ValueError: You have to specify either input_ids or inputs_embeds!

Details

I’m quite new to NLP task. However, I was trying to train the T5-large model and set things as follows. But unfortunately, I’ve got an error.

def build_model(transformer, max_len=512):
    input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
    sequence_output = transformer(input_word_ids)[0]
    cls_token = sequence_output[:, 0, :]
    out = Dense(1, activation='sigmoid')(cls_token)
    model = Model(inputs=input_word_ids, outputs=out)
    return model

model = build_model(transformer_layer, max_len=MAX_LEN)

It thorws

ValueError: in converted code:
ValueError                                Traceback (most recent call last)
<ipython-input-19-8ad6e68cd3f5> in <module>
----> 5     model = build_model(transformer_layer, max_len=MAX_LEN)
      6 
      7 model.summary()

<ipython-input-17-e001ed832ed6> in build_model(transformer, max_len)
     31     """
     32     input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
---> 33     sequence_output = transformer(input_word_ids)[0]
     34     cls_token = sequence_output[:, 0, :]
     35     out = Dense(1, activation='sigmoid')(cls_token)
ValueError: You have to specify either input_ids or inputs_embeds

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:21 (17 by maintainers)

Top GitHub Comments

2reactions

patrickvonplatencommented, May 16, 2020

@ratthachat - thanks for you message! We definitely need to provide more TF examples for the T5 Model. I want to tackle this problem in ~2 weeks.

In TF we use the naming convention inputs, so the you should change to model.fit({"inputs": x_encoder}) . I very much agree that the error message is quite misleading and correct it in this PR: #4401.

2reactions

patrickvonplatencommented, Apr 5, 2020

I’m not 100% sure what you want to do here exactly. T5 is always trained in a text-to-text format. We have a section here on how to train T5: https://huggingface.co/transformers/model_doc/t5.html#training

Otherwise I’d recommend taking a look at the official paper.

Top Results From Across the Web

I get a "You have to specify either input_ids or inputs_embeds ...

My input sequence is unconstrained (any sentence), and my output sequence is formal language that resembles assembly.

"You have to specify either input_ids or inputs_embeds", but I ...

The problem is that there's probably a renaming procedure in the code, since we use a encoder-decoder architecture we have 2 types of...

modeling_tf_t5.py - CodaLab Worksheets

... raise ValueError("You have to specify either input_ids or inputs_embeds") if inputs_embeds is None: assert self.embed_tokens is not None, "You have to ...

bert-base-chinese-for-tnews - Kaggle

... raise ValueError("You have to specify either input_ids or inputs_embeds") device = input_ids.device if input_ids is not None else inputs_embeds.device ...

BERT Inner Workings - TOPBOTS

I believe it's easy to follow along if you have the code next to the ... have to specify either input_ids or inputs_embeds")...