ValueError: You have to specify either input_ids or inputs_embeds!

See original GitHub issue

Details

I’m quite new to NLP task. However, I was trying to train the T5-large model and set things as follows. But unfortunately, I’ve got an error.

def build_model(transformer, max_len=512):
    input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
    sequence_output = transformer(input_word_ids)[0]
    cls_token = sequence_output[:, 0, :]
    out = Dense(1, activation='sigmoid')(cls_token)
    model = Model(inputs=input_word_ids, outputs=out)
    return model

model = build_model(transformer_layer, max_len=MAX_LEN)

It thorws

ValueError: in converted code:
ValueError                                Traceback (most recent call last)
<ipython-input-19-8ad6e68cd3f5> in <module>
----> 5     model = build_model(transformer_layer, max_len=MAX_LEN)
      6 
      7 model.summary()

<ipython-input-17-e001ed832ed6> in build_model(transformer, max_len)
     31     """
     32     input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
---> 33     sequence_output = transformer(input_word_ids)[0]
     34     cls_token = sequence_output[:, 0, :]
     35     out = Dense(1, activation='sigmoid')(cls_token)
ValueError: You have to specify either input_ids or inputs_embeds

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:21 (17 by maintainers)

github_iconTop GitHub Comments

2reactions
patrickvonplatencommented, May 16, 2020

@ratthachat - thanks for you message! We definitely need to provide more TF examples for the T5 Model. I want to tackle this problem in ~2 weeks.

In TF we use the naming convention inputs, so the you should change to model.fit({"inputs": x_encoder}) . I very much agree that the error message is quite misleading and correct it in this PR: #4401.

2reactions
patrickvonplatencommented, Apr 5, 2020

I’m not 100% sure what you want to do here exactly. T5 is always trained in a text-to-text format. We have a section here on how to train T5: https://huggingface.co/transformers/model_doc/t5.html#training

Otherwise I’d recommend taking a look at the official paper.

Read more comments on GitHub >

github_iconTop Results From Across the Web

I get a "You have to specify either input_ids or inputs_embeds ...
My input sequence is unconstrained (any sentence), and my output sequence is formal language that resembles assembly.
Read more >
"You have to specify either input_ids or inputs_embeds", but I ...
The problem is that there's probably a renaming procedure in the code, since we use a encoder-decoder architecture we have 2 types of...
Read more >
modeling_tf_t5.py - CodaLab Worksheets
... raise ValueError("You have to specify either input_ids or inputs_embeds") if inputs_embeds is None: assert self.embed_tokens is not None, "You have to ...
Read more >
bert-base-chinese-for-tnews - Kaggle
... raise ValueError("You have to specify either input_ids or inputs_embeds") device = input_ids.device if input_ids is not None else inputs_embeds.device ...
Read more >
BERT Inner Workings - TOPBOTS
I believe it's easy to follow along if you have the code next to the ... have to specify either input_ids or inputs_embeds")...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found