ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

Environment info

transformers version: 4.0.0
Platform: google colab
Python version: 3
PyTorch version (GPU?): 1.7.0+cu101

Who can help

Information

Model I am using (T5):

The problem arises when using:

from transformers import T5Tokenizer, T5ForConditionalGeneration
import sentencepiece

tokenizer = T5Tokenizer.from_pretrained('t5-small')
model = T5ForConditionalGeneration.from_pretrained('t5-small', torchscript = True)

input_ids = tokenizer('The <extra_id_0> walks in <extra_id_1> park', return_tensors='pt').input_ids
labels = tokenizer('<extra_id_0> cute dog <extra_id_1> the <extra_id_2> </s>', return_tensors='pt').input_ids
outputs = model(input_ids=input_ids, labels=labels)

outputs = model.generate(input_ids)

traced_model = torch.jit.trace(model, input_ids )
torch.jit.save(traced_model, "traced_t5.pt")

as mentioned in the article I tried to convert the model to torchscript

T5ForConditionalGeneration model is not supporting trace function for converting the model to torchscript

the output produced :

ValueError                                Traceback (most recent call last)
<ipython-input-7-e37c13fee7bc> in <module>()
      1 import torch
----> 2 traced_model = torch.jit.trace(model, input_ids )
      3 torch.jit.save(traced_model, "traced_t5.pt")

7 frames
/usr/local/lib/python3.6/dist-packages/transformers/models/t5/modeling_t5.py in forward(self, input_ids, attention_mask, encoder_hidden_states, encoder_attention_mask, inputs_embeds, head_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict)
    774         else:
    775             err_msg_prefix = "decoder_" if self.is_decoder else ""
--> 776             raise ValueError(f"You have to specify either {err_msg_prefix}inputs or {err_msg_prefix}inputs_embeds")
    777 
    778         if inputs_embeds is None:

ValueError: You have to specify either decoder_inputs or decoder_inputs_embeds

I got the same issue when converting a question-generation T5 model to torchscript, and the issue is here

Issue Analytics

State:
Created 3 years ago
Comments:9 (6 by maintainers)

Top GitHub Comments

4reactions

patrickvonplatencommented, Dec 4, 2020

Seq2Seq models are a bit special - they also need decoder_input_ids as the error message states. Since torchscript however does not allow keyword arguments we need to provide positional arguments and therefore it’s mandatory to also provide the 2nd argument being the attention_mask (for the encoder).

The following is what you are looking for (I think):

from transformers import T5Tokenizer, T5ForConditionalGeneration
import torch

tokenizer = T5Tokenizer.from_pretrained('t5-small')
model = T5ForConditionalGeneration.from_pretrained('t5-small', torchscript = True)
input_ids = tokenizer('The <extra_id_0> walks in <extra_id_1> park', return_tensors='pt').input_ids
attention_mask = input_ids.ne(model.config.pad_token_id).long()
decoder_input_ids = tokenizer('<pad> <extra_id_0> cute dog <extra_id_1> the <extra_id_2>', return_tensors='pt').input_ids

traced_model = torch.jit.trace(model, (input_ids, attention_mask, decoder_input_ids))
torch.jit.save(traced_model, "traced_t5.pt")

1reaction

karrtikiyerkcmcommented, Apr 21, 2021

Thanks @Ki6an , I was trying something similar for Pegasus Models for the summarisation task.

Top Results From Across the Web

ValueError: You have to specify either decoder_inputs or ...

Environment info transformers version:4.0.0 Platform:Google Colab Python version:3 Tensorflow version (GPU?):2.3.0 Using GPU in script?

ValueError: You have to specify either decoder_input_ids ...

We need to pass the features into the encoder and labels (targets) into the decoder. traced_model = torch.jit.trace(model, (input_ids, ...

I get a "You have to specify either input_ids or ...

I trained a BERT based encoder decoder model: ed_model I tokenized the input with: txt = "I love huggingface" inputs = input_tokenizer(txt, ...

BART — adapter-transformers documentation

The adapter configuration, can be either: the string identifier of a pre-defined configuration dictionary. a configuration dictionary specifying the full config.

python/huggingface/transformers/src/transformers/models ...

If you choose this second option, there are three possibilities you can use ... raise ValueError("You have to specify either input_ids or inputs_embeds")...