model.generate with prefix_allowed_tokens_fn throws RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

Environment info

transformers version: 4.15.0
Platform: Linux-5.4.0-90-generic-x86_64-with-debian-bullseye-sid
Python version: 3.7.12
PyTorch version (GPU?): 1.10.0+cu102 (True)
Tensorflow version (GPU?): 2.7.0 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help

Information

Model I am using T5ForConditionalGeneration:

The problem arises when using my own modified scripts: Script to reproduce error is mentioned below.

The tasks I am working on is my own task or dataset: The task requires conditional generation from T5, in such a way, that the output vocabulary is restricted to a small set.

To reproduce

Run the following script to reproduce the behaviour.

from transformers import T5Tokenizer, T5ForConditionalGeneration, T5Config

lm_model = 't5-small'
model = T5ForConditionalGeneration.from_pretrained(lm_model)
tokenizer = T5Tokenizer.from_pretrained(lm_model)

def restrict_decode_vocab(batch_idx, prefix_beam):
    if len(prefix_beam)==3:
        restricted_vocab = tokenizer(' ', return_tensors="pt")['input_ids'].tolist()
    else:
        restricted_vocab = tokenizer('<extra_id_0> cute dog <extra_id_1> the <pad>', return_tensors="pt")['input_ids'].tolist()
    return restricted_vocab

source = ['The <extra_id_0> walks in <extra_id_1> park .']
source_encoding = tokenizer(source[:], padding='longest', return_tensors="pt")
input_ids, attention_mask = source_encoding['input_ids'], source_encoding['attention_mask']
decoded_beams = model.generate(input_ids=input_ids, attention_mask=attention_mask, do_sample=True, num_beams=2, prefix_allowed_tokens_fn=restrict_decode_vocab, min_length=4, max_length=4, remove_invalid_values=True)
print(decoded_beams)

Above script produces the following stack trace.

/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/transformers/generation_utils.py:2259: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  next_indices = next_tokens // vocab_size
Traceback (most recent call last):
  File "reproduce_error.py", line 17, in <module>
    decoded_beams = model.generate(input_ids=input_ids, attention_mask=attention_mask, do_sample=True, num_beams=2, prefix_allowed_tokens_fn=restrict_decode_vocab, min_length=4, max_length=4, remove_invalid_values=True)
  File "/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/transformers/generation_utils.py", line 1220, in generate
    **model_kwargs,
  File "/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/transformers/generation_utils.py", line 2253, in beam_sample
    next_tokens = torch.multinomial(probs, num_samples=2 * num_beams)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

Expected behavior

No error.

Possible solution

The call function for class “InfNanRemoveLogitsProcessor” should include the following statement before returning “scores”.

scores[scores == float("-inf")] = torch.finfo(scores.dtype).min

Issue Analytics

State:
Created 2 years ago
Reactions:2
Comments:10 (5 by maintainers)

Top GitHub Comments

3reactions

patrickvonplatencommented, Jan 21, 2022

I don’t think that this is related in any way to the InfNanRemoveLogitsProcessor processor. IMO, the reason for the error here is that in the 3rd generation step, all values of next_token_scores are set to -inf (I think) due to the prefix_allowed_tokens_fn that you’ve added. This is not a bug IMO with transformers, but with the prefix_allowed_tokens_fn function as it should not set all values to -inf.

A tip from my side @iamjanvijay would be to do the following. Create the PrefixConstrainedLogitsProcessor object with your function and just play around with it locally (what happens at generation step 3) I think you’ll see then that it sets all values to -inf at some point which it shouldn’t do

1reaction

Narsilcommented, Jan 21, 2022

I’ll close my PR in the meantime. We can reopen it if needed, but I tend to agree with @patrickvonplaten that having everything float(-inf) can be considered a bug already.