RuntimeError: Trying to create tensor with negative dimension

Environment info

transformers version: 3.4.0
Platform: Linux-3.10.0-957.el7.x86_64-x86_64-with-debian-stretch-sid
Python version: 3.6.9
PyTorch version (GPU?): 1.6.0 (True)
Tensorflow version (GPU?): not installed (NA)
Using GPU in script?: No
Using distributed or parallel set-up in script?: No @TevenLeScao

Information

I am using TransfoXLModel. The problem arises when running the code below (if I do not fill in vocab_size=256, it works fine):

the example scripts:

from transformers import TransfoXLConfig, TransfoXLModel
configuration = TransfoXLConfig(vocab_size=256)
model = TransfoXLModel(configuration)

Error I get:

RuntimeError Traceback (most recent call last) <ipython-input-323-7039580347ad> in <module> 3 configuration = TransfoXLConfig(vocab_size=256) 4 # Initializing a model from the configuration ----> 5 model = TransfoXLModel(configuration)

/opt/conda/lib/python3.6/site-packages/transformers/modeling_transfo_xl.py in init(self, config) 736 737 self.word_emb = AdaptiveEmbedding( –> 738 config.vocab_size, config.d_embed, config.d_model, config.cutoffs, div_val=config.div_val 739 ) 740

/opt/conda/lib/python3.6/site-packages/transformers/modeling_transfo_xl.py in init(self, n_token, d_embed, d_proj, cutoffs, div_val, sample_softmax) 421 l_idx, r_idx = self.cutoff_ends[i], self.cutoff_ends[i + 1] 422 d_emb_i = d_embed // (div_val ** i) –> 423 self.emb_layers.append(nn.Embedding(r_idx - l_idx, d_emb_i)) 424 self.emb_projs.append(nn.Parameter(torch.FloatTensor(d_proj, d_emb_i))) 425

/opt/conda/lib/python3.6/site-packages/torch/nn/modules/sparse.py in init(self, num_embeddings, embedding_dim, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse, _weight) 107 self.scale_grad_by_freq = scale_grad_by_freq 108 if _weight is None: –> 109 self.weight = Parameter(torch.Tensor(num_embeddings, embedding_dim)) 110 self.reset_parameters() 111 else:

RuntimeError: Trying to create tensor with negative dimension -199744: [-199744, 16]

Issue Analytics

State:
Created 3 years ago
Comments:6 (2 by maintainers)

Top GitHub Comments

2reactions

TevenLeScaocommented, Oct 28, 2020

Hey! From your description it sounds like you haven’t changed the cutoff points for adaptive embeddings. (the different sizes of the clusters for the hierarchical softmax generation). This causes an issue as the last cluster of embeddings, the one for the least frequent words, has size vocab_size - cutoffs[-1] so if the last cutoff is bigger than the vocab size, that’s negative.

Now for only 256 vocab words, adaptive embeddings don’t really matter anyway, so I’d recommend running

from transformers import TransfoXLConfig, TransfoXLModel
configuration = TransfoXLConfig(vocab_size=256, cutoffs=[])
model = TransfoXLModel(configuration)

0reactions

davidliujiafengcommented, Oct 28, 2020

@TevenLeScao Thanks very much, it works great for me, close the issue now.