RuntimeError: Trying to create tensor with negative dimension
See original GitHub issueEnvironment info
transformersversion: 3.4.0- Platform: Linux-3.10.0-957.el7.x86_64-x86_64-with-debian-stretch-sid
- Python version: 3.6.9
- PyTorch version (GPU?): 1.6.0 (True)
- Tensorflow version (GPU?): not installed (NA)
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No @TevenLeScao
Information
I am using TransfoXLModel. The problem arises when running the code below (if I do not fill in vocab_size=256, it works fine):
- the example scripts:
from transformers import TransfoXLConfig, TransfoXLModel
configuration = TransfoXLConfig(vocab_size=256)
model = TransfoXLModel(configuration)
Error I get:
RuntimeError Traceback (most recent call last) <ipython-input-323-7039580347ad> in <module> 3 configuration = TransfoXLConfig(vocab_size=256) 4 # Initializing a model from the configuration ----> 5 model = TransfoXLModel(configuration)
/opt/conda/lib/python3.6/site-packages/transformers/modeling_transfo_xl.py in init(self, config) 736 737 self.word_emb = AdaptiveEmbedding( –> 738 config.vocab_size, config.d_embed, config.d_model, config.cutoffs, div_val=config.div_val 739 ) 740
/opt/conda/lib/python3.6/site-packages/transformers/modeling_transfo_xl.py in init(self, n_token, d_embed, d_proj, cutoffs, div_val, sample_softmax) 421 l_idx, r_idx = self.cutoff_ends[i], self.cutoff_ends[i + 1] 422 d_emb_i = d_embed // (div_val ** i) –> 423 self.emb_layers.append(nn.Embedding(r_idx - l_idx, d_emb_i)) 424 self.emb_projs.append(nn.Parameter(torch.FloatTensor(d_proj, d_emb_i))) 425
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/sparse.py in init(self, num_embeddings, embedding_dim, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse, _weight) 107 self.scale_grad_by_freq = scale_grad_by_freq 108 if _weight is None: –> 109 self.weight = Parameter(torch.Tensor(num_embeddings, embedding_dim)) 110 self.reset_parameters() 111 else:
RuntimeError: Trying to create tensor with negative dimension -199744: [-199744, 16]
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (2 by maintainers)
Top Related StackOverflow Question
Hey! From your description it sounds like you haven’t changed the cutoff points for adaptive embeddings. (the different sizes of the clusters for the hierarchical softmax generation). This causes an issue as the last cluster of embeddings, the one for the least frequent words, has size
vocab_size - cutoffs[-1]so if the last cutoff is bigger than the vocab size, that’s negative.Now for only 256 vocab words, adaptive embeddings don’t really matter anyway, so I’d recommend running
@TevenLeScao Thanks very much, it works great for me, close the issue now.