What to do about this warning message: "Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification"
See original GitHub issuemodel = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")
returns this warning message:
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
This just started popping up with v.3 so I’m not sure what is the recommended action to take here. Please advise if you can. Basically, any of my code using the AutoModelFor<X> is throwing up this warning now.
Thanks.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:70
- Comments:46 (12 by maintainers)
Top Results From Across the Web
Python: BERT Error - Some weights of the model checkpoint at ...
So if that is the case there's no need to worry about it. You can set this warning by doing: from transformers import...
Read more >Is "Some weights of the model were not used" warning normal ...
Yes, the warning is telling you that some weights were randomly initialized (here you classification head), which is normal since you are ......
Read more >Finally implemented | Kaggle
Time # Log Message
68.3s 1
68.3s 7 FutureWarning,
68.9s 8 tensor(1.6888, grad_fn=) torch.Size()
Read more >Some weights of the model checkpoint at mypath/bert-base ...
Some weights of the model checkpoint at mypath/bert-base-chinese were not used when initializing BertModel: ['cls.seq_relationship.weight', ...
Read more >hugging face使用BertModel.from_pretrained()都发生了什么?
The warning `Weights from XXX not initialized from pretrained model` ... converting the TensorFlow checkpoint in a PyTorch model using the ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@ohmeow you’re loading the
bert-base-casedcheckpoint (which is a checkpoint that was trained using a similar architecture toBertForPreTraining) in aBertForSequenceClassificationmodel.This means that:
BertForPreTraininghas, butBertForSequenceClassificationdoes not have will be discardedBertForSequenceClassificationhas butBertForPreTrainingdoes not have will be randomly initialized.This is expected, and tells you that you won’t have good performance with your
BertForSequenceClassificationmodel before you fine-tune it 🙂.@fliptrail this warning means that during your training, you’re not using the
poolerin order to compute the loss. I don’t know how you’re finetuning your model, but if you’re not using the pooler layer then there’s no need to worry about that warning.You can manage the warnings with the
loggingutility introduced in version 3.1.0: