What to do about this warning message: "Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification"

See original GitHub issue
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")

returns this warning message:

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

This just started popping up with v.3 so I’m not sure what is the recommended action to take here. Please advise if you can. Basically, any of my code using the AutoModelFor<X> is throwing up this warning now.

Thanks.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:70
  • Comments:46 (12 by maintainers)

github_iconTop GitHub Comments

72reactions
LysandreJikcommented, Jul 1, 2020

@ohmeow you’re loading the bert-base-cased checkpoint (which is a checkpoint that was trained using a similar architecture to BertForPreTraining) in a BertForSequenceClassification model.

This means that:

  • The layers that BertForPreTraining has, but BertForSequenceClassification does not have will be discarded
  • The layers that BertForSequenceClassification has but BertForPreTraining does not have will be randomly initialized.

This is expected, and tells you that you won’t have good performance with your BertForSequenceClassification model before you fine-tune it 🙂.

@fliptrail this warning means that during your training, you’re not using the pooler in order to compute the loss. I don’t know how you’re finetuning your model, but if you’re not using the pooler layer then there’s no need to worry about that warning.

47reactions
LysandreJikcommented, Sep 25, 2020

You can manage the warnings with the logging utility introduced in version 3.1.0:

from transformers import logging

logging.set_verbosity_warning()
Read more comments on GitHub >

github_iconTop Results From Across the Web

Python: BERT Error - Some weights of the model checkpoint at ...
So if that is the case there's no need to worry about it. You can set this warning by doing: from transformers import...
Read more >
Is "Some weights of the model were not used" warning normal ...
Yes, the warning is telling you that some weights were randomly initialized (here you classification head), which is normal since you are ......
Read more >
Finally implemented | Kaggle
Time # Log Message 68.3s 1 68.3s 7 FutureWarning, 68.9s 8 tensor(1.6888, grad_fn=) torch.Size()
Read more >
Some weights of the model checkpoint at mypath/bert-base ...
Some weights of the model checkpoint at mypath/bert-base-chinese were not used when initializing BertModel: ['cls.seq_relationship.weight', ...
Read more >
hugging face使用BertModel.from_pretrained()都发生了什么?
The warning `Weights from XXX not initialized from pretrained model` ... converting the TensorFlow checkpoint in a PyTorch model using the ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found