What to do about this warning message: "Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification"

See original GitHub issue

model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")

returns this warning message:

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

This just started popping up with v.3 so I’m not sure what is the recommended action to take here. Please advise if you can. Basically, any of my code using the AutoModelFor<X> is throwing up this warning now.

Thanks.

Issue Analytics

State:
Created 3 years ago
Reactions:70
Comments:46 (12 by maintainers)

Top GitHub Comments

72reactions

LysandreJikcommented, Jul 1, 2020

@ohmeow you’re loading the bert-base-cased checkpoint (which is a checkpoint that was trained using a similar architecture to BertForPreTraining) in a BertForSequenceClassification model.

This means that:

The layers that BertForPreTraining has, but BertForSequenceClassification does not have will be discarded
The layers that BertForSequenceClassification has but BertForPreTraining does not have will be randomly initialized.

This is expected, and tells you that you won’t have good performance with your BertForSequenceClassification model before you fine-tune it 🙂.

@fliptrail this warning means that during your training, you’re not using the pooler in order to compute the loss. I don’t know how you’re finetuning your model, but if you’re not using the pooler layer then there’s no need to worry about that warning.

47reactions

LysandreJikcommented, Sep 25, 2020

You can manage the warnings with the logging utility introduced in version 3.1.0:

from transformers import logging

logging.set_verbosity_warning()

Top Results From Across the Web

Python: BERT Error - Some weights of the model checkpoint at ...

So if that is the case there's no need to worry about it. You can set this warning by doing: from transformers import...

Is "Some weights of the model were not used" warning normal ...

Yes, the warning is telling you that some weights were randomly initialized (here you classification head), which is normal since you are ......

Finally implemented | Kaggle

Time # Log Message 68.3s 1 68.3s 7 FutureWarning, 68.9s 8 tensor(1.6888, grad_fn=) torch.Size()

Some weights of the model checkpoint at mypath/bert-base ...

Some weights of the model checkpoint at mypath/bert-base-chinese were not used when initializing BertModel: ['cls.seq_relationship.weight', ...

hugging face使用BertModel.from_pretrained()都发生了什么？

The warning `Weights from XXX not initialized from pretrained model` ... converting the TensorFlow checkpoint in a PyTorch model using the ...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

What to do about this warning message: "Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification"

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post