Fine-tune BERTForMaskedLM

Hello,

I am doing a project on spelling correction. I used pre-trained “bert-base-cased” model. However, the results are not that accurate. Therefore, I planned to fine-tune the BERT for Masked LM task. I couldn’t find any examples for fine-tuning BERT model for Masked LM. I tried to use “run_language_modeling.py” for fine-tuning. But, I came across with the following error:

C:\Users\ravida6d\spell_correction\transformers\examples\language-modeling>python run_language_modeling.py --output_dir ="C:\\Users\\ravida6d\\spell_correction\\contextualSpellCheck\\fine_tune\\" --model_type = bert --model_name_or_path = bert-base-cased --do_train --train_data_file =$TRAIN_FILE --do_eval --eval_data_file =$TEST_FILE –mlm

C:\Users\ravida6d\AppData\Local\Continuum\anaconda3\envs\contextualSpellCheck\lib\site-packages\transformers\training_args.py:291: FutureWarning: The `evaluate_during_training` argument is deprecated in favor of `evaluation_strategy` (which has more options)
  FutureWarning,

Traceback (most recent call last):
  File "run_language_modeling.py", line 313, in <module>
    main()
  File "run_language_modeling.py", line 153, in main
    model_args, data_args, training_args = parser.parse_args_into_dataclasses()
  File "C:\Users\ravida6d\AppData\Local\Continuum\anaconda3\envs\contextualSpellCheck\lib\site-packages\transformers\hf_argparser.py", line 151, in parse_args_into_dataclasses

    raise ValueError(f"Some specified arguments are not used by the HfArgumentParser: {remaining_args}")
ValueError: Some specified arguments are not used by the HfArgumentParser: ['bert', 'bert-base-cased']

I am not understanding how to use this script. Can anyone give some information for understanding the fine-tuning of BERT Masked LM.

Issue Analytics

State:
Created 3 years ago
Comments:7 (1 by maintainers)

Top GitHub Comments

1reaction

naturecreatorcommented, Sep 29, 2020

While fine-tuning, we can only see loss and perplexity which is useful. Is it also possible to see the accuracy of the model and also the tensorboard when using the “run_language_modeling.py” script? It would be really helpful if anyone could explain how the “loss” is calculated for BERTForMaskedLM task (as there are no labels provided while fine-tuning).

0reactions

ucas010commented, Nov 28, 2022

hi,dear how to use Spelling Error Correction with this rp? could you pls help me ?