[Wav2Vec2] Cannot load newly added Wav2Vec2 checkpoints

🐛 Bug

A recent commit: https://github.com/pytorch/fairseq/commit/2513524a1604dbafcc4ea9cc5a99ae0aa4f19694 added two new fine-tuned Wav2Vec2 checkpoints, however it seems like there is a problem with the saved config as one cannot load those checkpoints. E.g. the following code cannot be run:

import fairseq
model, _, _ = fairseq.checkpoint_utils.load_model_ensemble_and_task([checkpoint_path], arg_overrides={"data": "path/to/dict"})

To Reproduce

The following colab reproduces the error (one just has to run all cells): https://colab.research.google.com/drive/13hJI4w8pOD33hxOJ_qwKkN9QqdKVH5IM?usp=sharing

Kindly pinging @alexeib here 😃

Issue Analytics

State:
Created 2 years ago
Comments:15 (4 by maintainers)

Top GitHub Comments

6reactions

ag027592commented, Sep 5, 2021

@patrickvonplaten Hi, I met the same problem. Do you have any solution? Thank you. I run the code:

model, _, _ = fairseq.checkpoint_utils.load_model_ensemble_and_task([cp_path])

I got the error:

ConfigKeyError: Key 'target_dict' not in 'AudioPretrainingConfig' full_key: target_dict reference_type=Optional[AudioPretrainingConfig] object_type=AudioPretrainingConfig

2reactions

Kristopher-Chencommented, Mar 1, 2022

I am still getting this error

ConfigKeyError: Key 'eval_wer' not in 'AudioPretrainingConfig'
	full_key: eval_wer
	reference_type=Optional[AudioPretrainingConfig]
	object_type=AudioPretrainingConfig

while running the code

import torch
import fairseq
cp_path = '../w2v_large_lv_fsh_swbd_cv_ftsb300_updated.pt'
model, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task([cp_path])
model = model[0]
model.eval()

Package                Version         Location
---------------------- --------------- --------------------------------------------
antlr4-python3-runtime 4.8
backcall               0.2.0
bitarray               2.3.7
certifi                2021.10.8
cffi                   1.15.0
colorama               0.4.4
Cython                 0.29.28
debugpy                1.5.1
decorator              5.1.1
entrypoints            0.3
fairseq                1.0.0a0+5175fd5 
hydra-core             1.0.7
ipykernel              6.4.1
ipython                7.31.1
ipython-genutils       0.2.0
jedi                   0.18.1
jupyter-client         7.1.2
jupyter-core           4.9.1
matplotlib-inline      0.1.2
nest-asyncio           1.5.1
numpy                  1.22.2
omegaconf              2.0.6
parso                  0.8.3
pexpect                4.8.0
pickleshare            0.7.5
pip                    21.2.4
portalocker            2.4.0
prompt-toolkit         3.0.20
protobuf               3.19.4
ptyprocess             0.7.0
pycparser              2.21
Pygments               2.11.2
python-dateutil        2.8.2
PyYAML                 6.0
pyzmq                  22.3.0
regex                  2022.1.18
sacrebleu              2.0.0
setuptools             58.0.4
six                    1.16.0
tabulate               0.8.9
tensorboardX           2.5
torch                  1.10.2
torchaudio             0.10.2
tornado                6.1
tqdm                   4.62.3
traitlets              5.1.1
typing_extensions      4.1.1
wcwidth                0.2.5
wheel                  0.37.1

same problem, Have you solved it?

Top Results From Across the Web

Issues saving and loading wav2vec2 models fine tuned using ...

After training some toy models, I realized that I couldn't load from the checkpoints or save and reload the model in the same...

Unable to load Wav2Vec2 fine-tuned models from local files

It produced a file speech_recognition_model.pt and a directory of checkpoints, as expected. However, I am unable to load either PT or checkpoint. For...

Speech to Text with Hugging face - Kaggle

Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ...

Saving and Loading Models - PyTorch

When saving a general checkpoint, to be used for either inference or resuming training, you must save more than just the model's state_dict....

Build Speech Recognition for any Language with Transformers

This Video Tutorial explains step-by-step guide of the Colab Notebook Hugging Face Notebook has put together to Fine-Tune XLSR- Wav2Vec2 for ...