Segmentation fault when trying to load models
See original GitHub issueWe are using Azure ML pipelines to train our transformers models. We have had it working for a few weeks, and then recently (just noticed it a few days ago), when trying to initialize a model, we are getting Segmentation fault.
I tried just loading the models locally this morning and have the same issues. See snippet below.
config = config_class.from_pretrained(model_name, num_labels=10)
tokenizer = tokenizer_class.from_pretrained(model_name, do_lower_case=False)
model = model_class.from_pretrained("distilroberta-base", from_tf=False, config=config)
I also tried to download the *_model.bin and pass a local path instead of the model name and also got a Segmentation fault. I also tried to use bert-base-uncased instead of distilroberta-base and had the same issue.
I am running on Ubuntu, with the following package versions:
torch==1.3.0
tokenizers=0.0.11
transformers==2.4.1
UPDATE:
I hacked some example scripts and had success, so I think the issue is that our code uses…
"roberta": (RobertaConfig, RobertaForTokenClassification, RobertaTokenizer),
"mroberta": (RobertaConfig, RobertaForMultiLabelTokenClassification, RobertaTokenizer), # our custom multilabel class
instead of what the example scripts use…
AutoConfig,
AutoModelForTokenClassification,
AutoTokenizer,
Was there a breaking change to model files recently that would mean that our use of the “non-auto” classes are no longer usable?
UPDATE 2:
Our original code does not cause a Segmentation fault on Windows.
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (2 by maintainers)
Top Related StackOverflow Question
Downgrade to sentencepiece==0.1.91 solve it. I am using PyTorch 1.2.0 + transformers3.0.0
Bumping to
torch==1.5.1fixes this issue. But it’s still unclear why.