"terminate called after throwing an instance of 'c10::Error'"

See original GitHub issue

Hello,

Thanks for sharing the code. When I tried python train.py --config configs/maml/halfcheetah-vel.yaml --output-folder maml-halfcheetah-vel --seed 1 --num-workers 8,

It gave me this error, “terminate called after throwing an instance of ‘c10::Error’”

I checked all the requirements are satisfied. What could be the problem?

Thanks

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
tristandeleucommented, May 22, 2020

It looks like this is a CUDA error. It could be a problem with the multiprocessing context, and maybe adding mp.set_start_method('spawn') would solve this issue. I would suggest running the code using CPU instead (the networks are small enough that this shouldn’t be a bottleneck), this code was not tested using GPU.

0reactions
qingerVTcommented, May 22, 2020

This is the full trackback. Thanks!

terminate called after throwing an instance of ‘c10::Error’ what(): CUDA error: initialization error (setDevice at /opt/conda/conda-bld/pytorch_1579040055865/work/c10/cuda/impl/CUDAGuardImpl.h:42) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7efcc564c627 in /efs/qinsun/anaconda3/lib/python3.7/site-packages/torch/lib/libc10.so) frame #1: <unknown function> + 0xecf2 (0x7efcc5880cf2 in /efs/qinsun/anaconda3/lib/python3.7/site-packages/torch/lib/libc10_cuda.so) frame #2: torch::autograd::Engine::set_device(int) + 0x159 (0x7efccaf3c419 in /efs/qinsun/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so) frame #3: torch::autograd::Engine::thread_init(int) + 0x1a (0x7efccaf3cd9a in /efs/qinsun/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch.so) frame #4: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7efcf6a98faa in /efs/qinsun/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #5: <unknown function> + 0xc819d (0x7efcf638519d in /efs/qinsun/anaconda3/lib/python3.7/site-packages/torch/…/…/…/libstdc++.so.6) frame #6: <unknown function> + 0x76ba (0x7efd057c56ba in /lib/x86_64-linux-gnu/libpthread.so.0) frame #7: clone + 0x6d (0x7efd054fb41d in /lib/x86_64-linux-gnu/libc.so.6)

Read more comments on GitHub >

github_iconTop Results From Across the Web

terminate called after throwing an instance of 'c10::Error' #3
Eventually, I found out that this error was caused because some of the data files I was trying to import with Dataloader were...
Read more >
Terminate called after throwing an instance of 'c10::Error' what()
Trouble: Python input code can output right results, when I transfer to libtorch the problem occurs. anyone kowns how to solve ? Thanks....
Read more >
issue terminate called after throwing an instance of 'c10
Now I am getting this error. terminate called after throwing an instance of 'c10::CUDAError'. Other times I will get.
Read more >
What Do I Do If the Error Message "terminate called after ...
What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what(): HelpACLExecute:" Is Displayed During Model Running?...
Read more >
Terminate called after throwing an instance of 'c10 ...
I am trying to run my pytorch-lghtning code on TPU in GCP. import numpy as np # linear algebra import pandas as pd...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found