Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False.
See original GitHub issueI’m using ray to predict large number of files.
The pytorch model is trained on GPU. The machine has one GPU. If I run torch.cuda.is_available(), it returns True.
But I still go the error:
File "python/ray/_raylet.pyx", line 410, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 427, in ray._raylet.execute_task
File "/home/yujl/anaconda3/lib/python3.7/site-packages/ray/serialization.py", line 312, in deserialize_objects
self._deserialize_object(data, metadata, object_id))
File "/home/yujl/anaconda3/lib/python3.7/site-packages/ray/serialization.py", line 252, in _deserialize_object
return self._deserialize_msgpack_data(data, metadata)
File "/home/yujl/anaconda3/lib/python3.7/site-packages/ray/serialization.py", line 233, in _deserialize_msgpack_data
python_objects = self._deserialize_pickle5_data(pickle5_data)
File "/home/yujl/anaconda3/lib/python3.7/site-packages/ray/serialization.py", line 221, in _deserialize_pickle5_data
obj = pickle.loads(in_band)
File "/home/yujl/anaconda3/lib/python3.7/site-packages/torch/storage.py", line 142, in _load_from_bytes
return torch.load(io.BytesIO(b))
File "/home/yujl/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 585, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/yujl/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 765, in _legacy_load
result = unpickler.load()
File "/home/yujl/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 721, in persistent_load
deserialized_objects[root_key] = restore_location(obj, location)
File "/home/yujl/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 174, in default_restore_location
result = fn(storage, location)
File "/home/yujl/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 150, in _cuda_deserialize
device = validate_cuda_device(location)
File "/home/yujl/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 134, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
How do I use the model on GPU?
Issue Analytics
- State:
- Created 3 years ago
- Comments:24 (10 by maintainers)
Top Results From Across the Web
RuntimeError: Attempting to deserialize object on a CUDA ...
is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map ...
Read more >Attempting to deserialize object on a CUDA device but ...
cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') ...
Read more >RuntimeError: Attempting to deserialize object on a ... - AI Pool
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only ...
Read more >Attempting to deserialize object on a CUDA device but ...
it runs well on pytorch 1.7 with cuda,or with CPU on a laptop device = torch.device(“cuda” if torch.cuda.is_available() else “cpu”)
Read more >Attempting to deserialize object on a CUDA device but ...
is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hey @junliangyu96, your code has
ray.init(num_cpus=32, num_gpus=1), which is notray.remote(num_gpus).ray.initis a global value.ray.remoteis per task.Everywhere you want to use a GPU, every function, you need to do the following:
Same issue for Ray 1.12.1. I tried to use Ray’s Queue to send state_dict to the remote machine but failed.