[BUG] LocalCUDACluster doesn't work with NVIDIA MIG
See original GitHub issue(py)nvml does not appear to be compatible with MIG, which prevents various Dask services from working correctly, for example ‘LocalCUDACluster’.
While this isn’t explicitly Dask-cuda’s fault, the end result is the same. Adding this issue for others to reference, and for discussion of potential work arounds.
from dask_cuda import LocalCUDACluster
from dask.distributed import Client
cluster = LocalCUDACluster(device_memory_limit=1.0, rmm_managed_memory=True)
client = Client(cluster)
---------------------------------------------------------------------------
NVMLError_NoPermission Traceback (most recent call last)
<ipython-input-1-48e0ebf5a2e9> in <module>
33
34
---> 35 cluster = LocalCUDACluster(device_memory_limit=1.0,
36 rmm_managed_memory=True)
37 client = Client(cluster)
/opt/conda/envs/rapids/lib/python3.8/site-packages/dask_cuda/local_cuda_cluster.py in __init__(self, n_workers, threads_per_worker, processes, memory_limit, device_memory_limit, CUDA_VISIBLE_DEVICES, data, local_directory, protocol, enable_tcp_over_ucx, enable_infiniband, enable_nvlink, enable_rdmacm, ucx_net_devices, rmm_pool_size, rmm_managed_memory, jit_unspill, **kwargs)
166 memory_limit, threads_per_worker, n_workers
167 )
--> 168 self.device_memory_limit = parse_device_memory_limit(
169 device_memory_limit, device_index=0
170 )
/opt/conda/envs/rapids/lib/python3.8/site-packages/dask_cuda/utils.py in parse_device_memory_limit(device_memory_limit, device_index)
478 device_memory_limit = float(device_memory_limit)
479 if isinstance(device_memory_limit, float) and device_memory_limit <= 1:
--> 480 return int(get_device_total_memory(device_index) * device_memory_limit)
481
482 if isinstance(device_memory_limit, str):
/opt/conda/envs/rapids/lib/python3.8/site-packages/dask_cuda/utils.py in get_device_total_memory(index)
158 """
159 pynvml.nvmlInit()
--> 160 return pynvml.nvmlDeviceGetMemoryInfo(
161 pynvml.nvmlDeviceGetHandleByIndex(index)
162 ).total
/opt/conda/envs/rapids/lib/python3.8/site-packages/pynvml/nvml.py in nvmlDeviceGetMemoryInfo(handle)
1286 fn = get_func_pointer("nvmlDeviceGetMemoryInfo")
1287 ret = fn(handle, byref(c_memory))
-> 1288 check_return(ret)
1289 return c_memory
1290
/opt/conda/envs/rapids/lib/python3.8/site-packages/pynvml/nvml.py in check_return(ret)
364 def check_return(ret):
365 if (ret != NVML_SUCCESS):
--> 366 raise NVMLError(ret)
367 return ret
368
NVMLError_NoPermission: Insufficient Permissions
Issue Analytics
- State:
- Created 2 years ago
- Comments:34 (20 by maintainers)
Top Results From Across the Web
NVIDIA Multi-Instance GPU User Guide
MIG enables multiple GPU Instances to run in parallel on a single, physical NVIDIA Ampere GPU. With MIG, users will be able to...
Read more >MIG : Failed to attach MIG with container on specific 8 A100 ...
Failed to attach MIG instance to container 3 GPU Server attached succesfully, ... A clear and concise description of the bug or issue....
Read more >GPU Operator with MIG - NVIDIA Documentation Center
strategy should be set to mixed when MIG mode is not enabled on all GPUs on a node. Note. Starting with v1.9, MIG...
Read more >NVIDIA Multi-Instance GPU User Guide
With MIG, users will be able to see and schedule jobs on their new virtual GPU ... MPS currently does not offer error...
Read more >dask-cuda Changelog - PyUp.io
Bug Fixes - Resolve build issues / consistency with conda-forge packages ([883](https://github.com/rapidsai/dask-cuda/pull/883)) ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I do not have access to an A100, but the latest (unreleased) version of pynvml should include MIG-supported NVML bindings. I believe we will need to modify get_device_total_memory to optionally pass a MIG device handle when necessary. As a first-order functionality test, someone could try adding a try/except for the current
NVMLErrorand retry with a MIG handle - E.g.:I tested a few things. I have used a VM on AWS which has 8 A100 GPUs. I enabled MIG on GPU 0 and divided that into 7 5GB instances.
MIG instances configuration.
Some of the tests are done on a notebook running on bare VM. For other tests, I am using the rapids 21.06 docker container where I restrict which GPUs the container can see using the
--gpusflag. I will describe the setup appropriately as needed.Observations:
Currently,
LocalCUDAClusterrequiresCUDA_VISIBLE_DEVICESargument to haveMIG-GPU-prefix if we want to specify MIG instances: https://github.com/rapidsai/dask-cuda/blob/branch-21.08/dask_cuda/utils.py#L467 . Non MIG gpus can be specified via integers or with a prefixGPU-.LocalCUDAClusterfails when I try to use MIG instances by specifying the MIG enabled GPU by its indexCUDA_VISIBLE_DEVICES="0". This is directly on the VM.Expand to see Error Details.
Note: If we test the same by attaching GPU 0 by index to a docker container via :
docker run --gpus '"device=0"' --rm -it rapidsai/rapidsai:21.06-cuda11.0-runtime-ubuntu18.04-py3.8we get the same error as in the next bullet point.LocalCUDAClusterfails when I try to use MIG instances from inside a docker container (a case similar to when we run things with GKE or EKS). I start the docker container withdocker run --gpus '"device=0:0,0:1,0:2"' --rm -it rapidsai/rapidsai:21.06-cuda11.0-runtime-ubuntu18.04-py3.8to allow the container to see only the 1st, 2nd and 3rd MIG instance of GPU 0.Expand to see Error Details.
This error goes away if I made the changes mentioned in https://github.com/rapidsai/dask-cuda/issues/583#issuecomment-878349249 in
nvmlDeviceGetMemoryInfo. ButnvmlDeviceGetMemoryInfoneeds both the handle of the parent GPU and the MIG instance index. These are not passed in correctly at the moment however we are not getting thepermissionserror. Hence we will need to handle these changes indask-cudacode.Expand to see Image.
LocalCUDAClusterfails when I try to use MIG instances from directly without docker, but with a different error if I useCUDA_VISIBLE_DEVICEStto denote the MIG instances. Need to investigate further.Expand to see Error Details.
LocalCUDAClustersucceeds if I try to use non-MIG instances from directly with/without docker.Based on these pocs, there appear to be some existing discrepancies. We think that we need to first properly identify what type of device each device is in
CUDA_VISIBLE_DEVICES. Once we do that, we then need to query the GPUs with the right NVML call via rightpynvmlapi in several places such asget_cpu_affinity,get_device_total_memory, etc.Action Plan after discussion with @pentschev :
Firstly mapping the MIG counterparts for the
pynvmlapi we use indask_cuda/utils.py. We should be able to write ais_mig_deviceutils function which will parse a device index and return whether it is a MIG device or not. This can be subsequently used inget_cpu_affinity,get_device_total_memoryto use the correctpynvmlapis.Secondly, add more user-friendly error when trying to start a CUDA worker on a MIG-enabled device. See error 2 above.
Thirdly, add handling of default Dask-CUDA setup when we use a hybrid deployment of MIG enabled and disabled GPUs. Suppose we have a deployment where user wants to have the following configuration:
Three possible solution approaches are applicable in such a scenario: a. We rely on the default behavior and create workers only on the non-MIG devices and just create MIG devices when explicitly specified via
CUDA_VISIBLE_DEVICESb. Add a new argument--migthat will create workers using all MIG devices (and ignore the non MIG ones), where the default behavior (when--migis NOT specified) would be to create workers on all non-MIG devices. c. Create 3 workers with 3 completely different memory sizes and characteristics. Generally a bad idea.This perhaps need much more discussion before we do something.