RuntimeError: sigmoid_focal_loss_forward_impl: implementation for device cuda:0 not found.

Hi, I tried to train the model for LiDAR-only detector using this command:

torchpack dist-run -np 8 python tools/train.py configs/nuscenes/det/transfusion/secfpn/lidar/voxelnet_0p075.yaml

but got the following error. All the other training commands are working fine exept this one. Do I need to build any additional library? Any suggestion? Thanks.

Traceback (most recent call last):
  File "tools/train.py", line 87, in <module>
    main()
  File "tools/train.py", line 76, in main
    train_model(
  File "/home/trainer/bevnet/mmdet3d/apis/train.py", line 126, in train_model
    runner.run(data_loaders, [("train", 1)])
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/trainer/bevnet/mmdet3d/runner/epoch_based_runner.py", line 14, in train
    super().train(data_loader, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
    outputs = self.model.train_step(data_batch, self.optimizer,
  File "/usr/local/lib/python3.8/dist-packages/mmcv/parallel/distributed.py", line 52, in train_step
    output = self.module.train_step(*inputs[0], **kwargs[0])
  File "/home/trainer/bevnet/mmdet3d/models/fusion_models/base.py", line 78, in train_step
    losses = self(**data)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/fp16_utils.py", line 128, in new_func
    output = old_func(*new_args, **new_kwargs)
  File "/home/trainer/bevnet/mmdet3d/models/fusion_models/bevfusion.py", line 187, in forward
    outputs = self.forward_single(
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/fp16_utils.py", line 128, in new_func
    output = old_func(*new_args, **new_kwargs)
  File "/home/trainer/bevnet/mmdet3d/models/fusion_models/bevfusion.py", line 269, in forward_single
    losses = head.loss(gt_bboxes_3d, gt_labels_3d, pred_dict)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/fp16_utils.py", line 214, in new_func
    output = old_func(*new_args, **new_kwargs)
  File "/home/trainer/bevnet/mmdet3d/models/heads/bbox/transfusion.py", line 645, in loss
    layer_loss_cls = self.loss_cls(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmdet/models/losses/focal_loss.py", line 233, in forward
    loss_cls = self.loss_weight * calculate_loss_func(
  File "/usr/local/lib/python3.8/dist-packages/mmdet/models/losses/focal_loss.py", line 139, in sigmoid_focal_loss
    loss = _sigmoid_focal_loss(pred.contiguous(), target.contiguous(), gamma,
  File "/usr/local/lib/python3.8/dist-packages/mmcv/ops/focal_loss.py", line 55, in forward
    ext_module.sigmoid_focal_loss_forward(
RuntimeError: sigmoid_focal_loss_forward_impl: implementation for device cuda:0 not found.

Issue Analytics

State:
Created 10 months ago
Comments:12 (5 by maintainers)

Top GitHub Comments

1reaction

YoushaaMurhijcommented, Nov 16, 2022

I will try that.

0reactions

kentang-mitcommented, Dec 10, 2022

Please let me know if the solution will not work. We are doing a major reformat on this codebase internally to remove unnecessary dependencies on mmcv/mmdet and make the installation process easier, but this reformat process might take relatively long time (months).