User Prompt (Modified)

Cloned the BEVFusion repo and tried to build the dockerfile, but got some errors while building. After some trials, the build was successful but the GPU fails to load. Give me the modified Dockerfile.

...
error logs
...

Response

The issue is that when you use --override-channels with just -c pytorch, Mamba often fails to find the underlying system libraries (like cudatoolkit dependencies) that are usually hosted on conda-forge or defaults. When it can’t satisfy the dependency for the GPU version, it silently falls back to the CPU version.

The most reliable way to guarantee a GPU version of PyTorch 1.10.1 for CUDA 11.3 in Docker is to use pip with the official NVIDIA wheels, rather than Conda. This bypasses the solver logic entirely.

Here is the fixed Dockerfile. I have also cleaned up a conflict in your OpenMMLab installation (you were installing both mmcv and mmcv-full, which breaks things).

Corrected Dockerfile

FROM nvidia/cuda:11.3.1-devel-ubuntu20.04

RUN apt-get update && apt-get install wget -yq
RUN apt-get install build-essential g++ gcc -y
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get install libgl1-mesa-glx libglib2.0-0 -y
RUN apt-get install openmpi-bin openmpi-common libopenmpi-dev libgtk2.0-dev git -y

# 1. Setup Miniforge (Python 3.8)
ENV CONDA_DIR=/opt/conda
RUN wget --quiet https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh -O ~/miniforge.sh && \
     /bin/bash ~/miniforge.sh -b -p /opt/conda
ENV PATH=$CONDA_DIR/bin:$PATH

RUN mamba install -y python=3.8

# 2. CRITICAL FIX: Install PyTorch via PIP with the CUDA 11.3 wheel.
# This guarantees you get the GPU version.
RUN pip install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio==0.10.1 \
    -f https://download.pytorch.org/whl/cu113/torch_stable.html

# 3. Install other packages
RUN pip install Pillow==8.4.0
RUN pip install tqdm
RUN pip install torchpack
RUN pip install nuscenes-devkit
RUN pip install mpi4py==3.0.3
RUN pip install numba==0.48.0
RUN pip install numpy==1.23.5
RUN pip install flash-attn==0.2.0

# 4. OPTIMIZATION: Install mmcv-full using pre-built wheels.
# Installing "mmcv" AND "mmcv-full" causes conflicts. Only install mmcv-full.
# Using the -f link prevents compiling from source (which takes 20 mins).
RUN pip install mmcv-full==1.4.0 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10.0/index.html
RUN pip install mmdet==2.20.0

# 5. Set Environment variables to ensure setup.py uses CUDA
ENV FORCE_CUDA="1"
ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0 7.5 8.0 8.6+PTX"
ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all"

Steps to proceed:

Rebuild the image:
```
docker build -t bevfusion .
```

Run the container (Ensure you pass --gpus all):

docker run -it --gpus all --shm-size 16g \
  -v $(pwd):/home/bevfusion \
  -v /data/datasets/nuscenes/dataset:/dataset \
  bevfusion /bin/bash

Verify inside the container:

python -c "import torch; print(torch.cuda.is_available())"
# Should print: True

Install BEVFusion:

cd /home/bevfusion
python setup.py develop

Additional Note from User

Comment out the line below from mmdet3d/ops/init.py and mmdet3d/ops/feature_decorator/init.py #from .feature_decorator import feature_decorator

Downgrade numpy pip install "numpy<1.20"

Install flash attention pip install flash-attn==0.2.0

Fix the data path

# 1. Create the data directory in the repo root
mkdir -p data

# 2. Create the symbolic link. 
# Based on your docker run command, your data is at /dataset
ln -s /dataset data/nuscenes

# 3. Verify the link works
ls -l data/nuscenes/nuscenes_infos_val.pkl

Create a file named fix_pkl_keys.py in bevfusion root directory:

import mmcv
import os

# The files that need fixing
files = [
    'data/nuscenes/nuscenes_infos_train.pkl',
    'data/nuscenes/nuscenes_infos_val.pkl'
]

for f in files:
    if not os.path.exists(f):
        print(f"Skipping {f} (not found)")
        continue

    print(f"Checking {f}...")
    data = mmcv.load(f)
    dirty = False

    # 1. Rename 'data_list' -> 'infos'
    if 'data_list' in data:
        print(f"  - Found 'data_list', renaming to 'infos'...")
        data['infos'] = data.pop('data_list')
        dirty = True
    
    # 2. Rename 'metainfo' -> 'metadata'
    if 'metainfo' in data:
        print(f"  - Found 'metainfo', renaming to 'metadata'...")
        data['metadata'] = data.pop('metainfo')
        dirty = True

    if dirty:
        mmcv.dump(data, f)
        print(f"  - Saved fixed file: {f}")
    else:
        print(f"  - File seems correct (keys 'infos' and 'metadata' present or no new keys found).")

print("Done.")

Run it and

Engineering Prompt Saver

BEVFusion Docker Setup

User Prompt (Modified)

Response

Corrected Dockerfile

Steps to proceed:

Additional Note from User