Installation¶

In this section we demonstrate how to prepare an environment with PyTorch.

Installation

Requirements¶

Linux (Windows is not officially supported)
Python 3.6+
PyTorch 1.8 or higher
CUDA 10.1 or higher
NCCL 2
GCC 4.9 or higher
mmcv-full 1.4.7 or higher (use mmcv for fast installation)

We have tested the following versions of OS and softwares:

OS: Ubuntu 16.04/18.04 and CentOS 7.2
CUDA: 10.0/10.1/11.0/11.2
NCCL: 2.1.15/2.2.13/2.3.7/2.4.2 (PyTorch-1.1 w/ NCCL-2.4.2 has a deadlock bug, see here)
GCC(G++): 4.9/5.3/5.4/7.3/7.4/7.5

Install openmixup¶

We recommend that users follow our best practices to install OpenMixup.

Step 0. Create a conda virtual environment and activate it.

conda create -n openmixup python=3.8 -y
conda activate openmixup

Step 1. Install PyTorch and torchvision following the official instructions, e.g., on GPU platforms:

conda install pytorch torchvision torchaudio cudatoolkit=10.1 -c pytorch
# assuming CUDA=10.1, "pip install torch==1.8.1+cu101 torchvision==0.9.1+cu101 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html"

Step 2. Install MMCV using MIM. We recommend the users to install mmcv-full from the source using MIM (or it will install from the source with pip in step 4). You can also use pip install mmcv for fast installation.

pip install -U openmim
mim install mmcv-full

Step 3. Install other third-party libraries (not necessary). Please install pyGCO for PuzzleMix (used for cut_grid_graph, DON’T USE pip install gco==1.0.1).

conda install faiss-gpu cudatoolkit=10.1 -c pytorch  # optional for DeepCluster and ODC, assuming CUDA=10.1
pip install opencv-contrib-python  # optional for SaliencyMix (cv2.saliency.StaticSaliencyFineGrained_create())

Step 4. Install OpenMixup. To develop and run openmixup directly, install it from the source:

git clone https://github.com/Westlake-AI/openmixup.git
cd openmixup
pip install -v -e .
# "-v" means verbose, and "-e" means installing in editable mode;
# or "python setup.py develop"

Step 5. Install Apex (optional), following the official instructions, e.g.

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

If some errors occur when you install Apex from the source, you can try python setup.py install for fast installation. Note that we recommend using PyTorch AMP for mixed precision training in high versions of PyTorch.

Note:

The git commit id will be written to the version number with step d, e.g. 0.1.0+2e7045c. The version will also be saved in trained models.
Following the above instructions, openmixup is installed on dev (editable) mode, and any local modifications made to the code will take effect immediately (except for the running experiments). You can install it to pip/conda by pip install . and the local modifications will not take effect without reinstalling it.
If you are installing cv2 for the first time, ImportError: libGL.so.1 will occur, which can be solved by apt install libgl1-mesa-glx. If you would like to use opencv-python-headless instead of opencv-python, you can install it before installing MMCV. Refer to issue #48 for some errors encountered with the version of cv2.

(back to top)

Customized installation¶

Benchmark¶

According to MMSelfSup, if you need to evaluate your pre-training model with some downstream tasks such as detection or segmentation, please also install Detectron2, MMDetection and MMSegmentation.

If you don’t run MMDetection and MMSegmentation benchmark, it is unnecessary to install them.

You can simply install MMDetection and MMSegmentation with the following command:

pip install mmdet mmsegmentation

For more details, you can check the installation page of MMDetection and MMSegmentation.

Install MMCV without MIM¶

MMCV contains C++ and CUDA extensions, thus depending on PyTorch in a complex way. MIM solves such dependencies automatically and makes the installation easier. However, it is not a must. To install MMCV with pip instead of MIM, please follow MMCV installation guides. This requires manually specifying a find-url based on PyTorch version and its CUDA version.

For example, the following command install mmcv-full built for PyTorch 1.10.x and CUDA 11.3.

pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html

Another option: Docker Image¶

We provide a Dockerfile to build an image.

# build an image with PyTorch 1.10.0, CUDA 11.3, CUDNN 8.
docker build -f ./docker/Dockerfile --rm -t openmixup:torch1.10.0-cuda11.3-cudnn8 .

Note: Make sure you’ve installed the nvidia-container-toolkit.

Prepare datasets¶

It is recommended to symlink your dataset root (assuming $YOUR_DATA_ROOT) to $OPENMIXUP/data. If your folder structure is different, you may need to change the corresponding paths in config files.

Prepare Classification Datasets¶

We support following datasets: CIFAR-10/100, Tiny-ImageNet, ImageNet-1k, Place205, iNaturalist2017/2018, CUB200, FGVC-Aircrafts, StandordCars. Taking ImageNet for example, you need to 1) download ImageNet; 2) create the following list files or download meta files under $DATA/meta/: train.txt and val.txt contains an image file name in each line, train_labeled.txt and val_labeled.txt contains filename label\n in each line; train_labeled_*percent.txt are the down-sampled lists for semi-supervised evaluation. 3) create a symlink under $OPENMIXUP/data/.

Prepare PASCAL VOC¶

Assuming that you usually store datasets in $YOUR_DATA_ROOT (e.g., for me, /home/xhzhan/data/). This script will automatically download PASCAL VOC 2007 into $YOUR_DATA_ROOT, prepare the required files, create a folder data under $OPENSELFSUP and make a symlink VOCdevkit.

cd $OPENMIXUP
bash tools/prepare_data/prepare_voc07_cls.sh $YOUR_DATA_ROOT

At last, the folder with all related datasets looks like:

openmixup
├── openmixup
├── benchmarks
├── configs
├── data
│   ├── meta [used for 'ImageList' dataset]
│   ├── ade
│   ├── cifar10
│   ├── cifar100
│   │   ├── cifar-100-batches-py
│   │   ├── cifar-100-python.tar
│   │── coco
│   │── CUB200
│   ├── FGVC_Aircrafts
│   │   |   ├── images (contains all train & val)
│   ├── ImageNet
│   │   ├── train
│   │   |   ├── n01440764
│   │   |   ├── n01443537
│   │   |   ...
│   │   |   ├── n15075141
│   │   ├── val
│   │── iNaturalist2017
│   │── iNaturalist2018
│   ├── Places205
│   │   ├── images256
│   │   |   ├── a
│   │   |   |   ├── abbey
│   │   |   |   ├── airport_terminal
│   │   |   |   ...
│   │   |   ├── b
│   │   |   ...
│   │   |   ├── y
│   │── StanfordCars
│   │   ├── test
│   │   ├── train
│   │── STL10
│   │   ├── test
│   │   ├── train
│   ├── TinyImageNet
│   │   ├── train
│   │   |   ├── n01443537
│   │   |   ...
│   │   ├── val
│   │   |   ├── images (contains all train & val)
│   ├── VOCdevkit
│   │   ├── VOC2007
│   │   ├── VOC2012

A from-scratch setup script¶

Here is a full script for setting up openmixup with conda and link the dataset path. The script does not download full datasets, you have to prepare them on your own.

conda create -n openmixup python=3.8 -y
conda activate openmixup

conda install -c pytorch pytorch torchvision -y
git clone https://github.com/Westlake-AI/openmixup.git
cd openmixup
python setup.py develop

# download 'meta' and move to data/meta
wget https://github.com/Westlake-AI/openmixup/releases/download/dataset/meta.zip
unzip -d data/meta meta.zip
# download full classification datasets
ln -s $CIFAR10_ROOT data/cifar10
ln -s $CIFAR100_ROOT data/cifar100
ln -s $IMAGENET_ROOT data/ImageNet
ln -s $TINY_ROOT data/TinyImagenet
# download VOC datasets
bash tools/prepare_data/prepare_voc07_cls.sh $YOUR_DATA_ROOT

(back to top)

Common Issues¶

Using multiple openmixup versions¶

If there are more than one openmixup on your machine, and you want to use them alternatively, the recommended way is to create multiple conda environments and use different environments for different versions. The develop mode is recommanded if you want to add your own codes in openmixup.

Another way is to insert the following code to the main scripts (train.py, test.py or any other scripts you run)

import os.path as osp
import sys
sys.path.insert(0, osp.join(osp.dirname(osp.abspath(__file__)), '../'))

Or run the following command in the terminal of corresponding folder to temporally use the current one.

export PYTHONPATH=`pwd`:$PYTHONPATH

Issues of bugs¶

PyTorch-1.8 has a bug in the AdamW optimizer, which will cause some errors in DDP training. See this issue.
PyTorch-1.8 or higher has a bug in printing logs to the console. The log and log.json files are not affected.
The training hangs / deadlocks in some intermediate iterations. See this issue. This bug is fixed in the higher versions of PyTorch>=1.6.

(back to top)