MMSegmentation: OpenMMLab’s Semantic Segmentation Toolbox

Introduction

MMSegmentation is an open-source semantic segmentation toolbox from the OpenMMLab ecosystem. Built on PyTorch and the MMEngine framework, it provides a unified, config-driven workflow for training, evaluating, and deploying semantic segmentation models. It ships with implementations of over 50 segmentation architectures and supports a wide range of benchmark datasets out of the box.

In this guide you will learn how to install MMSegmentation, run inference with a pretrained model, and understand the config system that ties models, datasets, and training schedules together.

Overview

Key features:

50+ segmentation architectures (DeepLabV3, DeepLabV3+, PSPNet, UNet, SegFormer, Mask2Former, and more)
Config-driven design – models, datasets, schedules, and runtime settings are all defined in Python config files
Pretrained model zoo with hundreds of checkpoints on ADE20K, Cityscapes, PASCAL VOC, and other datasets
Built on MMEngine and MMCV for a consistent training and evaluation pipeline
Support for distributed training, mixed precision, and TensorRT/ONNX export

Use cases:

Autonomous driving scene parsing (Cityscapes, BDD100K)
Indoor scene understanding (ADE20K, ScanNet)
Medical image segmentation
Remote sensing and satellite imagery analysis
Research benchmarking of new segmentation architectures

Current version: mmsegmentation 1.2.2 (requires mmengine and mmcv >= 2.0)

Getting Started

MMSegmentation depends on PyTorch, MMEngine, and MMCV. Install them in order:

pip install torch torchvision
pip install -U openmim
mim install mmengine mmcv
pip install mmsegmentation

Alternatively, install MMSegmentation and its dependencies together using MIM:

mim install mmsegmentation

Verify the installation:

import mmseg
print(mmseg.__version__)

Core Concepts

Config System

MMSegmentation uses Python config files to define every aspect of an experiment. A config file typically inherits from base configs and specifies the model architecture, dataset pipeline, training schedule, and runtime options. For example, pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.py describes a PSPNet with a ResNet-50 backbone trained on Cityscapes.

Model Architecture

Models are built from a config dictionary with three main components:

backbone – the feature extractor (e.g., ResNet, Swin Transformer)
decode_head – the segmentation head that produces per-pixel predictions (e.g., PSPHead, ASPPHead, SegformerHead)
auxiliary_head (optional) – an auxiliary loss head for deeper supervision

Inference API

The high-level inference API consists of two functions from mmseg.apis:

init_model(config, checkpoint) – loads a model from a config file and checkpoint
inference_model(model, image) – runs inference and returns the segmentation result

Practical Examples

Example 1: Running Inference with a Pretrained Model

from mmseg.apis import init_model, inference_model
import mmcv

# Paths to the config file and pretrained checkpoint
config_file = "configs/pspnet/pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.py"
checkpoint_file = "pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth"

# Initialize the model
model = init_model(config_file, checkpoint_file, device="cuda:0")

# Run inference on a single image
result = inference_model(model, "demo/demo.png")

# The result contains predicted semantic labels per pixel
seg_map = result.pred_sem_seg.data.cpu().numpy()
print(f"Segmentation map shape: {seg_map.shape}")

Example 2: Visualizing Segmentation Results

from mmseg.apis import init_model, inference_model
from mmseg.visualization import SegLocalVisualizer
import mmcv

config_file = "configs/pspnet/pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.py"
checkpoint_file = "pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth"

model = init_model(config_file, checkpoint_file, device="cuda:0")
image = mmcv.imread("demo/demo.png")
result = inference_model(model, image)

# Visualize the result
visualizer = SegLocalVisualizer()
visualizer.dataset_meta = model.dataset_meta
visualizer.add_datasample(
    name="result",
    image=image,
    data_sample=result,
    draw_gt=False,
    show=False,
    out_file="result.png",
)
print("Saved visualization to result.png")

Example 3: Listing Available Pretrained Models

You can use MIM to search the model zoo for available checkpoints:

mim search mmsegmentation --model "deeplabv3plus"

Or programmatically in Python:

from mmseg.utils import register_all_modules
from mmengine import Config

# Register all modules so configs can be loaded
register_all_modules()

# Load a config to inspect the model definition
cfg = Config.fromfile("configs/deeplabv3plus/deeplabv3plus_r101-d8_4xb2-40k_cityscapes-512x1024.py")
print(f"Backbone: {cfg.model.backbone.type}")
print(f"Decode head: {cfg.model.decode_head.type}")

Example 4: Training on a Custom Dataset

To train on your own dataset, create a config that inherits from a base and overrides the dataset paths:

# my_config.py
_base_ = "configs/pspnet/pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.py"

# Override dataset settings
train_dataloader = dict(
    dataset=dict(
        type="CustomDataset",
        data_root="data/my_dataset",
        img_dir="images/train",
        seg_map_dir="annotations/train",
    ),
)

val_dataloader = dict(
    dataset=dict(
        type="CustomDataset",
        data_root="data/my_dataset",
        img_dir="images/val",
        seg_map_dir="annotations/val",
    ),
)

Then launch training from the command line:

python tools/train.py my_config.py

Best Practices

Use MIM (openmim) to manage dependencies and download pretrained checkpoints. It resolves version compatibility automatically.
Start from an existing config in the configs/ directory and customize it rather than writing a config from scratch.
Use register_all_modules() when loading configs or models outside the standard training scripts.
For custom datasets, ensure your annotation masks use contiguous integer class IDs starting from 0, and define the classes and palette fields in your dataset config.
Enable mixed-precision training with --amp for faster training on modern GPUs.

Conclusion

MMSegmentation provides a comprehensive, well-maintained toolkit for semantic segmentation research and production. Its config-driven design, extensive model zoo, and tight integration with the OpenMMLab ecosystem make it straightforward to reproduce published results, benchmark new ideas, and deploy models to production.

Resources:

About this article. This article was generated by the Best-of-the-Best autonomous AI digest and reviewed by Ruslan Magana Vsevolodovna. Package metadata was last checked on 24 January 2026. See the data leaderboard and the GitHub repository for sources.

Share on

X Facebook LinkedIn Bluesky