DeepLabCut 3.0 - PyTorch Model Architectures#

Introduction#

You can see a list of supported architectures/variants by using:

from deeplabcut.pose_estimation_pytorch import available_models
print(available_models())

Backbones (neural networks)#

Several backbones are currently implemented in DeepLabCut PyTorch (more will come, and you can add more easily in our new model registry).

ResNets

HRNet

DEKR

BUTCTD

DLCRNet

AnimalTokenPose

Information on Single Animal Models#

Single-animal models are composed of a backbone (encoder) and a head (decoder) predicting the position of keypoints. The default head contains a single deconvolutional layer. To create the single animal model composed of a backbone and head, you can call deeplabcut.create_training_dataset with net_type set to the backbone name (e.g. resnet_50 or hrnet_w32).

If you want to add a second deconvolutional layer (which will make your model slower, but it might improve performance), you can simply edit your pytorch_config.yaml file.

Of course, any multi-animal model can also be used for single-animal projects!

Information on Multi-Animal Models#

Backbones with Part-Affinity Fields#

As in DeepLabCut 2.X, the base multi-animal model is composed of a backbone (encoder) and a head predicting keypoints and part-affinity fields (PAFs). These PAFs are used to assemble keypoints for individuals.

Passing a backbone as a net type (e.g., resnet_50, hrnet_w32) for a multi-animal project will create a model consisting of a backbone and a heatmap + PAF head.

Top-Down Models#

Top-down pose estimation models split the task into two distinct parts: individual localization (through an object detector), followed by pose estimation (for each individual). As localization of individuals is handled by the detector, this simplifies the pose task to single-animal pose estimation!

Hence any single-animal model can be transformed into a top-down, multi-animal model. To do so, simply prefix top_down to your single-animal model name. Currently only a single FasterRCNN variant is available as a detector. Other variants will be added soon!

The pose model for top-down nets is simply the backbone followed by a single convolution for pose estimation. It’s also possible to add deconvolutional layers to top-down model heads.

Example top-down models would be top_down_resnet_50 and top_down_hrnet_w32.