MuPPet: Multi-person 2D-to-3D Pose Lifting

April 7, 2026·

Thomas Markhorst

Zhi-Yi Lin

Joug Yeong Chew

Jan van Gemert

Xucong Zhang

· 0 min read

PDF Code Dataset Poster Project Slides Source Document Video Preprint

Abstract

Multi-person social interactions are inherently built on coherence and relationships among all individuals within the group, making multi-person localization and body pose estimation essential to understanding these social dynamics. One promising approach is 2D-to-3D pose lifting which provides a 3D human pose consisting of rich spatial details by building on the significant advances in 2D pose estimation. However, the existing 2D-to-3D pose lifting methods often neglect inter-person relationships or cannot handle varying group sizes, limiting their effectiveness in multi-person settings. We propose MuPPet, a novel multi-person 2D-to-3D pose lifting framework that explicitly models inter-person correlations. To leverage these inter-person dependencies, our approach introduces Person Encoding to structure individual representations, Permutation Augmentation to enhance training diversity, and Dynamic Multi-Person Attention to adaptively model correlations between individuals. Extensive experiments on group interaction datasets demonstrate MuPPet significantly outperforms state-of-the-art single- and multi-person 2D-to-3D pose lifting methods, and improves robustness in occlusion scenarios. Our findings highlight the importance of modeling inter-person correlations, paving the way for accurate and socially-aware 3D pose estimation.

Type

CVPRw 2026

Last updated on April 7, 2026

← PolySLGen: Online Multimodal Speaking–Listening Reaction Generation in Polyadic Interaction April 8, 2026

Pushing Joint Image Denoising and Classification to the Edge September 13, 2024 →

No results found

MuPPet: Multi-person 2D-to-3D Pose Lifting