Publications

You can also find my articles on my Google Scholar profile.

Journal Articles

Reasoning Segmentation for Images and Videos: A Survey

Submitted to IJCV, 2025

Reasoning Segmentation (RS) segments objects from natural-language queries that require reasoning and knowledge, moving beyond fixed categories or explicit prompts. This survey synthesizes 26 methods, evaluation metrics, and 29 datasets/benchmarks, reviews applications across domains, and outlines current gaps and future research directions.

Recommended citation: Yiqing Shen, Chenjia Li, et al. (2025). "RVTBench: A Benchmark for Visual Reasoning Tasks." arXiv preprint arXiv:2505.18816.
Download Paper | Download Bibtex

Conference Papers

Temporally-Constrained Video Reasoning Segmentation and Automated Benchmark Construction

Medical AI for Global Impact (MedAGI), Oral Presentation, 2025

Accepted as an Oral presentation at MedAGI 2025. This work introduces a temporally-constrained video reasoning segmentation task and a scalable pipeline for benchmark construction.

Recommended citation: Yiqing Shen, Chenjia Li, Chenxiao Fan, Mathias Unberath. (2025). Temporally-Constrained Video Reasoning Segmentation and Automated Benchmark Construction. Oral presentation at MedAGI 2025.
Download Paper | Download Bibtex

Online Reasoning Video Segmentation with Just-in-Time Digital Twins

Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025

Accepted as an Oral presentation at ICCV 2025. Introduce a just-in-time digital twin concept to video segmentation, where – given an implicit query – a LLM plans the construction of a low-level scene representation from high-level video using specialist vision models

Recommended citation: Yiqing Shen, Bohan Liu, Chenjia Li, Lalithkumar Seenivasan, Mathias Unberath. (2025). *Online Reasoning Video Segmentation with Just-in-Time Digital Twins*. arXiv:2503.21056.
Download Paper | Download Bibtex

Operating Room Workflow Analysis via Reasoning Segmentation over Digital Twins

International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025

This work introduces ORDiRS, an operating room reasoning segmentation framework that integrates SAM2, DepthAnything, OWLv2, and LLaVA for workflow analysis.

Recommended citation: Yiqing Shen, Chenjia Li, Bohan Liu, Cheng-Yi Li, Tito Porras, Mathias Unberath. (2025). Operating Room Workflow Analysis via Reasoning Segmentation over Digital Twins. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025.
Download Paper | Download Bibtex

RVTBench: A Benchmark for Visual Reasoning Tasks

Submitted to WACV 2026, 2025

Propose RVTBench, a large-scale benchmark for reasoning visual tasks, including segmentation, grounding, VQA, and summarization.

Recommended citation: Yiqing Shen, Chenjia Li, Chenxiao Fan, Mathias Unberath. (2025). RVTBench: A Benchmark for Visual Reasoning Tasks. Under revision at NeurIPS 2025. arXiv:2505.11838.
Download Paper | Download Bibtex