Abstract
LLM-conditioned segmentation has recently advanced by coupling large language models with iterative mask generation frameworks. However, current query-based propose-then-select pipelines can generate high-quality mask candidates while still failing to select the mask that matches the linguistic condition. FlowSeg addresses this semantic misalignment by introducing dynamic semantic guidance through a bidirectional semantic flow between intermediate decoding states and LLM-derived condition embeddings.
Language conditions actively guide mask refinement at each decoding stage, while condition embeddings are progressively updated by emerging visual evidence. A lightweight boundary-aware refinement module further enhances uncertain regions without perturbing confident interiors. Experiments on referring expression segmentation and reasoning segmentation demonstrate consistent improvements and state-of-the-art performance.
Motivation
In query-based LLM-conditioned segmentation, the model may already produce a candidate mask that overlaps well with the target object, but the final matching step can select a semantically wrong candidate. FlowSeg treats language grounding as part of the generation dynamics rather than only a post-hoc selection signal.
Method
FlowSeg is built on a standard LLM-segmentor scaffold with dual visual encoders and a query-based segmentation decoder. Its key contribution is the decoder-side Bidirectional Semantic Flow, where condition embeddings guide query refinement and are updated by decoder queries throughout the generation process.
Results
Referring Expression Segmentation
FlowSeg improves over prior methods on RefCOCO, RefCOCO+, and RefCOCOg, with stronger gains on more challenging splits.
Reasoning Segmentation
On ReasonSeg test, FlowSeg reaches 54.7 cIoU, outperforming the baseline by 13.7 points.
Qualitative Results
FlowSeg produces more accurate masks with finer details compared with prior work, especially for ambiguous referring expressions and complex object boundaries.
Citation
@inproceedings{flowseg2026,
title = {FlowSeg: Dynamic Semantic Flow for LLM-Conditioned Segmentation},
author = {Zekang Zhang and Guangyu Gao and Youyun Tang and ChengJing Wu and Xiaochao Qu and Chi Harold Liu and Jianbo Jiao and Yunchao Wei and Luoqi Liu and Ting Liu},
booktitle = {Proceedings of the International Conference on Machine Learning (ICML)},
year = {2026}
}