Segment Anything in Light Fields for Real-Time Applications via Constrained Prompting
Segmented light field images can serve as a powerful representation in autonomous driving tasks: for example, object pose tracking. Segment Anything Model 2 (SAM 2) allows producing semantically meaningful segments for monocular images and videos. In this work, we introduce:
- A novel light field segmentation method that adapts SAM 2 to the light field domain.
- Segmentation refinement, a two-step method of light field segmentation: disparity-based mask propagation followed by reprompting of SAM 2.
- Semantic occluding, a technique that uses latent semantic features of the SAM 2 image encoder model to estimate the occluded regions of the segments to refine the prompts provided to the model.
We show that our method produces semantically accurate and spatio-angularly consistent segments, avoids excessive oversegmentation of objects, and achieves higher performance than SAM 2 video tracking while being 7 times faster.
Publications
• N. Goncharov and D. G. Dansereau, “Segment anything in light fields for real-time applications via constrained prompting,” under review, 2024. Available here.
Citing
If you find this work useful please cite
@article{goncharov2024_arxiv, title = {Segment Anything in Light Fields for Real-Time Applications via Constrained Prompting}, author = {Nikolai Goncharov and Donald G. Dansereau}, journal={arXiv preprint arXiv:2411.13840}, year = {2024} }
This work was carried out in the Robotic Imaging Group at the
Australian Centre for Robotics,
University of Sydney.
Acknowledgments
This research was supported in part by funding from Ford Motor Company.
Downloads
The code for the work is available here.
Gallery
(click to enlarge)