Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Oh I didn’t know about the visual bounding boxes this is super cool!

Quick question are you talking about this feature?

https://docs.cloud.google.com/vertex-ai/generative-ai/docs/b...

Because it’s just using structured response so it should be doable with Gemini 3 ? (We are using Gemini 3 for some docs processing and its visual understanding is just incredible)





No I'm talking about the image segmentation feature: https://simonwillison.net/2025/Apr/18/gemini-image-segmentat...

But the bounding box stuff might work well enough in Gemini 3 to handle this case as well.


Hmm so that post also links back to segmentation done by structured outputs? (Though here not even enforcing the structure)

https://ai.google.dev/gemini-api/docs/image-understanding#se...


It's not supported by Gemini 3: https://ai.google.dev/gemini-api/docs/gemini-3#migrating_fro...

> Image segmentation: Image segmentation capabilities (returning pixel-level masks for objects) are not supported in Gemini 3 Pro or Gemini 3 Flash. For workloads requiring native image segmentation, we recommend continuing to utilize Gemini 2.5 Flash with thinking turned off or Gemini Robotics-ER 1.5.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: