Google quietly rolled out the experimental multimodal image output capabilities for its Gemini 2.0 flash model at AI Studio, making it the first major laboratory to offer such features ahead of contestants such as openi and Zai. This update introduces the native image generation, image editing without the need for regeneration and a new “output format” setting, allowing users to be allowed to tolerate between “text” and “image and text” output.
Indigenous image output
Gemini 2.0 flash now supports multimodal output, allowing users to generate images or to edit existing images communicatically. This includes advanced editing capabilities where users can refine specific aspects of an image without fully generating new. For example, users can modify the background or other elements of an image by providing target instructions. All generated images include syntid watermarks to ensure authenticity and reduce misinformation risks.
Additionally, the “output format” setting allows users to switch only between the connivance or joint text and the image output, providing more flexibility based on their use case.
About AI Studio
Google Deepmind has been continuously carrying on its Mithun AI model family since its initial release in December 2024. Gemini 2.0 creates flash multimodal capabilities, rapid processing speed and better spatial arguments at the success of earlier versions. The company has deployed this model for developers at AI Studio and Wartax AI, which is focusing on agents AI features such as tool calling and real-time multimodal applications
Gemini’s recent update has emphasized creative equipment like localized artwork manufacture and detailed image editing, catering to industries.