Best Multimodal Models

Study tests five multimodal AI models on CT scan, finds 20% major errors

Artificial intelligence is rapidly transforming health care. AI systems can now detect diabetic eye disease from retinal photos and analyze CT images for signs of early-stage lung cancers and stroke.

12d

Microsoft open-sources multimodal reasoning model with 15B parameters

The company mainly trained Phi-4-reasoning-vision-15B on open-source data. The data included images and text-based descriptions of the objects depicted in those images. Before it started training the ...

12d

Black Forest Labs' new Self-Flow technique makes training multimodal AI models 2.8x more efficient

This efficiency makes it viable for enterprises to move beyond generic off-the-shelf solutions and develop specialized models ...

5don MSN

Google unveils Gemini Embedding 2, its first multimodal embedding model

Google introduces Gemini Embedding 2, its first multimodal embedding model designed to map text, images, audio, and video into a single space.

Google's Gemini Embedding 2 arrives with native multimodal support to cut costs and speed up your enterprise data stack

While previous embedding models were largely restricted to text, this new model natively integrates text, images, video, audio, and documents into a single numerical space — reducing latency by as muc ...

9to5Mac

New Apple model combines vision understanding and image generation with impressive results

In the study titled MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer, a team of nearly 30 Apple researchers details a novel unified approach that enables both ...

AASTOCKS.com

BABA-W's Tongyi Lab Open-Sources Film-Grade Dubbing Multimodal Large Model, Fun-CineForge

Short selling $429.44M; Ratio 26.641% 's Tongyi Lab released and open-sourced the first multimodal large model supporting film-grade multi-scenario dubbing, Fun-CineForge, according to Chinese media.

SiliconANGLE

AWS expands Nova foundation models, adds multimodal support

In conjunction with its announcement of Nova Forge, a platform for building customized variants of its Nova foundation models, Amazon Web Services Inc. today introduced four new artificial ...

Devdiscourse

Multi-modal artificial intelligence can improve smart city traffic analytics

Smart city initiatives are generating vast amounts of data from sensors, cameras, mobile devices, and digital service ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results