# multimodal AI
Latest news and articles about multimodal AI
Total: 3 articles found

Keling AI’s 3.0 Push: A Chinese Model Suite Aiming to Automate End‑to‑End Video Production
Keling AI has launched a 3.0 series of multimodal models—Video 3.0, Video 3.0 Omni and Image 3.0—positioned as an end‑to‑end solution for image and video generation, editing and post‑production. The suite emphasises native multimodal I/O and subject consistency, offering speed and integration for creators while raising questions about compute demands, governance and misuse risks.

SenseTime Open-Sources ‘Sense Nova‑MARS,’ Betting on Agentic Multimodal AI to Drive Execution‑Capable Applications
SenseTime has open‑sourced Sense Nova‑MARS, a multimodal Agentic VLM available in 8B and 32B parameter sizes that the company says can plan actions, call tools and deeply fuse dynamic visual reasoning with image‑text search. The move democratizes access to execution‑oriented multimodal models, accelerating research and product integration while raising safety and governance questions about agentic AI.

Small, Open and Multimodal: Chinese Startup Releases 10‑Billion‑Parameter Vision‑Language Model Claiming SOTA Performance
Chinese start‑up Jieyue Xingchen open‑sourced Step3‑VL‑10B, a 10‑billion‑parameter multimodal model that the team says matches same‑scale state‑of‑the‑art performance on vision, reasoning, math and dialogue. The release highlights a push for efficient, deployable multimodal models and will prompt independent verification and community adoption.