BlockBeats message, March 30, Alibaba Qianwen announced the launch of its full-modal model Qwen3.5-Omni. The Qwen3.5-Omni series includes Plus, Flash, and Light Instruct versions in three sizes, and supports a 256k long context window. The model supports more than 10 hours of audio input and more than 400 seconds of 720P (1FPS) audio-visual input. The model has undergone native multimodal pretraining on massive text, visual, and audio-visual data totaling over 100 million hours, demonstrating outstanding full-modal perception and generation capabilities. Compared with Qwen3-Omni, Qwen3.5-Omni’s multilingual capabilities have been greatly enhanced, enabling speech recognition in 113 languages and dialects, and speech generation in 36 languages and dialects. (Jin10)