Microsoft Launches 7 In-House MAI Models at Build 2026 — Declaring AI Stack Independence

Microsoft AI unveiled seven new models at Build 2026 spanning image, voice, transcription, reasoning, and coding — all trained on proprietary data without distillation from third-party models. MAI-Code-1-Flash immediately shipped inside GitHub Copilot and VS Code.

TL;DR — Microsoft AI launched the MAI model family — seven models spanning image generation, voice synthesis, transcription, reasoning, and coding — at Microsoft Build 2026. All models were trained from scratch on clean, traceable, enterprise-grade Microsoft data, without distillation from OpenAI or any third-party models. MAI-Code-1-Flash is live in GitHub Copilot and VS Code today.

The Declaration: Humanist Superintelligence

Microsoft AI CEO Mustafa Suleyman opened the Build 2026 keynote with a simple statement: Microsoft's goal is Humanist Superintelligence — AI designed to serve people, not replace them. The MAI model family is the first concrete step toward that mission.

More significant than the models themselves is what Suleyman emphasized about how they were built: "We don't distill from other labs and we don't rely on opaque data. Our datasets are clean, traceable, and enterprise-grade." This is a direct signal that Microsoft — which has been a major Azure reseller of OpenAI models — is now building its own complete AI stack.

7 New models announced simultaneously

43 Languages supported by MAI-Transcribe-1.5

51.2% MAI-Code-1-Flash on SWE-Bench Pro

5B MAI-Code-1-Flash active parameters

The Seven Models, Explained

MAI-Image-2.5 & Flash

A multimodal image model supporting both text-to-image generation and precise image editing. It ranks #2 on the Arena leaderboard for image editing, surpassing Google's Nano Banana Pro. Already live in PowerPoint and rolling out to OneDrive, with availability on Microsoft Foundry for developers. The Flash variant optimizes for production-scale cost and latency.

MAI-Transcribe-1.5

The headline claim: the world's best transcription model, with state-of-the-art word error rate (WER) across 43 languages on the FLEURS multilingual benchmark. It transcribes an hour of audio in under 15 seconds — up to 5× faster than Gemini 3.1 and GPT-4o-Transcribe at comparable accuracy. Keyword Biasing allows users to supply domain-specific terminology, reducing WER by up to 30%. Integrating into Copilot, Teams, GitHub, and Dynamics 365 Contact Centre.

MAI-Voice-2 & Flash

High-quality speech generation across 15 languages, with the ability to adapt to a speaker's voice from a short audio sample. Voice-2-Flash is optimized for ultra-low-latency real-time voice agents.

MAI-Thinking-1

Microsoft AI's first dedicated reasoning model. It reaches 97.0% on AIME 2025 and 94.5% on AIME 2026, demonstrating strong mathematical and scientific reasoning competitive with models in its weight class. In blind human side-by-side evaluations, it outperforms Claude Sonnet 4.6. Currently available in private preview on Microsoft Foundry.

MAI-Code-1-Flash

At just 5B active parameters, MAI-Code-1-Flash achieves 51.2% on SWE-Bench Pro — 16 percentage points ahead of Claude Haiku 4.5 (35.2%) on the same benchmark. It solves harder problems with up to 60% fewer tokens on SWE-Bench Verified, reducing latency and cost simultaneously. Adaptive Solution Length Control automatically calibrates response depth to task complexity. Rolling out today as a selectable model in VS Code GitHub Copilot.

💡

How to Access MAI Models
MAI models are available via Microsoft Foundry, as well as OpenRouter, Fireworks AI, and Baseten. For the first time, developers can fine-tune model weights directly through Foundry. MAI-Code-1-Flash requires no setup for GitHub Copilot users in VS Code — it appears in the model picker automatically as the rollout progresses.

The Full MAI Family at a Glance

Model	Category	Key Metric	Where Available
MAI-Image-2.5	Image gen + editing	Arena image editing #2	PowerPoint, OneDrive, Foundry
MAI-Image-2.5-Flash	Image (efficient)	Cost/speed optimized	Foundry
MAI-Transcribe-1.5	Speech-to-text	Best WER in 43 languages	Copilot, Teams, GitHub, Foundry
MAI-Voice-2	Text-to-speech	15 languages, voice adapt	Copilot
MAI-Voice-2-Flash	Voice (low latency)	Ultra-low latency	Voice agents
MAI-Thinking-1	Reasoning	AIME 2026: 94.5%	Foundry (private preview)
MAI-Code-1-Flash	Coding	SWE-Bench Pro: 51.2%	GitHub Copilot, VS Code

ℹ️

Microsoft + Mayo Clinic: Medical AI
Alongside the MAI model family, Suleiman announced that Microsoft and Mayo Clinic are co-creating a frontier AI model for healthcare. The collaboration combines Mayo Clinic's world-leading clinical expertise and de-identified longitudinal clinical data with Microsoft's foundational AI capabilities — marking MAI's first major domain-specific vertical application.

Key Takeaways

Microsoft AI launched 7 new MAI models at Build 2026 spanning image, transcription, voice, reasoning, and coding.
All models trained on proprietary Microsoft data only — no distillation from OpenAI or third-party models.
MAI-Code-1-Flash: 5B params, SWE-Bench Pro 51.2%, live in GitHub Copilot and VS Code immediately.
MAI-Transcribe-1.5: best-in-class WER across 43 languages, 5× faster than comparable models.
Available on Foundry, OpenRouter, Fireworks, Baseten — weight fine-tuning permitted for the first time.

🔗

Official Sources & Documentation
— Microsoft AI Blog — Full MAI family announcement with specs and vision
— Introducing MAI-Thinking-1 — Reasoning model benchmarks and Foundry access
— Introducing MAI-Code-1-Flash — GitHub Copilot integration and SWE-Bench results