Dec 26, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Published: 3 weeks ago (December 26, 2025 at 02:30 AM EST)

3 min read

Source: Dev.to

Overview

As 2025 comes to a close, we want to extend our deepest gratitude to each of you for your creativity and support this year. Your experiments, feedback, and brilliant creations have been the heartbeat of our open ecosystem.

As a final gift of the year, we’re excited to share the newest models and tools born in this last week of 2025.

Let’s take a look at what’s just landed.

👉 Subscribe to The Tongyi Weekly and never miss a release
Subscribe Now →

📣 Model Release & Updates

Introducing Qwen-Image-Layered – native image decomposition, fully open‑sourced

Why it stands out

Photoshop‑grade layering – physically isolated RGBA layers with true native editability.
Prompt‑controlled structure – explicitly specify 3–10 layers, from coarse layouts to fine‑grained details.
Infinite decomposition – keep drilling down: layers within layers, to any depth of detail.

Get started

New Open‑Source End‑to‑End Voice Model: Fun‑Audio‑Chat

We’re open‑sourcing Fun‑Audio‑Chat — an end‑to‑end voice model that’s more than just a chatbot. It’s your AI voice partner:

Empathetic – understands emotion, tone, and intent.
Action‑oriented – follows voice commands to complete tasks.
End‑to‑end S2S architecture – lower latency, higher efficiency.
Dual‑resolution design – ~50 % lower GPU cost.
Leader on multiple benchmarks (OpenAudioBench, MMAU, etc.).

Try it

New Qwen3‑TTS Lineup: VoiceDesign & VoiceClone

Create, control, and clone voices—faster and more expressive than ever.

VoiceDesign‑VD‑Flash

Fully controllable speech via free‑form text instructions (tone, rhythm, emotion, persona).
No preset voices – design your own unique vocal identity.
Outperforms GPT‑4o‑mini‑tts & Gemini‑2.5‑pro on role‑play benchmarks.

VoiceClone‑VC‑Flash

Clone any voice from just 3 seconds of audio.
Generate speech in 10+ languages (Chinese, English, Japanese, Spanish, etc.).
15 % lower WER vs. ElevenLabs & GPT‑4o‑Audio in multilingual tests.
Context‑aware cadence for more natural delivery.

Try it now

Qwen‑Image‑Edit‑2511: Stronger Consistency & Real‑World Image Editing

What’s new in 2511

Stronger multi‑person consistency for group photos and complex scenes.
Built‑in popular community LoRAs – no extra tuning required.
Enhanced industrial & product‑design generation.
Reduced image drift with dramatically improved character & identity consistency.
Improved geometric reasoning (construction lines, structural edits).

From identity‑preserving portrait edits to high‑fidelity multi‑person fusion and practical engineering & design workflows, 2511 pushes image editing to the next level.

Try it now

🧩 Ecosystem Highlights

Z‑Image Turbo: #1 Open‑Weight Text‑to‑Image Model in the Artificial Analysis Image Arena

According to Artificial Analysis, Z‑Image Turbo now ranks #1 among all open‑weight image models in the Artificial Analysis Image Arena.

Why it leads

Only $5 / 1k images on Alibaba Cloud.
Runs on consumer hardware with just 16 GB of memory.
Apache 2.0 open‑source license.
A 6B powerhouse that proves high quality doesn’t require high cost.

Z‑Image Turbo ranking

✨ Community Spotlights

Portrait Photography: BEYOND REALITY Z IMAGE 1.0 from Nurburgring

Fine‑tuned from Z‑Image‑Turbo, this model optimizes skin textures and environmental details while preserving analog film aesthetics. Available in both BF16 and FP8 (the latter runs on 8 GB VRAM hardware).

👉 Try it here

📬 Want More? Stay Updated

Every week we bring you:

New model releases & upgrades
AI research breakthroughs
Open‑source tools you can use today
Community highlights that inspire