当前位置：首页 > news >正文

Making Mixture-of-Experts Robust: A Dual-Model Strategy for Accuracy and Adversarial Defense

news 2026/3/27 2:46:14

Mixture-of-Experts (MoE) models are a core building block of modern large-scale AI systems — from Vision Transformers to large language models. They scale efficiently by routing inputs to specialized expert networks.

But this modular structure hides a structural vulnerability.

In our ICML 2025 paper:

“Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach”

we ask:

Can we make MoE models adversarially robust without sacrificing standard accuracy?

This post explains our key insight, the proposed method, and why it matters.

🔎 Overview of Our Approach

Below is the overview figure from our paper:

Robust MoE Overview

The framework has three main components:

Diagnose the structural weakness of MoE
Robustify expert networks (RT-ER)
Balance robustness and accuracy via a dual-model with joint training (JTDMoE)

1️⃣ Why Are Mixture-of-Experts Vulnerable?

An MoE model consists of:

Multiple expert networks ( f_i(x) )
A router ( a_i(x) ) assigning weights
A weighted aggregation:

[
F(x) = \sum_{i=1}^{E} a_i(x) f_i(x)
]

MoE works because different experts specialize in different regions of the input space.

However, adversarial perturbations reveal a structural asymmetry.

🔬 Our Key Observation

We evaluate four robustness metrics:

SA – Standard Accuracy
RA – Robust Accuracy (whole model attack)
RA-E – Attack targeting experts
RA-R – Attack targeting router

On CIFAR-10:

Router robustness (RA-R) > 50%
Expert robustness (RA-E) < 4%

This means:

The experts — not the router — are the weak link.

Even if the router behaves well, a fragile expert can collapse the entire model.

2️⃣ Why Standard Adversarial Training Fails

Standard adversarial training solves:

[
\min_\theta \max_{|\delta|\le \epsilon} \ell(F(x+\delta), y)
]

But MoE has a modular structure:

Routing decisions may change under perturbation.
Some experts remain poorly trained.
Robustness becomes uneven across experts.

Empirically, we observe:

Training instability.
Sudden accuracy drops.
Robust accuracy stagnating around 54%.

This motivates a component-level solution.

3️⃣ RT-ER: Robust Training with Expert Robustification

Instead of only optimizing the final MoE output, we explicitly regularize expert behavior.

We propose:

RT-ER (Robust Training with Experts’ Robustification)

[
\mathcal{L}{rob} =
\max \ell_{CE}(F(x+\delta), y)

\beta \cdot \ell_{KL}(f_2(x+\delta), f_2(x))
]

Where:

( f_2 ) is the second-top expert based on routing weight.
KL divergence enforces output consistency between clean and adversarial inputs.

Why the Second-Top Expert?

When adversarial noise alters routing decisions,
the second expert often becomes active.

If that expert is fragile, robustness collapses.

RT-ER ensures:

Experts behave consistently.
Routing changes become less harmful.
Lipschitz constants decrease.
Training becomes stable.

📈 Results

Compared to traditional adversarial training:

+16% robust accuracy improvement (CIFAR-10)
Significantly more stable training
Minimal drop in clean accuracy

Robustness is now distributed across experts.

4️⃣ The Robustness–Accuracy Trade-Off

Even with RT-ER, improving robustness slightly reduces clean accuracy.

So we ask:

Can we balance robustness and accuracy instead of choosing one?

5️⃣ Dual-Model Strategy

We introduce a dual-model framework:

[
F_D(x) = (1 - \alpha) F_S(x) + \alpha F_R(x)
]

Where:

( F_S ) = Standard MoE
( F_R ) = Robust MoE (trained via RT-ER)
( \alpha \in [0.5,1] )

This provides a controllable trade-off:

Larger ( \alpha ) → stronger robustness
Smaller ( \alpha ) → higher clean accuracy

6️⃣ Theoretical Insight: Certified Robustness Bound

We derive certified robustness bounds for both:

Single MoE
Dual-model

The bound reveals:

[
\epsilon \propto \frac{\text{classification margin}}{\text{Lipschitz constants}}
]

Implications:

Increasing margin improves robustness.
Reducing expert Lipschitz constants improves robustness.
Robustness fundamentally depends on expert stability.

This provides theoretical support for RT-ER.

7️⃣ JTDMoE: Joint Training for Dual-Model

Instead of training models separately, we propose:

JTDMoE (Joint Training for Dual-Model)

Bi-level optimization:

Lower level:
[
\min_{\Theta_R} \mathcal{L}_{rob}
]

Upper level:
[
\min_{\Theta_S,\Theta_R} \ell_{CE}(F_D(x), y)
]

This aligns:

Standard MoE
Robust MoE
Dual-model margin

🚀 Empirical Gains

Under AutoAttack:

+20% robust accuracy improvement (CIFAR-10)
Clean accuracy improves
Margin increases across all classes

This confirms our theoretical prediction.

🔑 Key Takeaways

If you remember three things:

Experts are the structural vulnerability in MoE.
Targeted expert robustification stabilizes the architecture.
A jointly trained dual-model breaks the robustness–accuracy trade-off.

📎 Paper & Code

📄 ICML 2025 Paper
Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach

💻 Code
https://github.com/TIML-Group/Robust-MoE-Dual-Model

Mixture-of-Experts models are becoming foundational in large-scale AI.

Understanding and strengthening their structural robustness is essential for deploying reliable and safe systems.

If you're working on robust ML, MoE architectures, or scalable AI systems — I would love to discuss.

查看全文

http://www.jsqmd.com/news/406577/

春晚音质封神！追觅电视大师声学系统，承包春晚全场景听觉体验

从 FWT 到 FFT

API调用还是本地部署？LLM使用策略对比

智平方机器人宣布完成10亿融资：估值超百亿百度与中车是投资方

大数据领域Doris的动态分区管理技巧

洗车店就在家门口 50 米，我问 AI 怎么去，它说“走过去“—— 深入剖析为什么 AI 会集体翻车？

python+uniapp微信小程序的文明城市创建平台设计与实现

python+uniapp微信小程序的外卖点餐点单系统商家协同过滤

python+uniapp微信小程序的大悦城地下停车场车位预约收费系统_

python+uniapp微信小程序的宠物生活服务预约系统宠物陪玩遛狗溜猫馆设计与实现商家_

vcs启动verdi单步调试功能

python+uniapp微信小程序的体育用品羽毛球购物商城

python+uniapp微信小程序的汽车线上车辆租赁管理系统的设计与实现_

python+uniapp微信小程序的便捷理疗店服务预约系统的研究与实现

python+uniapp微信小程序的健康食品零食商城积分兑换的设计与实现_

python+uniapp微信小程序的博物馆文创产品推荐商城销售系统

python+uniapp微信小程序的教师课堂教学辅助管理系统人脸识别签到

python+uniapp微信小程序的瑜伽馆课程预约选课管理系统

Stremio - 让你畅享视频娱乐的自由媒体中心！

FossFLOW：轻松制作美观的等距基础设施图

航空航天晶格结构增材制造：基本分类与特性

20260223 之所思 - 人生如梦

2026品牌设计趋势洞察：6家顶尖服务商深度评估与精选推荐 - 2026年企业推荐榜

物理研究科研AI智能体，AI应用架构师探索宇宙奥秘的可靠支撑

2026年手工地毯工厂综合评测：从源头到设计，谁更值得信赖？ - 2026年企业推荐榜

69岁李雪健妻子长相惊艳,儿子长相帅气,高学历令人羡慕！

大数据领域数据可视化的三维展示技术

绿云软件冲刺港股：9个月营收2亿利润3457万估值25亿

埃斯顿通过上市聆讯：预计2025年扣非后净利600万到800万吴波家族控制42%股权

为什么芯片工程师在流片之后容易生病？