[PyTorch] Support FA3 MLA CP feature #1907

zhujian19891203 · 2025-06-28T06:43:54Z

Description

Because flash-attention #1604 has already support hdimQK != hdimV backward, so we can support FA3(Flash Attention 3) backend for MLA (Multi-latent attention). #1604 allows us to skip explicit padding & unpadding hdimV to use FA3 as attention backend, and it can bring performance benefits.

Test Results:

I add some unit tests for FA3 MLA CP, test_fused_attn.py and test_fused_attn_with_cp.py passed.
I use the Megatron-LM framework to run the 16B DeepSeek model, where TP=CP=2, hdimQK=192, hdimV=128, FA3's MFU is basically the same as cuDNN attention, and the overall loss curves of the two are exactly the same.

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

Update FA3 to commit-id 3ba6f82 (tag 2.8.0.post2 with compile error fixed)
Update get_attention_backend method because FA3 support MLA now
Support FA3 MLA for CP module
Add unit tests for FA3 MLA CP
Update attention doc

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

1. Update FA3 to commit-id 3ba6f82 (tag 2.8.0.post2 with compile error fixed), PR-1604 support hdimQK != hdimV backward 2. Update get_attention_backend method because FA3 support MLA now 3. Add CP MLA support for FA3 4. Add unit tests for FA3 MLA CP 5. Update attention doc Signed-off-by: zhujian <zhujian.whu.cs@gmail.com>

zhujian19891203 force-pushed the fa3_mla_cp branch from d5e40b8 to 2f2bf0c Compare June 28, 2025 08:30

zhujian19891203 force-pushed the fa3_mla_cp branch from 2f2bf0c to 3a6b27d Compare June 28, 2025 08:38

cyanguwa self-assigned this Jun 28, 2025

Merge branch 'main' into fa3_mla_cp

2f3c72d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PyTorch] Support FA3 MLA CP feature #1907

[PyTorch] Support FA3 MLA CP feature #1907

Uh oh!

zhujian19891203 commented Jun 28, 2025

Uh oh!

Uh oh!

[PyTorch] Support FA3 MLA CP feature #1907

Are you sure you want to change the base?

[PyTorch] Support FA3 MLA CP feature #1907

Uh oh!

Conversation

zhujian19891203 commented Jun 28, 2025

Description

Type of change

Changes

Checklist:

Uh oh!

Uh oh!