Skip to content

Navigation Menu

Explore
By size
By industry
By use case
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

Dao-AILab / flash-attention Public

Notifications You must be signed in to change notification settings
Fork 1.2k
Star 13.5k

Code
Issues 522
Pull requests 43
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Pull requests: Dao-AILab/flash-attention

Labels 9 Milestones 0

Labels 9 Milestones 0

New pull request New

43 Open 162 Closed

43 Open 162 Closed

Author

Filter by author

Loading

Label

Filter by label

Loading

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Loading

Milestones

Filter by milestone

Loading

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Loading

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

FA3 kvcache + split kv + gqa parallelization

#1236 opened Sep 18, 2024 by jayhshah • Draft

3

[AMD] Triton Backend for ROCm #1

#1203 opened Sep 4, 2024 by micmelesse

Loading…

3

flashattnvarlen support tree attention

#1188 opened Aug 30, 2024 by efsotr

Loading…

the test_flash_attn.py it's actually in parent directory

#1167 opened Aug 21, 2024 by ArtificialZeng

Loading…

Add support for qk hidden dim different from v hidden dim

#1166 opened Aug 20, 2024 by smallscientist1

Loading…

2

add softmax_d for mha_bwd

#1161 opened Aug 19, 2024 by MayDomine

Loading…

Add how to import FA3 to documentation.

#1112 opened Jul 31, 2024 by AdamLouly

Loading…

1

Fix: bwd may need to first allocate cuda mem for rng_state

#1077 opened Jul 20, 2024 by jundaf2

Loading…

Windows actions

#1036 opened Jul 9, 2024 by bdashore3

Loading…

1

change condition to num_heads >= num_heads_k

#1030 opened Jul 5, 2024 by Luke20000429

Loading…

1

[Draft] support qk head_dim different from vo head_dim

#980 opened Jun 6, 2024 by defei-coder

Loading…

Fix +/-inf in LSE returned by forward

#978 opened Jun 3, 2024 by sgrigory

Loading…

1

add pyproject.toml with build dependencies

#958 opened May 17, 2024 by dhellmann

Loading…

5

Relative position encoding

#956 opened May 14, 2024 by b-albar

Loading…

1 of 4 tasks

ALiBi for the non-flash code path

#858 opened Feb 29, 2024 by Markus28

Loading…

Add local version identifier to package metadata for pre-built wheels

#856 opened Feb 28, 2024 by yundai424

Loading…

2

Add support for small page sizes

#824 opened Feb 13, 2024 by skrider

Loading…

Add C++ build support for use with LibTorch

#819 opened Feb 9, 2024 by shaltielshmid

Loading…

1

meta tensor stuff

#769 opened Jan 15, 2024 by tsengalb99

Loading…

Animations for Flash Attention, Flash Attention2, and Standard Attention

#736 opened Dec 24, 2023 by LuisAVasquez

Loading…

1

Jetson (aarch64) support

#724 opened Dec 14, 2023 by jasl

Loading…

feat(attention): add Bi-Directional MLM attention model

#721 opened Dec 12, 2023 by TamirFriedman-RecoLabs • Draft

1

Update utils.py

#710 opened Dec 8, 2023 by adarshxs

Loading…

2

[fix bug] Llama-2-70B crashed when prompt_len < ngroups

#708 opened Dec 7, 2023 by li2haipeng

Loading…

Add flash_attn_varlen_func_with_kvcache.

#685 opened Nov 22, 2023 by garrett4wade

Loading…

3

Previous 1 2 Next

Previous Next

ProTip! Filter pull requests by the default branch with base:main.

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.