Skip to content

[rls-v3.12]xe: ggemm: Optimize sparse groups by launching wg per token#5075

Open
umar456 wants to merge 2 commits intouxlfoundation:rls-v3.12from
umar456:cherry-pick/fd4a50ea-to-rls-v3.12
Open

[rls-v3.12]xe: ggemm: Optimize sparse groups by launching wg per token#5075
umar456 wants to merge 2 commits intouxlfoundation:rls-v3.12from
umar456:cherry-pick/fd4a50ea-to-rls-v3.12

Conversation

@umar456
Copy link
Copy Markdown
Contributor

@umar456 umar456 commented Apr 23, 2026

Summary

Backport of sparse groups optimization to rls-v3.12. Includes a prerequisite fix for src_attr_zp calculation that was missing from the release branch.

Changes

  • 669979f995 xe: ggemm: Fix src_attr_zp calculation for grouped src zp (prerequisite, not in rls-v3.12)
  • 500f6c5143 xe: ggemm: Optimize sparse groups by launching wg per token

Original Change

  • Original PR branch: uarshad/ggemm_second_token_perf

(cherry picked from commit 669979f and 500f6c5)

@github-actions github-actions Bot added platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel backport component:tests Codeowner: @oneapi-src/onednn-arch labels Apr 23, 2026
@umar456 umar456 changed the title xe: ggemm: Fix src_attr_zp calculation for grouped src zp xe: ggemm: Optimize sparse groups by launching wg per token [rls-v3.12] Apr 23, 2026
@umar456 umar456 changed the title xe: ggemm: Optimize sparse groups by launching wg per token [rls-v3.12] [rls-v3.12]xe: ggemm: Optimize sparse groups by launching wg per token Apr 23, 2026
@umar456 umar456 marked this pull request as ready for review April 23, 2026 22:15
@umar456 umar456 requested review from a team as code owners April 23, 2026 22:15
@vpirogov
Copy link
Copy Markdown
Contributor

vpirogov commented Apr 24, 2026

@umar456, please add links to PRs to main that are being backported. General rule for backports is that PRs must land into main first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport component:tests Codeowner: @oneapi-src/onednn-arch platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants