Skip to content

[WIP] xe: gated_mlp: improve performance of ukernel-based gmlp#5059

Draft
hidefromkgb wants to merge 8 commits intomainfrom
aguskov/gated_mlp_ugemm_perf
Draft

[WIP] xe: gated_mlp: improve performance of ukernel-based gmlp#5059
hidefromkgb wants to merge 8 commits intomainfrom
aguskov/gated_mlp_ugemm_perf

Conversation

@hidefromkgb
Copy link
Copy Markdown
Contributor

@hidefromkgb hidefromkgb commented Apr 21, 2026

Partly addresses MFDNN-14598.

Perf results so far:

GPU ukern, ms ref, ms perf, %
PTL-H 1.058240 1.079940 102.0506
BMG 0.574106 0.447617 77.9677
LNL 2.653071 2.894186 109.0881
DG2 1.374104 0.712022 51.8172

@hidefromkgb hidefromkgb requested review from a team as code owners April 21, 2026 00:04
@github-actions github-actions Bot added platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel component:tests Codeowner: @oneapi-src/onednn-arch component:common labels Apr 21, 2026
@dzarukin dzarukin marked this pull request as draft April 21, 2026 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:common component:tests Codeowner: @oneapi-src/onednn-arch platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant