xe: gemm: fix stride interface by rjoursler · Pull Request #5068 · uxlfoundation/oneDNN

rjoursler · 2026-04-22T17:38:42Z

Avoids incorrect offset calculations for batched GEMM. Fixes the following issue reported in MFDNN-14479

$ ./tests/benchdnn/benchdnn --matmul --engine=gpu --dt=f32:f32:f32 --stag=acbd --wtag=adbc --dtag=abcd --attr-post-ops=binary_mul:f32:0 --attr-scratchpad=user 4x3x16413x16:4x3x16x16413
Segmentation fault from GPU at 0xff0000000000f000, ctx_id: 1 (CCS) type: 0 (NotPresent), level: 1 (PDE), access: 1 (Write), banned: 1, aborting.
Segmentation fault from GPU at 0xff0000000000f000, ctx_id: 1 (CCS) type: 0 (NotPresent), level: 1 (PDE), access: 1 (Write), banned: 1, aborting.
Abort was called at 306 line in file:
./shared/source/os_interface/linux/drm_neo.cpp
Aborted (core dumped)

rjoursler · 2026-04-22T17:40:59Z

make test
disable test_device_cpu
disable build_cpu_runtime_omp
disable build_cpu_runtime_sycl
disable build_cpu_runtime_tbb

rjoursler · 2026-04-22T17:41:28Z

make test_ov

echeresh · 2026-04-22T20:11:02Z

                arg_list.set(argn++, pd()->scale_stride(i, eff_b_arg));
            }
            if (problem->hasCMXScale()) {
                arg_list.set(argn++, stride_c / problem->cqGroupM);


This is now 64-bit but the kernel interface uses 32-bit. I guess we don't have any type checks but theoretically that could lead to a similar overflow issue.

Could this be true for stride_binary? https://github.com/uxlfoundation/oneDNN/blob/main/src/gpu/intel/gemm/jit.cpp#L238

Good points, I will try updating these as well and check if there are any issues.

Simonsays095 · 2026-04-24T16:28:31Z

Could we do something similar to what we do with the opencl dispatcher, and use 64-bit strides/dimensions only if required by the problem?

rjoursler · 2026-04-24T18:09:16Z

Could we do something similar to what we do with the opencl dispatcher, and use 64-bit strides/dimensions only if required by the problem?

We definitely can, but I only intend to do this if we encounter a performance regression. If the offset calculations is not important for performance (which is generally the case for GEMM), then there is no benefit to adding this control, it only creates an extra point of failure.

Avoids incorrect offset calculations for batched GEMM.

rjoursler · 2026-04-27T22:40:28Z

make test
disable test_device_cpu
disable build_cpu_runtime_omp
disable build_cpu_runtime_sycl
disable build_cpu_runtime_tbb

rjoursler · 2026-04-27T22:40:37Z

make test_ov

rjoursler requested a review from a team as a code owner April 22, 2026 17:38

github-actions Bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Apr 22, 2026

dyoussif approved these changes Apr 22, 2026

View reviewed changes

atkassen approved these changes Apr 22, 2026

View reviewed changes

echeresh reviewed Apr 22, 2026

View reviewed changes

xe: gemm: fix stride interface

053f482

Avoids incorrect offset calculations for batched GEMM.

rjoursler force-pushed the rjoursle/gemm_stride branch from c6ddc5f to 053f482 Compare April 27, 2026 22:39

echeresh approved these changes Apr 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xe: gemm: fix stride interface#5068

xe: gemm: fix stride interface#5068
rjoursler wants to merge 1 commit intomainfrom
rjoursle/gemm_stride

rjoursler commented Apr 22, 2026

Uh oh!

rjoursler commented Apr 22, 2026

Uh oh!

rjoursler commented Apr 22, 2026

Uh oh!

echeresh Apr 22, 2026

Uh oh!

h-sadia Apr 24, 2026

Uh oh!

rjoursler Apr 27, 2026

Uh oh!

Simonsays095 commented Apr 24, 2026

Uh oh!

rjoursler commented Apr 24, 2026 •

edited

Loading

Uh oh!

rjoursler commented Apr 27, 2026

Uh oh!

rjoursler commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

rjoursler commented Apr 22, 2026

Uh oh!

rjoursler commented Apr 22, 2026

Uh oh!

rjoursler commented Apr 22, 2026

Uh oh!

echeresh Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

h-sadia Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

rjoursler Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Simonsays095 commented Apr 24, 2026

Uh oh!

rjoursler commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rjoursler commented Apr 27, 2026

Uh oh!

rjoursler commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

rjoursler commented Apr 24, 2026 •

edited

Loading