Conversation
|
make test |
|
make test_ov |
| arg_list.set(argn++, pd()->scale_stride(i, eff_b_arg)); | ||
| } | ||
| if (problem->hasCMXScale()) { | ||
| arg_list.set(argn++, stride_c / problem->cqGroupM); |
There was a problem hiding this comment.
This is now 64-bit but the kernel interface uses 32-bit. I guess we don't have any type checks but theoretically that could lead to a similar overflow issue.
There was a problem hiding this comment.
Could this be true for stride_binary? https://github.com/uxlfoundation/oneDNN/blob/main/src/gpu/intel/gemm/jit.cpp#L238
There was a problem hiding this comment.
Good points, I will try updating these as well and check if there are any issues.
|
Could we do something similar to what we do with the opencl dispatcher, and use 64-bit strides/dimensions only if required by the problem? |
We definitely can, but I only intend to do this if we encounter a performance regression. If the offset calculations is not important for performance (which is generally the case for GEMM), then there is no benefit to adding this control, it only creates an extra point of failure. |
Avoids incorrect offset calculations for batched GEMM.
c6ddc5f to
053f482
Compare
|
make test |
|
make test_ov |
Avoids incorrect offset calculations for batched GEMM. Fixes the following issue reported in MFDNN-14479