In Example 11.27 we suggested that there will be more cache misses than necessary if rows of the matrix Z are so long that they do not fit in the cache. If that is the case, how could you rewrite the loop nest in order to guarantee group-spatial reuse?
Example 11.27
Already registered? Login
Not Account? Sign up
Enter your email address to reset your password
Back to Login? Click here