How does this change lead to different performance?

February 27, 2023

I’ve just stumbled upon this from 2015 Fall: 15-213 Introduction to Computer Systems lecture by Carnegie Mellon University.

Is this incorrect? Because it’s exactly the same code with swapped int variables. Please help. I appreciate any explanation here.

>Solution :

Lets take a small and simple "2D" array:

int a[2][2];

In memory it’s laid out like this:

+---------+---------+---------+---------+
| a[0][0] | a[0][1] | a[1][0] | a[1][1] |
+---------+---------+---------+---------+

In the first variant of the loops, you iterate in the order a[0][0], a[0][1], a[1][0], and a[1][1]. You iterate over the elements that are close together, so they are in the cache.

In the second variant you iterate in the order a[0][0], a[1][0], a[0][1], and a[1][1].

See how you jump back and forth in the second variant? When the arrays are larger, this will make the CPU need to fetch data from memory more often, since it’s no longer possible to fit it all into the cache. More memory access leads to lower efficiency and longer execution times.