Why does the following loop unrolling lead to a wrong result?

I am currently trying to optimize some MIPS assembler that I’ve written for a program that triangulates a 24×24 matrix. My current goal is to utilize delayed branching and manual loop unrolling to try and cut down on the cycles. Note: I am using 32-bit single precision for all the matrix arithmetic. Part of the… Read More Why does the following loop unrolling lead to a wrong result?