How does using mulhsu avoid the shift for the 32-bit div5 case. The magic constant is 0x66666667 which is a positive number so mulh and mulhsu are equivalent.
Doesn't the shifting requirement come from the magic constant being (2**33+3)/5? We need the shift because of the 33 exponent. The mulhsu only divides by 2**32.
I wrote this test code that I think is correct translation to C. And the result is twice as large as it should be.
You’re right, the lack of srai was an earlier experiment with 0x333… as a magic number and there was a > 1 bit error for numbers > 0x400…. I updated the code in GitHub but due to a family emergency this was delayed.
How does using mulhsu avoid the shift for the 32-bit div5 case. The magic constant is 0x66666667 which is a positive number so mulh and mulhsu are equivalent.
It needs 32, not 31 bits. So it avoids the requirement for shifting.
Doesn't the shifting requirement come from the magic constant being (2**33+3)/5? We need the shift because of the 33 exponent. The mulhsu only divides by 2**32.
I wrote this test code that I think is correct translation to C. And the result is twice as large as it should be.
#include <stdint.h>
#include <stdio.h>
int div5(int x) {
uint32_t magic = 0x66666667U;
int32_t q = ((int64_t)x * (uint64_t)magic) >> 32;
if (x < 0)
++q;
return q;
}
int main() {
int x = 5;
printf("%d %d\n", x/5, div5(x));
return 0;
}
You’re right, the lack of srai was an earlier experiment with 0x333… as a magic number and there was a > 1 bit error for numbers > 0x400…. I updated the code in GitHub but due to a family emergency this was delayed.
You can easily run 32 bit riscv code here: https://riscv-programming.org/ale/