Richard Thier
|
31dd239ad3
|
Revert "thier3 / tpxb: int->uint32, but this loses a little perf because likely compiler uses the UB of signed overflow to optimize out stuff so will be reverted as it is not a practical thing anyways"
This reverts commit 808b87f266b2ce8a058b94d9183d100362abe1b4.
|
2025-10-01 02:06:23 +02:00 |
|
Richard Thier
|
808b87f266
|
thier3 / tpxb: int->uint32, but this loses a little perf because likely compiler uses the UB of signed overflow to optimize out stuff so will be reverted as it is not a practical thing anyways
|
2025-10-01 02:06:14 +02:00 |
|
Richard Thier
|
22ec030116
|
thier3: micro-optimized some of the unrolls
|
2025-10-01 01:00:07 +02:00 |
|
Richard Thier
|
69d1432721
|
thier3: no unnecessary 4096 loops and storage because last commit makes it not necessary
|
2025-10-01 00:34:08 +02:00 |
|
Richard Thier
|
1f6ef0f2ea
|
thier3: realized that I can run with 256 bucketed float prepass (many us faster!) + experimented with a 1bit split trick (too much overhead to win, despite less cache misses)
|
2025-10-01 00:29:32 +02:00 |
|
Richard Thier
|
08cb90bb1b
|
Revert "prepared for flame graph analysis"
This reverts commit ac873f7123c0dd23ff9d73668e005c71944a8afa.
|
2025-09-30 22:18:10 +02:00 |
|
Richard Thier
|
52fc14b0f6
|
Revert "thier3: write caching queues fixed - bug just makes it slower despite less cache misses"
This reverts commit 967c7c19b54fd0db820bbfa1cbe199a8ac9f5419.
|
2025-09-30 22:17:30 +02:00 |
|
Richard Thier
|
967c7c19b5
|
thier3: write caching queues fixed - bug just makes it slower despite less cache misses
|
2025-09-30 22:12:22 +02:00 |
|
Richard Thier
|
ac873f7123
|
prepared for flame graph analysis
|
2025-09-30 17:19:47 +02:00 |
|
Richard Thier
|
a5cb0995e3
|
added missing headers for thiersort3
|
2025-09-29 18:21:16 +02:00 |
|