10 Commits

Author SHA1 Message Date
Richard Thier
31dd239ad3 Revert "thier3 / tpxb: int->uint32, but this loses a little perf because likely compiler uses the UB of signed overflow to optimize out stuff so will be reverted as it is not a practical thing anyways"
This reverts commit 808b87f266b2ce8a058b94d9183d100362abe1b4.
2025-10-01 02:06:23 +02:00
Richard Thier
808b87f266 thier3 / tpxb: int->uint32, but this loses a little perf because likely compiler uses the UB of signed overflow to optimize out stuff so will be reverted as it is not a practical thing anyways 2025-10-01 02:06:14 +02:00
Richard Thier
22ec030116 thier3: micro-optimized some of the unrolls 2025-10-01 01:00:07 +02:00
Richard Thier
69d1432721 thier3: no unnecessary 4096 loops and storage because last commit makes it not necessary 2025-10-01 00:34:08 +02:00
Richard Thier
1f6ef0f2ea thier3: realized that I can run with 256 bucketed float prepass (many us faster!) + experimented with a 1bit split trick (too much overhead to win, despite less cache misses) 2025-10-01 00:29:32 +02:00
Richard Thier
08cb90bb1b Revert "prepared for flame graph analysis"
This reverts commit ac873f7123c0dd23ff9d73668e005c71944a8afa.
2025-09-30 22:18:10 +02:00
Richard Thier
52fc14b0f6 Revert "thier3: write caching queues fixed - bug just makes it slower despite less cache misses"
This reverts commit 967c7c19b54fd0db820bbfa1cbe199a8ac9f5419.
2025-09-30 22:17:30 +02:00
Richard Thier
967c7c19b5 thier3: write caching queues fixed - bug just makes it slower despite less cache misses 2025-09-30 22:12:22 +02:00
Richard Thier
ac873f7123 prepared for flame graph analysis 2025-09-30 17:19:47 +02:00
Richard Thier
a5cb0995e3 added missing headers for thiersort3 2025-09-29 18:21:16 +02:00