14 Commits

Author SHA1 Message Date
Richard Thier
d43b55f065 thier3: added mlock/munlock for array and its temporary (you can turn this off) 2025-10-01 04:36:18 +02:00
Richard Thier
0beb389c50 Revert "more uint->int, but these seem to make it slower a bit so will be reverted"
This reverts commit ef9e4f799b4f73e9319264a82fc89e885ef455ac.
2025-10-01 02:16:42 +02:00
Richard Thier
ef9e4f799b more uint->int, but these seem to make it slower a bit so will be reverted 2025-10-01 02:16:34 +02:00
Richard Thier
478d87e148 bugfix: remaining 4095 in code after mass changing 4096 to 256 2025-10-01 02:07:10 +02:00
Richard Thier
31dd239ad3 Revert "thier3 / tpxb: int->uint32, but this loses a little perf because likely compiler uses the UB of signed overflow to optimize out stuff so will be reverted as it is not a practical thing anyways"
This reverts commit 808b87f266b2ce8a058b94d9183d100362abe1b4.
2025-10-01 02:06:23 +02:00
Richard Thier
808b87f266 thier3 / tpxb: int->uint32, but this loses a little perf because likely compiler uses the UB of signed overflow to optimize out stuff so will be reverted as it is not a practical thing anyways 2025-10-01 02:06:14 +02:00
Richard Thier
22ec030116 thier3: micro-optimized some of the unrolls 2025-10-01 01:00:07 +02:00
Richard Thier
69d1432721 thier3: no unnecessary 4096 loops and storage because last commit makes it not necessary 2025-10-01 00:34:08 +02:00
Richard Thier
1f6ef0f2ea thier3: realized that I can run with 256 bucketed float prepass (many us faster!) + experimented with a 1bit split trick (too much overhead to win, despite less cache misses) 2025-10-01 00:29:32 +02:00
Richard Thier
08cb90bb1b Revert "prepared for flame graph analysis"
This reverts commit ac873f7123c0dd23ff9d73668e005c71944a8afa.
2025-09-30 22:18:10 +02:00
Richard Thier
52fc14b0f6 Revert "thier3: write caching queues fixed - bug just makes it slower despite less cache misses"
This reverts commit 967c7c19b54fd0db820bbfa1cbe199a8ac9f5419.
2025-09-30 22:17:30 +02:00
Richard Thier
967c7c19b5 thier3: write caching queues fixed - bug just makes it slower despite less cache misses 2025-09-30 22:12:22 +02:00
Richard Thier
ac873f7123 prepared for flame graph analysis 2025-09-30 17:19:47 +02:00
Richard Thier
a5cb0995e3 added missing headers for thiersort3 2025-09-29 18:21:16 +02:00