Commit Graph

  • ce32232a2d results and result generating AWKs master Richard Thier 2025-10-04 06:30:11 +02:00
  • ae88ba5725 not random, but thier3->2 change in rthier (which feels best now except for big rand magyar is best and smaller rand frewr and for comparison sort schwab is best) Richard Thier 2025-10-03 14:23:42 +02:00
  • 88da973e02 better thier3->2 in rthier and fflush-es for tail -f for measurement trackin Richard Thier 2025-10-03 12:40:14 +02:00
  • e7a4f24a87 re-enabled 4pasu for upcoming youtube-ing Richard Thier 2025-10-02 18:18:26 +02:00
  • 66376651a3 Revert "thier3: tricky rotation based state storing..." Richard Thier 2025-10-02 08:09:57 +02:00
  • 74e24486f4 Revert "tiny (hopefully) optimization for tpxb" Richard Thier 2025-10-02 08:09:44 +02:00
  • 2aa7de0d40 tiny (hopefully) optimization for tpxb Richard Thier 2025-10-02 08:09:27 +02:00
  • ce121571ca miore stuff in makefile in case of releases (-march and -fschedule-insns) Richard Thier 2025-10-02 08:08:58 +02:00
  • 1d1f151c07 thier3: tricky rotation based state storing... Richard Thier 2025-10-02 05:48:24 +02:00
  • 7ef63734a1 magyarsort: comment about GC totally not working in my opinion Richard Thier 2025-10-02 04:52:13 +02:00
  • 12431f229e rthier randomized only above threshold Richard Thier 2025-10-02 04:51:33 +02:00
  • 9b9997cbdb Revert "simpler occurence template" Richard Thier 2025-10-02 02:28:54 +02:00
  • d487bb111b simpler occurence template Richard Thier 2025-10-02 02:28:46 +02:00
  • b5aeaa1bdb added frewr comment - because it becames fastest in oct 1 - 2025 Richard Thier 2025-10-01 19:09:18 +02:00
  • fb0b8ce255 added licences - this is first commit that I will push upstream online to my gitea local repo! Richard Thier 2025-10-01 17:26:18 +02:00
  • 27873e06fe 7 relative randomization + a full random / 8 element Richard Thier 2025-10-01 17:12:58 +02:00
  • 603e689de7 added various shell script helpers Richard Thier 2025-10-01 16:49:30 +02:00
  • 7d407000fe added pre-randomized sorts (not so great so far - probably too much cache misses) Richard Thier 2025-10-01 16:49:00 +02:00
  • d43b55f065 thier3: added mlock/munlock for array and its temporary (you can turn this off) Richard Thier 2025-10-01 04:36:18 +02:00
  • ccdf991824 Revert "tpxb: 16-wide manual unroll - but it does not seem to be faster" Richard Thier 2025-10-01 04:26:44 +02:00
  • 100de9bc67 Revert "32-wide manual unroll with 2x compiled... still not as good perf as automatic 48x" Richard Thier 2025-10-01 04:26:32 +02:00
  • 18b734a6e7 32-wide manual unroll with 2x compiled... still not as good perf as automatic 48x Richard Thier 2025-10-01 04:18:04 +02:00
  • 6d79461262 tpxb: 16-wide manual unroll - but it does not seem to be faster Richard Thier 2025-10-01 04:02:08 +02:00
  • 036725611b removed non-temporal writes as too random patterns for it Richard Thier 2025-10-01 03:24:08 +02:00
  • a16917830f add back "make release_ypsu_noinline_debug_sym" for flamegraphs Richard Thier 2025-10-01 02:37:29 +02:00
  • 0beb389c50 Revert "more uint->int, but these seem to make it slower a bit so will be reverted" Richard Thier 2025-10-01 02:16:42 +02:00
  • ef9e4f799b more uint->int, but these seem to make it slower a bit so will be reverted Richard Thier 2025-10-01 02:16:34 +02:00
  • 478d87e148 bugfix: remaining 4095 in code after mass changing 4096 to 256 Richard Thier 2025-10-01 02:07:10 +02:00
  • 31dd239ad3 Revert "thier3 / tpxb: int->uint32, but this loses a little perf because likely compiler uses the UB of signed overflow to optimize out stuff so will be reverted as it is not a practical thing anyways" Richard Thier 2025-10-01 02:06:23 +02:00
  • 808b87f266 thier3 / tpxb: int->uint32, but this loses a little perf because likely compiler uses the UB of signed overflow to optimize out stuff so will be reverted as it is not a practical thing anyways Richard Thier 2025-10-01 02:06:14 +02:00
  • c032109110 Revert "tpbx: tried removal of relative addressing but it does not help, just makes n be int instead of uint32_t so probably will be reverted. Sad because this actually looked beneficial" Richard Thier 2025-10-01 01:53:38 +02:00
  • 5ecb48815b tpbx: tried removal of relative addressing but it does not help, just makes n be int instead of uint32_t so probably will be reverted. Sad because this actually looked beneficial Richard Thier 2025-10-01 01:53:28 +02:00
  • 98222d4494 tpxb: tried non-temporal writes (bad for random writes) Richard Thier 2025-10-01 01:28:49 +02:00
  • 22ec030116 thier3: micro-optimized some of the unrolls Richard Thier 2025-10-01 01:00:07 +02:00
  • 69d1432721 thier3: no unnecessary 4096 loops and storage because last commit makes it not necessary Richard Thier 2025-10-01 00:34:08 +02:00
  • 1f6ef0f2ea thier3: realized that I can run with 256 bucketed float prepass (many us faster!) + experimented with a 1bit split trick (too much overhead to win, despite less cache misses) Richard Thier 2025-10-01 00:29:32 +02:00
  • 45820cf81c re-added FlameGraph submodule Richard Thier 2025-09-30 22:22:30 +02:00
  • 08cb90bb1b Revert "prepared for flame graph analysis" Richard Thier 2025-09-30 22:18:10 +02:00
  • a849b01fa8 Reapply "adds cache_miss_flamegraph.sh" Richard Thier 2025-09-30 22:17:53 +02:00
  • 2a507e9f54 Revert "adds cache_miss_flamegraph.sh" Richard Thier 2025-09-30 22:17:38 +02:00
  • 52fc14b0f6 Revert "thier3: write caching queues fixed - bug just makes it slower despite less cache misses" Richard Thier 2025-09-30 22:17:30 +02:00
  • 967c7c19b5 thier3: write caching queues fixed - bug just makes it slower despite less cache misses Richard Thier 2025-09-30 22:12:22 +02:00
  • 78266ef345 adds cache_miss_flamegraph.sh Richard Thier 2025-09-30 17:22:19 +02:00
  • ac873f7123 prepared for flame graph analysis Richard Thier 2025-09-30 17:19:47 +02:00
  • da0c024a32 add results/ Richard Thier 2025-09-30 13:53:52 +02:00
  • 0a199b9d72 Revert "hand unrolled thiersort3 - I think its slower than gcc unrolling and surely more complex so I will revert" Richard Thier 2025-09-29 18:52:02 +02:00
  • 523605e8d8 hand unrolled thiersort3 - I think its slower than gcc unrolling and surely more complex so I will revert Richard Thier 2025-09-29 18:51:53 +02:00
  • a5cb0995e3 added missing headers for thiersort3 Richard Thier 2025-09-29 18:21:16 +02:00
  • 7ca9a19c5d tests for thier3 - works and very fast Richard Thier 2025-09-29 18:18:37 +02:00
  • 86f81d2a1c minor threepass optimizations and thier2 variant that uses threepass (but does unnecessary work in that case: allocation, extra copies, extra step for partitioning, etc) Richard Thier 2025-09-29 03:31:06 +02:00
  • a17b284c8a added three-plus-one pass radix which performs very well, but there is 0.8 ILP only because of lot of cache misses. worse perf on random than magyarsort, but better than ska_copy and best worst cases - might hook into thier2? Richard Thier 2025-09-29 02:24:50 +02:00
  • f4e4db43f9 4096-wise thiersort2 Richard Thier 2025-09-27 01:43:55 +02:00
  • 76001efd98 2048-wise thiersort2 Richard Thier 2025-09-27 01:41:52 +02:00
  • dcef96fee8 512-bucketed thiersort2 Richard Thier 2025-09-27 01:24:18 +02:00
  • 5fc08c6fae unlikely optimization in thiersort + measurements Richard Thier 2025-09-12 02:25:57 +02:00
  • 30e868d154 tried fewer but simpler bucketing Richard Thier 2025-09-12 01:58:28 +02:00
  • 2c5b0b1177 minor optimization Richard Thier 2025-09-12 01:49:20 +02:00
  • 5a8f34efa0 fixed thiersort2 Richard Thier 2025-09-12 01:42:11 +02:00
  • a3643eba9b added thiersort2 - better than std, somewhat similar to schwab in perf but is a bucket sort - very interestingly not huge boost in bucketing speed Richard Thier 2025-09-11 20:42:04 +02:00
  • 85aaf4b1a1 testing schwab_sort Richard Thier 2025-05-09 01:10:12 +02:00
  • 707ab1eb81 neoqs, meanqs and various quicksort variants Richard Thier 2025-05-06 03:06:37 +02:00
  • e38a76c0c4 added vergesort Richard Thier 2025-04-04 20:36:32 +02:00
  • b2c4e7082b mormord ILP-variant "nearly sorting properly" but some values buggy Richard Thier 2024-04-12 01:09:59 +02:00
  • b2d66b7fd0 some fixes for mormord-ilp-richi Richard Thier 2024-04-12 00:37:50 +02:00
  • 23a5bb1d55 mormordsort ILP version by me - with probably lot of bugs Richard Thier 2024-04-11 23:59:13 +02:00
  • 0f716e912c bit_partition function added - its like quicksort, but different Richard Thier 2024-04-11 21:43:18 +02:00
  • 3f0ae7ae77 Revert "mormord sort more branchless plus extra edge-case handling for empty sized calls" - speed was not great... Richard Thier 2024-04-11 20:02:23 +02:00
  • 9894f6c6d4 Revert "less branchless mor... not good I think" Richard Thier 2024-04-11 19:58:08 +02:00
  • 8e8d4257bc less branchless mor... not good I think Richard Thier 2024-04-11 19:57:58 +02:00
  • 3f4b17f0ef Revert "more branchless mormord - slower" Richard Thier 2024-04-11 19:54:07 +02:00
  • f4ceffe6e2 more branchless mormord - slower Richard Thier 2024-04-11 19:53:58 +02:00
  • 2d2cad2c5a mormord sort more branchless plus extra edge-case handling for empty sized calls Richard Thier 2024-04-11 19:45:30 +02:00
  • d16505a297 mormordsort got template recursion for 33% speedup (I think it still has 2x maybe) Richard Thier 2024-04-11 19:00:52 +02:00
  • ae2cd09452 removed unecessary mormordsort if Richard Thier 2024-04-11 17:52:58 +02:00
  • 32e98de308 Revert "mormord sort further optimizations (for me slower - btw it might need to be called mormord-prenex-magyarsort at this point? I added a lot to it tbh like copied parts of thiersort for this to work)" Richard Thier 2024-04-11 17:19:10 +02:00
  • bccb1d0703 mormord sort further optimizations (for me slower - btw it might need to be called mormord-prenex-magyarsort at this point? I added a lot to it tbh like copied parts of thiersort for this to work) Richard Thier 2024-04-11 17:18:39 +02:00
  • 02bad1f59f minor optimization on mormord sort Richard Thier 2024-04-11 16:59:09 +02:00
  • b2d700f127 mormord sort - working version, slow on random input for me Richard Thier 2024-04-11 16:41:08 +02:00
  • 55583bcb4a mormordsort - buggy version (I actually think its some of the Magyarsort 2.x in this form - but needs fixing Richard Thier 2024-04-11 06:13:51 +02:00
  • 6426560519 outliersort ideas Richard Thier 2023-07-20 23:28:52 +02:00
  • 0521ddd52d added neargoodsort idea (and some merge space optimization ideas that I think are known) Richard Thier 2023-07-20 21:09:57 +02:00
  • e3c229337c debug Richard Thier 2023-07-02 15:56:21 +02:00
  • 1c32648026 wip: debugging - should be reverted? Richard Thier 2023-07-02 13:33:27 +02:00
  • 259ae1e540 debug log for differences - I found nearby each elements to differ in this test! Richard Thier 2023-07-01 06:48:38 +02:00
  • 880fb7e991 with -g it seems there is some error actually... Richard Thier 2023-07-01 06:37:20 +02:00
  • 4436c79821 quicksort pivoting strategy changes when slowdown is recognized (works well against worst cases) Richard Thier 2023-07-01 06:06:03 +02:00
  • 83c79f4832 quicksort optimization to avoid const worstcase Richard Thier 2023-07-01 05:52:51 +02:00
  • f7c025c0dd 100k test case Richard Thier 2023-07-01 05:01:00 +02:00
  • c05e484ea0 interestingly the code I marked "rotten" might actually work lol Richard Thier 2023-07-01 04:53:42 +02:00
  • 4ad1c8b820 tested new thier and thier-qs and seems to work it looks like - constant is really slow because its the worst case for both (should be special-cased in my quicksort) Richard Thier 2023-07-01 04:50:32 +02:00
  • c47e8a133d added more regular quicksort as a separate file - still trying to prefer ours.. Richard Thier 2023-07-01 04:35:52 +02:00
  • 5df76664bb fixes to thiersort_apply - not sure actually but promising Richard Thier 2023-07-01 04:34:59 +02:00
  • 873c17f658 inplace quicksort fixes - but thier_apply seems like not doing anything? Richard Thier 2023-07-01 03:48:42 +02:00
  • 79b95bf905 various bugs Richard Thier 2023-06-30 22:06:24 +02:00
  • 58176a89b6 thiersort apply fixes, my own qsort added to the algs, quicksort_fromto fix; thier still buggy on random data - but others seem to get handled by its quicksorts under the hood... Richard Thier 2023-06-30 18:00:44 +02:00
  • 9ac3a76209 more info for thiersort testing - seems like apply maybe has a bug Richard Thier 2023-06-30 17:00:37 +02:00
  • 36189e8a3c hopefully fixing internal quicksorts? Richard Thier 2023-06-30 16:39:56 +02:00
  • 96e9fb4440 add thiersort for testing - all kinds of crashes for now Richard Thier 2023-06-30 16:39:33 +02:00
  • 88a8e87418 thiersort compile errors Richard Thier 2023-05-02 13:20:07 +02:00
  • 8dd103ca54 apply and prepare operations - first version Richard Thier 2023-04-29 19:14:03 +02:00