Richard Thier
|
3fdcaad537
|
trying some prefetch - not that good yet
|
2021-12-17 21:42:35 +01:00 |
|
Richard Thier
|
0b4eb5e5a6
|
minor speed tweaks by being able to define the counter type
|
2021-12-17 21:17:53 +01:00 |
|
Richard Thier
|
1686967f10
|
minor tweaks to 4pasu and added 4rot
|
2021-12-17 19:20:58 +01:00 |
|
Richard Thier
|
a878f20100
|
ypsus 4passu method optimized a bit
|
2021-12-15 16:09:40 +01:00 |
|
Richard Thier
|
a947cda58d
|
Revert "vsort version that got slower, but is really funny template code"
This reverts commit fd35dbc51b63fa97ff5a9d7a823cdfa271b99a43.
|
2021-12-15 14:48:27 +01:00 |
|
Richard Thier
|
fd35dbc51b
|
vsort version that got slower, but is really funny template code
|
2021-12-15 14:48:14 +01:00 |
|
Richard Thier
|
bff96c8f7f
|
upgraded vsort a bit (50-100ms)
|
2021-12-15 12:53:00 +01:00 |
|
Richard Thier
|
520db7049d
|
added ypsu-variants of radix-like things
|
2021-12-15 12:52:33 +01:00 |
|
Richard Thier
|
a044787846
|
finally again a real optimization and API for reusal - even faster for non-reused
|
2021-12-15 03:14:35 +01:00 |
|
Richard Thier
|
3490201420
|
further optimization - API change however is not a no-cost abstraction as it makes clang slower than original heap variant and g++ albeit faster than original it does not as fast as hardcoded - will investigave API change
|
2021-12-15 00:43:25 +01:00 |
|
Richard Thier
|
a1d6e96f5a
|
back to regular perf / measure run
|
2021-12-14 17:32:43 +01:00 |
|
Richard Thier
|
05235e269f
|
added simd-sort - basically the whole repo, but I haxed-in magyarsort as measure
|
2021-12-14 17:30:07 +01:00 |
|
Richard Thier
|
c4ed2994ea
|
Setting up causal profiling with "coz"
|
2021-12-14 17:29:33 +01:00 |
|
Richard Thier
|
675b90c0d8
|
make: release_debug_sym for better perfing
|
2021-12-13 04:20:05 +01:00 |
|
Richard Thier
|
11ceee29a1
|
minor tweaking for more ILP
|
2021-12-13 03:48:17 +01:00 |
|
Richard Thier
|
c2fc962766
|
test can now be used with perf valgrind --cachegrind and such tools
|
2021-12-13 03:48:01 +01:00 |
|
Richard Thier
|
bcdb905748
|
added better test by rlblaster / ypsu / kbalazs
|
2021-12-13 02:30:12 +01:00 |
|
Richard Thier
|
62dcda6bf2
|
minor tweaks
|
2021-12-13 02:18:08 +01:00 |
|
Richard Thier
|
76ba29018d
|
tweak: added ska_sort for measuring against because Ypsu told me about it... mixed results on my machine (on small numbers below 100 mine wins always, above it is really mixed and close)
|
2021-12-13 00:51:26 +01:00 |
|
Richard Thier
|
860cc4e702
|
added some result measurements - why image? will likely look better on git when someone shares it on social media if I add to readme...
|
2021-03-13 17:11:34 +01:00 |
|
Richard Thier
|
68684f7fb0
|
Implemented ILP and cache optimized simple radix variant - surprisingly good already!
|
2021-03-13 15:51:24 +01:00 |
|
Richard Thier
|
4199393153
|
make: c++14 and -O2
|
2021-03-13 11:12:29 +01:00 |
|
Richard Thier
|
1d0ba81e49
|
tried if it works with nibbles too: seems like easier to debug actually in this mode
|
2021-03-11 23:20:03 +01:00 |
|
Richard Thier
|
151b8f398b
|
Likely better ILP and no manual digit counts in code
|
2021-03-11 23:13:53 +01:00 |
|
Richard Thier
|
22e80d4cd5
|
indent
|
2021-03-11 22:40:37 +01:00 |
|
Richard Thier
|
f30b5056cc
|
noexcept
|
2021-03-11 22:39:53 +01:00 |
|
Richard Thier
|
83ae455c34
|
rename
|
2021-03-11 22:38:23 +01:00 |
|
Richard Thier
|
cfc9a050e4
|
removed manual digit usage by recursive template trickz
|
2021-03-11 22:34:44 +01:00 |
|
Richard Thier
|
c7fe2f0507
|
little refactor to maybe avoid manual digit misery
|
2021-03-11 22:19:29 +01:00 |
|
Richard Thier
|
e076ab662b
|
prefix sum
|
2021-03-11 22:06:50 +01:00 |
|
Richard Thier
|
f8d4f597c6
|
fix dumb mistakes
|
2021-03-11 21:38:06 +01:00 |
|
Richard Thier
|
33910b7e50
|
project init
|
2021-03-11 21:23:50 +01:00 |
|