23 Commits

Author SHA1 Message Date
Richard Thier
298edba5d2 minor unroll 2021-12-18 01:48:42 +01:00
Richard Thier
e7b677e4db basic prefetch optimizations 2021-12-18 01:23:06 +01:00
Richard Thier
e5d4ff74ad more manual unrolls 2021-12-17 23:37:48 +01:00
Richard Thier
645bc19f19 Manual occurence unrolling 2021-12-17 22:48:38 +01:00
Richard Thier
be450086b5 took out prefetch and added commented out pragmas - they not help 2021-12-17 22:09:35 +01:00
Richard Thier
3fdcaad537 trying some prefetch - not that good yet 2021-12-17 21:42:35 +01:00
Richard Thier
0b4eb5e5a6 minor speed tweaks by being able to define the counter type 2021-12-17 21:17:53 +01:00
Richard Thier
a044787846 finally again a real optimization and API for reusal - even faster for non-reused 2021-12-15 03:14:35 +01:00
Richard Thier
3490201420 further optimization - API change however is not a no-cost abstraction as it makes clang slower than original heap variant and g++ albeit faster than original it does not as fast as hardcoded - will investigave API change 2021-12-15 00:43:25 +01:00
Richard Thier
c4ed2994ea Setting up causal profiling with "coz" 2021-12-14 17:29:33 +01:00
Richard Thier
11ceee29a1 minor tweaking for more ILP 2021-12-13 03:48:17 +01:00
Richard Thier
62dcda6bf2 minor tweaks 2021-12-13 02:18:08 +01:00
Richard Thier
68684f7fb0 Implemented ILP and cache optimized simple radix variant - surprisingly good already! 2021-03-13 15:51:24 +01:00
Richard Thier
1d0ba81e49 tried if it works with nibbles too: seems like easier to debug actually in this mode 2021-03-11 23:20:03 +01:00
Richard Thier
151b8f398b Likely better ILP and no manual digit counts in code 2021-03-11 23:13:53 +01:00
Richard Thier
22e80d4cd5 indent 2021-03-11 22:40:37 +01:00
Richard Thier
f30b5056cc noexcept 2021-03-11 22:39:53 +01:00
Richard Thier
83ae455c34 rename 2021-03-11 22:38:23 +01:00
Richard Thier
cfc9a050e4 removed manual digit usage by recursive template trickz 2021-03-11 22:34:44 +01:00
Richard Thier
c7fe2f0507 little refactor to maybe avoid manual digit misery 2021-03-11 22:19:29 +01:00
Richard Thier
e076ab662b prefix sum 2021-03-11 22:06:50 +01:00
Richard Thier
f8d4f597c6 fix dumb mistakes 2021-03-11 21:38:06 +01:00
Richard Thier
33910b7e50 project init 2021-03-11 21:23:50 +01:00