magyarsort/neargoodsort_and_merge_ideas.md

# Sorting for "nearly sorted data"

## Algorithm:

* Go over the data and like in Stalin-sort keep only those who are in order
* BUT: Unlike stalin-sort we partition!
* in[], outs[], outns[]
* The in is the input array
* The outs is the "sorted part" of the separation (Stalin would keep them)
* The outns is the "outliers" part of the separation (Stalin would kill them)
* Use the same algorithm to recursively sort the outns part
* Use the merge sort's merge algoritm to merge outs[] and outns[] back into in[]

## This works because we know for sure that outs has at least a single element!
## When it only has one element we get worst case O(n^2) runtime!
## When the data is nearly sorted, we get nearly O(n) runtime!
## Can be used to "keep an array/list sorted" with an "update" method on it that iterates over and update pos/key similar to kismap.
## Idea: decide if we go from top or bottom based on which is smaller - hopefully mitigates worst case being descending case!

## Example

    ------------------------- Split 0
    in0:
    3 7 5 8 9 5 8 9 5 9 9 3 1
    
    outs0:
    3 7 8 9 9 9 9
    outns0:
    5 5 8 5 3 1
    ------------------------- Split 1
    in1:
    5 5 8 5 3 1
    
    outs1:
    5 5 8
    outns1:
    5 3 1
    ------------------------- Split 2
    in2:
    5 3 1
    
    outs2:
    5
    outns2:
    3 1
    ------------------------- Split 3
    in3:
    3 1
    
    outs2:
    3
    outns2:
    1
    ------------------------- Merge 3
    outs2:
    3
    outns2:
    1

    in3 [merge-out]:
    1 3
    ------------------------- Merge 2
    outs2:
    5
    outns2:
    1 3      == in3

    in2 [merge-out]:
    1 3 5
    ------------------------- Merge 1
    outs1:
    5 5 8
    outns1:
    1 3 5    == in2

    in1 [merge-out]:
    1 3 5 5 5 8
    ------------------------- Merge 0
    outs0:
    3 7 8 9 9 9 9
    outns0:
    1 3 5 5 5 8  == in1

    in0:
    1 3 3 5 5 5 7 8 8 9 9 9 9

Which is - as you can see the sort result of the input array!

    3 7 5 8 9 5 8 9 5 9 9 3 1
    
## Time and space analysis

On random data this sounds to be close to the O(n*logn) amortized runtime statistically I think but did not go after it.

On the worst case its clearly O(n^2) because we always just get a single element to outsi means that...

Space analysis is roughly same as the non-optimized merge sort - see below for space optimized merge steps - maybe useful for this to!

## Remark: Yes there is a variant in which we only need outnsi and no outsi vectors!

For split:

* Have i,j indices on the input
* i reads and j writes - except when i == j no self-overwrite happens.
* read from [i]
* put (pushback) either to outnsi or [j]
* this means j <= i
* j only grows when was put there
* if i is after length of input, j tells bounds of outsi in input arr

Recursion just done as-is above...

For merge:

* See that input has junk data at its end right as many as there is in the corresponding outnsi count
* so we merge from right-to-left. 
* The (j - 1) tells where to read from one merge source
* The end of the vector (or popback / peek) tells where to read from other merge source
* The (i - 1) is destination where to write the smaller among the two
* The corresponding merge-source-indices step only when they are moved to destination and i always moves.

This ends with the input array being sorted.

Rem.: Maybe it is worth it to set the outnsi vectors to be of length |input| / 2 or maybe even |input| to avoid vector growth at cost of more memory usage...

Rem.: This is based on the merge-sort last idea below - which as I said in git commit I think is well known, but just came up with.

# A random bad inplace-merge idea

## Example

Lets say we have this two lists

    1 3 3 5 7 9
    2 3 4 5 6 7

But represented in the same array, partitioned into two parts:

    1 3 3 5 7 9|2 3 4 5 6 7

We can go with two pointers and try to make this work with SWAPs:

    1 3 3 5 7 9|2 3 4 5 6 7
    ^           ^         ~
(noswap)
    1 3 3 5 7 9|2 3 4 5 6 7
      ^         ^
(swap*)
    1 2 3 5 7 9|3 3 4 5 6 7
        ^       ^ 
(noswap)
    1 2 3 5 7 9|3 3 4 5 6 7
          ^     ^
(swap*)
    1 2 3 3 7 9|3 4 5 5 6 7
            ^   ^
(swap*)
    1 2 3 3 3 9|4 5 5 6 7 7
              ^ ^
(swap*)
    1 2 3 3 3 4|5 5 6 7 7 9
              ^ ^
## Where: swap* means swap element on left with right, but on the right list put it in its right place (binary search + memcpy)
## Maybe: The second part should be heapified! Then we can get log(n) pop&insert, but issue is then it does not stay sorted :-(
## Runtime: O(n^2) worst case which is extreme slow...
## Rem.: Likely swap + bubble is better here for the second side...

# Better, but still slow random inplace merge idea
    1 3 3 5 7 9|2 3 4 5 6 7
    ^           ^
        (<=)
    1 3 3 5 7 9|2 3 4 5 6 7
      ^           ^
        (<=)
    1 3 3 5 7 9|2 3 4 5 6 7
        ^           ^
        (<=)
    1 3 3 5 7 9|2 3 4 5 6 7
          ^           ^
        (<=)
    1 3 3 5 7 9|2 3 4 5 6 7
            ^           ^
        (>)
    1 3 3 5 6 9|2 3 4 5 7 7
              ^           ^
        (>)
    1 3 3 5 6 7|2 3 4 5 7 9
                ^           ^
        (!!)
    1 3 3 5 6 7|2 3 4 5 7 9
    ^ !         ^
        (logsearch: ~)
    1 3 3 5 6 7|2 3 4 5 7 9
    ^ ^         ^ ~
        (tmpvec)
    1 3 3 5 6 7|. . 4 5 7 9
    ^ ^         ^ ~
        tmp: 2 3
        (memcpy)
    1 . . 3 3 5|6 7 4 5 7 9
    ^ ^         ^ ~
        tmp: 2 3
        (backwrite)
    1 2 3 3 3 5 6 7|4 5 7 9
          ^ ^       ^
        tmp: nil
        (not(3 <= 4 < 3))
    1 2 3 3 3 5 6 7|4 5 7 9
            ^ ^     ^
        tmp: nil
        (logsearch: ~)
        (not(3 <= 4 < 3))
    1 2 3 3 3 5 6 7|4 5 7 9
            ^ ^     ^ ~
        (tmpvec)
    1 2 3 3 3 5 6 7|. . 7 9
            ^ ^     ^ ~
        tmp: 4 5
        (memcpy)
    1 2 3 3 3 . . 5|6 7 7 9
            ^ ^     ^ ~
        tmp: 4 5
        (backwrite)
    1 2 3 3 3 4 5 5|6 7|7 9
              ^ ^     ^ ~
        tmp: nil

        (not(3 <= 4 < 3))
        (not(3 <= 4 < 3))
        (not(3 <= 4 < 3))
    [END]

## This sounds like O(n*logn) for the merge operation - which would make a merge sort slower than n*log*n still, but not so bad as above
## This is not totally in-place because can use worst case a lot of mem, but averagely less than regular merge
## But just using n/2 element tmp array for "regular" alg works if you think about it so not sure if beating that one...

# Doing n/2 element tmp array

From:
    arr:    1 3 3 5 7 9|2 3 4 5 6 7

To:
    arr:    . . . . . .|2 3 4 5 6 7
    tmp:    1 3 3 5 7 9

And then we just always  pick the smaller between the two piecewise:
            _
    arr:    . . . . . .|2 3 4 5 6 7
    tmp:    1 3 3 5 7 9 ^
            ^
              _
    arr:    1 . . . . .|2 3 4 5 6 7
    tmp:    . 3 3 5 7 9 ^
              ^
                _
    arr:    1 2 . . . .|. 3 4 5 6 7
    tmp:    . 3 3 5 7 9   ^
              ^
        (rem.: tmp is preferred to keep order of elements unchanged for same keys!)
                  _
    arr:    1 2 3 . . .|. 3 4 5 6 7
    tmp:    . . 3 5 7 9   ^
                ^
        (rem.: tmp is preferred to keep order of elements unchanged for same keys!)
                    _
    arr:    1 2 3 3 . .|. 3 4 5 6 7
    tmp:    . . . 5 7 9   ^
                  ^
                      _
    arr:    1 2 3 3 3 .|. . 4 5 6 7
    tmp:    . . . 5 7 9     ^
                  ^
                        _
    arr:    1 2 3 3 3 4|. . . 5 6 7
    tmp:    . . . 5 7 9       ^
                  ^
        (rem.: tmp is preferred to keep order of elements unchanged for same keys!)
                          _
    arr:    1 2 3 3 3 4|5 . . 5 6 7
    tmp:    . . . . 7 9       ^
                    ^
                            _
    arr:    1 2 3 3 3 4|5 5 . . 6 7
    tmp:    . . . . 7 9         ^
                    ^
                              _
    arr:    1 2 3 3 3 4|5 5 6 . . 7
    tmp:    . . . . 7 9           ^
                    ^
        (rem.: tmp is preferred to keep order of elements unchanged for same keys!)
                                _
    arr:    1 2 3 3 3 4|5 5 6 7 . 7
    tmp:    . . . . . 9           ^
                      ^
                                  _
    arr:    1 2 3 3 3 4|5 5 6 7 7 .
    tmp:    . . . . . 9             ^
                      ^
                                    _
    arr:    1 2 3 3 3 4|5 5 6 7 7 9
    tmp:    . . . . . .             ^
                        ^

And this ends the merge algorithm!
added neargoodsort idea (and some merge space optimization ideas that I think are known) 2023-07-20 21:09:57 +02:00			`# Sorting for "nearly sorted data"`

			`## Algorithm:`

			`* Go over the data and like in Stalin-sort keep only those who are in order`
			`* BUT: Unlike stalin-sort we partition!`
			`* in[], outs[], outns[]`
			`* The in is the input array`
			`* The outs is the "sorted part" of the separation (Stalin would keep them)`
			`* The outns is the "outliers" part of the separation (Stalin would kill them)`
			`* Use the same algorithm to recursively sort the outns part`
			`* Use the merge sort's merge algoritm to merge outs[] and outns[] back into in[]`

			`## This works because we know for sure that outs has at least a single element!`
			`## When it only has one element we get worst case O(n^2) runtime!`
			`## When the data is nearly sorted, we get nearly O(n) runtime!`
			`## Can be used to "keep an array/list sorted" with an "update" method on it that iterates over and update pos/key similar to kismap.`
			`## Idea: decide if we go from top or bottom based on which is smaller - hopefully mitigates worst case being descending case!`

			`## Example`

			`------------------------- Split 0`
			`in0:`
			`3 7 5 8 9 5 8 9 5 9 9 3 1`

			`outs0:`
			`3 7 8 9 9 9 9`
			`outns0:`
			`5 5 8 5 3 1`
			`------------------------- Split 1`
			`in1:`
			`5 5 8 5 3 1`

			`outs1:`
			`5 5 8`
			`outns1:`
			`5 3 1`
			`------------------------- Split 2`
			`in2:`
			`5 3 1`

			`outs2:`
			`5`
			`outns2:`
			`3 1`
			`------------------------- Split 3`
			`in3:`
			`3 1`

			`outs2:`
			`3`
			`outns2:`
			`1`
			`------------------------- Merge 3`
			`outs2:`
			`3`
			`outns2:`
			`1`

			`in3 [merge-out]:`
			`1 3`
			`------------------------- Merge 2`
			`outs2:`
			`5`
			`outns2:`
			`1 3 == in3`

			`in2 [merge-out]:`
			`1 3 5`
			`------------------------- Merge 1`
			`outs1:`
			`5 5 8`
			`outns1:`
			`1 3 5 == in2`

			`in1 [merge-out]:`
			`1 3 5 5 5 8`
			`------------------------- Merge 0`
			`outs0:`
			`3 7 8 9 9 9 9`
			`outns0:`
			`1 3 5 5 5 8 == in1`

			`in0:`
			`1 3 3 5 5 5 7 8 8 9 9 9 9`

			`Which is - as you can see the sort result of the input array!`

			`3 7 5 8 9 5 8 9 5 9 9 3 1`

			`## Time and space analysis`

			`On random data this sounds to be close to the O(n*logn) amortized runtime statistically I think but did not go after it.`

			`On the worst case its clearly O(n^2) because we always just get a single element to outsi means that...`

			`Space analysis is roughly same as the non-optimized merge sort - see below for space optimized merge steps - maybe useful for this to!`

outliersort ideas 2023-07-20 23:28:52 +02:00			`## Remark: Yes there is a variant in which we only need outnsi and no outsi vectors!`

			`For split:`

			`* Have i,j indices on the input`
			`* i reads and j writes - except when i == j no self-overwrite happens.`
			`* read from [i]`
			`* put (pushback) either to outnsi or [j]`
			`* this means j <= i`
			`* j only grows when was put there`
			`* if i is after length of input, j tells bounds of outsi in input arr`

			`Recursion just done as-is above...`

			`For merge:`

			`* See that input has junk data at its end right as many as there is in the corresponding outnsi count`
			`* so we merge from right-to-left.`
			`* The (j - 1) tells where to read from one merge source`
			`* The end of the vector (or popback / peek) tells where to read from other merge source`
			`* The (i - 1) is destination where to write the smaller among the two`
			`* The corresponding merge-source-indices step only when they are moved to destination and i always moves.`

			`This ends with the input array being sorted.`

			`Rem.: Maybe it is worth it to set the outnsi vectors to be of length \|input\| / 2 or maybe even \|input\| to avoid vector growth at cost of more memory usage...`

			`Rem.: This is based on the merge-sort last idea below - which as I said in git commit I think is well known, but just came up with.`
added neargoodsort idea (and some merge space optimization ideas that I think are known) 2023-07-20 21:09:57 +02:00
			`# A random bad inplace-merge idea`

			`## Example`

			`Lets say we have this two lists`

			`1 3 3 5 7 9`
			`2 3 4 5 6 7`

			`But represented in the same array, partitioned into two parts:`

			`1 3 3 5 7 9\|2 3 4 5 6 7`

			`We can go with two pointers and try to make this work with SWAPs:`

			`1 3 3 5 7 9\|2 3 4 5 6 7`
			`^ ^ ~`
			`(noswap)`
			`1 3 3 5 7 9\|2 3 4 5 6 7`
			`^ ^`
			`(swap*)`
			`1 2 3 5 7 9\|3 3 4 5 6 7`
			`^ ^`
			`(noswap)`
			`1 2 3 5 7 9\|3 3 4 5 6 7`
			`^ ^`
			`(swap*)`
			`1 2 3 3 7 9\|3 4 5 5 6 7`
			`^ ^`
			`(swap*)`
			`1 2 3 3 3 9\|4 5 5 6 7 7`
			`^ ^`
			`(swap*)`
			`1 2 3 3 3 4\|5 5 6 7 7 9`
			`^ ^`
			`## Where: swap* means swap element on left with right, but on the right list put it in its right place (binary search + memcpy)`
			`## Maybe: The second part should be heapified! Then we can get log(n) pop&insert, but issue is then it does not stay sorted :-(`
			`## Runtime: O(n^2) worst case which is extreme slow...`
			`## Rem.: Likely swap + bubble is better here for the second side...`

			`# Better, but still slow random inplace merge idea`
			`1 3 3 5 7 9\|2 3 4 5 6 7`
			`^ ^`
			`(<=)`
			`1 3 3 5 7 9\|2 3 4 5 6 7`
			`^ ^`
			`(<=)`
			`1 3 3 5 7 9\|2 3 4 5 6 7`
			`^ ^`
			`(<=)`
			`1 3 3 5 7 9\|2 3 4 5 6 7`
			`^ ^`
			`(<=)`
			`1 3 3 5 7 9\|2 3 4 5 6 7`
			`^ ^`
			`(>)`
			`1 3 3 5 6 9\|2 3 4 5 7 7`
			`^ ^`
			`(>)`
			`1 3 3 5 6 7\|2 3 4 5 7 9`
			`^ ^`
			`(!!)`
			`1 3 3 5 6 7\|2 3 4 5 7 9`
			`^ ! ^`
			`(logsearch: ~)`
			`1 3 3 5 6 7\|2 3 4 5 7 9`
			`^ ^ ^ ~`
			`(tmpvec)`
			`1 3 3 5 6 7\|. . 4 5 7 9`
			`^ ^ ^ ~`
			`tmp: 2 3`
			`(memcpy)`
			`1 . . 3 3 5\|6 7 4 5 7 9`
			`^ ^ ^ ~`
			`tmp: 2 3`
			`(backwrite)`
			`1 2 3 3 3 5 6 7\|4 5 7 9`
			`^ ^ ^`
			`tmp: nil`
			`(not(3 <= 4 < 3))`
			`1 2 3 3 3 5 6 7\|4 5 7 9`
			`^ ^ ^`
			`tmp: nil`
			`(logsearch: ~)`
			`(not(3 <= 4 < 3))`
			`1 2 3 3 3 5 6 7\|4 5 7 9`
			`^ ^ ^ ~`
			`(tmpvec)`
			`1 2 3 3 3 5 6 7\|. . 7 9`
			`^ ^ ^ ~`
			`tmp: 4 5`
			`(memcpy)`
			`1 2 3 3 3 . . 5\|6 7 7 9`
			`^ ^ ^ ~`
			`tmp: 4 5`
			`(backwrite)`
			`1 2 3 3 3 4 5 5\|6 7\|7 9`
			`^ ^ ^ ~`
			`tmp: nil`

			`(not(3 <= 4 < 3))`
			`(not(3 <= 4 < 3))`
			`(not(3 <= 4 < 3))`
			`[END]`

			`## This sounds like O(nlogn) for the merge operation - which would make a merge sort slower than nlog*n still, but not so bad as above`
			`## This is not totally in-place because can use worst case a lot of mem, but averagely less than regular merge`
			`## But just using n/2 element tmp array for "regular" alg works if you think about it so not sure if beating that one...`

			`# Doing n/2 element tmp array`

			`From:`
			`arr: 1 3 3 5 7 9\|2 3 4 5 6 7`

			`To:`
			`arr: . . . . . .\|2 3 4 5 6 7`
			`tmp: 1 3 3 5 7 9`

			`And then we just always pick the smaller between the two piecewise:`
			`_`
			`arr: . . . . . .\|2 3 4 5 6 7`
			`tmp: 1 3 3 5 7 9 ^`
			`^`
			`_`
			`arr: 1 . . . . .\|2 3 4 5 6 7`
			`tmp: . 3 3 5 7 9 ^`
			`^`
			`_`
			`arr: 1 2 . . . .\|. 3 4 5 6 7`
			`tmp: . 3 3 5 7 9 ^`
			`^`
			`(rem.: tmp is preferred to keep order of elements unchanged for same keys!)`
			`_`
			`arr: 1 2 3 . . .\|. 3 4 5 6 7`
			`tmp: . . 3 5 7 9 ^`
			`^`
			`(rem.: tmp is preferred to keep order of elements unchanged for same keys!)`
			`_`
			`arr: 1 2 3 3 . .\|. 3 4 5 6 7`
			`tmp: . . . 5 7 9 ^`
			`^`
			`_`
			`arr: 1 2 3 3 3 .\|. . 4 5 6 7`
			`tmp: . . . 5 7 9 ^`
			`^`
			`_`
			`arr: 1 2 3 3 3 4\|. . . 5 6 7`
			`tmp: . . . 5 7 9 ^`
			`^`
			`(rem.: tmp is preferred to keep order of elements unchanged for same keys!)`
			`_`
			`arr: 1 2 3 3 3 4\|5 . . 5 6 7`
			`tmp: . . . . 7 9 ^`
			`^`
			`_`
			`arr: 1 2 3 3 3 4\|5 5 . . 6 7`
			`tmp: . . . . 7 9 ^`
			`^`
			`_`
			`arr: 1 2 3 3 3 4\|5 5 6 . . 7`
			`tmp: . . . . 7 9 ^`
			`^`
			`(rem.: tmp is preferred to keep order of elements unchanged for same keys!)`
			`_`
			`arr: 1 2 3 3 3 4\|5 5 6 7 . 7`
			`tmp: . . . . . 9 ^`
			`^`
			`_`
			`arr: 1 2 3 3 3 4\|5 5 6 7 7 .`
			`tmp: . . . . . 9 ^`
			`^`
			`_`
			`arr: 1 2 3 3 3 4\|5 5 6 7 7 9`
			`tmp: . . . . . . ^`
			`^`

			`And this ends the merge algorithm!`