I am trying to write an optimal function that sets a list of elements, returns the original list with top_k elements at the beginning (they do not need to order themselves). I have no restrictions on the order of the remaining elements, but ideally I would like them to be in the original order.
I tried 3 approaches. Firstly, a trivial solution that runs in O (top_k * N) time. Secondly, using a bunch of priorities of the largest elements with O (log (top_k) * N) (the slowest) and, finally, brute force sorting the entire list of O (N * logN), which turned out to be the fastest)
def semi_sort_trivial(items, top_k=3):
for i in range(top_k):
maximum = items[i]
pos = i
for j in range(i+1, len(itemss)):
if maximum < events[j]:
pos = j
maximum = items[j]
items[pos], items[i] = items[i], items[pos]
return items
def semi_heap_sort(items, top_k=3):
lst = []
heap_store = items[:top_k]
for item in items[top_k:]:
lst.append(heapq.heappushpop(heap_store, item))
return heap_store + lst
def semi_sort_usingsort(items, top_k=3):
lst = sorted(items)[-top_k:]
return lst + [item for item in items if item not in lst]
In [7]: %timeit semi_heap_sort(range(20))
10000 loops, best of 3: 26.3 us per loop
In [8]: %timeit semi_sort_trivial(range(20))
100000 loops, best of 3: 11 us per loop
In [9]: %timeit semi_sort_usingsort(range(20))
100000 loops, best of 3: 5.89 us per loop
, . , . . , . ?
, . N = 20 k = 3 N log N 20 * 5 , , N- top_k 20 * 2. , semi_sort_usingsort. , ?
.