Sorted data structure for iteration in order, ordered push and delete (N items only on top)

What is considered the optimal data structure for pushing something in order (so it inserts into any position that can find the right position), the iteration is in order and choosing N elements from above (so that N is the smallest elements, N is determined by comparison with a threshold value) ? The jerk and pop should be especially fast (start each iteration of the loop), while the complete iteration of the data in order occurs at a variable speed, but probably an order of magnitude less. Data cannot be cleared by a complete iteration, it should not be changed. Everything that is clicked will eventually pop up, but since pop can remove multiple elements, there may be more shocks than pop. The data scale in the structure at any time can increase to hundreds or thousands of thousands of elements.

I am currently using std::dequebinary search to insert elements in ascending order. Profiling shows that it takes up most of the time, so something has changed. std::priority_queuedoesn’t allow iteration, and the hacks I saw will not be put in order. Even with a limited test (without full iteration!) The class std::setperformed worse than my approach std::deque.

None of the classes I fiddled with seem to be built with this use case in mind. I would not mind making my own class if the data structure is not found in the STL or raised for some reason.

edit:

There are currently two main functions, pushand prune. pushuses 65% of the time, pruneuses 32%. Most of the time used in push, is connected with an insert in the deque(64% from 65%). Only 1% is obtained from binary search to find a position.

template<typename T, size_t Axes>
void Splitter<T, Axes>::SortedData::push(const Data& data) //65% of processing
{
 size_t index = find(data.values[(axis * 2) + 1]);

 this->data.insert(this->data.begin() + index, data); //64% of all processing happens here
}

template<typename T, size_t Axes>
void Splitter<T, Axes>::SortedData::prune(T value) //32% of processing
{
 auto top = data.begin(), end = data.end(), it = top;

 for (; it != end; ++it)
 {
  Data& data = *it;

  if (data.values[(axis * 2) + 1] > value) break;
 }

 data.erase(top, it);
}

template<typename T, size_t Axes>
size_t Splitter<T, Axes>::SortedData::find(T value)
{
 size_t start = 0;
 size_t end = this->data.size();

 if (!end) return 0;

 size_t diff;

 while (diff = (end - start) >> 1)
 {
  size_t mid = diff + start;

  if (this->data[mid].values[(axis * 2) + 1] <= value)
  {
   start = mid;
  }
  else
  {
   end = mid;
  }
 }

 return this->data[start].values[(axis * 2) + 1] <= value ? end : start;
}
+5
source share
5 answers

According to your requirements, a hybrid data structure tailored to your needs is likely to work best. As others have said, continuous memory is very important, but I would not recommend keeping the array sorted at any time. I suggest you use 3 buffers (1 std::arrayand 2 std::vectors):

  • 1 ( ) " ". .
  • 2 ( ) (A + B) .

, std::push_heap. , . , std::sort std::merge (A) (B), . , , .. A B . , . , ( , pop_back pop_front).

, .

+2
0

, . std:: map.

0

, o (log n), , priority_queue ( deque). STL, , .

, , STL . , , , , , , , , - .

0

, - ? , ; , , <() . ( <())

EDIT: Your own numbers show that inserting does not take time, but "sorting". Although some of these insertion times create a copy of your object, some (possibly most) are creating an internal structure that will hold your object - and I don’t think it will change between list / map / set / queue, etc. . If you can predict the probable final / maximum size of your dataset and can write or find your own sorting algorithm, and time is wasted when distributing objects, then the vector can be a way.

0
source

All Articles