Finding a good match between two lists of numbers

I got two sets of numbers, with SET2 having more elements in it. This ensured that the quantity of SET2 is equal to or greater than the quantity of SET1 . In fact, since order matters, the input is lists rather than sets.

My goal is to combine (summarize) / reorder the numbers from SET2 to make it as similar as possible to SET1 . I define similarity as the sum of the variances in each position. See this post for how I calculate the similarities. The smaller the amount, the better.

My first approach was to try all the combinations and choose the best one. This only works for fairly small sets (especially the second). See this post and answer from Rawling. Is there a smarter way to get a good combination? I definitely don't need the best. The result would be nice. Obviously, sets with empty subsets are meaningless. The extremely unbalanced sets do not seem to be very promising for me. SET1 tends to have about 8, but can have up to 18 entries. SET2 often has a score of over 10 (up to 35). The sum of the numbers in the two sets is (except for rounding errors).

Here is an example with good and bad results (not all possible):

SET1 = { 272370, 194560, 233430 }; SET2 = { 53407.13, 100000, 365634.03, 181319.07 }

      272370            |      194560          |        233430 
---------------------------------------------------------------------
     365634.03         |  100000 + 53407.13   |      181319.07       (best match)
     365634.03         |     181319.07        |  100000 + 53407.13   (good)
     365634.03         |      100000          |181319.07 + 53407.13  (ok)
      53407.13          |365634.03 + 100000    |      181319.07       (bad)
      53407.13          |365634.03 + 181319.07 |        100000        (bad)
.                 |365634.03 + 181319.07 | 53407.13 + 100000    (invalid)
53407.13 + 100000 |365634.03 + 181319.07 |                      (invalid)

, , , . .

!

+5
1

, :

1. list<int> set1, set2;
2. sort(set2) // decreasing, set2[0] would be the greatest value in set2
3. struct set1item = {set1index, value, list<int> chosen}
4. prepare list<set1item> set1items from set1 //(index = index in set1 list, value = set1[index] and chosen = null)
5. put set1items to some priorityqueue pq // ordered by value
6. for each set2item in set2{
7.     item = pq.first()
8.     item.chosen.add(set2item);
9.     item.value -= set2item;
10.    pq.updateFirst(item)
11.}

: set2 , set1, , set2, set2 set1.

, set1 .

1: Set1 = {20, 9, 7, 3}, Set2 = {7, 6, 6, 4, 2, 2, 2, 2, 2, 2, 2, 2}.

iter1: fromSet2 = 7, Set1 = {20:{}, 9:{}, 7:{}, 3:{}}, fromSet1=20. 20 7 7 . : Set1 = {13:{7}, 9:{}, 7:{}, 3:{}}.

iter2: fromSet2 = 6, Set1 = {13:{7}, 9:{}, 7:{}, 3:{}}, fromSet1=13. 13 6 6 . : Set1 = {7:{7, 6}, 9:{}, 7:{}, 3:{}}.

iter3: fromSet2 = 6, Set1 = {7:{7, 6}, 9:{}, 7:{}, 3:{}}, fromSet1=9. 9 6 6 . : Set1 = {7:{7, 6}, 3:{6}, 7:{}, 3:{}}.

iter4: fromSet2 = 4, Set1 = {7:{7, 6}, 3:{6}, 7:{}, 3:{}}, fromSet1=7. 7 4 4 . : Set1 = {3:{7, 6, 4}, 3:{6}, 7:{}, 3:{}}.

iter5: fromSet2 = 2, Set1 = {3:{7, 6, 4}, 3:{6}, 7:{}, 3:{}}, fromSet1=7. 7 2 2 . : Set1 = {3:{7, 6, 4}, 3:{6}, 5:{2}, 3:{}}.

...

+1

All Articles