I have a set of U elements (initially unknown size) and I would like to create a random selection from n <| U | elements. Sampling the stream is great for this.
The problem arises when I divided U into several subsets and took a random sample of each subset (each sample contains k <= n elements, but usually k = n). I also know how many elements are in each subset. I would like to know how to combine these samples (preferably combining two samples at a time) into one size n of the sample.
Or in another way, given the different sets A and B , as well as the random samples a and b, I would like to make c ⊆ a ∪ b so that c is a random pattern A ∪ B , and I can specify the size of c (usually | c | will be about the same size as a |).
, U. , S_i, . S_i. , S_1 20% U, S_1 20%. , , , , . , k n, k = n, , , .
A B, c : A strong > |/| A ∪ B | a; | B |/| A ∪ B | = 1 - (| A |/| A ∪ B |) b. ( , , | a | n * (| A |/| A ∪ B |) ( | b |), , , , .) .
| A | == | B | | a | == | b |, . aUb.