Effectively accessing the B + tree containing multidimensional data

I have a set of tuples (x,y)of 64-bit integers that make up my dataset. I, say, trillions of these tuples; it is not possible to store a dataset in memory on any machine on earth. However, it is wise to store them on disk.

I have a storage on disk (B + -tree) that allows me to quickly and parallelly process data in one dimension. However, some of my queries depend on both dimensions.

Request examples:

  • Find a tuple whose is xgreater than or equal to some given value
  • Find a tuple whose smallest xpossible st it is ygreater than or equal to some given value
  • Find a tuple whose smallest xpossible st it is yless than or equal to some given value
  • Perform maintenance operations (insert some tuples, delete some tuples)

The best I have found are Z-order curves, but I can't figure out how to execute queries given my two-dimensional dataset.

Solutions that are unacceptable include sequential scanning of data; this may be too slow.

+5
source share
4 answers

I think the most suitable data structures for your requirements are R-tree and its variants (R * -tree, R + -tree, Hilbert R-tree). The R-tree is similar to B + -tree, but also allows multi-dimensional queries.

- . , 1.. 3, , . . : " " ( 18.5).

+2

, , z-? Wikipedia , .

Z- , . :

Start with the largest rectangle that might contain your point.

    Recursively:

        Create a result set of rectangles    

    For each rectangle in your set        
        If the rectangle is a single point, you are done, it is what you are looking for.
        Otherwise, divide the rectangle in two (specify one additional bit of the z-curve)
            If both halves contain a point
                If one half is better 
                    Add that rectangle to your result set of rectangles
                Otherwise
                    Add both rectangles to your result set of rectangles
            Otherwise, only one half contains a point
                    Add that rectangle to your result set of rectangles

    Search your result set of rectangles

, - . , z-order.

0

, "" B + ( d +, d - ) . , .

:

B + B + tree. , , B +, . B + x.

, . ( ), . "" , B +.

// 2 :

log b(card(x)) + log b(card(y))

b - B +, card (x) - x.

, . , .

0

http://fallabs.com/tokyocabinet/

Tokyo Cabinet - . , , . . . , . -, B + .

C API C, Perl, Ruby, Java Lua. API, C99 POSIX. Tokyo Cabinet - , GNU Lesser General Public.

u?

0

All Articles