UNIS Template
*
COMMONWEALTH OF AUSTRALIA
Copyright Regulations 1969
WARNING
This material has been reproduced and communicated to you by or on behalf of the University of Sydney pursuant to Part VB of the Copyright Act 1968 (the Act). The material in this communication may be subject to copyright under the Act. Any further copying or communication of this material by you may be the subject of copyrightprotection under the Act.
Do not remove this notice.
*
Orthogonal range searching
Orthogonal range searching
Joachim Gudmundsson
*
Orthogonal range searching
*
Range queries
Input: A set of n points S={s1, s2, , sn} in d-dimensional space.
Aim: Preprocess S such that orthogonal range queries can be handled efficiently.
*
Range queries
The data is usually processed once while queries are performed many times.
Preprocessing: O(n log n)?
Space: O(n)?
Query time: O(polylog n+k)?
*
1D-range queries
S={12,3,29,20,2,11,30,31,24,22,5,9,26}
Q=[10,25]
*
1D-range queries
S={12,3,29,20,2,11,30,31,24,22,5,9,26}
What if we allow no preprocessing?
Try every point separately O(n) time
Q=[10,25]
*
1D-range queries
S={12,3,29,20,2,11,30,31,24,22,5,9,26}
What if we allow preprocessing?
Sort the points
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Query: Search for the boundary values.
Report everything in-between.
Q=[10,25]
*
1D-range queries using BST
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Q=[10,25]
*
1D-range queries using BST
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Observation 1:A balanced binary search tree can be constructed in linear time if the input is sorted.
2359111220 22 24 2629 30 31
5
The largest value in the left subtree
All elements in the leaves
*
1D-range queries using BST
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Query: Search for the boundary values.
Report everything in-between.
Q=[10,25]
2359111220 22 24 2629 30 31
*
1D-range queries using BST
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Query: Search for the boundary values.
Report everything in-between.
Q=[10,25]
2359111220 22 24 2629 30 31
*
1D-range queries using BST
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Query: Search for the boundary values.
Report everything in-between.
Q=[10,25]
2359111220 22 24 2629 30 31
*
1D-range queries using BST
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Query: Search for the boundary values.
Report everything in-between.
Q=[10,25]
2359111220 22 24 2629 30 31
*
1D-range queries using BST
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Query: Search for the boundary values.
Report everything in-between.
Q=[10,25]
2359111220 22 24 2629 30 31
*
1D-range queries using BST
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Query time:
Q=[10,25]
2359111220 22 24 2629 30 31
O(log n+reporting points)
log n
*
1D-range queries using BST
How to report the points?
Q=[10,25]
2359111220 22 24 2629 30 31
split point (search paths split)
Follow left path.
*
1D-range queries using BST
How to report the points?
Q=[10,25]
2359111220 22 24 2629 30 31
split point (search paths split)
Follow left path.
*
1D-range queries using BST
How to report the points?
Q=[10,25]
2359111220 22 24 2629 30 31
split point (search paths split)
Follow left path.
No point in sub-tree lies inside Q
*
1D-range queries using BST
How to report the points?
Q=[10,25]
split point (search paths split)
Follow left path.
Every time the left path turns left report the leaves in the right subtree!
2359111220 22 24 2629 30 31
All points in sub-tree lies inside Q
*
1D-range queries using BST
How to report the points?
Q=[10,25]
split point (search paths split)
Follow right path.
Every time the right path turns right report the leaves in the left subtree!
2359111220 22 24 2629 30 31
Time: O(size of subtree)
All points in sub-tree lies inside Q
*
1D-range queries using BST
Algorithm 1D-RangeQuery(T,[a,b])
SplitNode FindSplitNode(T,a,b)
v lc(split)
While v is not a leaf do
If value(v) a then
ReportSubtree(rc(v))
v lc(v)
else v rc(v)
If value(v) in [a,b] then report(value(v))
% Do the same for the right path
SplitNode
v
rc(v)
*
1D-range queries using BST
Correctness proof
Prove that every reported point p lies in [a,b]
If p is a leafof the search path then p is tested explicitly.
Otherwise p must be reported in call to ReportSubtree(rc(v)).
Assume this happened when searching for a.
SplitNode
v
rc(v)
Search for b goes right at split node b > SplitNode p
Search for a goes left at v a v < pFinally prove that every point in the range is reported.p [a,b]*1D-range queries using BSTHow long does it take to report all the leaves of a bbs-tree? 2359111220 22 24 2629 30 31Query time:O(log n+reporting points)n leaves 2n-1 nodes*1D-range queries using BSTHow long does it take to report all the leaves of a bbs-tree? 2359111220 22 24 2629 30 31Query time:O(log n+k)Preprocessing: O(n log n)Space: O(n)Query time: O(log n+k)*2D-range queriesInput: A set of n points S={s1, s2, , sn} in the Euclidean plane.Aim: Preprocess S such that orthogonal range queries can behandled efficiently.*kD-treesHow can we generalise the 1D structure to 2D?What if we split both wrt x-coordinates and y-coordinates?*kD-treesHow can we generalise the 1D structure to 2D?What if we split both wrt x-coordinates and y-coordinates?AA*kD-treesHow can we generalise the 1D structure to 2D?What if we split both wrt x-coordinates and y-coordinates?ABCABC*kD-treesHow can we generalise the 1D structure to 2D?What if we split both wrt x-coordinates and y-coordinates?ABDEFGCBCADEFG*kD-treesHow can we generalise the 1D structure to 2D?What if we split both wrt x-coordinates and y-coordinates?BCABDEFGCHIJKLMNOADEFGHLNKMIJO*kD-treesHow can we generalise the 1D structure to 2D?What if we split both wrt x-coordinates and y-coordinates?ABDEFGCHIJKLMNOkD-treesHow can we generalise the 1D structure to 2D?What if we split both wrt x-coordinates and y-coordinates?ABDEFGCHIJKLMNO*kd-tree pseudocodeAlgorithm kd-tree(P,depth)if #P=1 then return leaf containing P else if even(depth) thensplit P with vertical line L through Xmedian(P)P1 subset of P on or to the left of L,P2 subset of P to the right of Lelse if odd(depth) thensplit P with horizontal line L through Ymedian(P.y)P1 subset of P on or below L,P2 subset of P above Lcreate node v storing L v.left kd-tree(P1,depth+1)v.right kd-tree(P2,depth+1)RETURN vConstruction time?*kd-tree preprocessingMedian can be computed in O(n) time[Blum et al. 1973]or*kd-tree preprocessingMedian can be computed in O(n) time[Blum et al. 1973]or123456sorted wrt x-coord.sorted wrt y-coord.Time = O(n log n)123456Time T(n) = O(1) if n=1O(n) + 2T(n/2)otherwise*kd-tree spaceSpace = size of a balanced binary tree = O(n)*kd-tree queryA node v in the kd-tree corresponds to a rectangular region in the plane (region(v)).V123123All points in region(v) are leaves in the subtree with root at v.v*kd-tree queryQuery algorithm:Given query rectangle R, visit all nodes in T that intersect R.2R11233vkd-tree queryQuery algorithm:Given query rectangle R, visit all nodes in T that intersect R.2RAll points in the subtree of T rooted at v should be reported!4112334vv*kd-tree queryAlgorithm SearchTree(v root of T, R query range)if v is a leaf then RETURN v if v in R else if region(lc(v)) in R then ReportSubtree(lc(v))else if region(lc(v)) intersects R then SearchTree(lc(v),R)else if region(rc(v)) in R then ReportSubtree(rc(v))else if region(rc(v)) intersects R then SearchTree(lc(v),R)Query time?O(k) + number of nodes tested in T that do not contain any points to report. 21R*kd-tree queryAlgorithm SearchTree(v root of T, R query range)if v is a leaf then RETURN v if v in R else if region(lc(v)) in R then ReportSubtree(lc(v))else if region(lc(v)) intersects R then SearchTree(lc(v),R)else if region(rc(v)) in R then ReportSubtree(rc(v))else if region(rc(v)) intersects R then SearchTree(lc(v),R)Query time?O(k) + number of nodes tested in T that do not contain any points to report. How many such regions is there?21R*kd-tree queryThe number of such regions is at most the number of regions that can intersect a vertical line (4).Consider a vertical line L and a kd-tree TRALAL*kd-tree queryConsider a vertical line L and a kd-tree TALBANumber of visited nodes?Level 1: Q(n) = 1 + Q(n/2)Level 2: Q(n) = 1 + 2Q(n/4)BQ(n) = 2 + 2Q(n/4) = O(n )*Summary: kd-treesPreprocessing: O(n log n)Space: O(n)Query time O(n1/2+k)*kd-trees: higher dimensions3D: Consider a side L of the query range (6)LBAA*kd-trees: higher dimensions3D: Consider a side L of the query rangeLBBAA*kd-trees: higher dimensionsQuery: Q(n)= 4 + 4Q(n/8) =O(n2/3)3D: Consider a side of the query rangeLABALB*Summary kd-trees: higher dimensionsPreprocessomg: O(dn log n)Space: O(dn)Query: O(n1-1/d + k)dD: Consider a side of the query rangeCan we do better?If space = O(n) then query time = (n1-1/d + k)*2D-range queriesRange Treeskd-trees are optimal if we only allow for O(n) space.What if we allow for more space, can we do better?*2D-range queriesObservation: A 2D-query is two 1D-queries.*2D-range queriesObservation: A 2D-query is two 1D-queries.Idea: Build a balanced binary search tree w.r.t. the x-coordinatesof the points in S. *2D-range queriesObservation: A 2D-query is two 1D-queries.Idea: Build a balanced binary search tree w.r.t. the x-coordinatesof the points in S. T*2D-range queriesFor each internal node v in T construct an associated data structure A(v), for the points in the subtree of T rooted at v.vT*2D-range queriesFor each internal node v in T construct an associated data structure A(v), for the points in the subtree of T rooted at v.A(v) is a balanced binary search tree w.r.t. the y-coordinates.vA(v)vA(v)*2D-range queriesFor each internal node v in T construct an associated data structure A(v), for the points in the subtree of T rooted at v.A(v) is a balanced binary search tree w.r.t. the y-coordinates.Range tree: multilevel tree structureT main tree (first-level tree)vA(v) associated data structure(secondary tree) *2D-range queriesQuery:[x:x] [y;y]xx*2D-range queriesQuery:[x:x] [y;y]xxxx*2D-range queriesQuery:[x:x] [y;y]xxCBAxxABC*2D-range queriesQuery:[x:x] [y;y]xxCBAxxABCyyQuery time:Left and right search paths +Searching the associated structures2 x O(log n)*2D-range queriesEach 1D-range tree takes O(log n+k) to query.How many associated structures do we need to query? *2D-range queriesQuery time:Querying a 1D-tree requires O(log n+k) time. How many 1D trees (associated structures) do we need to query?At most 2 height of T = 2 log nEach 1D query requires O(log n+k) time.Query time = O(log2 n + k)Query: [x,x]xx*Range treesSpace: – Number of nodes in T is O(n)- |A(v)| = ?Consider one leaf v. How many associated structures does v belong to?Which ones does it belong to?vTv*Range treesSpace: – Number of nodes in T is O(n)- |A(v)| = ?Consider one leaf v. How many associated structures does v belong to?Which ones does it belong to?vTv*Range treesSpace: – Number of nodes in T is O(n)- |A(v)| = ?Consider one leaf v. How many associated structures does v belong to?Which ones does it belong to?vTvlog n*Range treesSpace: – Number of nodes in T is O(n)- |A(v)| = O(n log n)Consider one leaf v. How many associated structures does v belong to?Which ones does it belong to?vTvlog n*Range treesConstruction time:- T can be constructed in O(n log n) time – All the associated structures can be constructed in timeO(n log n) + |A(v)| = O(n log n). Why? Observation 1!vTvlog n*2D-range queriesSummaryPreprocessing:O(n log n)Space:O(n log n)Query time: O(log2 n + k)*dD-range queriesSummaryPreprocessing:O(n logd-1 n)Space:O(n logd-1 n)Query time: O(logd n + k)first-level treesecond-level treed-level treev*Lower boundQuery time and preprocessing is almost optimal. Can we improve the space requirement? Theorem: (Chazelle 1990)If query time is O(logc n + k) thenone needs at least (n log n/loglog n) space.*Summary: Range treeskd-treesSpace: O(dn)Preprocessomg:O(dn log n)Query: O(n1-1/d + k)Range treesSpace: O(n logd-1 n)Preprocessing: O(n logd-1 n)Query time: O(logd n)O(logd-1 n) with some tricks
Reviews
There are no reviews yet.