ֿ

Overview

SpaceBase is a real-time spatial database, specifically designed for applications that require very low latencies or at very high rates. It provides a Java, C++, Erlang and Apache Thrift (Python, Ruby, Node.js) APIs.

SpaceBase is developed by Parallel Universe and its trial version can be downloaded from the SpaceBase homepage.

Introduction

What is SpaceBase?

SpaceBase is a server-side, in-memory, high-performance, dynamic, concurrent and distributable spatial data-store. It stores 2D and 3D spatial objects and allows you to update and query them at high rates and concurrently from many threads and many servers. SpaceBase not only allows for fast and concurrent updates and queries, but supports atomic multi-object transactions, and, most importantly, can be used as a basis for parallelizing your entire application.

Note: Please note that the Thrift and Erlang APIs provides only a subset of the capabilities provided with the Java and C++ APIs.

Is SpaceBase For You?

SpaceBase was specifically designed for real-time or near real-time applications (requiring low latency), modifying many spatial objects at a high rate (a high write-ratio). SpaceBase is an excellent fit for interactive systems, where updates are done on many threads and/or servers, in response to concurrent network requests. SpaceBase is especially suited for:

  • Online MMO games and virtual worlds. Spatial queries can be used for streaming and culling for each user. SpaceBase’s join-queries are great for broad-phase collision-detection.
  • Military simulation and C4I systems.
  • Highly dynamic location-based services. If your LBS only occasionally queries mostly static-objects, to inform users of nearby restaurants for example, there might be other solutions that are a better fit for your needs. SpaceBase does not particularly excel at storing a high-volume of static objects.

What SpaceBase Provides

  • Stores 2D and 3D spatial objects with extents (i.e. objects that have an area or volume, not just points).
  • Fast insert, update and delete operations.
  • Fast pre-built and user-made spatial queries. Pre-built queries include range queries (return all objects contained in a given region of space) and intersection queries (return all object intersecting a given region of space.
  • Fast join-queries. Join queries return pairs of objects that satisfy a certain condition. For example, return all pairs of objects that intersect one another, or return all pairs of objects that are within a given distance from one another.
  • Spatial processing parallelization. Transactions executing on groups of objects at close proximity are executed atomically, while transactions operating on different object-groups that are spatially separated are parallelized with virtually zero interference with one another. All of the application-logic handling spatial objects can be run through SpaceBase and thus parallelized by it.
  • Load balancing. SpaceBase can distribute its objects among different servers for efficient load-balancing. Objects that are in relative close spatial proximity will usually be kept on the same server, but SpaceBase distributes the objects in such a way that each server would hold a similar number of objects. That is, a small region of space with a very high-density of objects would be distributed across several servers, while a large region with few objects would be kept on a single server, thus ensuring a reasonable work-sharing among servers.
  • JMX monitoring of performance.

What SpaceBase Doesn’t Provide

  • Disk durability. The reason for that is the SpaceBase was designed to keep objects that move in space at a high modification rate, and so an in-memory store was the best fit.

System Requirements

SpaceBase is implemented in Java™, and runs on all hardware and operating systems with available JavaSE JRE. It provides a Java, C++, Erlang and Apache Thrift APIs.

How SpaceBase Works

SpaceBase uses a spatial data-structure based on the well-known R-Tree as an index allowing fast spatial queries. SpaceBase uses a variant of the R-Tree that never requires re-organizing to preserve its efficiency, and allows for concurrent modifications. While spatial data-structures are generally not as fast as their one-dimensional counterparts (like the B-Tree), spatial data usually has a unique characteristic which allows for its high level of concurrency: while bank accounts participating in a financial transactions usually have no common feature — any two random accounts can participate in a transaction — spatial objects usually model real-world things, and those tend to affect one another only when close. When objects affect each other (and so their behavior needs to be modeled as an atomic transaction), they are often in close proximity, while far-away objects do not, usually, interact, and so, the processes modeling them can be carried out in parallel. The R-Tree keeps near objects close together in its internal index representation, and SpaceBase employs this to allow many transactions, each operating on a group of close-proximity objects, while spatially separated transactions do not interfere with one another at all.

News

October 17, 2013

SpaceBase demo released using Quasar and without callbacks.

February 28, 2013

Spaceship simulation using SpaceBase.

December 6, 2012

SpaceBase Lite has been released.

March 22, 2012

Introductory blog post: Introducing SpaceBase: a New Realtime Spatial Data Store.

User Manual

SpaceBase Concepts

Spatial Elements

A spatial object represents a real or virtual entity that occupies a region or a point in space. Once such an object is put into a SpaceBase store, it is called an element of the store, or just a spatial element. SpaceBase supports spatial elements existing in a two-dimensional or a three-dimensional space.

Spatial Tokens

A spatial-token is an object created by SpaceBase for each element as it is inserted into the store, and used to identify the element when updating or deleting it. The insert method returns the spatial token allocated for the inserted element, and in order to delete or update it, you pass the token you’ve received to the delete or update method respectively. It is a good idea to keep the token in one of the element’s fields.

Axis-Aligned Bounding Boxes

An axis-aligned bounding-box (or an AABB, in short) is a box in the 3D case or a rectangle in the 2D case, that bounds a spatial element, and whose sides are parallel to the axes. An AABB is defined as a list of pairs, one for every dimension, each comprising of a minimum and maximum bounds along that dimension. It’s OK to have minimum and maximum bounds equal to each other. If the element is a point, all bounds along all dimensions have an extent of length zero.

AABB

SpaceBase does not consider an element’s entire geometry — only its AABB. For each element, SpaceBase holds a pointer to the object and its AABB.

Furthermore, AABBs play a crucial part in SpaceBase’s entire operation, as SpaceBase’s internal index is a hierarchical tree of nodes, each holding the AABB of all the nodes below it. You should know this should you choose to write your own special kind of spatial query, which are described below.

AABBs are represented by SpaceBase’s AABB class.

Spatial Queries

A spatial query searches the data-store for elements fulfilling a certain spatial condition. For example, you could search for all objects contained in a given region of space, or all objects whose bounds intersect some geometry. You can use SpaceBase’s built-in queries or roll your own.

Since SpaceBase does not consider the elements’ geometry, only their AABBs, the query actually only tests whether or not an element’s AABB fulfills the criteria, so it is possible that the element itself does not intersect the shape given in the query — only its bounding box does — but the results can be easily refined and culled. We’ll explore that further when we describe queries in depth.

Note that the criteria for spatial queries must only consider the elements’ location and extent in space, not other geometric properties like size, for example. Searching for all elements that longer than one meter is a geometric query, not a spatial query. While you could write such a query in SpaceBase, you really shouldn’t, as it will be very slow, and will terribly hurt the performance of other SpaceBase operations running concurrently.

Spatial Joins

A spatial join is a query that does not test each element against a given condition, but rather the elements against each other, in pairs. For example, you could search for all elements (whose bounding boxes are) intersecting each other. The result is a set of pairs of intersecting elements.

SpaceBase can join the elements of a given data-store against each other, and it can also join the elements of one store with those in another.

Geo-Spatial Queries and Joins

Internally, SpaceBase does not consider the units-of-measure of the spatial coordinates. It makes no assumption on the nature of the space, except for it being a vector space (or a finite portion of one). It doesn’t even require a metric to exist on the space. All interpretation for what the coordinates mean is done by the queries (or joins). For example, those that take into account distances, assume that the unit of distance is consistent with the coordinates, and together they describe a euclidic space (i.e., an object at coordinate (1,0) and one at (2,0) are 1 distance-unit apart).

Sometimes, though, it is desired to give the space a different interpretation. SpaceBase provides built-in spatial queries and joins for the common usage of describing objects located on (or close to) the surface of the earth. Those queries support geographic coordinates and distances in well-known units.

Complex Operations and Transactions

A complex operation is composed of several simple operations, working on several elements. A transaction is a complex operation that executes atomically, that is, queries either see the effect of all the simple operations of the transaction (once it completes), or none of them (before it completes).

SpaceBase supports different flavors of transactions, but it also supports complex operations that are not transactional — operations that other, concurrently executing, queries can observe the partial completion of.

So if you need to perform a complex operation, consider whether or not you need it to be atomic, or whether you can live without atomicity, in which case the operation will perform faster, and with less interference with other concurrent operations.

The Visitor Pattern (Java and C++ APIs only)

SpaceBase has been built from the ground-up to support multi-threaded programming in low-latency applications, and so its API was designed to enable the fastest performance and the highest concurrency (as large a number of operations working concurrently without affecting one-another’s performance).

For that reason, SpaceBase’s queries and joins don’t return a set of results, or even an iterator to traverse them. Instead, SpaceBase’s API employs the visitor pattern, in which you pass the query (or join) method an interface which you implement, and SpaceBase calls the methods (usually just one) of that interface, passing it the query’s results, either one-by-one or as a set. Complex operations and transactions also make use of visitors. This allows SpaceBase to better manage the threads in which the operations run, to make sure that internal locks are always released promptly, and to better take advantage of the hardware.

If you wish to have a set of elements returned back to the caller, you may put the results into a collection that you have allocated yourself. However, this is considered a bad idea, it is unnecessary in almost all circumstances, and may be a little tricky if SpaceBase is configured to run in the parallel execution mode, which is explained later (if, on the other hand, your use of SpaceBase is very basic, you can use the SimpleSpaceBase API, which does not make use of visitors.

From within a visitor, you cannot start a SpaceBase operation on the same store you are visiting (attempting to do so results in an exception being thrown). This would have almost certainly caused a deadlock. You may, however, start an operation on another SpaceBase store, although this could cause a deadlock if a visitor on that store were to call your own, so you should be careful if you’d like to do that. However, complex operations and transactions pass special objects as arguments to the visitor that allow it to safely modify the store.

If your visitor throws a runtime-exception for whatever reason, that exception is thrown by the SpaceBase method that started the operation (e.g. query), with one caveat discussed a bit later. If this happens as part of a complex operation or a transaction, none of the operations of that complex operations or transactions are executed. That is, both transactions and non-transactional complex operations fail atomically.

Java API

Getting Started

Installation

SpaceBase requires no installation. Simply include all jar files in the lib directory of the SpaceBase distribution in your class-path. All of SpaceBase user-facing public classes are in the package co.paralleluniverse.spacebase, and co.paralleluniverse.spacebase.queries, and they all have a fine Javadoc documentation.

Running SpaceBase

SpaceBase is a library that runs as part of your app. Just make sure to disable assertions, at least for the SpaceBase packages co.paralleluniverse.spacebase.*

Performance Note: Do not enable assertions for any of the SpaceBase classes. Enabling assertions for SpaceBase may considerably hurt performance.

Configuring and Creating a SpaceBase Store

All operations on a SpaceBase store are done by calling methods on the SpaceBase interface. To create a new SpaceBase store, we use the class SpaceBaseBuilder. We first create a new instance of SpaceBaseBuilder like so:

SpaceBaseBuilder builder = new SpaceBaseBuilder();

Then we must specify the number of dimensions this store supports — either 2 or 3:

builder.setDimensions(2);

Finally, we create the store. We must give it a non-null, non-empty, unique name that will be used by the JMX monitoring mechanism:

SpaceBase<MySpatialObject> base = builder.build("space-1");

That’s it. In fact, this could all be done in one line:

SpaceBase<MySpatialObject> base 
    = new SpaceBaseBuilder().setDimensions(2).build("space-1");

The same builder can be used again and again to create new store instances, but the builder itself is not thread-safe and should be used from a single thread.

While the builder requires setting only the dimensions property, you would probably want to also set at-least the execution mode, as explained in the next section.

Multi-Threaded Execution Modes

SpaceBase is a concurrent data-store, and as such it supports many threads operating on a single store simultaneously. It has two different execution modes that make use of threads differently. Which one is best for you depends on your specific application design and your hardware.

The first execution mode is the concurrent mode. In this mode, every operation is executed in its entirety on the thread that started it. If you call a query method with a visitor, the visitor’s method will be called on the same thread in that originally called the query method. Only internal, asynchronous housekeeping operations will be executed in background threads.

The second execution mode is the parallel mode, in which every operation may be further parallelized (using Java’s Fork/Join framework), and a single operation will execute on several different threads in parallel, none of which is the thread used to start the operation. In this mode, your visitor may be called, concurrently, on several different threads. If, against all good judgment, you have decided to collect the element passed to the visitor in a collection, take this fact into account, and make sure the collection supports concurrent insertion.

To set SpaceBase’s execution mode, you must create an appropriate — concurrent or parallel — executor, and pass it to the builder’s setExecutor method. Passing a ‘ThreadPoolExecutor’ will set the concurrent mode (the executor will be used to run background operations), while passing a jsr166e.ForkJoinPool will set the parallel mode (the ForkJoinPool must be constructed with the asyncMode flag set). For convenience, you may use one of the static factory methods of the DbExecutors class (in the co.paralleluniverse.db.api package) to create the executor.

If you use the builder to construct multiple SpaceBase stores, they will all share the same executor — as they should. In fact, when you instantiate multiple stores using the same builder, or even multiple builders, it is strongly recommended to have them share the same executor.

To set SpaceBase to run in the concurrent mode, we use DbExecutors’s concurrent method. The ‘int’ parameter indicates how large should the thread pool responsible for the background operations be:

builder.setExecutor(SpaceBaseExecutors.concurrent(2));

The parallel mode is set with the parallel method, which takes as an argument the number of threads in the fork-join pool:

builder.setExecutor(SpaceBaseExecutors.parallel(10));

Sync and Asynchronous Mode

In the parallel execution mode a SpaceBase operation may run in many threads in parallel. Since, as explained, no query or join or transaction ever return a result back to the caller but rather call a provided visitor, why should the caller wait for the operation to complete?

Well, there may be a reason to wait, if the caller wants to make sure that its modifications to the store are visible to a query it wants to issue. In the parallel mode, you could insert an element, and then immediately issue a query that will start before the insert operation has completed, and that will fail to find the newly inserted element. This caller-thread visibility may be important in some circumstances and not in others.

For that reason, all SpaceBase operations return a co.paralleluniverse.db.api.Sync object (even those that return a spatial-token, as a token is also a Sync). A Sync is really just a void Future (in fact, it implements Future<Void>). It has a join method, that, when called, blocks the current thread, until the operation that returned the Sync has completed, after which, all modifications to the store made by that operation are visible to the joining thread.

In addition, SpaceBase provides a method to wait for all operations issued from the current thread to complete.

Note that even if the SpaceBase is configured to run in parallel mode, its default is to automatically join all operations, that is, when a method returns, its operation has completed. If you wish not to wait, or only wait for some operations, you must call method setCurrentThreadAsynchronous, which affects the current thread only: from that moment on, all SpaceBase methods issued by this thread will return immediately, and you can wait for them manually, if you like, using the Sync’s join method.

If you’re using SpaceBase in concurrent mode, or if you have not called setCurrentThreadAsynchronous you may ignore the returned Syncs as they would be completely useless.

One last thing about Syncs: if a visitor called by an operation throws an exception, and if you’re using SpaceBase in parallel mode and are using manual joins, the exception is thrown when the Sync for the failed operation is joined.

Other Settings

The builder has other options that are used to set the store’s execution mode as well as other tuning parameters that affect performance, but it comes with sensible defaults. The more advanced tuning parameters are discussed in the chapter Tuning SpaceBase for Performance.

Inserting, Modifying and Deleting Spatial Elements

Once we have a SpaceBase instance, we can start using the store. In this chapter we offer simple usage examples. Please refer to the Javadoc for a more detailed reference.

Assume we have a class called MySpatialObject that represents spatial objects in our application, and an instance, spatialObj, of that class. We insert spatialObj into the store:

SpatialToken token = base.insert(spatialObj, AABB.create(8.0, 15.0, 7.5, 19.7));

Notice how we create an AABB with the AABB.create() static method. In a real application, the spatial object would probably maintain its AABB itself, in which case it might have a method called getAABB(), and we’d have something like:

SpatialToken token = base.insert(spatialObj, spatialObj.getAABB());

We said it would be a good idea to keep the token in the spatial object, so we might wish to give it a setToken and getToken methods, and so the next line would be something like:

spatialObj.setToken(token);

If we’d then like to update the object’s location, we would write:

base.update(spatialObj.getToken(), AABB.create(-100.0, -2.5, 7.5, 7.5));

Or, more likely,

base.update(spatialObj.getToken(), spatialObj.getAABB());

Finally, to remove the object from the store, we call:

base.delete(spatialObj.getToken());

Because SpaceBase was built for concurrency, the update and delete methods don’t throw exceptions if the object represented by the taken had already been removed from the store, because it may have been removed concurrently from a different thread. So, in order to save us unnecessary tests, in such a case those operations are simply ignored and the methods return normally.

Another version of the update method is used like this:

update(token, new SpatialModifyingVisitor<MySpatialObject>() {
    public void visit(ElementUpdater<MySpatialObject> updater) {
      // modify updater.element()
        updater.update(AABB.create(-200.0, 0.0, 1.0, 2.0));
       }
    });

Within the visitor’s visit method, you may freely modify the element’s fields without interference from other transactions (it is essentially locked for writing).

Performance Note: Insertion is SpaceBase’s costliest operation, and it may entail even further, asynchronous housekeeping. So when populating the store (i.e. when executing a large number of insert operations) in the parallel execution mode, do not set the thread for asynchronous operations, or, alternatively, join the Sync returned from each insert operation before issuing the next. This is only recommended when many inserts are done consecutively. There is no need to join if the inserts are spaced-out.

Queries

Using the Built-In Queries

SpaceBase comes with a few useful built-in spatial queries. They are created by the static methods of the SpatialQueries class, and are listed in the next section. Here we’ll provide an example:

base.query(SpatialQueries.contained(AABB.create(-100.0, 100.0, 0.0, 1000.0)),
    new SpatialVisitor<MySpatialObject>() {
        public void visit(MySpatialObject elem, SpatialToken token) {
            System.out.println(elem + " is in the region!");
        }
    });

The visitor is called for each element matching the query. Of course, in the visitor you may examine the element’s geometry further (not just its bounding-box as the query does), and decide if it’s of use to you.

You can also request that all matching elements be passed to the visitor in one call as a set, like so:

base.query(SpatialQueries.contained(AABB.create(-100.0, 100.0, 0.0, 1000.0)),
    new SpatialSetVisitor<MySpatialObject>() {
        public void visit(Set<MySpatialObject> result, Set<ElementUpdater<T>> resForUpdate) {
            for(MySpatialObject elem : result)
                System.out.println(elem + " is in the region!");
        }
    });

This visitor will be called only once.

SpaceBase’s Built-In Queries

The following table lists SpaceBase’s built-in spatial queries. The first column lists the queries obtained using SpatialQueries’s static methods or fields, while the second lists the classes implementing those queries, which are all found in the ...spacebase.queries package.

The queries consider only the elements’ AABBs, but by extending the classes in spacebase.queries package and overriding their implementation of queryElement (as detailed in the next section), you may refine the queries to take into account more specific properties of the element other than its bounding-box.

SpatialQueries Query class Description
method/field (in spacebase.queries)  
ALL_QUERY AllQuery Finds all elements in the store.
NONE_QUERY - Finds no elements. Provided for the sake of completeness.
equals(AABB) EqualsQuery Finds all elements whose AABBs are equal to a given AABB.
intersect(AABB) IntersectQuery Finds all elements whose AABBs intersect a given AABB (or are entirely contained in it).
contained(AABB) ContainedQuery Finds all elements whose AABBs are contained within a given AABB.
range(AABB,double) RangeQuery Finds all elements whose AABBs lie, completely or partially, within a given range from a given AABB.
contain(AABB) ContainsQuery Finds all elements whose bounding boxes contain a given AABB.

For further details, please consult the Javadoc.

Composing Queries

You can compose queries together using SpatialQueries’s not, and, and or methods, which provide query negation, conjunction and disjunction, respectively. The binary operations and and or make use of short-circuit logic, so the order of the two queries passed as arguments may have a performance impact — in the case of the and operation, the less inclusive query should come first, and in the case of the or operation, the more inclusive query should come first.

See the Javadoc for details.

Writing Your Own Queries

You may want to write your own queries that check for containment or intersection with shapes more interesting than axis-aligned boxes (like polygon containment, line intersection, etc.). In order to do that, you should understand how SpaceBase runs queries.

SpaceBase’s internal index is a tree of nodes, each holding the AABB of all the nodes below it. When a query runs, it decides for each node whether or not there may be matching elements in the child nodes. The way it does this is by examining the node’s AABB and deciding whether there could be elements that are fully contained in that AABB that may satisfy the query. This way SpaceBase decides whether to descend further to lower nodes in the tree, and if so, to which.

To write your own spatial-query, you need to implement the SpatialQuery interface which declares two methods:

Result queryContainer(AABB aabb);
boolean queryElement(AABB aabb, T elem);

The second, queryElement is simple. It receives an element’s AABB, and decides whether it satisfies the query. If it returns true, the element will be passed on to the visitor. If in queryElement you take into account properties of the element other than its bounding box, you can refine the query to more complex geometries than simple AABBs.

The first, queryContainer, receives an index-node’s bounding-box, and determines if there could be elements fully contained in those bounds that may satisfy the query. It returns an enum with three options:

NONE, if no elements within the bounding box could possibly match the query,

SOME, if there could be elements within the bounding box that would match the query,

or ALL, if all elements within the bounds would definitely match the query.

Of course, you could always return SOME and let the queryElement examine each element for you, but that would require SpaceBase to call the method for all elements in the store. By now you should understand that it is the NONE result that makes SpaceBase queries so fast, as it stops the descent from certain nodes down the tree. The ALL result might be beneficial if the queryContainer has a slow computation, as it saves SpaceBase the trouble of calling the query if a positive result is assured.

Fast(er) Spatial Queries

When implementing the SpatialQuery interface, the resulting query may suffer from non-optimal performance due to the need to allocate many instances of the AABB class. If you wish to improve the performance of your hand-rolled query, it should implement the FastSpatialQuery interface, which extends SpatialQuery, and adds an extra method for you to write:

Result queryContainer(IndexedAABB aabbs, int index);

This method should satisfy exactly the same contract as the first queryContainer method we’ve covered, except that it takes its AABB argument in a different form — as an IndexedAABB, which can be thought of as an array of AABBs, and an index into that array. Using IndexedAABB and the accompanying index is very similar to using an AABB object, but you should refer to IndexedAABB’s Javadoc for specifics.

Spatial Joins

A spatial join is a query that operates not on each spatial element individually but on pairs of elements. It is useful in finding, say, all pairs of elements that intersect one another, or all pairs that are within a certain distance from each other.

Using the Pre-Built Spatial Joins

SpaceBase comes with a couple of useful built-in spatial joins. They, too, are created by the static methods of the SpatialQueries class, and are listed in the next section. Here’s an example:

base.join(SpatialQueries.intersect(),
    new SpatialJoinVisitor<MySpatialObject, MySpatialObject>() {
        public void visit(MySpatialObject elem1, MySpatialObject elem2) {
           System.out.println(elem1 + " and " + elem2 + " intersect!");
        }
    });

You may also limit the area in which the join is performed, like so:

base.join(SpatialQueries.contained(AABB.create(-100.0, 100.0, 0.0, 1000.0)), 
SpatialQueries.distance(10),
    new SpatialJoinVisitor<MySpatialObject, MySpatialObject>() {
        public void visit(MySpatialObject elem1, MySpatialObject elem2) {
           System.out.println(elem1 + " and " + elem2 + " are close!");
        }
    });

It is also possible to join elements from two different SpaceBase stores. In that case, the spatial-join examines pairs of elements where the first element in the pair comes from the first store, and the second comes from the other:

base.join(base2, SpatialQueries.intersect(),
    new SpatialJoinVisitor<MySpatialObject, OtherSpatialType>() {
        public void visit(MySpatialObject elem1, OtherSpatialType elem2) {
            System.out.println(elem1 + " and " + elem2 + " intersect!");
        }
    });

When joining two stores it is also possible to limit the join to a region of space.

Note that unlike simple queries, spatial joins do not have (in this version of SpaceBase, at least) versions that return all results in a set.

SpaceBase’s Built-In Spatial Joins

The following table lists SpaceBase’s built-in spatial joins. The first column lists the join-queries obtained using SpatialQueries’s static methods or fields, while the second lists the classes implementing those join-queries, which are all found in the ...spacebase.queries package.

The join-queries consider only the elements’ AABBs, but by extending the classes in spacebase.queries package and overriding their implementation of joinElements (as detailed in the next section), you may refine the queries to take into account more specific properties of the element other than its bounding-box

SpatialQueries Query class Description
method/field (in spacebase.queries)  
JOIN_ALL_QUERY AllJoin Finds all pairs of elements in the store(s).
JOIN_NONE_QUERY - Finds no pairs of elements. Provided for the sake of completeness.
intersect() IntersectJoin Finds all pairs of elements whose AABBs intersect.
distance(double) DistanceJoin Finds all pairs of elements whose AABBs are less than or equal to a given distance apart.

For further details, please consult the Javadoc.

Writing Your Own Spatial Joins

You may want to write your own spatial joins that check for containment or intersection with shapes more interesting than axis-aligned boxes. To do that, you implement the SpatialJoinQuery interface, which has the following methods:

boolean join(AABB aabb1, AABB aabb2);
boolean joinElements(AABB aabb1, T1 elem1, AABB aabb2, T2 elem2);

The join method is analogous to queryContainer in the SpatialQuery interface. It is passed two AABBs from the store’s internal hierarchy. Each is the bounding box of a group of elements. If an element contained in the first box and an element contained in the second may possibly satisfy the join, the method should return true. If no element contained in the first AABB can possibly be joined with an element in the second, the method should return false.

The joinElements method receives two elements and their respective AABBs. If the two elements satisfy the join — return true. Otherwise, return false. By having joinElements take into account properties of the elements other than their bounding box, you can refine the join to more complex geometries than simple AABBs.

Fast(er) Spatial Joins

Just like SpatialQuery, SpatialJoinQuery also has its faster counterpart FastJoinQuery, which adds a couple of methods to those of its parent SpatialJoinQuery:

boolean join(IndexedAABB aabbs1, int index1, IndexedAABB aabbs2, int index2);
boolean join(AABB aabb1, IndexedAABB aabbs2, int index2);

These two methods should satisfy exactly the same contract as the SpatialJoinQuery’s join method, except that it uses IndexedAABBs and indeces. Please take a look at the Javadocs for specifics.

Geo-Spatial Queries and Joins

SpaceBase has a few built-in queries and joins designed to support objects positioned geographically. These — unlike all other built-in queries — make strict requirements on how coordinates are specified.

If you wish to make use of the geo-spatial queries provided with SpaceBase, you must adhere by the following rules:

  • The first coordinate is the latitude, given in degrees, and must have a value in the range [-90.090.0].
  • The second coordinate is the longitude, given in degrees, and must have a value in the range [-180.0180.0].
  • If you wish the queries to take altitude into account, you can use 3D AABBs, in which case the third coordinate is the altitude, and it must be given in meters.

In short, the coordinates must be given as (lat, lng) or (lat, lng, alt) with lat and lng being in degrees, and alt in meters. For AABBs, of course, the first pair is the minimum and maximum latitude, etc.

The built-in geo-spatial queries and joins can be created by the GeoQueries class in the co.paralleluniverse.spacebase.go package:

SpatialQueries Query class Description
method/field (in spacebase.queries)  
georange(AABB, double) GeoRangeQuery Finds all elements whose geographic AABBs lie, completely or partially, within a given range, in meters, from a given geographic AABB.
geodistance(double) GeoDistanceJoin Finds all pairs of elements whose geographic AABBs are less than or equal to a given distance, in meters, apart.

Complex Operations and Transactions

Simple Transactions

As defined earlier, a complex operation is one composed of many simple operations like insert, update or delete. Transactions are complex operations that are executed atomically. There are several kinds of complex-operations and transactions you can run with SpaceBase. As before, here we’ll provide some examples, and you should refer to the Javadoc for further details.

The first kind of transaction, simply inserts, deletes and/or updates several elements atomically. It is used like so:

base.transaction(new SpatialTransaction<MySpatialObject>() {
    public void run(StoreUpdater<MySpatialObject> updater) {
        sobj1.setToken(updater.insert(sobj1, sobj1.getAABB());
        // updater.delete(sobj1.getToken); - illegal
        updater.update(sobj2.getToken(), sobj2.getAABB());
        updater.delete(sobj3.getToken);
    }
});

There are a few things you should note here:

  1. Notice how the StoreUpdater object is used to update the store. Calling methods on base from within a transaction would result in an exception.
  2. You may not update or delete an element that was inserted within the same transaction, nor are you allowed to update the same element more than once within a transaction. Doing so a. makes no sense, and b. may cause unexpected behavior.
  3. If the store is configured to run in parallel execution mode, the transaction will run in a thread other than the one calling the transaction method. However, you should not be concerned with concurrency issues when your getAABB and setToken methods are called (as long as you’re not trying to insert sobj1 from two different threads). SpaceBase makes it just work.

Query-based Transactions and Complex Operations

The second kind of transaction allows updating elements that are found by a query. This transaction atomically deletes all elements contained in a given box:

base.transaction(SpatialQueries.contained(AABB.create(-100.0, 100.0, 0.0, 1000.0),
    new SpatialModifyingVisitor<MySpatialObject>() {
        public void visit(ElementUpdater<MySpatialObject> updater) {
            updater.delete();
        }
        public void done() {
        }
    });

A non-atomic version of complex operation that does the same thing is called almost identically, except with a queryForUpdate method instead of transaction:

base.queryForUpdate(
    SpatialQueries.contained(AABB.create(-100.0, 100.0, 0.0, 1000.0)),
    new SpatialModifyingVisitor<MySpatialObject>() {
        public void visit(ElementUpdater<MySpatialObject> updater) {
            updater.delete();
        }
        public void done() {
        }
    });

Again, the visitor must use the updater provided, and never call any of base’s methods. However, within the done method, you may call any of base’s method to start a new operation.

Another very useful operation runs two queries at once: one for returning elements which we only wish to read, and another that returns elements that are to be modified. The results are passed to the visitor all at once:

base.queryForUpdate(
SpatialQueries.contained(AABB.create(-100.0, 100.0, 0.0, 1000.0)), // for read
SpatialQueries.contained(AABB.create(-10.0, 10.0, -50.0, 10.0)), // for write
false, // non-atomic
    new SpatialSetVisitor<MySpatialObject>() {
        public void visit(Set<MySpatialObject> readOnly, Set<ElementUpdater<MySpatialObject>> forUpdate) {
            // read elements in the readOnly set
            // modify elements in the forUpdate set
        }
    });

An advantage to this operation is that any SpaceBase method could be called from within the visitor, to start a new query or transaction.

Another transaction method provides the visitor with a set all of the query’s results in one call.

There is, however, one restriction that applies to the transactional operations (using the transaction methods) and not to the non-transaction versions (the query methods): in a transaction, you may not update an element’s bounds so they no longer match the given query. In other words, in a query-based transaction you may not move the element outside the region defined by the query. The following would throw an exception:

base.transaction(SpatialQueries.contained(AABB.create(-100.0, 100.0, 0.0, 1000.0)),
    new SpatialModifyingVisitor<MySpatialObject>() {
        public void visit(ElementUpdater<MySpatialObject> updater) {
            // throws an exception:
            updater.update(AABB.create(-200.0, 0.0, 1.0, 2.0));
        }
    });

The new bounds are not contained in the query region. Doing it in a non-transactional operation, however, is fine:

base.queryForUpdate(
SpatialQueries.contained(AABB.create(-100.0, 100.0, 0.0, 1000.0)),
    new SpatialModifyingVisitor<MySpatialObject>() {
        public void visit(ElementUpdater<MySpatialObject> updater) {
            updater.update(AABB.create(-200.0, 0.0, 1.0, 2.0)); // OK
        }
    });

In addition, the query passed to the transaction method must implement the interface BoundedSpatialQuery, which specifies a query whose matching elements reside in some bounded region of space (i.e. can all be contained in some AABB). BoundedSpatialQuery declares a method

boolean isContainedIn(AABB);

which returns ‘true’ if all of the query’s matching elements are contained in the given AABB.

Naturally, some queries do not fulfill this requirement and so do not implement the interface. For example, SpatialQueries.intersect is not a bounded query because all elements that intersect a given AABB are not necessarily contained in a bounded region of space.

However, note that you may compose a non-bounded query with a bounded one using the SpatialQueries.and method, to obtain a bounded query.

The Simple API and Auto-Expiration

Expiration

Having spatial elements automatically disappear from the data store after a given duration is a useful feature, so we’ve added it to a class that wraps a SpaceBase instance. It is called SpaceBaseWithExpiration, and is found in the co.paralleluniverse.spacebase.expiration package. Two subclasses provide expiration based on real ‘‘wall clock time’’, and another uses virtual time. Please consult the Javadocs for further information.

The Simple API

Some applications use SpaceBase simply as a fast spatial database, and do not use it as a parallelization framework to scale their logic. Such applications may make use of the simple API provided by the co.paralleluniverse.spacebase.simple.SimpleSpaceBase class. It does not employ visitor, and queries simply return collections. It is suitable for simple CRUD usage. Please consult the Javadocs for further information on the simple API.

The Quasar API

Quasar is an open-source lightweight thread and actors library developed by Parallel Universe. It enables the creation of many thousands (or even millions) of lightweight threads, and provides the scalability advantages of asynchronous, callback-based APIs while maintaining the simplicity of the sycnrhonous (blocking) model.

The co.paralleluniverse.spacebase.quasar.SpaceBase interface, is a SpaceBase API that integrates with Quasar. Instead of passing callbacks to SpaceBase operations, those operations block the calling fiber. The transactions are no longer delinated by the scope of the visitor callback(s) they invoke, but explicitely. Transactions look like this:

try (ResultSet<MyElement> rs = global.sb.query(SpatialQueries.range(...))) {
	for (MyElement e : rs.getResultReadOnly()) {
        // ....
    }
}

Please refer to the Javadocs for details.

Working with Records

Quasar provides records as a standard abstraction to access shared state. Records are an excellent choice for a SpaceBase element type, because they can be easily decorated with transaction-protection that make sure that you can only read or write a record field when protected by an appropriate transaction.

SpaceBase comes with two types of transaction-protection for record elements, both found in the co.paralleluniverse.db.record package.

If you’ve chosen to use records as your SpaceBase elements, you should wrap them with TransactionalRecord before inserting them into SpaceBase; a special version of SpaceBase.insert takes an instance of Transactional<E> which TransactionalRecord implements (where E is the element type).

When inserting TransactionalRecords into a SpaceBase instance, all queries that return elements, return protected record references that only allow read access if the transaction is read-only, and are invalidated when the transaction terminates.

While you’d want to insert TransactionalRecords into SpaceBase, sometimes you’d want to hold a reference to one of the records, but would still like to enjoy transactional protection. But, as we’ve said, the record references returned by queries are invalidated as soon the transaction ends and can never be used again.

After you wrap a record with TransactionalRecord and insert it into SpaceBase, you might want to also wrap it with a StrandedTransactionalRecord and hold on to that. StrandedTransactionalRecord is a record reference that is never invalidated, but queries the state of the current strand to see whether a transaction is in effect and whether it includes the record.

Please refer to the Javadocs for details.

Tuning SpaceBase for Performance

There are several properties, all set through SpaceBaseBuilder, that can affect SpaceBase’s performance. We’ve already discussed the execution modes, now let’s look at SpaceBaseBuilder’s other properties. The optimal settings depend on your hardware, data distribution and access patterns, so you should experiment until you find one to your liking. Note, however, that all properties have sensible defaults, so you should get excellent performance out of the box even without touching anything. However, for that extra oomph, we’ve got some knobs for you to play with.

Node Width

Each of SpaceBase’s internal nodes has up to a constant number of child nodes. The wider the node (the more children it has), the shorter the tree. With wider nodes there are less levels to traverse. However, each node you do traverse requires more work. The optimal node width mostly depends on your CPU’s cache-line size and memory performance characteristics. The node width can be set between 5 and 32 by calling:

SpaceBaseBuilder setNodeWidth(int width);

Node Compression

The index nodes may be compressed to allow more width in the same number of cache-lines. Compression, however, has some processing costs. For large node-widths (say, over 20) we recommend enabling node compression. Node compression can be set on or off with the method:

SpaceBaseBuilder setCompressed(boolean compress);

Floating-Point Precision

By default, SpaceBase stores AABB coordinates with double-precision (64 bits). If you’d like to conserve memory, you may instruct SpaceBase to use single-precision (32 bits) coordinates by passing false to:

SpaceBaseBuilder setSinglePrecision(boolean singlePrecision);

Optimistic or Pessimistic Locking

In order to facilitate concurrent update to the spatial index, SpaceBase has to lock its internal nodes. It may, however, make use of ‘‘optimistic locking’’ which avoids actual locking and improves performance in many circumstances, at the cost of possible repeated attempts at executing a single operation. The optimistic locking options are set with the method:

SpaceBaseBuilder setOptimisticLocking(int height, int retryLimit);

The height argument specifies above which optimistic locking is to be used. If it’s -1 then optimistic locking is used for all nodes. We recommend setting its value to 0 or 1.

retryLimit specifies the maximum number of times an operation is to be retried before reverting to pessimistic locking (for the execution of that operation only).

To disable optimistic locking, call this method:

SpaceBaseBuilder setPessimisticLocking();

Backpressure

When SpaceBase runs in parallel mode, methods called on the SpaceBase API merely initiate operations that run in a separate thread-pool. If you setCurrentThreadAsynchronous to true, those methods return immediately. This could result in the thread-pool being overwhelmed if operations are initiated in too-high a rate. To prevent that from happening, you should enable backpressure, which blocks the SpaceBase method if the thread-pool is too busy. This is done with this method of SpaceBaseBuilder:

SpaceBaseBuilder setQueueBackpressure(int queueBackPressure);

Experimental Settings

SpaceBaseBuilder has other properties you can set, but those are considered experimental and are deliberately left undocumented.

Monitoring

SpaceBase uses JMX for monitoring itself. It exposes several MBeans which you can monitor using VisualVM, jconosole, or any other JMX dashboard. The most interesting MBean is probably the one monitoring SpaceBase’s performance, and is named co.paralleluniverse:type=SpaceBase,name=<store-name>,monitor=performance, which in VisualVM would appear in the MBean tree as co.paralleluniverse/SpaceBase/<store-name>/performance. It has many properties, which are categorized in sub-MBeans for viewing convenience. All the values of the performance properties refer to the previous polling period. The period is 5 seconds long, by default, but can be changed through the PerformanceTimePeriod property of the co.paralleluniverse/SpaceBase/MonitoringServices MBean — Just enter a new time period in milliseconds.

Here we explain some of them:

Transaction Information

Transactions Total number of operations on the store (in the last polling period).
Inserts Total number of insertions.
Updates Total number of element updates.
Deletes Total number of deletions from the store.
Queries Total number of queries on the store.
AvgQueryResults Average number of results returned from queries.
AvgQueryVolume Average volume/area of bounded queries.

General

QueueLength The number of internal SpaceBase tasks currently awaiting execution. The smaller the better. (This value does not refer to the last polling period, but is an instantaneous measurement).
TotalNodeTouches Total number of SpaceBase nodes accessed.
PropagateTimeRatio Amount of time, expressed as a ratio of the last polling period, spent on some housekeeping tasks (the lower the better).
AsyncTimeRatio Amount of time, expressed as a ratio of the last polling period, spent on some housekeeping tasks (the lower the better).

Locking

Collisions Total number of lock collisions (the lower the better).
WaitTimeRatio Amount of time, expressed as a ratio of the last polling period, spent waiting for locks (the lower the better).
OptimisticLockingTransactions Number of SpaceBase operations that made use of optimistic locking.
OptimisticLockingNodes Total number of nodes accessed with optimistic locking.
OptimisticLockingRetries Number of operation retries due to optimistic locking (the lower the better).

Descends

AvgDescenXBifurcationLevel The lower the better.
AvgDescenXBifurcationLevel The lower the better.

Distributing SpaceBase

This version of SpaceBase is single-node only.

Thrift API - Ruby, Python, Node.js

Getting Started

System Requirements

You must have the JRE 7 (Java Runtime Environment) installed on the machine running SpaceBase.

Installation

To install SpaceBase, simply unpack the distribution archive.

To install the Thrift client libraries for Python, cd to the SpaceBase distribution directory, and run:

cd thrift/clients/python
sudo python setup.py install 

To install the Thrift client libraries for Ruby, run:

gem install thrift 

To install the Thrift client libraries for Node.js, cd to the SpaceBase distribution directory, and run:

npm install thrift/clients/nodejs

Running SpaceBase

SpaceBase for Thrift is a Java process running a Thrift server. To start the SpaceBase server, run the following command in the distribution’s root directory:

java -server  
	-Dco.paralleluniverse.spacebase.properties.file=config/spacebase_thrift.properties 
	-Dco.paralleluniverse.spacebase.license.file=<path to .lic file>
	-Djava.util.logging.config.file=config/jul.properties
	-jar spacebase-thrift.jar <sb name1> <sb port1> <sb name2> <sb port2>...

<sb name1> is the name of the SpaceBase store.

<sb port1> is the port Thrift client will use to connect to the store.

You can create several SpaceBase instances (stores) on the command line.

The -Dco.paralleluniverse.spacebase.properties.file is optional. Use it only if you want to change the default SpaceBase configuration.

Interacting with SpaceBase

SpaceBase Thrift clients use the Thrift library as well as SpaceBase’s Thrift generated API files to interact with the server.

Python

import sys
sys.path.append('<path to spacebase dir>/thrift/python')

from spacebase import SpaceBase
from spacebase.ttypes import *

from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from thrift.server import TServer

try:
  transport = TSocket.TSocket('localhost', <port>) 
  transport = TTransport.TFramedTransport(transport) 
  protocol = TBinaryProtocol.TBinaryProtocol(transport) 
 
  spacebase = SpaceBase.Client(protocol) 
  transport.open() 
...

  transport.close()

except Thrift.TException, tx:
  print "%s" % (tx.message)

Ruby

require 'thrift'

$:.push('<path to spacebase dir>/thrift/ruby')
require 'space_base'
require 'spacebase_constants'

begin
  socket = Thrift::Socket.new('localhost', <port>) 
  transport = Thrift::FramedTransport.new(socket)
  protocol = Thrift::BinaryProtocol.new(transport)
  spacebase = SpaceBase::Client.new(protocol)

  transport.open()
...

  transport.close()

rescue Thrift::Exception => tx
  print 'Exception: ', tx.message, "\n"
end

Node

var thrift = require('thrift');//,
    //ttransport = require('thrift/transport');

var SpaceBase = require('<path to spacebase dir>/SpaceBase.js'),
    ttypes = require('<path to spacebase dir>/spacebase_types');

var connection = thrift.createConnection('localhost', <port>) 
// {transport: ttransport.TFramedTransport}
    spacebase = thrift.createClient(SpaceBase, connection);

connection.on('error', function(err) {
  console.error("connection:", err);
});

...
connection.end();

All SpaceBase operations are run by calling methods on the spacebase object.

AABBs

SpaceBase stores and retrieves object’s spatial bounds in the form of AABBs, represented by the AABB class. Here’s how to create an AABB:

Python

aabb = AABB (3.1, 57.2, -6, 700)

Ruby

aabb = AABB.new({'minX' => 3.1, 'maxX' => 57.2, 'minY' => -6, 'maxY' => 700})

Node

var aabb = new AABB({minX: 3.1, maxX: 57.2, minY: -6, maxY: 700});

Element and Bounds

Some operations return an object or a list of objects of type ElementAndBounds. Those have three fields:

  • id — the element’s id;
  • aabb — the element’s bounds;
  • data— the data associated with the element in the form of a hash/dictionary/map.

General Operations

Retrieve the name of the store:

Python/Ruby

getName()

Node

getName(function(err, name) {})

Get the total number of items in the store:

Python/Ruby

size()

Node

size(function(err, size) {})

Inserting, Modifying and Deleting Elements

To insert an element into the store, use the method:

Python/Ruby

id = insert(aabb, data, ttl)

Node

insert(aabb, data, ttl, function(err, id) {})

Where aabb is the element’s bounds and data is a hash/dictionary/map containing the element’s key-value pairs with string keys and string values.

ttl is the duration in milliseconds from now that the element will stay in the store. When the ttl expires, the element is automatically removed from the store. If ttl is negative, the element will stay in the store until it is explicitly removed.

The method returns the new element’s ID.

To update an element, use:

Python/Ruby

update(id, newAABB, newData, ttl)

Node

update(id, newAABB, newData, ttl, function(err) {})

The update method has no return value.

Please note that if an element with the given ID is not founds in the store, the operation will fail silently. This is so the application doesn’t have to explicitly deal with concurrent deletes.

update actually replaces the element, i.e., it changes its bounding-box, its expiration and all of its key-value pairs, which are replaced by those provided to the method. If you only wish to change some of those values, you should use the updatek method, that has the same signature:

Python/Ruby

updatek(id, newAABB, newData, ttl)

Node

updatek(id, newAABB, newData, ttl, function(err) {})

The only required argument to updatek is the id. You can pass null/nil/Nothing for newAABB and newData, or 0 for ttl, if you do not wish to change their value. Also, if newData contains key-value pairs, only they will be modified or added to the element. Note that you cannot remove key-value pairs using the updatek method.

Finally, to remove an element from the store:

Python/Ruby

remove(id)

Node

remove(id, function(err) {})

Here, too, if an element with the given ID is not founds in the store, the operation will fail.

Performance Note: Insertion is SpaceBase’s costliest operation. It is much, much faster to update an item than to delete and re-insert it.

Queries

SpaceBase spatial queries are run with the query method. It has the following form:

Python

results = query(SimpleQuery(...))

Ruby

results = query(SimpleQuery.new(...))

Node

query(new SimpleQuery(...), function(err, results) {})

where the SimpleQuery type defines the query type and additional parameter.

Queries return sets of ElementAndBounds.

You can also request to retrieve only partial elements with the queryk method:

Python

results = queryk(SimpleQuery(...), ['key1', 'key5', ...]))

Ruby

results = queryk(SimpleQuery.new(...), ['key1', 'key5', ...]))

Node

queryk(new SimpleQuery(...), ["key1", "key5", ...]), function(err, results) {})

The resulting ElementAndBounds set will contain only the elements’ requested keys and their respective values.

The following table lists SpaceBase’s built-in spatial queries:

Query Description Form
ALL Finds all elements in the store. Python: SimpleQuery(SimpleQueryType.ALL)
    Ruby: SimpleQuery.new({'type'=>SimpleQueryType::ALL})
    Node: new SimpleQuery({type: ttypes.SimpleQueryType.ALL})
EQUALS Finds all elements whose AABBs are equal to a given AABB. Python: SimpleQuery(SimpleQueryType.EQUALS, aabb)
    Ruby: SimpleQuery.new({'type'=>SimpleQueryType::EQUALS,'aabb'=>aabb})
    Node: new SimpleQuery({type: ttypes.SimpleQueryType.EQUALS,aabb: aabb})
XSECT Finds all elements whose AABBs intersect the given AABB (or are entirely contained in it). Python: SimpleQuery(SimpleQueryType.XSECT, aabb)
    Ruby: SimpleQuery.new({'type'=>SimpleQueryType::XSECT,'aabb'=>aabb})
    Node: new SimpleQuery({type: ttypes.SimpleQueryType.XSECT aabb: aabb})
CONTAINED Finds all elements whose AABBs are contained within a given AABB. Python: SimpleQuery(SimpleQueryType.CONTAINED, aabb)
    Ruby: SimpleQuery.new({'type'=>SimpleQueryType::CONTAINED,'aabb'=>aabb})
    Node: new SimpleQuery({type: ttypes.SimpleQueryType.CONTAINED, aabb: aabb})
RANGE Finds all elements whose AABBs lie, completely or partially, within a given distance (double precision floating point) from a given AABB. Python: SimpleQuery(SimpleQueryType.RANGE, aabb, param=distance)
    Ruby: SimpleQuery.new({'type'=>SimpleQueryType::RANGE,'aabb'=>aabb,'param'=>distance})
    Node: new SimpleQuery({type: ttypes.SimpleQueryType.RANGE, aabb: aabb, param: distance})
CONTAINS Finds all elements whose bounding boxes contain a given AABB. Python: SimpleQuery(SimpleQueryType.CONTAINS, aabb)
    Ruby: SimpleQuery.new({'type'=>SimpleQueryType::CONTAINS,'aabb'=>aabb})
    Node: new SimpleQuery({type: ttypes.SimpleQueryType.CONTAINS,aabb: aabb})

Spatial Joins

A spatial join is a query that operates not on each spatial element individually but on pairs of elements. It is useful in finding, say, all pairs of elements that intersect one another, or all pairs that are within a certain distance from each other.

SpaceBase spatial joins are run with the join method. It has the following form:

Python

results = join(Join(...))

Ruby

results = join(Join.new(...))

Node

join(new Join(...), function(err, results) {})

where Join is a tuple or an atom defining the join type and additional parameter. Join results are sets of ElementAndBoundsPair. ElementAndBoundsPair simply contains two ElementAndBounds objects in two fields named first and second . Just as in queries, joins allow retrieving partial elements using joink:

Python

results = joink(Join(...), ['key1', 'key5', ...])

Ruby

results = joink(Join.new(...), ['key1', 'key5', ...])

Node

joink(new Join(...), ["key1", "key5", ..], function(err, results) {})

The following table lists SpaceBase’s built-in spatial joins:

Join Description Form
XSECT Finds all elements whose AABBs intersect. Python: Join(JoinType.XSECT)
    Ruby: Join.new({'type'=>JoinType::XSECT})
    Node: new Join({type: ttypes.JoinType.XSECT})
DISTANCE Finds all pairs of elements whose AABBs are less than or equal to a given distance (double precision floating point) apart. Python: Join(JoinType.DISTANCE, distance)
    Ruby: Join.new({'type'=>JoinType::DISTANCE,'param'=>distance})
    Node: new Join({type: ttypes.JoinType.DISTANCE, param: distance})

Geo-Spatial Queries and Joins

SpaceBase has a few built-in queries and joins designed to support objects positioned geographically. These — unlike all other built-in queries — make strict requirements on how coordinates are specified.

If you wish to make use of the geo-spatial queries provided with SpaceBase, you must adhere by the following rules:

  • The first coordinate is the latitude, given in degrees, and must have a value in the range [-90.090.0].
  • The second coordinate is the longitude, given in degrees, and must have a value in the range [-180.0180.0].
  • If you wish the queries to take altitude into account, you can use 3D AABBs, in which case the third coordinate is the altitude, and it must be given in meters.

In short, the coordinates must be given as (lat, lng) or (lat, lng, alt) with lat and lng being in degrees, and alt in meters. For AABBs, of course, the first pair is the minimum and maximum latitude, etc.

Query Description Form
     
GEORANGE Finds all elements whose geographic AABBs lie, completely or partially, within a given distance, given in meters (double precision floating point) from a given geographic AABB. Python: SimpleQuery(SimpleQueryType.GEORANGE, aabb, param=distance)
    Ruby: SimpleQuery.new({'type'=>SimpleQueryType::GEORANGE, 'aabb'=>aabb, 'param'=>distance})
    Node: new SimpleQuery({type: ttypes.SimpleQueryType.GEORANGE, aabb: aabb, param: distance})
     
     
Join Description Form
     
GEODISTANCE Finds all pairs of elements whose geographic AABBs are less than or equal to a given distance (double precision floating point) in meters apart. Python: Join(JoinType.GEODISTANCE, distance)
    Ruby: Join.new({'type'=>JoinType::GEODISTANCE,'param'=>distance})
    Node: new Join({type: ttypes.JoinType.GEODISTANCE, param: distance})

Configuring SpaceBase

The SpaceBase instances are configured in a .properties file, optionally specified by:

-Dco.paralleluniverse.spacebase.properties.file=<path to properties file>

on the command line. If the file is not specified, default values will be used for all properties.

A file with default configuration is provided with the SpaceBase distribution, and is found in the config directory. The same configuration applies to all SpaceBase stores created by the server.

There are two important properties you can set in the .properties file:

Property Description Default
spacebase.server.protocol The Thrift wire protocol. Can be binary, compact or json. binary
spacebase.server.nonblocking Whether or not to use a nonblocking Java server. If set to true (the default), clients must use the FramedTransport transport; otherwise they must use TBufferedTransport true
spacebase.server.minWorkers Minimum number of worker threads serving requests. 5
spacebase.server.maxWorkersWorkers Maximum number of worker threads serving requests. 50000
spacebase.server.selectorThreads Number of threads listening on the port and transferring requests to the worker threads [relevant only if nonblocking is set to true (the default)] 5
spacebase.datum The name of the geodesic datum to use for the geospatial queries and joins. WGS84
spacebase.datum.a Rather than specifying a named datum, you can define the datum parameters. a is the datum’s large radius, in meters.  
spacebase.datum.f If spacebase.datum.a is specified, f must be, too. f is the datum’s oblaten ess factor.  

There are other properties you can set in the file that tune SpaceBase’s performance. They are explained in the next section.

Tuning SpaceBase for Performance

There are several properties, all set in the .properties file, that can affect SpaceBase’s performance. The optimal settings depend on your hardware, data distribution and access patterns, so you should experiment until you find one to your liking. Note, however, that all properties have sensible defaults, so you should get excellent performance out of the box even without touching anything. However, for that extra oomph, we’ve got some knobs for you to play with.

Multi-Threaded Execution Modes

SpaceBase is a concurrent data-store, and as such it supports many threads operating on a single store simultaneously. It has two different execution modes that make use of threads differently. Which one is best for you depends on your specific application design and your hardware.

The first execution mode is the concurrent mode. In this mode, every operation is executed in its entirety on the thread that started it. In the Thrift case, that will be the worker thread handling the request.

The second execution mode is the parallel mode, in which every operation may be further parallelized (using Java’s Fork/Join framework), and a single operation will execute on several different threads in parallel, none of which is the thread used to start the operation.

To turn on parallel mode, include this line in the .properties file:

spacebase.parallel = true  #default: false

By default, SpaceBase works in concurrent mode (spacebase.parallel is false).

Another property,

spacebase.parallelism  #default: 2 if parallel is false, num of processors if true

is the number of total threads to use if in the parallel mode (defaults to the number of available processors), or the number of background bookkeeping threads to use when in the concurrent mode (defaults to 2).

Node Width

Each of SpaceBase’s internal nodes has up to a constant number of child nodes. The wider the node (the more children it has), the shorter the tree. With wider nodes there are less levels to traverse. However, each node you do traverse requires more work. The optimal node width mostly depends on your CPU’s cache-line size and memory performance characteristics. The node width can be set between 5 and 32 with the property:

spacebase.node-width  #default: 10

Node Compression

The index nodes may be compressed to allow more width in the same number of cache-lines. Compression, however, has some processing costs. For large node-widths (say, over 20) we recommend enabling node compression. Node compression can be set on or off with the property:

spacebase.compressed  #default: false

Floating-Point Precision

By default, SpaceBase stores AABB coordinates with double-precision (64 bits). If you’d like to conserve memory, you may instruct SpaceBase to use single-precision (32 bits) coordinates by setting to true the property:

spacebase.single-precision  #default: false

Optimistic or Pessimistic Locking

In order to facilitate concurrent update to the spatial index, SpaceBase has to lock its internal nodes. It may, however, make use of ‘‘optimistic locking’’ which avoids actual locking and improves performance in many circumstances, at the cost of possible repeated attempts at executing a single operation. Optimistic locking is turned on by default, and can be set off with the property:

spacebase.optimistic  #default: true

If using optimistic locking, two more properties affect its performance:

spacebase.optimistic-height  #default: 1

This property specifies above which tree-livel optimistic locking is to be used. If it’s -1 then optimistic locking is used for all nodes. We recommend setting its value to 0 or 1.

spacebase.optimistic-retry-limit  #default: 3

This property specifies the maximum number of times an operation is to be retried before reverting to pessimistic locking (for the execution of that operation only).

Monitoring

SpaceBase uses JMX for monitoring itself. It exposes several MBeans which you can monitor using VisualVM, jconosole, or any other JMX dashboard. The most interesting MBean is probably the one monitoring SpaceBase’s performance, and is named co.paralleluniverse:type=SpaceBase,name=<store-name>,monitor=performance, which in VisualVM would appear in the MBean tree as co.paralleluniverse/SpaceBase/<store-name>/performance. It has many properties, which are categorized in sub-MBeans for viewing convenience. All the values of the performance properties refer to the previous polling period. The period is 5 seconds long, by default, but can be changed through the PerformanceTimePeriod property of the co.paralleluniverse/SpaceBase/MonitoringServices MBean — Just enter a new time period in milliseconds.

Here we explain some of them:

Transaction Information

Transactions Total number of operations on the store (in the last polling period).
Inserts Total number of insertions.
Updates Total number of element updates.
Deletes Total number of deletions from the store.
Queries Total number of queries on the store.
AvgQueryResults Average number of results returned from queries.
AvgQueryVolume Average volume/area of bounded queries.

General

QueueLength The number of internal SpaceBase tasks currently awaiting execution. The smaller the better. (This value does not refer to the last polling period, but is an instantaneous measurement).
TotalNodeTouches Total number of SpaceBase nodes accessed.
PropagateTimeRatio Amount of time, expressed as a ratio of the last polling period, spent on some housekeeping tasks (the lower the better).
AsyncTimeRatio Amount of time, expressed as a ratio of the last polling period, spent on some housekeeping tasks (the lower the better).

Locking

Collisions Total number of lock collisions (the lower the better).
WaitTimeRatio Amount of time, expressed as a ratio of the last polling period, spent waiting for locks (the lower the better).
OptimisticLockingTransactions Number of SpaceBase operations that made use of optimistic locking.
OptimisticLockingNodes Total number of nodes accessed with optimistic locking.
OptimisticLockingRetries Number of operation retries due to optimistic locking (the lower the better).

Descends

AvgDescenXBifurcationLevel The lower the better.
AvgDescenXBifurcationLevel The lower the better.

Distributing SpaceBase

This version of SpaceBase is single-node only.

C++ API

Getting Started

Installation

Before you use SpaceBase, make sure you have the JRE (Java Runtime Environment) version 6 and up, installed, and the JAVA_HOME environment variable pointing to the JRE installation path. You can download the Oracle JRE from http://java.oracle.com. It is preferable, however to install the full JDK (Java Development Kit) as it contains, beside the JRE, other tools which may be useful for monitoring and troubleshooting.

SpaceBase is a Java library with a C++ API, that runs as part of your C++ application by means of the JNI (Java Native Interface) Invocation API. This means that a JVM (Java Virtual Machine) instance runs embedded in your application.

The JVM is contained in a shared library called jvm.so or jvm.dll depending on your platform, which is bundled with the JRE installation and must be included in the shared library path of your application.

Performance Note: The Oracle HotSpot™ JVM comes in two versions — client and server — that are found in the bin/client or bin/server directory of the JRE respectively. Make sure that when running SpaceBase you use the server JVM shared library, as it has far superior performance. On the Windows platform, the regular HotSpot JRE installation contains only the client JVM, so you should install the full JDK (Java Development Kit), which contains both JVMs.

In addition, SpaceBases shared library, named spacebasec.so or spacebasec.dll depending on your platform, must be included in your library path and linked with. The SpaceBase native libraries are found in the lib directory of the SpaceBase distribution,

Running SpaceBase

Before compiling your SpaceBase application, make sure to include the include subdirectory of the SpaceBase distribution in your C++ include path.

To use SpaceBase, you must first initialize the JVM and the SpaceBase library somewhere on your application’s main thread. When your application finishes, you must destroy the embedded JVM. You do it all like so:

#include "JVM.h"
#include "SpaceBase.h"

int main()
{
    ...

    void *jvm = create_JVM("<full-path>", 100); // create the JVM
    init_spacebase(jvm); // initialize SpaceBase

    ... // main program loop / spawn more threads.

    destroy_JVM(jvm); // destroy the JVM
}

JVM.h and SpaceBase.h are two header files in the SpaceBase distribution. SpaceBase.h is the SpaceBase API’s main header file, while JVM.h contains only the declarations for create_JVM and destroy_JVM.

create_JVM, as you may have guessed, creates the JVM. You pass it two parameters. The first is the full path to the directory containing SpaceBase’s .jar files — the lib directory of the SpaceBase distribution. The other, is the size, in megabytes, of heap memory you’d wish to allot the JVM. create_JVM may also take an optional variable-length list of string (const char*) arguments to be passed on to the JVM; refer to the reference documentation for details.

destroy_JVM, well, destroys the JVM.

If, for some reason, you already have a JVM embedded in your application, you don’t need to include JVM.h. Instead you create and destroy the JVM using the regular JNI functions, like so:

#include <jni.h>
#include "SpaceBase.h"

int main()
{
    ...

    JavaVMInitArgs vm_args;
    // ... set vm_args
    JavaVM* jvm;
    JNIEnv* env;
    int res = JNI_CreateJavaVM(&jvm, (void**)&env, &vm_args); // create the JVM
    init_spacebase(jvm); // initialize SpaceBase

    ... // main program loop / spawn more threads.

    jvm->DestroyJavaVM(); // destroy the JVM
}

Just make sure, when setting the JVM arguments (through the JavaVMInitArgs struct), to add SpaceBase’s jar files — spacebase.jar, spacebasec.jar, jsr166y.jar — to the classpath.

Performance Note: Do not enable Java assertions for any of the SpaceBase classes. Enabling assertions for SpaceBase may considerably hurt performance.

Configuring and Creating a SpaceBase Store

The SpaceBase API is declared in the SpaceBase.h header file. All operations on a SpaceBase store are done by calling methods on the SpaceBase class. To create a new SpaceBase store, we use the class SpaceBaseBuilder.

We first create a new instance of SpaceBaseBuilder:

SpaceBaseBuilder builder();

To create the store, we use the BUILD_SPACEBASE macro, and pass it the number of spatial dimensions we wish to use, the class type of the objects we wish to store, and a non-null, non-empty, unique name, that will be used for monitoring:

SpaceBase<2, MySpatialObject>* base =     
    BUILD_SPACEBASE(builder, 2, MySpatialObject, "space-1");

That’s it.

The same builder can be used again and again to create new store instances, but the builder itself is not thread-safe and should be used from a single thread.

When you’re done using the SpaceBase instance, do not delete the pointer to the instance, rather, call dispose:

base->dispose();

While that is all that is required, you would probably want to also set at-least the execution mode, as explained in the next section.

Multi-Threaded Execution Modes

SpaceBase is a concurrent data-store, and as such it supports many threads operating on a single store simultaneously. It has two different execution modes that make use of threads differently. Which one is best for you depends on your specific application design and your hardware.

The first execution mode is the concurrent mode. In this mode, every operation is executed in its entirety on the thread that started it. If you call a query method with a visitor, the visitor’s method will be called on the same thread in that originally called the query method.

he second execution mode is the parallel mode, in which every operation may be further parallelized (using Java’s Fork/Join framework), and a single operation will execute on several different threads in parallel, none of which is the thread used to start the operation. In this mode, your visitor may be called, concurrently, on several different threads. If, against all good judgment, you have decided to collect the element passed to the visitor in a collection, take this fact into account, and make sure the collection supports concurrent insertion.

To set SpaceBase to run in the concurrent mode, use the builder’s set_concurrent_mode method, before calling the build method. Even in the concurrent mode, some housekeeping operations are done asynchronously on other threads. The method takes an int parameter specifying the number of threads to use in the housekeeping thread-pool:

builder.set_concurrent_mode(2);

The parallel mode is set with the set_parallel_mode method, which takes, as an argument, the number of threads in the fork-join pool:

builder.set_parallel_mode(10);

Sync and Asynchronous Mode

In the parallel execution mode a SpaceBase operation may run in many threads in parallel. Since, as explained, no query or join or transaction ever return a result back to the caller but rather call a provided visitor, why should the caller wait for the operation to complete?

Well, there may be a reason to wait, if the caller wants to make sure that its modifications to the store are visible to a query it wants to issue. In the parallel mode, you could insert an element, and then immediately issue a query that will start before the insert operation has completed, and that will fail to find the newly inserted element. This caller-thread visibility may be important in some circumstances and not in others.

For that reason, all SpaceBase operations return a Sync object (even those that return a spatial-token, as a token is also a Sync). It has a join method, that, when called, blocks the current thread, until the operation that returned the Sync has completed, after which, all modifications to the store made by that operation are visible to the joining thread.

In addition, SpaceBase provides a method to wait for all operations issued from the current thread to complete.

Note that even if the SpaceBase is configured to run in parallel mode, its default is to automatically join all operations, that is, when a method returns, its operation has completed. If you wish not to wait, or only wait for some operations, you must call method setCurrentThreadAsynchronous, which affects the current thread only: from that moment on, all SpaceBase methods issued by this thread will return immediately, and you can wait for them manually, if you like, using the Sync’s join method.

If you’re using SpaceBase in concurrent mode, or if you have not called setCurrentThreadAsynchronous you may ignore the returned Syncs as they would be completely useless.

One last thing about Syncs: if a visitor called by an operation throws an exception, and if you’re using SpaceBase in parallel mode and are using manual joins, the exception is thrown when the Sync for the failed operation is joined.

Other Settings

The builder has other options that are used to set the store’s execution mode as well as other tuning parameters that affect performance, but it comes with sensible defaults. The more advanced tuning parameters are discussed in the chapter Tuning SpaceBase for Performance.

Inserting, Modifying and Deleting Spatial Elements

Once we have a SpaceBase instance, we can start using the store. In this chapter we offer simple usage examples. Please refer to the Javadoc for a more detailed reference.

Assume we have a class called MySpatialObject that represents spatial objects in our application, and an instance, spatialObj, of that class. We insert spatialObj into the store:

SpatialToken token = base->insert(&spatial_obj, AABB<2>(8.0, 15.0, 7.5, 19.7));

In a real application, the spatial object would probably maintain its AABB itself, in which case it might have a method called getAABB(), and we’d have something like:

SpatialToken token = base->insert(&spatial_obj, spatial_obj.get_AABB());

We said it would be a good idea to keep the token in the spatial object, so we might wish to give it a setToken and getToken methods, and so the next line would be something like:

spatialObj.setToken(token);

If we’d then like to update the object’s location, we would write:

base->update(spatial_obj.get_token(), AABB<2>(-100.0, -2.5, 7.5, 7.5));

Or, more likely,

base->update(spatial_obj.get_token(), spatial_obj.get_AABB());

Finally, to remove the object from the store, we call:

base->del(spatial_obj.getToken());

Because SpaceBase was built for concurrency, the update and delete methods don’t throw exceptions if the object represented by the taken had already been removed from the store, because it may have been removed concurrently from a different thread. So, in order to save us unnecessary tests, in such a case those operations are simply ignored and the methods return normally.

Performance Note: Insertion is SpaceBase’s costliest operation, and it may entail even further, asynchronous housekeeping. So when populating the store (i.e. when executing a large number of insert operations) in the parallel execution mode, do not set the thread for asynchronous operations, or, alternatively, join the Sync returned from each insert operation before issuing the next. This is only recommended when many inserts are done consecutively. There is no need to join if the inserts are spaced-out.

Queries

Using the Built-In Queries

SpaceBase comes with a few useful built-in spatial queries. They are created by the static methods of the SpatialQueries class, and are listed in the next section. Here we’ll provide an example:

#include "SpaceBaseQueries.h"

class MyVisitor : public SpatialVisitor<MySpatialObject>
{
public:
    void visit(MySpatialObject& elem) 
    {
        cout << elem << " is in the region!" << endl;
    }
} visitor;
base->query(*SpatialQueries::contained(AABB<2>(-100.0,100.0,0.0,1000.0)), visitor);

The SpatialQueries class is declared in the SpaceBaseQueries.h header file. The visitor is called for each element matching the query. Of course, in the visitor you may examine the element’s geometry further (not just its bounding-box as the query does), and decide if it’s of use to you.

SpaceBase’s Built-In Queries

The following table lists SpaceBase’s built-in spatial queries. The first column lists the queries obtained using SpatialQueries static methods, while the second lists the classes implementing those queries.

The queries consider only the elements’ AABBs, but by extending the classes and overriding their implementation of query_element (as detailed in the next section), you may refine the queries to take into account more specific properties of the element other than its bounding-box.

SpatialQueries method Query class Description
all() AllQuery Finds all elements in the store.
equals(const AABB<D>&) EqualsQuery Finds all elements whose AABBs are equal to a given AABB.
intersect(const AABB<D>&) IntersectQuery Finds all elements whose AABBs intersect a given AABB (or are entirely contained in it).
contained(const AABB<D>&) ContainedQuery Finds all elements whose AABBs are contained within a given AABB.
range(const AABB<D>&,double) RangeQuery Finds all elements whose AABBs lie, completely or partially, within a given range from a given AABB.
contain(const AABB<D>&) ContainsQuery Finds all elements whose bounding boxes contain a given AABB.

For further details please consult the doxygen HTML reference documentation.

Composing Queries

You can compose queries together using SpatialQueries’s not, and, and or methods, which provide query negation, conjunction and disjunction, respectively. The binary operations and and or make use of short-circuit logic, so the order of the two queries passed as arguments may have a performance impact — in the case of the and operation, the less inclusive query should come first, and in the case of the or operation, the more inclusive query should come first.

See the doxygen HTML reference documentation for details.

Writing Your Own Queries

You may want to write your own queries that check for containment or intersection with shapes more interesting than axis-aligned boxes (like polygon containment, line intersection, etc.). In order to do that, you should understand how SpaceBase runs queries.

SpaceBase’s internal index is a tree of nodes, each holding the AABB of all the nodes below it. When a query runs, it decides for each node whether or not there may be matching elements in the child nodes. The way it does this is by examining the node’s AABB and deciding whether there could be elements that are fully contained in that AABB that may satisfy the query. This way SpaceBase decides whether to descend further to lower nodes in the tree, and if so, to which.

To write your own spatial-query, you need to extends the SpatialQuery<D,T> template class which declares two methods:

SpatialQueryResult query_container(const AABB<D>& aabb);
bool query_element(const AABB<D>& aabb, const T& elem);

The second, queryElement is simple. It receives an element’s AABB, and decides whether it satisfies the query. If it returns true, the element will be passed on to the visitor. If in queryElement you take into account properties of the element other than its bounding box, you can refine the query to more complex geometries than simple AABBs.

The first, queryContainer, receives an index-node’s bounding-box, and determines if there could be elements fully contained in those bounds that may satisfy the query. It returns an enum with three options:

NONE, if no elements within the bounding box could possibly match the query,

SOME, if there could be elements within the bounding box that would match the query,

or ALL, if all elements within the bounds would definitely match the query.

Of course, you could always return SOME and let the queryElement examine each element for you, but that would require SpaceBase to call the method for all elements in the store. By now you should understand that it is the NONE result that makes SpaceBase queries so fast, as it stops the descent from certain nodes down the tree. The ALL result might be beneficial if the queryContainer has a slow computation, as it saves SpaceBase the trouble of calling the query if a positive result is assured.

You can also extend the built-in query classes that are defined in SpaceBaseQueries.h and override their implementation of query_element to take into account more specific properties of the element other than its bounding-box. However, note that overriding the built-in queries’ query_container method not only does not make sense, but will have no effect.

Spatial Joins

A spatial join is a query that operates not on each spatial element individually but on pairs of elements. It is useful in finding, say, all pairs of elements that intersect one another, or all pairs that are within a certain distance from each other.

Using the Pre-Built Spatial Joins

SpaceBase comes with a couple of useful built-in spatial joins. They, too, are created by the static methods of the SpatialQueries class, and are listed in the next section. Here’s an example:

class MyJoinVisitor : public SpatialJoinVisitor<MySpatialObject, MySpatialObject>
{
public:
    void visit(MySpatialObject& elem1, MySpatialObject& elem2) 
    {
        cout << elem1 << " and " << elem2 << " intersect!" << endl;
    }
} join_visitor;
base->join(*SpatialQueries::intersect<2>(), join_visitor);

You may also limit the area in which the join is performed, like so:

class MyJoinVisitor : public SpatialJoinVisitor<MySpatialObject, MySpatialObject>
{
public:
    void visit(MySpatialObject& elem1, MySpatialObject& elem2) 
    {
        cout << elem1 << " and " << elem2 << " are close!" << endl;
    }
} join_visitor;
base->join(*SpatialQueries::contained(AABB<2>(-100.0, 100.0, 0.0, 1000.0)), 
*SpatialQueries::distance<2>(10),
join_visitor);

Note the use of the type ‘Any’, a placeholder type which does nothing, but indicates that the region query for spatial joins does not consider the elements themselves — only their AABBs.

It is also possible to join elements from two different SpaceBase stores. In that case, the spatial-join examines pairs of elements where the first element in the pair comes from the first store, and the second comes from the other:

class MyJoinVisitor : public SpatialJoinVisitor<MySpatialObject, OtherSpatialType>
{
public:
    void visit(MySpatialObject& elem1, OtherSpatialType & elem2) 
    {
        cout << elem1 << " and " << elem2 << " intersect!" << endl;
    }
} join_visitor;
base->join(base2, *SpatialQueries::intersect<2>(), join_visitor);

When joining two stores it is also possible to limit the join to a region of space.

SpaceBase’s Built-In Spatial Joins

The following table lists SpaceBase’s built-in spatial joins. The first column lists the join-queries obtained using SpatialQueries’s static methods or fields, while the second lists the classes implementing those join-queries.

The join-queries consider only the elements’ AABBs, but by extending the classes and overriding their implementation of join_elements (as detailed in the next section), you may refine the queries to take into account more specific properties of the element other than its bounding-box.

SpatialQueries method Spatial join class Description
     
all_pairs() AllJoin Finds all pairs of elements in the store(s).
intersect() IntersectJoin Finds all pairs of elements whose AABBs intersect.
distance(double) DistanceJoin Finds all pairs of elements whose AABBs are less than or equal to a given distance apart.

For further details please consult the doxygen HTML reference documentation.

Writing Your Own Spatial Joins

You may want to write your own spatial joins that check for containment or intersection with shapes more interesting than axis-aligned boxes. To do that, you extend the SpatialJoinQuery<D,T1,T2> template class, which has the following methods:

bool join(const AABB<D>& aabb1, const AABB<D>& aabb2);
bool join_elements(const AABB<D>& aabb1, const T2& elem1, 
    const AABB<D>& aabb2, const T2& elem2);

The join method is analogous to queryContainer in the SpatialQuery interface. It is passed two AABBs from the store’s internal hierarchy. Each is the bounding box of a group of elements. If an element contained in the first box and an element contained in the second may possibly satisfy the join, the method should return true. If no element contained in the first AABB can possibly be joined with an element in the second, the method should return false.

The joinElements method receives two elements and their respective AABBs. If the two elements satisfy the join — return true. Otherwise, return false. By having joinElements take into account properties of the elements other than their bounding box, you can refine the join to more complex geometries than simple AABBs.

You can also extend the built-in spatial join classes that are defined in SpaceBaseQueries.h and override their implementation of join_elements to take into account more specific properties of the elements other than their bounding-boxes. However, note that overriding the built-in joines’ join method not only does not make sense, but will also have no effect.

Complex Operations and Transactions

As defined earlier, a complex operation is one composed of many simple operations like insert, update or delete. Transactions are complex operations that are executed atomically. There are several kinds of complex-operations and transactions you can run with SpaceBase. As before, here we’ll provide some examples, and you should refer to the doxygen HTML reference documentation for further details.

The first kind of transaction, simply inserts, deletes and/or updates several elements atomically. It is used like so (assume sobj1, sobj2, sobj3 are global instances of MySpatialObject):

class MySpatialTransaction : public SpatialTransaction<2, MySpatialObject>
{
public:
    void run(StoreUpdater<2, MySpatialObject>& updater) 
    {
        sobj1.set_token(updater.insert(&sobj1, sobj1.get_AABB());
        // updater.delete(sobj1.getToken); - illegal
        updater.update(sobj2.get_token(), sobj2.get_AABB());
        updater.delete(sobj3.get_token);
    }
} txn;
base->transaction(txn);

There are a few things you should note here:

  1. Notice how the StoreUpdater object is used to update the store. Calling methods on base from within a transaction would result in an exception.
  2. You may not update or delete an element that was inserted within the same transaction, nor are you allowed to update the same element more than once within a transaction. Doing so a. makes no sense, and b. may cause unexpected behavior.
  3. If the store is configured to run in parallel execution mode, the transaction will run in a thread other than the one calling the transaction method. However, you should not be concerned with concurrency issues when your get_AABB and set_Token methods are called (as long as you’re not trying to insert sobj1 from two different threads). SpaceBase makes it just work.

Query-based Transactions and Complex Operations

The second kind of transaction allows updating elements that are found by a query. This transaction atomically deletes all elements contained in a given box:

class MyModifyingVisitor : public SpatialModifyingVisitor<2, MySpatialObject>
{
public:
    void visit(MySpatialObject& elem, ElementUpdater<2>& eu) 
    {
        updater.delete();
    }
} modifying_visitor;
base->transaction(*SpatialQueries::contained(AABB<2>(-100.0, 100.0, 0.0, 1000.0)),
      modifying_visitor);

A non-atomic version of complex operation that does the same thing is called almost identically, except with a query_for_update method instead of transaction:

base->query_for_update(
      *SpatialQueries::contained(AABB<2>(-100.0, 100.0, 0.0, 1000.0)),
      modifying_visitor);

Again, the visitor must use the updater provided, and never call any of base’s methods.

Another transaction method provides the visitor with a set all of the query’s results in one call.

There is, however, one restriction that applies to the transactional operations (using the transaction methods) and not to the non-transaction versions (the query methods): in a transaction, you may not update an element’s bounds so they no longer match the given query. In other words, in a query-based transaction you may not move the element outside the region defined by the query. If a SpatialModifyingVisitor is defined like so,

class MyModifyingVisitor : public SpatialModifyingVisitor<2, MySpatialObject>
{
public:
    void visit(MySpatialObject& elem, ElementUpdater<2>& eu) 
    {
        updater.update(AABB<2>(-200.0, 0.0, 1.0, 2.0));
    }
} mover;

then the following would throw a Java exception and crash the program:

base->transaction(*SpatialQueries::contained(AABB<2>(-100.0, 100.0, 0.0, 1000.0),
      mover); // throws an exception

The new bounds are not contained in the query region. Doing it in a non-transactional operation, however, is fine:

base->query_for_update(
*SpatialQueries::contained(AABB<2>(-100.0, 100.0, 0.0, 1000.0),
    mover); // OK

In addition, the query passed to the transaction method must extend the template class BoundedSpatialQuery<D,T>, which specifies a query whose matching elements reside in some bounded region of space (i.e. can all be contained in some AABB). BoundedSpatialQuery declares a method

bool is_contained_in(const AABB<D>&);

which returns ‘true’ if all of the query’s matching elements are contained in the given AABB.

Naturally, some queries do not fulfill this requirement and so do not implement the interface. For example, SpatialQueries::intersect is not a bounded query because all elements that intersect a given AABB are not necessarily contained in a bounded region of space.

However, note that you may compose a non-bounded query with a bounded one using the SpatialQueries::and method, to obtain a bounded query. 

Tuning SpaceBase for Performance

There are several properties, all set through SpaceBaseBuilder, that can affect SpaceBase’s performance. We’ve already discussed the execution modes, now let’s look at SpaceBaseBuilder’s other properties. The optimal settings depend on your hardware, data distribution and access patterns, so you should experiment until you find one to your liking. Note, however, that all properties have sensible defaults, so you should get excellent performance out of the box even without touching anything. However, for that extra oomph, we’ve got some knobs for you to play with.

Node Width

Each of SpaceBase’s internal nodes has up to a constant number of child nodes. The wider the node (the more children it has), the shorter the tree. With wider nodes there are less levels to traverse. However, each node you do traverse requires more work. The optimal node width mostly depends on your CPU’s cache-line size and memory performance characteristics. The node width can be set between 5 and 32 by calling:

SpaceBaseBuilder& set_node_width(int width);

Node Compression

The index nodes may be compressed to allow more width in the same number of cache-lines. Compression, however, has some processing costs. For large node-widths (say, over 20) we recommend enabling node compression. Node compression can be set on or off with the method:

SpaceBaseBuilder& set_compressed(bool compress);

Floating-Point Precision

By default, SpaceBase stores AABB coordinates with double-precision (64 bits). If you’d like to conserve memory, you may instruct SpaceBase to use single-precision (32 bits) coordinates by passing false to:

SpaceBaseBuilder& set_single_precision(bool single_precision);

Optimistic or Pessimistic Locking

In order to facilitate concurrent update to the spatial index, SpaceBase has to lock its internal nodes. It may, however, make use of ‘‘optimistic locking’’ which avoids actual locking and improves performance in many circumstances, at the cost of possible repeated attempts at executing a single operation. The optimistic locking options are set with the method:

SpaceBaseBuilder& set_optimistic_locking(int height, int retry_limit);

The height argument specifies above which optimistic locking is to be used. If it’s -1 then optimistic locking is used for all nodes. We recommend setting its value to 0 or 1.

retryLimit specifies the maximum number of times an operation is to be retried before reverting to pessimistic locking (for the execution of that operation only).

To disable optimistic locking, call this method:

SpaceBaseBuilder& set_pessimistic_locking();

Experimental Settings

SpaceBaseBuilder has other properties you can set, but those are considered experimental and are deliberately left undocumented.

Monitoring

SpaceBase uses JMX (Java Management Extensions) — the accepted standard for Java monitoring and management — for monitoring itself. It exposes several MBeans (monitoring entities) that you can monitor using VisualVM, jconosole (those two tools come bundled with the JDK. VisualVM can be downloaded separately from http://visualvm.java.net/, or any other JMX dashboard. To learn more about JMX please refer to Oracle’s JMX technology page: http://www.oracle.com/technetwork/java/javase/tech/javamanagement-140525.html

You can monitor SpaceBase remotely using jconsole, VisualVM etc., but for that you need to enable remote monitoring. All you need to do is add a couple of parameters to create_JVM:

void *jvm = create_JVM("<full-path>", 100, 
      1, "com.sun.management.jmxremote.port=<port-num>");

Where <port-num> is the TCP port number that will be used for remote access. You may also enable security for remote monitoring; for further details please see http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html.

The most interesting MBean is probably the one monitoring SpaceBase’s performance, which in VisualVM would appear in the MBeans tree as co.paralleluniverse/SpaceBase/<store-name>/performance. It has many properties, which are categorized in sub-MBeans for viewing convenience. All the values of the performance properties refer to the previous polling period. The period is 5 seconds long, by default, but can be changed through the PerformanceTimePeriod property of the co.paralleluniverse/SpaceBase/MonitoringServices MBean – Just enter a new time period in milliseconds.

Transaction Information

Transactions Total number of operations on the store (in the last polling period).
Inserts Total number of insertions.
Updates Total number of element updates.
Deletes Total number of deletions from the store.
Queries Total number of queries on the store.
AvgQueryResults Average number of results returned from queries.
AvgQueryVolume Average volume/area of bounded queries.

General

QueueLength The number of internal SpaceBase tasks currently awaiting execution. The smaller the better. (This value does not refer to the last polling period, but is an instantaneous measurement).
TotalNodeTouches Total number of SpaceBase nodes accessed.
PropagateTimeRatio Amount of time, expressed as a ratio of the last polling period, spent on some housekeeping tasks (the lower the better).
AsyncTimeRatio Amount of time, expressed as a ratio of the last polling period, spent on some housekeeping tasks (the lower the better).

Locking

Collisions Total number of lock collisions (the lower the better).
WaitTimeRatio Amount of time, expressed as a ratio of the last polling period, spent waiting for locks (the lower the better).
OptimisticLockingTransactions Number of SpaceBase operations that made use of optimistic locking.
OptimisticLockingNodes Total number of nodes accessed with optimistic locking.
OptimisticLockingRetries Number of operation retries due to optimistic locking (the lower the better).

Descends

AvgDescenXBifurcationLevel The lower the better.
AvgDescenXBifurcationLevel The lower the better.

Distributing SpaceBase

This version of SpaceBase is single-node only.

Erlang API

Getting Started

System Requirements

You must have the JRE 7 (Java Runtime Environment), as well as a recent version of the OTP installed on the machine running SpaceBase.

Installation

Just unpack the distribution archive.

Running SpaceBase

SpaceBase for Erlang is a Java process that uses the JInterface API to pose as an Erlang node with an sname. Before you run SpaceBase, the epmd process must be running — any running Erlang process starts the epmd, or you can just start it manually.

To start the SpaceBase server, run the following command in the distribution’s root directory:

java -server  
    -Dco.paralleluniverse.spacebase.properties.file=config/spacebase_erl.properties 
    -Dco.paralleluniverse.spacebase.license.file=<path to .lic file>
    -Djava.util.logging.config.file=config/jul.properties
    -Derlang.cookie=<XXXXXXX>
    -jar spacebase-erl.jar <erl node name>

<erl node name> is the sname name of the Erlang node the SpaceBase server will run as.

You can use the -Derlang.cookie option to pass a string to be used as the cookie for the Erlang node. If you don’t, the default cookie in the ~/.cookie file will be used.

The -Dco.paralleluniverse.spacebase.properties.file is optional. Use it only if you want to change the default SpaceBase configuration.

Interacting with SpaceBase

All SpaceBase operations are run by sending messages either to the main server process (used only to spawn SPaceBase stores), or to one of the SpaceBase stores themselves — for all other operation.

All messages have the same structure:

{self(), Op}

When running an operation that takes no parameters, Op is an atom.

When the operation takes parameters, Op is a tuple:

{self(), {Op_Name, Params}}

A SpaceBase store process will reply to the message with a response of the form:

{Pid, ok, Result} | {Pid, error, Reason}

where Pid is the responding process, while the main server process will respond with:

{sb_server, ok, Result} | {sb_server, error, Reason}

Creating and Destroying SpaceBase Stores

The first thing you need to do is create at least one SpaceBase store, this is done by sending the get_store message to the server process, which is always {server, SbNode}, where SbNode is the node name given in the SpaceBase command line. Like so:

{server, SbNode} ! {self(), {get_store, ''my_store1''}} 

The SpaceBase server will then either create a new store named ‘‘my_store1’’, or locate an existing store by that name. It will then respond with:

{sb_server, ok, Pid} 

Where Pid identifies the process controlling the store. You will send all additional messages for that store to that Pid.

To destroy the store, simply send it the kill message:

Pid ! {self(), kill} 

IDS

Each element stored in SpaceBase must have a unique id. You can provide your own id (if you know it to be unique) of any Erlang type. If you don’t, SpaceBase will assign an automatic id of type integer().

AABBs

SpaceBase stores and retrieves object’s spatial bounds in the form of AABBs. In Erlang, those are represented as 4- or 6- tuples of type float(), depending on the store’s dimension:

{Min_X, Max_X, Min_Y, Max_Y} | {Min_X, Max_X, Min_Y, Max_Y, Min_Z, Max_Z}

Elements

Many operations return an element or a list of elements. Those are represented as the tuple:

{Id, Aabb, Value}

With Value being any Erlang value (of any type) that is the element’s data content.

Store Info

There are two simple operations that return general information about the store:

get_name -> Name :: string()

that returns the store’s name (same as the Pid name) set on the command line, and

get_size -> Size :: integer()

that returns the total number of elements in the store.

Element Info

There are two simple operations that query an element’s data:

{contains, Id} -> true|false

tests whether an element exists in the store, and

{get_element, Id} -> {Id, {MinX, MaxX, MinY, MaxY}, Value} | undefined

returns the elements AABB and data (or undefined if the element doesn’t exist).

Inserting, Modifying and Deleting Elements

To insert an element into the store, use the operation:

{insert, Aabb, Value} -> Id :: integer()

Or, to provide your own unique id (of any Erlang type):

{insert, Aabb, Value, Id} -> Id

With Value being any Erlang value (of any type) that is the element’s data content.

To update an element, use:

{update, Id, Aabb} | {update, Id, Aabb, Value} -> ok

Please note that if an element with the given ID is not founds in the store, the operation will fail silently (and still return ok). This is so the application doesn’t have to explicitly deal with concurrent deletes.

And, finally, to remove an element from the store:

{delete, Id} -> ok

Here, too, if an element with the given ID is not founds in the store, the operation will fail silently (and still return ok).

Performance Note: Insertion is SpaceBase’s costliest operation, so when initially populating the store, consider sending all insert messages at bulk, receive-ing the all the responses after all the insert messages have been sent.

Queries

Simple Queries

SpaceBase spatial queries are run with the squery operation. It has the following form:

{squery, Query} -> [Element]

where Query is a tuple or an atom defining the query type and additional parameter.

The following table lists SpaceBase’s built-in spatial queries:

Query Description
all Finds all elements in the store.
{equal, Aabb} Finds all elements whose AABBs are equal to the given AABB.
{xsect, Aabb} Finds all elements whose AABBs intersect the given AABB (or are entirely contained in it).
{contained, Aabb} Finds all elements whose AABBs are contained within a given AABB.
{range, Aabb, Distance} Finds all elements whose AABBs lie, completely or partially, within a given Distance (:: float()) from a given AABB.
{contain, Aabb} Finds all elements whose bounding boxes contain a given AABB.

Compound Queries

You can compose queries together using not, and, and or tuples, which provide query negation, conjunction and disjunction, respectively. The binary operations and and or make use of short-circuit logic, so the order of the two queries passed as arguments may have a performance impact — in the case of the and operation, the less inclusive query should come first, and in the case of the or operation, the more inclusive query should come first.

Examples:

{squery, {'not', {contained, {0.0, 10.0, 50.0, 60.0}}}}
{squery, {'or', {'not', {contained, {0.0, 10.0, 50.0, 60.0}}},
                {xset, {30.0, 30.0, -1000.0, 1000.0}}}}

Spatial Joins

A spatial join is a query that operates not on each spatial element individually but on pairs of elements. It is useful in finding, say, all pairs of elements that intersect one another, or all pairs that are within a certain distance from each other.

SpaceBase spatial joins are run with the sjoin operation. It has the following form:

{sjoin, Join} -> [{Element, Element}]

where Join is a tuple or an atom defining the join type and additional parameter.

The following table lists SpaceBase’s built-in spatial joins:

Join Description
xsect Finds all pairs of elements whose AABBs intersect.
{dist, Distance} Finds all pairs of elements whose AABBs are less than or equal to a given Distance (:: float()) apart.

Geo-Spatial Queries and Joins

SpaceBase has a few built-in queries and joins designed to support objects positioned geographically. These — unlike all other built-in queries — make strict requirements on how coordinates are specified.

If you wish to make use of the geo-spatial queries provided with SpaceBase, you must adhere by the following rules:

  • The first coordinate is the latitude, given in degrees, and must have a value in the range [-90.090.0].
  • The second coordinate is the longitude, given in degrees, and must have a value in the range [-180.0180.0].
  • If you wish the queries to take altitude into account, you can use 3D AABBs, in which case the third coordinate is the altitude, and it must be given in meters.

In short, the coordinates must be given as (lat, lng) or (lat, lng, alt) with lat and lng being in degrees, and alt in meters. For AABBs, of course, the first pair is the minimum and maximum latitude, etc.

Query Description
   
{georange, Aabb, Distance} Finds all elements whose geographic AABBs lie, completely or partially, within a given Distance (:: float()) in meters, from a given geographic AABB.
   
   
Join Description
   
{geodist, Distance} Finds all pairs of elements whose geographic AABBs are less than or equal to a given Distance (:: float()) in meters, apart.

Configuring SpaceBase

The SpaceBase instances are configured in a .properties file, optionally specified by:

-Dco.paralleluniverse.spacebase.properties.file=<path to properties file>

on the command line. If the file is not specified, default values will be used for all properties.

A file with default configuration is provided with the SpaceBase distribution, and is found in the config directory. The same configuration applies to all SpaceBase stores created by the server.

There are two important properties you can set in the .properties file:

Property Description Default
     
spacebase.dim The SpaceBase stores’ dimension. Can be 2 or 3. 2
spacebase.server.max-threads The maximum number of threads that will be used by the server to execute SpaceBase operations. When not set, defaults to the number of available cores.
spacebase.datum The name of the geodesic datum to use for the geospatial queries and joins. WGS84
spacebase.datum.a Rather than specifying a named datum, you can define the datum parameters. a is the datum’s large radius, in meters.  
spacebase.datum.f If spacebase.datum.a is specified, f must be, too. f is the datum`s oblaten ess factor.  

There are other properties you can set in the file that tune SpaceBase’s performance. They are explained in the next section.

Tuning SpaceBase for Performance

There are several properties, all set in the .properties file, that can affect SpaceBase’s performance. The optimal settings depend on your hardware, data distribution and access patterns, so you should experiment until you find one to your liking. Note, however, that all properties have sensible defaults, so you should get excellent performance out of the box even without touching anything. However, for that extra oomph, we’ve got some knobs for you to play with.

Multi-Threaded Execution Modes

SpaceBase is a concurrent data-store, and as such it supports many threads operating on a single store simultaneously. It has two different execution modes that make use of threads differently. Which one is best for you depends on your specific application design and your hardware.

The first execution mode is the concurrent mode. In this mode, every operation is executed in its entirety on the thread that started it. In the Erlang case, that will be the thread handling the request.

The second execution mode is the parallel mode, in which every operation may be further parallelized (using Java’s Fork/Join framework), and a single operation will execute on several different threads in parallel, none of which is the thread used to start the operation.

To turn on parallel mode, include this line in the .properties file:

spacebase.parallel = true # default: false

By default, SpaceBase works in concurrent mode (spacebase.parallel is false).

Another property,

spacebase.parallelism  # default: 2 if parallel is false, num of processors if true

is the number of total threads to use if in the parallel mode (defaults to the number of available processors), or the number of background bookkeeping threads to use when in the concurrent mode (defaults to 2).

Node Width

Each of SpaceBase’s internal nodes has up to a constant number of child nodes. The wider the node (the more children it has), the shorter the tree. With wider nodes there are less levels to traverse. However, each node you do traverse requires more work. The optimal node width mostly depends on your CPU’s cache-line size and memory performance characteristics. The node width can be set between 5 and 32 with the property:

spacebase.node-width # default: 10

Node Compression

The index nodes may be compressed to allow more width in the same number of cache-lines. Compression, however, has some processing costs. For large node-widths (say, over 20) we recommend enabling node compression. Node compression can be set on or off with the property:

spacebase.compressed # default: false

Floating-Point Precision

By default, SpaceBase stores AABB coordinates with double-precision (64 bits). If you’d like to conserve memory, you may instruct SpaceBase to use single-precision (32 bits) coordinates by setting to true the property:

spacebase.single-precision # default: false

Optimistic or Pessimistic Locking

In order to facilitate concurrent update to the spatial index, SpaceBase has to lock its internal nodes. It may, however, make use of ‘‘optimistic locking’’ which avoids actual locking and improves performance in many circumstances, at the cost of possible repeated attempts at executing a single operation. Optimistic locking is turned on by default, and can be set off with the property:

spacebase.optimistic # default: true

If using optimistic locking, two more properties affect its performance:

spacebase.optimistic-height # default: 1

This property specifies above which tree-livel optimistic locking is to be used. If it’s -1 then optimistic locking is used for all nodes. We recommend setting its value to 0 or 1.

spacebase.optimistic-retry-limit # default: 3

This property specifies the maximum number of times an operation is to be retried before reverting to pessimistic locking (for the execution of that operation only).

Monitoring

SpaceBase uses JMX for monitoring itself. It exposes several MBeans which you can monitor using VisualVM, jconosole, or any other JMX dashboard. The most interesting MBean is probably the one monitoring SpaceBase’s performance, and is named co.paralleluniverse:type=SpaceBase,name=<store-name>,monitor=performance, which in VisualVM would appear in the MBean tree as co.paralleluniverse/SpaceBase/<store-name>/performance. It has many properties, which are categorized in sub-MBeans for viewing convenience. All the values of the performance properties refer to the previous polling period. The period is 5 seconds long, by default, but can be changed through the PerformanceTimePeriod property of the co.paralleluniverse/SpaceBase/MonitoringServices MBean — Just enter a new time period in milliseconds.

Here we explain some of them:

Transaction Information

Transactions Total number of operations on the store (in the last polling period).
Inserts Total number of insertions.
Updates Total number of element updates.
Deletes Total number of deletions from the store.
Queries Total number of queries on the store.
AvgQueryResults Average number of results returned from queries.
AvgQueryVolume Average volume/area of bounded queries.

General

QueueLength The number of internal SpaceBase tasks currently awaiting execution. The smaller the better. (This value does not refer to the last polling period, but is an instantaneous measurement).
TotalNodeTouches Total number of SpaceBase nodes accessed.
PropagateTimeRatio Amount of time, expressed as a ratio of the last polling period, spent on some housekeeping tasks (the lower the better).
AsyncTimeRatio Amount of time, expressed as a ratio of the last polling period, spent on some housekeeping tasks (the lower the better).

Locking

Collisions Total number of lock collisions (the lower the better).
WaitTimeRatio Amount of time, expressed as a ratio of the last polling period, spent waiting for locks (the lower the better).
OptimisticLockingTransactions Number of SpaceBase operations that made use of optimistic locking.
OptimisticLockingNodes Total number of nodes accessed with optimistic locking.
OptimisticLockingRetries Number of operation retries due to optimistic locking (the lower the better).

Descends

AvgDescenXBifurcationLevel The lower the better.
AvgDescenXBifurcationLevel The lower the better.

Distributing SpaceBase

This version of SpaceBase is single-node only.

documentation

SpaceBase

API Discuss