Frequently Asked Questions
When should I use Bitsy?
The Bitsy graph database is a good fit for you if
- your data can be modeled as a property graph
- your data can fit in memory
- your application is multi-threaded
- your application needs ACID transactions and transparent concurrency control
- your application needs high read/write performance on small transactions
- you need the flexibility to move to a different graph database implementation in the future
What is different about Bitsy?
Bitsy is designed differently from disk-based and cluster-based graph databases. You can read more about Bitsy's design principles in this presentation. Because Bitsy doesn't perform disk seeks, it has a higher write throughput in a multi-threaded setting while offering ACID guarantees.
Bitsy's read throughput is also comparable to the fastest graph databases out there. This presentation discusses how Bitsy achieves cutting-edge read performance.
Can I move from/to a different graph database?
Yes. Bitsy is a strict implementation of the Blueprints API and supports the Tinkerpop software stack. The Blueprints API provides "a common set of interfaces to allow developers to plug-and-play their graph database backend."
To move your data across database, you can write a simple program to all read vertices and edges from your old graph database and store them in the new graph database. In this manner, you can avoid getting locked into any single graph database implementation.
How can I use Bitsy in my application?
Bitsy is an embedded database, which means that you directly invoke its methods through the standard Blueprints API from your application.
To embed Bitsy, your application must be Java-based, i.e., written in Java, Groovy, Scala, Clojure, JRuby, Jython, etc. Reads from your application will access Bitsy's in-memory data-structures, and writes will modify these structures and write the updates to files. Bitsy guarantees clean recovery on file-systems with metadata journaling.
Server-based applications that embed Bitsy can expose a domain-specific APIs corresponding to the business logic layer. The server could even include the presentation layer for small-medium installations. You can configure a large thread pool (50-200) to get the best performance out of Bitsy.
A domain-specific API built on the current-generation of servers and protocols (Tomcat/HTTP, Jetty/HTTP, Jersey/JAX-RS, Thrift, Avro, etc) should be able support thousands if not tens of thousands of transactions per second.
Can I use Bitsy in an eco-system with other NoSQL technologies?
Yes. All Bitsy identifiers are UUIDs that don't change over time. The toString() method on these identifiers gives you a String identifier that can be used to access the vertices/edges in the graph. You can maintain such references to Bitsy vertices/edges in external stores, such as search indexes and other databases.
A couple of common use-cases involving complementary NoSQL technologies have been documented in the following pages:
- Large Objects: This page describes how large objects/blobs can be stored outside Bitsy while maintaining ACID guarantees.
- Fancy Indexes: This page describes a way to integrate Bitsy with Lucene or other search indexes
Other methods as possible too. For example, you could use Bitsy to
- Maintain temporary user information such as 'shopping carts' which can later be committed to the main database
- Store the configuration of your application
- Keep track of application statistics that need to survive crashes and restarts
How do I get the best performance out of Bitsy?
Please refer to the page on Tuning Bitsy.
How can I monitor a Bitsy application?
Please refer to the page on Monitoring and Management.
Can I backup a Bitsy database?
Please refer to the page on Backup and Restore.
What is the BitsyRetryException? How do I deal with it?
Please refer to the page on Optimistic Concurrency. The retry mechanism is going to be a part of the Tinkerpop standard.
What is the BitsyException with error code ACCESS_OUTSIDE_TX_SCOPE? How do I deal with it?
"Element references created in a transactional context may not be accessed outside the transactional context", according to the Blueprints standard. The standard continues to say that "in cases where the element reference needs to be accessed outside its original transactional context, it should be re-instantiated based on the element id."
Since other graph databases seem to automatically reload vertices/edges, Bitsy offers a wrapper class named BitsyAutoReloadingGraph which can be used as follows:
import com.lambdazen.bitsy.wrapper.BitsyAutoReloadingGraph; ... BitsyGraph baseGraph = new BitsyGraph(...) BitsyAutoReloadingGraph graph = new BitsyAutoReloadingGraph(baseGraph ); // ... use graph in any number of threads
Bitsy gives me an older version of vertices/edges. What is happening?
The most likely cause is that a transaction is already in progress at the time of the query. The default isolation level for Bitsy is REPEATABLE_READ which ensures that you don't see different properties for the same element in the same transaction. So if you read vertex/edge elements, but don't call commit/rollback, you may see the same information later in the thread.
The best way to avoid these problems is to move to a transaction model such as the one described in the "Recommended method" section, which automatically does the commits and retries. The other option is to use the isTransactionActive() method available in BitsyGraph and BitsyAutoReloadingGraph to see if a transaction is in progress.
How does the open-source license work?
Bitsy is licensed under Apache 2.0 license which is a popular corporate-friendly open-source license. Bitsy was previously a dual-licensed product.