Commits

lambdazen committed ed99f29

First draft

  • Participants
  • Parent commits 884ccd9

Comments (0)

Files changed (20)

BackupAndRestore.md

+# Backup and Restore
+
+For an offline backup, you can copy the contents of the database folder to a backup location using a simple file copy operation. To restore the database, you simply have to restore the database folder from the backup, after removing the original contents.
+
+Bitsy also supports online backups using the backup() method in BitsyGraph. This method expects a path to an empty folder to which the vertex and edge logs can be written. You can put this on a schedule using a simple Timer or your favorite scheduler library (cron4j, Quartz, etc.). 
+
+You can perform online backups manually using the backup() method that is [exposed over JMX](MonitoringAndManagement.md)
+# Benchmarks
+
+These benchmarks are based on tests run on a low-end desktop PC ($600 HP Pavilion desktop p7-1287c running Windows 7). You can find these benchmarks in the unit test named BitsyGraphTest. 
+
+**Note:** Please model and execute performance tests based on your use-cases. These tests aren't meant to be performance guarantees. 
+
+## Write throughput
+
+The plot below shows the throughput of a test application that repeatedly commits a small transaction (1 vertex + 1 edge) from multiple threads. The code for this test is available in BitsyGraphTest.XtestMultiThreadedPerformance(). 
+
+![WriteThroughput](images/BitsyWriteThroughput.png)
+
+The throughput increases as more threads are added till the system is able to perform 50K+ operations/second at ~750 concurrent threads. The performance is lower when the number of concurrent threads is small due to the _rotational latency_ of the disk that slows down the force() operation on the [file channel](http://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.html).
+
+In typical OLTP applications, a thread pool is dedicated to service clients with each thread committing changes to the database. The above plot indicates that the maximum size of this thread pool can be as high as 500-1000 (provided the system has enough resources for the application). The best size for the thread pool may be higher or lower depending on the application, the nature of transactions, and the hardware capabilities. 
+
+**Note**: The throughput will automatically scale up and down depending on the number of concurrent requests. However, when fewer threads are committing transactions, the latency seen by the application will be better. 
+
+## Read throughput
+
+The following two plots show the throughput of a test application that repeatedly traverses a portion of the graph. The code for this test is available in BitsyGraphTest.XtestMultiThreadedPerformance(). 
+
+![Read throughput without thread switching](images/BitsyReadThroughput-2.png)
+
+This plot shows the read throughput when there is no thread switching (x=1-3 threads) and thread switching becomes a bottleneck (x=4-5 threads). A CPU with N-processors can handle (N-1) read threads without context switching, provided the remaining core takes care of everything else (OS, Java garbage collection, etc). As you can see in the above the plot, once the number of concurrent threads exceeds 3, the performance drops because thread context switching becomes the bottleneck. 
+
+The peak performance is _2 million operations per second_ when the number of threads is 3. This level of performance could be useful for standalone applications and batch operations. But for typical OLTP applications, it is likely that the bottleneck is elsewhere (sockets, serialization, etc). 
+
+For OLTP applications which typically operate with a larger number of concurrent threads, Bitsy offers a _performance floor_ that is still very attractive (~400K operations/second). The plot below shows the read throughput for 4-1000 concurrent threads. As you can see, the X-axis doesn't matter much once the number of threads exceeds the number of cores. 
+
+![Read throughput with thread switching](images/BitsyReadThroughput-1.png)
+
+

BlueprintsFeatures.md

+# Supported Blueprints Features
+
+Most of the [Blueprints features] supported by Bitsy. 
+
+* ignoresSuppliedIds: **true**
+* isWrapper: **false**
+* supportsDuplicateEdges: **true**
+* supportsEdgeIteration: Same as the allowFullGraphScans option passed during the creation of the BitsyGrah
+* supportsEdgeProperties: **true**
+* supportsEdgeRetrieval: **true**
+* supportsSelfLoops: **true**
+* supportsThreadedTransactions: **true**
+* supportsTransactions: **true**
+* supportsVertexIteration: Same as allowFullGraphScans
+* supportsVertexProperties: **true**
+* supportsEdgeKeyIndex: **true**
+* supportsKeyIndices: **true**
+* supportsVertexKeyIndex: **true**
+
+**Note:** KeyIndexableGraph is supported, but not the older IndexableGraph interface. Refer to [this page](https://github.com/tinkerpop/blueprints/wiki/Graph-Indices) for more information. 
+
+* supportsIndices: **false** 
+* supportsVertexIndex: **false**
+* supportsEdgeIndex: **false**
+
+## Serialization features
+
+All serialization features are supported. Object serialization depends on the [Jackson JSON processor] with "default typing" option enabled, *not* Java serialization. The default typing option is described in the section named "Deserializing Polymorphic types" in the [Jackson FAQ].
+
+* supportsBooleanProperty: **true**
+* supportsDoubleProperty: **true**
+* supportsDuplicateEdges: **true**
+* supportsFloatProperty: **true**
+* supportsIntegerProperty: **true**
+* supportsLongProperty: **true**
+* supportsMapProperty: **true**
+* supportsMixedListProperty: **true**
+* supportsPrimitiveArrayProperty: **true**
+* supportsSerializableObjectProperty: **true**
+* supportsStringProperty: **true**
+* supportsUniformListProperty: **true**
+
+
+[Blueprints features]: http://www.tinkerpop.com/docs/javadocs/blueprints/2.3.0/com/tinkerpop/blueprints/Features.html
+[default typing]: https://github.com/FasterXML/jackson-docs/wiki/JacksonFAQ
+[Jackson JSON processor]: http://jackson.codehaus.org/
+[Jackson FAQ]: https://github.com/FasterXML/jackson-docs/wiki/JacksonFAQ
+# Dependencies
+
+- [Blueprints Core] which is the API implemented by the database
+- [Jackson JSON processor] for serialization and deserialization
+- [Ness Computing Core Component] for fast UUID serialization and deserialization (faster than JDK UUID.fromString() and toString())
+- [SLF4J API] for logging. The SLF4J implementation/binding depends on the library selected by the application integrating with Bitsy. If no binding is included in the classpath, the logging statements will be ignored. 
+
+[Blueprints Core]: https://github.com/tinkerpop/blueprints/wiki
+[Ness Computing Core Component]: https://github.com/NessComputing/components-ness-core 
+[SLF4J API]: http://www.slf4j.org/
+[Jackson JSON processor]: http://wiki.fasterxml.com/JacksonHome

DesignPrinciples.md

+# Design Principles
+
+The speed and simplicity of Bitsy relies on three design principles: No Seek, No Socket, NoSQL. 
+
+### No Seek
+
+Bitsy's first principle is _"No Seek."_ All updates to the database are written to the end of a transaction log file. Changes committed concurrently from different threads are written to the disk using file-append operations followed by a _force operation_ to the disk. The rest of this section describes the reason behind the "No Seek" principle.
+
+> *Seek time*: The seek time of a hard disk measures the amount of time required for the read/write heads to move between tracks over the surfaces of the platters -- _From [PC Guide](http://www.pcguide.com/ref/hdd/perf/perf/spec/posSeek-c.html)_
+
+Most databases persist information in tree-like structures, such a B-Trees. These data structures are stored on random-access storage devices, typically [hard disk-drives]. Each query/update on these data structures involves several _seek_ operations on the disk as the query/update algorithm traverses the tree. 
+
+The average seek time is in the order of 5-15 milliseconds for typical hard disks, which is 1000x slower than in-memory operations. The bottleneck in the seek operation is the mechanical movement of the drive head. Once the head is moved to the right track, there is a _rotational latency_ which is the time it takes for the spinning disk to get to the correct sector. This is between 2-4ms. 
+
+While caches can help speed up queries, the database will trigger _one or more seeks per updated element_ to register updates to the B-Tree. Some databases write the updates to a log and then move them to a B-Tree at a later time. But in the long-run, an ordered on-disk structure (like the B-Tree) cannot perform an update to a large data structure faster than the seek time of the underlying random-access device. The seek-time bottleneck is traditionally addressed by (a) adding more disks to the disk array, or (b) moving to solid-state drives. Both methods are expensive and only improve the seek time to 0.1ms at best -- still 100x slower than RAM lookups. 
+
+Although hard-disk drives have high seek times, they are pretty fast at reading from and writing to a continuous block of data. The sequential data transfer rate for a standard hard-disk is around 100MB/sec. If a record is 1KB in size, you could write ~100K records in a second, which is still 2-3 orders of magnitude faster than 100K seek operations. 
+
+This is the reason behind the "No Seek" principle. A cheap off-the-shelf hard disk drive can keep up an in-memory database, as long as each updated element doesn't trigger a seek operation. There are a couple of caveats here: 
+
+1. Though a commit operation in Bitsy will not result in a seek operation in the common case, the OS will fire seeks to reach different blocks of the file. Similary re-organization processes that access different files will require a seek to get to each file. The cost of these seek operations, however, can be amortized over the 1000s of records that are processed per seek operation. 
+1. Rotational latency, which can be around 2-4ms, can slow down a commit operation from the perspective of a single thread. However, since concurrent commits from different threads are written at once, the rotational latency can be amortized over the threads that commit at the same time. When sufficient threads are operating concurrently, the rotational latency doesn't affect the overall throughput of the system. The write throughput in the [benchmarks page](Benchmarks.md) demonstrates this effect.
+
+[Hard disk-drives]: http://en.wikipedia.org/wiki/Hard_disk_drive
+
+### No Socket
+
+Bitsy's second principle is _"No Socket."_ It is an embedded database and is meant to be integrated into an _application_. The application can be any of the following:
+
+* A standalone Java program
+* A web-application that implements all "three tiers"
+* A stateful server with a remote interface based on your favorite protocol (REST, SOAP, Thrift, Protocol buffer, RMI, Avro) exposed to a cluster of stateless web-applications.
+
+Typically, databases are designed to be accessed by applications over sockets. This allows multiple applications to access the same database and makes it possible to restart the application without restarting the database. 
+
+The downside is that a sockets-based interface requires OS-level calls and serialization/de-serialization of requests and responses which reduces the overall throughput of the system. Every socket interaction at the application level corresponds to several interactions at the database level. While connection pooling addresses the cost of establishing a new connection to the database, the cost of transmission, serialization and de-serialization can not be avoided. 
+
+The speed-up gained by avoiding sockets is especially visible while querying Bitsy. Queries in Bitsy can take _less than a micro-second_ on desktop PCs because serialization is not required. This is hard to achieve over sockets. 
+
+Socket-based interfaces also complicate the database's design. Since the interface is exposed to other servers, the database must now manage users accounts, and handle authentication and authorization. This is a layer of authentication and security that is different from the application-level authentication and security rules. With an embedded database like Bitsy, the application has sole control over the authentication and authorization rules. 
+
+
+### No SQL
+
+If you have come this far, you must have your own reasons to look for a "NoSQL" graph database. Bitsy was originally developed to support an application that talked to a different graph database using the [Blueprints API] and [Gremlin]. When the graph database implementation was not fast enough in a multi-threaded OLTP setting, the author of the application built Bitsy. 
+
+The biggest issue with SQL is that it is very easy (almost natural) to write poorly performing queries. Tuning a SQL database, on the other hand, is a non-trivial task. SQL was developed as a 4th-generation declarative language. The idea is that the programmer declares what he/she wants to query and system will take care of the rest. This works fine as long the tables are small (<1K rows). Once the tables become big, the "execution plan" for each query starts to matter. Soon indexes have be created, queries have to be re-written, profiling tools have to be launched, DBAs have to be hired, and so on. By the end of the process, any development time saved by the use of SQL is offset by the cost of tuning the queries and the database. The widespread adoption of "3GL" [object-relational mapping tools](http://en.wikipedia.org/wiki/Object-relational_mapping) like Hibernate, to hide SQL from the programmer, is a testament to the SQL's failure as a 4GL. Bitsy, on the other hand, has very few [tuning parameters](TuningBitsy.md). When launched with the allowFullGraphScans option set of false, it guarantees that _all_ operations are executed quickly. 
+
+The second issue with SQL is its complexity from the database implementor's as well as the application developer's point of view. SQL is a database implementor's nightmare. The SQL 2003 standard consists of [1300+ BNF rules](http://savage.net.au/SQL/sql-2003-2.bnf). The vendors that got together to standardize SQL had their interests in mind (read: barrier to entry) rather than the interest of the SQL programmer. Furthermore, most application developers aren't as comfortable with SQL as they may think, because of its complex syntax, database-specific variants and varying programming styles. The [Blueprints API] was developed by one expert, [Marko Rodriguez], not a consortium of software giants. In the author's opinion, the [Property Graph Model] is simpler and more dynamic than the relational model with SQL. 
+
+
+### Final thoughts
+
+The design principles, "No Seek", "No Socket" and "No SQL" keep Bitsy very simple and very fast. A couple of other recent developments are worth mentioning in the context of Bitsy's design: 
+
+1. **Journaled file-systems**: Modern file systems support metadata journaling which ensures that files aren't (badly) corrupted in the middle of a write operation. While Bitsy's startup procedure accounts for incomplete writes, and has the logic to repair these problems, it doesn't have to deal with the _hard_ problems of metadata corruption. 
+1. **Cheap RAM**: The cost of RAM has come down drastically over the last few decades. [Amazon EC2](http://aws.amazon.com/ec2/) offers high-memory instances with up to 68GB of memory. Off-the-shelf servers support upto 512GB RAM. This has enabled a [paradigm shift in the database market.](http://www.linux-magazine.com/Online/Features/In-Memory-DBMS)
+
+[Marko Rodriguez]: http://markorodriguez.com/
+[Blueprints API]: https://github.com/tinkerpop/blueprints/wiki
+[Gremlin]: https://github.com/tinkerpop/gremlin/wiki
+[Property Graph Model]: https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model
+
+
+# Fancy Indexes
+
+Recent progress in indexing and search technology obsoletes old-style database indexes for _human_ searches and lookups. [Apache Lucene](http://lucene.apache.org/) is the most popular indexing search technology. Lucene indexes can have documents that refer to the vertex/edge IDs in Bitsy. This section discusses how you can integrate Bitsy with Lucene (or an alternate "fancy index") while maintaining ACID guarantees. 
+
+Lucene indexes are good for human use. Bitsy indexes are good for for machine (algorithmic) use. For example, a [Blueprints key-index](https://github.com/tinkerpop/blueprints/wiki/Graph-Indices) is better to find person node by SSN and a Lucene index is better to find a person node based on his/her education background. 
+
+For this reason, it is OK to update Lucene indexes with a little bit of delay. Delayed batch updates are also preferable because Lucene performs much better with batched updates followed by commits. Please refer to the bullet on [setting autoCommit=false](http://wiki.apache.org/lucene-java/ImproveIndexingSpeed). 
+
+**Note:** There are no plans to integrate Lucene with Bitsy. If you are managing [large objects](LargeObjects.md) outside Bitsy, it makes sense to keep the indexes on these objects externally as well. Furthermore, the mapping between vertex/edge properties and documents to be indexed by Lucene is application-dependent. If there is one search box for all properties of a person, there will only be one Lucene index against all properties of the person vertex. 
+
+## Ensuring consistency with Lucene
+
+Bitsy supports ACID transactions, and so does Lucene in the sense that all commits will be fully applied or not applied at all. The problem, however, is that when committing to Lucene in batches, a crash might leave some elements/documents un-indexed. 
+
+To avoid this problem, your application can maintain _one vertex per Lucene index_ maintained. Whenever a vertex is added with an impact on the index, you can add an edge from that vertex to the vertex representing the Lucene index. This guarantees that the database always keeps track of documents that need to be added to the indexes. 
+
+In the background, a separate thread can wake up (say every minute) and perform these steps:
+
+1. Find the vertex for the Lucene index managed by this thread, say V, based on a key-index or the UUID of the vertex
+1. Get the incoming edges to V. These link to vertices that have some properties or references to [large objects](LargeObjects.md) that must be indexed. 
+1. Add the document(s) corresponding to the linked vertices. This should be an _idempotent_ operation since it may be executed more than once in case of a crash in step 5/6. 
+1. Commit the Lucene index writer
+1. Remove the processed edges from the Bitsy graph
+1. Commit the changes to the Bitsy graph
+
+The above algorithm ensures that the Lucene indexes are updated efficiently, while keeping Bitsy and Lucene consistent with one another within a configurable period of delay. In the event of a crash, the system will pick up the pending vertices to be indexed and add them to the Lucene index. 
-# Welcome
+# Introduction
 
-Welcome to your wiki! This is the default page we've installed for your convenience. Go ahead and edit it.
+Bitsy is a small, fast, embeddable, durable in-memory [graph database] that implements the [Blueprints API]. 
 
-## Wiki features
+[Blueprints API]: https://github.com/tinkerpop/blueprints/wiki
+[graph database]: http://en.wikipedia.org/wiki/Graph_database
+[ACID]: http://en.wikipedia.org/wiki/ACID
 
-This wiki uses the [Markdown](http://daringfireball.net/projects/markdown/) syntax.
+## Features
 
-The wiki itself is actually a git repository, which means you can clone it, edit it locally/offline, add images or any other file type, and push it back to us. It will be live immediately.
+* Support for [most Blueprints features](BlueprintsFeatures.md) including key indices and threaded transactions
+* ACID guarantees on transactions
+* Designed for multi-threaded OLTP applications
+* Implements optimistic concurrency control
+* Data stored in readable text files
+* Serialization using the [Jackson JSON processor]
+* Recovers cleanly from power failures and crashes provided the underlying file system supports metadata journaling, like NTFS, ext3, ext4, XFS and JFS (not FAT32 or ext2)
 
-Go ahead and try:
+[OLTP]: http://en.wikipedia.org/wiki/Online_transaction_processing
+[Jackson JSON processor]: http://wiki.fasterxml.com/JacksonHome
+
+## Getting started
+
+### Embedding Bitsy
+
+Bitsy is a [JAR file]() that can be added to your project's classpath. The dependencies are listed [here](Dependencies.md)
+
+If you are using Maven, you can add Bitsy and its dependencies to your project by adding this dependency: 
 
 ```
-$ git clone https://bitbucket.org/lambdazen/bitsy.git/wiki
+<dependency>
+  <groupId>com.lambdazen.bitsy</groupId>
+  <artifactId>bitsy</artifactId>
+  <version>1.0-SNAPSHOT</version>
+</dependency>
 ```
 
-Wiki pages are normal files, with the .md extension. You can edit them locally, as well as creating new ones.
-
-## Syntax highlighting
+The project artifacts are hosted in the Sonatype OSS Maven Repository at https://oss.sonatype.org/content/groups/public/. You can configure this repository by following [these instructions](http://maven.apache.org/guides/mini/guide-multiple-repositories.html).
 
+The 1.0 snapshot will be promoted to the 1.0 release after the Beta period ends on **July 1, 2013**. 
 
-You can also highlight snippets of text (we use the excellent [Pygments][] library).
+### Using Bitsy
 
-[Pygments]: http://www.pygments.org/
+The following code snippet shows how Bitsy can be launched and shut-down: 
 
 ```java
+import java.nio.file.Path;
+import com.lambdazen.bitsy.BitsyGraph;
+
 public class Test {
-    // Comment
-    public static final String FOO = "foo";
+    public static void main(String[] args) {
+        Path dbPath = Paths.get("...path to a directory...");
+
+        // Open the database
+        BitsyGraph myGraph = new BitsyGraph(dbPath);
+
+        ... use myGraph in any number of threads
+
+        // Close the database
+        myGraph.shutdown();
+    }
 }
 ```
 
+Now you can use Bitsy like any [Blueprints graph database](https://github.com/tinkerpop/blueprints/wiki) using the Tinkerpop software stack. 
+
+**Note**: The above example uses the simple constructor for BitsyGraph. Refer to [Tuning Bitsy](TuningBitsy.md) for other constructors, including a memory-only (non-durable) database constructor. 
+
+## Deep dive
+
+[This presentation](http://www.slideshare.net/lambdazen/bitsy-graphdatabase) provides an overview of Bitsy's design principles and features. The following Wiki pages go into a little more detail:
 
-You can check out the source of this page to see how that's done, and make sure to bookmark [the vast library of Pygment lexers][lexers], we accept the 'short name' or the 'mimetype' of anything in there.
-[lexers]: http://pygments.org/docs/lexers/
+* [Design Principles](DesignPrinciples.md): _No Seek_, _No Socket_ and _No SQL_
+* [Optimistic concurrency](OptimisticConcurrency.md): Covers the concurrency control and best practices to retry transactions
+* [Write algorithms](WriteAlgorithms.md): Details the different write algorithms along with the threads, buffers and logs used
+* [Serialization format](SerializationFormat.md): Specifies the JSON format for the various records in the database files
+* [Benchmarks](Benchmarks.md): Illustrates the read and write throughputs in a multi-threaded OLTP setting
+* [Tuning Bitsy](TuningBitsy.md): Covers the different BitsyGraph settings and how they can be changed
+* [Monitoring and Management](MonitoringAndManagement.md): Covers JMX and SLF4J integration
+* [Backup and Restore](BackupAndRestore.md): Discusses offline/online backup and restore procedures
+* [Large Objects](LargeObjects.md): Handling large objects outside of Bitsy
+* [Fancy Indexes](FancyIndexes.md): Maintaining Lucene and Lucene-like indexes outside of Bitsy
+* [Roadmap](Roadmap.md): Lists planned releases
 
+## Licensing
 
-Have fun!
+Bitsy is available under an open-source [AGPLv3](http://www.gnu.org/licenses/agpl.html) license. 
+# Large Objects
+
+Seek operations while accessing large objects have a low amortized cost. So the ["No seek" design principle](DesignPrinciples.md) is not particularly useful for large objects. Bitsy's strength lies is quickly querying and updating graphs with a lot of small vertices and nodes in a transactional setting. 
+
+There are better technologies to store large objects, such as: 
+
+* A plain old file system: The java.io and java.nio packages are pretty efficient when it comes to storing, loading and streaming large files. You can use an ID or date-based mechanism to save the object. 
+* Clustered file systems like HDFS
+* Key-value store like [Apache Cassandra](http://cassandra.apache.org/)
+
+By moving large objects to an external store, you can conserve the memory used by Bitsy and reduce the time taken during startup. The rest of this section discusses how you can combine Bitsy with an external store for large objects while maintaining the ACID guarantees. 
+
+## Ensuring ACID with large objects
+
+You can place the references to the large objects in vertex or edge properties. The reference could be a URI to the file's path or the key to a key-value store. 
+
+When multiple transactions are operating on vertices and nodes, you can follow these rules to ensure ACID guarantees: 
+
+1. Save all large objects _before_ committing the vertices/edges with references to them
+1. Delete large objects _after_ the vertices/edges with the references to them have been deleted and committed
+1. _Do not modify_ any large object once it is created
+
+The last rule calls for immutable large-objects. This ensures that crashes in the middle of updating a large object don't corrupt the database. 
+
+### Example
+
+Here is a quick example. Consider an application that maintains a vertex per "person" in Bitsy. The image corresponding to each person is maintained in a file system. Using the above-mentioned approach you could model [CRUD operations](http://en.wikipedia.org/wiki/Create,_read,_update_and_delete) as follows: 
+
+**Creating the Person for the first time: **
+
+1. Save the image in a file, say /resource/<user-id>/<random-uuid>.jpg. By inserting a random [UUID](http://en.wikipedia.org/wiki/Universally_unique_identifier) to the path, we ensure that it won't conflict with an older image. 
+1. Create the person vertex with a property 'imageURI'
+1. Commit the changes to Bitsy
+
+**Reading the Person's details: **
+
+1. Read the person vertex
+1. If the image is needed, use the 'imageURI' property to lookup the image. 
+
+**Updating the Person's image: **
+
+1. Create a new image with a different UUID
+1. Change the 'imageURI' in the person vertex
+1. Commit the changes to Bitsy
+
+A crash in the middle of this process will leave the person node in its last known state, i.e., pointing to the old image file. 
+
+**Removing a Person: **
+
+1. Load the person vertex
+1. Keep track of the imageURI in a local variable
+1. Remove the vertex
+1. Commit the changes to Bitsy
+1. Remove the file
+
+A crash in the middle of this process will still leave the person node in a valid state. In the worst case, you could have an extra image file on the file sytem. 

MonitoringAndManagement.md

+# Monitoring and Management
+
+## JMX interface
+
+Bitsy exposes the following monitoring and management methods over JMX: 
+
+1. Get and set key tuning parameters: These parameters are exposed as JMX attributes. 
+1. Online backup: The backup() operation takes a String parameter that refers to the directory to which the database should be backed up. Refer to the page [Backup and Restore](BackupAndRestore.md) for information on restoring the database and making offline backups
+
+These can be accessed using [jconsole](http://docs.oracle.com/javase/7/docs/technotes/guides/management/jconsole.html) under com.lambdazen.bitsy. Here is a screenshot of jconsole showing the [three tuning parameters]((TuningBitsy.md) and operations to 
+
+* Backup the database while it is running, and 
+* Flush the transaction log (refer to "transaction log flush" in the [write algorithms page](WriteAlgorithms.md)).
+
+![Backing up using JConsole](images/JConsole.png)
+
+The above screenshot also shows an online backup to folder of "C:\tmp\backup" using JConsole. 
+
+## Logging
+
+Bitsy uses the [SLF4J API](http://www.slf4j.org/) to log messages. If the embedding application doesn't include an SLF4J implementation (like [Logback](http://logback.qos.ch/)), no log messages will be output. Otherwise, the log messages will go to the respective implementation. 
+
+You can set the log level to ERROR, INFO or DEBUG based on your requirements. The loggers used by Bitsy are named after the respective classnames, and always begin with "com.lambdazen". 

OptimisticConcurrency.md

+# Optimistic Concurrency Control
+
+Bitsy implements [optimistic concurrency control](http://en.wikipedia.org/wiki/Optimistic_concurrency_control) and does not lock on any vertex or edge when it is queried or updated. Instead, every queried vertex or edge is retrieved along with its _version number_. When a vertex or edge is _modified_, the version number is checked before the element is committed. If the version matches the latest version for that vertex/edge in the database, the commit goes through. Otherwise, the Bitsy database throws a _BitsyRetryException_. 
+
+**Note:** The version numbers are at the vertex and edge level. So if two different properties are updated concurrently by two transactions, one of them will fail with the BitsyRetryException. 
+
+## Simulating locks
+
+Vertices or edge that are queried, but _not updated_ by a transaction, do not participate in the version check. For example, if an edge is added between two vertices by a transaction, and a property is concurrently changed in one of the end-point vertices by a different transaction, both transactions will go through. 
+
+A transaction can simulate a lock on the vertex or edge, by either setting a property to the currently-held value, or calling the _markForUpdate()_ method defined in the concrete implementations of the Vertex and Edge interfaces, viz. BitsyVertex and BitsyEdge. This ensures that other transactions that update the same edge/vertex after the "lock" operation, but before this transaction commits, will fail with the BitsyRetryException. 
+
+An application can also implement locks in Java for pessimistic concurrency control. However, you should make sure that [deadlocks](http://en.wikipedia.org/wiki/Deadlock) are avoided using method such as lock ordering. 
+
+## Dealing with the BitsyRetryException
+
+**Note:** This section only contains recommendations. Please feel free to implement what makes sense. 
+
+The best way to deal with the _BitsyRetryException_ is to retry the transaction again, starting with the queries. One method to retry transactions is to model every transaction that operates on the database as an implementation of an ITransaction interface like this one:
+
+```java
+public interface Transaction<T> {
+    public T call(BitsyGraph graph) throws Exception;
+}
+```
+
+Now, you can use a transaction executor to execute these transactions multiple times as required. Here is an example:
+
+```java
+public class TransactionExecutor<T> {
+    public TransactionExecutor(BitsyGraph graph, int numTries) {
+        this.graph = graph;
+        this.numTries = numTries;
+    }
+
+    public T execute(Transaction<T> tx) throws Exception {
+        for (int i=0; i < numTries; i++) {
+            boolean commit = false;
+            try {
+                tx.call(graph);
+                commit = true;
+                break;
+            } catch (BitsyRetryException e) {
+                if (i < numTries - 1) continue; else throw e;
+            } finally {
+                graph.stopTransaction(commit ? Conclusion.SUCCESS : Conclusion.FAILURE);
+            }
+        }
+    }
+```
+
+The actual work can be implemented by named or anonymous classes:
+
+```java
+    // Say txExecutor is a field in this class
+    ITransaction sampleAddVertexTx = new ITransaction() {
+        public Object call(BitsyGraph graph) {
+            return graph.addVertex(null);
+        }
+    }
+
+    Object newVertexId = txExecutor.execute(sampleAddVertexTx);
+```
+# Roadmap
+
+## Version 1.0
+
+**Planned for July 15, 2013**: Bug-fixes and minor feature requests based on feedback from Beta. 
+
+## Version 1.5
+
+**Planned for Q4, 2013**: Version 1.5 will focus on fitting larger graphs into memory. 
+
+### Features
+
+* Reduce memory footprint taken up by the graph data structures
+* Improve startup performance
+* Minor features, performance improvements and bug-fixes
+* Move to Java 8, which includes a better ConcurrentHashMap (smaller, faster)
+
+## Version 2.0
+
+**Planned for Q1 2014**: Version 2.0 will provide tools to build a _stateful cluster_ of Bitsy applications that operate on several graphs and expose remote operations to a stateless cluster of _client applications_. 
+
+Version 2.0 will still follow the [original design principles](DesignPrinciples.md), viz. No Seek, No Socket, No SQL, for the common-case. Sockets will be used in cache-able lookups to [Zookeeper](http://zookeeper.apache.org/) from the client applications to locate the database application that holds a specific graph. Sockets will also be used to perform transfers of graphs from one server in the stateful cluster to another. 
+
+
+There are no plans to support graphs that span multiple machines in the cluster. ACID is hard to achieve in this setting because automatic graph partioning is [NP-hard]([NP-hard](http://en.wikipedia.org/wiki/Graph_partition). Without partitioning, we need two-phase commits on multiple servers which [will slowdown Bitsy](http://en.wikipedia.org/wiki/Two-phase_commit#Disadvantages). 
+### Features
+
+* Support for multiple graphs in a Bitsy database (a form of sharding)
+* Integration with [Zookeeper](http://zookeeper.apache.org/) to locate the server that owns a graph
+* Methods to move a graph-shard from one database to another
+* Methods to perform a manual hot failover of an entire server
+

SerializationFormat.md

+# Serialization Format
+
+There are a few benefits to understanding the serialization format used by the Bitsy graph database: 
+
+* Inspect the text files to see if the vertex and edge properties set by the application are serialized correctly
+* Use text editors and text processing tools (like sed, awk, perl) to investigate the database contents
+* Fix corrupt data files: Ideally, you shouldn't have to do this. The [Backup and Recovery](BackupAndRecovery.md) section discusses how you can make online/offline backups and recover the database in case of failures. 
+
+All Bitsy files are text files encoded with the [UTF-8 charset](http://en.wikipedia.org/wiki/UTF-8). Each file consits of 0 or more records which are separated by a [Unix line separator (\n)](http://en.wikipedia.org/wiki/Newline). The format for a record is as follows: 
+
+    <record type>=<record contents>#<checksum>
+
+The record type is a single character and the checksum is a six-digit hexadecimal number. The rest of this page discusses the various record formats. 
+
+
+## Header
+
+The first record of every file is a header record of the form:
+
+    H=<Log number>#<Checksum>
+
+Header records are not used in any line other than the first line of a data file. The purpose of the header record is to identify the order in which the logs should be loaded into memory. It also helps identify partially re-organized vertex/edge log files which may occur if the system crashes in the middle of a reorganization process performed by the VEReorg thread. 
+
+**Note:** The log number is important to the database's consistency. You should not delete files that only have the header defined. 
+
+## Vertex
+
+The vertex record captures a vertex that is inserted, modified or deleted. The record has the following format: 
+
+    V={"id":"<ID>","v":<version>,"s":<state>,"p":<JSON-encoded map of properties>}#<checksum>
+
+The version is an integer and the state is either M/D referring to modified and deleted vertices (respectively). Any vertex record that has a version number that doesn't match the version number in the in-memory version is an _obsolete_ record and is removed during a re-organization process. 
+
+## Edge
+
+The edge record captures an edge and is similar to the vertex record. It has the following format (in a single line): 
+
+    E={"id":"<edge ID>","v":<version>,"s":<M/D>,\
+       "o":"<out vertex ID>","l":"<edge label>","i":"<in vertex ID>",\
+       "p":<JSON-encoded map of properties>}#<checksum>
+
+The properties "o", "l" and "i" refer to the outgoing vertex, edge label and incoming vertex (respectively).
+
+## Transaction
+
+A transaction record captures the end of a transaction flush to the log. It is only present in the transaction log files, viz. txA.txt and txB.txt. The purpose of this record is to capture a successful transaction commit. The format of the record looks like this: 
+
+    T=<long ID>#<checksum>
+
+The purpose of this record is to recover from crashes where a batch of transactions are only partially written to the transaction log by the _MemToTxLogWriter_ thread. The checksum facilitates the detection of a corrupt state caused by a partial flush. To recover the database to a valid state, Bitsy removes all records after the last valid T record, and doesn't load these records to the in-memory database during startup. 
+
+## Log
+
+A log record captures the end of a flush from the transaction log to the vertex/edge log. It is only present in vertex and edge log files, viz. vA.txt, vB.txt, eA.txt and eB.txt. The format of this record looks like this:
+
+    L=<log counter>#<checksum>
+
+The log counter used here is the log counter of the next transaction log to be flushed into this V/E log. The purpose of this record is to recover from crashes that occur when a transaction log is only partially flushed to a V/E log. Bitsy truncates all records that follow an L record, if its log counter matches that of the header record in txA.txt or txB.txt. 
+
+# Detailed constructor
+
+BitsyGraph supports a simple constructor that just takes a database path as shown in the "Embedding Bitsy" section in the [home page]. In addition to this, Bitsy a supports a detailed constructor which takes four parameters:
+
+1. **Path dbPath**: This is the path to the database directory which must be created before launching the application. The directory should preferably not be used for other purposes. 
+1. **boolean allowFullGraphScans**: If set to false, the methods getVertices() or getEdges() defined in BitsyGraph will throw an exception. The methods getVertices(key, value) and getEdges(key, value) will also throw exception unless a [key index] is defined for the given key. You can set this to false to ensure that all Bitsy operations are executed quickly. *Defaults to true.* 
+1. **int txLogThreshold**: The transaction log threshold is the size of the transaction log in bytes upon reaching which the transaction log flusher moves the records to vertex and edge logs. A higher number indicates fewer file operations, but more disk space and startup time in the worst case. Refer to the "transaction log flush" algorithm in the [Write Algorithms] page. *Defaults to 4MB.* 
+1. **double reorgFactor**: The reorganization factor. Reorganization is triggered only when the total number of new vertices and edges added is more than the factor multiplied by the original number of vertices and edges. Refer to the "vertex and edge log reorganization algorithm" algorithm in the [Write Algorithms] page. A reasonable range for this value is between 0.1 and 10. *Defaults to 1.*
+
+## Changing the parameters in runtime
+
+All of the above parameters, except _dbPath_, can be changed in runtime. The application can programatically change the parameters using the setter methods in BitsyGraph. 
+
+These setter methods are also [exposed as JMX attributes](MonitoringAndManagement.md) and can be changed using **jconsole** or an alternate JMX client. 
+
+## Memory-only constructor
+
+BitsyGraph supports an _empty constructor_ that creates a _non-durable_ memory-only graph. This is could be useful for unit tests and non-durable applications. It is similar to TinkerGraph, the reference implementation for Blueprints, but implements [optimistic concurrency control](OptimisticConcurrency.md). 
+
+[key index]: https://github.com/tinkerpop/blueprints/wiki/Graph-Indices
+[Write Algorithms]: WriteAlgorithms.md
+[home page]: Home.md

WriteAlgorithms.md

+# Write Algorithms
+
+Bitsy is a durable in-memory database. The vertices and edges in the graph are maintained in various Java Map data structures. Changes made to the database are logged to files. These files are routinely compacted and re-organized to remove obsolete information. The rest of this section describes the various algorithms involved in writing transactions to files. The following illustration describes the various buffers and processes. 
+
+![Bitsy Write Algorithms](images/BitsyWriteAlgorithms.png)
+
+## Transaction commit algorithm
+
+A transactional context is tied to every thread that operates on Bitsy. When the vertices and edges are queried in a transaction, the elements are copied to the transaction context along with the version number. 
+
+When a transaction is committed, the database performs the following steps:
+
+1. Serialize all updated vertices and edges using the [Jackson JSON processor], after incrementing the version number
+1. Grab an exclusive lock on the graph
+1. Check to see if all updated vertices and edges have the same version number at time of the commit operation
+1. Apply the changes to the in-memory database
+1. Release the exclusive lock on the graph
+1. Enqueue the transaction details to a transaction buffer
+1. Wait till the transaction buffer is flushed to the file
+
+The commit algorithm tries to minimize the time for which the exclusive write lock is held. It also ensures that the method does not return till the commits are written (and forced) to disk. 
+
+## Transaction buffer flush algorithm
+
+There are two transaction buffers (A/B) which are in-memory queue implementations. One of them is the _enqueue buffer_ to which transaction details are enqueued. The other buffer is called the _flush buffer_. 
+
+A transaction-buffer flusher thread, named MemToTxLogWriter-<id>, runs in the background performing the following steps in an infinite loop:
+
+1. Wait till the enqueue buffer has more than one transaction
+1. Swap the enqueue/flush buffers (A -> B or B -> A)
+1. Pick up transactions from the flush buffer and write them to a _transaction log_ (see next section)
+1. Force the updates to disk
+1. Notify all the processed transactions (see step 7 in the previous section)
+
+The idea behind the dual queue system is to:
+
+* avoid the file write operations from blocking the transaction commit operations, 
+* reduce rotational latency by writing as much as possible at once and then forcing the updates to disk, and
+* maximize the throughput when there are a lot of queued-up transactions. 
+
+The write throughput in the [benchmarks page](Benchmarks.md) shows the decreasing impact of rotational latency as the number of concurrently writing threads increases. 
+
+## Transaction log flush algorithm
+
+The transaction logs are organized in a similar fashion to the transaction buffers. There are two files (txA.txt and txB.txt). One of them is the _enqueue log_ and the other one is _flush log_. 
+
+When the enqueue transaction log exceeds the transaction log threshold (configurable with a default of 4MB), the transaction-log flusher thread, named TxFlusher-<id>, wakes up and performs the following steps:
+
+1. Swap the enqueue/flush transaction logs: This is not a file rename operation. It simply instructs the transaction-log flusher to use the other log file. 
+1. Read records from the flush transaction log and write the up-to-date vertices and edges to the vertex and edge logs. The obsolete vertices and edges are ignored. 
+
+## Vertex and edge log reorganization algorithm
+
+The vertex and edge logs are organized in a similar fashion to the transaction logs. There are two sets of vertex and edge files, named vA.txt and vB.txt, and eA.txt and eB.txt. One of the sets of logs is the enqueue logs to which the transaction logs are flushed. 
+
+When the total number of vertices and edges in the vertex/edge logs exceeds the previously known size (say N) by the _reorganization factor_ times N, a thread named the VEReorg-<id> wakes up and performs the following steps: 
+
+1. Block all further transaction log flushes
+1. Read all vertices from the enqueue vertex log and write the up-to-date records to the flush vertex log
+1. Perform a similar copy operation on the edge logs
+1. Delete the old enqueue logs
+1. Swap the enqueue/flush status of the V/E logs so that future transaction log flushes will go to the correct file
+
+Unlike the previous two dual-queue systems, a re-organization of the vertex and edge logs will block flushes from the transaction logs. However, this doesn't adversely affect performance as long as the file operations catch up with the updates
+
+## Online backup algorithm
+
+When an online backup operation is triggered through the [JMX console](MonitoringAndManagement.md), the database performs the following steps:
+
+1. Trigger a transaction flush and wait for it to complete
+1. Copy the "enqueue" vertex and edge logs to the given backup directory
+
+This algorithm ensures that a _valid snapshot_ of the database is copied to the given backup directory. All transaction log flush operations will wait till step 2 is complete. To restore the database, you simply have to delete the contents of the database directory and overwrite it with the contents of the backup directory. 
+
+## Asynchronous commits (future)
+
+The first two algorithms (transaction commit and transaction buffer flush) can easily modified to support asynchronous commits for asynchronous server applications. Specifically, the commit method can return after step 6 in the transaction commit algorithm, and step 5 in the transaction buffer flush algorithm can invoke callbacks to the asynchronous application. If you are building an asynchronous server with Bitsy and would like to have this feature, please [create an issue](https://bitbucket.org/lambdazen/bitsy/issues). 
+
+**Note:** Asynchronous commits are currently not available in the [Blueprints API](https://github.com/tinkerpop/blueprints/wiki). However, the API supports the notion of a ThreadedTransactionalGraph which could implement a method such as commit(CallbackInterface). 
+
+[Jackson JSON processor]: http://jackson.codehaus.org/
+

images/BitsyReadThroughput-1.png

Added
New image

images/BitsyReadThroughput-2.png

Added
New image

images/BitsyWriteAlgorithms.png

Added
New image

images/BitsyWriteThroughput.png

Added
New image

images/HardDiskHead.jpg

Added
New image

images/JConsole.png

Added
New image