JCR at eXo Platform
eXo Platform uses its JCR component to store almost all its data (portal, social, wiki, forum, calendar, ecms, etc.). The only exceptions are the data related to users, groups, and roles that are stored in dedicated tables through JBoss PicketLink.
Using JCR in the early days of eXo Platform was an interesting and natural choice because
- it is a Java standard (JSR 170-JSR 283);
- it allows the abstraction of data storage behind a standard API, meaning the data-storage system can be changed without affecting the application code;
- it offers a broad set of interesting features out of the box (hierarchical data structure, querying, versioning, export/import, and so on); and
- it provides a flexible schema.
But as we always do in the IT world and at eXo Platform, we question our choices. This has led us to start dropping the JCR technology progressively from some of our components and applications. Here are the main reasons:
- JCR: not a silver bullet (neither is any other tool). Because of its flexible data model, we probably used it too much, even when it did not make much sense. JCR is a good technology for storing hierarchical data structures with nodes and properties, which is typically what we needed for our Content and Document Management. That’s why we’ll definitely keep it for our ECMS application. In others words, “the right tool for right job!”
- Readable database tables. JCR uses a generic modelization that results in very generic tables (items, attributes, etc., similar to an EAV model). While it brings flexibility to the application data modelization, it also makes data very hard, for example, to read, debug, hack, backup, or import. We believe that having meaningful tables and columns will help devs and ops substantially.
- Performance.Performance. Despite doing many optimizations on our JCR, whose implementation is very mature as of today, we needed to step further and provide additional ways to sustain an evergrowing volume of relationship between data while keeping our performance and scalability at comparable levels and even enhance them. Having dedicated tables for our applications will allow us to optimize the model for our specific-use cases and to do finer tuning (for database indexes, for instance). This will help increase performance substantially. (We already have achieved great improvements in our benches.)
- JCR’s decreasing adoption. Although the initial goal of JCR was promising, it did not reach its objectives in term of adoption. The last specification is already seven years old (JCR 2.0 in 2009), and the community is not very active. We cannot tell the future, but these are not good signs. This also implies that it is harder to find skilled people for this technology than for more widespread ones (like JPA).
The decision has been taken to add support for dedicated and meaningful database tables. We chose to use JPA with Hibernate to manage these data. JPA is the API specification that describes the management of relational data in Java applications. Hibernate is the JPA reference implementation.
Also, in order to replace the JCR search capabilities, we will use Elasticsearch as the default search engine.
Where do we stand today?
The last version of eXo Platform, 4.3, which was released at the beginning of this year, still runs fully on JCR, but we have prepared the groundwork for moving away from JCR progressively. The 4.3 version includes everything needed to start the migration, including
- new datasource dedicated to JPA,
- integration with JPA environment (persistence unit, entities, etc.),
- database data creation and update with Liquibase, and
- indexing and search API to allow easy plug-in to any search engine.
At the same time as we released eXo Platform 4.3, we also released our new application called eXo Tasks. This application relies fully on dedicated tables using JPA, that is, there is no JCR. Tasks is bundled as an add-on and is also part of the eXo Platform Enterprise edition.
Since these releases, we have also started to migrate some applications, using the pluggable capabilities of eXo Platform to easily switch from the JCR implementation to the JPA implementation.
The first application being migrated is Wiki. The add-on using JPA is close to being released in its first stable version (1.0.0-RC3 as of this writing) and is already installed in production on the Tribe! So you can use this right now by installing it through the add-on manager with
./addon install exo-wiki-rdbms
Once installed, start eXo Platform. All the Wiki JCR data will be automatically migrated to new tables. Be aware, though, that this add-on is not supported yet.
The migration of social data is also on its way and should land very soon.
These two migrations will be part of eXo Platform 4.4, but in that version will be directly integrated with the product, that is, not be add-ons anymore, and therefore they will be supported.
We also implemented the Elasticsearch integration with add-ons, two add-ons actually: ES Search and ES Embedded. The first one provides all the APIs and glue code needed to interact easily with an Elasticsearch cluster; the second one allows one to start an Elasticsearch node directly in eXo Platform to use and test it quickly (not recommended in production).
The integration and capabilities of JPA and Elasticsearch will be detailed in an upcoming blog post. Stay tuned!