What’s new in eXo JCR 1.15?
A little bit more than a year after eXo JCR 1.14.0-GA, we are now pleased to announce that eXo JCR 1.15.0-GA – our implementation of the JSR-170 spec – has just been released. For this version, we mainly focused on improving the user experience based on feedback from our customers and consultants. In the next sections, I will present the main new features and enhancements that you will be able to benefit starting from eXo Platform 4.
Dedicated tables per workspace
In previous versions of eXo JCR, we proposed 2 storage modes:
- The single mode, where all the data of all the workspaces are stored into the same tables.
- The multiple mode, where each workspace has its own tables but in a dedicated DB schema.
The first mode is easy to set up but is not very scalable and the second mode is much more scalable but much harder to set up, especially when you have a lot of workspaces and you use a database like Oracle. A new mode has been added, which provides the benefit of both previous modes without any drawback. This new mode, called isolated, stores the data of each workspace in dedicated tables within the same DB schema which means that it is as easy as the single mode to set up and as scalable as the multiple mode.
Lucene indexes re-synchronized thanks to rsync
As you may know, eXo JCR relies on Lucene anytime we execute a JCR query and lucene doesn’t provide any clustering capabilities out of the box. In eXo JCR 1.14, we proposed 2 ways to manage the lucene indexes in a cluster environment:
- Shared index: The lucene indexes are stored into a shared file system, only the master has write access to the shared indexes.
- Local index: The lucene indexes are stored locally on each cluster node and each node is responsible for updating its own copy of indexes.
The first approach easily ensures consistency as all nodes read the same indexes, but it relies on a shared file system which affects the performances in read and write accesses. The second approach gives much better performances in read and write accesses as the accesses are local but since each node is responsible for keeping up to date its own copy of lucene indexes, we can get differences in the indexes content from one cluster node to another.
This new version proposes a new approach in which only the master is in read/write mode, the other nodes are in read only, each node has its own copy of indexes that is resynchronized from the master thanks to the rsync command. This way we can guarantee good performances and consistency.
Features related to the extension mechanism
In versions prior to JCR 1.12, when we wanted to customize the portal, we had to modify directly the portal binaries which made it complex to upgrade and to support. Starting from JCR 1.12 / PLF 3.0, we added the extension mechanism, allowing us to customize our portal application without modifying the portal binaries which helped us to solve the initial problem but unfortunately brought new ones that we fixed thanks to the features described in the next subsections.
Use System properties to configure your workspace
One of the features of the extension mechanism allows you to add or redefine configuration files including the configuration files of your repository/workspace from different war files. The direct consequence of it is the fact that we don’t have anymore one single file containing the whole JCR configuration but several small ones distributed over different war files. Thanks to this new feature, you can now change the configuration of one or all the registered workspaces whatever the location of their related configuration file. This can be done simply using System properties whose name matches with a specific syntax. For example let’s say that I want to set the max-buffer-size to 300k for all the registered workspaces. To do so, I just need to add the system property -Dexo.jcr.config.force.all.container.max-buffer-size=300k to my startup command.
The extension mechanism is re-loadable
Developing an extension can be tedious especially because of the extension mechanism that didn’t support hot reloading so far, meaning we would have to restart the server regularly. Starting from this version, in case you are in developing mode, the extension mechanism is able to detect a change and, according to what has been modified, it could decide to automatically reload only one portal container, a set of portal containers or all the containers.
Exclude nodes from query results
Up to now all nodes are indexed automatically but for some reason you may need to exclude some nodes from JCR query results. For example, if you have a folder representing the trash, you may want to prevent any JCR query from returning anything from it. It is now possible to exclude from indexing a subtree of nodes using a path or all nodes of a given node type or both.
CND file format support
You can now define your node type definition using the “Compact Node Definition” format that has been specified in JCR 2.0 spec. This new format has been designed to be much more natural and easier to read and write than the old XML format. This also helps to significantly reduce the size of your node type definition files.
Here is how we define the node typewith the old XML syntax:
<nodeType name="nt:base" isMixin="false" hasOrderableChildNodes="false" primaryItemName=""> <propertyDefinitions> <propertyDefinition name="jcr:primaryType" requiredType="Name" autoCreated="true" mandatory="true" onParentVersion="COMPUTE" protected="true" multiple="false"> <valueConstraints /> </propertyDefinition> <propertyDefinition name="jcr:mixinTypes" requiredType="Name" autoCreated="false" mandatory="false" onParentVersion="COMPUTE" protected="true" multiple="true"> <valueConstraints /> </propertyDefinition> </propertyDefinitions> </nodeType>
Here is how we define the same node type with CND:
[nt:base] abstract - jcr:primaryType (NAME) mandatory autocreated protected COMPUTE - jcr:mixinTypes (NAME) protected multiple COMPUTE
Write operations optimizations
In term of performances, we mainly focused on improving write operations by optimizing the change log in order to ease its accessibility and also to get rid of useless operations. For example, if it detects that you asked to update several times the same property, only the last change will be taken into account, mostly to reduce the total amount of queries to be executed by the database. This will help to reduce I/O (network and disc).
One other interesting improvement is the ability to enable update batching, which can really improve the performances, especially when you do a lot of operations of the same type (CREATE or UPDATE or DELETE) within the same transaction and you have a remote database. Our internal tests showed that we could reduce the first starting time of an eXo Platform by up to 40 % according to the configuration used.
Smaller memory footprint
To improve the scalability of a product it is important to reduce the memory footprint, we did it by reviewing our code in order to reuse as many existing objects as we could.
Use JCR Action as validator
The JCR actions allow you to perform actions before and after a JCR operation, but in previous versions when an action fails, the exception is logged but the JCR operation is performed normally. Now it is possible to raise an exception from your JCR action and decide yourself how the JCR should behave in case of an exception. This improvement can help you to implement a data validator that cannot be managed at NodeType definition level. A good example is when you need to validate the value of a property that depends on the value of another property, such as a start date and an end date, we don’t want that the end date to be before the start date and vise versa.
Shutdown your cluster easily
In previous versions, when you had a cluster under load and you wanted to stop it safely, you had to suspend all the running transactions on your cluster using JMX before stopping all your nodes. In this version, we improved the inter-node communication in order to suspend the transactions over the cluster automatically as soon as we detect that you are trying to shut down one or several nodes, this should ease administrator’s life.
Other Interesting Enhancements
- The indexing process of a workspace can be long. So, to allow you to have a better idea of what’s left, we get, in a non blocking manner, the total amount of nodes to be indexed and we provide it once we have it.
- The result of a JCR query can now be sorted case insensitively.
- The code of the whole eXo JCR’s stack has been reviewed to be compatible with JBoss AS 7 and its new classloading strategy.
- eXo JCR supports Java 7.