Technical Aspects

Technical aspects

File Storage Interface

The file storage interface is done through the Legacy, Storage and Document classes. The Legacy can be crypted and controls therefore the crypto key. The Legacy defines two paths:

base: where the Legacy takes its place for its Storages and Documents
outbase: where the Legacy will store copy of files when the COPY function will be called

Each Storage should be considered as mount point of an independent filesystem. So the Legacy is a multi-filesystem aggregation.

The size of each Storage in one Legacy must have the same size. Storages are created automatically when it is necessary. However, even if they would be automatically created when needed, The Storage administrator should handle filesystems creations and mounts and so LSDStorage creations using the LSDAdmin function (or Web interface).

Warning: The size in Legacy must be less than real size since encryption should add some bytes. In general, depending on average files size and crypto mode or not, 5 percent less than the real size should be enough.

OpenLSD takes care of block size in filesystem but not of crypto block size, but later could be.

Network Protocol

OpenLSD framework message passing concept is based on serialization of objects between clients and servers. One shall say it could be inefficient but as OpenLSD uses an excellent NIO framework (MINA), it can handle serialization/deserialization with high performance and bandwidth (please see our benchmarks) using non blocking socket and a multi-threading approach. Moreover, this enables also a secure way to access to the Legacies, the front entry of the system by not allowing access them by standard protocols such as ftp, http, ... It does not mean that ftp and http can not be used as client. In fact, the application example shows how to do using servlet to store or retrieve document from HTTP protocol. But this is done by something we can call a « LSD proxy ». The Tomcat server (for the example) implements a proxy for OpenLSD using a HTTP client from one side (get or put) and LSD tunnel to LSD server on the other side.

From version 1.0, OpenLSD supports an HTTP connector for short functions on OpenLSD Server and OpHandler Server (Statistics, Shutdown):

Info on Sessions : shows some numbers statistics on global sessions by Handler
Info on Storage : shows some statistics on Physical Storages (short description) (OpenLSD Server only)
Full info on Storage : show more detailed statistics on Physical Storages but could be very long (depend on disk speed access and number of files) (OpenLSD Server only)
Clean : force the JVM to run garbage collection dynamically (on OpHandler, clean also the Pool of connections)
Shutdown : shutdown the OpenLSD Server or OpHandler Server (protected by a password)

With the option CSV, the data are returned in CSV format.

Message Protocol

Each message is a command (client to server) or an answer (server to client). Each command can be of two types: (unique field in message or session objects)

Unique: only one command and one answer will be done during this session. After the command and answer are done, the session will be immediately and automatically closed.
Non Unique: multiple commands and answers can occur, so the session must be closed explicitly or will be closed after a time out of inactivity occurs.

Some message can be multiple blocks based, such as document sending or receiving protocol. For instance, when a document is bringing through the network, it is split according to the size of the document (fileblocksize field for the size of one block, rankblock for the current block rank, bytesblock for the current block of bytes in message or session objects, and filesize for the global file size in the session object). For instance, a document of 1MB could be split in 64 blocks of 16 KB each. The size of one block is a parameter of OpenLSD. Some messages have no need of multiple block support (such as administration message to stop or start a service).

For each multiple block message, two implementations have been made:

Acknowledging protocol: each block is to be confirmed prior to get the next block. The advantage of this protocol is that we can limit the memory usage at one block at a time by session. The disadvantage is that if the network is slow or if the latency is quite high, the performance can be limited to the network aspect.
Non-Acknowledging protocol (NAck): only the first and the last blocks are acknowledge, so the sender sends all of the blocks without waiting the acknowledge of the receiver. The advantage is the performance shall not be limited by the network but now by the memory since one file can be fully in memory when sent or received, which can be a disadvantage. However the memory usage is for a short time only, just until the block is consumed by the receiver or really sent by the sender.

I implement a final blocking function that wait until the asynchronous underlying command is done (waitForAllBlocks and endedAllBlocks in session objects). This one is needed to finalize some actions (client side mainly).

For importer local to the OpenLSD service, I implement a specific message that does not transmit the file to be imported but only its full access path. This implementation is obviously the most efficient. Nevertheless, importers from network have also good performance but limited to the network bandwidth.

A KeepAlive filter has been inserted in the message protocol (inspired from Mina’s KeepAlive filter) which allows maintaining connections even on long term (Web pool of connections, OpHandler pool of connections …). However after a long time of inactivity (around 10 minutes), the connection is closed in order to not maintain very long term connection.

Database relative

OpenLSD uses a JDBC interface to access to database. In order to be efficient, after making some tests, it appears that using some big framework such as Hibernate could be a problem in applications that do not already used Hibernate objects. Moreover, the performances were very bad when considering specific tasks compared to « manual SQL code ». So my implementation uses direct JDBC connection and not object for persistence. However, I decided to make class for each kind of objects, such as Hibernate objects, but using my own underlying implementation (LSDDbX classes). By this way I was able to introduce a special class (LSDSpecific) where I put specific functions according to the underlying database really used. Indeed Oracle, PostGreSQL and MySQL do not have the same specific SQL abilities, neither the same performance according to the way one write the SQL code. So for very specific codes, I made this specific class which calls specific subclasses according to the current used database (LSDMySQLSpecific, LSDPostGreSQLSpecific and LSDOracleSpecific).

Specific Java problems

I had to face also some specific and quite odd problems in Java.

One was that when you start to work in 64 bytes, you assume that everything is ok with size up to 2^64, but it isn't. Almost all objects in Java (even 1.6) are limited to 2^32 bytes long. Specifically I had to create a List that allows more than 2^32 objects in it, the LinkedLongList. However I try to limit as possible its usage since the problem would be the memory (only to get the list of Storages for one Legacy). However, at the time I am writing those words, I suspect that it is not a problem since having more than 2^32 real servers is probably not a real common fact…

I create a Lock that is not reentrant (contrary to the ReentrantLock of Concurrent package) meaning that if the same thread tries to lock it again, it will block. Of course, this is OK only if another thread has the right to unlock it from elsewhere (if not, it brings to a dead lock!). Perhaps it exists something better to do that, but the ReentrantLock was not the good option.

Another problem was when you launch an external process from a Runtime.exec command. The API was not really well documented but now there is at least a warning (as of the 1.6 version) in the Process API:

« All its standard io (i.e. stdin, stdout, stderr) operations will be redirected to the parent process through three streams (getOutputStream(), getInputStream(), getErrorStream()). The parent process uses these streams to feed input to and get output from the subprocess. Because some native platforms only provide limited buffer size for standard input and output streams, failure to promptly write the input stream or read the output stream of the subprocess may cause the subprocess to block, and even deadlock. »

What does this means? Well, if you launch a process but you don't care about its ErrorStream for instance and so you don't try to read anything from it, you likely will have a chance to block the subprocess and even the main JVM, so having a deadlock. To avoid this, I use a specific thread to read from error stream and input stream (read anything without action), the LSDStreamGobbler class. Some few web sites present quite the same code as it is inspired from several blogs or forums. To use it, when you don't care about outputs of the subprocess, just call the waitForProcess(Process p) static function. If you take care about the InputStream, then create a new LSDStreamGobbler object with the Process as parameter and start the new created LSDStreamGobbler thread.

Optimizations

I try several ways to optimize as much as I found, with the help of Vincent and also Brice Carriere Montjosieu.

One of course was to use as much as possible the efficiency brings by the NIO support of MINA framework. Non blocking socket protocol was a good way to implement with a few threads a scalable server for OpenLSD.

Another one was to optimize as much as possible the SQL request, and even to create procedures inside the database so as to be as efficient as possible.

And finally, I try to make OpenLSD as general as possible. I cannot say that this is a final version or even a complete version since I have some other plan of improvements but I think it was time to bring it to the open source community and let this project lives free (free as free beer too !).