Concept and Logic OpenLSD ML
Howto enter in production with ML
Multiple legacies is the ability for OpenLSD to maintain a mirror of the documents inserted in a Legacy across several (1 to n) servers that implement this Legacy. One can see a Legacy as a virtual storage. Each virtual storage can have 1 to n implementations on separate servers such that it increases the security of the documents archived in this Legacy.
Considering huge amount of files does not allow making save on tape, so this kind of mirror should be considering as a real security for archive, and probably the only one.
Considering also an organization that has huge network area, this kind of mirror can be used according to localization of each user. For instance, imagine one OpenLSD Server is in the US, another one in Europe, another one in Asia and a fourth one in Africa. Therefore users from Asia should access first to the Asia instance in order to allow fast answer without using intercontinental network links. All of these suppose that all mirror get the same document archived.
Three ways can be used to have a mirror of the storages:
You need to use the new ML interface instead of the standard one from OpenLSD. Also there are some production tips to know.
The main idea is that each time an import or a delete is done, it is done on a “main” server (most of the time, it should be the one that is closest to the database) and then it stores in the database the actions to do on the others servers that implement the same Legacy.
The replication is therefore asynchronous and starts after a successful action is done. For instance, the replication of one import is done after the first import is done and ok, the same for the deletion. The asynchronous scheme relies on the database persistence. It stores the actions that are still to do. For a specific document, when all relative actions are done and ok, the relative entries in the database are deleted.
The ML support can be used even after the production was started without this option, and the reverse is also possible, so you can go from or to ML for one Legacy as you want. Although it should not be a good idea to go for instance from no ML to ML and then go back to no ML and again to ML support, since each time you go in the ML support you will have to synchronized the legacy servers.
There is several kind of check in OpenLSD: files from database point of view, database from files point of view and all of them can be done on each component of one ML. There is also a specific function that enables to resynchronized if necessary one or more component of one ML. For instance this function can be used to start a ML instance after a production starts without ML support or to resynchronized to site where one had a problem (like storage failure).
One can use also this ML support not only for security reason but also for efficiency since one can implement web services (even import) using the closest OpenLSD Server as a component for one ML.
The database is unique since this is the kernel point of the OpenLSD implementation to ensure efficiency and security. However, one should take care about the replication of this database since it is not done using the ML support. The reason is that this database could imply business tables that are not related to OpenLSD, so the impossibility for this framework to take care of this replication. Considering very large network, the replication should be done using a master slaves plan, even if the master is changing from time to time (for instance considering the open hours across the world). Depending one the database software used, several options can be done. Also, an application replication schema can be used where the application takes care of the database replication by assembling SQL orders in one file and pushing it on other sites (that is the option we take).
Once the database is replicated, the security is completed and accesses can be everywhere (at least in reading mode) to OpenLSD Servers and also to database schema.
|