ELNs and Data Portability

I recall back in the late 90’s a lot of discussion at CENSA meetings about the need to move data between different systems, and of course from one ELN to another. From the customer’s perspective it is a really important issue although sadly one that doesn’t get enough attention until they are committed to a vendor – and of course it isn’t in the Vendor’s interest to allow you to take your data somewhere else… to a competing product for example. We even sponsored the development of CENSML (Collaborative Electronic Notebook Systems Markup Language) which was meant with complete apathy and interestingly no one else proposed anything similar.

So at this time the data portability situation in the ELN world is pretty awful. Which is a shame, and at some point people are going to start noticing – and perhaps the next round of ELN purchases will have open file formats as a purchasing consideration.

I came across the Data Portability project in this article on Tech Crunch which seems to be a really nice way of at least making the Data Portability issues obvious to consumers. They are starting off in the online web app area but clearly it is very relevant to any IT system, either cloud-based or on premises.

For the record, Amphora’s systems are completely open – our view is that it is your data and you should be able to take it where you want, when you want, without even having to involve us.

In addition, our focus on IP means we need to be able to reassure our customers that they can take a record out of our ELN and defend their IP long after their relationship with Amphora has come to a close – with a 50 or 100 year retention timescale, requiring the vendor to be around just isn’t acceptable (which is a big concern with services that claim to outsource IP protection, something I’ll blog on in due course).

We take this a step further in our Hosted/SaaS offerings, where customers can take a copy of their data (via rsync or similar) onto another server controlled by them every night. We also work with those customers to make sure they can spin up their own server as needed. This means that even where we’re Hosting them, they can tell us our services aren’t required and still have complete access to their data without any cooperation for us.

We believe that open data, neutral file formats, powerful APIs and above all a respectful policy to our customer’s IP are the cornerstone of any ELN vendor’s offering.

Our next web site refresh will contain our Data Portability policy. In the meantime I can only hope that as various advocacy groups get more vocal about the need for Facebook, Twitter and others to unlock your data, that will cause Data Portability to be given the consideration it deserves in the ELN world.

What gets kept in Informatics Systems, and where?

Not all of the “Stuff” sloshing around the lab is the same, and distinguishing between them helps tease out the best place to store things. We use a simple Triangle Diagram (originally proposed by John Trigg of PhaseFour which really just tries to point out that stuff is related, but it’s at different levels of abstraction:


It is quite hard to draw definite lines around things, but I think most people can appreciate that a raw data dump from an instrument is somewhat different from a report to management, or that an experimental write up in word is different from some tabular data in a spreadsheet. The differences between the levels come out in:

  • The software that’s used to read the file and interpret the content. Some will require very specific software (e.g. from an instrument vendor), but a PDF or text file can be read by many different things.
  • Who might be interested in the data. Again, some files are useful to anyone (for example, a report) but some only useful to certain people with specific training.
  • How long your company might want to keep the data, and indeed how long you are realistically able to keep the data. Typically the lower you go, the harder it is to keep something, so if you feel it’s business critical you really need to pay attention to the formats used.

This differentiation can really help in ELN System design. Partly it draws your attention to what needs to be stored in the ELN (typically the “Experiment” write up level), and what can be left in systems e.g. a database or a file server, which can be pointed to from the ELN.

Not everything needs to be stored in the ELN, and indeed it would be unrealistic to expect to be able to do so. The important thing is common keys so you can offer the user a link to more information, and the advent of web-based systems has made this level of “integration” so trivial one sometimes feels a bit of fraud describing it as such.

By building on the storage tools you have in place, and focusing an ELN on Experiments, the resulting system is cheap to run, costs little to acquire, and results in little disruption to existing practices.

Google has cool book scanning technology

Seems Google have some really nifty technology for scanning books automatically without destroying them.

Some of our clients have scans of their Paper Notebooks, and you can put them into PatentSafe quite easily. However, most people are put off by the expense of scanning (& tagging), which is a real pity.

Given the amount of knowledge that’s locked in legacy Paper Lab Notebooks, let’s hope Google’s technology is made available outside of Google.

Good Introductory presentation on Records Management

Found this while messing around on SlideShare (which I might use to make my stuff more available). Good basic overview of records management. Our PatentSafe product is a specialized solution for Lab Notebooks/Patent Evidence – something that traditional Document Management/Records Management/Content Management solutions can’t really do all that well for a whole variety of reasons. But we still need to implement good Records Management practices, and this presentation explains them quite nicely.

(I guess we’re good Records Management for Electronic Lab Notebooks – shipped in a box That Just Works, and used by an awful lot of people :-))

Anyway, enjoy: