Chemistry ELNs and Open Source

For some scientists, Chemical Structure-based searching is an important part of the toolset they use to plan and write up their experiments. Historically this functionality has been the domain of proprietary software vendors, who have used their monopoly on Cheminformatics technology to lever the adoption of their wider informatics suites (including products positioned as “Electronic Laboratory Notebooks”).

The resulting lack of competition on top of vendor consolidation has led to Chemistry-focused ELNs tending to lag in terms of ease of use, and openness, whilst of course being pretty expensive. As those vendors seek to expand into other scientific disciplines, they bring with them the same costs which are then unnecessarily imposed onto other areas.

One major reason for this is that the Open Source Cheminformatics world has historically been under-developed. My theory is that’s because Cheminformatics started in earnest before Open Source took off as a concept (in comparison to Bioinformatics) but I have no real evidence for this.

Open Source is an important part of todays’ software ecosystem:

  • It provides a set of building blocks, and I would imagine almost every software product (commercial or otherwise) has some Open Source components. By sharing the basic foundations, the cost of entry is reduced and this results in more entrants and lower costs for everyone.
  • Open Source¬†drives innovation by allowing people to re-mix things to “scratch their own itch” and produce new approaches as needed. Even if those solutions remain in-house they still inspire others, and perhaps allow the engineers inside the commercial vendors to successfully propose new approaches.
  • The threat of “free” competition as well as more players in the market generally keeps vendors on their toes. Without a complete lock on particular functionality, vendors must instead compete on value and functionality.

Amphora are not in the Chemistry ELN market (and have no intention of being in that market), but I look at what’s out there and compare with what I see happening in other areas and it is clear there’s a lot that could be done which would benefit the wider ELN world as well. Frankly what’s going on Chemistry is giving the wider ELN community a bad name – especially as marketers keep positioning their products as the only “proper” approach for any kind of science, chemistry or otherwise. You really don’t need to spend thousands of dollars a seat and days/weeks of implementation time to deploy an ELN!

So I’ve waiting for a decent Open Source approach to Chemistry-based searching because if nothing else it will inject some innovation where it has been sorely lacking.

So I was delighted to read this post on how to Enable Exact Structure Search and Substructure Search for Your Chemical Database. I don’t think there’s a great breakthrough here, but it is a straightforward set of instructions on how you can do it which demystifies Cheminformatics a lot.

This could get pretty interesting in the next few years…

  • HTML5 and other web technologies are surely at the stage where we don’t need a “thick client” deployed onto a desktop anymore – can’t we do it all in the browser?
  • What about all the tablets (like the iPad), can we make them full clients?
  • Can we finally have true cross platform chemistry ELNs?
  • Can we easily embed chemistry into a variety of other applications, rather than having to buy a complete implementation of someone else’s idea of an ELN?

Amphora’s focus will remain on our particular slice of the ELN problem, which is providing the secure recordkeeping back end, discipline-neutral collaboration etc. Once you’ve done all that work the lawyers generally want to make sure you get the credit for all that Intellectual Property you’ve created even if they don’t explicitly apply for a Patent – even in Academic environments this is becoming more important as the journals and funding agencies raise their expectations in terms of record keeping etc.¬†Amphora’s job is to help our customers focus on the science, and we’ll look after the Intellectual Property and Records considerations.

Even though we don’t plan to directly participate, I’m really looking forward to this. It is great fun working with our customers’ in-house Bioinformatics solutions, and I’d love to see that level of innovation in Cheminformatics.

Traditional Scholarly Publishing on Bloggers

This editorial in Analytical Chemistry is a nice example of the reaction more formal publications have to the rise of “Bloggers” (found via Abhishek Tiwari’s blog which has a delicious subtitle of “In the spider-web of facts, many a truth is strangled”.

This is classic case of an established industry being threatened by The Internet. We’ve see it with Travel Agents, Book Stores, Insurance Brokers, Newspapers… you name it. The Editorial just effuses indignation that some people have the temerity to bypass the establishment.

Scientific Blogging is here to stay, as is blogging in general (sorry, newspapers). My advice to anyone who feels threatened by that is:

  1. Understand the rules have changed. What was scarce is now plentiful. What’s the new scarcity?
  2. The new paradigm has some advantages and some weaknesses – what are they?
  3. With the resources and expertise you have built up, how can you bring value to this new world (and hence remain relevant)?

What won’t work is sitting on the sidelines hoping the new thing will go away, because it won’t. It’ll just keep on getting more relevant and by the time you are forced to engage with it, they’ll have solved most of the problems without you… and you will be irrelevant.

Academia: A satirical Data Management Plan

As I’ve written elsewhere, we’re increasingly working with Academic labs who need to get their Lab Notebook and general data management act together.

Whilst there are considerable differences between Academia and Industry (as indeed there are between Biotech and Pharma) I have to say I find the challenge of implementing an ELN in an academic environment quite refreshing – the constraints are such that you really have to think carefully about the human side of things, as well as ensuring you can do it all for a cost that will fit into whatever money a lab might be able to find. So far we’ve had an excellent response which is quite gratifying, and I the less buttoned-down culture is a lot of fun.

So I was greatly amused to see this satirical data management plan, ostensibly in response to an NSF request. Of course one of the reasons why it is so funny is that it is rather close to the mark!

Getting a decent data management process and associated ELN implementation in such an environment is perfectly possible, and we can often do it without inflicting too much pain on the individual researchers. Fact is, scientists became scientists because they liked science, not record keeping! Fortunately I think the tools have finally got to the stage where despite the relentless increase in computerization and resulting data volumes, we can make the record keeping side of things as transparent and hassle-free as it needs to be.