Flat Vs Hierarchical information storage

Really interesting blog post by Tom Evslin (creator of Microsoft Exchange Server) on how information systems have evolved in a comparatively short period of time from being Hierarchical to Flat:

The WorldWideWeb is where Moore’s Law met Metcalfe’s Law. Information management – the way we find out what we want to know – went from hierarchical to flat in just a few years as a result. We now assume – usually correctly – that we can find any particular piece of data from a railroad schedule in Estonia to a quote by an Argentine novelist on the Web within minutes of wanting it….

…we all assumed that most people would approach information through the categories they assigned the information to….To put it mildly, we were all wrong!

People don’t think hierarchically – at least most people don’t. We think in terms of associations….

When we were working on our first ELN projects (in the Mid-90’s) categorization and hierarchy was on everyone’s mind. Before a scientist created an experiment we had them fill out all sorts of metadata about it, and we’d have day long meetings as part of the implementation where the records management, library services, IT, and the user representatives would thrash out what they needed to have for each project. We’d be trying to keep the amount of metadata down to a few elements (at some point the users just put anything into the field because they’ve had enough of filling in silly boxes) but there was still a huge amount of pressure. Then, once we’d figured out all the metadata, we’d get into yet more meetings about how the information should be presented. An awful lot of pain for everyone involved….

Fast forward to today. Our ELN can still capture and track metadata, and show the content of the system in different ways (e.g. you can drill down by project). But it is much less of “thing”. I guess it must come up only 50% of the time, and when it does they only really want a few items – generally, what they need to implement a records management process and maybe make the list of documents pretty. Sure, we’ve improved the product so the whole metadata issue is less hassle – we can now extract most of the metadata transparently rather than bugging the user, and we’ve got a more open framework to manage it. But even so, metadata is less on people’s minds.

I’m not saying metadata isn’t important, because it is. But is isn’t as big a thing as it was. Partly that’s because we’ve got better tools (primarily more CPU/Memory/DIsk), partly because we’re comfortable using Google and know that full text search really does work even on large bodies of content, but also because people realize that acquiring metadata isn’t cost-free. So the tradeoffs have changed – and most importantly, we’re all members of the Google generation.

The great thing is that ELNs are becoming much more lightweight. Less disruptive to the existing processes (but still delivering huge amounts of benefit), and cheaper/quicker to install too (because we’re not spending 2 days in meetings to figure out how to configure the thing).

Thanks Google 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *