This blog post includes the following very relevant thought:
Apache Lucene/Solr is used in more companies than a large majority, if not all, of the commercial vendors out there combined
I’m not sure how you can “prove” this but anecdotaly I suspect there’s a lot of truth there. Which is a good example of the use of Open Source components Vs “more Enterprise capable” bits.
Fact is, Lucene works and it works better (for our purposes) than any other search engine – commercial or open source. We’ve look at a whole bunch, both standalone and those embedded in SQL Databases.
Not only does it scale well, but it was easy for us to customise it to work with the specifics of our problem domain – for example, allowing users to search for sample IDs within the text of a document, even when they are formatted in a way that causes most text engines to treat them as “noise” words. And when a customer does have a problem, it is really easy for us to see what’s going on and tweak as required.
The only problem we have is when we meet an IT dept who is used to traditional search engines – we say “It’ll work like this, it will take this much resource, and will easily scale to X” and they don’t believe us – “it can’t be that easy surely? – you guys can’t have done this.”. Which is a little energy sapping… but nothing that can’t be resolved with a Pilot, and once we’ve reassured them about that, they tend to respect the other engineering calls we’ve made.