This week Apache Solr 5.0 was released. Is there a better excuse to discuss this amazing tool?
agencyQ has been using Solr since the heady days of the pre-beta, when Lucene sounded like a place in Switzerland, and “Solr” seemed so much like a typo. We’re fanboys for sure. 5.0 is just another great release that makes the tool easier to use, easier to manage and incrementally better. But, as with so many technologies its strength isn’t what it does, but what you can make it do. Let’s see how this plays into the agencyQ belief that the Internet isn’t finished and change is inevitable, so go plan for it already.
To begin, let’s stop calling Solr a search engine. We think of it as a data engine. True, it will suck in millions of records and query across those at lightening speed. But, so what? Lots of search engines do that. If you tease apart how Solr does this, you can leverage the power under the hood to help future-proof your projects.
Firstly, Solr can be schema-free. This means that you don’t need to tell it what data you want to store ahead of time. You can just sent data to Solr and let Solr take care of storing it and letting you bring it back superfast. When your data needs to change over time, just add the new data to the mix and Solr will happily pick it up and run with it.
Since you’re schema free, you’re also not tied to the old grid of rows and columns relational databases are stuck with. And, unlike relational database, Solr can query, group and count on individual fields – Solr calls these facets – at lightening speed. Need to know how many products are in the “Tool” category, that are “red” and under $5? One quick facet query and you’ve got it. Add a new field to facet on, no problem, it’s already there and not a line of code of code to be touched to make it happen.
But what if you need to go in and change data in every record? Since you’re schemaless, it’s no problem structurally. With the addition of Solr’s atomic updates, you now just patch just the part of your data that’s changing and leave the rest alone. This is no mean feat when you have millions of records and thousands of concurrent queries hitting the server at the same time.
Scale has yet to present a problem for Solr. Whether your business is adding a few records at a time, or a few hundred million, Solr can handle it. It’s true, that at some point you’ll want to branch out into the Solr Cloud (that’s a thing) and spread your load across a lot of servers, but with the built-in replication and scaling tools, Solr has you covered there.
Finally, we mentioned Atomic updates, but what if the real nuclear option happens and your project gets re-platformed into a different language entirely. Surely then you’d need to work on your Solr? Not so! As a stand-alone service data is accessed via secure urls it is entirely independent of your platform, coding language and even hosting preference.
When you start to see Solr as your data-engine and not just a quickie search tool, you can make sure it treats you well over time and grows with you. The Internet is going to change, let Solr help you change with it.