Saturday, March 2, 2013

Why using solr search engine instead of searching in DB

Next week at the 6.3 i will lecture about solr in windows azure user group here.
 and i thought it will be nice to explain why i think it's better to use search engine like solr instead of using database search capabilities or implement your own search by iterate your data structure and looking for matches. before i start counting the reasons let me say that i think solr is good but any search capabilities based on lucene framework the bible of the text search is also good enough. notice that some databases use lucene framework for there search so if you are using these capabilities maybe you don't need solr after all. So let me count some major advantages for working with solr (and lucene) 1. Performance 2. Ranking 3. Flexibility 4. Clustering & Cloud support 5. Solr is free open source Let me provide more details about each advantage
 Performance
Solr is fast, i mean very fast. millions of documents can return result in few milliseconds! this is the whole idea of indexed search engine you spend some time in indexing each string in inverted file and you get really fast search. and search must be ultra fast otherwise nobody will use it.
  Ranking
Search is not about finding is about ranking! the most important thing is search is to provide the most relevant results at the top of the search result. needless to say that the most relevant results has to return first so solr stored the data in a way that most relevant results the boosted documents will have the fastest access. Flexibility
Let's say your user type with typo or write something in singular and the word exist in plural or even write word that just sounds the same of the word. if you don't have lucene as a search framework it is almost impossible to get such results. but in solr you can add plugins for different languages that will support such cases.
Clustering  Cloud support
Today when many applications are SAAS and implements multi tenancy and sometimes you need to search on billions of documents having a tool that allow you to split your work for hundreds of nodes and allow you to work in cores for each customer you have your own core is not trivial. more than that ranking is calculated due to other documents in the system so such capabilities is very hard to implement.
 Solr is free open source
And all of this for free in easy to use interface with the ability to add your own code with very large and quality community that will help you provide enterprise search for your application.

No comments: