Friday, March 6, 2015

A Solr Core-nucopia?

[N.B.: If you haven't already, check out my series of posts (1, 2, and 3) that walk through installing Solr for Sitecore.]

Given that there isn't yet a search scaling guide for Sitecore 8.0, the current, best authoritative source of guidance is the Sitecore Search Scaling Guide for 7.5. There is an interesting line in the guide that recommends creating "separate cores for each Sitecore index." This was necessary to avoid inconsistent and unexpected results.

While digging through the release notes for Sitecore 8 update 1 I found a good, technical description of the risk:

When two indexes were configured to use the same SOLR core, it was impossible to differentiate the index data between the indexes. As a result, index data related in one index would override the index data in the other index. This has been fixed so that the _uniqueid index field value has been extended with information about the index name. (426743)

Out of curiosity I decided to validate the fix. Here is a query from one of my Solr cores connected to a Sitecore 8 update 2 instance (a big thumbs-up to Solr's built-in admin tool!)

Let's breakdown the taxonomy of this _uniqueid:

  1. sitecore://<database name>/<item id>?lang=<language name>&ver=<version number>&ndx=<index name>

Clearly, the index name is now a part of the key! Does this mean you can disregard the advice in the Search Scaling Guide? In a word, yes. A more important question is, "Should you?" Well, probably not. Here are some reasons why:

  1. Any single Sitecore index (Solr core) rebuild is less expensive since there is less data. Thus, the rebuild is quicker.
  2. When reviewing statistics about a core in Solr's core admin, facts about the core such as the number of documents easily translate to facts about the Sitecore index.
  3. Probably most important of all, it's possible to tune the cache and core's settings as necessary per Sitecore index. Undoubtedly, usage patterns will vary per Sitecore index. So should the strategies you implement to tune the Solr core responsible for that Sitecore index.

Update (3-8-2015): In Sitecore update 2, one Solr core was removed and two were added. I updated the paragraph below to reflect this.

Keep in mind, these advantages do come at a cost. There will be some amount of overhead incurred per core. Also, there is the management headache of maintaining many cores. As of Sitecore 8 update 2 there are 13 indexes for a vanilla install. If you want to take advantage of the SwitchOnRebuildSolrSearchIndex feature (while an index rebuilds, Sitecore can still return search results for that index) then you will need to add an additional core for each Sitecore index that uses this feature. That is a possible 26 25 (see here for an explanation why the total number changed) cores to manage!

I'm interested in other people's opinions on this topic. Let me know what you all think.


Post a Comment