Friday, May 22, 2015

MSBuild and TDS

I recently had a need to get continuous integration working for a project that uses Team Development for Sitecore (TDS)—most of us do right? :) While some of this blog post will deal specifically with creating a build definition in Team Foundation Server (TFS), the vast majority of this article applies to any software that uses MSBuild under the hood: TeamCity, Jenkins, CruiseControl, etc.
 
The wrinkle in my requirements was that I could not install TDS on the build server. There's already a very helpful resource on this topic, but I did find I had to do things slightly different in my environment. Additionally, I also made it a goal to remove the SlowCheetah dependency (I make use of xml transformations) from my build server. Finally, I ran into a couple of other small roadblocks that I thought I might as well document here while I was at it.

TDS

As I said, Mike Edwards has an immensely useful article that shows how to avoid installing TDS on your build server. The only things I will add are where I diverged from his steps. For clarity, Mike added a folder called TDSFiles at the root of his solution. I added a folder called MSBuild Support Files with a child folder called TDS.

  1. The HedgehogDevelopmentSitecoreProject.targets file has many references to the HedgehogDevelopment.SitecoreProject.Tasks.dll. For each one of these you need to modify the path. In my case, the correct path is no path at all. This was because (I assume) TFS used the working directory of the .targets file itself as a starting location—the .targets file and the DLL live side-by-side.


  2. In the same .targets file you will also need to modify the paths of the TdsService.asmx and the HedgehogDevelopment.SitecoreProject.Service.dll. Here is a screenshot of my modifications.


SlowCheetah

After corresponding with Hedgehog's Charlie Turano, I decided to eliminate MSBuild's dependency on SlowCheetah. This step is only necessary if you do not supply the DLLs and .targets files to MSBuild. One easy way of doing this is to simply include the "packages" folder from NuGet in source control. This guarantees that MSBuild will be able to make use of the files. In fact, this is how my solution was already set up. Nonetheless, TDS is perfectly capable of doing XML transformations during the build. I want to be ready should a future release of TDS completely replace SlowCheetah (a possibility since SlowCheetah's developer has said he will no longer maintain it.)

This is very easy to do. Simply comment out the following line in any .csproj file that uses SlowCheetah


Some Miscellaneous Issues

  1. I encountered another .targets related issue. This time it was:

    The imported project "C:\Program Files (x86)\MSBuild\Microsoft\VisualStudio\v11.0\WebApplications\Microsoft.WebApplication.targets" was not found. Confirm that the path in the <Import> declaration is correct, and that the file exists on disk.

    What's happening? Inside the .csproj file there is a variable $VSToolsPath getting set that ends up being used by MSBuild to resolve the path of the Microsoft.WebApplication.targets. You could modify the .csproj to prevent this behavior, but it's much easier to simply use a command-line switch like so:

    msbuild myproject.csproj /p:VisualStudioVersion=12.0.

    If you are using TFS then the fix is just as easy: in your build definition on the process tab set your MSBuild Arguments


  2. I was receiving a post-build error:

    API restriction: The assembly 'file:///D:\Builds\6\XXXXXXXXXX\XXX-TestBuild\Binaries\_PublishedWebsites\TDS.MyProject\bin\MyProject.Tests.dll' has already loaded from a different location. It cannot be loaded from a new location within the same appdomain.


    The full explanation of what is happening is here. The resolution is again very simple. In the build definition make sure you do not recursively match all test DLLs:

Thursday, May 21, 2015

Create a Reverse Proxy Controlled By Sitecore

Reverse proxies can be an incredibly useful technology in your Sitecore implementation depending upon your needs. The basic idea is that a reverse proxy forwards requests on to other servers on behalf of the requesting client, sort of like a traffic cop. The responses from the servers behind the reverse proxy are then returned to the requesting client. This can be done in such a way that is completely transparent to the end-user.

The Use Case

So why bother? Well, as Grant Killian suggests over on his blog at least two scenarios come to mind (I'm sure all you very smart folks could undoubtedly name more!) I want to focus on the case of a reverse proxy sitting between the Internet and a set of web servers that includes one or more legacy web servers and a Sitecore instance.

I've kept the conceptual diagram above simple (no load-balanced servers, firewalls, cache servers, etc.) but the technique readily applies to an enterprise ecosystem. The basic strategy is as follows:

  1. A user tries to browse a page (perhaps one they have bookmarked) e.g. http://company.com/foobar.php
  2. The reverse proxy receives the request and "asks" Sitecore where to route the request
  3. Sitecore tells the reverse proxy if it can handle the request and, if so, what the URL should be.
  4. The reverse proxy rewrites the request and forwards it. For example, if Sitecore responded positively to the reverse proxy our URL might be transformed to http://sitecorecd.company.com/foo/bar
  5. Sitecore or the legacy server responds to the page request
  6. The reverse proxy rewrites the response so that the end-user is unaware the page they see was came from a different server than the one they contacted.

The payoff with this scenario is we can now manage incremental content migrations from legacy servers to Sitecore servers without any disruption to end-user experience. Bookmarks, campaign emails, RSS feeds, Google search result rankings....all of it will happily continue on as always regardless of whether the legacy web server or Sitecore actually answers the HTTP request. Powerful stuff! This technique is especially useful for clients that have a very large inventory of content and cannot or are unwilling to migrate everything all at once.

The Solution

The first order of business is setting up a reverse proxy in IIS. The goal is to have a dedicated web site in IIS as the reverse proxy. To do that we need to install the Application Request Routing (ARR) extension. Once ARR is installed we'll need to do perform the following configuration steps

  1. Open IIS Manager and select the server node. Double click on the Application Request Routing Cache icon.

  2. In the right-hand pane click Server Proxy Settings.

  3. Check the Enable proxy setting and uncheck the Reverse rewrite host in response headers option.

  4. Set the Response buffer and Response buffer threshold values for 8092 and then click Apply. The reason for this I discovered through the school of hard knocks: some pages were mysteriously causing YSODs. After digging through logs (more on that later) we found that the response from the server was literally truncated. The page was large enough that it was overflowing the response buffer and causing it to flush with only a part of the overall page.


Now that ARR is installed and configured at the server level we need to turn our attention to the reverse proxy site. Here is the secret sauce of our solution: rather than merely write in some rules for routing in the web.config we are going to create our own custom Rewrite Provider. This will allow us to execute our own code during the runtime of the reverse proxy. I followed this guide to develop my own custom provider; it should get you up and running.

So what does my rewrite provider do? At its heart it's just a very simple URL resolver. The provider is rather dumb (and it should be!) We want the reverse proxy to do as little processing as possible.

  1. First make a request to Sitecore to see if the requested page (rewritten with Sitecore's host header) can be served, i.e. does the web request return with a response status code < 400.
  2. If that fails, the reverse proxy contacts a web service that knows how to map a legacy URL onto a Sitecore URL. Thus a URL like /news/article.php?id=foobar in the legacy system can be mapped onto /news/articles/foobar for example.
  3. If steps 1 and 2 fail, then the request is routed to the legacy server.

You may be wondering if all of that still sounds like too much work given that every request passes through the reverse proxy. Fortunately ARR has very good built-in caching, so in practice, your most requested pages (and resources) will not be processed in code continuously.

Closing Thoughts

Beyond what's already been covered, I recommend you consider the following:

  1. You need some kind of logging strategy. I write to the event logs from the reverse proxy and Sitecore's logs during the runtime of Sitecore (for example, the URL mapping web service.)
  2. Perform load testing. ARR is remarkably good OOTB with its cache settings, but better to test and know than simply assume.
  3. Think ahead about sessions and how you will deal with them.
  4. Redundancy. Our solution uses more than one reverse proxy. As a side-note with my implementation, if Sitecore itself goes down, the reverse proxy will continue to function. All requests would simply go to the legacy server. Eventually this behavior may become undesirable, but early in a project's lifetime this can be a real selling point.
  5. You definitely need to create some outbound rewrite rules in your reverse proxy's web.config to deal with:
    1. Rewriting relative links in the response HTML
    2. Rewriting the Location in the response Header when the status code is a 3XX (a redirect) and the host name is your backend server. This will prevent the end-user's browser being redirected to http://legacy.company.com/foobar rather than http://company.com/foobar
  6. You should absolutely turn on Failed Request Tracing Rules. This is the logging function I discussed earlier that proved invaluable in diagnosing and resolving issues during development