A long time ago, in a business not too far away…
It is a time of struggle for the young Jedi of Summit 7. Their work had been put to a halt by the evil overlord Load Balancer. In an effort to fend off the overlord, the Jedi Knight Jason Miller and his two young padawans, Michael Pigott and Jessica Criner set out to defeat him and bring peace to their client.
Little did they know that the overlord had his secret weapon, Source IP Address Translation! With limited knowledge of this powerful weapon, the young Jedi Knights set out on their journey with hopes of setting free King Search and Queen InfoPath from the grasp of the evil overlord…
Source IP Address Translation
Source IP Address Translation is a lot like Network Address Translation. It allows you to set the load balancer to alter the TCP/IP packets that are from the client to the server. It changes the packet to make the source IP address (the client) set to the address of the load balancer itself rather than the requesting client. When the server receives the packet, it will transmit back to the load balancer instead of the client directly. The farm was currently configured this way, resulting in errors with an InfoPath form and the Search Service.
We first noticed the effects of this feature with the InfoPath form. It was a simple form that auto-populated the currently logged in user’s name and email address. It was setup using a data connection to the UserProfileService.asmx web service, pulling values from the GetUserProfileByName method. The issue: the form would only work occasionally! The most difficult part with troubleshooting the issue was this inconsistency. We attempted to target each web front end to see if we could track down the error, and also searched through the ULS logs and event logs to try to understand the issue. The error the form was giving us was very misleading as well. It was an error that I’m all too familiar with…
Event ID 5566: an error occurred accessing a data source.
I knew from previous experience this was related to loopback. This error however is very misleading because you could navigate directly to the web service without issue. Also, if it was purely a loopback issue, the form would not work at all. Instead, it was working sometimes, leading to complete confusion and frustration. The inconsistency in the form working was all due to the Source IP Translation and Load Balancing issues discussed above.
We also saw the effects of this with our Search Service Application. When starting a crawl on the sites, the Search Service Application was requesting the information from a particular WFE. The load balancer was then sending its IP address to the WFE to communicate with instead of the source IP address of the Search Service Application. In short the load balancer was causing search to remain on hold for large amounts of time during the crawl, and returning numerous errors.
At this point we had suspicions about the load balancer not being configured correctly. We looked into the configuration of the Source IP Address Translation and found that it was set to the load balancer IP address. To remedy the issues above, we turned the Source IP Address Translation feature off. After this fix was in place, the form still failed to work. However, we were now able to locate the error on the WFE servers. They were timing out talking to the User Profile Service. So now all that remained to get the form to work consistently was setting the host files on each WFE to 127.0.0.1 and disabling loopback check. Here is a link to the article we used to disable loopback per hostname site collection instead of disabling globally: http://blogs.technet.com/b/sharepoint_foxhole/archive/2010/06/21/disableloopbackcheck-lets-do-it-the-right-way.aspx.
As for search, the crawler was re-run after the fix was in place. The crawler worked wonders! It began to crawl faster and finished in about an hour. Any errors it did return were minimal and completely unrelated to load balancing. Simply turning the Source IP Address Translation feature off not only remedied our Search issues, but also resulted in an InfoPath form that finally worked consistently, after the additional loopback fixes. Balance was now restored to the SharePoint 2013 farm!
In summary, load balancing was key to the SharePoint farm’s success, however, proper configuration and attention to detail is always necessary. We hope this helps you in the future. Go forth and conquer!