This is the second of a mini blog series in which I discuss some newly discovered limitations to Microsoft’s cool new Cloud Search Service Application (SSA). If you haven’t yet, I recommend reviewing the first limitation: Windows Claims Only. Can't use alternative identity providers with the Cloud SSA, folks. Take a look at the article, too, if you’d like some background on hybrid search. In this article, I’ll discuss another limitation that I found: you can’t really use alternative URLs with on-premises content and the Cloud SSA.
I’ll give you the same warning as last time. Beware: The following is going to get very technical. You can’t say I didn’t warn you. :)
New Limitation: No Alternative URLs
In SharePoint, alternative URLs can be defined for different zones (done via Alternative Access Mappings). When I was testing ADFS with the Cloud SSA (all that stuff in the first article), I was doing so with two zones. The Default zone was NTLM only (for the sake of crawling). We added the Extranet zone and configured it for ADFS identities only. That way, users won’t get prompted to select their identity provider when they hit the site (like they would if both ADFS and Windows were enabled in the same zone). Yes, I know there are other ways to work around this annoyance; that’s not the point. Since we wanted users to sign in using ADFS, we put the primary URL in the Extranet zone: https://site1.company.com. Since the Default zone still needed a URL, we gave it a throwaway URL: https://site1d.company.com. It would only be for crawling and thus the users would never need it.
Normally, this wouldn’t be a problem. Because crawling is being done against the Default zone, the query component in the SSA is smart enough to know how to handle alternative URLs. Brian Pendergrass discusses this in detail in the following awesome blog: https://blogs.msdn.microsoft.com/sharepoint_strategery/2014/07/08/problems-crawling-the-non-default-zone-explained/. In short, the SSA effectively parameterizes the Default zone’s URL and enables the query component to switch it out with the equivalent URL for the zone the user is in. In our example, the SSA crawls and indexes the Default zone URL, https://site1d.company.com. If a user browses to https://site1.company.com, SharePoint will know that the user is in the Extranet zone. If they do a search, the query component will go ahead and swap the Default URL (site1d.company.com) with the Extranet URL (site1.company.com) without anybody having to do anything. It’s very cool and why SharePoint should always be configured to crawl the Default zone.
Unfortunately, this doesn’t work with the Cloud SSA. The components aren’t smart enough to know about the zones. Remember, the Office 365 (O365) index is external to the on-premises farm, and the query component in O365 doesn’t know what zone a user is in. The URL crawled on-premises is the URL sent to the O365 index. When a user issues a query, the O365 query component doesn’t do the on-premises query magic and rewrite the URL for you. This makes perfect sense when you think about it from the angle of a user performing a search from SPO. There is no concept of zones there. How would SPO know which URL the user should get in their results? It wouldn’t.
The result of all of this is that users will get in their search results whatever URLs were crawled in the on-premises Cloud SSA. In our example, all results will come back as https://site1d.company.com even though we want them to be https://site1.company.com. If they click on the link for the item, they will be sent to the Default zone. That’s not what we want.
I’ve discovered that there is a potential workaround to this issue. It’s really not a good one, though. One of the few things you can configure in the on-premises Cloud SSA is crawling. One option you have available is Server Name Mappings. These will let you map a crawled URL with a URL of your choosing. We can use this to replace https://site1d.company.com with https://site1.company.com. When you do a crawl, site1.company.com is sent to the O365 index and is then displayed in the search results. Unfortunately, it’s very problematic:
- It messes up standard SSA functionality like contextual search. See Brian’s article (link above) for more about the negative impact of Server Name Mappings.
- The Server Name Mapping is not respected everywhere. For example, when you hover over a search result you get the Hover Panel. One option on the Hover Panel is “View Library.” I have found that this link does not get the rewritten URL but will instead have the URL for the crawled zone (Default). Even though your item will have a URL of https://site1.company.com, “View Link” will send you to https://site1d.company.com. Not cool.
- Say you need more than one alternative URL. You might have a web application that uses four different zones. Each would have their own URL. Perhaps they each need different authentication providers or have different user policies. When a user does a search, they should get back the URL for the zone that they’re in. Server Name Mappings provide only one chance to rewrite the URL. If, say, the Custom zone has the URL https://internalsite, you won’t be able to rewrite the URL to both https://site1.company.com and https://internalsite. It’s one or the other.
As you can see, there really isn’t a great story with the Cloud SSA and alternative URLs. The moral of the story is that, if your web application requires multiple URLs, then the Cloud SSA isn’t right for you.
As I said in the other article, I’m really hoping that this will help somebody avoid a lot of pain as they attempt to deploy the Cloud SSA where they probably shouldn’t. Just make sure to factor in all of the Cloud SSA’s limitations. If it fits, then enjoy the Cloud SSA!