Problem
You have two AppFabric hosts in your SharePoint Server 2013 farm and want to add a third. After adding this host, you find that the new host status is DOWN. You try starting the service first using the usual Start- CacheHost command, but this fails. When you try starting it through the Services control panel, it does start up, but then stops after a few minutes. You then check the CacheClusterConfiguration file and see all three hosts there and configured correctly. But when you check the health stats (Get-CacheClusterHealth), only the two original hosts are shown.
Solution
You have two AppFabric hosts in your SharePoint Server 2013 farm and want to add a third. After adding this host, you find that the new host status is DOWN. You try starting the service first using the usual Start- CacheHost command, but this fails. When you try starting it through the Services control panel, it does start up, but then stops after a few minutes. You then check the CacheClusterConfiguration file and see all three hosts there and configured correctly. But when you check the health stats (Get-CacheClusterHealth), only the two original hosts are shown.
Solution
- After adding a new cache host, stop and then start the CacheCluster.
- Adding a local SharePoint 2013 (development server) as a cache host to AppFabric’s cache cluster
- Configuring Multiple Distributed Cache Servers in SharePoint 2013
- AppFabric cache not clearing on cluster restart
- Stop-CacheCluster
- Restart-CacheCluster
- Start-CacheHost
- AppFabric Caching and SharePoint: Concepts and Examples (Part 1)
- AppFabric Caching and SharePoint: Concepts and Examples (Part 2)
- AppFabric 1.1 Cache - Problem with Start-CacheCluster
- Server Unavailability Troubleshooting (Windows Server AppFabric Caching)
- Thanks to Marco van Wieren and his post for providing the clue needed for solving this.
- Here's the process I went through to add the instance and troubleshoot:
-
I first added the instance by executing the following script on the target new AppFabric host:
$SPFarm = Get-SPFarm $cacheClusterName = "SPDistributedCacheCluster_" + $SPFarm.Id.ToString() $cacheClusterManager = [Microsoft.SharePoint.DistributedCaching.Utilities.SPDistributedCacheClusterInfoManager]::Local $cacheClusterInfo = $cacheClusterManager.GetSPDistributedCacheClusterInfo($cacheClusterName) $instanceName ="SPDistributedCacheService Name=AppFabricCachingService" $serviceInstance = Get-SPServiceInstance | ? {($_.Service.Tostring()) -eq $instanceName -and ($_.Server.Name) -eq $env:computername} if([System.String]::IsNullOrEmpty($cacheClusterInfo.CacheHostsInfoCollection)) {Add-SPDistributedCacheServiceInstance; $cacheClusterInfo.CacheHostsInfoCollection}
- After this, I executed Use-CacheCluster followed by Get-CacheHost to check service status.
Note that I did this while logged into the new AppFabric host machine. The results were that my two existing hosts were UP while the new one was DOWN. I then executed the following script to spot check the new host's service instance status:
$instanceName ="SPDistributedCacheService Name=AppFabricCachingService" $serviceInstance = Get-SPServiceInstance | ? {($_.service.tostring()) -eq $instanceName -and ($_.server.name) -eq $env:computername} $serviceInstance
The result was that the new service instance status is Online. - I then decided to check the service instance status of all hosts in the farm using this script:
Get-SPServiceInstance | ? {($_.service.tostring()) -eq "SPDistributedCacheService Name=AppFabricCachingService"} | select Server, Status
The results were that all service instances were Online. - I then started the AppFabric service manually, via the Services control panel on the new host. I noted that the correct service account was configured (spService); and when I clicked Start, the service started up without issue.
- Still logged into this machine, I again executed Use-CacheCluster followed by Get-CacheHost to check service application status. This time, the results were UNKNOWN for all. I then logged into one of the existing AppFabric hosts and executed Use-CacheCluster followed by Get-CacheHost to check service status. The results were UNKNOWN for the new host and UP for the two existing hosts.
- I waited a minute or two to give time for the caching service to warm up and synchronize. I then repeated steps 4 - 5, but experienced the same outcome.
- I repeated steps 4 - 5, but this time from the SharePoint Management Shell, executing Start-CacheHost. This immediately resulted in the error:
Start-CacheHost : Cannot start service AppFabricCachingService on computer.
- I then checked the cluster configuration by exporting the configuration file using this script:
export-cacheclusterconfig C:\CacheClusterConfig.txt
All three hosts were identified in this file and the configuration information for each seemed to be correct. - Next, I checked the health metrics by executing this script:
Use-CacheCluster Get-CacheClusterhealth
Oddly, this showed that the two existing hosts were perfectly healthy, but it did not mention anything about the new one I had just added. - I then checked all of the caches using this script:
Get-Cache
The results of this were a listing of the names of all of the caches, but this listing only showed the caches for the two existing hosts and nothing for the new one. - Next, I checked the configuration of the new host using this script:
Get-CacheHostConfig
The result showed a CacheSize of 1124 MB, which was significantly higher than any of the CacheSizes for the other hosts. - Thinking that this might be the issue, I stopped the cluster, set the new CacheSize and then started the cluster using this script:
Stop-CacheCluster Set-CacheHostConfig -Hostname OCS-VS-BAT13D1 -CachePort 22233 -CacheSize 300 Start-CacheCluster
- I then checked the service status of all hosts, using Get-CacheHost, and this time all instances reported back UP status.
- It wasn't clear to me whether it was setting the CacheSize to a lower value or stopping and starting the cache cluster that resolve the issue. So, I stopped the cluster, set the CacheSize back to 1124 MB, and then started the cluster and checked service statuses again: they were all up.
- Therefore, it was stopping and starting the cache cluster that enabled the new host to fully integrate with the cluster.
-
I first added the instance by executing the following script on the target new AppFabric host:
No comments:
Post a Comment