Friday, May 16, 2014

SharePoint 2013: The process was terminated due to an unhandled exception

Problem

You have a SharePoint Server 2013 farm, with one application and two web front end (WFE) servers in a traditional topology.  The farm servers are VMs hosted on Hyper-V.  Farm servers have both private and external NICs configured.  The private network is limited to farm servers.  A HOST file is used to enable private network routing among farm servers.  The Distributed cache service is running on the application server and the two WFEs.  You review the Application Event log for one of your web front end (WFE) servers and find the following set of events appearing in the log on an hourly basis:
Log Name:      Application
Source:        Application Error
Date:          [date/Time]
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      [WFE]
Description:
Faulting application name: WerFault.exe, version: 6.2.9200.16659, time stamp: 0x51db3bf4
Faulting module name: wer.dll, version: 6.2.9200.16384, time stamp: 0x501081cc
Exception code: 0xc0000005
Fault offset: 0x0000000000021fe5
Faulting process id: 0x318c
Faulting application start time: 0x01cf62c15e954491
Faulting application path: C:\Windows\system32\WerFault.exe
Faulting module path: C:\Windows\system32\wer.dll
Report Id: 9d3c283d-ceb4-11e3-9405-00155d38891a
Faulting package full name: 
Faulting package-relative application ID: 
Event Xml:
...
and
Log Name:      Application
Source:        Application Error
Date:          [date/time]
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      [WFE]
Description:
Faulting application name: DistributedCacheService.exe, version: 1.0.4632.0, time stamp: 0x4eafeccf
Faulting module name: KERNELBASE.dll, version: 6.2.9200.16815, time stamp: 0x52f2ca60
Exception code: 0xe0434352
Fault offset: 0x00000000000264a8
Faulting process id: 0x3588
Faulting application start time: 0x01cf62c03cd277d1
Faulting application path: C:\Program Files\AppFabric 1.1 for Windows Server\DistributedCacheService.exe
Faulting module path: C:\Windows\system32\KERNELBASE.dll
Report Id: 9c501e53-ceb4-11e3-9405-00155d38891a
Faulting package full name: 
Faulting package-relative application ID: 
Event Xml:
...
and
Log Name:      Application
Source:        .NET Runtime
Date:          [date/time]
Event ID:      1026
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      [WFE]
Description:
Application: DistributedCacheService.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: Microsoft.ApplicationServer.Caching.DataCacheException
Stack:
   at Microsoft.ApplicationServer.Caching.VelocityWindowsService.StartServiceCallback(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()

Event Xml:
...
You check the Distributed Cache service on the WFEs through Manage Services on Server in Central Administration and see that the services are running on all servers. You then open a SharePoint Management Shell as administrator on one of the WFEs, and you run Use-CacheCluster and Get-CacheHost to check the status of the cache hosts.  The status shows the application server status as UNKNOWN, the local server as UP and the other WFE as UNKNOWN.  You then remote into the application server and repeat the powershell commands, and this time the status is: application server UP and the two WFEs UNKNOWN.  A similar result is experienced for the second WFE.

Solution
  1. Check the HOST file on each of the farm servers and verify that the host names in the file are fully qualified.
References
  1. Cache Administration with Windows PowerShell (AppFabric 1.1)
  2. AppFabric 1.1 caching service crashes with System.UriFormatException: Invalid URI: The hostname could not be parsed
  3. AppFabric Event ID 1000 and Event ID 1026 with SharePoint 2013
  4. Raw error past data on PASTEBIN by Anonymous
  5. AppFabric Caching and SharePoint: Concepts and Examples (Part 1)
Notes
  • The primary posting that helped solve this was [3]. Hat tip to the experts at Sterling International Consulting Group for identifying this one as it relates to static private networking.

No comments: