Saturday, February 15, 2014

SharePoint 2010 Health Analyzer: One or more servers is not responding: SharePoint 2010 Timer Service Stopped

Problem

You find the following entry in the SharePoint 2010 Central Administration Review problems and solutions All Reports listing:

TitleOne or more servers is not responding.
Severity1 - Error
CategoryAvailability
ExplanationThe following servers have not executed any timer jobs in the last 2 hours: [SharePointServer1].  This can happen if the server was shut down or lost network connectivity, or if the timer service on that server has crashed, hung, or been stopped.
RemedyEnsure that the server(s) listed above are running and connected to the network.  If the timer service is not started, restart the service by typing the following command at the command prompt on each server: "net start SPTimerV4."  If the server was intentionally removed from the farm, remove the record of the server from the SharePoint topology in the Central Administration site at http://[pathtoCA]. For more information about this rule, see"http://go.microsoft.com/fwlink/?LinkID=142656".
Failing Servers 
Failing ServicesSPTimerService (SPTimerV4)
Rule SettingsView
 
Troubleshooting
  1. Reviewed job history for SharePointServer1.
    • Result: no job currently scheduled, nor any job completed since midnight.
  2. Remoted into SharePointServer1, opened Services, scrolled to SharePoint 2010 Timer, and then viewed status.
    • Result: service is set to Auto and is stopped.
  3. Started this service and monitored for a short while.
    • Result: service remained started.
  4. Returned to CA, refreshed Job History page for SharePointServer1:
    • Result: new jobs began appearing in history.  All jobs completed successfully except for two, which were repeatedly aborted: Service Application Instance Provisioning Job, Application Server Administration Service Timer Job.
  5. Clicked Aborted status for Service Application Instance Provisioning Job.
    • Result: message: "The administration service job definition 'SecurityTokenServiceApplication' (id 5e0bad8b-948b-460a-8bb2-19da7e10fd9c) was not executed because the administration service on this server is not started. This job definition can be run manually using 'stsadm -o execadmsvcjobs'. "
  6. Clicked Aborted status for Application Server Administration Service Timer Job.
    • Result: message: The administration service job definition 'job-application-server-admin-service' (id 43b74f53-53ba-4eb2-981d-28c13bf012e2) was not executed because the administration service on this server is not started. This job definition can be run manually using 'stsadm -o execadmsvcjobs'.
  7. Remoted into SharePointServer1, opened Services, scrolled to SharePoint 2010 Administration, and then viewed status.
    • Result: service is set to Auto and is stopped.  Also noted that its logon is set to Local System.
  8. Remoted into SharePointServer2 to compare.  Opened Services, scrolled to SharePoint 2010 Administration, and then viewed status.
    • Result: service is set to Auto and is started.  Also noted that its logon is set to Local System.
  9. Remoted into SharePointServer1, opened Services, scrolled to SharePoint 2010 Administration, and then started service and observed.
    • Result: service remained started.
  10. Returned to CA, refreshed Job History page for SharePointServer1, and looked for SharePoint 2010 Timer Service Job status.
    • Result: job repeatedly being completed successfully.
  11. Looked for Service Application Instance Provisioning Job status.
    • Result: this job did not reappear.
  12. Returned  SharePoint 2010 Central Administration Review problems and solutions All Reports listing, and clicked on error.  Clicked Re-analyze Now button.
    • Result: Severity changed to "4 - Success"
Summary

This was a straightforward SharePoint Health Analyzer rule issue to resolve.  It helped to correlate the issue against Windows Server logs.

References
Notes
  • Starting the SharePoint 2010 Administration will generate an error event in the server Application log: EventID 2137: The SharePoint Health Analyzer detected an error.  One or more services have started or stopped unexpectedly.

No comments: