SharePoint 2013 Logo

SharePoint Server 2013 Issue: AppFabric Distributed Cache Service Crashes

Recently, I set up a SharePoint Server 2013 farm for a client. It is a small 3-server farm setup consisting of one web front end server, one application server, and one database server. After the installation completed successfully, there was a problem with AppFabric Distributed Cache Service. The service couldn’t start either automatically or manually. The service kept crashing. There were several error messages on the event logs and the ULS logs:

Microsoft.Fabric.Common.OperationCompletedException: Operation completed with an exception —> System.TimeoutException: The operation has timed out.

AppFabricCachingService.CrashMicrosoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRService0001>:SubStatus<ES0001>:Service initialization failed. No user action required.

I have tried reinstalling the service, the farm, and even the whole SharePoint software but the problem persisted. To give some context the following is the server and farm setup:

Virtualization Yes, all 3 farm servers are virtual machines
Virtualization Software VMware ESX 4.1
Operating System Windows Server 2008 R2 SP1
SharePoint SharePoint Server 2013 RTM
SQL Server SQL Server 2008 R2 SP1

This issue didn’t happen on other environment, so it could be unique to the hardware or the virtualization system.

After many sessions of troubleshooting with Microsoft Support, we managed to solve the issue with a workaround, by altering the default distributed cache service configuration and manually set a smaller cache size. Follow the steps to do it:

  1. Open SharePoint 2013 Management Shell as Administrator.
  2. Type Use-CacheCluster.
  3. Type Export-CacheClusterConfig .\afconfig.xml. You can change .\afconfig.xml to another location or filename if necessary.
  4. Open afconfig.xml with Notepad or any other text editor.
  5. Find and replace <dataCache size=”Medium”> to <dataCache size=”Small”>.
  6. Find and replace <caches partitionCount=”256″> to <caches partitionCount=”32″>.
  7. Save and close afconfig.xml.
  8. Go back to SharePoint 2013 Management Shell as Administrator.
  9. Type Stop-CacheCluster to ensure that AppFabric Distributed Cache Service is not running.
  10. Type Import-CacheClusterConfig .\afconfig.xml.
  11. Type Start-CacheCluster and ensure that all cache hosts are started.
  12. If the above command exits before all cache hosts are started (in starting condition), wait for a few minutes, then type Get-CacheHost.

The above workaround solves the crashing issue of AppFabric Distributed Cache Service. What’s weird, though, after I reverted back the configuration to Medium dataCache size and 256 cache partitionCount, AppFabric Distributed Cache Service still runs happily. And I never found the root cause of the issue. But at least now the farm is up and running.

  • http://blog.philipp-riedel.de/ Philipp Riedel

    Hi Denni. I already followed your instructions, but I still receive the same exception. Do you have any other device? Event Viewer does not give any other help.

    • http://www.denniland.com/ Denni Gautama

      Hi Philipp, perhaps you can try to rebuild the cache cluster (remove and re-add the cache hosts). Jake’s suggestion can be useful too.

  • Jake N

    Thanks for this information – we have a similar setup and this seemed to work for us.

    Just to add, we renamed a host when rebuilding a Frontend and this service seemed to keep the old hostname (as well as add the new). Once we removed the old host in the XML file and re-imported the settings, the service came back up on all servers.

    • http://www.denniland.com/ Denni Gautama

      Thanks for the extra tip, Jake :)

  • Pingback: Appfabric For Sharepoint 2013 Download | Free Documents App()

  • SP

    Hi,

    I am able to see the service status as “UP” in powershell but i am not able to start the appfabric caching service from services.msc, when i try to start it it starts and stops immediately throwing error in the event viewer :

    Faulting application name: DistributedCacheService.exe, version: 1.0.4632.0, time stamp: 0x4eafeccf

    Faulting module name: KERNELBASE.dll, version: 6.3.9600.17415, time stamp: 0x54505737

    Exception code: 0xe0434352

    Fault offset: 0x0000000000008b9c

    Faulting process id: 0x13fc

    Faulting application start time: 0x01d07ba05f6c2974

    Faulting application path: C:Program FilesAppFabric 1.1 for Windows ServerDistributedCacheService.exe

    Faulting module path: C:Windowssystem32KERNELBASE.dll

    Report Id: 9e0be60d-e793-11e4-80d5-005056a56db0

    Faulting package full name:

    Faulting package-relative application ID:

    Any suggestion for this ?

    • suresh kumar

      Hi
      Am also having the same issue. Could you please tell me how u resolved this? You can drop email [email protected]

  • Kavita Joshi

    Hi Denni,
    I tried all different options including deleting Distributed Cache service and adding again. None worked. And last I tried your suggestion and that worked perfelty. Thanks a lot!

  • Simon h

    Hey – just incase it helps anyone with a root cause analysis – in our situation this was *definitely* caused by us changing the RAM or the CPU count (upwards). The most likely candidate was that the cache freaked out when we increased the RAM on our VMs via VMWare

  • Pingback: Fix: Faulting application name: DistributedCacheService.exe | rule30()

  • Randall OkweI

    In the export file, make sure you can ping the fqdn of each node, remove the nodes you cannot comm with. Also are your wfes or nodes in a diff vlan?

  • Hilton Giesenow

    Just had this on an older 2013 farm – turns out just simply exporting and re-importing the config does the trick sometimes, no need to swap anything. Got the advice from https://social.msdn.microsoft.com/Forums/vstudio/en-US/3cf6048e-2b5f-47a0-aafd-56a6c3b3b9a8/cache-service-immediately-stops-with-an-error?forum=velocity, combined with your post so thanks