We are using SharePoint 2013 on Windows Server 2012
We are having problems with 2013 workflows because the Distributed Cache keeps growing. At a point, it will crash and 2013 workflows will not run due to a http 401 Unauthorized error. I think that Workflow Manager tries to grab the user's token and the request times out due to the cache's size. The Distributed Logon Token Cache and the SPVIewStateCache both time out.
The ULS logs show:
Unexpected error occurred in method 'GetObject' , usage 'Distributed Logon Token Cache' - Exception 'Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRCA0018>:SubStatus<ES0001>:The request timed out.. Additional Information : The client was trying to communicate with the server : net.tcp://SP2013.Domain.local:22233
at Microsoft.ApplicationServer.Caching.DataCache.ThrowException(ResponseBody respBody, RequestBody reqBody)
at Microsoft.ApplicationServer.Caching.DataCache.InternalGet(String key, DataCacheItemVersion& version, String region, IMonitoringListener listener)
at Microsoft.ApplicationServer.Caching.DataCache.<>c__DisplayClass49.<Get>b__48()
at Microsoft.SharePoint.DistributedCaching.SPDistributedCache.GetObject(String key)'.
Unexpected error occurred in method 'Put' , usage 'SPViewStateCache' - Exception 'Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRCA0018>:SubStatus<ES0001>:The request timed out.. Additional Information : The client was trying to communicate with the server : net.tcp://SP2013Dev.DMC.local:22233
at Microsoft.ApplicationServer.Caching.DataCache.ThrowException(ResponseBody respBody, RequestBody reqBody)
at Microsoft.ApplicationServer.Caching.DataCache.InternalPut(String key, Object value, DataCacheItemVersion oldVersion, TimeSpan timeout, DataCacheTag[] tags, String region, IMonitoringListener listener)
at Microsoft.ApplicationServer.Caching.DataCache.<>c__DisplayClass25.<Put>b__24()
at Microsoft.ApplicationServer.Caching.DataCache.Put(String key, Object value, TimeSpan timeout)
at Microsoft.SharePoint.DistributedCaching.SPDistributedCache.Put(String key, Object value)'.
I've rebuilt the Workflow Manager farm but the problem keeps coming back. After a couple days of troubleshooting and rebuilding workflow manager, service bus, distributed cache, and UPS while trying to locate the problem, I was able to get workflows running again but it will break again once the cache grows too big.
I set the limit at 1024 MB with this Powershell command but it's not working.
Update-SPDistributedCacheSize -CacheSizeInMB 1024 MB
PS C:\Windows\system32> Get-AFCacheHostConfiguration -ComputerName SP2013 -CachePort "22233"
HostName : SP2013.Domain.local
ClusterPort : 22234
CachePort : 22233
ArbitrationPort : 22235
ReplicationPort : 22236
Size : 1024 MB
ServiceName : AppFabricCachingService
HighWatermark : 99%
LowWatermark : 90%
IsLeadHost : True
The cache just keeps growing. What can I do to have Distributed Cache delete expired data?
Command ran at 3pm
PS C:\Windows\system32> Get-CacheStatistics -ComputerName sp2013 -CachePort 22233
Size : 3397632
ItemCount : 198
RegionCount : 304
NamedCacheCount : 11
RequestCount : 2530
MissCount : 562
At 4:30PM
PS C:\Windows\system32> Get-CacheStatistics -ComputerName sp2013 -CachePort 22233
Size : 8997888
ItemCount : 260
RegionCount : 607
NamedCacheCount : 11
RequestCount : 9372
MissCount : 1242
At 6PM
PS C:\Windows\system32> Get-CacheStatistics -ComputerName sp2013 -CachePort 22233
Size : 9325568
ItemCount : 222
RegionCount : 629
NamedCacheCount : 11
RequestCount : 10224
MissCount : 1292
Here are my Cache settings for the Distributed Logon Token Cache and the SPVIewStateCache.
PS C:\Windows\system32> Get-CacheConfig -CacheName DistributedViewStateCache_a080f929-f0d1-42cd-a9c1-14cc3ae717c3
CacheName : DistributedViewStateCache_a080f929-f0d1-42cd-a9c1-14cc3ae717c3
TimeToLive : 10 mins
CacheType : Partitioned
Secondaries : 0
MinSecondaries : 0
IsExpirable : True
EvictionType : LRU
NotificationsEnabled : False
WriteBehindEnabled : False
WriteBehindInterval : 300
WriteBehindRetryInterval : 60
WriteBehindRetryCount : -1
ReadThroughEnabled : False
ProviderType :
ProviderSettings : {}
PS C:\Windows\system32> Get-CacheConfig -CacheName DistributedLogonTokenCache_a080f929-f0d1-42cd-a9c1-14cc3ae717c3
CacheName : DistributedLogonTokenCache_a080f929-f0d1-42cd-a9c1-14cc3ae717c3
TimeToLive : 10 mins
CacheType : Partitioned
Secondaries : 0
MinSecondaries : 0
IsExpirable : True
EvictionType : LRU
NotificationsEnabled : False
WriteBehindEnabled : False
WriteBehindInterval : 300
WriteBehindRetryInterval : 60
WriteBehindRetryCount : -1
ReadThroughEnabled : False
ProviderType :
ProviderSettings : {}