java – Infinispan health check not responding sometimes

Currently, we deploy three pods of Infinispan 13.0.6 in one DC and another 3 pods of Infinispan 13.0.6 in anther DC under OpenShift. One of the biggest issue is a lot of time we can see the health check at http://127.0.0.1:11222/rest/v2/cache-managers/default/health/status stop responding and the pods get killed due to failed the health check. We checked all the JVM configurations and the GC was only take about 20-50ms (the cache size less than 200MB). We enabled the debug log of the Infinispan, but cannot really see any obvious problem in the log:

2022-04-26 16:08:27,936 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.statetransfer.StateConsumerImpl] Removing no longer owned entries for cache ___hotRodTopologyCache_hotrod-default
2022-04-26 16:08:58,583 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x63F977FB33D5BD49D503B9D39755F0F6
2022-04-26 16:08:58,583 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xFE5F80F4CE74763B2DEDAA7CE24F27F2
2022-04-26 16:08:58,583 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xAB85D37038BDDCDC607C56EFA28810AB
2022-04-26 16:08:58,583 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x7C7970E75BA3DF681589BE00F49DAA4F
2022-04-26 16:08:58,583 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x52BE86595CC53C803242CF11EA235EBC
2022-04-26 16:08:58,583 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x041352068F6C6160D9CD38CF2E8D41F8
2022-04-26 16:08:58,583 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x8F0F435791D64A7CE4440B9AA593869B
2022-04-26 16:08:58,583 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x238866F3A5EF2ADED0E433541813E8B9
2022-04-26 16:08:58,603 DEBUG (non-blocking-thread--p2-t2) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xEDE9E166865BD7E129911A9F84C276E0
2022-04-26 16:09:23,823 DEBUG (non-blocking-thread--p2-t2) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xC931FFA88895CDAD20CD4F6C3C93A497
2022-04-26 16:09:23,825 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xC1BA71B5104AC08802435F6BC1243366
2022-04-26 16:09:27,675 DEBUG (Timer runner-2,infinispan-server-dca-845c7b4cb-njzsr-17472) [org.jgroups.protocols.UNICAST3] infinispan-server-dca-845c7b4cb-njzsr-17472: removing expired connection for infinispan-server-dca-845c7b4cb-knn9g-1190 (240068 ms old) from send_table
2022-04-26 16:09:27,675 DEBUG (Timer runner-2,infinispan-server-dca-845c7b4cb-njzsr-17472) [org.jgroups.protocols.UNICAST3] infinispan-server-dca-845c7b4cb-njzsr-17472: removing expired connection for infinispan-server-dca-845c7b4cb-knn9g-1190 (240068 ms old) from recv_table
2022-04-26 16:09:38,028 DEBUG (non-blocking-thread--p2-t2) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xA84F87971CA41A7B571CAEFB4C764937
2022-04-26 16:09:38,115 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x9996A56AD52F055526ECC41029317606
2022-04-26 16:09:48,613 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xEBE99F39AAAA22913DE87C39AA0F0FDC
2022-04-26 16:09:48,613 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xF1D05C4F8709884D7BF5D6C8B5EA8582
2022-04-26 16:09:48,613 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xEB33B2982C5103C0C5F8432B7F5130F2
2022-04-26 16:09:48,613 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x92B71C3DC97C6315198FDCB27238B833
2022-04-26 16:09:48,613 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xAEBDFA9045BBCA626AC3ACD7981E0941
2022-04-26 16:09:48,613 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xEDE9E166865BD7E129911A9F84C276E0
2022-04-26 16:09:48,613 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x1449D385C2B9379E66A1B0A14B137049
2022-04-26 16:09:48,884 DEBUG (non-blocking-thread--p2-t2) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x238866F3A5EF2ADED0E433541813E8B9
2022-04-26 16:09:48,885 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x8F0F435791D64A7CE4440B9AA593869B
2022-04-26 16:09:57,993 DEBUG (non-blocking-thread--p2-t2) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xFB4DEFE662E301948C0B1EA5BA4FAD89
2022-04-26 16:10:13,929 DEBUG (non-blocking-thread--p2-t2) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x496849732A7832711F82CF11D270B94E
2022-04-26 16:10:38,809 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xF1D05C4F8709884D7BF5D6C8B5EA8582
2022-04-26 16:10:38,905 DEBUG (non-blocking-thread--p2-t2) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x1449D385C2B9379E66A1B0A14B137049
2022-04-26 16:11:18,418 DEBUG (non-blocking-thread--p2-t2) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x29815A6BC19D0CFC467E1B1174B73457
2022-04-26 16:11:29,265 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x238866F3A5EF2ADED0E433541813E8B9
2022-04-26 16:11:29,278 DEBUG (non-blocking-thread--p2-t2) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x041352068F6C6160D9CD38CF2E8D41F8
2022-04-26 16:11:38,270 DEBUG (non-blocking-thread--p2-t2) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x22E3ABA27E342442A1205D444329A137
2022-04-26 16:11:54,126 DEBUG (non-blocking-thread--p2-t2) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0x6503F87FCDA766988BCF722CA64A4182
2022-04-26 16:12:10,713 DEBUG (Timer runner-2,infinispan-server-dca-845c7b4cb-njzsr-17472) [org.jgroups.protocols.UNICAST3] infinispan-server-dca-845c7b4cb-njzsr-17472: removing expired connection for infinispan-server-dca-845c7b4cb-6r7mv-15796 (240065 ms old) from send_table
2022-04-26 16:12:10,713 DEBUG (Timer runner-2,infinispan-server-dca-845c7b4cb-njzsr-17472) [org.jgroups.protocols.UNICAST3] infinispan-server-dca-845c7b4cb-njzsr-17472: removing expired connection for infinispan-server-dca-845c7b4cb-6r7mv-15796 (240065 ms old) from recv_table
2022-04-26 16:12:19,202 DEBUG (non-blocking-thread--p2-t1) [org.infinispan.server.hotrod.ClientListenerRegistry] Channel disconnected, removing event sender listener for id: [B0xF1D05C4F8709884D7BF5D6C8B5EA8582
2022-04-26 16:12:31,176 INFO  (Thread-1) [org.infinispan.SERVER] ISPN080002: Infinispan Server stopping
2022-04-26 16:12:31,178 DEBUG (Thread-1) [org.infinispan.rest.RestServer] Stopping server REST-rest-default listening at xxx.xxx.xxx.xxx:11222
2022-04-26 16:12:31,178 DEBUG (Thread-1) [org.infinispan.server.core.AbstractProtocolServer] Stopping server REST-rest-default listening at xxx.xxx.xxx.xxx:11222
2022-04-26 16:12:31,178 DEBUG (Thread-1) [org.infinispan.server.core.AbstractProtocolServer] Server REST-rest-default stopped
2022-04-26 16:12:31,178 DEBUG (Thread-1) [org.infinispan.server.hotrod.HotRodServer] Stopping server HotRod-hotrod-default listening at xxx.xxx.xxx.xxx:11222
2022-04-26 16:12:31,179 DEBUG (Thread-1) [org.infinispan.registry.impl.InternalCacheRegistryImpl] Unregistering internal cache ___hotRodTopologyCache_hotrod-default
2022-04-26 16:12:31,179 DEBUG (Thread-1) [org.infinispan.cache.impl.CacheImpl] Stopping cache ___hotRodTopologyCache_hotrod-default on infinispan-server-dca-845c7b4cb-njzsr-17472
2022-04-26 16:12:31,182 DEBUG (Thread-1) [org.infinispan.topology.LocalTopologyManagerImpl] Node infinispan-server-dca-845c7b4cb-njzsr-17472 leaving cache ___hotRodTopologyCache_hotrod-default
*** Server process (99) received TERM signal ***
2022-04-26 16:12:31,193 DEBUG (Thread-1) [org.infinispan.server.core.AbstractProtocolServer] Stopping server HotRod-hotrod-default listening at 198.18.226.101:11222
2022-04-26 16:12:31,193 DEBUG (Thread-1) [org.infinispan.server.core.AbstractProtocolServer] Server HotRod-hotrod-default stopped

I’m wondering, except for the JVM doing a GC, is there any other possibilities that Infinispan will not respond to health check? Will Infinispan server response health check during the state transferring?

Leave a Comment