Forum Discussion
No KB that i'm aware of. Their RCA was...
Good Morning!
Here is the root cause our Engineering has identified,
Looking at the threads in hostd, we see that there are lots of threads blocked on the lock of the host managed object.
11 threads (threads 12, 14, 15, 16, 17, 18, 19, 20, 21, 26, 27) were blocked trying to read-lock the host.
The thread that holds the read lock is thread 2. It is blocked in some vsan.
A code in the GetRuntime() property decided to perform some RPC operations and blocked waiting on a condition variable. This caused a deadlock.
This depends on whether the event that the vsan stub was waiting for would be generated from an I/O thread (in which case the thread would eventually be unblocked), or the event needed a worker thread to be generated (in which case it would be a deadlock by thread starvation).
As the root cause for the bug is that a piece of VSAN code which is causing a deadlock, our Engineering is working with vSAN team to get the insight of the respective property.
Related Content
- 7 years ago
- 8 years ago