Running applications is only part of the job; understanding what’s happening inside your cluster is what keeps systems r…