Troubleshooting¶

Early warnings from pods¶

Warnings, like the one below, can be seen from pods scheduled on newly started nodes.

MountVolume.MountDevice failed for volume "xxxx" : kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name gcs.csi.ofek.dev not found in the list of registered CSI drivers

Those warnings are temporary and reflect that the driver is still starting. Kubernetes will retry until the driver is ready. The problem is often encountered in clusters with auto-scaler as nodes come and go.

This is a known issue of kubernetes (see kubernetes#75890).

A possible workaround is to taint all nodes running the csi-gcs driver like <driver name>/driver-ready=false:NoSchedule and use, as suggested in this comment, a custom controller like wish/nodetaint to remove the taint once the csi-gcs pod is ready.

This workaround will ensure pods are repelled from nodes until the csi-gcs driver is ready without interfering with other components like the cluster auto-scaler.

By default, <driver name> is gcs.csi.ofek.dev.

Warning

The driver labels the node with <driver name>/driver-ready=true to reflect its readiness state. It's possible to use a node selector to select nodes with a ready csi-gcs node driver. However, it doesn't work with clusters using cluster-autoscaler as the auto-scaler will never find a node with matching <driver name>/driver-ready=true label.

Last update: May 14, 2022