Blog | Longhorn

Troubleshooting: Some old instance manager pods are still running after upgrade

November 9, 2021 · 3 min read

Derek Su

Applicable versions

Longhorn ≥ v0.8.0.

Symptoms

Some old instance manager pods are still running after upgrade.

Details

This behavior is an expected behavior rather than a bug. In the following paragraphs, we will explain why.

Let us first take a look of the example. We created a pod with a volume backed by 3 replicas in a Kubernetes cluster with 1 master and 4 workers nodes. The running volume is associated with the engine instance manager pod instance-manager-e-ec3eb207.

Troubleshooting: Volume cannot be cleaned up after the node of the workload pod is down and recovered

November 8, 2021 · One min read

Derek Su

Applicable versions

All Longhorn versions.

Symptoms

Volume cannot be cleaned up after the node of the workload pod is down and recovered.

Solution

The root cause is a race condition in the pod cleanup process of Kubernetes. It is fixed since Kubernetes 1.22.0+ according to this commit.

Troubleshooting: DNS Resolution Failed

October 26, 2021 · One min read

JenTing Hsiao

Applicable versions

All Longhorn versions.

Symptoms

The longhorn-driver-deployer or longhorn-csi-plugin or longhorn-ui Pods unable to access the longhorn manager backend http://longhorn-backend:9500/v1.

Troubleshooting: Generate pprof runtime profiling data

October 18, 2021 · One min read

Derek Su

Applicable versions

Longhorn ≥ v1.1.2.

Symptoms

Not able to investigate the longhorn-manager performance bottlenecks from the external state of the longhorn processes.

Troubleshooting: Pod stuck in creating state when Longhorn volumes filesystem is corrupted

August 19, 2021 · 4 min read

Chin-Ya Huang

Derek Su

Applicable versions

All Longhorn versions.

Symptoms

The pod using a longhorn volume with an ext4 filesystem stays in container Creating with errors in the log.

  Warning  FailedMount             30s (x7 over 63s)  kubelet                  MountVolume.SetUp failed for volume "pvc-bb8582d5-eaa4-479a-b4bf-328d1ef1785d" : rpc error: code = Internal desc = 'fsck' found errors on device /dev/longhorn/pvc-bb8582d5-eaa4-479a-b4bf-328d1ef1785d but could not correct them: fsck from util-linux 2.31.1
ext2fs_check_if_mount: Can't check if filesystem is mounted due to missing mtab file while determining whether /dev/longhorn/pvc-bb8582d5-eaa4-479a-b4bf-328d1ef1785d is mounted.
/dev/longhorn/pvc-bb8582d5-eaa4-479a-b4bf-328d1ef1785d contains a file system with errors, check forced.
/dev/longhorn/pvc-bb8582d5-eaa4-479a-b4bf-328d1ef1785d: Inodes that were part of a corrupted orphan linked list found.  

/dev/longhorn/pvc-bb8582d5-eaa4-479a-b4bf-328d1ef1785d: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
  (i.e., without -a or -p options)

Applicable versions​

Symptoms​

Details​

Applicable versions​

Symptoms​

Solution​

Applicable versions​

Symptoms​

Applicable versions​

Symptoms​

Applicable versions​

Symptoms​

Applicable versions

Symptoms

Details

Applicable versions

Symptoms

Solution

Applicable versions

Symptoms

Applicable versions

Symptoms

Applicable versions

Symptoms