Blog | Longhorn

Troubleshooting: Pod with `volumeMode: Block` is stuck in terminating

April 6, 2022 · 2 min read

Phan Le

Applicable versions

All Longhorn versions.

Symptoms

User has a pod that uses a PVC with volumeMode: Block provisioned by Longhorn CSI driver. After an unexpected crash of the Longhorn volume (due to network, CPU pressure, hardware problem, etc...), the user cannot delete the pod. The pod would be stuck in terminating forever since Kubelet refuses to unmount the block volume. This prevents the user from cleaning up the pod and spinning up a new replacement pod thus leading to a long service degradation. For example, if the pod is part of a StatefulSet, the replacement pod cannot come up due to the old pod being stuck terminating.

Troubleshooting: Instance manager pods are restarted every hour

February 25, 2022 · 2 min read

Phan Le

Applicable versions

v1.0.1 or newer

Background

Each Longhorn volume has one engine and one or more replicas (see more detail about Longhorn architecture at here). When a Longhorn volume is attached, Longhorn launches a process for each engine/replica object. The engine process will be launched inside engine instance manager pods (the instance-manager-e-xxxxxxxx pods inside longhorn-system namespace). The replica process will be launched inside replica instance manager pods (the instance-manager-r-xxxxxxxx pods inside longhorn-system namespace).

Troubleshooting: Open-iSCSI on RHEL based systems

February 22, 2022 · 2 min read

Keith Lucas

Applicable versions

All Longhorn versions.

Symptons

The iscsi.service systemd service may add about 2-3 minutes to the boot up time of a node if the node is restarted with longhorn volumes attached to it.

Troubleshooting: Upgrading volume engine is stuck in deadlock

January 3, 2022 · 3 min read

Phan Le

Applicable versions

This happens when users upgrade Longhorn from version ≤ v1.1.1 to a newer version.

Symptoms

Upgrading Longhorn system includes 2 steps: first upgrade Longhorn manager to the latest version, then upgrade the Longhorn engine to the latest version using the latest Longhorn manager. When doing the second step (upgrading Longhorn engine), you may hit the problem that some volumes are stuck in engine upgrading. You may also see that volume attachment/detachment cannot finish (e.g., Longhorn volumes are stuck in detaching or attaching state).

Tip: Set Longhorn To Only Use Storage On A Specific Set Of Nodes

November 15, 2021 · 2 min read

Phan Le

Applicable versions

All Longhorn versions.

Background

Let's say you have a cluster of 5 nodes (node-1, node-2, ..., node-5). You have some fast disks on node-1, node-2, and node-3 so you want Longhorn to use storage on those nodes only. There are a few ways to do this as below.

Applicable versions​

Symptoms​

Applicable versions​

Background​

Applicable versions​

Symptons​

Applicable versions​

Symptoms​

Applicable versions​

Background​

Applicable versions

Symptoms

Applicable versions

Background

Applicable versions

Symptons

Applicable versions

Symptoms

Applicable versions

Background