299 lines
7.9 KiB
Markdown
299 lines
7.9 KiB
Markdown
# Seagate Exos X CSI (ME5 dual-site) — Guía de instalación y operación
|
|
|
|
Este README documenta cómo he dejado **reproducible** la instalación del *Seagate Exos X CSI Driver* (soporta ME5) en un clúster Kubernetes con **dos cabinas / dos zonas** (site-a y site-b) usando iSCSI + multipath y *topología por zona*.
|
|
|
|
> **Objetivo**
|
|
>
|
|
> * Un único despliegue del driver (Helm).
|
|
> * **Dos StorageClass** (uno por sitio) con `allowedTopologies` y credenciales (Secret) separadas.
|
|
> * *WaitForFirstConsumer* para que el volumen se cree en la **misma zona** del pod.
|
|
> * Montajes iSCSI rápidos gracias a multipath bien configurado (modo `greedy`).
|
|
|
|
---
|
|
|
|
## 1) Configuración iSCSI en los nodos
|
|
|
|
En **todos los nodos** del clúster:
|
|
|
|
1. Instalar dependencias:
|
|
|
|
```bash
|
|
sudo zypper install open-iscsi yast2-iscsi-client multipath-tools
|
|
```
|
|
|
|
2. Habilitar y arrancar el servicio iSCSI:
|
|
|
|
```bash
|
|
sudo systemctl enable --now iscsid.service
|
|
systemctl status iscsid.service
|
|
```
|
|
|
|
3. Descubrir los targets en las cabinas:
|
|
|
|
```bash
|
|
sudo iscsiadm -m discovery -t sendtargets -p 192.168.3.11
|
|
sudo iscsiadm -m discovery -t sendtargets -p 192.168.3.21
|
|
```
|
|
|
|
En este punto hay que **añadir en las cabinas el grupo de host con cada host**.
|
|
|
|
4. Iniciar sesión contra todos los portales de ambas cabinas:
|
|
|
|
```bash
|
|
# Cabina site-a
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e92b6 -p 192.168.3.11:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e92b6 -p 192.168.3.12:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e92b6 -p 192.168.3.13:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e92b6 -p 192.168.3.14:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e92b6 -p 192.168.3.15:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e92b6 -p 192.168.3.16:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e92b6 -p 192.168.3.17:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e92b6 -p 192.168.3.18:3260 --login &
|
|
|
|
# Cabina site-b
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e8e43 -p 192.168.3.21:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e8e43 -p 192.168.3.22:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e8e43 -p 192.168.3.23:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e8e43 -p 192.168.3.24:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e8e43 -p 192.168.3.25:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e8e43 -p 192.168.3.26:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e8e43 -p 192.168.3.27:3260 --login &
|
|
sudo iscsiadm -m node -T iqn.1988-11.com.dell:01.array.bc305b5e8e43 -p 192.168.3.28:3260 --login
|
|
```
|
|
|
|
5. Verificar la sesión activa:
|
|
|
|
```bash
|
|
sudo iscsiadm -m session
|
|
```
|
|
|
|
6. Editar configuración de iSCSI en `/etc/iscsi/iscsid.conf`:
|
|
|
|
```conf
|
|
iscsid.startup = /bin/systemctl start iscsid.socket iscsiuio.socket
|
|
iscsid.safe_logout = Yes
|
|
node.startup = automatic
|
|
node.leading_login = No
|
|
node.session.timeo.replacement_timeout = 120
|
|
node.conn[0].timeo.login_timeout = 15
|
|
node.conn[0].timeo.logout_timeout = 15
|
|
node.conn[0].timeo.noop_out_interval = 5
|
|
node.conn[0].timeo.noop_out_timeout = 5
|
|
node.session.err_timeo.abort_timeout = 15
|
|
node.session.err_timeo.lu_reset_timeout = 30
|
|
node.session.err_timeo.tgt_reset_timeout = 30
|
|
node.session.err_timeo.host_reset_timeout = 60
|
|
node.session.initial_login_retry_max = 8
|
|
node.session.cmds_max = 128
|
|
node.session.queue_depth = 32
|
|
node.session.xmit_thread_priority = -20
|
|
node.session.iscsi.InitialR2T = No
|
|
node.session.iscsi.ImmediateData = Yes
|
|
node.session.iscsi.FirstBurstLength = 262144
|
|
node.session.iscsi.MaxBurstLength = 16776192
|
|
node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
|
|
node.conn[0].iscsi.MaxXmitDataSegmentLength = 0
|
|
discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 32768
|
|
node.session.nr_sessions = 1
|
|
node.session.reopen_max = 0
|
|
node.session.iscsi.FastAbort = Yes
|
|
node.session.scan = auto
|
|
```
|
|
|
|
---
|
|
|
|
## 2) Prerrequisitos en los nodos
|
|
|
|
### 2.1. Configuración `/etc/multipath.conf`
|
|
|
|
```conf
|
|
defaults {
|
|
user_friendly_names "no"
|
|
find_multipaths "greedy"
|
|
no_path_retry "queue"
|
|
}
|
|
|
|
devices {
|
|
device {
|
|
vendor "DellEMC"
|
|
product "ME5"
|
|
path_grouping_policy "multibus"
|
|
path_checker "tur"
|
|
prio "alua"
|
|
}
|
|
}
|
|
```
|
|
|
|
> **Por qué `greedy`?**
|
|
>
|
|
> * `find_multipaths "greedy"` evita crear *maps* hasta que haya más de un camino **o** el dispositivo sea claramente multipath, reduciendo falsos positivos y estabilizando el *udev settle*.
|
|
|
|
|
|
### 2.2. Multipath e iSCSI activos
|
|
|
|
Asegurarse de tener `multipathd` en ejecución:
|
|
|
|
```bash
|
|
sudo systemctl restart multipathd
|
|
sudo multipath -r
|
|
```
|
|
|
|
### 2.3. Propagación de montajes (rshared)
|
|
|
|
```bash
|
|
sudo mount --make-rshared /
|
|
|
|
# systemd drop-in para kubelet
|
|
sudo install -d /etc/systemd/system/kubelet.service.d
|
|
cat <<'EOF' | sudo tee /etc/systemd/system/kubelet.service.d/10-mount-propagation.conf
|
|
[Service]
|
|
MountFlags=
|
|
ExecStartPre=/bin/mkdir -p /var/lib/kubelet
|
|
ExecStartPre=/bin/mount --bind /var/lib/kubelet /var/lib/kubelet
|
|
ExecStartPre=/bin/mount --make-rshared /var/lib/kubelet
|
|
EOF
|
|
|
|
sudo systemctl daemon-reload
|
|
sudo systemctl restart kubelet
|
|
```
|
|
|
|
Verificar:
|
|
|
|
```bash
|
|
sudo findmnt -o TARGET,PROPAGATION /
|
|
sudo findmnt -o TARGET,PROPAGATION /var/lib/kubelet
|
|
```
|
|
|
|
### 2.4. Etiquetas de topología en nodos
|
|
|
|
```bash
|
|
kubectl label nodes <nodo-del-site-a> topology.kubernetes.io/zone=site-a --overwrite
|
|
kubectl label nodes <nodo-del-site-b> topology.kubernetes.io/zone=site-b --overwrite
|
|
```
|
|
|
|
---
|
|
|
|
## 3) Despliegue del Driver con Helm
|
|
|
|
### 3.1. Namespace y valores
|
|
|
|
```bash
|
|
kubectl apply -f namespace.yaml # namespace: seagate
|
|
```
|
|
|
|
**values.yaml** (resumen de lo usado):
|
|
|
|
* Imagen del driver: `ghcr.io/seagate/seagate-exos-x-csi:v1.10.0`
|
|
* Sidecars: provisioner, attacher, resizer, snapshotter, registrar
|
|
* `controller.extraArgs: ["-v=2"]`
|
|
* `node.extraArgs: ["-v=2"]`
|
|
|
|
### 3.2. Instalación
|
|
|
|
```bash
|
|
helm upgrade --install exos-x-csi \
|
|
-n seagate --create-namespace \
|
|
./seagate-exos-x-csi \
|
|
-f ./values.yaml
|
|
```
|
|
|
|
*(Si hay residuos RBAC, eliminarlos antes de reintentar)*
|
|
|
|
---
|
|
|
|
## 4) Secret por cabina (A y B)
|
|
|
|
Crear un `Secret` por sitio con `apiAddress`, `username`, `password` en Base64.
|
|
|
|
```bash
|
|
kubectl apply -f secret-me5-site-a.yaml
|
|
kubectl apply -f secret-me5-site-b.yaml
|
|
```
|
|
|
|
---
|
|
|
|
## 5) StorageClass por zona
|
|
|
|
Definir **dos** `StorageClass` con:
|
|
|
|
* Secret (A o B)
|
|
* `pool` y `volPrefix`
|
|
* `allowedTopologies` por zona
|
|
* `volumeBindingMode: WaitForFirstConsumer`
|
|
|
|
```bash
|
|
kubectl apply -f sc-me5-site-a.yaml
|
|
kubectl apply -f sc-me5-site-b.yaml
|
|
```
|
|
|
|
---
|
|
|
|
## 6) Prueba de extremo a extremo
|
|
|
|
PVC + Pod en site-a:
|
|
|
|
```bash
|
|
kubectl apply -f pvc-pod-a.yaml
|
|
kubectl apply -f pod-a.yaml
|
|
kubectl get pvc,pod
|
|
```
|
|
|
|
Verificar `iscsiadm`, `multipath`, eventos del PVC y logs del controller.
|
|
|
|
---
|
|
|
|
## 7) Medición de tiempos de *NodePublish*
|
|
|
|
```bash
|
|
kubectl -n seagate logs -l name=seagate-exos-x-csi-node-server \
|
|
-c seagate-exos-x-csi-node --tail=10000 \
|
|
| grep "NodePublishVolume" | grep "ROUTINE END"
|
|
```
|
|
|
|
---
|
|
|
|
## 8) Solución de problemas
|
|
|
|
* `missing API credentials` → revisar claves CSI en el StorageClass.
|
|
* `DeadlineExceeded` → revisar multipath, etiquetas de zona y topología.
|
|
* Helm RBAC conflict → borrar roles residuales.
|
|
|
|
---
|
|
|
|
## 9) Limpieza
|
|
|
|
```bash
|
|
kubectl delete -f pod-a.yaml
|
|
kubectl delete -f pvc-pod-a.yaml
|
|
```
|
|
|
|
Para desinstalar completamente:
|
|
|
|
```bash
|
|
helm uninstall exos-x-csi -n seagate
|
|
```
|
|
|
|
---
|
|
|
|
## 10) Resumen en repo (`seagate/`)
|
|
|
|
* `namespace.yaml`
|
|
* `secret-me5-site-a.yaml`, `secret-me5-site-b.yaml`
|
|
* `values.yaml`
|
|
* `sc-me5-site-a.yaml`, `sc-me5-site-b.yaml`
|
|
* `pvc-pod-a.yaml`, `pod-a.yaml`
|
|
|
|
---
|
|
|
|
## 11) Anexos — Comandos útiles
|
|
|
|
* Reinicio multipath/kubelet
|
|
* Limpieza iSCSI/multipath:
|
|
|
|
```bash
|
|
sudo iscsiadm -m node -u || true
|
|
sudo iscsiadm -m node -o delete || true
|
|
sudo multipath -F || true
|
|
sudo multipath -r
|
|
```
|