Henry Xie 's blog: Oracle Non-RAC DB StatefulSet HA 1 Command Failover Test in OKE

Requirement:

OKE has a very powerful Block Volume management built-in. It can find, detach, reattach block storage volumes among different worker nodes seamlessly. Here is what we are going to test.

We create an Oracle DB statefulset on OKE. Imagine we have hardware or OS issue on the worker node and test HA failover to another worker node with only 1 command (kubectl drain).

Below things happen automatically when draining the node

OKE will shutdown DB pod
OKE will detach PV on the worker node
OKE will find a new worker node in the same AD
OKE will attach PV in the new worker node
OKE will start DB pod in the new worker node

DB in statefulset is not RAC, but with the power of OKE, we can failover a DB to new VM in less than a few minutes

Solution:

Create service for DB statefulset

$ cat testsvc.yaml 
apiVersion: v1
kind: Service
metadata:
  labels:
     name: oradbauto-db-service
  name: oradbauto-db-svc
spec:
  ports:
  - port: 1521
    protocol: TCP
    targetPort: 1521
  selector:
     name: oradbauto-db-service

Create a DB statefulset, wait about 15 min to let DB fully up

$ cat testdb.yaml 
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: oradbauto
  labels:
    app: apexords-operator
    name: oradbauto
spec:
  selector:
     matchLabels:
        name: oradbauto-db-service
  serviceName: oradbauto-db-svc
  replicas: 1
  template:
    metadata:
        labels:
           name: oradbauto-db-service
    spec:
      securityContext:
         runAsUser: 54321
         fsGroup: 54321
      containers:
        - image: iad.ocir.io/espsnonprodint/autostg/database:19.2
          name: oradbauto
          ports:
            - containerPort: 1521
              name: oradbauto
          volumeMounts:
            - mountPath: /opt/oracle/oradata
              name: oradbauto-db-pv-storage
          env:
            - name: ORACLE_SID
              value: "autocdb"
            - name: ORACLE_PDB
              value: "autopdb"
            - name:  ORACLE_PWD
              value: "whateverpass"
  volumeClaimTemplates:
  - metadata:
      name: oradbauto-db-pv-storage
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 50Gi

Image we have hardware issues on this node, we need to failover to a new node

Before Failover: Check the status of PV and Pod. and the pod is running on the node 1.1.1.1
Check any if other pods running on the node will be affected
We have a node ready in the same AD as statefulset Pod
kubectl get pv,pvc

kubectl get po -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
oradbauto-0 1/1 Running 0 20m 10.244.3.40 1.1.1.1 <none> <none>
1 command to failover DB to new worker node

kubectl drain <node name> --ignore-daemonsets --delete-local-data
kubectl drain 1.1.1.1 --ignore-daemonsets --delete-local-data
No need to update MT connection string as DB servicename is untouched and transparent to new DB pod

After failover: Check the status of PV and Pod. and the pod is running on the new node

kubectl get pv,pvc
kubectl get pod -owide

The movement of PV,PVC work on volumeClaimTemplates as well as the PV,PVC when we create them via yaml files with storage class "oci"

Henry Xie 's blog

Sunday, April 12, 2020

Oracle Non-RAC DB StatefulSet HA 1 Command Failover Test in OKE

Requirement:

Solution:

No comments:

Post a Comment