设置就绪，活跃或启动探针

本文介绍了设置就绪，活跃或启动探针的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我很难理解哪种方法最适合我的情况以及如何实际实施.

I'm having difficulty understanding which would be best for my situation and how to actually implement it.

简而言之，问题是这样的:

In a nutshell, the problem is this:

我正在使用Skaffold扩展数据库(Postgres)，BE(Django)和FE(React)部署
BE在DB旋转之前大约有50％的时间
Django要做的第一件事就是连接到数据库
它只会尝试一次(根据设计并且无法更改)，如果无法尝试，它将失败并导致应用程序损坏

因此，我需要确保每次启动部署时，在开始进行BE部署之前，数据库部署都在运行中.

Thus, I need to make sure every single time I spin up my deployments, the DB deployment is running before starting the BE deployment

我遇到了就绪，活跃，以及starup探针.我已经阅读了好几次，准备调查听起来像我需要的:我不希望BE部署在DB部署准备好接受连接之前就开始.

I came across readiness, liveness, and starup probes. I've read it a couple times and readiness probes sound like what I need: I don't want the BE deployment to start until the DB deployment is ready to accept connections.

我想我不了解如何设置它.这是我尝试过的方法，但是我仍然遇到实例被加载到另一个实例之前的情况.

I guess I'm not understanding how to set it up. This is what I've tried, but I still run into instances where one is being loaded before another.

postgres.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      component: postgres
  template:
    metadata:
      labels:
        component: postgres
    spec:
      containers:
        - name: postgres
          image: testappcontainers.azurecr.io/postgres
          ports:
            - containerPort: 5432
          env:
            - name: POSTGRES_DB
              valueFrom:
                secretKeyRef:
                  name: testapp-secrets
                  key: PGDATABASE
            - name: POSTGRES_USER
              valueFrom:
                secretKeyRef:
                  name: testapp-secrets
                  key: PGUSER
            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: testapp-secrets
                  key: PGPASSWORD
            - name: POSTGRES_INITDB_ARGS
              value: "-A md5"
          volumeMounts:
            - name: postgres-storage
              mountPath: /var/lib/postgresql/data
              subPath: postgres
      volumes:
        - name: postgres-storage
          persistentVolumeClaim:
            claimName: postgres-storage
---
apiVersion: v1
kind: Service
metadata:
  name: postgres-cluster-ip-service
spec:
  type: ClusterIP
  selector:
    component: postgres
  ports:
    - port: 1423
      targetPort: 5432

api.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      component: api
  template:
    metadata:
      labels:
        component: api
    spec:
      containers:
        - name: api
          image: testappcontainers.azurecr.io/testapp-api
          ports:
            - containerPort: 5000
          env:
            - name: PGUSER
              valueFrom:
                secretKeyRef:
                  name: testapp-secrets
                  key: PGUSER
            - name: PGHOST
              value: postgres-cluster-ip-service
            - name: PGPORT
              value: "1423"
            - name: PGDATABASE
              valueFrom:
                secretKeyRef:
                  name: testapp-secrets
                  key: PGDATABASE
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: testapp-secrets
                  key: PGPASSWORD
            - name: SECRET_KEY
              valueFrom:
                secretKeyRef:
                  name: testapp-secrets
                  key: SECRET_KEY
            - name: DEBUG
              valueFrom:
                secretKeyRef:
                  name: testapp-secrets
                  key: DEBUG
          readinessProbe:
            httpGet:
              host: postgres-cluster-ip-service
              port: 1423
            initialDelaySeconds: 10
            periodSeconds: 5
            timeoutSeconds: 2
---
apiVersion: v1
kind: Service
metadata:
  name: api-cluster-ip-service
spec:
  type: ClusterIP
  selector:
    component: api
  ports:
    - port: 5000
      targetPort: 5000

client.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: client-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      component: client
  template:
    metadata:
      labels:
        component: client
    spec:
      containers:
        - name: client
          image: testappcontainers.azurecr.io/testapp-client
          ports:
            - containerPort: 3000
          readinessProbe:
            httpGet:
              path: api-cluster-ip-service
              port: 5000
            initialDelaySeconds: 10
            periodSeconds: 5
            timeoutSeconds: 2
---
apiVersion: v1
kind: Service
metadata:
  name: client-cluster-ip-service
spec:
  type: ClusterIP
  selector:
    component: client
  ports:
    - port: 3000
      targetPort: 3000

我认为 ingress.yaml 和 skaffold.yaml 不会有帮助，但是请告诉我是否应该添加它们.

I don't think the ingress.yaml and the skaffold.yaml will be helpful, but let me know if I should add those.

那我在做什么错了?

因此，我根据David Maze的回复尝试了一些方法.这可以帮助我了解正在发生的事情，但是我仍然遇到一些我不太了解如何解决的问题.

So I've tried out a few things based on David Maze's response. This helped me understand what is going on better, but I am still running into issues I'm not quite understanding how to resolve.

第一个问题是，即使使用默认的 restartPolicy:Always ，即使Django失败，Pods本身也不会失败.Pod认为即使Django失败了，它们也完全健康.

The first problem is that even with a default restartPolicy: Always, and even though Django fails, the Pods themselves don't fail. The Pods think they are perfectly healthy even though Django has failed.

第二个问题是，显然需要使Pods了解Django的状态.那是我还没有全神贯注的部分，特别是探针应该检查其他部署或它们本身的状态吗?

The second problem is that apparently the Pods need to be made aware of Django's status. That is the part I'm not quite wrapping my brain around, particularly should probes be checking the status of other deployments or themselves?

昨天我的想法是前者，但今天我认为是后者:Pod需要知道其中包含的程序已失败.但是，我尝试过的所有操作只会导致探测失败，连接被拒绝等.

Yesterday my thinking was the former, but today I'm thinking it is the latter: the Pod needs to know the program contained in it has failed. However, everything I've tried just results in a failed probe, connection refused, etc.:

# referring to itself
host: /health
port: 5000

host: /healthz
port: 5000

host: /api
port: 5000

host: /
port: 5000

host: /api-cluster-ip-service
port: 5000

host: /api-deployment
port: 5000

# referring to the DB deployment
host: /health
port: 1423 #or 5432

host: /healthz
port: 1423 #or 5432

host: /api
port: 1423 #or 5432

host: /
port: 1423 #or 5432

host: /postgres-cluster-ip-service
port: 1423 #or 5432

host: /postgres-deployment
port: 1423 #or 5432

因此，尽管它是超级简单"的实现，但显然我在设置探针是错误的(正如一些博客所描述的那样).例如，/health 和/healthz 路由:这些是内置在Kubernetes中还是需要设置?重新阅读文档以希望澄清这一点.

So apparently I'm setting up the probe wrong, despite it being a "super-easy" implementation (as a few blogs have described it). For example, the /health and /healthz routes: are these built into Kubernetes or do these need to be setup? Rereading the docs to hopefully clarify this.

INTO

设置就绪，活跃或启动探针

问题描述

推荐答案