This page applies to Apigee and Apigee hybrid.
View Apigee Edge documentation.
Symptoms
API proxy deployments fail with the following error messages.
Error Messages
If the TLS certificate of the
apigee-webhook-service.apigee-system.svc service has expired
or is not yet valid, the following error message will be shown on
apigee-watcher logs:
{"level":"error","ts":1687991930.7745812,"caller":"watcher/watcher.go:60", "msg":"error during watch","name":"ingress","error":"INTERNAL: INTERNAL: failed to update ApigeeRoute [org-env]-group-84a6bb5, namespace apigee: Internal error occurred: failed calling webhook \"mapigeeroute.apigee.cloud.google.com\": Post \"https://apigee-webhook-service.apigee-system.svc:443/mutate-apigee-cloud-google-com-v1alpha1-apigeeroute?timeout=30s\": x509: certificate has expired or is not yet valid: current time 2023-06-28T22:38:50Z is after 2023-06-17T17:14:13Z, INTERNAL: failed to update ApigeeRoute [org-env]-group-e7b3ff6, namespace apigee
Possible Causes
| Cause | Description |
|---|---|
| The apigee-serving-cert is not found | If the apigee-serving-cert is not found in the
apigee-system namespace, this issue could occur. |
Duplicate certificate requests were created for
renewing apigee-serving-cert |
If there are duplicate certificate requests created for renewing the
apigee-serving-cert certificate, the
apigee-serving-cert certificate may not get renewed.
|
| cert-manager is not healthy |
If cert-manager is not healthy, the
apigee-serving-cert certificate may not get renewed.
|
Cause: The apigee-serving-cert is not found
Diagnosis
-
Check the availability of the
apigee-serving-certcertificate in theapigee-systemnamespace:kubectl -n apigee-system get certificates apigee-serving-cert
If this certificate is available, an output similar to following should be seen:
NAME READY SECRET AGE apigee-serving-cert True webhook-server-cert 2d10h
-
If the apigee-serving-cert certificate is not found in the
apigee-systemnamespace, that could be the reason for this issue.
Resolution
-
Update the
apigee-serving-certusing Helm:helm upgrade ENV_NAME apigee-env/ \ --namespace APIGEE_NAMESPACE \ --set env=ENV_NAME \ --atomic \ -f OVERRIDES_FILE
Make sure to include all of the settings shown, including
--atomicso that the action rolls back on failure. -
Verify that the
apigee-serving-certcertificate has been created:kubectl -n apigee-system get certificates apigee-serving-cert
Cause: Duplicate certificate requests were created for renewing apigee-serving-cert
Diagnosis
-
Check
cert-managercontroller logs and see whether an error message similar to the following has been returned.List all
cert-managerpods:kubectl -n cert-manager get pods
An example output:
NAME READY STATUS RESTARTS AGE cert-manager-66d9545484-772cr 1/1 Running 0 6d19h cert-manager-cainjector-7d8b6bd6fb-fpz6r 1/1 Running 0 6d19h cert-manager-webhook-669b96dcfd-6mnm2 1/1 Running 0 6d19h
Check
cert-managercontroller logs:kubectl -n cert-manager logs cert-manager-66d9545484-772cr | grep "issuance is skipped until there are no more duplicates"
Example outputs:
1 controller.go:163] cert-manager/certificates-readiness "msg"="re-queuing item due to error processing" "error"="multiple CertificateRequests were found for the 'next' revision 3, issuance is skipped until there are no more duplicates" "key"="apigee-system/apigee-serving-cert"
1 controller.go:167] cert-manager/certificates-readiness "msg"="re-queuing item due to error processing" "error"="multiple CertificateRequests were found for the 'next' revision 683, issuance is skipped until there are no more duplicates" "key"="apigee/apigee-istiod"
If you see either of the messages shown above, the
apigee-serving-certand theapigee-istiod-certcertificates will not be renewed. -
List all certificate requests in the
apigee-systemnamespace or theapigeenamespace depending on the namespace printed in the log entries above and check to see if there are multiple certificate requests created for renewing the sameapigee-serving-certorapigee-istiod-certcertificate revisions:kubectl -n apigee-system get certificaterequests
See the cert-manager issue relevant to this problem at
cert-manager created multiple CertificateRequest objects with the same
certificate-revision.
Resolution
-
Delete all certificate requests in
apigee-systemnamespace:kubectl -n apigee-system delete certificaterequests --all
-
Verify that duplicated certificate requests have been deleted and only one
certificate request is available for the
apigee-serving-certcertificate inapigee-systemnamespace:kubectl -n apigee-system get certificaterequests
-
Verify that the
apigee-serving-certcertificate has been renewed:kubectl -n apigee-system get certificates apigee-serving-cert -o yaml
An example output:
apiVersion: cert-manager.io/v1 kind: Certificate metadata: creationTimestamp: "2023-06-26T13:25:10Z" generation: 1 name: apigee-serving-cert namespace: apigee-system resourceVersion: "11053" uid: e7718341-b3ca-4c93-a6d4-30cf70a33e2b spec: dnsNames: - apigee-webhook-service.apigee-system.svc - apigee-webhook-service.apigee-system.svc.cluster.local issuerRef: kind: Issuer name: apigee-selfsigned-issuer secretName: webhook-server-cert status: conditions: - lastTransitionTime: "2023-06-26T13:25:11Z" message: Certificate is up to date and has not expired observedGeneration: 1 reason: Ready status: "True" type: Ready notAfter: "2023-09-24T13:25:11Z" notBefore: "2023-06-26T13:25:11Z" renewalTime: "2023-08-25T13:25:11Z" revision: 1
Cause: cert-manager is not healthy
Diagnosis
-
Check the health of the
cert-managerpods in thecert-managernamespace:kubectl -n cert-manager get pods
If
cert-managerpods are healthy, allcert-managerpods should be ready(1/1)and inRunningstate, otherwise, that could be the reason for this issue:NAME READY STATUS RESTARTS AGE cert-manager-59cf78f685-mlkvx 1/1 Running 0 15d cert-manager-cainjector-78cc865768-krjcp 1/1 Running 0 15d cert-manager-webhook-77c4fb46b6-7g9g6 1/1 Running 0 15d
-
The
cert-managercan fail for many reasons. Check thecert-managerlogs and identify the reason for the failure and resolve them accordingly.One known reason is that the
cert-managerwill fail if it cannot communicate with the Kubernetes API. In this case, an error message similar to following is displayed::E0601 00:10:27.841516 1 leaderelection.go:330] error retrieving resource lock kube-system/cert-manager-controller: Get "https://192.168.0.1:443/api/v1/namespaces/kube-system/configmaps/cert-manager-controller": dial tcp 192.168.0.1:443: i/o timeout
Resolution
- Check the health of the Kubernetes cluster and fix any issues found. See Troubleshooting Clusters.
-
Refer to
Troubleshooting for additional
cert-managertroubleshooting information.
Must gather diagnostic information
If the problem persists even after following the above instructions, gather the following diagnostic information, and then contact Google Cloud Customer Care.
- Google Cloud Project ID
- Apigee hybrid organization
-
Apigee hybrid
overrides.yamlfile, masking any sensitive information. - Kubernetes pod status in all namespaces:
kubectl get pods -A > kubectl-pod-status`date +%Y.%m.%d_%H.%M.%S`.txt
-
Kubernetes
cluster-infodump:# generate kubernetes cluster-info dump kubectl cluster-info dump -A --output-directory=/tmp/kubectl-cluster-info-dump # zip kubernetes cluster-info dump zip -r kubectl-cluster-info-dump`date +%Y.%m.%d_%H.%M.%S`.zip /tmp/kubectl-cluster-info-dump/*