ESO Core needs to establish a secure and authenticated communication channel with out-of-process providers. It has to be encrypted in transit because we transmit secrets over the network. It must to be authenticated, otherwise a malicious actor could call GetSecret() to retrieve secrets and take advantage over the out-of-process provider's service account to read secrets or other resources within the cluster.
We have to take the following into consideration:
Out-of-process providers run in separate pods and communicate with ESO Core via gRPC. We need a certificate management system that:
We do want to integrate with cert-manager eventually, but this is out of scope for the proposal.
Decision: Push Model
ESO Core generates and distributes certificates to provider namespaces.
User labels Service → ESO discovers Service → ESO generates certificates →
ESO creates Secret in provider namespace → Provider mounts Secret →
Provider starts with mTLS → ESO connects with mTLS
How it works:
Pros:
Cons:
Why chosen: Simplicity for provider developers, no chicken-and-egg authentication problem, leverages Kubernetes primitives.
How it works:
Pros:
Cons:
Why rejected: Circular dependency problem, excessive complexity, tight coupling between ESO and providers.
Decision: Label-Based Discovery
Services are labeled with external-secrets.io/provider: "true".
Configuration:
apiVersion: v1
kind: Service
metadata:
name: provider-aws
namespace: external-secrets-system
labels:
external-secrets.io/provider: "true"
spec:
ports:
- port: 8080
Pros:
Cons:
Why chosen: Explicit contract over implicit behavior, Kubernetes-native, clear intent.
How it works:
Extract namespace from Provider.spec.address (e.g., provider-aws.external-secrets-system.svc:8080).
Pros:
Cons:
Why rejected: Too implicit, preference for explicit contracts.
How it works: Configure providers via controller flags or ConfigMap.
Pros:
Cons:
Why rejected: Not dynamic, requires manual maintenance.
Decision: In-Cluster Only
Automatic certificate management only supports providers running inside the Kubernetes cluster.
We will support out-of-cluster providers, but we don't manage mTLS credentials for them.
Rationale:
Out of scope:
Decision: Required
Providers must reload certificates without restarting when secrets are updated.
Implementation requirement:
// Watch certificate files for changes
// Use tls.Config.GetCertificate callback for dynamic loading
// Reload certificates in-memory when files change
// Use fsnotify or similar to detect file changes
Rationale:
Metrics requirement:
provider_certificate_hot_reload_totalprovider_certificate_hot_reload_failures_totalIn ESO Namespace:
apiVersion: v1
kind: Secret
metadata:
name: eso-provider-ca-internal
namespace: external-secrets-system
data:
ca.crt: <CA certificate>
ca.key: <CA private key> # ONLY in ESO namespace
In Provider Namespaces:
apiVersion: v1
kind: Secret
metadata:
name: external-secrets-provider-tls
namespace: <provider-namespace>
data:
ca.crt: <CA certificate>
tls.crt: <Server certificate with DNS SANs>
tls.key: <Server private key>
# note: no client certs/keys!
Security:
| Certificate Type | Validity | Rotation Lookahead |
|---|---|---|
| CA Certificate | 1 year | 60 days |
| Server Certificate | 90 days | 35 days |
| Client Certificate | 90 days | 35 days |
Reconciliation Interval: 10 minutes
E.g. for service provider-aws in namespace provider-system:
DNS SANs in server certificate:
- provider-aws
- provider-aws.provider-system
- provider-aws.provider-system.svc
- provider-aws.provider-system.svc.cluster.local
Covers all Kubernetes DNS resolution patterns. The cluster.local must be configurable, as some clusters have custom cluster domains.
Note: We do NOT want to support custom SANs at this point. Certificates with custom SANs is out of scope and users should use other tooling for that.
ESO can trust any provider. ESO can trust anyone who is able to create a service object with the appropriate labels.
However: We must not distribute client certificates anyhwere, as this will allow anyone with access to a client certificate + key to fetch secrets from a provider.
ESO controller requires:
# Watch services cluster-wide
- apiGroups: [""]
resources: ["services"]
verbs: ["get", "list", "watch"]
# Manage secrets in any namespace
- apiGroups: [""]
resources: ["secrets"]
verbs: ["create", "update", "patch", "get", "list", "watch"]
Security consideration: Cross-namespace secret write is privileged. ESO only writes to the fixed secret name external-secrets-provider-tls and only in namespaces with labeled services.
Deploy provider with labeled service:
labels:
external-secrets.io/provider: "true"
Create Provider resource:
apiVersion: external-secrets.io/v1
kind: Provider
metadata:
name: my-aws
spec:
address: provider-aws.provider-system.svc:8080
That's it. Certificates are automatic.
Label your service with external-secrets.io/provider: "true"
Mount secret in pod:
volumeMounts:
- name: certs
mountPath: /etc/provider/certs
volumes:
- name: certs
secret:
secretName: external-secrets-provider-tls
Configure gRPC server to use certs from /etc/provider/certs/
Implement hot certificate reload (required)
Expose Prometheus metrics
# Certificate expiration (seconds until expiry)
eso_provider_certificate_expiry_seconds{namespace, service}
# CA certificate expiration
eso_ca_certificate_expiry_seconds
# Certificate rotations
eso_provider_certificate_rotation_total{namespace, service, reason}
# Rotation failures
eso_provider_certificate_rotation_failures_total{namespace, service}
# Hot reload events (in provider pods)
eso_provider_certificate_hot_reload_total{namespace, service}
# Hot reload failures (in provider pods)
eso_provider_certificate_hot_reload_failures_total{namespace, service}
Options:
Impact: Affects connection stability during rotation.
Should ESO briefly accept both old and new certificates during rotation?
Pros:
Cons: