Clusters
Every env in Platz has a set of clusters attached to it. The cluster set is what an env user can deploy into — when you click "New Deployment" the cluster dropdown is filtered to the clusters attached to the current env.
This page is the env-level perspective on clusters: what's already attached, what users can do with them, and the dotted line back to the site-level cluster admin page.
How clusters get attached
Cluster discovery happens at the site level (see Admin → Clusters). Once a cluster shows up in Platz, a site admin attaches it to an env. After attachment, env-level users see the cluster in their env settings and in the cluster picker on deployment forms.
Env-level admins cannot attach or detach clusters themselves. That's a site-admin operation. If you need a cluster attached to your env and don't have site admin rights, ask whoever installed Platz to do it.
The env's cluster list
/envs/<env>/settings/clusters shows the clusters attached to the current env. For each cluster:
- Name — derived from the EKS cluster name or the kubeconfig context.
- Region — for EKS clusters, the AWS region. For local clusters,
local. - Status — green if
is_ok: trueandignore: false, otherwise an error badge with thenot_ok_reason. - Ingress — a quick view of the ingress domain, class, and TLS secret name. Editable only by site admins.
- Number of deployments — a count of deployments currently on this cluster from this env.
Env users (non-admin) see this page but in read-only mode. Env admins see the same view; the only editable parts (ingress, Grafana) are still site-admin-only because those affect every deployment on the cluster, not just this env's deployments.
What env users can change
Effectively nothing on this page. The env-level cluster view is informational — letting users see "yes, the cluster is attached, yes, it's healthy" without giving them site-admin powers.
If a cluster is unhealthy and your deployment is broken because of it, the right escalation path is:
- Note the
not_ok_reasonshown in the UI. - Open
/admin/clusters/<id>if you're a site admin; otherwise ping your operator with the cluster name and reason.
Cluster status and what it means
A cluster's is_ok flag is set by the agent's periodic health check (a kube API ping). Common failure modes:
- Connection refused / timed out — the cluster's API endpoint is unreachable from the Platz agent's network. Could be a VPC peering issue, an EKS endpoint that's now private-only, or the cluster being torn down.
- Unauthorized — the agent's credentials no longer have access. For EKS, this usually means the role's
aws-authmapping was changed. - Stale
last_seen_at— the agent isn't running, or hasn't reached this cluster in the last few intervals. Check theplatz-platzio-k8s-agent-<name>pod's logs.
A failing cluster doesn't take down running deployments — the kubelets in the cluster keep doing their thing. It does prevent Platz from running new deployment tasks (helm install, upgrade, restart). Existing tasks already in flight may finish or stall depending on when the failure started.
Choosing a cluster on a new deployment
The cluster picker in the deployment creation form:
- Shows clusters attached to the current env.
- Filters out clusters with
ignore: true. - Filters out clusters with
is_ok: false. - Filters out clusters whose
ingress_domainis empty if the chart'sfeatures.ingress.enabled: true(because the standard ingress feature can't function without a domain).
If the dropdown is empty when you click "New Deployment", the most likely causes:
- The env has no clusters attached (talk to a site admin).
- The only attached clusters are ignored or unhealthy.
- The chart requires standard ingress and no cluster has an ingress domain set.
Moving a deployment between clusters
The Recreate task type moves a deployment between clusters or namespaces. See Deployment Tasks & History. The user-facing flow: open the deployment, Edit Deployment, change the cluster dropdown, save. Platz uninstalls the deployment from the old cluster and reinstalls it on the new one in a single task.
Recreate is a maintainer-or-owner operation. A regular User role can't move deployments between clusters.
Caveats
- Detaching a cluster from an env doesn't uninstall deployments. A site admin who detaches
prod-us-east-1from theproductionenv doesn't make the deployments disappear from Kubernetes — they keep running. They just become invisible in the Platz UI (because the env no longer owns them). This is rarely what you want; site admins should uninstall first, then detach. - A cluster belongs to one env at a time. You can't share a cluster across two envs (e.g., letting both
productionandstagingusers deploy to a single cluster). The workaround is to have two clusters at the Kubernetes level — usually a good idea anyway for blast-radius reasons. - Cluster names aren't unique across the install. Two clusters can share a name if their provider IDs differ (e.g., two EKS clusters in different regions both named
apps). The UI shows the region next to the name to disambiguate. - The number of attached clusters affects deployment dropdown latency on slow connections. Each cluster's status is fetched at form-render time. With dozens of clusters, the dropdown can take a noticeable second to populate.