Kubernetes多租户管理策略
Kubernetes多租户管理策略
引言
多租户是 Kubernetes 集群共享资源的重要能力。本文将深入探讨 Kubernetes 多租户的核心概念、隔离策略和最佳实践。
一、多租户架构
1.1 多租户层次结构
┌─────────────────────────────────────────────────────────────┐ │ 多租户架构 │ ├─────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ 集群层隔离 │ │ │ │ - 独立集群 / 集群联邦 │ │ │ │ - 物理隔离 │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ 命名空间层隔离 │ │ │ │ - Namespace 隔离 │ │ │ │ - RBAC / NetworkPolicy │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ Pod 层隔离 │ │ │ │ - ResourceQuota / LimitRange │ │ │ │ - 资源限制 │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────┘1.2 多租户隔离级别
| 级别 | 隔离程度 | 适用场景 |
|---|---|---|
| 集群级 | 最高 | 严格隔离需求 |
| 命名空间级 | 中等 | 共享集群 |
| Pod级 | 最低 | 资源共享 |
二、命名空间隔离
2.1 命名空间创建
apiVersion: v1 kind: Namespace metadata: name: tenant-a labels: tenant: tenant-a environment: production --- apiVersion: v1 kind: Namespace metadata: name: tenant-b labels: tenant: tenant-b environment: staging2.2 网络隔离
apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: tenant-isolation namespace: tenant-a spec: podSelector: {} policyTypes: - Ingress - Egress ingress: - from: - namespaceSelector: matchLabels: tenant: tenant-a egress: - to: - namespaceSelector: matchLabels: tenant: tenant-a - to: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: kube-system三、RBAC 权限管理
3.1 租户角色定义
apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: tenant-admin namespace: tenant-a rules: - apiGroups: [""] resources: ["pods", "services", "configmaps", "secrets"] verbs: ["get", "list", "watch", "create", "update", "patch", "delete"] - apiGroups: ["apps"] resources: ["deployments", "statefulsets", "daemonsets"] verbs: ["get", "list", "watch", "create", "update", "patch", "delete"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: tenant-admin-binding namespace: tenant-a roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: tenant-admin subjects: - kind: User name: tenant-a-admin apiGroup: rbac.authorization.k8s.io3.2 集群级权限
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: tenant-viewer rules: - apiGroups: [""] resources: ["namespaces"] verbs: ["get", "list", "watch"] - apiGroups: ["metrics.k8s.io"] resources: ["pods", "nodes"] verbs: ["get", "list", "watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: tenant-viewer-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: tenant-viewer subjects: - kind: Group name: tenants apiGroup: rbac.authorization.k8s.io四、资源配额管理
4.1 ResourceQuota 配置
apiVersion: v1 kind: ResourceQuota metadata: name: tenant-a-quota namespace: tenant-a spec: hard: requests.cpu: "10" requests.memory: 20Gi limits.cpu: "20" limits.memory: 40Gi pods: "50" services: "20" configmaps: "100" secrets: "50"4.2 LimitRange 配置
apiVersion: v1 kind: LimitRange metadata: name: tenant-a-limits namespace: tenant-a spec: limits: - type: Pod max: cpu: "4" memory: 8Gi min: cpu: "100m" memory: 128Mi - type: Container default: cpu: "500m" memory: 512Mi defaultRequest: cpu: "200m" memory: 256Mi max: cpu: "2" memory: 4Gi min: cpu: "100m" memory: 128Mi五、存储隔离
5.1 存储类隔离
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: tenant-a-storage provisioner: ebs.csi.aws.com parameters: type: gp3 reclaimPolicy: Delete volumeBindingMode: WaitForFirstConsumer allowedTopologies: - matchLabelExpressions: - key: topology.ebs.csi.aws.com/zone values: - us-west-2a5.2 PV/PVC 隔离
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: tenant-a-db namespace: tenant-a spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi storageClassName: tenant-a-storage六、多租户最佳实践
6.1 租户模板
apiVersion: templates.gatekeeper.sh/v1 kind: ConstraintTemplate metadata: name: k8stenant spec: crd: spec: names: kind: K8sTenant validation: openAPIV3Schema: properties: name: type: string quota: type: object properties: cpu: type: string memory: type: string targets: - target: admission.k8s.gatekeeper.sh rego: | package k8stenant violation[{"msg": msg}] { input.review.object.kind == "Namespace" not input.review.object.metadata.labels.tenant msg := "namespace must have tenant label" }6.2 租户创建流程
# 创建租户命名空间 kubectl create namespace tenant-c # 创建资源配额 kubectl apply -f tenant-c-quota.yaml # 创建网络策略 kubectl apply -f tenant-c-network-policy.yaml # 创建 RBAC 绑定 kubectl apply -f tenant-c-rbac.yaml6.3 监控与审计
apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: tenant-alerts spec: groups: - name: tenant_rules rules: - alert: TenantResourceQuotaExceeded expr: sum(kube_resourcequota_used) by (namespace) / sum(kube_resourcequota_hard) by (namespace) > 0.9 for: 5m labels: severity: warning annotations: summary: "租户 {{ $labels.namespace }} 资源配额即将用尽" description: "资源使用率已达 90%"七、多租户管理工具
7.1 Kubeflow 多租户
apiVersion: kubeflow.org/v1beta1 kind: Profile metadata: name: tenant-a spec: owner: kind: User name: tenant-a-user resourceQuotaSpec: hard: requests.cpu: "4" requests.memory: 8Gi limits.cpu: "8" limits.memory: 16Gi7.2 vCluster 虚拟集群
# 创建虚拟集群 vcluster create tenant-a-vcluster \ --namespace tenant-a \ --expose \ --use-ingress八、常见问题与解决方案
8.1 资源泄漏
问题分析:
- 租户删除后资源未清理
- 孤儿资源存在
解决方案:
# 配置资源清理策略 apiVersion: v1 kind: Namespace metadata: name: tenant-a annotations: "helm.sh/resource-policy": keep8.2 权限越界
问题分析:
- RBAC 配置错误
- ClusterRole 权限过宽
解决方案:
# 检查角色权限 kubectl describe role tenant-admin -n tenant-a # 验证权限 kubectl auth can-i delete namespaces --as=tenant-a-admin8.3 网络策略冲突
问题分析:
- 多个网络策略规则冲突
- 默认策略覆盖
解决方案:
# 明确策略优先级 apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: tenant-priority annotations: networking.k8s.io/priority: "100"结论
多租户管理是 Kubernetes 共享集群资源的核心能力。通过命名空间隔离、RBAC 权限管理、资源配额和网络策略,可以实现租户间的有效隔离。选择合适的隔离级别和工具能够满足不同场景的安全需求,同时提高资源利用率。
