Scale up more pods on more demand (CPU, requests, connections). Scale down on less demand. Simple.
You need more nodes for more pods.
Pods are sent to pending if nodes are full. The pending checker will check now and then and add nodes as necessary.
All about Pods.
Horizontal = more of the same pod

1 HPA per deployment

# Name space: not security boundaries
apiVersion: v1
kind: Namespace
metadata:
name: acg-ns
---
# Service: Load Balancer
apiVersion: v1
kind: Service
metadata:
namespace: acg-ns
name: acg-lb
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: acg-stress
---
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: acg-ns
labels:
app: acg-stress
name: acg-web
spec:
selector:
matchLabels:
app: acg-stress
replicas: 1
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
app: acg-stress
spec:
containers:
- image: k8s.gcr.io/hpa-example
name: stresser
ports:
- containerPort: 80
resources:
requests:
cpu: 0.2
---
# HPA
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: acg-hpa
namespace: acg-ns
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: acg-web
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 50
kubectl apply -f hpademo.yml
kubectl get deploy --namespace acg-ns
kubectl get hpa --namespace acg-ns
kubectl get nodes
# Watch nodes
kubectl get hpa --namespace acg-ns --watch
# Check deployment made by HPA updater
# Will have updated replicas based on load
# Config all handled by control plane
kubectl get deploy --namespace acg-ns -o yaml
# -----------------------
# Stressing the above Nodes via busybox
kubectl run - i --ty loader --image=busybox /bin/sh
# Sample script
while true; do wget -q -O- <http://acg-lb.acg-ns.svc.cluster.local>; done