Autonomous infrastructure ops

The autonomous ops engineer that fixes infrastructure before your on-call gets paged.

24/7 monitoring agent that ships fixes for disk, memory, CPU, and Kubernetes incidents — and escalates only what it can't handle.

Last shipped: 2026-05-30v0.1.30
incidents auto-resolved
last 30 days
on-call hours saved
since launch
mean time to remediation
vs. 47 min industry avg
live data
Signal sources: AWS GCP Kubernetes
Interactive sandbox

Watch it run

Pick a real incident type. BytePort detects it, picks a runbook, remediates, verifies, and writes a postmortem — in under 30 seconds. Demo environment — no real infrastructure is touched.

byteport-agent · sandbox
$ Select a signal above to start the agent loop →

Seven runbooks. Zero pages.

Every card below is a live runbook — a real action BytePort takes when a signal breaches threshold. Not a recommendation. A fix.

Browse the full runbook catalog →
live

Disk pressure

Signal: disk_usage > 85%

Vacuums journalctl logs, rotates stale log files, prunes Docker layers, sweeps /tmp. Resolves without restart in 90% of cases.

avg resolution: 38s
live

Memory pressure

Signal: memory_used > 90%

Identifies the leaking process via cgroup stats. Restarts it using the byteport.restart=safe label so only opted-in services are touched.

avg resolution: 52s
live

CPU pressure

Signal: cpu_pct > 95% sustained

Applies cgroup CPU throttle via byteport.throttle=safe label. Buys headroom without killing the process, then files a postmortem with root cause.

avg resolution: 29s
live

Host unhealthy

Signal: host_unhealthy = 1

Runs connectivity checks, service probes, and NTP drift detection. Restarts unhealthy services in dependency order, escalates if the host itself is unreachable.

avg resolution: 71s
live

Pod crashloop

Signal: pod_crashloop_count > 3

Classifies crash cause (OOM, config error, missing dep). Safe-restarts the pod with back-off guard. Requires byteport.io/allow-remediation: "true" annotation on the namespace.

avg resolution: 44s
live

Deployment rollback

Signal: failed deploy within 30 min

Reads the byteport.io/allow-rollback annotation. If set, triggers kubectl rollout undo and notifies the commit author via Slack or email.

avg resolution: 18s
live

OOM-killed pod

Signal: oom_killed_count > 0

Distinguishes memory leak from under-provisioned limits. If limits look correct, restarts with debug logging enabled. If leak signature detected, patches the resource limit and files a postmortem.

avg resolution: 61s

Running in 60 seconds.

Point it at your signal source. Give it a notification target. It watches from there.

bash
npm install @byteport/agent

# minimum required env vars
export BYTEPORT_API_KEY=your_api_key
export PROMETHEUS_URL=http://prometheus:9090    # or DATADOG_API_KEY
export POSTMARK_SERVER_TOKEN=your_token        # or SLACK_WEBHOOK_URL

node -e "require('@byteport/agent').start()"
bash
docker run -d \
  -e BYTEPORT_API_KEY=your_api_key \
  -e PROMETHEUS_URL=http://prometheus:9090 \
  -e POSTMARK_SERVER_TOKEN=your_token \
  byteport/agent:latest
yaml
helm repo add byteport https://byteport.polsia.app/charts
helm repo update

# Default install is dry-run safe — never mutates without your permission
helm install byteport byteport/byteport-agent \
  --namespace byteport-agent \
  --create-namespace

# Confirm it's watching (logs appear within 30s)
kubectl logs -n byteport-agent \
  -l app.kubernetes.io/name=byteport-agent --tail=20 -f

Works with what you already run.

Plug into your existing signal sources and notification channels. No new dashboards required.

Datadog
Metrics API
Prometheus
HTTP query API
Postmark
Transactional email
Slack
Webhook alerts
PagerDuty
Events API v2
Kubernetes
kubectl + annotations
GitHub
Commit metadata
CloudWatch
GetMetricData API
Grafana
roadmap

What BytePort fixed recently.

Real incidents from the live database. Updated as they happen.

See the full live feed →
Loading recent incidents…

Want this watching your infra?

BytePort is in private beta. Join the waitlist and we'll reach out when your stack is a good fit.