SRE Skills (9)
Part of Role Skills Catalog | Phase 4 + Phase 6
Chains: SLO → Alerts | Pipeline Create → Pipeline Audit → Env | Toil → Hardening | Runbook → Postmortem
Userflow Schema
flowchart TD
jaan-to-sre-slo-setup["sre-slo-setup<br>SLO Setup<br>SLOs + error budgets + SLIs"] --> jaan-to-sre-alert-tuning["sre-alert-tuning<br>Alert Tuning<br>Noise reduction + severity map"]
jaan-to-sre-alert-tuning["sre-alert-tuning<br>Alert Tuning<br>Noise reduction + severity map"] -.-> jaan-to-dev-observability-alerts["dev-observability-alerts<br>DEV: observability-alerts"]
detect-dev["detect-dev<br>DEV: stack-detect"] -.-> jaan-to-sre-pipeline-create["sre-pipeline-create<br>Pipeline Create<br>CI/CD workflows + Docker + deploy"]
jaan-to-sre-pipeline-create["sre-pipeline-create<br>Pipeline Create<br>CI/CD workflows + Docker + deploy"] --> jaan-to-sre-pipeline-audit["sre-pipeline-audit<br>Pipeline Audit<br>Weak points + missing gates"]
jaan-to-sre-pipeline-audit["sre-pipeline-audit<br>Pipeline Audit<br>Weak points + missing gates"] --> jaan-to-sre-env-check["sre-env-check<br>Env Check<br>Config drift + readiness"]
jaan-to-sre-env-check["sre-env-check<br>Env Check<br>Config drift + readiness"] -.-> jaan-to-dev-ship-check["dev-ship-check<br>DEV: ship-check"]
jaan-to-sre-toil-audit["sre-toil-audit<br>Toil Audit<br>Toil list + automation candidates"] --> jaan-to-sre-hardening-plan["sre-hardening-plan<br>Hardening Plan<br>Reliability backlog + steps"]
jaan-to-sre-hardening-plan["sre-hardening-plan<br>Hardening Plan<br>Reliability backlog + steps"] -.-> jaan-to-delivery-backlog-split["delivery-backlog-split<br>DELIVERY: backlog-split"]
jaan-to-sre-incident-runbook["sre-incident-runbook<br>Incident Runbook<br>Runbook + escalation + triage"] --> jaan-to-sre-incident-postmortem["sre-incident-postmortem<br>Incident Postmortem<br>Timeline + actions + owners"]
jaan-to-sre-incident-postmortem["sre-incident-postmortem<br>Incident Postmortem<br>Timeline + actions + owners"] --> jaan-to-sre-hardening-plan["sre-hardening-plan<br>Hardening Plan<br>Reliability backlog + steps"]
style detect-dev fill:#f0f0f0,stroke:#999
style jaan-to-dev-observability-alerts fill:#f0f0f0,stroke:#999
style jaan-to-dev-ship-check fill:#f0f0f0,stroke:#999
style jaan-to-delivery-backlog-split fill:#f0f0f0,stroke:#999
Legend: Solid = internal | Dashed = cross-role exit | Gray nodes = other roles
/jaan-to-sre-slo-setup
- Logical:
sre-slo-setup - Description: SLOs + error budgets, suggested SLIs (latency, errors, uptime), reporting cadence
- Quick Win: No - needs monitoring data
- Key Points:
- Define SLOs and error budgets
- Alerts should reflect user impact
- Tune noise and prioritize
- → Next:
sre-alert-tuning - MCP Required: Monitoring tools (Datadog/Grafana)
- Input: [service]
- Output:
$JAAN_OUTPUTS_DIR/sre/monitor/{slug}/slo-setup.md
/jaan-to-sre-alert-tuning
- Logical:
sre-alert-tuning - Description: Noise reduction plan, priorities + severity mapping, missing alerts checklist
- Quick Win: No - needs alert history
- Key Points:
- Define SLOs and error budgets
- Alerts should reflect user impact
- Tune noise and prioritize
- → Next:
dev-observability-alerts - MCP Required: Monitoring tools (Datadog/Grafana)
- Input: [alerts]
- Output:
$JAAN_OUTPUTS_DIR/sre/monitor/{slug}/alert-tuning.md
/jaan-to:sre-pipeline-create
- Logical:
sre-pipeline-create - Description: Generate CI/CD pipeline configuration: GitHub Actions workflows, Dockerfile, deployment manifests, testing stages
- Quick Win: Yes
- Key Points:
- Multi-stage Docker builds for optimization
- Parallel test/lint/type-check jobs
- Environment-specific deployment with approval gates
- Secrets management pattern (GitHub Secrets)
- → Next:
sre-pipeline-audit - MCP Required: None
- Input: [tech-stack, project-structure, deployment-target]
- Output:
$JAAN_OUTPUTS_DIR/sre/cicd/{slug}/pipeline/ - Note: Complements existing
sre-pipeline-audit(audit-only)
/jaan-to-sre-pipeline-audit
- Logical:
sre-pipeline-audit - Description: Weak points + quick fixes, missing gates checklist, reliability improvements backlog
- Quick Win: No - needs GitLab MCP
- Key Points:
- Identify weak gates and flaky tests
- Ensure env parity and secret management
- Make rollbacks routine
- → Next:
sre-env-check - MCP Required: GitLab (pipeline config)
- Input: [repo]
- Output:
$JAAN_OUTPUTS_DIR/sre/cicd/{slug}/pipeline-audit.md
/jaan-to-sre-env-check
- Logical:
sre-env-check - Description: Config drift + readiness checklist, missing secrets/configs, "safe to deploy?" hints
- Quick Win: No - needs GitLab MCP
- Key Points:
- Identify weak gates and flaky tests
- Ensure env parity and secret management
- Make rollbacks routine
- → Next:
dev-ship-check - MCP Required: GitLab (env config)
- Input: [env]
- Output:
$JAAN_OUTPUTS_DIR/sre/cicd/{slug}/env-check.md
/jaan-to-sre-toil-audit
- Logical:
sre-toil-audit - Description: Toil list + automation candidates, time spent estimates, top ROI opportunities
- Quick Win: Yes
- Key Points:
- Track toil weekly; automate repeatable work
- Maintain reliability backlog
- Prioritize hardening with measurable outcomes
- → Next:
sre-hardening-plan - MCP Required: None
- Input: [week]
- Output:
$JAAN_OUTPUTS_DIR/sre/ops/{slug}/toil-audit.md
/jaan-to-sre-hardening-plan
- Logical:
sre-hardening-plan - Description: Reliability improvement backlog, sequenced steps, verification metrics
- Quick Win: Yes
- Key Points:
- Track toil weekly; automate repeatable work
- Maintain reliability backlog
- Prioritize hardening with measurable outcomes
- → Next:
delivery-backlog-split - MCP Required: None
- Input: [service]
- Output:
$JAAN_OUTPUTS_DIR/sre/ops/{slug}/hardening-plan.md
/jaan-to-sre-incident-runbook
- Logical:
sre-incident-runbook - Description: Runbook + escalation steps, triage checklist, "if X then Y" actions
- Quick Win: Yes
- Key Points:
- Runbooks should be actionable and short
- Clear escalation paths
- Postmortems must yield backlog items
- → Next:
sre-incident-postmortem - MCP Required: None
- Input: [service]
- Output:
$JAAN_OUTPUTS_DIR/sre/incident/{slug}/runbook.md
/jaan-to-sre-incident-postmortem
- Logical:
sre-incident-postmortem - Description: Blameless retro template filled, timeline + contributing factors, action items + owners
- Quick Win: Yes
- Key Points:
- Runbooks should be actionable and short
- Clear escalation paths
- Postmortems must yield backlog items
- → Next:
sre-hardening-plan - MCP Required: None
- Input: [incident]
- Output:
$JAAN_OUTPUTS_DIR/sre/incident/{slug}/postmortem.md