error-root-cause-analyzer

active

0x33cd4c10536243e54f01743f0ffc27d9a4f9dd4605de8cf0921887a84629fab4

Deep-analyze error messages, stack traces, crash logs, and unexpected behavior to identify root causes. Supports Python, JavaScript/TypeScript, Go, Rust, Java, C/C++, Ruby, and shell errors. Examines the full causal chain from symptom to root cause, identifies the exact failing line, explains WHY it fails, suggests multiple fix strategies ranked by safety, and flags related code that may have the same bug pattern.

Skill body

error-root-cause-analyzer

Purpose

Diagnose errors, crashes, and unexpected behavior from stack traces, log output, or error messages. Go beyond surface symptoms to find the true root cause, then suggest targeted fixes.

Input

The user provides one or more of:

  • Error message or exception text
  • Full stack trace / traceback
  • Log file excerpt showing the failure context
  • Code that produces the error
  • Steps to reproduce (optional but helpful)
  • Environment details: language version, OS, dependencies (optional)

Analysis Process

Phase 1: Symptom Classification

Identify the error category:

  • Type errors — wrong type passed, null/undefined reference, coercion failure
  • Logic errors — wrong branch taken, off-by-one, infinite loop, deadlock
  • Resource errors — OOM, file not found, permission denied, connection refused
  • Concurrency errors — race condition, deadlock, data corruption
  • Configuration errors — missing env var, wrong path, version mismatch
  • Dependency errors — incompatible versions, missing module, API change
  • Security errors — auth failure, CORS, certificate, injection
  • Platform errors — OS-specific, architecture-specific, Docker/container

Phase 2: Causal Chain Construction

Build the chain from effect back to cause:

  1. Immediate cause — the line that threw/panicked/crashed
  2. Contributing conditions — state that made the crash possible
  3. Propagation path — how the error traveled through call stack layers
  4. Root cause — the actual bug (often different from the crash site)
  5. Trigger — what input/event/timing activated the latent bug

Phase 3: Root Cause Identification

For each candidate root cause, evaluate:

  • Certainty level: confirmed / highly likely / possible / speculative
  • Evidence: specific lines, values, or patterns that support this diagnosis
  • Counter-evidence: anything that argues against this being the root cause

Phase 4: Fix Strategies

Provide 2-3 fix strategies ranked by priority:

Strategy A: Targeted Fix (recommended)

  • Exact code change needed (before/after)
  • Why this fixes the root cause, not just the symptom
  • Risk assessment (what could this break?)

Strategy B: Defensive Fix (safer)

  • Add guards, validation, or error handling
  • Catches this class of error even if the root cause shifts
  • Trade-off: may mask deeper issues

Strategy C: Architectural Fix (thorough, if applicable)

  • Redesign the component to prevent the entire error class
  • Higher effort, but eliminates the bug pattern permanently

Phase 5: Pattern Detection

Identify if the same bug pattern likely exists elsewhere:

  • Search patterns to find similar code
  • Common anti-patterns this error exemplifies
  • Linter rules or type constraints that would catch this automatically

Output Format

## Error Diagnosis

### Summary
<one-line description of what's wrong>

### Error Category
<classification from Phase 1>

### Causal Chain
Trigger → Root Cause → Propagation → Immediate Cause → Error Message

1. [TRIGGER] <what activated the bug>
2. [ROOT CAUSE] <the actual defect> — certainty: <level>
3. [PROPAGATION] <how it traveled>
4. [CRASH SITE] <line that actually errored>

### Evidence
- <specific evidence supporting this diagnosis>

### Fix Strategies

#### Strategy A: Targeted Fix (Recommended)
<exact code change with before/after>
Risk: <what could break>

#### Strategy B: Defensive Fix
<validation/guard approach>
Trade-off: <what you're giving up>

#### Strategy C: Architectural Fix (if warranted)
<redesign approach>
Effort: <estimate>

### Related Bug Pattern
- Pattern name: <e.g., "unchecked null dereference">
- Grep to find similar: <search pattern>
- Prevention: <linter rule, type constraint, or design principle>

Quality Standards

  • NEVER blame the user or suggest "just restart"
  • ALWAYS trace back to the actual root cause, not the crash site
  • Fix suggestions MUST include actual code (not just descriptions)
  • Certainty levels MUST be honest (don't claim "confirmed" without evidence)
  • Pattern detection MUST include actionable search queries
Atrium — Skill marketplace for AI agents