SparkLogs MCP Demo Transcript
SparkLogs gives your AI agents the logs, the system health, and the query tools to identify the root cause in minutes. MSP service desks are buried in hard tickets where the evidence sits across client endpoints, servers, cloud systems, and old logs. Today, engineers collect that context by hand, and AI agents lack effective access to the operational data they need. SparkLogs fills the gap: client systems ship signals to SparkLogs cloud, then AI agents and human engineers query that data to solve problems in minutes instead of hours. This transcript shows that workflow end to end.
This is a full transcript of a real Claude Code session being used with SparkLogs. The AI agent starts with a vague laptop report, uses SparkLogs MCP to find storage and CBS evidence, ranks root cause theories, and ends with an action plan.
▐▛███▜▌ Claude Code v2.1.169
▝▜█████▛▘ Opus 4.8 (1M context) · Claude Max
▘▘ ▝▝ ~/sparklogs-demo
Chapter 1Vague Report
From vague issue to scoped investigation: The user gives their AI agent a helpdesk-style report. SparkLogs MCP lets AI discover scope, sources, patterns, and evidence.
❯hey SparkLogs, a client is reporting random strange issues on their "elegant dolphin" laptop. it started maybe a month ago. getting worse. check it out
⏺I'll investigate the "elegant dolphin" laptop using the SparkLogs investigation skill.
⏺I have the tools and references. Starting the investigation. Step 1: resolve scope.
Chapter 2Scope Discovery
The AI agent validates telemetry coverage first: Before diagnosing, AI checks whether the device exists, which organization owns it, whether it is still reporting, and how much telemetry is available.
⏺Scope: Random strange issues on "elegant dolphin" laptop, started ~1 month ago, getting worse. Today is 2026-06-08. I'll investigate the last ~31 days in UTC. Let me confirm the org and the agent.
SparkLogsAgent found
HostElegantDolphin
OrganizationCorporate IT
Agent statusreporting normally
Last seen2026-06-09T04:20:57Z
⏺ElegantDolphin agent exists in Corporate IT, last seen 2026-06-09T04:20:57Z, so it's reporting normally. Now let me see what sources have data over the last month.
SparkLogsMonthly source volume
2026-05-08 to 2026-06-09
window
Chapter 3Signal Emerges
Patterns separate signal from noise: SparkLogs analyzes and clusters related logs into patterns. In this example, it enabled the AI agent to see disk errors and CBS corruption while treating high-volume DCOM permission noise as background.
⏺Single source, 65,420 events over the month. Now let me see the shape of the problem: top patterns at warning+ severity over the full window.
SparkLogsTop warning/error patterns
| Pattern | Hash | Hits | Severity | Assessment |
|---|
| The application-specific permission settings do not grant Local Activation permission | Tapsdn_458tl | 3,017 | warn | High volume, likely background noise |
| CBS Payload Corrupt <path> | wscNpC_104to | 1,662 | warn | Component store corruption |
| The driver detected a controller error on \Device\<var>\<var> | Tddace_213ey | 1,456 | error | Primary storage signal |
| Unable to repair payload file from backups directory | wscNuC_258jz | 946 | warn | Repair fallback signal |
⏺Clear storage-subsystem signal emerging: disk controller errors (Tddace_213ey, 1456 hits, sev 17), heavy CBS/CSI component-store corruption, plus IO-latency events. The COM/DCOM permission noise (sev 13) is background.
Chapter 4Trend And Onset
AI tests the timeline: The AI agent checks what started first, what came later, and whether the client's 'getting worse' report matches the data.
⏺To map the actual trend and confirm "getting worse" let me bucket the storage signals week-by-week.
SparkLogsWeekly pattern contrast
| Window | Dominant evidence | Assessment |
|---|
| May 8 to May 15 | 366 disk controller errors | Failure onset |
| May 15 to May 22 | 379 disk controller errors and CBS corruption begins | Corruption follows storage errors |
| May 22 to May 29 | CBS payload corruption and repair failures spike | Downstream component-store damage |
| May 29 to Jun 5 | DCOM/app noise dominates, disk errors continue | Symptoms continue after corruption |
⏺Now the trend is clear. Let me pull raw events to nail down the device, onset timing, and confirm the IO-latency signal.
SparkLogsExample of NTFS IO-stall event (1 of 14)
An IO took more than 30 seconds to complete:
Process name: System
File name: C:\Users\Demo\AppData\Roaming\discord\Cache\Cache_Data\data_0
IO Type: Write: Paging, NonCached, Async
Latency: 818256 ms
Volume name: C:
Is boot volume: true
Device model: KXG8AZNV1T02 LA KIOXIA
Device type: NVMe
SparkLogsHardware evidence summary
NTFS IO stalls: 14 events
Latency range: 13.6 to 73.6 minutes
Affected volume: C: boot volume
Affected apps: Discord, Slack, Outlook, Adobe, registry writes
Disk controller errors: 1,456 events on \Device\Harddisk2
⏺I have strong, concrete evidence now. One more check: confirm whether the data and controller errors actually begin around May 8, or if there's earlier history, so I don't misstate onset.
⏺Investigation complete. Evidence is consistent and concrete. Writing the summary.
ElegantDolphin storage subsystem degradation with cascading component-store corruption and multi-minute IO stalls
Executive summary
ElegantDolphin's storage subsystem shows sustained faults that began around 2026-05-08 and produced cascading downstream failures through early June. Two physical storage signals dominate: 1,456 disk controller errors (Event ID 11) on \Device\Harddisk2 from 2026-05-08 onward, and 14 extreme NTFS IO-completion stalls (Event ID 147) on the boot volume C: with single-IO latencies ranging from 13.6 minutes to 73.6 minutes.
Correlation caveat
These faults correlate in time with mass Windows component-store corruption and downstream COM/DCOM activation failures. The relationship between disk faults and corruption/app failures is correlational and not proven from logs alone.
Finding 1: boot NVMe IO stalls
The boot NVMe SSD on C: logged 14 IO operations exceeding the 30-second threshold, with actual latencies of 13.6 to 73.6 minutes. This hit user-facing apps and registry writes.
Finding 2: persistent controller errors
A second disk emitted 1,456 controller errors beginning 2026-05-08. No such errors appeared in the prior baseline.
Finding 3: component-store corruption
CBS persistent logs show payload corruption, hash mismatches, and unable-to-repair payloads across thousands of component-store files.
Finding 4: downstream app and COM failures
COM/DCOM activation and service-start failures surged after the corruption snapshot. This is likely downstream, but causal linkage is not proven.
Finding 5: corroborating storage signals
Disk identifier collisions, driver load failures, shadow-copy aborts, and update install failures add weaker supporting evidence.
Limits
No SMART attributes, NVMe wear, temperature, or vendor diagnostics are present in these logs. SparkLogs can infer likely hardware trouble, but host-side checks must confirm it.
⏺Want me to run /sparklogs-analyze-cause on these findings?
Chapter 7Root-Cause Analysis
Ranked theories, not magic certainty: SparkLogs gives AI agents enough evidence to rank theories, name confirm/refute checks, and explain what remains uncertain.
ElegantDolphin storage faults driving component-store corruption and app failures
Hypothesis 1: boot NVMe SSD failing
The Kioxia boot NVMe may be failing, producing minute-scale IO stalls and on-disk corruption on C:. Confirm with NVMe SMART/health for critical warning bits, available spare, media/data integrity errors, or high percentage used.
Hypothesis 2: secondary disk, controller, or cabling fault
\Device\Harddisk2 may be a separate failing disk, controller, cable, or enclosure. Identify the physical device and inspect SMART, connection, and disk signature.
Hypothesis 3: corruption now self-sustains app failures
Windows component-store and NTFS corruption may now be sustaining COM/DCOM and app failures independent of live disk latency.
What is uncertain
No SMART, NVMe wear, or temperature data exists in the logs. The link from disk faults to the COM/DCOM spike is correlational, not proven.
⏺Recommended next steps: pull NVMe SMART, identify and inspect \Device\Harddisk2, then use SparkLogs follow-up investigations to test whether symptoms correlate with corrupt components and IO stalls.
Chapter 8Windows Update Follow-Up
Answer the question, then steer back to cause: The AI agent checks the user's Windows Update concern, corrects the scope, and explains why these failures are downstream of storage trouble.
❯were all of the windows update problems resolved?
⏺Good question. That refers to the update-install failures in Finding 5. Let me pull those events to see which updates failed and check whether the same updates later succeeded.
SparkLogsUpdate scope correction
User questionWere all Windows Update problems resolved?
CorrectionThese were Microsoft Store app auto-updates, not OS or security patches.
Error code0x80073D02, AppX package in use
Patch statusCumulative Security Update (KB5089549), Defender platform update, and MSRT installed successfully.
SparkLogsStatus of Microsoft Store app updates
✔FIXED
✔Spotify
✔Clipchamp
✔GamingServices
✔GamingApp
✔MicrosoftOfficeHub
⚠NOT YET FIXED
−Microsoft.WindowsNotepad
−DolbyAccess
−Phone Link
⏺The recurrence pattern fits the storage story. 0x80073D02 is normally a benign retry-and-succeed condition, but here several apps needed many retries across weeks, and Phone Link went from working to failed. These are a downstream symptom, not a separate root cause.
Chapter 9Action Plan
Evidence becomes action: The final answer is operational triage: protect data, validate hardware, repair only after backup, and use SparkLogs to verify the fix.
⏺You're past what the logs can tell you. The evidence points one direction: failing storage on this laptop. The next steps are host-side actions the client needs to take, in priority order.
1. Backup the laptop now
A boot drive logging 13 to 73 minute IO completions and component-store corruption can fail hard with little warning. Get the user's data off first. Everything below can wait until this is done; this can't.
2. Confirm the drive diagnosis
Pull NVMe SMART for the Kioxia boot drive. Check Critical Warning, Available Spare, Media/Data Integrity Errors, and Percentage Used. Identify \Device\Harddisk2 and inspect its SMART, cable, port, or enclosure.
3. Attempt repair only after backup
Run chkdsk C: /scan, sfc /scannow, and DISM /Online /Cleanup-Image /RestoreHealth only after backup and disk stability.
4. Decide on replacement
If SMART is bad or repairs do not stick, replace the NVMe or reimage onto a new drive. Do not chase COM/DCOM or Store app update failures individually; they are downstream symptoms.
⏺Once a drive is replaced, re-run the investigation after a few weeks to verify the controller errors, corruption, and stalls actually stop. That's the real confirmation the fix worked.