From 5 Clicks to Zero Guesswork: Building a Jenkins Dashboard with an AI Coding Agent

Dmytro Protsyk
QA Engineer

Every couple weeks my morning starts with coffee and the warm, soothing glow of red Jenkins pipelines. In our dedicated QA Automation team, two engineers rotate every week for triaging failed Playwright tests across Jenkins builds. The work may not be exciting or very interesting, but it's important and needs to be done.
We are using the Build Monitor Jenkins plugin that gives us a single view of pipelines that have Playwright's test stages. But:
- It does not show you what stage in the pipeline a build failed — was it Playwright? Unit tests? A static analysis check? You need to check each one with your 👀.
- Getting from the dashboard to the actual Playwright report takes up to five clicks: open pipeline → select failed build → open html attachment with links to external playwright report → open context menu for link → open in new tab. Unfortunately due to security configuration for Jenkins we can’t show html files with JS inside Jenkins.
- New pipelines had to be manually added to the dashboard. There were few times when we forgot to add new pipelines to Dashboard.
Yeah it’s not catastrophic problems – but it adds unnecessary friction
Then one of my colleagues dropped a link in Slack…
The link was to Superpowers – a workflow for Claude Code that extends it with subagent orchestration, built-in browser-based design preview, and a set of "skills" the agent can draw on. I'd been curious about unattended agent workflows for a while and it felt like creating a dashboard might be a right problem to try to solve with Superpowers.
After a quick plugin install (/plugin install superpowers@superpowers-marketplace) I am good to go. Everything starts with a brainstorming session where you describe a general idea and interactively iterate over it until you have a robust spec. During that stage Claude also explores the current state of the project(in my case it was just an empty folder) and runs a few spec review subagents.
The second stage is planning where you can dig deeper into technical things and implementation details. As a result you will have a detailed implementation plan split into 15-20 tasks. Tasks that depend on others linked to each other.
And finally implementation – there are two modes: subagents driven and main session execution with human checkpoints. I went with the subagents approach. Claude spinups subagents – one per task. Each agent implemented changes described in the task, added tests, performed a self-review and made a commit. For more complex tasks Claude triggers a separate review subagent that checks if the task was implemented according to spec.
It didn't work out of the box – pipeline paths were not generated correctly and some env variables were not used correctly. But I still had a bunch of context left so I was able to debug issues with Claude and get to a first working version.

In the end I had a simple dashboard where:
- All pipelines automatically discovered (using the gh CLI to scan org repos for Jenkinsfiles containing a Playwright stage – no more manual additions)
- Cards split into three categories: red (Playwright failures), yellow (failures on other stages), and green
- Direct links to the Playwright reports right on the card
- Auto-refresh every 30 seconds
Under the hood: SQLite for storage and sync, Express as the backend, React + Tailwind on the front end, and Drizzle ORM.
The second session was about polishing.
I had ideas about what I want to improve:
- Add link to github repo
- Show statuses for 5 previous builds on card
- Card extended view with all stages breakdown
This is where another Superpowers feature earned its keep: the built-in browser design preview. The agent generates a few mockups, you react to them, and iterating toward a final design becomes genuinely fast. Getting UI feedback loops down from "edit → rebuild → check" to something closer to interactive felt like a meaningful quality-of-life improvement.

And yeah I’m one of those weirdos who prefer light mode 😅
The Part I Didn't Expect…
The main implementation session stayed lean throughout. Normally, even just setting up a project can eat half your available context window. With this orchestration approach, that overhead gets pushed down to the subagents. The primary session finished with context to spare.
What stuck with me most from the first session wasn't the dashboard itself — it was the realization that I could step away and trust the process. No side-questing into unrelated refactors, no confident "✅ done" messages that turned out to mean "I've started thinking about it." The subagent review layer seems to be doing real work in keeping the output honest.
For a QA engineer, that's the bar that matters. You want a system that fails loudly and visibly when something's wrong — not one that papers over problems and reports green.
The agent finally built one of those. For itself.