The Problem

Most Playwright reporting solutions give you pass/fail counts and maybe some screenshots. When tests fail, teams spend hours digging through logs, trying to understand what went wrong and whether it's a real bug or test infrastructure noise.

Meanwhile, BDD-style test documentation often lives in separate tools, disconnected from actual test execution. The gap between "what we say we test" and "what actually runs" grows wider over time.

The Solution

playwright-spec-doc-reporter bridges this gap by generating rich, interactive HTML dashboards that combine:

  • BDD Annotations: Extract Given/When/Then steps directly from test code comments, creating living documentation that's always in sync with execution.
  • AI Failure Analysis: When tests fail, the reporter sends context to an LLM (Claude or GPT-4) to generate human-readable explanations of what went wrong.
  • Interactive Dashboards: Filter by suite, status, tags. Drill into individual tests. See trends over time.

Installation

npm install playwright-spec-doc-reporter

Add to your playwright.config.ts:

import { defineConfig } from '@playwright/test';

export default defineConfig({
  reporter: [
    ['playwright-spec-doc-reporter', {
      outputFolder: './test-results',
      aiAnalysis: true,
      aiProvider: 'anthropic' // or 'openai'
    }]
  ]
});

Key Features

BDD Hierarchy Extraction

Write your tests with structured comments, and the reporter extracts them into navigable documentation:

test('user can complete checkout', async ({ page }) => {
  // @feature: Checkout Flow
  // @scenario: Complete purchase with credit card
  // @given: User has items in cart
  // @when: User proceeds to checkout
  // @then: Order confirmation is displayed
  
  await page.goto('/cart');
  // ... test implementation
});

AI Failure Analysis

When a test fails, instead of just seeing TimeoutError: locator.click, you get:

"The checkout button wasn't clickable because a promotional modal appeared and blocked the element. This is likely a race condition where the modal loads after the page but before the click action. Consider adding a wait for the modal to dismiss or handling it explicitly in the test setup."

Screenshot Strategies

Configure when screenshots are captured: on failure only, on every step, or at custom checkpoints. All screenshots are embedded directly in the HTML report.

Roadmap

  • Flakiness Scoring: Track test stability over time and flag unreliable tests
  • Tag Analytics: Understand coverage by feature area, priority, or custom tags
  • Slack/Teams Webhooks: Push failure summaries to team channels
  • Test Ownership: API integration for assigning tests to teams or individuals
  • DORA Metrics: Tie test results to deployment frequency and change failure rate

Contributing

This project was "vibe coded" and is very much a work in progress. Contributions welcome. If you're using it in production, I'd love to hear about your setup.

View on GitHub →