Skip to main content
Back to blog

How to Debug Failed Tests in Azure DevOps Pipelines

A systematic guide to debugging failed tests in Azure DevOps pipelines. Learn to diagnose environment issues, flaky tests, authentication failures, and configuration problems using pipeline logs, artifacts, and tracing tools.

InnovateBits5 min read
Share

A pipeline failure at 2 AM blocks the morning deployment. Effective debugging is the skill that separates a QA engineer who resolves issues quickly from one who raises a ticket and waits. This guide gives you a systematic debugging process for pipeline test failures.


The debugging decision tree

Test fails in pipeline
         │
         ▼
Does it fail locally?
    │           │
   YES          NO
    │           │
    ▼           ▼
Code bug    Environment difference?
            │           │
          YES           NO
            │           │
            ▼           ▼
       Config/creds   Timing/resources
       difference     issue (flakiness)

Step 1: Read the failure message

Go to Pipelines → [Run] → [Job] → [Failed step].

Each step shows its full stdout/stderr. The failure message is usually near the bottom:

Error: expect(received).toBe(expected)
Expected: "Welcome, Alice"
Received: "Please sign in"

at LoginTest (tests/auth.spec.ts:34:5)

This tells you: the login succeeded locally but the test user credentials don't work in the pipeline environment.


Step 2: Classify the failure type

SymptomLikely cause
Cannot connect to... / ECONNREFUSEDWrong URL, environment down, VPN required
401 Unauthorized / 403 ForbiddenWrong credentials, expired token
Timeout of 30000ms exceededSlow environment, element not appearing, race condition
Element not found / not visibleSelector changed, feature flag off, race condition
Expected X, received YData state differs from expected, logic bug
Cannot read properties of undefinedAPI returned different structure than expected

Step 3: Compare local vs pipeline environment

Most pipeline failures come from environment differences. Checklist:

☐ Is BASE_URL set correctly in the pipeline? (check the variable)
☐ Is the test database in the expected state?
☐ Are test user accounts created and active in staging?
☐ Is the feature being tested deployed to staging?
☐ Is there a feature flag that's off in staging but on locally?
☐ Does staging have a different config than local (e.g., different timeout)?
☐ Are SSL certificates valid on staging? (--insecure flag may be needed)

Step 4: Add diagnostic logging

When the error message isn't clear, add temporary diagnostic steps:

# Add before the failing test step
- script: |
    echo "=== Environment Debug ==="
    echo "BASE_URL: $(BASE_URL)"
    echo "Node version: $(node --version)"
    echo "NPM version: $(npm --version)"
    curl -v $(BASE_URL)/health || echo "Health check failed"
  displayName: Debug environment

For Playwright, enable verbose tracing:

// playwright.config.ts — enable for debugging
use: {
  trace: 'on',           // Capture for every test (expensive but thorough)
  screenshot: 'on',
  video: 'on',
}

Download the trace artifact and open it locally:

npx playwright show-trace trace.zip

Step 5: Identify flaky tests

Flaky tests fail intermittently without code changes. Signs of flakiness:

  • Test fails in pipeline, passes when you re-run without code changes
  • Test fails on one shard but passes on others
  • Test fails at night (scheduled run) but passes in PR pipeline

Quarantine flaky tests immediately — they destroy trust in the suite:

// Mark as flaky while investigating
test.fixme('TC-204: Wishlist limit — needs investigation', async ({ page }) => {
  // ...
})
# Pipeline: add --retries to catch flakiness
- script: npx playwright test --retries=3

Track retry statistics to identify patterns. Tests that need 3 retries every run have a systemic issue (race condition, timing dependency).


Step 6: Debug authentication failures

The most common pipeline-specific failure: tests pass locally because you're already logged in; in CI, the session starts fresh.

// Create a reusable auth state
// setup/auth.ts
import { chromium } from '@playwright/test'
 
async function globalSetup() {
  const browser = await chromium.launch()
  const page = await browser.newPage()
  
  await page.goto(process.env.BASE_URL + '/login')
  await page.fill('[name="email"]', process.env.TEST_EMAIL!)
  await page.fill('[name="password"]', process.env.TEST_PASSWORD!)
  await page.click('[type="submit"]')
  await page.waitForURL('**/dashboard')
  
  // Save auth state
  await page.context().storageState({ path: 'auth-state.json' })
  await browser.close()
}
 
export default globalSetup
// playwright.config.ts
export default defineConfig({
  globalSetup: './setup/auth.ts',
  use: {
    storageState: 'auth-state.json', // Reuse in all tests
  },
})

Step 7: Use re-run diagnostics

Azure DevOps shows run history per test case:

  1. Go to pipeline run → Tests tab
  2. Click a failed test → History tab
  3. See: how many times this test has failed in the last N runs

A test that fails 1/10 times is flaky. A test that fails consistently after a specific commit introduced a regression.


Common errors and fixes

Error: Screenshot not captured for failed tests Fix: Screenshots are only captured if screenshot: 'only-on-failure' is set in playwright.config.ts AND the test artifacts are published with condition: always().

Error: Trace files are too large to download Fix: Use trace: 'on-first-retry' instead of trace: 'on'. This captures traces only on the first retry, not for every test.

Error: Tests time out on slow pipeline agents Fix: Increase timeouts for CI: timeout: process.env.CI ? 60000 : 30000. Microsoft-hosted agents can be slower than local machines, especially for I/O-heavy operations.

Error: Can't reproduce pipeline failure locally Fix: Use Docker to match the pipeline environment: docker run --rm -v $(pwd):/work -w /work mcr.microsoft.com/playwright:v1.45.0-jammy npx playwright test. This uses the exact same browser version as the pipeline.

Free newsletter

Stay ahead in AI-driven QA

Get practical tutorials on test automation, AI testing, and quality engineering — straight to your inbox. No spam, unsubscribe any time.

Discussion

Sign in with GitHub to comment · powered by Giscus