How to Debug Failed Tests in Azure DevOps Pipelines
A systematic guide to debugging failed tests in Azure DevOps pipelines. Learn to diagnose environment issues, flaky tests, authentication failures, and configuration problems using pipeline logs, artifacts, and tracing tools.
A pipeline failure at 2 AM blocks the morning deployment. Effective debugging is the skill that separates a QA engineer who resolves issues quickly from one who raises a ticket and waits. This guide gives you a systematic debugging process for pipeline test failures.
The debugging decision tree
Test fails in pipeline
│
▼
Does it fail locally?
│ │
YES NO
│ │
▼ ▼
Code bug Environment difference?
│ │
YES NO
│ │
▼ ▼
Config/creds Timing/resources
difference issue (flakiness)
Step 1: Read the failure message
Go to Pipelines → [Run] → [Job] → [Failed step].
Each step shows its full stdout/stderr. The failure message is usually near the bottom:
Error: expect(received).toBe(expected)
Expected: "Welcome, Alice"
Received: "Please sign in"
at LoginTest (tests/auth.spec.ts:34:5)
This tells you: the login succeeded locally but the test user credentials don't work in the pipeline environment.
Step 2: Classify the failure type
| Symptom | Likely cause |
|---|---|
Cannot connect to... / ECONNREFUSED | Wrong URL, environment down, VPN required |
401 Unauthorized / 403 Forbidden | Wrong credentials, expired token |
Timeout of 30000ms exceeded | Slow environment, element not appearing, race condition |
Element not found / not visible | Selector changed, feature flag off, race condition |
Expected X, received Y | Data state differs from expected, logic bug |
Cannot read properties of undefined | API returned different structure than expected |
Step 3: Compare local vs pipeline environment
Most pipeline failures come from environment differences. Checklist:
☐ Is BASE_URL set correctly in the pipeline? (check the variable)
☐ Is the test database in the expected state?
☐ Are test user accounts created and active in staging?
☐ Is the feature being tested deployed to staging?
☐ Is there a feature flag that's off in staging but on locally?
☐ Does staging have a different config than local (e.g., different timeout)?
☐ Are SSL certificates valid on staging? (--insecure flag may be needed)
Step 4: Add diagnostic logging
When the error message isn't clear, add temporary diagnostic steps:
# Add before the failing test step
- script: |
echo "=== Environment Debug ==="
echo "BASE_URL: $(BASE_URL)"
echo "Node version: $(node --version)"
echo "NPM version: $(npm --version)"
curl -v $(BASE_URL)/health || echo "Health check failed"
displayName: Debug environmentFor Playwright, enable verbose tracing:
// playwright.config.ts — enable for debugging
use: {
trace: 'on', // Capture for every test (expensive but thorough)
screenshot: 'on',
video: 'on',
}Download the trace artifact and open it locally:
npx playwright show-trace trace.zipStep 5: Identify flaky tests
Flaky tests fail intermittently without code changes. Signs of flakiness:
- Test fails in pipeline, passes when you re-run without code changes
- Test fails on one shard but passes on others
- Test fails at night (scheduled run) but passes in PR pipeline
Quarantine flaky tests immediately — they destroy trust in the suite:
// Mark as flaky while investigating
test.fixme('TC-204: Wishlist limit — needs investigation', async ({ page }) => {
// ...
})# Pipeline: add --retries to catch flakiness
- script: npx playwright test --retries=3Track retry statistics to identify patterns. Tests that need 3 retries every run have a systemic issue (race condition, timing dependency).
Step 6: Debug authentication failures
The most common pipeline-specific failure: tests pass locally because you're already logged in; in CI, the session starts fresh.
// Create a reusable auth state
// setup/auth.ts
import { chromium } from '@playwright/test'
async function globalSetup() {
const browser = await chromium.launch()
const page = await browser.newPage()
await page.goto(process.env.BASE_URL + '/login')
await page.fill('[name="email"]', process.env.TEST_EMAIL!)
await page.fill('[name="password"]', process.env.TEST_PASSWORD!)
await page.click('[type="submit"]')
await page.waitForURL('**/dashboard')
// Save auth state
await page.context().storageState({ path: 'auth-state.json' })
await browser.close()
}
export default globalSetup// playwright.config.ts
export default defineConfig({
globalSetup: './setup/auth.ts',
use: {
storageState: 'auth-state.json', // Reuse in all tests
},
})Step 7: Use re-run diagnostics
Azure DevOps shows run history per test case:
- Go to pipeline run → Tests tab
- Click a failed test → History tab
- See: how many times this test has failed in the last N runs
A test that fails 1/10 times is flaky. A test that fails consistently after a specific commit introduced a regression.
Common errors and fixes
Error: Screenshot not captured for failed tests
Fix: Screenshots are only captured if screenshot: 'only-on-failure' is set in playwright.config.ts AND the test artifacts are published with condition: always().
Error: Trace files are too large to download
Fix: Use trace: 'on-first-retry' instead of trace: 'on'. This captures traces only on the first retry, not for every test.
Error: Tests time out on slow pipeline agents
Fix: Increase timeouts for CI: timeout: process.env.CI ? 60000 : 30000. Microsoft-hosted agents can be slower than local machines, especially for I/O-heavy operations.
Error: Can't reproduce pipeline failure locally
Fix: Use Docker to match the pipeline environment: docker run --rm -v $(pwd):/work -w /work mcr.microsoft.com/playwright:v1.45.0-jammy npx playwright test. This uses the exact same browser version as the pipeline.
Stay ahead in AI-driven QA
Get practical tutorials on test automation, AI testing, and quality engineering — straight to your inbox. No spam, unsubscribe any time.
Discussion
Sign in with GitHub to comment · powered by Giscus