Skip to main content
Back to blog

Parallel Test Execution in Azure DevOps Pipelines: Speed Up Testing

How to run tests in parallel in Azure DevOps to dramatically reduce pipeline execution time. Covers matrix strategies, job parallelism, Playwright sharding, pytest-xdist, and calculating the optimal worker count.

InnovateBits4 min read
Share

A 90-minute test suite run sequentially is a 15-minute test suite run across 6 parallel agents. Parallel execution is the single highest-impact optimisation you can make to a slow CI pipeline.


Three parallelism strategies

StrategyHowBest for
Job parallelismMultiple jobs run simultaneouslyIndependent test suites (unit + API + E2E)
Matrix strategySame job runs with different parametersCross-browser, cross-OS, cross-version
Test shardingTests split across agentsSingle large test suite

Strategy 1: Job parallelism

Run different test types simultaneously:

stages:
  - stage: Test
    jobs:
      # All three jobs start at the same time
      - job: Unit
        displayName: Unit tests (2 min)
        steps:
          - script: npm test:unit
 
      - job: API
        displayName: API tests (5 min)
        steps:
          - script: npm test:api
 
      - job: E2E
        displayName: E2E tests (15 min)
        steps:
          - script: npx playwright test

Total time: 15 minutes (the slowest job) instead of 22 minutes (sequential).


Strategy 2: Matrix strategy

Run the same tests on multiple configurations simultaneously:

jobs:
  - job: CrossBrowser
    displayName: Cross-browser E2E
    strategy:
      matrix:
        Chrome:
          BROWSER: chromium
        Firefox:
          BROWSER: firefox
        Safari:
          BROWSER: webkit
      maxParallel: 3
 
    steps:
      - script: npm ci
      - script: npx playwright install --with-deps $(BROWSER)
      - script: npx playwright test --project=$(BROWSER)
        displayName: Run $(BROWSER) tests
      - task: PublishTestResults@2
        inputs:
          testResultsFiles: results/$(BROWSER)-results.xml
          testRunTitle: E2E — $(BROWSER) — $(Build.BuildNumber)
        condition: always()

All three browsers run simultaneously. Total time: time of slowest browser.

Matrix for multiple environments

strategy:
  matrix:
    Staging:
      ENV_NAME: staging
      BASE_URL: https://staging.app.com
    UAT:
      ENV_NAME: uat
      BASE_URL: https://uat.app.com
  maxParallel: 2

Strategy 3: Test sharding (Playwright)

Split a single test suite across multiple agents:

jobs:
  - job: Shard
    strategy:
      matrix:
        Shard1: { SHARD_INDEX: 1, SHARD_TOTAL: 4 }
        Shard2: { SHARD_INDEX: 2, SHARD_TOTAL: 4 }
        Shard3: { SHARD_INDEX: 3, SHARD_TOTAL: 4 }
        Shard4: { SHARD_INDEX: 4, SHARD_TOTAL: 4 }
      maxParallel: 4
 
    steps:
      - script: npm ci
      - script: npx playwright install --with-deps chromium
      - script: |
          npx playwright test \
            --shard=$(SHARD_INDEX)/$(SHARD_TOTAL) \
            --reporter=blob
        displayName: Run shard $(SHARD_INDEX)/$(SHARD_TOTAL)
      - task: PublishPipelineArtifact@1
        inputs:
          targetPath: blob-report
          artifact: blob-report-$(SHARD_INDEX)
        condition: always()
 
  - job: MergeResults
    dependsOn: Shard
    condition: always()
    steps:
      - script: npm ci
      - task: DownloadPipelineArtifact@2
        inputs:
          targetPath: all-blob-reports
          patterns: 'blob-report-*/**'
      - script: npx playwright merge-reports --reporter=html all-blob-reports
      - task: PublishPipelineArtifact@1
        inputs:
          targetPath: playwright-report
          artifact: playwright-merged-report

A 60-minute suite sharded across 4 agents completes in ~15 minutes.


Strategy 3b: pytest-xdist sharding

For Python test suites:

jobs:
  - job: PytestShard
    strategy:
      matrix:
        Shard1: { SHARD_NUM: 0, SHARD_TOTAL: 4 }
        Shard2: { SHARD_NUM: 1, SHARD_TOTAL: 4 }
        Shard3: { SHARD_NUM: 2, SHARD_TOTAL: 4 }
        Shard4: { SHARD_NUM: 3, SHARD_TOTAL: 4 }
      maxParallel: 4
 
    steps:
      - script: pip install pytest pytest-xdist
      - script: |
          pytest tests/ \
            --splits=$(SHARD_TOTAL) \
            --group=$(SHARD_NUM) \
            --junitxml=results/shard-$(SHARD_NUM).xml
      - task: PublishTestResults@2
        inputs:
          testResultsFiles: results/shard-$(SHARD_NUM).xml
          testRunTitle: Shard $(SHARD_NUM)
        condition: always()

Calculating optimal shard count

Optimal shards = ceiling(Suite duration / Target duration)

Example:
  Suite takes 60 minutes
  Target: < 15 minutes
  Shards needed: ceiling(60/15) = 4 shards

Consideration: Each shard has setup overhead (~2-3 minutes for
browser install, npm install). Factor this in:
  Effective test time per shard = 60/4 = 15 min
  Plus setup: 15 + 2.5 = 17.5 min total

Beyond 8–10 shards, diminishing returns from setup overhead often outweigh gains.


Common errors and fixes

Error: Jobs run sequentially despite matrix strategy Fix: Check maxParallel is set. Also, you need enough agents in your agent pool. Free tier Microsoft-hosted agents allow up to 10 parallel jobs. Check Organisation Settings → Parallel jobs.

Error: Test results from shards not appearing in the pipeline Tests tab Fix: Each shard must publish its own results with PublishTestResults. Ensure the task runs with condition: always() and the file paths are correct per shard.

Error: Playwright shards have uneven distribution (one shard takes much longer) Fix: Playwright distributes tests by file. If one test file is very slow, it dominates a shard. Split large test files into smaller ones for better distribution.

Error: maxParallel is set to 4 but only 2 jobs run at once Fix: You may have reached your parallel job limit. Free tier allows 1 parallel job; paid tiers allow more. Check Organisation Settings → Billing.

Free newsletter

Stay ahead in AI-driven QA

Get practical tutorials on test automation, AI testing, and quality engineering — straight to your inbox. No spam, unsubscribe any time.

Discussion

Sign in with GitHub to comment · powered by Giscus