Testing Guide¶

This guide covers how to write and run tests for pyhs3.

Testing Philosophy¶

We follow Test-Driven Development (TDD) principles:

Write a failing test that validates the desired functionality
Run the test to confirm it fails as expected
Write minimal code to make the test pass
Run the test to confirm success
Refactor if needed while keeping tests green

Test Requirements¶

NO EXCEPTIONS: All projects MUST have:

Unit tests for individual components
Integration tests for component interactions
End-to-end tests with real data

The only exception is if explicitly authorized to skip tests for a specific change.

Running Tests¶

Basic Test Execution¶

Run all tests using pytest:

pytest

Run tests with coverage reporting:

pytest --cov=pyhs3

Using hatch¶

Run tests using hatch:

hatch run test

Run doctests:

hatch run doctest

Using nox¶

Run tests across multiple Python versions:

nox -s tests

Run specific tests:

nox -s tests -- tests/test_distributions.py

Test Organization¶

Test File Structure¶

Tests are organized in the tests/ directory:

tests/
├── test_distributions.py      # Unit tests for distributions
├── test_functions.py          # Unit tests for functions
├── test_workspace.py          # Integration tests for workspaces
├── test_histfactory.py        # Integration tests for HistFactory
├── test_realworld.py          # End-to-end tests with real data
└── test_histfactory/          # Test data directory
    ├── simplemodel_uncorrelated-background_hifa.json
    └── simplemodel_uncorrelated-background_hs3.json

Test Naming Conventions¶

Test files: Must start with test_ (e.g., test_distributions.py)
Test classes: Use descriptive names (e.g., TestGaussianDistribution, TestDiHiggsIssue41Workspace)
Test functions: Must start with test_ (e.g., test_gaussian_pdf_evaluation)

Test Markers¶

We use pytest markers to categorize tests:

slow¶

Mark tests that take significant time to run:

import pytest


@pytest.mark.slow
def test_large_scale_optimization():
    """Test that runs for several seconds."""
    ...

Run slow tests:

pytest --runslow

Skip slow tests (default):

pytest

pydot¶

Mark tests that require the pydot dependency:

import pytest


@pytest.mark.pydot
def test_graph_visualization():
    """Test that requires pydot for graph rendering."""
    ...

Run pydot tests:

pytest --runpydot

Writing Tests¶

Unit Tests¶

Test individual components in isolation:

from __future__ import annotations

import numpy as np
import pytest
from pyhs3.distributions import GaussianDist


class TestGaussianDistribution:
    """Unit tests for Gaussian distribution."""

    def test_gaussian_pdf_at_mean(self):
        """Test that PDF is maximized at the mean."""
        dist = GaussianDist(name="gauss", x="x", mean="mu", sigma="sigma")
        # Test implementation...

    def test_gaussian_invalid_sigma(self):
        """Test that negative sigma raises an error."""
        with pytest.raises(ValueError, match="sigma must be positive"):
            ...
        # Test implementation...

Integration Tests¶

Test interactions between components:

from __future__ import annotations

import pyhs3
from pyhs3.distributions import GaussianDist
from pyhs3.parameter_points import ParameterSet, ParameterPoint


class TestWorkspaceIntegration:
    """Integration tests for workspace operations."""

    def test_workspace_model_creation(self):
        """Test creating a model from workspace."""
        ws = pyhs3.Workspace(
            metadata={"hs3_version": "0.2"},
            distributions=[GaussianDist(name="model", x="x", mean="mu", sigma="sigma")],
            parameter_points=[
                ParameterSet(
                    name="default",
                    parameters=[
                        ParameterPoint(name="x", value=0.0),
                        ParameterPoint(name="mu", value=0.0),
                        ParameterPoint(name="sigma", value=1.0),
                    ],
                )
            ],
        )
        model = ws.model()
        assert len(model.parameters) == 3

End-to-End Tests¶

Test with real data and complete workflows:

from __future__ import annotations

import json
import pyhs3


class TestRealWorldWorkspace:
    """End-to-end tests with real data."""

    def test_histfactory_workspace_loading(self):
        """Test loading a real HistFactory workspace."""
        with open(
            "tests/test_histfactory/simplemodel_uncorrelated-background_hs3.json"
        ) as f:
            workspace_data = json.load(f)

        ws = pyhs3.Workspace(**workspace_data)
        model = ws.model()

        # Validate against expected results
        assert model is not None
        # Run full PDF evaluation...

Test Data¶

For integration and end-to-end tests:

Use external test data repositories like scikit-hep-testdata when possible
Store small test files in tests/test_*/ directories
Never commit large files to the repository

Doctests¶

We support doctests in both code and documentation:

In Code¶

Add doctests to docstrings:

def calculate_nll(logpdf: float) -> float:
    """Calculate negative log-likelihood from log PDF.

    Parameters
    ----------
    logpdf : float
        The log probability density function value

    Returns
    -------
    float
        The negative log-likelihood (-2 * logpdf)

    Examples
    --------
    ...

    """
    return -2 * logpdf

In Documentation¶

Add code examples to .rst files with .. doctest:: directive or in code blocks that are automatically tested.

Run doctests:

pytest --doctest-modules src/pyhs3
hatch run doctest

Coverage Requirements¶

Test Coverage Goals¶

We aim for high test coverage:

Overall coverage: >85%
Core modules: >90%
New code: 100% coverage expected

Viewing Coverage Reports¶

Generate and view HTML coverage report:

pytest --cov=pyhs3 --cov-report=html
open htmlcov/index.html  # or start htmlcov/index.html on Windows

View coverage in terminal:

pytest --cov=pyhs3 --cov-report=term-missing

Coverage is automatically reported in CI and uploaded to CodeCov.

Test Output Standards¶

Clean Output¶

Test output must be pristine to pass.

No unexpected warnings
No error messages in logs (unless explicitly tested)
All expected log messages should be captured and tested

Capturing Expected Errors¶

If your code is expected to log errors, capture them in tests:

import pytest
import logging


def test_expected_error_logging(caplog):
    """Test that expected errors are logged correctly."""
    with caplog.at_level(logging.ERROR):
        # Code that logs an error
        ...

    assert "Expected error message" in caplog.text

Testing Best Practices¶

Fixtures¶

Use pytest fixtures for common test setup:

import pytest
from pyhs3 import Workspace


@pytest.fixture
def simple_workspace():
    """Create a simple workspace for testing."""
    return Workspace(
        metadata={"hs3_version": "0.2"},
        distributions=[...],
    )


def test_with_fixture(simple_workspace):
    """Test using the fixture."""
    model = simple_workspace.model()
    assert model is not None

Parametrized Tests¶

Test multiple cases efficiently:

import pytest


@pytest.mark.parametrize(
    "value,expected",
    [
        (0.0, 1.83787706),
        (1.0, 2.83787706),
        (-1.0, 2.83787706),
    ],
)
def test_nll_values(value, expected):
    """Test NLL calculation for various inputs."""
    # Test implementation...

Mocking¶

Important: We DO NOT use mocks in end-to-end tests. Always use real data and real APIs.

For unit tests, mocking is acceptable when testing external dependencies:

from unittest.mock import Mock, patch


def test_with_mock():
    """Test with mocked external dependency."""
    with patch("external_module.function") as mock_func:
        mock_func.return_value = 42
        # Test implementation...

Continuous Integration¶

Our CI pipeline runs:

Linting via pre-commit hooks
Type checking with mypy
Unit tests across multiple Python versions (3.10, 3.11, 3.12, 3.13)
Coverage reporting to CodeCov
Documentation build to ensure docs compile

All checks must pass before a PR can be merged.

Troubleshooting¶

Common Issues¶

Tests pass locally but fail in CI

Ensure you’ve run pre-commit run --all-files
Check for platform-specific issues
Verify all dependencies are specified

Coverage too low

Add tests for uncovered code paths
Use pytest --cov=pyhs3 --cov-report=term-missing to see what’s missing

Slow test suite

Mark slow tests with @pytest.mark.slow
Run only fast tests during development: pytest -m "not slow"

Flaky tests

Investigate timing issues or race conditions
Use fixed random seeds for reproducibility
Avoid tests that depend on external services

Getting Help¶

If you’re stuck:

Check CI logs for detailed error messages
Review similar tests in the codebase
Ask in GitHub discussions
Read pytest documentation: https://docs.pytest.org/