> ## Documentation Index > Fetch the complete documentation index at: https://docs.adenhq.com/llms.txt > Use this file to discover all available pages before exploring further. # Testing Agents > Validate agent behavior with goal-based and regression testing to build confidence and ensure production readiness ## Overview Testing is the foundation of developer confidence. In Hive, comprehensive testing confirms three critical things: * The agent meets success criteria defined by the goal * Constraints are respected under normal and edge inputs * Failure and escalation paths behave as expected ## Recommended Workflow 1. Generate or refine tests with the coding agent: ```bash theme={null} claude> /hive-test ``` 2. Run focused suites while iterating: ```bash theme={null} PYTHONPATH=exports uv run pytest exports/your_agent/tests/ -v ``` 3. Run goal-based checks before merge: ```bash theme={null} uv run hive test-run exports/your_agent --goal your_goal_id ``` ## Common Commands ### Run all tests for an agent ```bash theme={null} PYTHONPATH=exports uv run pytest exports/your_agent/tests/ -v ``` ### Run a single test ```bash theme={null} PYTHONPATH=exports uv run pytest \ exports/your_agent/tests/test_agent.py::test_happy_path -v ``` ### Run goal-aware CLI test runner ```bash theme={null} uv run hive test-run exports/your_agent --goal your_goal_id ``` ### List generated tests ```bash theme={null} uv run hive test-list exports/your_agent ``` ### Debug a failing test ```bash theme={null} uv run hive test-debug exports/your_agent test_constraint_budget_limit ``` ## What to Test ### Goal Completion * Primary success criteria are satisfied * Weighted criteria do not regress across releases ### Constraints * Hard constraints always fail safely * Soft constraints emit warnings or fallback behavior ### Routing and Retries * Conditional edges take the correct branch * Retry loops terminate and do not stall the graph ### Human-in-the-Loop * Pause/resume paths work * Timeout and escalation behavior match requirements ## CI Example ```yaml theme={null} name: Agent Tests on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Setup run: ./quickstart.sh - name: Pytest run: PYTHONPATH=exports uv run pytest exports/*/tests/ -v --tb=short ``` ## Best Practices * Keep unit-level tests deterministic with mocked tool responses * Add regression tests for every production failure you fix * Treat constraints as mandatory API contracts, not optional hints * Track test coverage across success, failure, retry, and HITL branches > **Testing and Debugging**: Testing catches issues before production. Once your agent is live, [debugging](/building/debugging) tools help you diagnose and fix issues based on real-world behavior. ## Next Steps Create and export a new agent package Add supervised decision points to sensitive actions Diagnose and fix issues discovered in testing or production Deploy your tested agent to production