Available Test Containers¶
ASQI provides several pre-built test containers for different testing scenarios. All containers are available on Docker Hub and can be pulled using the installation commands.
Container Overview¶
Mock Tester (
asqiengineer/test-container:mock_tester-latest
): Basic test container for development and validationGarak Security Tester (
asqiengineer/test-container:garak-latest
): LLM security vulnerability assessment with 40+ attack vectorsChatbot Simulator (
asqiengineer/test-container:chatbot_simulator-latest
): Persona-based conversational testing with multi-turn dialogueTrustLLM (
asqiengineer/test-container:trustllm-latest
): Comprehensive trustworthiness evaluation frameworkDeepTeam (
asqiengineer/test-container:deepteam-latest
): Red teaming library for adversarial robustness testing
All containers are available on Docker Hub and can be pulled using the commands shown in the installation section.
Test Container Examples¶
Note: Certain tests include volume mounting to save detailed logs. You might need to configure the volume output mount accordingly.
ASQI provides ready-to-use example configurations for each test container. Download and run these examples to get started quickly:
Mock Tester Example¶
Basic test container for development and validation:
# Download and run the basic demo
curl -O https://raw.githubusercontent.com/asqi-engineer/asqi-engineer/main/config/suites/demo_test.yaml
asqi execute-tests -t demo_test.yaml -s demo_systems.yaml -o results.json
Garak Security Testing Example¶
LLM security vulnerability assessment with multiple attack probes:
# Download security test configuration
curl -O https://raw.githubusercontent.com/asqi-engineer/asqi-engineer/main/config/suites/garak_test.yaml
# Run security tests (includes encoding attacks and prompt injection)
asqi execute-tests -t garak_test.yaml -s demo_systems.yaml -o security_results.json
Note: Certain tests requires a OPENAI_API_KEY
so it is recommended to pass it in via the env_file
field as part of the system config.
Chatbot Simulator Example¶
Persona-based conversational testing with multi-turn dialogue:
# Download chatbot simulation configuration
curl -O https://raw.githubusercontent.com/asqi-engineer/asqi-engineer/main/config/suites/chatbot_simulator_test.yaml
# Run conversational tests
asqi execute-tests -t chatbot_simulator_test.yaml -s demo_systems.yaml -o chatbot_results.json
TrustLLM Example¶
Comprehensive trustworthiness evaluation across multiple dimensions:
# Download trustworthiness evaluation configuration
curl -O https://raw.githubusercontent.com/asqi-engineer/asqi-engineer/main/config/suites/trustllm_test.yaml
# Run trustworthiness evaluation
asqi execute-tests -t trustllm_test.yaml -s demo_systems.yaml -o trustllm_results.json
DeepTeam Red Teaming Example¶
Advanced adversarial robustness testing:
# Download red teaming configuration
curl -O https://raw.githubusercontent.com/asqi-engineer/asqi-engineer/main/config/suites/deepteam_test.yaml
# Run red teaming tests
asqi execute-tests -t deepteam_test.yaml -s demo_systems.yaml -o redteam_results.json
Evaluating Score Cards¶
Score cards provide automated assessment of test results against business-relevant criteria. ASQI engineer includes a flexible grading engine that evaluates individual test executions and provides structured feedback.
How Score Cards Work¶
Score cards consist of indicators that evaluate specific metrics from test results:
Apply to specific tests: Target individual test names from your test suite
Extract metrics: Pull any field from test container JSON output
Assessment criteria: Define pass/fail thresholds with business-friendly outcomes
Individual evaluation: Each test execution is assessed separately (no aggregation)
Basic Score Card Example¶
Using the simple example score card for mock tester results:
# First run a test to generate results
curl -O https://raw.githubusercontent.com/asqi-engineer/asqi-engineer/main/config/suites/demo_test.yaml
asqi execute-tests -t demo_test.yaml -s demo_systems.yaml -o test_results.json
# Download and apply basic score card
curl -O https://raw.githubusercontent.com/asqi-engineer/asqi-engineer/main/config/score_cards/example_score_card.yaml
asqi evaluate-score-cards --input-file test_results.json -r example_score_card.yaml -o results_with_grades.json
# Or run end-to-end (tests + score card evaluation)
asqi execute -t demo_test.yaml -s demo_systems.yaml -r example_score_card.yaml -o complete_results.json
For detailed information about specific test containers, see LLM Test Containers.