src.asqi.score_card_engine¶

Attributes¶

logger

Classes¶

`ScoreCardEvaluationResult`	Result of evaluating a single score_card indicator.
`ScoreCardEngine`	Core score_card evaluation engine.

Functions¶

`parse_metric_path`(→ List[str])	Parse a metric path supporting both dot notation and bracket notation.
`get_nested_value`(→ Tuple[Any, Optional[str]])	Extract a nested value from a dictionary using dot/bracket notation.

Module Contents¶

src.asqi.score_card_engine.logger¶

src.asqi.score_card_engine.parse_metric_path(path: str) → List[str]¶

Parse a metric path supporting both dot notation and bracket notation.

Examples:: ‘success’ -> [‘success’] ‘vulnerability_stats.Toxicity.overall_pass_rate’ -> [‘vulnerability_stats’, ‘Toxicity’, ‘overall_pass_rate’] ‘probe_results[“encoding.InjectHex”][“encoding.DecodeMatch”].passed’ -> [‘probe_results’, ‘encoding.InjectHex’, ‘encoding.DecodeMatch’, ‘passed’] ‘probe_results[“encoding.InjectHex”].total_attempts’ -> [‘probe_results’, ‘encoding.InjectHex’, ‘total_attempts’]
Args:: path: Metric path string to parse
Returns:: List of keys to traverse
Raises:: ValueError: If path contains invalid syntax

src.asqi.score_card_engine.get_nested_value(data: Dict[str, Any], path: str) → Tuple[Any, str | None]¶

Extract a nested value from a dictionary using dot/bracket notation.

Args:: data: Dictionary to extract value from path: Path to the nested value (e.g., ‘a.b.c’ or ‘a[“key.with.dots”].c’)
Returns:: Tuple of (value, error_message). If successful, error_message is None. If failed, value is None and error_message describes the issue.

class src.asqi.score_card_engine.ScoreCardEvaluationResult(indicator_id: str, indicator_name: str | None, test_id: str)¶

Result of evaluating a single score_card indicator.

indicator_id¶

indicator_name¶

test_id¶

outcome: str | None = None¶

metric_value: Any | None = None¶

test_result_id: str | None = None¶

sut_name: str | None = None¶

computed_value: int | float | bool | None = None¶

details: str = ''¶

description: str | None = None¶

notes: str | None = None¶

error: str | None = None¶

to_dict() → Dict[str, Any]¶: Convert to dictionary for JSON serialization.

class src.asqi.score_card_engine.ScoreCardEngine¶

Core score_card evaluation engine.

filter_results_by_test_id(test_results: List[asqi.workflow.TestExecutionResult], target_test_id: str) → List[asqi.workflow.TestExecutionResult]¶

Filter test results to only include those with the specified test id.

Args:: test_results: List of test execution results to filter target_test_id: Name of test to filter for
Returns:: Filtered list of test results matching the target test id

validate_scorecard_test_ids(test_results: List[asqi.workflow.TestExecutionResult], score_card: asqi.schemas.ScoreCard) → None¶

Check that the score card indicators are applicable to the available test results.

Audit indicators do not reference test_ids and are ignored here.

extract_metric_values(test_results: List[asqi.workflow.TestExecutionResult], metric_path: str) → List[Any]¶

Extract metric values from test results using the specified path.

Supports both flat and nested metric access: - Flat: ‘success’, ‘score’ - Nested: ‘vulnerability_stats.Toxicity.overall_pass_rate’ - Bracket notation: ‘probe_results[“encoding.InjectHex”][“encoding.DecodeMatch”].passed’

Args:: test_results: List of test execution results metric_path: Path to metric within test results (supports dot and bracket notation)
Returns:: List of extracted metric values

resolve_metric_or_expression(test_result: asqi.workflow.TestExecutionResult, metric_config: str | asqi.schemas.MetricExpression) → Tuple[int | float | None, str | None]¶

Resolve a metric configuration (simple path or expression object).

Args:: test_result: Test execution result containing metric data metric_config: Either a simple metric path string or MetricExpression object
Returns:: Tuple of (resolved_value, error_message). If successful, error is None.

apply_condition_to_value(value: Any, condition: str, threshold: int | float | None = None) → Tuple[bool, str]¶

Apply the specified condition to a single value.

Args:: value: Value to evaluate condition: Condition to apply (e.g., ‘equal_to’, ‘greater_than’) threshold: Threshold value for comparison (required for most conditions)
Returns:: Tuple of (condition_met, description)

evaluate_indicator(test_results: List[asqi.workflow.TestExecutionResult], indicator: asqi.schemas.ScoreCardIndicator) → List[ScoreCardEvaluationResult]¶

Evaluate a single score_card indicator against individual test results.

Args:: test_results: List of test execution results to evaluate indicator: Score card indicator configuration
Returns:: List of evaluation results for each matching test

evaluate_audit_indicator(indicator: asqi.schemas.AuditScoreCardIndicator, audit_responses: asqi.schemas.AuditResponses | None = None, available_suts: Set[str] | None = None) → List[ScoreCardEvaluationResult]¶: Convert manual audit responses for a single audit indicator into evaluation results.

evaluate_scorecard(test_results: List[asqi.workflow.TestExecutionResult], score_card: asqi.schemas.ScoreCard, audit_responses_data: asqi.schemas.AuditResponses | None = None) → List[Dict[str, Any]]¶

Evaluate a complete grading score_card against test results.

Args:: test_results: List of test execution results to evaluate score_card: Complete score card configuration audit_responses_data: User-provided audit responses
Returns:: List of evaluation result dictionaries
Raises:: ValueError: If no indicators match any test ids in the test results