src.asqi.score_card_engine

Attributes

Classes

ScoreCardEvaluationResult

Result of evaluating a single score_card indicator.

ScoreCardEngine

Core score_card evaluation engine.

Functions

parse_metric_path(→ List[str])

Parse a metric path supporting both dot notation and bracket notation.

get_nested_value(→ Tuple[Any, Optional[str]])

Extract a nested value from a dictionary using dot/bracket notation.

Module Contents

src.asqi.score_card_engine.logger
src.asqi.score_card_engine.parse_metric_path(path: str) List[str]

Parse a metric path supporting both dot notation and bracket notation.

Examples:

‘success’ -> [‘success’] ‘vulnerability_stats.Toxicity.overall_pass_rate’ -> [‘vulnerability_stats’, ‘Toxicity’, ‘overall_pass_rate’] ‘probe_results[“encoding.InjectHex”][“encoding.DecodeMatch”].passed’ -> [‘probe_results’, ‘encoding.InjectHex’, ‘encoding.DecodeMatch’, ‘passed’] ‘probe_results[“encoding.InjectHex”].total_attempts’ -> [‘probe_results’, ‘encoding.InjectHex’, ‘total_attempts’]

Args:

path: Metric path string to parse

Returns:

List of keys to traverse

Raises:

ValueError: If path contains invalid syntax

src.asqi.score_card_engine.get_nested_value(data: Dict[str, Any], path: str) Tuple[Any, str | None]

Extract a nested value from a dictionary using dot/bracket notation.

Args:

data: Dictionary to extract value from path: Path to the nested value (e.g., ‘a.b.c’ or ‘a[“key.with.dots”].c’)

Returns:

Tuple of (value, error_message). If successful, error_message is None. If failed, value is None and error_message describes the issue.

class src.asqi.score_card_engine.ScoreCardEvaluationResult(indicator_id: str, indicator_name: str | None, test_id: str)

Result of evaluating a single score_card indicator.

indicator_id
indicator_name
test_id
outcome: str | None = None
metric_value: Any | None = None
test_result_id: str | None = None
sut_name: str | None = None
computed_value: int | float | bool | None = None
details: str = ''
description: str | None = None
notes: str | None = None
error: str | None = None
to_dict() Dict[str, Any]

Convert to dictionary for JSON serialization.

class src.asqi.score_card_engine.ScoreCardEngine

Core score_card evaluation engine.

filter_results_by_test_id(test_results: List[asqi.workflow.TestExecutionResult], target_test_id: str) List[asqi.workflow.TestExecutionResult]

Filter test results to only include those with the specified test id.

Args:

test_results: List of test execution results to filter target_test_id: Name of test to filter for

Returns:

Filtered list of test results matching the target test id

validate_scorecard_test_ids(test_results: List[asqi.workflow.TestExecutionResult], score_card: asqi.schemas.ScoreCard) None

Check that the score card indicators are applicable to the available test results.

Audit indicators do not reference test_ids and are ignored here.

extract_metric_values(test_results: List[asqi.workflow.TestExecutionResult], metric_path: str) List[Any]

Extract metric values from test results using the specified path.

Supports both flat and nested metric access: - Flat: ‘success’, ‘score’ - Nested: ‘vulnerability_stats.Toxicity.overall_pass_rate’ - Bracket notation: ‘probe_results[“encoding.InjectHex”][“encoding.DecodeMatch”].passed’

Args:

test_results: List of test execution results metric_path: Path to metric within test results (supports dot and bracket notation)

Returns:

List of extracted metric values

resolve_metric_or_expression(test_result: asqi.workflow.TestExecutionResult, metric_config: str | asqi.schemas.MetricExpression) Tuple[int | float | None, str | None]

Resolve a metric configuration (simple path or expression object).

Args:

test_result: Test execution result containing metric data metric_config: Either a simple metric path string or MetricExpression object

Returns:

Tuple of (resolved_value, error_message). If successful, error is None.

apply_condition_to_value(value: Any, condition: str, threshold: int | float | None = None) Tuple[bool, str]

Apply the specified condition to a single value.

Args:

value: Value to evaluate condition: Condition to apply (e.g., ‘equal_to’, ‘greater_than’) threshold: Threshold value for comparison (required for most conditions)

Returns:

Tuple of (condition_met, description)

evaluate_indicator(test_results: List[asqi.workflow.TestExecutionResult], indicator: asqi.schemas.ScoreCardIndicator) List[ScoreCardEvaluationResult]

Evaluate a single score_card indicator against individual test results.

Args:

test_results: List of test execution results to evaluate indicator: Score card indicator configuration

Returns:

List of evaluation results for each matching test

evaluate_audit_indicator(indicator: asqi.schemas.AuditScoreCardIndicator, audit_responses: asqi.schemas.AuditResponses | None = None, available_suts: Set[str] | None = None) List[ScoreCardEvaluationResult]

Convert manual audit responses for a single audit indicator into evaluation results.

evaluate_scorecard(test_results: List[asqi.workflow.TestExecutionResult], score_card: asqi.schemas.ScoreCard, audit_responses_data: asqi.schemas.AuditResponses | None = None) List[Dict[str, Any]]

Evaluate a complete grading score_card against test results.

Args:

test_results: List of test execution results to evaluate score_card: Complete score card configuration audit_responses_data: User-provided audit responses

Returns:

List of evaluation result dictionaries

Raises:

ValueError: If no indicators match any test ids in the test results