Basic Evaluation Steps
Basic evaluation steps provide trajectory analysis and visualization capabilities for Jetty workflows. These steps enable trajectory filtering, selection, and correlation analysis for comprehensive workflow assessment.
Available Steps (2)
select_trajectories
Advanced trajectory filtering and selection with multiple criteria support.
Activity Name: select_trajectories
Use Cases: Trajectory filtering, workflow analysis, performance assessment, quality control
visualize_correlation
Creates correlation visualizations and statistical analysis for trajectory data.
Activity Name: visualize_correlation
Use Cases: Performance analysis, metric correlation, trend visualization, statistical insights
select_trajectories
Selects and filters trajectories based on comprehensive criteria including labels, status, author, and date ranges.
Activity Name
select_trajectories
Configuration Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
task_name | string | current task | Workflow name in format collection/name or just name |
filter_by | object | {} | Filtering criteria |
limit | int | 500 | Maximum number of trajectories to return |
offset | int | 0 | Number of trajectories to skip |
Filter Criteria
The filter_by object supports these operators:
| Operator | Description | Example |
|---|---|---|
$in | Value in list | {"labels": {"score": {"$in": ["1", "0"]}}} |
$eq | Equal to | {"status": {"$eq": "completed"}} |
$ne | Not equal to | {"status": {"$ne": "failed"}} |
$exists | Field exists | {"labels": {"$exists": true}} |
Examples
Basic Selection
{
"name": "get_trajectories",
"activity": "select_trajectories",
"config": {
"task_name": "my_collection/my_flow",
"limit": 100
}
}
Filter by Labels
{
"name": "filter_by_score",
"activity": "select_trajectories",
"config": {
"task_name": "jettyio/evaluation_flow",
"filter_by": {
"labels": {"human_score": {"$in": ["1", "0"]}}
},
"limit": 500
}
}
Filter by Status
{
"name": "get_completed",
"activity": "select_trajectories",
"config": {
"filter_by": {
"status": "completed"
},
"limit": 1000
}
}
Output Structure
{
"outputs": {
"selected_trajectories": [
{
"trajectory_id": "abc123",
"collection": "my_collection",
"name": "my_flow",
"folder": "0001",
"status": "completed"
}
],
"total_found": 50,
"total_available": 500,
"filter_criteria": {...}
}
}
visualize_correlation
Creates matplotlib visualizations for correlation analysis between trajectory metrics.
Activity Name
visualize_correlation
Configuration Parameters
| Parameter | Type | Description |
|---|---|---|
trajectory_path | string | Reference to trajectory list from previous step |
x | string | Path expression for x-axis values |
y | string | Path expression for y-axis values |
Path Expressions
Use dot notation to extract values from trajectories:
| Expression | Description |
|---|---|
labels[0].value | First label's value |
steps.step_name.outputs.score | Step output value |
init_params.field | Init parameter |
status | Trajectory status |
Examples
Correlation Plot
{
"name": "plot_correlation",
"activity": "visualize_correlation",
"config": {
"trajectory_path": "filter_trajectories.outputs.selected_trajectories",
"x": "labels[0].value",
"y": "steps.evaluator.outputs.score"
}
}
Human vs Model Scores
{
"name": "human_model_correlation",
"activity": "visualize_correlation",
"config": {
"trajectory_path": "select_labeled.outputs.selected_trajectories",
"x": "labels.human_score.value",
"y": "steps.judge.outputs.results[0].score"
}
}
Output Structure
{
"outputs": {
"correlation_data": [
{
"trajectory_id": "abc123",
"collection": "my_collection",
"folder": "0001",
"name": "my_flow",
"x": 0.8,
"y": 0.75
}
],
"total_processed": 50,
"errors": [],
"x_max": 1.0,
"y_max": 0.95,
"x_min": 0.2,
"y_min": 0.1,
"scatter_plot": {
"path": "collection/flow/0001/visualize_correlation_0.png",
"extension": "png"
},
"csv_file": {
"path": "collection/flow/0001/visualize_correlation_1.csv",
"extension": "csv"
}
}
}
Generated Artifacts
| File | Description |
|---|---|
| Scatter plot (PNG) | Visual correlation plot with data points |
| CSV file | Raw correlation data for further analysis |
Common Patterns
Evaluation Pipeline
{
"steps": ["select", "correlate"],
"step_configs": {
"select": {
"activity": "select_trajectories",
"task_name": "my_collection/evaluation_flow",
"filter_by": {
"labels": {"reviewed": {"$eq": "true"}}
},
"limit": 500
},
"correlate": {
"activity": "visualize_correlation",
"trajectory_path": "select.outputs.selected_trajectories",
"x": "labels.human_rating.value",
"y": "steps.auto_scorer.outputs.score"
}
}
}
Multi-Flow Comparison
{
"steps": ["select_v1", "select_v2", "compare_v1", "compare_v2"],
"step_configs": {
"select_v1": {
"activity": "select_trajectories",
"task_name": "my_collection/model_v1",
"filter_by": {"status": "completed"},
"limit": 200
},
"select_v2": {
"activity": "select_trajectories",
"task_name": "my_collection/model_v2",
"filter_by": {"status": "completed"},
"limit": 200
},
"compare_v1": {
"activity": "visualize_correlation",
"trajectory_path": "select_v1.outputs.selected_trajectories",
"x": "init_params.difficulty",
"y": "steps.evaluate.outputs.score"
},
"compare_v2": {
"activity": "visualize_correlation",
"trajectory_path": "select_v2.outputs.selected_trajectories",
"x": "init_params.difficulty",
"y": "steps.evaluate.outputs.score"
}
}
}
Best Practices
Trajectory Selection
- Apply filters early to reduce data volume
- Use appropriate limits to avoid memory issues
- Validate filter criteria with small samples first
Correlation Analysis
- Ensure x and y values are numeric or can be converted
- Check for missing values in path expressions
- Use consistent path expressions across trajectories
Performance
- Limit trajectory selection to necessary data
- Consider sampling for very large datasets
- Cache visualization outputs for repeated analysis
Related Steps
- Simple Judge - Evaluate trajectory outputs
- Extract From Trajectories - Extract data from trajectories
- List Emit Await - Generate trajectories for analysis