Skip to content

[Pipeline] Build pipeline with conditional step inclusion #331

@ArthurCRodrigues

Description

@ArthurCRodrigues

Description

Currently, the AutograderPipeline is built with a mostly fixed set of steps, and each step (like SandboxStep, AiBatchStep) internally checks if it should perform any work during its execute phase. This leads to:

  1. Bloated Logs: Logs show "Executing step: SANDBOX" even for assignments that don't need one.
  2. Unnecessary Result Objects: PipelineExecution is filled with successful but empty StepResult objects.
  3. Performance Overhead: Minor overhead of initializing and calling execute on redundant steps.

The goal is to move this decision logic to the pipeline building phase, ensuring that the pipeline is constructed only with the steps strictly required for the given assignment.

Proposed Implementation Plan

  1. Refactor Template Loading:

    • Modify build_pipeline in autograder/autograder/autograder.py to load templates (using TemplateLibraryService) before initializing the StepRegistry.
    • This allows the builder to inspect template metadata (e.g., requires_sandbox).
  2. Update StepRegistry for Conditional Building:

    • Update StepRegistry.__init__ to accept the loaded templates.
    • Modify build_step (and its internal _build_* methods) to return None if a step is not required.
      • SandboxStep: Only return if any(t.requires_sandbox for t in templates).
      • AiBatchStep: Only return if grading_criteria or templates contain AiTestFunction nodes.
      • PreFlightStep: Only return if setup_config contains actual requirements.
  3. Update TemplateLoaderStep:

    • Refactor TemplateLoaderStep to take the already-loaded templates and simply attach them to the PipelineExecution. This avoids double loading while maintaining the contract that templates are available in the execution context.
  4. Clean up Individual Steps:

    • Remove the "skip" logic from SandboxStep, AiBatchStep, etc. Since these steps will only be added to the pipeline when necessary, they can assume their execution is always required.

Expected Outcome

  • Assignments that don't require a sandbox or AI will have a significantly shorter and cleaner pipeline execution history.
  • The pipeline architecture becomes more explicit and efficient.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions