diff --git a/.agents/README.md b/.agents/README.md new file mode 100644 index 000000000..ef04713ab --- /dev/null +++ b/.agents/README.md @@ -0,0 +1,161 @@ +# Weaver Agent Skills + +This directory contains [Agent Skills](https://agentskills.io/) - a standardized format for describing capabilities that +can be discovered and used by AI agents, IDEs, and automated systems. + +## What are Agent Skills? + +Agent Skills provide a structured way to document and expose Weaver's capabilities in a format that LLMs and AI agents +can easily understand and utilize. Each skill is self-contained with: + +- **Frontmatter metadata** (YAML) describing the skill +- **Markdown documentation** explaining usage +- **Optional supporting files** (scripts, references, assets) + +## Directory Structure + +``` +.agents/ +└── skills/ + ├── weaver-skill-create/ + │ └── SKILL.md + ├── process-deploy/ + │ └── SKILL.md + ├── job-monitor/ + │ └── SKILL.md + └── ... +``` + +## Project Structure & Skill Integration + +For a complete overview of the Weaver project structure and how Agent Skills integrate with the codebase, see [/AGENTS.md](/AGENTS.md). + +### Quick Reference: Skills to Code Mapping + +| Skill Category | Code Location | Interface | +| --------------------------------------- | ----------------------- | -------------------- | +| **job-**, **process-**, **provider-** | `weaver/cli.py` | CLI commands | +| **API skills** | `weaver/wps_restapi/` | REST endpoints | +| **process-** | `weaver/processes/` | Process operations | +| **cwl-** | `weaver/` | CWL support | + +## Available Skills + +All skills are organized by category for easy discovery: + +### API Information + +- **[api-conformance](skills/api-conformance/SKILL.md)** - Check OGC standards conformance +- **[api-info](skills/api-info/SKILL.md)** - Get API metadata and endpoints +- **[api-version](skills/api-version/SKILL.md)** - Get Weaver version information + +### CWL Comprehension + +- **[cwl-create-commandlinetool](skills/cwl-create-commandlinetool/SKILL.md)** - Create CWL CommandLineTool packages +- **[cwl-debug-package](skills/cwl-debug-package/SKILL.md)** - Debug CWL package deployment and execution issues +- **[cwl-optimize-performance](skills/cwl-optimize-performance/SKILL.md)** - Optimize CWL performance and resource usage +- **[cwl-understand-builtin](skills/cwl-understand-builtin/SKILL.md)** - Use Weaver's built-in utility processes +- **[cwl-understand-docker](skills/cwl-understand-docker/SKILL.md)** - Master Docker requirements in CWL packages +- **[cwl-understand-workflow](skills/cwl-understand-workflow/SKILL.md)** - Create multi-step CWL workflows +- **[cwl-use-expressions](skills/cwl-use-expressions/SKILL.md)** - Use JavaScript expressions for dynamic behavior +- **[cwl-validate-package](skills/cwl-validate-package/SKILL.md)** - Validate CWL syntax before deployment + +### Job Operations + +- **[job-dismiss](skills/job-dismiss/SKILL.md)** - Cancel running or pending jobs +- **[job-exceptions](skills/job-exceptions/SKILL.md)** - Get detailed error information +- **[job-execute](skills/job-execute/SKILL.md)** - Run processes with inputs (async/sync) +- **[job-inputs](skills/job-inputs/SKILL.md)** - Retrieve job input parameters +- **[job-list](skills/job-list/SKILL.md)** - List jobs with filtering and pagination +- **[job-logs](skills/job-logs/SKILL.md)** - View execution logs for debugging +- **[job-monitor](skills/job-monitor/SKILL.md)** - Wait for job completion with polling +- **[job-provenance](skills/job-provenance/SKILL.md)** - Get W3C PROV lineage metadata +- **[job-results](skills/job-results/SKILL.md)** - Retrieve output results +- **[job-statistics](skills/job-statistics/SKILL.md)** - Retrieve resource usage metrics +- **[job-status](skills/job-status/SKILL.md)** - Check job execution status + +### Process Management + +- **[process-deploy](skills/process-deploy/SKILL.md)** - Deploy CWL application packages +- **[process-describe](skills/process-describe/SKILL.md)** - Get process details and capabilities +- **[process-package](skills/process-package/SKILL.md)** - Retrieve CWL package definitions from a deployed process +- **[process-list](skills/process-list/SKILL.md)** - Discover available processes +- **[process-undeploy](skills/process-undeploy/SKILL.md)** - Remove deployed processes + +### Provider Management + +- **[provider-list](skills/provider-list/SKILL.md)** - List all registered providers +- **[provider-register](skills/provider-register/SKILL.md)** - Register remote WPS/OGC services +- **[provider-unregister](skills/provider-unregister/SKILL.md)** - Remove provider registrations + +### Setup Operations + +- **[weaver-install](skills/weaver-install/SKILL.md)** - Install and configure Weaver (Docker or from source) +- **[weaver-ci-validate](skills/weaver-ci-validate/SKILL.md)** - Run code test and lint checks with Makefile targets +- **[weaver-skill-create](skills/weaver-skill-create/SKILL.md)** - Create new Agent Skills +- **[weaver-skills-update](skills/weaver-skills-update/SKILL.md)** - Maintain and update skills documentation + +### Vault Operations + +- **[vault-upload](skills/vault-upload/SKILL.md)** - Store sensitive data securely + +## Using These Skills + +### For AI Agents + +AI agents can read the SKILL.md files to understand: + +1. When to use each capability +2. What parameters are required +3. How to format requests +4. What to expect in responses + +### For IDEs (PyCharm, VS Code) + +Configure your IDE to recognize these skills for autocomplete and suggestions: + +**PyCharm / JetBrains**: Add to `~/.config/github-copilot/intellij/mcp.json`: + +```json +{ + "servers": { + "weaver": { + "type": "filesystem", + "path": "/path/to/weaver/.agents/skills" + } + } +} +``` + +**VS Code**: Add to `.vscode/settings.json`: + +```json +{ + "github.copilot.advanced": { + "contextFiles": [ + "${workspaceFolder}/.agents/skills/**/*.md" + ] + } +} +``` + +## Creating New Skills + +For detailed guidance on creating new Agent Skills, see [weaver-skill-create](skills/weaver-skill-create/SKILL.md). +This includes naming conventions, metadata requirements, structure, examples, and best practices. + +## Skill Metadata + +Each SKILL.md file contains YAML frontmatter with metadata. For complete metadata documentation and best practices, +see [weaver-skill-create](skills/weaver-skill-create/SKILL.md). + +## Additional Resources + +- **Weaver Docs**: [https://pavics-weaver.readthedocs.io/](https://pavics-weaver.readthedocs.io/) +- **Agent Skills Spec**: [https://agentskills.io/specification](https://agentskills.io/specification) + +## Support + +- **Issues**: [https://github.com/crim-ca/weaver/issues](https://github.com/crim-ca/weaver/issues) +- **Discussions**: [https://github.com/crim-ca/weaver/discussions](https://github.com/crim-ca/weaver/discussions) +- **Documentation**: [https://pavics-weaver.readthedocs.io/](https://pavics-weaver.readthedocs.io/) diff --git a/.agents/skills/api-conformance/SKILL.md b/.agents/skills/api-conformance/SKILL.md new file mode 100644 index 000000000..0500824a2 --- /dev/null +++ b/.agents/skills/api-conformance/SKILL.md @@ -0,0 +1,194 @@ +--- +name: api-conformance +description: | + Retrieve the OGC API - Processes conformance classes that the Weaver instance implements. + Shows which parts of the OGC standard are supported. Use to verify compliance and feature availability. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + category: system-information + version: 1.0.0 + author: CRIM +allowed-tools: http_request +--- + +# Check Conformance + +Retrieve OGC API - Processes conformance classes implemented by Weaver. + +## When to Use + +- Verifying OGC standards compliance +- Checking which features are supported +- Validating integration compatibility +- Testing interoperability +- Documenting system capabilities + +## Parameters + +None required. + +## CLI Usage + +```bash +# Get conformance classes +weaver conformance -u $WEAVER_URL + +# Check specific feature support +weaver conformance -u $WEAVER_URL | grep -i "deploy" +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Get conformance +conformance = client.conformance() + +# Check supported conformance classes +for uri in conformance.body.get("conformsTo", []): + print(uri) + +# Check specific feature +if "http://www.opengis.net/spec/ogcapi-processes-2/1.0/conf/deploy-replace-undeploy" in conformance.body["conformsTo"]: + print("Supports process deployment") +``` + +## API Request + +```bash +curl -X GET \ + "${WEAVER_URL}/conformance" +``` + +## Returns + +```json +{ + "conformsTo": [ + "http://www.opengis.net/spec/ogcapi-processes-1/1.0/conf/core", + "http://www.opengis.net/spec/ogcapi-processes-1/1.0/conf/ogc-process-description", + "http://www.opengis.net/spec/ogcapi-processes-1/1.0/conf/json", + "http://www.opengis.net/spec/ogcapi-processes-1/1.0/conf/job-list", + "http://www.opengis.net/spec/ogcapi-processes-2/1.0/conf/deploy-replace-undeploy", + "http://www.opengis.net/spec/ogcapi-processes-3/0.0/conf/workflows", + "http://www.opengis.net/spec/ogcapi-processes-4/1.0/conf/job-management" + ] +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Conformance Classes + +### OGC API - Processes Part 1: Core + +- **core**: Basic process execution +- **ogc-process-description**: Standard process descriptions +- **json**: JSON encoding support +- **job-list**: Job listing capability + +### OGC API - Processes Part 2: Deploy, Replace, Undeploy (DRU) + +- **deploy-replace-undeploy**: Dynamic process deployment +- **ogcapppkg**: OGC Application Package support +- **cwl**: Common Workflow Language support + +### OGC API - Processes Part 3: Workflows and Chaining + +- **workflows**: Workflow execution support +- **chaining**: Process chaining capabilities + +### OGC API - Processes Part 4: Job Management + +- **job-management**: Enhanced job operations +- **job-callback**: Notification callbacks +- **job-dismiss**: Job cancellation + +## Feature Detection + +### Check Process Deployment Support + +```python +conformance = client.conformance() +conforms_to = conformance.body.get("conformsTo", []) + +has_deployment = any("deploy" in uri for uri in conforms_to) +if has_deployment: + print("This Weaver supports dynamic process deployment") +``` + +### Check Workflow Support + +```bash +if weaver conformance -u $WEAVER_URL | grep -q "workflows"; then + echo "Workflow chaining is supported" +else + echo "Workflow chaining not available" +fi +``` + +### Check Job Management + +```python +conforms_to = client.conformance().body["conformsTo"] + +features = { + "Job Listing": any("job-list" in uri for uri in conforms_to), + "Job Dismissal": any("dismiss" in uri for uri in conforms_to), + "Job Callbacks": any("callback" in uri for uri in conforms_to), +} + +for feature, supported in features.items(): + status = "✓" if supported else "✗" + print(f"{status} {feature}") +``` + +## Use Cases + +### Compatibility Testing + +```python +# Test if client and server are compatible +required_features = [ + "http://www.opengis.net/spec/ogcapi-processes-1/1.0/conf/core", + "http://www.opengis.net/spec/ogcapi-processes-1/1.0/conf/json" +] + +conformance = client.conformance() +supported = conformance.body.get("conformsTo", []) + +compatible = all(feature in supported for feature in required_features) +if compatible: + print("Server is compatible with client requirements") +``` + +### Feature Documentation + +```bash +# Generate feature report +echo "Weaver Capabilities Report" +echo "=========================" +weaver version -u $WEAVER_URL +echo "" +echo "Supported OGC Features:" +weaver conformance -u $WEAVER_URL | grep -o 'processes-[0-9]/.*' | sort -u +``` + +## Related Skills + +- [api-info](../api-info/) - Get general API information +- [api-version](../api-version/) - Check version +- [process-list](../process-list/) - See what processes are available +- [process-deploy](../process-deploy/) - Use deployment if supported + +## Documentation + +- [OGC API - Processes Conformance](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [Supported Features](https://pavics-weaver.readthedocs.io/en/latest/index.html#implementations) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) +- [OGC API - Processes Specification](https://docs.ogc.org/is/18-062r2/18-062r2.html) diff --git a/.agents/skills/api-info/SKILL.md b/.agents/skills/api-info/SKILL.md new file mode 100644 index 000000000..502570a08 --- /dev/null +++ b/.agents/skills/api-info/SKILL.md @@ -0,0 +1,188 @@ +--- +name: api-info +description: | + Retrieve general API information including server details, supported endpoints, API version, and + contact information. Use to verify Weaver instance availability and get basic service metadata. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# Get API Information + +Retrieve general API information and server metadata. + +## When to Use + +- Verifying Weaver instance availability +- Getting API endpoints and capabilities +- Checking server configuration +- Discovering supported features +- Integration and connectivity testing + +## Parameters + +None required - operates on the base API URL. + +## CLI Usage + +```bash +# Get API information +weaver info -u $WEAVER_URL + +# Check if server is responding +weaver info -u https://weaver.example.com +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Get API info +info = client.info() + +print(f"API Title: {info.body.get('title')}") +print(f"Description: {info.body.get('description')}") +print(f"Contact: {info.body.get('contact')}") + +# Check available endpoints +for link in info.body.get("links", []): + print(f"{link['rel']}: {link['href']}") +``` + +## API Request + +```bash +GET / +Accept: application/json +``` + +## Returns + +```json +{ + "title": "Weaver", + "description": "Weaver: OGC API - Processes with Workflow Capabilities", + "attribution": "© 2020-2026 CRIM", + "type": "application", + "configuration": "HYBRID", + "contact": { + "name": "CRIM", + "url": "https://crim.ca" + }, + "links": [ + { + "rel": "service-desc", + "type": "application/openapi+json;version=3.0", + "title": "OpenAPI definition", + "href": "https://weaver.example.com/api" + }, + { + "rel": "processes", + "type": "application/json", + "title": "List of processes", + "href": "https://weaver.example.com/processes" + }, + { + "rel": "jobs", + "type": "application/json", + "title": "List of jobs", + "href": "https://weaver.example.com/jobs" + }, + { + "rel": "providers", + "type": "application/json", + "title": "List of providers", + "href": "https://weaver.example.com/providers" + } + ] +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Response Fields + +### Server Information + +- **title**: Service title +- **description**: Service description +- **attribution**: Copyright and attribution +- **configuration**: Weaver mode (EMS, ADES, HYBRID) + +### Contact Information + +- **name**: Organization name +- **url**: Organization website +- **email**: Contact email (if provided) + +### Links + +- **service-desc**: OpenAPI specification +- **processes**: Process listing endpoint +- **jobs**: Job listing endpoint +- **providers**: Provider listing endpoint +- **conformance**: Conformance declaration + +## Configuration Modes + +- **EMS**: Execution Management Service (orchestrates remote ADES) +- **ADES**: Application Deployment and Execution Service (local execution) +- **HYBRID**: Both EMS and ADES capabilities + +## Use Cases + +### Health Check + +```bash +# Quick availability check +if weaver info -u $WEAVER_URL > /dev/null 2>&1; then + echo "Weaver is available" +else + echo "Weaver is not responding" +fi +``` + +### Service Discovery + +```python +# Discover available endpoints +info = client.info() + +endpoints = {link["rel"]: link["href"] for link in info.body["links"]} +print(f"Processes endpoint: {endpoints.get('processes')}") +print(f"Jobs endpoint: {endpoints.get('jobs')}") +``` + +### Configuration Check + +```python +# Verify Weaver mode +info = client.info() +config = info.body.get("configuration") + +if config == "HYBRID": + print("This Weaver supports both local and remote execution") +elif config == "EMS": + print("This Weaver orchestrates remote ADES instances") +elif config == "ADES": + print("This Weaver performs local execution only") +``` + +## Related Skills + +- [api-version](../api-version/) - Get detailed version information +- [api-conformance](../api-conformance/) - Check OGC conformance +- [process-list](../process-list/) - Browse available processes +- [provider-list](../provider-list/) - View registered providers + +## Documentation + +- [API Root](https://pavics-weaver.readthedocs.io/en/latest/api.html) +- [Configuration](https://pavics-weaver.readthedocs.io/en/latest/configuration.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/api-version/SKILL.md b/.agents/skills/api-version/SKILL.md new file mode 100644 index 000000000..a6179e3e3 --- /dev/null +++ b/.agents/skills/api-version/SKILL.md @@ -0,0 +1,165 @@ +--- +name: api-version +description: | + Retrieve detailed version information including Weaver version, database schema version, and + deployed commit hash. Use for version verification, troubleshooting, and ensuring compatibility. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# Get Version + +Retrieve detailed version information for the Weaver instance. + +## When to Use + +- Verifying Weaver version before deployment +- Troubleshooting compatibility issues +- Checking for available updates +- Bug reporting and support requests +- Ensuring feature availability +- Documenting system configuration + +## Parameters + +None required. + +## CLI Usage + +```bash +# Get version information +weaver version -u $WEAVER_URL + +# Check specific version +weaver version -u https://weaver.example.com +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Get version +version = client.version() + +print(f"Version: {version.body['version']}") +print(f"Database: {version.body['db_version']}") +print(f"Commit: {version.body.get('commit', 'N/A')}") +``` + +## API Request + +```bash +curl -X GET \ + "${WEAVER_URL}/versions" +``` + +## Returns + +```json +{ + "versions": [ + { + "version": "6.8.3", + "type": "api", + "db_version": "3.31.0", + "commit": "abc123def456" + } + ] +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Version Information + +### version + +Weaver application version (e.g., "6.8.3") + +- Major version: Breaking changes +- Minor version: New features +- Patch version: Bug fixes + +### db_version + +Database schema version + +- Used for migration compatibility +- Important for upgrades + +### commit + +Git commit hash of deployed version + +- Useful for exact version identification +- Helps with debugging and support + +## Use Cases + +### Version Check + +```bash +# Check if running latest version +CURRENT=$(weaver version -u $WEAVER_URL -f json | jq -r '.versions[0].version') +echo "Running Weaver v$CURRENT" + +# Compare with required version +if [[ "$CURRENT" < "6.0.0" ]]; then + echo "Version too old, upgrade required" +fi +``` + +### Compatibility Verification + +```python +# Check if feature is available +version_info = client.version() +version = version_info.body["versions"][0]["version"] + +major, minor, patch = map(int, version.split('.')) + +# Check for provenance feature (added in 4.0.0) +if major >= 4: + print("Provenance tracking available") +else: + print("Upgrade to 4.0.0+ for provenance") +``` + +### Bug Reporting + +```bash +# Collect version info for bug report +echo "Weaver Version Information:" +weaver version -u $WEAVER_URL +weaver info -u $WEAVER_URL | jq '.configuration' +``` + +## Version History + +Major versions and key features: + +- **6.x**: Enhanced OGC API - Processes Part 4, improved quotation +- **5.x**: Workflow improvements, vault enhancements +- **4.x**: W3C PROV provenance tracking +- **3.x**: OGC API - Processes Part 2 (DRU) +- **2.x**: Enhanced job management +- **1.x**: Initial OGC API - Processes implementation + +## Related Skills + +- [api-info](../api-info/) - Get general API information +- [api-conformance](../api-conformance/) - Check OGC compliance +- [process-describe](../process-describe/) - Check process availability + +## Documentation + +- [Version Information](https://pavics-weaver.readthedocs.io/en/latest/api.html) +- [Release Notes](https://pavics-weaver.readthedocs.io/en/latest/changes.html) +- [Installation](https://pavics-weaver.readthedocs.io/en/latest/installation.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/cwl-create-commandlinetool/SKILL.md b/.agents/skills/cwl-create-commandlinetool/SKILL.md new file mode 100644 index 000000000..d20f8651b --- /dev/null +++ b/.agents/skills/cwl-create-commandlinetool/SKILL.md @@ -0,0 +1,552 @@ +--- +name: cwl-create-commandlinetool +description: | + Create CWL CommandLineTool packages from scratch including proper structure, inputs, outputs, and + requirements. Learn best practices for wrapping command-line tools and creating reusable process + definitions. Use when creating new CWL packages for Weaver deployment. +license: Apache-2.0 +compatibility: Requires understanding of command-line tools and CWL basics. Supports CWL v1.0, v1.1, v1.2. +metadata: + author: fmigneault +--- + +# Create CWL CommandLineTool + +Learn to create well-structured CWL CommandLineTool packages from scratch. + +## When to Use + +- Wrapping a command-line tool for Weaver +- Creating a new process definition +- Converting existing scripts to CWL +- Building reusable process components +- Standardizing tool execution + +## Basic Structure + +### Minimal CommandLineTool + +```yaml +cwlVersion: v1.2 +class: CommandLineTool +baseCommand: [echo] + +inputs: + message: + type: string + inputBinding: + position: 1 + +outputs: + output: + type: stdout +``` + +### Complete Template + +```yaml +cwlVersion: v1.2 +class: CommandLineTool + +label: Process Name +doc: | + Detailed description of what this process does. + Include usage examples and expected behavior. + +baseCommand: [command, subcommand] + +requirements: + DockerRequirement: + dockerPull: appropriate-image:version + +inputs: + required_input: + type: File + label: Required input file + doc: Description of this input + inputBinding: + position: 1 + prefix: --input + + optional_param: + type: string? + default: "default_value" + inputBinding: + position: 2 + prefix: --param + +outputs: + output_file: + type: File + label: Output result + doc: Description of output + outputBinding: + glob: "output.txt" + +stdout: output.log +stderr: error.log +``` + +## Building Inputs + +### Simple Literal Input + +```yaml +inputs: + threshold: + type: float + doc: "Threshold value (0.0-1.0)" + inputBinding: + position: 1 + prefix: -t +``` + +### File Input + +```yaml +inputs: + input_file: + type: File + label: "Input data file" + doc: "NetCDF or GeoTIFF file" + format: + - edam:format_3650 # NetCDF + - edam:format_3591 # GeoTIFF + inputBinding: + position: 1 + prefix: --input +``` + +### Array Input + +```yaml +inputs: + input_files: + type: File[] + doc: "Multiple input files" + inputBinding: + position: 1 + prefix: --files + itemSeparator: "," # Creates: --files file1,file2,file3 +``` + +### Optional Input + +```yaml +inputs: + optional_flag: + type: boolean? + default: false + inputBinding: + prefix: --verbose +``` + +### Input with Default + +```yaml +inputs: + output_format: + type: string + default: "netcdf" + inputBinding: + position: 2 + prefix: --format +``` + +## InputBinding Configuration + +### Position + +```yaml +# Command will be: tool input1 input2 output +inputs: + input1: + type: File + inputBinding: + position: 1 # First argument + + input2: + type: File + inputBinding: + position: 2 # Second argument + + output_name: + type: string + inputBinding: + position: 3 # Third argument +``` + +### Prefix + +```yaml +# Creates: tool --input file.txt --format json +inputs: + input_file: + type: File + inputBinding: + prefix: --input + + format: + type: string + inputBinding: + prefix: --format +``` + +### Separate vs Together + +```yaml +# Separate: --input file.txt +inputs: + input_with_space: + type: File + inputBinding: + prefix: --input + separate: true # Default +``` + +```yaml +# Together: --input=file.txt +inputs: + input_no_space: + type: File + inputBinding: + prefix: --input + separate: false +``` + +### Value From Expression + +```yaml +inputs: + input_file: + type: File + inputBinding: + position: 1 + valueFrom: $(self.basename) # Use just filename, not full path +``` + +## Building Outputs + +### File Output + +```yaml +outputs: + output_file: + type: File + outputBinding: + glob: "result.txt" # Exact filename +``` + +### Multiple Files + +```yaml +outputs: + output_files: + type: File[] + outputBinding: + glob: "*.txt" # All .txt files +``` + +### Directory Output + +```yaml +outputs: + output_dir: + type: Directory + outputBinding: + glob: "results/" +``` + +### Standard Streams + +```yaml +outputs: + stdout_output: + type: stdout + + stderr_output: + type: stderr + +stdout: output.log +stderr: error.log +``` + +### Conditional Output + +```yaml +outputs: + optional_output: + type: File? # Optional output + outputBinding: + glob: "optional.txt" +``` + +## Requirements + +### Docker + +```yaml +requirements: + DockerRequirement: + dockerPull: python:3.12-slim +``` + +### Initial Work Directory + +```yaml +requirements: + InitialWorkDirRequirement: + listing: + - entryname: script.py + entry: | + #!/usr/bin/env python3 + print("Hello from Python") + + - entryname: config.json + entry: | + {"setting": "value"} + + - $(inputs.input_file) # Stage input file +``` + +### Resource Requirements + +```yaml +requirements: + ResourceRequirement: + coresMin: 2 + coresMax: 4 + ramMin: 4096 # MB + ramMax: 8192 + tmpdirMin: 10240 + outdirMin: 10240 +``` + +### Environment Variables + +```yaml +requirements: + EnvVarRequirement: + envDef: + PATH: "/usr/local/bin:$(PATH)" + PYTHONUNBUFFERED: "1" +``` + +### Inline JavaScript + +```yaml +requirements: + InlineJavascriptRequirement: {} + +inputs: + value: + type: int + inputBinding: + valueFrom: $(self * 2) # Double the input value +``` + +## Advanced Patterns + +### Conditional Arguments + +```yaml +inputs: + verbose: + type: boolean? + default: false + +arguments: + - valueFrom: | + ${ + if (inputs.verbose) { + return "--verbose"; + } else { + return null; + } + } +``` + +### Dynamic Output Names + +```yaml +inputs: + input_file: + type: File + +outputs: + output_file: + type: File + outputBinding: + glob: | + ${ + return inputs.input_file.nameroot + "_processed.txt"; + } +``` + +### Capture Success/Exit Codes + +```yaml +successCodes: [0] +temporaryFailCodes: [1, 2] # Retry these +permanentFailCodes: [3, 4] # Don't retry these + +outputs: + exit_code: + type: int + outputBinding: + glob: . + outputEval: $(runtime.exitCode) +``` + +## Complete Examples + +### Simple Python Script + +```yaml +cwlVersion: v1.2 +class: CommandLineTool + +baseCommand: [python] + +requirements: + DockerRequirement: + dockerPull: python:3.12-slim + InitialWorkDirRequirement: + listing: + - entryname: script.py + entry: | + import sys + with open(sys.argv[1]) as f: + data = f.read() + with open('output.txt', 'w') as f: + f.write(data.upper()) + +arguments: + - script.py + - $(inputs.input_file.path) + +inputs: + input_file: + type: File + +outputs: + output_file: + type: File + outputBinding: + glob: output.txt +``` + +### Command with Multiple Options + +```yaml +cwlVersion: v1.2 +class: CommandLineTool + +label: Image Processor +doc: Process images with various filters + +baseCommand: [convert] + +requirements: + DockerRequirement: + dockerPull: dpokidov/imagemagick:7.1.0-57 + +inputs: + input_image: + type: File + doc: "Input image file" + inputBinding: + position: 1 + + resize: + type: string? + doc: "Resize dimensions (e.g., 800x600)" + inputBinding: + prefix: -resize + + quality: + type: int? + default: 90 + doc: "JPEG quality (1-100)" + inputBinding: + prefix: -quality + + output_format: + type: string + default: "jpg" + doc: "Output format" + +arguments: + - valueFrom: "output.$(inputs.output_format)" + position: 100 + +outputs: + output_image: + type: File + outputBinding: + glob: "output.*" +``` + +## Testing Your CWL + +### Local Validation + +```bash +cwltool --validate my-tool.cwl +``` + +### Local Execution + +```bash +# Create test inputs +cat > test-inputs.json << EOF +{ + "input_file": { + "class": "File", + "path": "test.txt" + }, + "threshold": 0.5 +} +EOF + +# Run +cwltool my-tool.cwl test-inputs.json +``` + +### Deploy to Weaver + +```bash +weaver deploy -u $WEAVER_URL -p my-tool -b my-tool.cwl +``` + +## Best Practices + +1. **Use descriptive names**: Clear input/output names +2. **Add documentation**: Use `doc` and `label` fields +3. **Specify types clearly**: Be explicit with types +4. **Use proper positions**: Order arguments logically +5. **Pin Docker versions**: Use specific image tags +6. **Test locally first**: Use cwltool before deploying +7. **Handle optional inputs**: Use `?` for optional +8. **Capture all outputs**: Don't lose important files +9. **Use runtime variables**: `$(runtime.outdir)`, etc. +10. **Version your CWL**: Track changes + +## Related Skills + +- [cwl-validate-package](../cwl-validate-package/) - Validate your CWL +- [cwl-understand-docker](../cwl-understand-docker/) - Docker configuration +- [cwl-debug-package](../cwl-debug-package/) - Debug issues +- [process-deploy](../process-deploy/) - Deploy to Weaver +- [cwl-understand-workflow](../cwl-understand-workflow/) - Chain tools + +## Documentation + +- [CWL CommandLineTool Specification](https://www.commonwl.org/v1.2/CommandLineTool.html) +- [CWL User Guide](https://www.commonwl.org/user_guide/) +- [Weaver Package Guide](https://pavics-weaver.readthedocs.io/en/latest/package.html) +- [CWL Examples](https://github.com/common-workflow-language/workflows) + +## Templates + +See the examples in this skill as starting templates for your own CWL packages! diff --git a/.agents/skills/cwl-debug-package/SKILL.md b/.agents/skills/cwl-debug-package/SKILL.md new file mode 100644 index 000000000..5761df809 --- /dev/null +++ b/.agents/skills/cwl-debug-package/SKILL.md @@ -0,0 +1,608 @@ +--- +name: cwl-debug-package +description: | + Debug CWL package issues including deployment failures, execution errors, and validation problems. + Learn systematic troubleshooting approaches, common error patterns, and debugging techniques. Use + when CWL packages fail to deploy or execute correctly. +license: Apache-2.0 +compatibility: Requires cwltool for local testing. Works with CWL v1.0, v1.1, v1.2. +metadata: + author: fmigneault +--- + +# Debug CWL Packages + +Systematic troubleshooting guide for CWL package deployment and execution issues. + +## When to Use + +- CWL package fails to deploy to Weaver +- Process executes but produces errors +- Validation warnings or errors +- Unexpected output or behavior +- Docker-related failures +- Input/output type mismatches + +## Debugging Strategy + +### 1. Validate Locally First + +```bash +# Always validate before deploying +cwltool --validate package.cwl + +# If validation passes, test with sample data +cwltool package.cwl test-inputs.json +``` + +### 2. Check Weaver Deployment + +```bash +# Deploy and capture response +weaver deploy -u $WEAVER_URL -p my-process -b package.cwl + +# Check if process is listed +weaver capabilities -u $WEAVER_URL | grep my-process + +# Get process details +weaver describe -u $WEAVER_URL -p my-process +``` + +### 3. Test Execution + +```bash +# Execute with test inputs +JOB_ID=$(weaver execute -u $WEAVER_URL -p my-process -I inputs.json -f json | jq -r .jobID) + +# Monitor status +weaver status -u $WEAVER_URL -j $JOB_ID + +# Check logs +weaver logs -u $WEAVER_URL -j $JOB_ID + +# Check exceptions +weaver exceptions -u $WEAVER_URL -j $JOB_ID +``` + +## Common Errors and Solutions + +### Validation Errors + +#### Unknown Field + +``` +ERROR: Unknown field `DockerRequirment` +``` + +**Cause**: Typo in field name + +**Solution**: + +```yaml +# Wrong +DockerRequirment: # Missing 'e' + +# ✅ Correct +DockerRequirement: + dockerPull: myimage:latest +``` + +#### Missing Required Field + +``` +ERROR: Missing required field `class` +``` + +**Solution**: + +```yaml +cwlVersion: v1.2 +class: CommandLineTool # Must specify class +baseCommand: [echo] +``` + +#### Type Mismatch + +``` +ERROR: Expected type File, got string +``` + +**Solution**: + +```yaml +# Wrong +inputs: + input_file: string # Should be File +``` + +```yaml +# ✅ Correct +inputs: + input_file: File +``` + +### Deployment Errors + +#### Invalid CWL Version + +```text +ERROR: Unsupported CWL version v2.0 +``` + +**Solution**: + +```yaml +# Use supported version +cwlVersion: v1.2 +class: CommandLineTool +``` + +#### Docker Image Not Found + +```text +ERROR: Failed to pull Docker image 'myimage:latest' +``` + +**Solutions**: + +```bash +# 1. Verify image exists +docker pull myimage:latest + +# 2. Use full registry path +docker pull docker.io/library/myimage:latest + +# 3. Check image name spelling +docker pull python:3.12-slim +``` + +#### Process ID Conflict + +``` +ERROR: Process 'my-process' already exists +``` + +**Solutions**: + +```bash +# 1. Use different process ID +weaver deploy -u $WEAVER_URL -p my-process-v2 -b package.cwl + +# 2. Undeploy existing process first +weaver undeploy -u $WEAVER_URL -p my-process +weaver deploy -u $WEAVER_URL -p my-process -b package.cwl +``` + +### Execution Errors + +#### Missing Input + +```text +ERROR: Required input 'input_file' not provided +``` + +**Solution**: + +```json +{ + "input_file": { + "class": "File", + "path": "https://example.com/data.txt" + } +} +``` + +#### Input Type Mismatch + +``` +ERROR: Expected File, got string +``` + +**Solution**: + +❌ Incorrect reference for a File (not a string) + +```json +{ + "input_file": "data.txt" +} +``` + +✅ Correct File input reference + +```json +{ + "input_file": { + "class": "File", + "path": "https://example.com/data.txt" + } +} +``` + +#### Command Not Found + +``` +ERROR: /bin/sh: mycommand: command not found +``` + +**Solutions**: + +```yaml +# 1. Install command in Docker image +requirements: + DockerRequirement: + dockerPull: image-with-mycommand:latest +``` + +```yaml +# 2. Use full path +baseCommand: [/usr/local/bin/mycommand] +``` + +```yaml +# 3. Install via InitialWorkDirRequirement +requirements: + InitialWorkDirRequirement: + listing: + - entryname: install.sh + entry: | + #!/bin/bash + apt-get update && apt-get install -y mycommand +``` + +#### Permission Denied + +``` +ERROR: Permission denied: /output/result.txt +``` + +**Solutions**: + +```yaml +# 1. Ensure output directory is writable +outputs: + result: + type: File + outputBinding: + glob: "*.txt" # Use glob pattern + +# 2. Write to runtime.outdir +arguments: + - -o + - $(runtime.outdir)/result.txt +``` + +#### Output Not Found + +``` +ERROR: Output file 'result.txt' not found +``` + +**Solutions**: + +```yaml +# 1. Check glob pattern +outputs: + result: + type: File + outputBinding: + glob: "result.txt" # Exact match +``` + +```yaml +# 2. Use wildcard +outputs: + result: + type: File + outputBinding: + glob: "*.txt" # Match any .txt file +``` + +```yaml +# 3. Verify command produces output +baseCommand: [echo, "test"] +stdout: result.txt # Capture stdout +``` + +## Debugging Techniques + +### Enable Verbose Logging + +```bash +# Local testing with debug +cwltool --debug package.cwl inputs.json + +# Weaver execution - check logs +weaver logs -u $WEAVER_URL -j $JOB_ID +``` + +### Test Incrementally + +```yaml +# 1. Start with minimal CWL +cwlVersion: v1.2 +class: CommandLineTool +baseCommand: [echo, "hello"] +outputs: + stdout: stdout +``` + +```yaml +# 2. Add inputs +inputs: + message: string +baseCommand: [echo] +arguments: [$(inputs.message)] +``` + +```yaml +# 3. Add Docker +requirements: + DockerRequirement: + dockerPull: debian:stable-slim +``` + +### Isolate Issues + +```bash +# Test each component separately + +# 1. Test Docker image +docker run --rm myimage:latest mycommand --help + +# 2. Test command locally +echo "test" | mycommand + +# 3. Test with cwltool +cwltool package.cwl inputs.json + +# 4. Test on Weaver +weaver execute -u $WEAVER_URL -p my-process -I inputs.json +``` + +### Check Intermediate Files + +```yaml +# Add intermediate outputs for debugging +outputs: + debug_output: + type: Directory + outputBinding: + glob: . # Capture all files in working directory + + final_output: + type: File + outputBinding: + glob: result.txt +``` + +### Use Simple Test Data + +Create minimal test inputs + +```json +{ + "input_file": { + "class": "File", + "path": "test.txt", + "contents": "test data\n" + } +} +``` + +## Workflow-Specific Debugging + +### Check Step Connections + +```yaml +# Verify outputs match inputs +steps: + step1: + run: tool1.cwl + in: {input: workflow_input} + out: [output] # Type: File + + step2: + run: tool2.cwl + in: + input: step1/output # Must expect File + out: [result] +``` + +### Visualize Workflow + +```bash +# Generate workflow diagram +cwltool --print-dot workflow.cwl | dot -Tpng > workflow.png +``` + +### Test Steps Individually + +```bash +# Test each step separately +cwltool step1.cwl step1-inputs.json +cwltool step2.cwl step2-inputs.json + +# Then test complete workflow +cwltool workflow.cwl workflow-inputs.json +``` + +## Docker-Specific Debugging + +### Test Container Locally + +```bash +# Run container interactively +docker run -it --rm myimage:latest /bin/bash + +# Test command in container +docker run --rm myimage:latest mycommand --help + +# Mount test data +docker run --rm -v $(pwd):/data myimage:latest mycommand /data/test.txt +``` + +### Check Image Availability + +```bash +# Pull image +docker pull myimage:latest + +# Inspect image +docker inspect myimage:latest + +# List image tags +curl https://hub.docker.com/v2/repositories/myimage/tags/ +``` + +### Debug Network Issues + +```yaml +# Enable network access +requirements: + NetworkAccess: + networkAccess: true + +# Test with curl/wget +baseCommand: [curl, -O, https://example.com/data.txt] +``` + +## Provenance and Statistics + +### Check Execution Details + +```bash +# Get detailed provenance +weaver provenance -u $WEAVER_URL -j $JOB_ID + +# Get resource usage +weaver statistics -u $WEAVER_URL -j $JOB_ID + +# Get inputs used +weaver inputs -u $WEAVER_URL -j $JOB_ID +``` + +## Common Pitfalls + +### 1. Using `latest` Tags + +```yaml +# ❌ Avoid +dockerPull: python:latest # Unpredictable +``` + +```yaml +# ✅ Use specific versions +dockerPull: python:3.12.16-slim +``` + +### 2. Missing Output Glob + +```yaml +# ❌ Output not found +outputs: + result: + type: File + # Missing outputBinding! +``` + +```yaml +# ✅ Specify glob +outputs: + result: + type: File + outputBinding: + glob: "result.txt" +``` + +### 3. Incorrect Input Types + +```yaml +# ❌ Type mismatch +inputs: + file_input: string # Should be File +``` + +```yaml +# ✅ Correct type +inputs: + file_input: File +``` + +### 4. Forgetting Runtime Variables + +```yaml +# ❌ Hardcoded path +arguments: ["-o", "/output/result.txt"] +``` + +```yaml +# ✅ Use runtime.outdir +arguments: ["-o", "$(runtime.outdir)/result.txt"] +``` + +### 5. Missing Requirements + +```yaml +# ❌ No DockerRequirement +baseCommand: [python, script.py] # Where does Python come from? +``` + +```yaml +# ✅ Specify Docker image +requirements: + DockerRequirement: + dockerPull: python:3.12-slim +baseCommand: [python, script.py] +``` + +## Debugging Checklist + +- [ ] Validate with `cwltool --validate` +- [ ] Test locally with `cwltool` +- [ ] Check Docker image exists +- [ ] Verify input types match +- [ ] Check output glob patterns +- [ ] Test with minimal inputs +- [ ] Review Weaver logs +- [ ] Check exceptions +- [ ] Verify Docker requirements +- [ ] Test steps independently (workflows) +- [ ] Check runtime variables +- [ ] Verify network access (if needed) + +## Related Skills + +- [cwl-validate-package](../cwl-validate-package/) - Validate before deploying +- [process-deploy](../process-deploy/) - Deploy CWL packages +- [job-logs](../job-logs/) - View execution logs +- [job-exceptions](../job-exceptions/) - Get error details +- [job-status](../job-status/) - Monitor execution +- [cwl-understand-docker](../cwl-understand-docker/) - Docker troubleshooting + +## Documentation + +- [CWL Troubleshooting](https://www.commonwl.org/user_guide/) +- [cwltool Documentation](https://github.com/common-workflow-language/cwltool) +- [Weaver Process Deployment](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [Docker Debugging](https://docs.docker.com/config/containers/runmetrics/) + +## Tools + +- **cwltool**: Local testing and validation +- **docker**: Container testing +- **jq**: JSON parsing for responses +- **curl**: API debugging + +## Best Practices + +1. ✅ Always validate locally first +2. ✅ Test with minimal data +3. ✅ Debug incrementally +4. ✅ Check logs and exceptions +5. ✅ Test Docker containers independently +6. ✅ Use specific image tags +7. ✅ Document known issues +8. ✅ Keep CWL packages simple +9. ✅ Version control your CWL +10. ✅ Learn from working examples diff --git a/.agents/skills/cwl-optimize-performance/SKILL.md b/.agents/skills/cwl-optimize-performance/SKILL.md new file mode 100644 index 000000000..f85b7809f --- /dev/null +++ b/.agents/skills/cwl-optimize-performance/SKILL.md @@ -0,0 +1,558 @@ +--- +name: cwl-optimize-performance +description: | + Optimize CWL package performance including resource allocation, Docker image selection, scatter + patterns, and execution strategies. Learn techniques to improve execution speed and resource + efficiency. Use when processes are slow, use too many resources, or need optimization. +license: Apache-2.0 +compatibility: Requires Weaver deployment and CWL v1.0+. +metadata: + author: fmigneault +--- + +# Optimize CWL Package Performance + +Techniques to improve CWL package execution speed and resource efficiency. + +## When to Use + +- Processes take too long to execute +- Docker images are large and slow to pull +- Resource allocation is inefficient +- Parallel processing opportunities exist +- Jobs fail due to resource limits +- Optimizing workflow execution time + +## Docker Image Optimization + +### Use Slim/Alpine Variants + +```yaml +# ❌ Slow - large image (~1GB) +DockerRequirement: + dockerPull: python:3.12 +``` + +```yaml +# ✅ Faster - slim image (~150MB) +DockerRequirement: + dockerPull: python:3.12-slim +``` + +```yaml +# ✅ Smallest - alpine (~50MB) +DockerRequirement: + dockerPull: python:3.12-alpine +``` + +### Pin Specific Versions + +```yaml +# ❌ Slow - always pulls latest +DockerRequirement: + dockerPull: myimage:latest +``` + +```yaml +# ✅ Fast - cached after first pull +DockerRequirement: + dockerPull: myimage:1.2.3 +``` + +### Use Digest for Immutability + +```yaml +# ✅ Best - never changes, cached forever +DockerRequirement: + dockerPull: python@sha256:abc123... +``` + +### Pre-pull Images + +```bash +# Pull images before workflow execution +docker pull python:3.12-slim +docker pull gdal:3.6.0-alpine +``` + +## Resource Requirements + +### Right-size Resources + +```yaml +requirements: + ResourceRequirement: + # ✅ Appropriate for task + coresMin: 2 + coresMax: 4 + ramMin: 4096 # 4GB + ramMax: 8192 # 8GB + tmpdirMin: 10240 # 10GB temp + outdirMin: 20480 # 20GB output +``` + +### Dynamic Resource Allocation + +```yaml +requirements: + InlineJavascriptRequirement: {} + ResourceRequirement: + # Scale RAM with input file size + ramMin: | + ${ + var sizeMB = inputs.input_file.size / (1024 * 1024); + return Math.max(2048, Math.ceil(sizeMB * 3)); + } + + # Scale cores with number of inputs + coresMin: | + ${ + return Math.min(8, Math.max(2, inputs.files.length)); + } +``` + +### Avoid Over-allocation + +```yaml +# ❌ Wastes resources +ResourceRequirement: + ramMin: 64000 # 64GB for a simple task + coresMin: 32 +``` + +```yaml +# ✅ Appropriate allocation +ResourceRequirement: + ramMin: 4096 # 4GB + coresMin: 2 +``` + +## Parallel Processing with Scatter + +### Basic Scatter + +```yaml +steps: + process: + run: tool.cwl + scatter: input_file # Process files in parallel + in: + input_file: input_files + out: [output] +``` + +### Scatter Multiple Inputs + +```yaml +steps: + process: + run: tool.cwl + scatter: [file, parameter] + scatterMethod: dotproduct # Pair inputs + in: + file: files + parameter: parameters + out: [output] +``` + +### Optimal Scatter Size + +```yaml +# ❌ Too fine-grained - overhead dominates +scatter: tiny_chunks # 1000s of 1MB files +``` + +```yaml +# ✅ Balanced - good parallelism +scatter: reasonable_chunks # Dozens of 100MB files +``` + +```yaml +# ❌ Too coarse - underutilizes resources +scatter: huge_chunks # 2-3 multi-GB files +``` + +## Input/Output Optimization + +### Minimize Data Transfer + +```yaml +# ❌ Transfers entire large file +inputs: + full_dataset: + type: File # 10GB file +``` + +```yaml +# ✅ Transfer only needed subset +inputs: + subset_params: + type: string # Small parameters +# Tool downloads only needed data +``` + +### Use File References + +```yaml +# ✅ Pass by reference when possible +{ + "input_file": { + "class": "File", + "path": "https://example.com/large-file.nc" # Weaver downloads + } +} +``` + +### Stream When Possible + +```yaml +# ✅ Process streams instead of files +baseCommand: [curl, https://example.com/data.txt] +stdout: processed.txt + +# Pipes directly without intermediate file +``` + +### Efficient Output Patterns + +```yaml +# ❌ Returns many small files +outputs: + results: + type: File[] + outputBinding: + glob: "*.txt" # 1000s of tiny files +``` + +```yaml +# ✅ Returns aggregated results +outputs: + results: + type: File + outputBinding: + glob: "combined-results.tar.gz" # Single archive +``` + +## Workflow Structure Optimization + +### Minimize Steps + +```yaml +# ❌ Too many small steps +steps: + step1: download.cwl + step2: unzip.cwl + step3: validate.cwl + step4: process.cwl +``` + +```yaml +# ✅ Combined operations +steps: + process: # Does download, unzip, validate, process + run: optimized-process.cwl +``` + +### Parallel Independent Steps + +```yaml +# ✅ Steps that can run in parallel +steps: + process_a: + run: tool-a.cwl + in: {input: data} + out: [output_a] + + process_b: # Runs simultaneously with process_a + run: tool-b.cwl + in: {input: data} + out: [output_b] + + combine: # Waits for both + run: merge.cwl + in: + a: process_a/output_a + b: process_b/output_b + out: [merged] +``` + +### Cache Intermediate Results + +```yaml +# ✅ Expose intermediate results for reuse +outputs: + preprocessed: # Can be reused + type: File + outputSource: preprocess/output + + final: + type: File + outputSource: analyze/output +``` + +## Command Optimization + +### Efficient Commands + +```yaml +# ❌ Inefficient +baseCommand: [bash, -c] +arguments: + - "cat file.txt | grep pattern | sort | uniq > output.txt" +``` + +```yaml +# ✅ More efficient +baseCommand: [grep, pattern] +stdin: file.txt +stdout: output.txt +``` + +### Avoid Unnecessary Operations + +```yaml +# ❌ Reads entire file into memory +baseCommand: [python, -c] +arguments: + - "open('huge.txt').read()" +``` + +```yaml +# ✅ Streams data +baseCommand: [awk, '{print $1}'] +``` + +### Use Native Tools + +```yaml +# ❌ Python for simple text operations +DockerRequirement: + dockerPull: python:3.12-slim +baseCommand: [python, -c, "print('hello')"] +``` + +```yaml +# ✅ Simple shell command +DockerRequirement: + dockerPull: alpine:latest +baseCommand: [echo, hello] +``` + +## Monitoring and Profiling + +### Track Resource Usage + +```bash +# Get job statistics +weaver statistics -u $WEAVER_URL -j $JOB_ID + +# Check execution time +weaver status -u $WEAVER_URL -j $JOB_ID | jq '.duration' + +# View logs for bottlenecks +weaver logs -u $WEAVER_URL -j $JOB_ID +``` + +### Identify Bottlenecks + +```yaml +# Add timing to steps +steps: + download: + run: download.cwl + # Check logs to see how long this takes + + process: + run: process.cwl + # Compare durations +``` + +### Profile Locally + +```bash +# Time local execution +time cwltool process.cwl inputs.json + +# Check resource usage +docker stats +``` + +## Caching Strategies + +### Docker Image Caching + +```yaml +# ✅ Use versioned tags for caching +DockerRequirement: + dockerPull: myimage:1.2.3 # Cached after first use +``` + +### Intermediate File Caching + +```yaml +# ✅ Reuse expensive preprocessing +steps: + expensive_preprocess: + run: preprocess.cwl + in: {input: raw_data} + out: [preprocessed] # Cache this + + analyze: + run: analyze.cwl + in: {input: expensive_preprocess/preprocessed} + out: [result] +``` + +## Common Performance Issues + +### Issue: Slow Docker Pull + +```yaml +# Problem: Large image +DockerRequirement: + dockerPull: tensorflow/tensorflow:latest-gpu # 4GB+ +``` + +```yaml +# Solutions: +# 1. Use smaller base image +DockerRequirement: + dockerPull: tensorflow/tensorflow:2.11.0-gpu-slim + +# 2. Pre-pull images +# 3. Use private registry closer to Weaver + +# 4. Build custom minimal image +``` + +### Issue: Memory Overflow + +```yaml +# Problem: Insufficient RAM +ResourceRequirement: + ramMin: 2048 +``` + +```yaml +# Solution: Increase based on data size +ResourceRequirement: + ramMin: | + ${ + var dataSizeMB = inputs.data.size / (1024 * 1024); + return Math.max(4096, dataSizeMB * 4); + } +``` + +### Issue: Slow File I/O + +```yaml +# Problem: Reading entire file +baseCommand: [python, -c] +arguments: + - "data = open('huge.csv').read()" +``` + +```yaml +# Solution: Stream processing +baseCommand: [python, -c] +arguments: + - | + import sys + for line in sys.stdin: + process(line) +stdin: huge.csv +``` + +### Issue: Sequential Processing + +```yaml +# Problem: Processing items one by one +steps: + process: + run: tool.cwl + # No scatter - sequential +``` + +```yaml +# Solution: Scatter for parallelism +steps: + process: + run: tool.cwl + scatter: item + in: {item: items} + out: [output] +``` + +## Benchmarking + +### Compare Approaches + +```bash +# Approach 1 +time weaver execute -u $WEAVER_URL -p approach1 -I inputs.json + +# Approach 2 +time weaver execute -u $WEAVER_URL -p approach2 -I inputs.json + +# Compare statistics +weaver statistics -u $WEAVER_URL -j $JOB1 +weaver statistics -u $WEAVER_URL -j $JOB2 +``` + +### A/B Testing + +```yaml +# Test different resource allocations +# Version A: Conservative +ResourceRequirement: + ramMin: 4096 + coresMin: 2 +``` + +```yaml +# Version B: Generous +ResourceRequirement: + ramMin: 8192 + coresMin: 4 + +# Measure which performs better +``` + +## Best Practices Summary + +1. ✅ Use slim Docker images +2. ✅ Pin image versions for caching +3. ✅ Right-size resource allocations +4. ✅ Use scatter for parallel processing +5. ✅ Minimize data transfer +6. ✅ Combine small steps +7. ✅ Profile and measure +8. ✅ Cache expensive operations +9. ✅ Stream large data when possible +10. ✅ Use appropriate tools for tasks + +## Related Skills + +- [cwl-understand-docker](../cwl-understand-docker/) - Docker optimization +- [cwl-understand-workflow](../cwl-understand-workflow/) - Workflow patterns +- [cwl-debug-package](../cwl-debug-package/) - Debug performance issues +- [job-statistics](../job-statistics/) - Monitor resource usage + +## Documentation + +- [CWL Resource Requirements](https://www.commonwl.org/v1.2/CommandLineTool.html#ResourceRequirement) +- [Scatter Feature](https://www.commonwl.org/v1.2/Workflow.html#WorkflowStep) +- [Docker Best Practices](https://docs.docker.com/develop/dev-best-practices/) +- [Weaver Performance](https://pavics-weaver.readthedocs.io/en/latest/processes.html) + +## Measurement + +Track these metrics for optimization: + +- **Execution time**: Start to finish duration +- **Docker pull time**: Image download duration +- **Resource usage**: CPU, RAM, Disk I/O +- **Data transfer**: Input/output transfer time +- **Queue time**: Time waiting for resources + +Optimize the biggest bottleneck first! diff --git a/.agents/skills/cwl-understand-builtin/SKILL.md b/.agents/skills/cwl-understand-builtin/SKILL.md new file mode 100644 index 000000000..cea888d03 --- /dev/null +++ b/.agents/skills/cwl-understand-builtin/SKILL.md @@ -0,0 +1,478 @@ +--- +name: cwl-understand-builtin +description: | + Use Weaver's built-in processes including jsonarray2netcdf, file2string_array, and other utility + processes. Learn when to use builtins instead of custom Docker containers for common operations. + Use to simplify workflows and avoid unnecessary Docker complexity. +license: Apache-2.0 +compatibility: Requires Weaver deployment with builtin processes enabled. +metadata: + author: fmigneault +--- + +# Understand Weaver Built-in Processes + +Learn to use Weaver's built-in utility processes for common operations without custom Docker containers. + +## When to Use + +- Converting data formats (JSON to NetCDF) +- Splitting or combining files +- Simple text processing +- Avoiding Docker overhead for simple operations +- Chaining builtins in workflows +- Quick prototyping without custom containers + +## What are Built-in Processes? + +Built-in processes are pre-deployed utility processes in Weaver that perform common operations without requiring custom +Docker images or CWL packages. + +### Benefits + +- ✅ No Docker image required +- ✅ Faster execution (no image pull) +- ✅ Always available in Weaver +- ✅ Well-tested and maintained +- ✅ Simpler CWL definitions + +## Available Built-in Processes + +### jsonarray2netcdf + +Convert JSON array data to NetCDF format. + +**Use Cases**: + +- Converting API responses to NetCDF +- Creating NetCDF from structured data +- Data format conversion in workflows + +**Inputs**: + +- `input`: JSON array or file +- `x_variable`: X-axis variable name +- `y_variable`: Y-axis variable name +- `z_variable`: Data variable name + +**Outputs**: + +- `output`: NetCDF file + +**Example**: + +```yaml +cwlVersion: v1.2 +class: Workflow + +steps: + convert: + run: "https://weaver.example.com/processes/jsonarray2netcdf" + in: + input: json_data + x_variable: {default: "lon"} + y_variable: {default: "lat"} + z_variable: {default: "temperature"} + out: [output] +``` + +### file2string_array + +Convert a file to an array of strings (one per line). + +**Use Cases**: + +- Reading configuration files +- Processing line-based data +- Splitting file content for parallel processing + +**Inputs**: + +- `file`: Input text file + +**Outputs**: + +- `output`: Array of strings + +**Example**: + +```yaml +steps: + read_file: + run: "https://weaver.example.com/processes/file2string_array" + in: + file: config_file + out: [output] + + process_lines: + run: process-line.cwl + scatter: line + in: + line: read_file/output + out: [result] +``` + +## Discovering Built-in Processes + +### List All Built-ins + +```bash +# List processes (built-ins typically don't have visibility tag or have special marker) +weaver capabilities -u $WEAVER_URL + +# Look for processes without custom deployment +``` + +### Describe Built-in Process + +```bash +# Get detailed information +weaver describe -u $WEAVER_URL -p jsonarray2netcdf + +# View inputs and outputs +weaver describe -u $WEAVER_URL -p file2string_array -f json | jq '.inputs' +``` + +## Using Built-ins in Workflows + +### Simple Conversion Workflow + +```yaml +cwlVersion: v1.2 +class: Workflow + +inputs: + json_input: File + +outputs: + netcdf_output: + type: File + outputSource: convert/output + +steps: + convert: + run: + class: CommandLineTool + # Reference built-in by its process ID + id: jsonarray2netcdf + in: + input: json_input + out: [output] +``` + +### Chaining Built-ins + +```yaml +cwlVersion: v1.2 +class: Workflow + +inputs: + data_file: File + +outputs: + final_result: + type: File + outputSource: convert/output + +steps: + # Split file into lines + split: + run: file2string_array + in: + file: data_file + out: [output] + + # Process could continue with custom processing + # Then convert back + convert: + run: string_array2file + in: + input: split/output + out: [output] +``` + +### Mixing Built-ins with Custom Processes + +```yaml +cwlVersion: v1.2 +class: Workflow + +steps: + # Use built-in for format conversion + to_netcdf: + run: jsonarray2netcdf + in: + input: json_data + out: [output] + + # Use custom process for analysis + analyze: + run: custom-analysis.cwl # Your custom CWL + in: + input: to_netcdf/output + out: [result] + + # Use built-in for final conversion + to_json: + run: netcdf2jsonarray + in: + input: analyze/result + out: [output] +``` + +## Common Patterns + +### Data Format Pipeline + +```yaml +# JSON → NetCDF → Process → GeoTIFF +steps: + json_to_nc: + run: jsonarray2netcdf + in: {input: raw_json} + out: [output] + + process: + run: analysis.cwl + in: {input: json_to_nc/output} + out: [result] + + nc_to_geotiff: + run: netcdf2geotiff + in: {input: process/result} + out: [output] +``` + +### File Processing Pipeline + +```yaml +# File → Lines → Process Each → Combine +steps: + split_lines: + run: file2string_array + in: {file: input_file} + out: [output] + + process_each: + run: process-line.cwl + scatter: line + in: {line: split_lines/output} + out: [processed] + + combine: + run: array2file + in: {lines: process_each/processed} + out: [output] +``` + +## When to Use Built-ins vs Custom + +### Use Built-ins When: + +- ✅ Simple format conversion +- ✅ Basic string/file operations +- ✅ Quick prototyping +- ✅ Avoiding Docker overhead +- ✅ Operation matches built-in capability exactly + +### Use Custom CWL When: + +- ❌ Complex processing logic +- ❌ Specific software dependencies +- ❌ Custom algorithms +- ❌ Performance-critical operations +- ❌ Unique requirements + +## Built-in Limitations + +### Not Customizable + +Built-ins have fixed behavior - you can't modify them. + +**Workaround**: Chain built-ins with custom processes + +```yaml +steps: + builtin_convert: + run: jsonarray2netcdf + in: {input: data} + out: [output] + + custom_post_process: + run: my-custom-tool.cwl + in: {input: builtin_convert/output} + out: [result] +``` + +### Limited Operations + +Built-ins only cover common operations. + +**Workaround**: Use as pre/post-processing steps + +```yaml +steps: + preprocess: + run: file2string_array # Built-in + in: {file: input} + out: [lines] + + main_process: + run: complex-analysis.cwl # Custom + in: {data: preprocess/lines} + out: [result] +``` + +### Version Locked + +Built-in behavior tied to Weaver version. + +**Workaround**: Document Weaver version requirements + +```yaml +# In process metadata +metadata: + weaverVersion: ">=6.0.0" +``` + +## Executing Built-in Processes + +### Direct Execution + +```bash +# Execute jsonarray2netcdf +weaver execute \ + -u $WEAVER_URL \ + -p jsonarray2netcdf \ + -I inputs.json + +# inputs.json: +{ + "input": {"href": "https://example.com/data.json"}, + "x_variable": "longitude", + "y_variable": "latitude", + "z_variable": "temperature" +} +``` + +### In Workflow Context + +```bash +# Deploy workflow using built-ins +weaver deploy -u $WEAVER_URL -p my-workflow -b workflow.cwl + +# Execute workflow +weaver execute -u $WEAVER_URL -p my-workflow -I workflow-inputs.json +``` + +## Best Practices + +### 1. Check Availability + +```bash +# Verify built-in exists before using +weaver describe -u $WEAVER_URL -p jsonarray2netcdf +``` + +### 2. Document Built-in Usage + +```yaml +# Add comments in CWL +steps: + convert: + # Using Weaver built-in jsonarray2netcdf + # Converts JSON array to NetCDF format + run: jsonarray2netcdf + in: {input: json_data} + out: [output] +``` + +### 3. Handle Built-in Errors + +```yaml +# Built-ins can fail like any process +# Check job status and logs +``` + +### 4. Version Compatibility + +```yaml +# Document Weaver version requirements +# Different Weaver versions may have different built-ins +``` + +### 5. Combine with Custom Processes + +```yaml +# Use built-ins for common operations +# Use custom CWL for specific logic +``` + +## Related Skills + +- [process-list](../process-list/) - List all available processes including built-ins +- [process-describe](../process-describe/) - Get built-in process details +- [cwl-understand-workflow](../cwl-understand-workflow/) - Chain built-ins in workflows +- [job-execute](../job-execute/) - Execute built-in processes +- [process-deploy](../process-deploy/) - Deploy workflows using built-ins + +## Documentation + +- [Weaver Built-in Processes](https://pavics-weaver.readthedocs.io/en/latest/package.html) +- [Process Deployment](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [Workflow Examples](https://pavics-weaver.readthedocs.io/en/latest/processes.html) + +## Examples + +### Complete Workflow with Built-ins + +```yaml +cwlVersion: v1.2 +class: Workflow + +doc: | + Workflow demonstrating built-in process usage. + 1. Split input file into lines + 2. Process each line in parallel + 3. Convert results to NetCDF + +inputs: + input_file: File + config: string + +outputs: + final_output: + type: File + outputSource: to_netcdf/output + +steps: + split: + run: file2string_array + in: + file: input_file + out: [output] + + process: + run: custom-processor.cwl + scatter: line + in: + line: split/output + config: config + out: [result] + + to_netcdf: + run: jsonarray2netcdf + in: + input: process/result + z_variable: {default: "data"} + out: [output] +``` + +## Tips + +- 💡 Built-ins are faster (no Docker pull) +- 💡 Use for simple, common operations +- 💡 Chain with custom processes for complex workflows +- 💡 Check built-in availability in target Weaver instance +- 💡 Document which built-ins your workflow depends on +- 💡 Consider built-ins for prototyping before creating custom Docker images diff --git a/.agents/skills/cwl-understand-docker/SKILL.md b/.agents/skills/cwl-understand-docker/SKILL.md new file mode 100644 index 000000000..f61cb411c --- /dev/null +++ b/.agents/skills/cwl-understand-docker/SKILL.md @@ -0,0 +1,453 @@ +--- +name: cwl-understand-docker +description: | + Understand Docker requirements in CWL packages including DockerRequirement configuration, image + selection, networking, and volume mounting. Learn best practices for containerized process + execution in Weaver. Use when creating Docker-based CWL packages or troubleshooting Docker-related + issues. +license: Apache-2.0 +compatibility: Requires Docker understanding and CWL v1.0+. Docker must be available for local testing. +metadata: + author: fmigneault +--- + +# Understand Docker in CWL + +Master Docker requirements and configuration in CWL packages for containerized process execution. + +## When to Use + +- Creating Docker-based CWL packages +- Selecting appropriate Docker images +- Troubleshooting Docker-related failures +- Optimizing Docker image usage +- Understanding container execution in Weaver +- Configuring volume mounts and networking + +## DockerRequirement Basics + +### Simple Docker Requirement + +```yaml +cwlVersion: v1.2 +class: CommandLineTool +baseCommand: [python, script.py] + +requirements: + DockerRequirement: + dockerPull: python:3.12-slim +``` + +### Docker with Specific Tag + +```yaml +requirements: + DockerRequirement: + dockerPull: ubuntu:20.04 # Specific version, not 'latest' +``` + +### Custom Docker Image + +```yaml +requirements: + DockerRequirement: + dockerPull: myregistry.io/myimage:v1.2.3 +``` + +## Docker Image Selection + +### Official Images (Recommended) + +```yaml +DockerRequirement: + dockerPull: python:3.12-slim # Python + dockerPull: node:16-alpine # Node.js + dockerPull: openjdk:11-jre-slim # Java + dockerPull: debian:bullseye-slim # Debian + dockerPull: ubuntu:22.04 # Ubuntu +``` + +### Scientific Images + +```yaml +DockerRequirement: + dockerPull: continuumio/miniconda3:latest # Conda + dockerPull: jupyter/scipy-notebook:latest # Scientific Python + dockerPull: rocker/r-ver:4.2.0 # R +``` + +### Geospatial Images + +```yaml +DockerRequirement: + dockerPull: osgeo/gdal:ubuntu-small-latest # GDAL + dockerPull: ghcr.io/osgeo/proj:9.1.0 # PROJ +``` + +## Docker Image Best Practices + +### Use Specific Tags + +```yaml +# ❌ Bad - unpredictable +dockerPull: python:latest + +# ✅ Good - reproducible +dockerPull: python:3.12.16-slim +``` + +### Prefer Slim/Alpine Variants + +```yaml +# ❌ Large image (~1GB) +dockerPull: python:3.12 + +# ✅ Smaller image (~150MB) +dockerPull: python:3.12-slim + +# ✅ Even smaller (~50MB, but may lack libraries) +dockerPull: python:3.12-alpine +``` + +### Pin Versions for Reproducibility + +```yaml +# ✅ Exact version +dockerPull: myorg/myimage:1.2.3 + +# ✅ SHA256 digest (most precise) +dockerPull: myorg/myimage@sha256:abc123... +``` + +## Advanced Docker Configuration + +### Environment Variables + +```yaml +requirements: + DockerRequirement: + dockerPull: myimage:latest + + EnvVarRequirement: + envDef: + PYTHONUNBUFFERED: "1" + TZ: "UTC" + DATA_PATH: "/data" +``` + +### Network Access + +```yaml +requirements: + DockerRequirement: + dockerPull: myimage:latest + + NetworkAccess: + networkAccess: true # Allow internet access +``` + +### Resource Limits + +```yaml +requirements: + DockerRequirement: + dockerPull: myimage:latest + + ResourceRequirement: + coresMin: 2 + coresMax: 4 + ramMin: 4096 # MB + ramMax: 8192 + tmpdirMin: 1024 + outdirMin: 2048 +``` + +## Working with Files in Docker + +### Input Files + +```yaml +# Files are automatically mounted into container +inputs: + input_file: + type: File + inputBinding: + position: 1 + # File will be available at: /tmp/path/filename +``` + +### Output Files + +```yaml +outputs: + output_file: + type: File + outputBinding: + glob: "output.txt" # Looked for in $(runtime.outdir) +``` + +### Directory Inputs + +```yaml +inputs: + input_dir: + type: Directory + inputBinding: + position: 1 +``` + +## Common Docker Patterns + +### Python Script Execution + +```yaml +cwlVersion: v1.2 +class: CommandLineTool +baseCommand: [python] + +requirements: + DockerRequirement: + dockerPull: python:3.12-slim + InitialWorkDirRequirement: + listing: + - entryname: script.py + entry: | + import sys + print(f"Processing {sys.argv[1]}") + +arguments: + - script.py + - $(inputs.input_file.path) + +inputs: + input_file: File + +outputs: + stdout_log: + type: stdout +``` + +### Installing Dependencies + +```yaml +requirements: + DockerRequirement: + dockerPull: python:3.12-slim + + InitialWorkDirRequirement: + listing: + - entryname: requirements.txt + entry: | + numpy==1.24.0 + pandas==1.5.3 + - entryname: install-deps.sh + entry: | + #!/bin/bash + pip install -r requirements.txt + +baseCommand: [bash, install-deps.sh, "&&", python, script.py] +``` + +### Running Shell Scripts + +```yaml +requirements: + DockerRequirement: + dockerPull: bash:5.1 + + InitialWorkDirRequirement: + listing: + - entryname: script.sh + entry: | + #!/bin/bash + set -e + echo "Processing..." + # Your script here + +baseCommand: [bash, script.sh] +``` + +## Docker Troubleshooting + +### Image Pull Failures + +**Problem**: Cannot pull image + +``` +Error: Failed to pull image 'myimage:latest' +``` + +**Solutions**: + +```yaml +# 1. Check image exists +docker pull myimage:latest + +# 2. Use full registry path +dockerPull: docker.io/library/myimage:latest + +# 3. Check authentication (for private images) +# Weaver needs registry credentials configured +``` + +### Permission Issues + +**Problem**: Cannot write files + +``` +Error: Permission denied writing to /output +``` + +**Solution**: + +```yaml +# Run as specific user +requirements: + DockerRequirement: + dockerPull: myimage:latest + # Note: User specification is environment-dependent +``` + +### Missing Dependencies + +**Problem**: Command not found in container + +``` +Error: bash: mycommand: command not found +``` + +**Solutions**: + +```yaml +# 1. Use image with command included +dockerPull: image-with-mycommand:latest + +# 2. Install in InitialWorkDirRequirement +InitialWorkDirRequirement: + listing: + - entryname: install.sh + entry: | + apt-get update && apt-get install -y mycommand + +# 3. Use conda/pip to install packages +``` + +### Network Access Issues + +**Problem**: Cannot download files + +``` +Error: Unable to connect to remote server +``` + +**Solution**: + +```yaml +requirements: + NetworkAccess: + networkAccess: true # Enable network +``` + +## Docker Security Considerations + +### Use Trusted Images + +```yaml +# ✅ Official images +dockerPull: python:3.12-slim + +# ✅ Verified publishers +dockerPull: bitnami/python:3.12 + +# ⚠️ Be cautious with unknown sources +dockerPull: randomuser/unknownimage:latest +``` + +### Minimize Image Size + +```yaml +# Use multi-stage builds in your Dockerfile +FROM python:3.12 AS builder +# Install dependencies + +FROM python:3.12-slim +# Copy only what's needed +``` + +### Keep Images Updated + +```bash +# Regularly update pinned versions +dockerPull: python:3.12.17-slim # Update from 3.9.16 +``` + +## Integration with Weaver + +### Deploy Docker-based Process + +```bash +# 1. Create CWL with DockerRequirement +cat > process.cwl << 'EOF' +cwlVersion: v1.2 +class: CommandLineTool +baseCommand: [python, -c] +requirements: + DockerRequirement: + dockerPull: python:3.12-slim +arguments: + - "print('Hello from Docker')" +outputs: + stdout: stdout +EOF + +# 2. Validate +cwltool --validate process.cwl + +# 3. Deploy to Weaver +weaver deploy -u $WEAVER_URL -p docker-hello -b process.cwl +``` + +### Monitor Docker Execution + +```bash +# Execute process +JOB_ID=$(weaver execute -u $WEAVER_URL -p docker-hello -f json | jq -r .jobID) + +# Check logs (shows Docker execution) +weaver logs -u $WEAVER_URL -j $JOB_ID +``` + +## Related Skills + +- [process-deploy](../process-deploy/) - Deploy Docker-based packages +- [cwl-validate-package](../cwl-validate-package/) - Validate Docker config +- [job-logs](../job-logs/) - Debug Docker execution +- [job-exceptions](../job-exceptions/) - Handle Docker errors +- [cwl-debug-package](../cwl-debug-package/) - Troubleshoot Docker issues + +## Documentation + +- [CWL DockerRequirement](https://www.commonwl.org/v1.2/CommandLineTool.html#DockerRequirement) +- [Docker Best Practices](https://docs.docker.com/develop/dev-best-practices/) +- [Weaver Docker Support](https://pavics-weaver.readthedocs.io/en/latest/package.html) +- [Docker Hub](https://hub.docker.com/) + +## Tools + +- **docker pull**: Test image availability +- **docker run**: Test container locally +- **dive**: Analyze image layers +- **trivy**: Scan for vulnerabilities + +## Best Practices Summary + +1. ✅ Use specific image tags (not `latest`) +2. ✅ Prefer official and verified images +3. ✅ Use slim/alpine variants when possible +4. ✅ Pin versions for reproducibility +5. ✅ Test locally before deploying +6. ✅ Enable NetworkAccess if needed +7. ✅ Handle file permissions appropriately +8. ✅ Keep images updated and secure +9. ✅ Document why specific images are chosen +10. ✅ Consider image size and pull time diff --git a/.agents/skills/cwl-understand-workflow/SKILL.md b/.agents/skills/cwl-understand-workflow/SKILL.md new file mode 100644 index 000000000..dc18b7314 --- /dev/null +++ b/.agents/skills/cwl-understand-workflow/SKILL.md @@ -0,0 +1,418 @@ +--- +name: cwl-understand-workflow +description: | + Understand CWL Workflow class structures for chaining multiple processing steps. Learn how to + connect outputs to inputs, manage data flow between steps, and create complex multi-step + workflows. Use when building workflows that chain multiple processes together. +license: Apache-2.0 +compatibility: Requires understanding of CWL CommandLineTool basics. Supports CWL v1.0, v1.1, v1.2. +metadata: + author: fmigneault +--- + +# Understand CWL Workflows + +Learn to create and understand CWL Workflow class structures for chaining multiple processing steps. + +## When to Use + +- Building multi-step data processing pipelines +- Chaining multiple tools together +- Understanding workflow execution order +- Debugging workflow step connections +- Optimizing data flow between steps + +## Workflow Basics + +### Workflow vs CommandLineTool + +**CommandLineTool**: Single process execution + +```yaml +class: CommandLineTool # Runs one command +baseCommand: [process] +``` + +**Workflow**: Chain multiple steps + +```yaml +class: Workflow # Chains multiple tools +steps: + step1: ... + step2: ... +``` + +## Simple Workflow Example + +```yaml +cwlVersion: v1.2 +class: Workflow + +inputs: + input_file: File + threshold: float + +outputs: + final_output: + type: File + outputSource: step2/output + +steps: + step1: + run: preprocess.cwl + in: + input: input_file + out: [processed] + + step2: + run: analyze.cwl + in: + data: step1/processed + threshold: threshold + out: [output] +``` + +## Workflow Components + +### 1. Inputs + +Workflow-level inputs that can be used by any step: + +```yaml +inputs: + input_files: + type: File[] # Array of files + parameter: + type: string + default: "default_value" + optional_param: + type: string? # Optional input +``` + +### 2. Outputs + +Final outputs from the workflow: + +```yaml +outputs: + result: + type: File + outputSource: final_step/output # From which step + + intermediate: + type: File + outputSource: step1/output # Can expose intermediate results +``` + +### 3. Steps + +Individual processing steps: + +```yaml +steps: + step_name: + run: # Inline tool definition + class: CommandLineTool + baseCommand: [echo] + + in: # Map workflow inputs to step inputs + step_input: workflow_input + another: some_step/output + + out: [output1, output2] # Step outputs +``` + +The `run` can also reference an external CWL file. + +> ⚠️ WARNING: The referenced `tool.cwl` should be a corresponding pre-existing process (where ID would be `tool`) for +> this CWL to succeed [Weaver deployment](../process-deploy/). + +```yaml +steps: + step_name: + run: tool.cwl # CWL file to run +``` + +## Data Flow Patterns + +### Sequential Processing + +```yaml +steps: + download: + run: download.cwl + in: {url: input_url} + out: [file] + + process: + run: process.cwl + in: {input: download/file} # Uses output from download + out: [result] + + upload: + run: upload.cwl + in: {file: process/result} # Uses output from process + out: [location] +``` + +### Parallel Processing + +```yaml +steps: + # These can run in parallel (no dependencies) + process_a: + run: tool-a.cwl + in: {input: input_file} + out: [output_a] + + process_b: + run: tool-b.cwl + in: {input: input_file} # Same input, independent processing + out: [output_b] + + # This waits for both to complete + merge: + run: merge.cwl + in: + file_a: process_a/output_a + file_b: process_b/output_b + out: [merged] +``` + +### Scatter/Gather Pattern + +```yaml +steps: + process_many: + run: process-one.cwl + scatter: input_file # Run on each file + in: + input_file: input_files # Array input + out: [output] # Array output + + combine: + run: combine.cwl + in: + files: process_many/output # Array of outputs + out: [combined] +``` + +## Advanced Workflow Features + +### ScatterMethod + +```yaml +steps: + process: + run: tool.cwl + scatter: [input1, input2] # Scatter over multiple inputs + scatterMethod: dotproduct # How to combine + # dotproduct: pair inputs (1with1, 2with2) + # nested_crossproduct: all combinations + # flat_crossproduct: all combinations, flatten + in: + input1: files_a + input2: files_b + out: [output] +``` + +### Conditional Execution (CWL v1.2+) + +```yaml +steps: + optional_step: + run: tool.cwl + when: $(inputs.do_process) # Only run if true + in: + do_process: run_optional + input: data + out: [output] +``` + +### SubWorkflows + +```yaml +steps: + sub_workflow: + run: another-workflow.cwl # Run a workflow as a step + in: + workflow_input: my_input + out: [workflow_output] +``` + +## Common Workflow Patterns + +### Preprocessing Pipeline + +```yaml +steps: + validate: + run: validate.cwl + in: {input: raw_data} + out: [validated] + + clean: + run: clean.cwl + in: {input: validate/validated} + out: [cleaned] + + transform: + run: transform.cwl + in: {input: clean/cleaned} + out: [transformed] + + analyze: + run: analyze.cwl + in: {data: transform/transformed, params: parameters} + out: [results] +``` + +### Map-Reduce Pattern + +```yaml +steps: + # Map: process each item + map: + run: mapper.cwl + scatter: item + in: {item: input_items} + out: [mapped] + + # Reduce: combine results + reduce: + run: reducer.cwl + in: {items: map/mapped} + out: [result] +``` + +## Debugging Workflows + +### Visualize Workflow + +```bash +# Generate workflow diagram +cwltool --print-dot workflow.cwl | dot -Tpng > workflow.png +``` + +### Check Step Connections + +```bash +# Validate connections +cwltool --validate workflow.cwl + +# Print execution plan +cwltool --print-deps workflow.cwl inputs.json +``` + +### Test Individual Steps + +```bash +# Test each step separately +cwltool step1.cwl step1-inputs.json +cwltool step2.cwl step2-inputs.json +``` + +### Enable Debug Output + +```bash +# See detailed execution +cwltool --debug workflow.cwl inputs.json +``` + +## Requirements for Workflows + +### Subworkflow Feature + +```yaml +requirements: + SubworkflowFeatureRequirement: {} +``` + +### Scatter Feature + +```yaml +requirements: + ScatterFeatureRequirement: {} +``` + +### Multiple Input Feature + +```yaml +requirements: + MultipleInputFeatureRequirement: {} +``` + +### Step Input Expression + +```yaml +requirements: + StepInputExpressionRequirement: {} + +steps: + process: + run: tool.cwl + in: + computed_input: + valueFrom: $(inputs.x + inputs.y) # Compute from other inputs +``` + +## Weaver Workflow Example + +Complete workflow for data processing: + +```yaml +cwlVersion: v1.2 +class: Workflow + +requirements: + ScatterFeatureRequirement: {} + +inputs: + netcdf_files: File[] + region: string + +outputs: + statistics: + type: File + outputSource: compute_stats/output + +steps: + subset: + run: subset-by-region.cwl + scatter: input_file + in: + input_file: netcdf_files + region: region + out: [subset_file] + + compute_stats: + run: compute-statistics.cwl + in: + input_files: subset/subset_file + out: [output] +``` + +## Related Skills + +- [process-deploy](../process-deploy/) - Deploy workflow to Weaver +- [job-execute](../job-execute/) - Execute workflow +- [job-provenance](../job-provenance/) - Track workflow execution +- [cwl-validate-package](../cwl-validate-package/) - Validate workflow +- [cwl-debug-package](../cwl-debug-package/) - Debug workflow issues + +## Documentation + +- [CWL Workflows](https://www.commonwl.org/user_guide/23-scatter-workflow/) +- [Weaver Workflows](https://pavics-weaver.readthedocs.io/en/latest/package.html) +- [Workflow Patterns](https://www.commonwl.org/user_guide/) +- [Process Chaining](https://pavics-weaver.readthedocs.io/en/latest/processes.html) + +## Best Practices + +1. **Keep steps modular** - Each step should do one thing well +2. **Use descriptive names** - Clear step and variable names +3. **Document data flow** - Comment complex connections +4. **Test incrementally** - Verify each step works before chaining +5. **Handle errors** - Consider what happens if a step fails +6. **Optimize scatter** - Use appropriate scatterMethod for your use case +7. **Version control** - Track workflow changes diff --git a/.agents/skills/cwl-use-expressions/SKILL.md b/.agents/skills/cwl-use-expressions/SKILL.md new file mode 100644 index 000000000..4a045cb7e --- /dev/null +++ b/.agents/skills/cwl-use-expressions/SKILL.md @@ -0,0 +1,546 @@ +--- +name: cwl-use-expressions +description: | + Use CWL expressions and JavaScript for dynamic behavior including parameter transformation, + conditional logic, and computed values. Learn to leverage InlineJavascriptRequirement for powerful + CWL packages. Use when you need dynamic, computed, or conditional behavior in CWL processes. +license: Apache-2.0 +compatibility: Requires CWL v1.0+ with InlineJavascriptRequirement support. +metadata: + author: fmigneault +--- + +# Use CWL Expressions and JavaScript + +Master CWL expressions and JavaScript for dynamic, powerful CWL packages. + +## When to Use + +- Computing values from inputs +- Conditional execution or arguments +- Transforming file names or paths +- Dynamic output naming +- Complex parameter manipulation +- Conditional validation + +## Expression Syntax + +### Parameter References + +```yaml +$(inputs.parameter_name) # Reference input +$(self) # Current value +$(runtime.outdir) # Runtime output directory +$(runtime.tmpdir) # Runtime temp directory +$(runtime.cores) # Available CPU cores +$(runtime.ram) # Available RAM (MB) +``` + +### Simple Expressions + +```yaml +inputs: + value: + type: int + inputBinding: + valueFrom: $(self * 2) # Double the value +``` + +## Enabling JavaScript + +### InlineJavascriptRequirement + +```yaml +requirements: + InlineJavascriptRequirement: {} + +# Now you can use JavaScript expressions +``` + +## JavaScript Expressions + +### Basic Syntax + +```yaml +# Single-line +valueFrom: $(inputs.x + inputs.y) +``` + +```yaml +# Multi-line +valueFrom: | + ${ + return inputs.x + inputs.y; + } +``` + +### String Manipulation + +```yaml +inputs: + filename: + type: string + +outputs: + output: + type: File + outputBinding: + glob: | + ${ + return inputs.filename.replace('.txt', '_processed.txt'); + } +``` + +### File Operations + +```yaml +inputs: + input_file: + type: File + +arguments: + # Use just the filename + - valueFrom: $(inputs.input_file.basename) + + # Use filename without extension + - valueFrom: $(inputs.input_file.nameroot) + + # Get file extension + - valueFrom: $(inputs.input_file.nameext) + + # Get directory + - valueFrom: $(inputs.input_file.dirname) + + # Get file size + - valueFrom: $(inputs.input_file.size) +``` + +## Common Patterns + +### Conditional Arguments + +```yaml +requirements: + InlineJavascriptRequirement: {} + +inputs: + verbose: + type: boolean + default: false + + debug: + type: boolean? + +arguments: + # Add --verbose if true + - valueFrom: | + ${ + return inputs.verbose ? "--verbose" : null; + } + + # Add --debug if present and true + - valueFrom: | + ${ + return inputs.debug ? "--debug" : null; + } +``` + +### Computed Output Names + +```yaml +inputs: + input_file: + type: File + prefix: + type: string + default: "processed" + +outputs: + output_file: + type: File + outputBinding: + glob: | + ${ + var base = inputs.input_file.nameroot; + var ext = inputs.input_file.nameext; + return inputs.prefix + "_" + base + ext; + } +``` + +### Array Processing + +```yaml +inputs: + files: + type: File[] + +arguments: + # Join array elements + - valueFrom: | + ${ + return inputs.files.map(function(f) { + return f.path; + }).join(','); + } +``` + +### Conditional Defaults + +```yaml +inputs: + threshold: + type: float? + + auto_threshold: + type: boolean + default: false + +arguments: + - prefix: --threshold + valueFrom: | + ${ + if (inputs.threshold !== null) { + return inputs.threshold; + } else if (inputs.auto_threshold) { + return 0.5; // Auto value + } else { + return null; // No threshold + } + } +``` + +## Advanced Techniques + +### Complex Validation + +```yaml +requirements: + InlineJavascriptRequirement: {} + +inputs: + value: + type: int + +arguments: + - valueFrom: | + ${ + if (inputs.value < 0 || inputs.value > 100) { + throw "Value must be between 0 and 100"; + } + return inputs.value; + } +``` + +### Dynamic Command Building + +```yaml +baseCommand: [python, -c] + +inputs: + operation: + type: string + value_a: + type: float + value_b: + type: float + +arguments: + - valueFrom: | + ${ + var ops = { + "add": inputs.value_a + inputs.value_b, + "subtract": inputs.value_a - inputs.value_b, + "multiply": inputs.value_a * inputs.value_b, + "divide": inputs.value_a / inputs.value_b + }; + return "print(" + ops[inputs.operation] + ")"; + } +``` + +### Format Conversion + +```yaml +inputs: + date_string: + type: string # "2026-02-19" + +arguments: + - valueFrom: | + ${ + // Convert date format + var parts = inputs.date_string.split('-'); + return parts[2] + '/' + parts[1] + '/' + parts[0]; + // Returns: "19/02/2026" + } +``` + +### Resource Calculation + +```yaml +requirements: + ResourceRequirement: + ramMin: | + ${ + // Calculate RAM based on input file size + var fileSize = inputs.input_file.size / (1024 * 1024); // MB + return Math.max(2048, fileSize * 4); // 4x file size, min 2GB + } +``` + +### Array Filtering + +```yaml +inputs: + files: + type: File[] + min_size: + type: int + default: 0 + +arguments: + - valueFrom: | + ${ + // Filter files by size + return inputs.files + .filter(function(f) { + return f.size > inputs.min_size; + }) + .map(function(f) { + return f.path; + }) + .join(' '); + } +``` + +## InitialWorkDirRequirement with Expressions + +### Dynamic File Generation + +```yaml +requirements: + InlineJavascriptRequirement: {} + InitialWorkDirRequirement: + listing: + - entryname: config.json + entry: | + ${ + return JSON.stringify({ + "input": inputs.input_file.path, + "threshold": inputs.threshold, + "output": runtime.outdir + "/result.txt" + }, null, 2); + } +``` + +### Conditional File Staging + +```yaml +requirements: + InitialWorkDirRequirement: + listing: | + ${ + var files = [inputs.required_file]; + if (inputs.optional_file !== null) { + files.push(inputs.optional_file); + } + return files; + } +``` + +## Runtime Information + +### Available Runtime Properties + +```yaml +arguments: + # Output directory + - valueFrom: $(runtime.outdir) + + # Temp directory + - valueFrom: $(runtime.tmpdir) + + # CPU cores available + - valueFrom: $(runtime.cores) + + # RAM available (MB) + - valueFrom: $(runtime.ram) +``` + +### Using Runtime in Paths + +```yaml +outputs: + output: + type: File + outputBinding: + glob: | + ${ + return runtime.outdir + "/output.txt"; + } +``` + +## Debugging Expressions + +### Add Logging + +```yaml +arguments: + - valueFrom: | + ${ + console.log("Input value:", inputs.value); + console.log("Computed result:", inputs.value * 2); + return inputs.value * 2; + } +``` + +### Test Locally + +```bash +# Run with cwltool to see expression output +cwltool --debug tool.cwl inputs.json +``` + +## Best Practices + +### 1. Keep Expressions Simple + +```yaml +# ❌ Too complex +valueFrom: | + ${ + var result; + if (inputs.a) { + if (inputs.b) { + result = inputs.a + inputs.b; + } else { + result = inputs.a; + } + } else { + result = 0; + } + return result; + } +``` + +```yaml +# ✅ Better - simplify logic +valueFrom: $(inputs.a + (inputs.b || 0)) +``` + +### 2. Use Null Checks + +```yaml +valueFrom: | + ${ + return inputs.optional !== null ? inputs.optional : "default"; + } +``` + +### 3. Document Complex Expressions + +```yaml +inputs: + files: + type: File[] + +arguments: + # Filter files > 1MB and join paths with commas + - valueFrom: | + ${ + return inputs.files + .filter(function(f) { return f.size > 1048576; }) + .map(function(f) { return f.path; }) + .join(','); + } +``` + +### 4. Avoid Side Effects + +```yaml +# ❌ Don't modify inputs +valueFrom: | + ${ + inputs.value = inputs.value * 2; // Bad! + return inputs.value; + } +``` + +```yaml +# ✅ Return new value +valueFrom: $(inputs.value * 2) +``` + +### 5. Handle Errors Gracefully + +```yaml +valueFrom: | + ${ + try { + return someComplexOperation(inputs.value); + } catch (e) { + console.error("Error:", e.message); + throw e; + } + } +``` + +## Common Gotchas + +### Null vs Undefined + +```yaml +# CWL uses null for missing optional inputs +valueFrom: | + ${ + // ✅ Check for null + if (inputs.optional === null) { + return "default"; + } + return inputs.optional; + } +``` + +### File Path vs Object + +The `inputs.file` is an object with properties. The `file` portion is the input ID. + +```yaml +# ✅Use .path for the file path +valueFrom: $(inputs.file.path) +``` + +```yaml +# ❌ Returns object +valueFrom: $(inputs.file) +``` + +### String Concatenation + +```yaml +# ✅ Use + for concatenation +valueFrom: $(inputs.prefix + "_" + inputs.suffix) +``` + +```yaml +# ❌ Don't rely on automatic coercion +valueFrom: $(inputs.prefix inputs.suffix) +``` + +## Related Skills + +- [cwl-create-commandlinetool](../cwl-create-commandlinetool/) - Build CWL tools +- [cwl-understand-workflow](../cwl-understand-workflow/) - Use in workflows +- [cwl-debug-package](../cwl-debug-package/) - Debug expressions +- [cwl-validate-package](../cwl-validate-package/) - Validate expressions + +## Documentation + +- [CWL Expressions](https://www.commonwl.org/v1.2/CommandLineTool.html#Expressions) +- [InlineJavascriptRequirement](https://www.commonwl.org/v1.2/CommandLineTool.html#InlineJavascriptRequirement) +- [JavaScript in CWL](https://www.commonwl.org/user_guide/17-expressions/) +- [Runtime Context](https://www.commonwl.org/v1.2/CommandLineTool.html#Runtime_environment) + +## Examples Repository + +Check the CWL examples repository for more expression patterns: +[https://github.com/common-workflow-language/workflows](https://github.com/common-workflow-language/workflows) diff --git a/.agents/skills/cwl-validate-package/SKILL.md b/.agents/skills/cwl-validate-package/SKILL.md new file mode 100644 index 000000000..d503956b3 --- /dev/null +++ b/.agents/skills/cwl-validate-package/SKILL.md @@ -0,0 +1,237 @@ +--- +name: cwl-validate-package +description: | + Validate CWL package syntax and structure before deployment to Weaver. Check for syntax errors, + Docker requirements, input/output definitions, and CWL version compatibility. Use to catch errors + early and ensure package quality before deployment. +license: Apache-2.0 +compatibility: Requires cwltool installed locally. Supports CWL v1.0, v1.1, v1.2. +metadata: + author: fmigneault +--- + +# Validate CWL Package + +Validate CWL package syntax and structure before deploying to Weaver. + +## When to Use + +- Before deploying a new process to catch errors early +- When creating or modifying CWL packages +- To verify Docker requirements are properly specified +- When troubleshooting package deployment failures +- To ensure CWL version compatibility + +## Parameters + +### Required + +- **package_file** (path): CWL file to validate (.cwl or .yaml) + +### Optional + +- **strict** (boolean): Enable strict validation mode +- **check_docker** (boolean): Verify Docker images are accessible + +## CLI Usage + +```bash +# Basic validation +cwltool --validate process.cwl + +# Validate with strict mode +cwltool --validate --strict process.cwl + +# Validate and check requirements +cwltool --print-pre --validate process.cwl + +# Validate workflow with dependencies +cwltool --validate workflow.cwl +``` + +## Validation Checks + +### Syntax Validation + +- CWL version compatibility +- YAML/JSON structure +- Required fields present +- Type definitions correct + +### Semantic Validation + +- Input/output types match +- Command line bindings valid +- File paths resolvable +- Expressions syntax correct + +### Docker Validation + +- DockerRequirement properly formatted +- Image names valid +- Tags specified (recommended) + +### Workflow Validation + +- Step names unique +- Input/output connections valid +- No circular dependencies +- All required inputs provided + +## Common Issues and Fixes + +### Issue: "Unknown field 'xyz'" + +```yaml +# ❌ Wrong - typo in field name +DockerRequirment: # Missing 'e' + dockerPull: image + +# ✅ Correct +DockerRequirement: + dockerPull: image +``` + +### Issue: "Type mismatch" + +```yaml +# ❌ Wrong - string where File expected +inputs: + input_file: string +``` + +```yaml +# ✅ Correct +inputs: + input_file: File +``` + +### Issue: "Missing required field" + +```yaml +# ❌ Wrong - missing class +cwlVersion: v1.2 +``` + +```yaml +# ✅ Correct +cwlVersion: v1.2 +class: CommandLineTool +``` + +### Issue: "Invalid expression" + +```yaml +# ❌ Wrong - incorrect JavaScript syntax +arguments: ["$(runtime.outdir"] # Missing closing ) +``` + +```yaml +# ✅ Correct +arguments: ["$(runtime.outdir)"] +``` + +## Example: Valid CWL Package + +```yaml +cwlVersion: v1.2 +class: CommandLineTool +baseCommand: [echo] + +requirements: + DockerRequirement: + dockerPull: debian:stable-slim + +inputs: + message: + type: string + inputBinding: + position: 1 + +outputs: + output: + type: stdout + +stdout: output.txt +``` + +## Validation Output + +### Success + +``` +process.cwl is valid CWL +``` + +### Errors + +``` +ERROR process.cwl:5:1: Unknown field `DockerRequirment` + Did you mean `DockerRequirement`? +``` + +## Advanced Validation + +### Check Docker Images + +```bash +# Verify Docker image exists +docker pull $(grep dockerPull process.cwl | cut -d: -f2-) +``` + +### Test Locally + +```bash +# Run with sample inputs +cwltool process.cwl inputs.json +``` + +### Validate Against Schema + +```bash +# Use CWL schema validator +schema-salad-tool --print-jsonld-context process.cwl +``` + +## Integration with Weaver + +After validation, deploy to Weaver: + +```bash +# 1. Validate locally +cwltool --validate process.cwl + +# 2. Test with sample data +cwltool process.cwl test-inputs.json + +# 3. Deploy to Weaver +weaver deploy -u $WEAVER_URL -p my-process -b process.cwl +``` + +## Related Skills + +- [process-deploy](../process-deploy/) - Deploy validated package +- [cwl-debug-package](../cwl-debug-package/) - Debug validation failures +- [cwl-understand-docker](../cwl-understand-docker/) - Docker requirements +- [job-exceptions](../job-exceptions/) - Debug deployment errors + +## Documentation + +- [CWL Application Packages](https://pavics-weaver.readthedocs.io/en/latest/package.html) +- [CWL Specification](https://www.commonwl.org/v1.2/) +- [CWL User Guide](https://www.commonwl.org/user_guide/) +- [Process Deployment](https://pavics-weaver.readthedocs.io/en/latest/processes.html) + +## Tools + +- **cwltool**: Reference CWL implementation +- **schema-salad**: CWL schema validator +- **Docker**: For testing Docker-based packages + +## Best Practices + +1. **Always validate** before deploying to Weaver +2. **Use specific Docker tags** (not `latest`) +3. **Test locally** with sample data +4. **Document inputs/outputs** with descriptions +5. **Pin CWL version** explicitly (v1.0, v1.1, v1.2) diff --git a/.agents/skills/job-dismiss/SKILL.md b/.agents/skills/job-dismiss/SKILL.md new file mode 100644 index 000000000..1cb902ef8 --- /dev/null +++ b/.agents/skills/job-dismiss/SKILL.md @@ -0,0 +1,102 @@ +--- +name: job-dismiss +description: | + Cancel a running or pending job. The job status will be updated to "dismissed" and execution will + be terminated. Use when you need to stop a job that is taking too long, was submitted with + incorrect parameters, or is no longer needed. +license: Apache-2.0 +compatibility: Requires Weaver API access with job management permissions. +metadata: + author: fmigneault +--- + +# Dismiss Job + +Cancel a running or pending job and mark it as dismissed. + +## When to Use + +- Stopping a job with incorrect parameters +- Cancelling long-running jobs no longer needed +- Freeing resources from stuck jobs +- Interrupting jobs during testing +- Managing resource allocation + +## Parameters + +### Required + +- **job_id** (string): Job identifier to cancel + +## CLI Usage + +```bash +# Dismiss a job +weaver dismiss -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 + +# Check status after dismissal +weaver status -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Dismiss job +result = client.dismiss(job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890") + +if result.success: + print("Job dismissed successfully") + +# Verify status +status = client.status(job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890") +print(f"Job status: {status.body['status']}") # Should be "dismissed" +``` + +## API Request + +```bash +curl -X DELETE \ + "${WEAVER_URL}/jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890" +``` + +## Returns + +```json +{ + "jobID": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", + "status": "dismissed", + "message": "Job dismissed by user request" +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Behavior + +- **Running jobs**: Will be terminated gracefully if possible +- **Pending jobs**: Removed from queue without execution +- **Completed jobs**: Cannot be dismissed (already finished) +- **Failed jobs**: Cannot be dismissed (already terminated) + +## Error Handling + +- **404 Not Found**: Job does not exist +- **403 Forbidden**: Insufficient permissions +- **410 Gone**: Job already dismissed + +## Related Skills + +- [job-execute](../job-execute/) - Start a job +- [job-status](../job-status/) - Check job status +- [job-monitor](../job-monitor/) - Monitor execution +- [job-list](../job-list/) - Find jobs to dismiss + +## Documentation + +- [Job Dismissal](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/job-exceptions/SKILL.md b/.agents/skills/job-exceptions/SKILL.md new file mode 100644 index 000000000..574000f63 --- /dev/null +++ b/.agents/skills/job-exceptions/SKILL.md @@ -0,0 +1,110 @@ +--- +name: job-exceptions +description: | + Retrieve detailed exception and error information for failed jobs including error messages, stack + traces, and debugging information. Use when diagnosing job failures or troubleshooting process + execution issues. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# Get Job Exceptions + +Retrieve detailed exception and error information for failed jobs. + +## When to Use + +- Diagnosing why a job failed +- Getting detailed error messages +- Debugging process execution issues +- Reporting bugs or issues +- Understanding failure causes + +## Parameters + +### Required + +- **job_id** (string): Job identifier + +## CLI Usage + +```bash +# Get exceptions for a failed job +weaver exceptions -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 + +# Combine with logs for full context +weaver logs -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 +weaver exceptions -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Get exceptions +exceptions = client.exceptions(job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890") + +for exception in exceptions.body.get("exceptions", []): + print(f"Error: {exception.get('Text', 'Unknown error')}") + print(f"Code: {exception.get('Code', 'N/A')}") +``` + +## API Request + +```bash +curl -X GET \ + "${WEAVER_URL}/jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890/exceptions" +``` + +## Returns + +```json +{ + "exceptions": [ + { + "Code": "InvalidParameterValue", + "Text": "Input parameter 'threshold' must be between 0 and 1, got 1.5", + "Locator": "threshold" + } + ] +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Exception Information + +Typical exception fields: + +- **Code**: Error code (e.g., InvalidParameterValue, ProcessFailed) +- **Text**: Human-readable error message +- **Locator**: Which parameter or component caused the error +- **StackTrace**: Detailed stack trace (if available) + +## Common Error Codes + +- **InvalidParameterValue**: Invalid input parameter +- **MissingParameterValue**: Required parameter not provided +- **ProcessFailed**: Process execution failed +- **NoApplicableCode**: Generic error +- **StorageQuotaExceeded**: Insufficient storage space +- **NetworkError**: Network connectivity issues + +## Related Skills + +- [job-logs](../job-logs/) - View execution logs +- [job-status](../job-status/) - Check job status +- [job-execute](../job-execute/) - Retry with corrected parameters +- [process-describe](../process-describe/) - Validate input requirements + +## Documentation + +- [Job Exceptions](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [Error Handling](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/job-execute/SKILL.md b/.agents/skills/job-execute/SKILL.md new file mode 100644 index 000000000..db4039ebf --- /dev/null +++ b/.agents/skills/job-execute/SKILL.md @@ -0,0 +1,215 @@ +--- +name: job-execute +description: | + Execute a deployed process with specified inputs. Supports synchronous and asynchronous execution + modes with various output formats. Use when you need to run a process with specific input data and + retrieve results. +license: Apache-2.0 +compatibility: Requires Weaver API access. Supports async/sync execution modes. +metadata: + author: fmigneault +--- + +# Execute Process + +Execute a deployed process with specified inputs in synchronous or asynchronous mode. + +## When to Use + +- Running a deployed process with input data +- Starting asynchronous long-running jobs +- Getting immediate results from fast processes (sync mode) +- Executing workflows with multiple steps + +## Parameters + +### Required + +- **process_id** (string): Process identifier to execute +- **inputs** (object or file): Process input values + - Can be JSON/YAML file path + - Can be inline key=value pairs + - Can be CWL input format + +### Optional + +- **mode** (string): Execution mode + - `async`: Asynchronous execution (default) - returns job ID immediately + - `sync`: Synchronous execution - waits for completion + - `auto`: Let server decide based on estimated duration +- **response** (string): Response format + - `document`: Full job status document (default) + - `raw`: Direct output results +- **output_transmission** (string): How outputs are returned + - `reference`: URLs to output files (default) + - `value`: Inline output values +- **subscribers** (object): Notification callbacks for job events +- **headers** (object): Custom HTTP headers + +## CLI Usage + +```bash +# Async execution with inputs from file +weaver execute -u $WEAVER_URL -p my-process -I inputs.json + +# Sync execution with inline inputs +weaver execute -u $WEAVER_URL -p echo -i message="Hello World" -M sync + +# With monitoring +weaver execute -u $WEAVER_URL -p my-process -I inputs.json -M + +# Execute and wait for results +JOB_ID=$(weaver execute -u $WEAVER_URL -p my-process -I inputs.json -f json | jq -r .jobID) +weaver monitor -u $WEAVER_URL -j $JOB_ID +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Async execution +result = client.execute( + process_id="my-process", + inputs={"input1": "value1", "input2": "value2"}, + mode="async" +) + +job_id = result.body["jobID"] +print(f"Job started: {job_id}") + +# Sync execution +result = client.execute( + process_id="echo", + inputs={"message": "Hello"}, + mode="sync" +) + +print(f"Result: {result.body}") +``` + +## API Request + +```bash +curl -X POST \ + -H "Content-Type: application/json" \ + -H "Prefer: respond-async" \ + -d '{ + "inputs": { + "input1": "value1", + "input2": {"href": "https://example.com/data.txt"} + }, + "outputs": { + "output1": {"transmissionMode": "reference"} + } +}' \ + "${WEAVER_URL}/processes/my-process/execution" +``` + +## Returns + +### For Async Mode + +```json +{ + "jobID": "b2c3d4e5-f6a7-8901-bcde-f12345678901", + "status": "accepted", + "location": "https://weaver.example.com/jobs/b2c3d4e5-f6a7-8901-bcde-f12345678901", + "created": "2026-02-19T10:00:00Z", + "processID": "my-process" +} +``` + +**Note**: Response may include additional fields such as `links`, `message`, `progress`, and execution details. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +### For Sync Mode + +```json +{ + "outputs": { + "output1": { + "href": "https://weaver.example.com/outputs/result.nc", + "type": "application/netcdf" + } + }, + "status": "succeeded", + "duration": "PT2M30S" +} +``` + +**Note**: Synchronous responses include complete output data or references. Additional fields may include `logs`, +`statistics`, and `provenance`. + +## Input Format Examples + +### Literal Values + +```json +{ + "inputs": { + "message": "Hello World", + "count": 42, + "enabled": true + } +} +``` + +### File References + +```json +{ + "inputs": { + "input_file": { + "href": "https://example.com/data.nc" + } + } +} +``` + +### Multiple Files (Array) + +```json +{ + "inputs": { + "input_files": [ + {"href": "https://example.com/file1.txt"}, + {"href": "https://example.com/file2.txt"} + ] + } +} +``` + +### Vault References + +```json +{ + "inputs": { + "credentials": { + "href": "vault://my-secret-token" + } + } +} +``` + +## Error Handling + +- **404 Not Found**: Process does not exist +- **400 Bad Request**: Invalid inputs or parameters +- **422 Unprocessable Entity**: Input validation failed +- **503 Service Unavailable**: Execution resources unavailable + +## Related Skills + +- [job-monitor](../job-monitor/) - Wait for job completion +- [job-status](../job-status/) - Check job status +- [job-results](../job-results/) - Retrieve output results +- [job-dismiss](../job-dismiss/) - Cancel running job + +## Documentation + +- [Process Execution](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [Input/Output Formats](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/job-inputs/SKILL.md b/.agents/skills/job-inputs/SKILL.md new file mode 100644 index 000000000..5a167adae --- /dev/null +++ b/.agents/skills/job-inputs/SKILL.md @@ -0,0 +1,176 @@ +--- +name: job-inputs +description: | + Retrieve the input specifications and values that were provided when a job was executed. Shows + what parameters were used to run the process. Use for debugging, reproducing results, or auditing + job submissions. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# Get Job Inputs + +Retrieve the input values that were provided when a job was executed. + +## When to Use + +- Reviewing parameters used for a job +- Reproducing job execution with same inputs +- Debugging parameter-related issues +- Auditing job submissions +- Documenting workflow configurations +- Comparing inputs across multiple job runs + +## Parameters + +### Required + +- **job_id** (string): Job identifier + +## CLI Usage + +```bash +# Get job inputs +weaver inputs -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 + +# Save inputs for reuse +weaver inputs -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 > inputs-to-reuse.json + +# Resubmit with same inputs +weaver execute -u $WEAVER_URL -p my-process -I inputs-to-reuse.json +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Get inputs +inputs = client.inputs(job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890") + +for input_name, input_value in inputs.body.items(): + print(f"{input_name}: {input_value}") + +# Reuse inputs for another job +new_job = client.execute( + process_id="my-process", + inputs=inputs.body +) +``` + +## API Request + +```bash +curl -X GET \ + "${WEAVER_URL}/jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890/inputs" +``` + +## Returns + +```json +{ + "input1": "value1", + "input2": { + "href": "https://example.com/input-file.nc", + "type": "application/netcdf" + }, + "threshold": 0.5, + "enabled": true +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Input Types + +### Literal Values + +```json +{ + "parameter": "string value", + "count": 42, + "enabled": true +} +``` + +### File References + +```json +{ + "input_file": { + "href": "https://example.com/data.tif", + "type": "image/tiff" + } +} +``` + +### Arrays + +```json +{ + "files": [ + {"href": "https://example.com/file1.nc"}, + {"href": "https://example.com/file2.nc"} + ] +} +``` + +### Vault References + +```json +{ + "credentials": { + "href": "vault://secret-token" + } +} +``` + +## Use Cases + +### Reproduce Results + +```bash +# Get inputs from successful job +weaver inputs -u $WEAVER_URL -j c3d4e5f6-a7b8-9012-cdef-123456789012 > good-inputs.json + +# Run again with same parameters +weaver execute -u $WEAVER_URL -p my-process -I good-inputs.json +``` + +### Debug Failed Jobs + +```python +# Compare inputs between successful and failed jobs +success_inputs = client.inputs(job_id="success-job-id") +failed_inputs = client.inputs(job_id="d4e5f6a7-b8c9-0123-def1-234567890123") + +# Find differences +for key in success_inputs.body: + if success_inputs.body[key] != failed_inputs.body.get(key): + print(f"Different value for {key}") +``` + +### Audit Trail + +```bash +# Document what inputs were used +weaver inputs -u $WEAVER_URL -j $JOB_ID | tee audit/job-$JOB_ID-inputs.json +``` + +## Related Skills + +- [job-execute](../job-execute/) - Submit jobs with inputs +- [job-results](../job-results/) - Get corresponding outputs +- [job-status](../job-status/) - Check job status +- [process-describe](../process-describe/) - See required/optional inputs + +## Documentation + +- [Job Inputs](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [Input Formats](https://pavics-weaver.readthedocs.io/en/latest/processes.html#inputs-outputs) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/job-list/SKILL.md b/.agents/skills/job-list/SKILL.md new file mode 100644 index 000000000..480fbc618 --- /dev/null +++ b/.agents/skills/job-list/SKILL.md @@ -0,0 +1,158 @@ +--- +name: job-list +description: | + List jobs with optional filtering by process, status, date range, tags, and more. Supports + pagination and sorting. Use when you need to find specific jobs, monitor multiple executions, or + generate job reports. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# List Jobs + +List jobs with filtering, pagination, and sorting capabilities. + +## When to Use + +- Finding jobs by process or status +- Monitoring multiple job executions +- Generating job reports and statistics +- Debugging failed jobs +- Cleaning up old jobs +- Auditing job history + +## Parameters + +### Optional + +- **process** (string): Filter by process ID +- **provider** (string): Filter by provider ID +- **status** (string): Filter by job status (running, succeeded, failed, etc.) +- **limit** (integer): Maximum number of results (default: 10) +- **page** (integer): Page number for pagination (default: 0) +- **sort** (string): Sort order (e.g., "created:desc") +- **tags** (list): Filter by job tags +- **date** (string): Filter by date range +- **detail** (boolean): Include detailed information + +## CLI Usage + +```bash +# List all jobs +weaver jobs -u $WEAVER_URL + +# Filter by process +weaver jobs -u $WEAVER_URL -p my-process + +# Filter by status +weaver jobs -u $WEAVER_URL -s succeeded + +# Combine filters +weaver jobs -u $WEAVER_URL -p my-process -s failed + +# With pagination +weaver jobs -u $WEAVER_URL --limit 50 --page 2 + +# Sort by creation date +weaver jobs -u $WEAVER_URL --sort created:desc +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# List all jobs +jobs = client.jobs() + +for job in jobs.body.get("jobs", []): + print(f"{job['jobID']}: {job['status']}") + +# Filter by process and status +failed_jobs = client.jobs( + process="my-process", + status="failed" +) + +# Get detailed information +detailed = client.jobs(detail=True, limit=100) +``` + +## API Request + +```bash +curl -X GET \ + "${WEAVER_URL}/jobs?process=my-process&status=succeeded&limit=20&page=0" +``` + +## Returns + +```json +{ + "jobs": [ + { + "jobID": "12345678-1234-5678-1234-567890abcdef", + "processID": "my-process", + "status": "succeeded", + "created": "2026-02-19T10:00:00Z", + "finished": "2026-02-19T10:05:00Z", + "duration": "PT5M" + } + ], + "total": 150, + "limit": 20, + "page": 0, + "links": { + "next": "/jobs?page=1&limit=20" + } +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Job Status Values + +- **accepted**: Job received and queued +- **running**: Job is currently executing +- **succeeded**: Job completed successfully +- **failed**: Job failed with errors +- **dismissed**: Job was cancelled by user + +## Filtering Examples + +### By Date Range + +```bash +weaver jobs -u $WEAVER_URL --date "2026-02-01/2026-02-19" +``` + +### By Multiple Statuses + +```bash +# Get all active jobs (accepted or running) +weaver jobs -u $WEAVER_URL -s accepted,running +``` + +### By Tags + +```bash +weaver jobs -u $WEAVER_URL --tags production,validated +``` + +## Related Skills + +- [job-status](../job-status/) - Check individual job status +- [job-execute](../job-execute/) - Create new jobs +- [job-dismiss](../job-dismiss/) - Cancel jobs +- [job-results](../job-results/) - Retrieve job outputs + +## Documentation + +- [Job Management](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [Job Filtering](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/job-logs/SKILL.md b/.agents/skills/job-logs/SKILL.md new file mode 100644 index 000000000..4222e70dd --- /dev/null +++ b/.agents/skills/job-logs/SKILL.md @@ -0,0 +1,66 @@ +--- +name: job-logs +description: | + Retrieve execution logs for debugging and monitoring job execution. Includes process execution + steps, standard output/error, and timestamps. Use when debugging failed jobs or tracking execution + details. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# Get Job Logs + +Retrieve execution logs for debugging and monitoring. + +## When to Use + +- Debugging failed jobs +- Understanding execution flow +- Tracking progress in detail +- Identifying errors and warnings + +## Parameters + +### Required + +- **job_id** (string): Job identifier + +## CLI Usage + +```bash +weaver logs -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") +logs = client.logs(job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890") + +for log_entry in logs.body.get("logs", []): + print(log_entry) +``` + +## Returns + +Execution logs including: + +- Process execution steps +- Standard output/error streams +- Timestamps for each step +- Error messages and stack traces +- Resource usage information + +## Related Skills + +- [job-status](../job-status/) - Check status +- [job-exceptions](../job-exceptions/) - Get error details + +## Documentation + +- [Job Logs](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/job-monitor/SKILL.md b/.agents/skills/job-monitor/SKILL.md new file mode 100644 index 000000000..f9ce912a1 --- /dev/null +++ b/.agents/skills/job-monitor/SKILL.md @@ -0,0 +1,74 @@ +--- +name: job-monitor +description: | + Continuously monitor a job until completion or timeout. Polls job status at regular intervals and + provides progress updates. Use when you need to wait for a job to complete and get final results. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# Monitor Job + +Continuously monitor a job until completion or timeout with regular status polling. + +## When to Use + +- Waiting for an asynchronous job to complete +- Tracking long-running workflow execution +- Getting real-time progress updates +- Automatically retrieving results when done + +## Parameters + +### Required + +- **job_id** (string): Job identifier to monitor + +### Optional + +- **timeout** (integer): Maximum time to wait in seconds (default: 60) +- **interval** (integer): Polling interval in seconds (default: 5) + +## CLI Usage + +```bash +# Monitor with defaults +weaver monitor -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 + +# Custom timeout and interval +weaver monitor -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 -tS 300 -tI 10 + +# Execute and monitor in one command +weaver execute -u $WEAVER_URL -p my-process -I inputs.json -M +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Start monitoring +status = client.monitor( + job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890", + timeout=300, + interval=10 +) + +print(f"Final status: {status.body['status']}") +``` + +## Returns + +- **status**: Final job status (succeeded, failed, dismissed) +- **progress**: Progress percentage (0-100) +- **duration**: Total execution time +- **message**: Status message or error details + +## Documentation + +- [Job Monitoring](https://pavics-weaver.readthedocs.io/en/latest/processes.htmling-a-job-execution-getstatus) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/job-provenance/SKILL.md b/.agents/skills/job-provenance/SKILL.md new file mode 100644 index 000000000..2c7159147 --- /dev/null +++ b/.agents/skills/job-provenance/SKILL.md @@ -0,0 +1,188 @@ +--- +name: job-provenance +description: | + Retrieve W3C PROV provenance metadata tracking the complete execution lineage and data derivation. + Includes information about inputs, outputs, processes, agents, and temporal relationships for + reproducibility and data lineage tracking. +license: Apache-2.0 +compatibility: Requires Weaver API access with provenance feature enabled (weaver.cwl_prov=true). +metadata: + author: fmigneault +--- + +# Get Job Provenance + +Retrieve W3C PROV provenance metadata for tracking execution lineage and data derivation. + +## When to Use + +- Tracking data lineage and derivation +- Ensuring reproducibility of results +- Auditing and compliance requirements +- Understanding workflow execution paths +- Documenting research workflows +- Publishing scientific results with provenance + +## Parameters + +### Required + +- **job_id** (string): Job identifier + +### Optional + +- **format** (string): Provenance format + - `json`: JSON-LD format (default) + - `xml`: PROV-XML format + - `turtle`: RDF Turtle format + - `rdf`: RDF/XML format + +## CLI Usage + +```bash +# Get provenance in JSON-LD +weaver provenance -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 + +# Get provenance in RDF Turtle +weaver provenance -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 -f turtle + +# Save to file +weaver provenance -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 -o provenance.jsonld +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Get provenance +prov = client.provenance(job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890") + +# Access provenance entities +for entity in prov.body.get("entities", []): + print(f"Entity: {entity['id']}") + +# Get provenance in specific format +prov_turtle = client.provenance( + job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890", + format="turtle" +) +``` + +## API Request + +```bash +GET /jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890/prov +Accept: application/ld+json +``` + +## Returns + +W3C PROV document with: + +```json +{ + "@context": "https://www.w3.org/ns/prov", + "entity": [ + { + "@id": "input-file-1", + "@type": "prov:Entity", + "prov:atLocation": "https://example.com/input.nc", + "prov:generatedAtTime": "2026-02-19T10:00:00Z" + } + ], + "activity": [ + { + "@id": "execution-1", + "@type": "prov:Activity", + "prov:startedAtTime": "2026-02-19T10:00:05Z", + "prov:endedAtTime": "2026-02-19T10:05:20Z", + "prov:used": {"@id": "input-file-1"} + } + ], + "wasGeneratedBy": [ + { + "prov:entity": {"@id": "output-file-1"}, + "prov:activity": {"@id": "execution-1"} + } + ] +} +``` + +## Provenance Elements + +### Entities + +- Input files and data +- Output files and results +- Intermediate data products +- Configuration files + +### Activities + +- Process executions +- Workflow steps +- Data transformations +- Service invocations + +### Agents + +- Software tools and versions +- Computing infrastructure +- Users and organizations + +### Relationships + +- **wasGeneratedBy**: Output generated by activity +- **used**: Activity used entity as input +- **wasAssociatedWith**: Activity associated with agent +- **wasDerivedFrom**: Entity derived from another entity +- **wasInformedBy**: Activity informed by another activity + +## Use Cases + +### Scientific Reproducibility + +```python +# Extract complete workflow lineage +prov = client.provenance(job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890") + +# Document software versions used +for agent in prov.body.get("agents", []): + print(f"Software: {agent['label']} v{agent['version']}") +``` + +### Data Lineage Tracking + +```python +# Trace data derivation chain +for derivation in prov.body.get("wasDerivedFrom", []): + print(f"{derivation['entity']} derived from {derivation['source']}") +``` + +### Compliance and Auditing + +```bash +# Export provenance for archival +weaver provenance -u $WEAVER_URL -j $JOB_ID -f xml -o compliance/job-$JOB_ID-prov.xml +``` + +## Related Skills + +- [job-status](../job-status/) - Check job status +- [job-results](../job-results/) - Retrieve outputs +- [job-execute](../job-execute/) - Run processes with provenance tracking + +## Documentation + +- [Provenance Tracking](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [W3C PROV Overview](https://www.w3.org/TR/prov-overview/) +- [Configuration](https://pavics-weaver.readthedocs.io/en/latest/configuration.html#weaver-cwl-prov) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) + +## Note + +Provenance tracking must be enabled in Weaver configuration (`weaver.cwl_prov=true`). Jobs executed without this setting +will not have provenance data available. diff --git a/.agents/skills/job-results/SKILL.md b/.agents/skills/job-results/SKILL.md new file mode 100644 index 000000000..9b6fe8cdf --- /dev/null +++ b/.agents/skills/job-results/SKILL.md @@ -0,0 +1,105 @@ +--- +name: job-results +description: | + Retrieve output results from a successfully completed job. Downloads output files or retrieves + inline values. Use when a job has status 'succeeded' and you need to access the outputs. +license: Apache-2.0 +compatibility: Requires Weaver API access. Job must have succeeded status. +metadata: + author: fmigneault +--- + +# Get Job Results + +Retrieve output results from a successfully completed job. + +## When to Use + +- Getting outputs after job completion +- Downloading result files locally +- Retrieving literal output values +- Accessing workflow step outputs + +## Parameters + +### Required + +- **job_id** (string): Job identifier + +### Optional + +- **output_dir** (path): Directory to download output files +- **download** (boolean): Whether to download files locally + +## CLI Usage + +```bash +# View results (URLs) +weaver results -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 + +# Download to directory +weaver results -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 -oD ./outputs + +# Complete workflow: execute, monitor, get results +weaver execute -u $WEAVER_URL -p my-process -I inputs.json +weaver monitor -u $WEAVER_URL -j +weaver results -u $WEAVER_URL -j -oD ./outputs +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Get results +results = client.results(job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890") + +for output_name, output_value in results.body.items(): + if isinstance(output_value, dict) and "href" in output_value: + print(f"{output_name}: {output_value['href']}") + else: + print(f"{output_name}: {output_value}") +``` + +## Returns + +Result format depends on output type: + +### File Outputs (Reference Mode) + +```json +{ + "output1": { + "href": "https://weaver.example.com/outputs/job-id/output1.nc", + "type": "application/netcdf" + } +} +``` + +### Literal Outputs (Value Mode) + +```json +{ + "count": 42, + "message": "Processing complete", + "success": true +} +``` + +## Error Handling + +- **404 Not Found**: Job does not exist +- **400 Bad Request**: Job not yet completed or failed + +## Related Skills + +- [job-execute](../job-execute/) - Start the job +- [job-monitor](../job-monitor/) - Wait for completion +- [job-status](../job-status/) - Check status + +## Documentation + +- [Job Results](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/job-statistics/SKILL.md b/.agents/skills/job-statistics/SKILL.md new file mode 100644 index 000000000..d7a8a9689 --- /dev/null +++ b/.agents/skills/job-statistics/SKILL.md @@ -0,0 +1,152 @@ +--- +name: job-statistics +description: | + Retrieve execution statistics for a job including resource usage (CPU, memory), execution + duration, data transfer metrics, and performance indicators. Use for monitoring resource + consumption and optimizing process configurations. +license: Apache-2.0 +compatibility: Requires Weaver API access with statistics feature enabled. +metadata: + author: fmigneault +--- + +# Get Job Statistics + +Retrieve execution statistics and resource usage for a job. + +## When to Use + +- Monitoring resource consumption +- Optimizing process configurations +- Capacity planning and resource allocation +- Performance analysis and benchmarking +- Identifying resource bottlenecks +- Cost estimation for cloud resources + +## Parameters + +### Required + +- **job_id** (string): Job identifier + +## CLI Usage + +```bash +# Get job statistics +weaver statistics -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 + +# Compare statistics for multiple jobs +for job in $(weaver jobs -u $WEAVER_URL -p my-process -f json | jq -r '.jobs[].jobID'); do + echo "Job $job:" + weaver statistics -u $WEAVER_URL -j $job +done +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Get statistics +stats = client.statistics(job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890") + +print(f"Duration: {stats.body.get('duration')}") +print(f"CPU Usage: {stats.body.get('cpuUsage')}") +print(f"Memory Usage: {stats.body.get('memoryUsage')}") +``` + +## API Request + +```bash +curl -X GET \ + "${WEAVER_URL}/jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890/statistics" +``` + +## Returns + +```json +{ + "jobID": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", + "duration": "PT5M32S", + "executionDuration": "PT5M15S", + "queueDuration": "PT17S", + "resource": { + "cpuUsage": { + "average": "45%", + "peak": "87%" + }, + "memoryUsage": { + "average": "2.3 GB", + "peak": "4.1 GB" + }, + "diskIO": { + "read": "150 MB", + "write": "75 MB" + } + }, + "dataTransfer": { + "inputSize": "500 MB", + "outputSize": "200 MB" + } +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Statistics Fields + +### Timing + +- **duration**: Total time from submission to completion +- **executionDuration**: Actual processing time +- **queueDuration**: Time spent waiting in queue + +### Resource Usage + +- **cpuUsage**: CPU utilization (average and peak) +- **memoryUsage**: RAM consumption (average and peak) +- **diskIO**: Disk read/write operations +- **networkIO**: Network transfer (if applicable) + +### Data Metrics + +- **inputSize**: Total size of input data +- **outputSize**: Total size of output data +- **transferredData**: Data transferred between services + +## Use Cases + +### Resource Optimization + +```python +# Analyze resource usage patterns +stats = client.statistics(job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890") + +if stats.body["resource"]["memoryUsage"]["peak"] > "8 GB": + print("Consider increasing memory allocation") +``` + +### Cost Estimation + +```python +# Calculate approximate cloud compute costs +duration_minutes = parse_duration(stats.body["duration"]) +cpu_hours = duration_minutes / 60 +estimated_cost = cpu_hours * cost_per_cpu_hour +``` + +## Related Skills + +- [job-status](../job-status/) - Check job status +- [job-logs](../job-logs/) - View execution logs +- [job-monitor](../job-monitor/) - Monitor execution +- [job-list](../job-list/) - Compare multiple jobs + +## Documentation + +- [Job Statistics](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [Resource Management](https://pavics-weaver.readthedocs.io/en/latest/configuration.html#resource-management) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/job-status/SKILL.md b/.agents/skills/job-status/SKILL.md new file mode 100644 index 000000000..f94948a47 --- /dev/null +++ b/.agents/skills/job-status/SKILL.md @@ -0,0 +1,81 @@ +--- +name: job-status +description: | + Check the current execution status of a job including progress, timestamps, and state information. + Use when you need to check if a job is still running, has completed, or has failed. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# Get Job Status + +Check current execution status of a job with progress and timestamps. + +## When to Use + +- Checking if a job is complete +- Getting progress percentage +- Determining job state (running, succeeded, failed) +- Debugging failed jobs + +## Parameters + +### Required + +- **job_id** (string): Job identifier + +## CLI Usage + +```bash +weaver status -u $WEAVER_URL -j a1b2c3d4-e5f6-7890-abcd-ef1234567890 +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") +status = client.status(job_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890") + +print(f"Status: {status.body['status']}") +print(f"Progress: {status.body.get('progress', 0)}%") +``` + +## Returns + +```json +{ + "jobID": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", + "status": "running", + "progress": 45, + "message": "Processing step 2 of 4", + "created": "2026-02-19T10:00:00Z", + "started": "2026-02-19T10:00:05Z", + "processID": "my-process" +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Job Status Values + +- **accepted**: Job received and queued +- **running**: Job is executing +- **succeeded**: Job completed successfully +- **failed**: Job failed with errors +- **dismissed**: Job was cancelled + +## Related Skills + +- [job-monitor](../job-monitor/) - Wait for completion +- [job-logs](../job-logs/) - View execution logs +- [job-results](../job-results/) - Retrieve outputs + +## Documentation + +- [Job Status](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/process-deploy/SKILL.md b/.agents/skills/process-deploy/SKILL.md new file mode 100644 index 000000000..7c516daad --- /dev/null +++ b/.agents/skills/process-deploy/SKILL.md @@ -0,0 +1,117 @@ +--- +name: process-deploy +description: | + Deploy a new process or application package to Weaver using CWL (Common Workflow Language) + definitions. Supports Docker containers, remote WPS references, and workflow definitions. Use when + you need to add a new processing capability to Weaver. +license: Apache-2.0 +compatibility: Requires Weaver API access. Supports CWL v1.0, v1.1, v1.2. +metadata: + author: fmigneault +--- + +# Deploy Process + +Deploy a new process or application package to Weaver using CWL (Common Workflow Language) definitions. + +## When to Use + +- Adding a new processing capability to Weaver +- Deploying Docker-based applications +- Registering remote WPS process references +- Creating workflow processes that chain multiple steps + +## Parameters + +### Required + +- **process_id** (string): Unique identifier for the process (lowercase, hyphens allowed) +- **package** (CWL object or file path): Application package definition + - Can be CWL YAML/JSON file + - Can be reference URL to remote process + - Can be inline CWL document + +### Optional + +- **visibility** (string): Process visibility ("public" or "private"), default: "public" +- **auth** (auth handler): Authentication for protected endpoints + +## CLI Usage + +```bash +# Deploy from local CWL file +weaver deploy -u https://weaver.example.com -p my-process -b process.cwl + +# Deploy with specific visibility +weaver deploy -u $WEAVER_URL -p my-process -b process.cwl --visibility private +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") +result = client.deploy( + body="process.cwl", + process_id="my-process", + visibility="public" +) + +print(f"Deployed: {result.body['id']}") +``` + +## API Request + +```bash +curl -X POST \ + -H "Content-Type: application/json" \ + -d '{ + "processDescription": { + "process": { + "id": "my-process" + } + }, + "executionUnit": [{ + "href": "https://example.com/process.cwl" + }] +}' \ + "${WEAVER_URL}/processes" +``` + +## Returns + +```json +{ + "processSummary": { + "id": "my-process", + "version": "1.0.0", + "title": "My Process", + "jobControlOptions": ["async-execute", "sync-execute"], + "outputTransmission": ["value", "reference"], + "processDescriptionURL": "https://weaver.example.com/processes/my-process" + }, + "deploymentDone": true +} +``` + +**Note**: Response may include additional fields such as `links`, `keywords`, and extended `process` details. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Error Handling + +- **409 Conflict**: Process with this ID already exists +- **400 Bad Request**: Invalid CWL definition or parameters +- **401 Unauthorized**: Authentication required + +## Related Skills + +- [process-describe](../process-describe/) - Get process details +- [job-execute](../job-execute/) - Run the deployed process +- [process-undeploy](../process-undeploy/) - Remove the process + +## Documentation + +- [Process Operations](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CWL Application Packages](https://pavics-weaver.readthedocs.io/en/latest/package.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/process-describe/SKILL.md b/.agents/skills/process-describe/SKILL.md new file mode 100644 index 000000000..ed555434e --- /dev/null +++ b/.agents/skills/process-describe/SKILL.md @@ -0,0 +1,127 @@ +--- +name: process-describe +description: | + Retrieve detailed information about a deployed process including inputs, outputs, metadata, and + execution requirements. Use when you need to understand process capabilities or validate before + execution. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# Describe Process + +Retrieve complete process description including inputs, outputs, and metadata. + +## When to Use + +- Understanding process capabilities before execution +- Discovering required and optional inputs +- Checking expected output formats +- Validating process availability +- Getting CWL package information + +## Parameters + +### Required + +- **process_id** (string): Process identifier to describe + +### Optional + +- **provider** (string): Provider identifier for remote processes +- **schema** (string): Schema format ("OGC", "OLD", "WPS") + +## CLI Usage + +```bash +# Describe local process +weaver describe -u $WEAVER_URL -p my-process + +# Describe remote provider process +weaver describe -u $WEAVER_URL -P my-provider -p remote-process + +# Get specific schema format +weaver describe -u $WEAVER_URL -p my-process --schema OGC +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Get description +description = client.describe(process_id="my-process") + +# Print inputs +for input_id, input_spec in description.body["inputs"].items(): + print(f"{input_id}: {input_spec.get('title', input_id)}") + print(f" Type: {input_spec.get('format', {}).get('mediaType', 'literal')}") + print(f" Required: {input_spec.get('minOccurs', 0) > 0}") +``` + +## Returns + +Process description includes: + +- **id**: Process identifier +- **title**: Human-readable name +- **abstract**: Description of what it does +- **version**: Process version +- **inputs**: Input specifications with types and constraints +- **outputs**: Output specifications with formats +- **keywords**: Associated keywords/tags +- **metadata**: Additional metadata +- **jobControlOptions**: Supported execution modes (async, sync) +- **outputTransmission**: Supported output modes (reference, value) + +## Example Response + +```json +{ + "id": "ndvi-calculator", + "title": "NDVI Calculator", + "description": "Calculate Normalized Difference Vegetation Index from satellite imagery", + "version": "1.0.0", + "inputs": { + "red_band": { + "title": "Red Band Image", + "minOccurs": 1, + "maxOccurs": 1, + "formats": [ + {"mediaType": "image/tiff"} + ] + }, + "nir_band": { + "title": "Near-Infrared Band Image", + "minOccurs": 1, + "maxOccurs": 1, + "formats": [ + {"mediaType": "image/tiff"} + ] + } + }, + "outputs": { + "ndvi": { + "title": "NDVI Output", + "formats": [ + {"mediaType": "image/tiff"} + ] + } + } +} +``` + +## Related Skills + +- [process-list](../process-list/) - Discover available processes +- [process-deploy](../process-deploy/) - Deploy new process +- [job-execute](../job-execute/) - Run the process + +## Documentation + +- [Process Description](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/process-list/SKILL.md b/.agents/skills/process-list/SKILL.md new file mode 100644 index 000000000..d508b52b5 --- /dev/null +++ b/.agents/skills/process-list/SKILL.md @@ -0,0 +1,92 @@ +--- +name: process-list +description: | + List all available processes with optional filtering by visibility or provider. Retrieve process + summaries for discovery. Use when you need to find available processing capabilities or explore + what Weaver can do. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# List Processes + +List all available processes with optional filtering and pagination. + +## When to Use + +- Discovering available processes +- Finding processes by keyword or category +- Checking deployed processes +- Exploring remote provider capabilities + +## Parameters + +### Optional + +- **provider** (string): Filter by provider ID +- **visibility** (string): Filter by visibility ("public", "private") +- **limit** (integer): Maximum number of results +- **page** (integer): Pagination offset +- **sort** (string): Sort order for results +- **detail** (boolean): Include detailed descriptions + +## CLI Usage + +```bash +# List all processes +weaver capabilities -u $WEAVER_URL + +# List from specific provider +weaver capabilities -u $WEAVER_URL -P my-provider + +# List with details +weaver capabilities -u $WEAVER_URL --detail +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# List all +result = client.capabilities() + +for process in result.body.get("processes", []): + print(f"{process['id']}: {process.get('title', process['id'])}") +``` + +## Returns + +```json +{ + "processes": [ + { + "id": "process-1", + "title": "Process 1", + "description": "Brief description", + "version": "1.0.0" + } + ], + "total": 10, + "limit": 10, + "offset": 0 +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Related Skills + +- [process-describe](../process-describe/) - Get process details +- [process-deploy](../process-deploy/) - Add new process +- [job-execute](../job-execute/) - Run a process as a job with inputs + +## Documentation + +- [Process Listing](https://pavics-weaver.readthedocs.io/en/latest/processes.html#listing) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html#capabilities) diff --git a/.agents/skills/process-package/SKILL.md b/.agents/skills/process-package/SKILL.md new file mode 100644 index 000000000..69d8a8f3b --- /dev/null +++ b/.agents/skills/process-package/SKILL.md @@ -0,0 +1,112 @@ +--- +name: process-package +description: | + Retrieve the CWL application package definition for a deployed process. Returns the complete + Common Workflow Language document describing the process implementation. Use when you need to + inspect, version control, or replicate process definitions. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# Get Process Package + +Retrieve the CWL application package definition for a deployed process. + +## When to Use + +- Inspecting process implementation details +- Version controlling process definitions +- Replicating processes to other Weaver instances +- Debugging process execution issues +- Understanding process requirements and dependencies + +## Parameters + +### Required + +- **process_id** (string): Process identifier + +### Optional + +- **provider** (string): Provider for remote processes +- **output** (file path): Save package to file + +## CLI Usage + +```bash +# View package in console +weaver package -u $WEAVER_URL -p my-process + +# Save to file +weaver package -u $WEAVER_URL -p my-process -o process-package.cwl + +# Get package from remote provider +weaver package -u $WEAVER_URL -P my-provider -p remote-process -o remote.cwl +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Get package +package = client.package(process_id="my-process") + +print(package.body) # CWL document + +# Save to file +with open("process.cwl", "w") as f: + import yaml + yaml.dump(package.body, f) +``` + +## API Request + +```bash +GET /processes/my-process/package +Accept: application/cwl+yaml +``` + +## Returns + +CWL application package in YAML or JSON format: + +```yaml +cwlVersion: v1.2 +class: CommandLineTool +baseCommand: process-command +inputs: + input1: + type: File + inputBinding: + position: 1 +outputs: + output1: + type: File + outputBinding: + glob: "*.out" +requirements: + DockerRequirement: + dockerPull: myimage:latest +``` + +## Error Handling + +- **404 Not Found**: Process does not exist +- **403 Forbidden**: Insufficient permissions + +## Related Skills + +- [process-deploy](../process-deploy/) - Deploy CWL package +- [process-describe](../process-describe/) - Get process metadata +- [job-execute](../job-execute/) - Run the process + +## Documentation + +- [Process Package](https://pavics-weaver.readthedocs.io/en/latest/package.html) +- [CWL Specification](https://www.commonwl.org/v1.0/) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/process-undeploy/SKILL.md b/.agents/skills/process-undeploy/SKILL.md new file mode 100644 index 000000000..d1b21dd10 --- /dev/null +++ b/.agents/skills/process-undeploy/SKILL.md @@ -0,0 +1,85 @@ +--- +name: process-undeploy +description: | + Remove a deployed process from Weaver. This action is irreversible and will delete the process + definition. Use when you need to clean up unused processes or remove deprecated process versions. +license: Apache-2.0 +compatibility: Requires Weaver API access with process management permissions. +metadata: + author: fmigneault +--- + +# Undeploy Process + +Remove a deployed process from Weaver permanently. + +## When to Use + +- Removing unused or deprecated processes +- Cleaning up test processes +- Decommissioning old process versions +- Managing process lifecycle + +## Parameters + +### Required + +- **process_id** (string): Process identifier to remove + +### Optional + +- **provider** (string): Provider identifier for remote processes + +## CLI Usage + +```bash +# Undeploy local process +weaver undeploy -u $WEAVER_URL -p my-process + +# Confirm before undeploying +weaver describe -u $WEAVER_URL -p my-process # Check first +weaver undeploy -u $WEAVER_URL -p my-process +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Undeploy process +result = client.undeploy(process_id="my-process") + +if result.success: + print(f"Process removed successfully") +``` + +## API Request + +```bash +curl -X DELETE \ + "${WEAVER_URL}/processes/my-process" +``` + +## Returns + +- **status**: Confirmation of removal +- **message**: Success or error message + +## Error Handling + +- **404 Not Found**: Process does not exist +- **403 Forbidden**: Insufficient permissions to undeploy +- **409 Conflict**: Process has active jobs + +## Related Skills + +- [process-deploy](../process-deploy/) - Deploy new process +- [process-list](../process-list/) - View all processes +- [process-describe](../process-describe/) - Check process details before removal + +## Documentation + +- [Process Undeployment](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/provider-list/SKILL.md b/.agents/skills/provider-list/SKILL.md new file mode 100644 index 000000000..ead9897fd --- /dev/null +++ b/.agents/skills/provider-list/SKILL.md @@ -0,0 +1,184 @@ +--- +name: provider-list +description: | + List all registered remote providers including WPS and OGC API - Processes services. Shows + provider URLs, types, and availability status. Use to discover available external services + integrated with Weaver. +license: Apache-2.0 +compatibility: Requires Weaver API access. +metadata: + author: fmigneault +--- + +# List Providers + +List all registered remote providers and their capabilities. + +## When to Use + +- Discovering available external services +- Checking provider connectivity +- Auditing registered integrations +- Finding providers for specific capabilities +- Troubleshooting federation issues + +## Parameters + +### Optional + +- **detail** (boolean): Include detailed provider information +- **check** (boolean): Verify provider connectivity + +## CLI Usage + +```bash +# List all providers +weaver capabilities -u $WEAVER_URL --providers + +# List with details +weaver capabilities -u $WEAVER_URL --providers --detail + +# Check specific provider's processes +weaver capabilities -u $WEAVER_URL -P my-provider +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# List providers +result = client.capabilities(providers=True) + +for provider in result.body.get("providers", []): + print(f"{provider['id']}: {provider['url']}") + print(f" Type: {provider['type']}") + print(f" Public: {provider['public']}") + +# Get processes from specific provider +processes = client.capabilities(provider="my-provider") +for process in processes.body.get("processes", []): + print(f" - {process['id']}: {process.get('title', '')}") +``` + +## API Request + +```bash +curl -X GET \ + "${WEAVER_URL}/providers" +``` + +## Returns + +```json +{ + "providers": [ + { + "id": "remote-wps", + "url": "https://remote.example.com/wps", + "type": "wps", + "public": true, + "description": "Remote WPS processing service" + }, + { + "id": "ogc-api-provider", + "url": "https://ogc.example.com/processes", + "type": "ogcapi", + "public": true, + "description": "OGC API - Processes instance" + } + ], + "total": 2 +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Provider Information + +Each provider includes: + +- **id**: Unique provider identifier +- **url**: Service endpoint URL +- **type**: Service type (wps, ogcapi, esgf) +- **public**: Public accessibility flag +- **description**: Provider description +- **status**: Connectivity status (if checked) + +## Provider Types + +### WPS + +- Web Processing Service 1.0/2.0 +- XML-based protocols +- GetCapabilities, DescribeProcess, Execute + +### OGC API - Processes + +- RESTful JSON API +- Modern OGC standard +- /processes, /jobs endpoints + +### ESGF + +- Earth System Grid Federation +- Climate data processing +- Specialized scientific workflows + +## Use Cases + +### Service Discovery + +```bash +# Find all available providers +weaver capabilities -u $WEAVER_URL --providers + +# Check what processes each provider offers +for provider in $(weaver capabilities -u $WEAVER_URL --providers -f json | jq -r '.providers[].id'); do + echo "Provider: $provider" + weaver capabilities -u $WEAVER_URL -P $provider +done +``` + +### Provider Health Check + +```python +# Check all providers +providers = client.capabilities(providers=True) + +for provider in providers.body.get("providers", []): + try: + processes = client.capabilities(provider=provider["id"]) + print(f"✓ {provider['id']}: {len(processes.body.get('processes', []))} processes") + except Exception as e: + print(f"✗ {provider['id']}: Unavailable - {e}") +``` + +### Federation Management + +```python +# List providers by type +providers = client.capabilities(providers=True) + +wps_providers = [p for p in providers.body["providers"] if p["type"] == "wps"] +ogc_providers = [p for p in providers.body["providers"] if p["type"] == "ogcapi"] + +print(f"WPS providers: {len(wps_providers)}") +print(f"OGC API providers: {len(ogc_providers)}") +``` + +## Related Skills + +- [provider-register](../provider-register/) - Add new provider +- [provider-unregister](../provider-unregister/) - Remove provider +- [process-describe](../process-describe/) - Get provider process details +- [job-execute](../job-execute/) - Run provider processes + +## Documentation + +- [Remote Providers](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [Provider Types](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/provider-register/SKILL.md b/.agents/skills/provider-register/SKILL.md new file mode 100644 index 000000000..b8f1598f3 --- /dev/null +++ b/.agents/skills/provider-register/SKILL.md @@ -0,0 +1,181 @@ +--- +name: provider-register +description: | + Register an external WPS or OGC API - Processes service as a remote provider, making its processes + available through Weaver. Enables federation of services and distributed workflow execution. Use + when integrating remote processing services. +license: Apache-2.0 +compatibility: Requires Weaver API access with provider registration permissions. +metadata: + author: fmigneault +--- + +# Register Provider + +Register an external WPS or OGC API - Processes service as a remote provider. + +## When to Use + +- Integrating external WPS services +- Connecting to remote OGC API - Processes instances +- Building federated processing networks +- Enabling distributed workflow execution +- Accessing specialized remote processing capabilities +- Implementing multi-organization collaborations + +## Parameters + +### Required + +- **provider_id** (string): Unique provider identifier +- **url** (string): Provider service URL + +### Optional + +- **type** (string): Provider type + - `wps`: WPS 1.0/2.0 service + - `ogcapi`: OGC API - Processes + - `esgf`: ESGF processing service +- **public** (boolean): Whether provider is publicly accessible (default: true) +- **auth** (object): Authentication credentials (if required) + +## CLI Usage + +```bash +# Register WPS provider +weaver register -u $WEAVER_URL -n my-wps-provider -w https://remote-wps.example.com/wps + +# Register OGC API - Processes provider +weaver register -u $WEAVER_URL -n my-ogc-provider -w https://remote-ogc.example.com/processes + +# Register with authentication +weaver register -u $WEAVER_URL -n secure-provider -w https://secure.example.com/wps --auth token:SECRET_TOKEN +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Register provider +result = client.register( + provider_id="my-provider", + url="https://remote.example.com/wps", + type="wps", + public=True +) + +if result.success: + print(f"Provider registered: {result.body['id']}") +``` + +## API Request + +```bash +curl -X POST \ + -H "Content-Type: application/json" \ + -d '{ + "id": "my-provider", + "url": "https://remote.example.com/wps", + "type": "wps", + "public": true +}' \ + "${WEAVER_URL}/providers" +``` + +## Returns + +```json +{ + "id": "my-provider", + "url": "https://remote.example.com/wps", + "type": "wps", + "public": true, + "status": "registered" +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Provider Types + +### WPS (Web Processing Service) + +- Supports WPS 1.0.0 and 2.0.0 +- Automatic process discovery via GetCapabilities +- Execute operations via WPS Execute + +### OGC API - Processes + +- Modern RESTful API +- JSON-based communication +- Standardized endpoints + +### ESGF + +- Earth System Grid Federation services +- Climate data processing +- Specialized scientific workflows + +## After Registration + +Once registered, you can: + +```bash +# List processes from provider +weaver capabilities -u $WEAVER_URL -P my-provider + +# Describe remote process +weaver describe -u $WEAVER_URL -P my-provider -p remote-process + +# Execute remote process +weaver execute -u $WEAVER_URL -P my-provider -p remote-process -I inputs.json +``` + +## Use Cases + +### Federated Workflows + +```python +# Register multiple providers +for provider_name, provider_url in providers.items(): + client.register(provider_id=provider_name, url=provider_url) + +# Execute distributed workflow +step1 = client.execute(provider="provider1", process_id="preprocess", ...) +step2 = client.execute(provider="provider2", process_id="analyze", ...) +``` + +### Service Integration + +```bash +# Register institutional services +weaver register -u $WEAVER_URL -n institution-a -w https://inst-a.org/wps +weaver register -u $WEAVER_URL -n institution-b -w https://inst-b.org/processes + +# Access all services through single endpoint +weaver capabilities -u $WEAVER_URL -P institution-a +weaver capabilities -u $WEAVER_URL -P institution-b +``` + +## Error Handling + +- **409 Conflict**: Provider ID already exists +- **400 Bad Request**: Invalid URL or parameters +- **503 Service Unavailable**: Cannot connect to provider URL + +## Related Skills + +- [provider-unregister](../provider-unregister/) - Remove provider +- [provider-list](../provider-list/) - View registered providers +- [job-execute](../job-execute/) - Run remote processes +- [process-describe](../process-describe/) - Get remote process details + +## Documentation + +- [Remote Providers](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [Provider Types](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/provider-unregister/SKILL.md b/.agents/skills/provider-unregister/SKILL.md new file mode 100644 index 000000000..b11f32f19 --- /dev/null +++ b/.agents/skills/provider-unregister/SKILL.md @@ -0,0 +1,147 @@ +--- +name: provider-unregister +description: | + Remove a registered remote provider from Weaver. This disconnects the external service but does + not affect the remote service itself. Use when decommissioning integrations or removing outdated + provider registrations. +license: Apache-2.0 +compatibility: Requires Weaver API access with provider management permissions. +metadata: + author: fmigneault +--- + +# Unregister Provider + +Remove a registered remote provider from Weaver. + +## When to Use + +- Removing outdated provider registrations +- Decommissioning service integrations +- Cleaning up unused providers +- Updating provider configurations (unregister then re-register) +- Managing provider lifecycle + +## Parameters + +### Required + +- **provider_id** (string): Provider identifier to remove + +## CLI Usage + +```bash +# Unregister provider +weaver unregister -u $WEAVER_URL -n my-provider + +# List providers before removal +weaver capabilities -u $WEAVER_URL --providers +weaver unregister -u $WEAVER_URL -n old-provider + +# Verify removal +weaver capabilities -u $WEAVER_URL --providers +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") + +# Unregister provider +result = client.unregister(provider_id="my-provider") + +if result.success: + print("Provider unregistered successfully") +``` + +## API Request + +```bash +curl -X DELETE \ + "${WEAVER_URL}/providers/my-provider" +``` + +## Returns + +```json +{ + "id": "my-provider", + "status": "unregistered", + "message": "Provider successfully removed" +} +``` + +**Note**: Response may include additional fields. See +[API documentation](https://pavics-weaver.readthedocs.io/en/latest/api.html) for complete response schemas. + +## Behavior + +- **Does NOT affect**: The remote service (remains operational) +- **Removes**: Provider registration from Weaver +- **Invalidates**: References to provider processes in workflows +- **Preserves**: Historical job records that used the provider + +## Impact on Existing Jobs + +- **Completed jobs**: Remain accessible with full history +- **Running jobs**: Continue execution (already dispatched) +- **Pending jobs**: May fail if they reference the provider + +## Use Cases + +### Provider Update + +```bash +# Update provider URL or configuration +weaver unregister -u $WEAVER_URL -n my-provider +weaver register -u $WEAVER_URL -n my-provider -w https://new-url.example.com/wps +``` + +### Cleanup + +```python +# Remove unused providers +providers = client.capabilities(providers=True) + +for provider in providers.body.get("providers", []): + # Check if provider is still reachable + try: + processes = client.capabilities(provider=provider["id"]) + if not processes.body.get("processes"): + client.unregister(provider_id=provider["id"]) + except Exception: + client.unregister(provider_id=provider["id"]) +``` + +### Service Migration + +```bash +# Migrate from old to new provider +weaver register -u $WEAVER_URL -n new-provider -w https://new.example.com/wps + +# Test new provider +weaver capabilities -u $WEAVER_URL -P new-provider + +# Remove old provider +weaver unregister -u $WEAVER_URL -n old-provider +``` + +## Error Handling + +- **404 Not Found**: Provider does not exist +- **403 Forbidden**: Insufficient permissions +- **409 Conflict**: Provider has active jobs + +## Related Skills + +- [provider-register](../provider-register/) - Register new provider +- [provider-list](../provider-list/) - View all providers +- [job-execute](../job-execute/) - Use provider processes +- [job-list](../job-list/) - Check for provider jobs before removal + +## Documentation + +- [Provider Management](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html) diff --git a/.agents/skills/vault-upload/SKILL.md b/.agents/skills/vault-upload/SKILL.md new file mode 100644 index 000000000..0810ba7fe --- /dev/null +++ b/.agents/skills/vault-upload/SKILL.md @@ -0,0 +1,96 @@ +--- +name: vault-upload +description: | + Securely store files or credentials in Weaver's vault for use in process execution. The vault + provides encrypted storage for sensitive data like authentication tokens, private files, or API + keys. Use when you need to handle sensitive data securely. +license: Apache-2.0 +compatibility: Requires Weaver API access with vault feature enabled. +metadata: + author: fmigneault +--- + +# Upload to Vault + +Securely store files or credentials in Weaver's encrypted vault. + +## When to Use + +- Storing API keys or authentication tokens +- Uploading private input files +- Managing sensitive credentials +- Sharing secrets between jobs securely + +## Parameters + +### Required + +- **file_id** or **vault_token** (string): Unique vault identifier +- **file_path** (path): Local file to upload + +### Optional + +- **encrypted** (boolean): Whether to encrypt the file + +## CLI Usage + +```bash +# Upload credentials +weaver upload -u $WEAVER_URL -vT my-credentials -f api-key.txt + +# Upload private data file +weaver upload -u $WEAVER_URL -vT private-data -f sensitive.nc +``` + +## Python Usage + +```python +from weaver.cli import WeaverClient + +client = WeaverClient(url="https://weaver.example.com") +result = client.upload( + vault_token="my-credentials", + file_path="api-key.txt" +) + +print(f"Uploaded to vault: vault://{result.body['vault_id']}") +``` + +## Using Vault References in Execution + +Once uploaded, reference vault content in job inputs: + +```json +{ + "inputs": { + "auth_token": { + "href": "vault://my-credentials" + }, + "private_file": { + "href": "vault://private-data" + } + } +} +``` + +## Returns + +- **vault_id**: Vault token for referencing +- **status**: Upload confirmation +- **reference**: `vault://` URL to use in inputs + +## Security Features + +- End-to-end encryption +- Access control per vault item +- Automatic cleanup after job completion (optional) +- Audit logging + +## Related Skills + +- [job-execute](../job-execute/) - Use vault references in execution + +## Documentation + +- [Vault Feature](https://pavics-weaver.readthedocs.io/en/latest/processes.html) +- [CLI Reference](https://pavics-weaver.readthedocs.io/en/latest/cli.html#upload) diff --git a/.agents/skills/weaver-ci-validate/SKILL.md b/.agents/skills/weaver-ci-validate/SKILL.md new file mode 100644 index 000000000..f3c331d98 --- /dev/null +++ b/.agents/skills/weaver-ci-validate/SKILL.md @@ -0,0 +1,194 @@ +--- +name: weaver-ci-validate +description: | + Run Weaver code test and lint validations through Makefile targets. + Prefer `make check*`, `make fix*` and `make test*` commands over direct tool calls to stay + aligned with project CI behavior and environment setup. +license: Apache-2.0 +compatibility: Requires Make, Python environment dependencies, and Weaver repository access. +metadata: + category: setup-operations + version: "1.0.0" + keywords: + - makefile + - lint + - tests + - validation + - ci + author: fmigneault +--- + +# Validate Tests and Lint with Makefile + +Guide validation tasks toward `Makefile` targets for consistent local and CI checks. + +## When to Use + +- Before opening a pull request. +- After modifying Python modules, tests, docs, or configuration. +- When investigating lint failures reported by CI. +- When selecting focused checks to reduce local turnaround time. + +## Core Rules + +- Use `make` targets first. Do not call `pytest`, `pylint`, `flake8`, or related tools + directly unless a `Makefile` target does not exist for the needed scope. +- Fix lint issues with `make fix*` targets when available to maintain consistency with CI auto-fix behavior. + Only address other issues manually when they cannot be handled through predefined `Makefile` targets. + +## Validation Workflow + +1. Run a focused target for the area you changed. +2. If needed, run broader lint checks. +3. Run aggregate `-only` targets to mirror broader validation without triggering install steps. + +## Recommended Targets + +### Lint and Style + +```shell +make check-only +make check-lint-only +make check-pep8-only +make check-imports-only +make check-docstring-only +make check-docf-only +make check-fstring-only +make check-security-only +make check-security-code-only +make check-security-deps-only +make check-doc8-only +make check-dist-doc-only +make check-links-only +make check-css-only +make check-md-only +``` + +`make check-only` runs all enabled check families through their `-only` variants. + +Fix target counterparts for check targets that support automatic remediation: + +```shell +make fix-only +make fix-imports-only +make fix-lint-only +make fix-docf-only +make fix-fstring-only +make fix-css-only +make fix-md-only +``` + +`make fix-only` runs all enabled fix families through their `-only` variants. + +No automatic `fix-*` targets are defined for remaining `check-*` targets. + +### Test Suites + +```shell +make test-only +make test-unit-only +make test-func-only +make test-cli-only +make test-workflow-only +make test-online-only +make test-offline-only +make test-no-tb14-only +make test-code-sprint-only +make test-spec-only SPEC='pattern' +make test-coverage-only +``` + +`make test-only` runs all tests with no dependency-install pre-step. + +## Pytest Marker Patterns + +The marker registry is defined in `setup.cfg` under `[tool:pytest]`. + +### Predefined Markers + +- `cli` +- `code_sprint` +- `testbed14` +- `functional` +- `server` +- `quotation` +- `workflow` +- `online` +- `slow` +- `remote` +- `builtin` +- `vault` +- `format` +- `html` +- `prov` +- `kvp` +- `oap_part1` +- `oap_part2` +- `oap_part3` +- `oap_part4` +- `openeo` +- `wps` + +### Predefined Target Patterns + +These wrap common marker expressions: + +- `test-unit-only` -> `-m "not slow and not online and not functional"` +- `test-func-only` -> `-m "functional and not code_sprint"` +- `test-cli-only` -> `-m "cli"` +- `test-workflow-only` -> `-m "workflow"` +- `test-online-only` -> `-m "online"` +- `test-offline-only` -> `-m "not online"` +- `test-no-tb14-only` -> `-m "not testbed14"` +- `test-code-sprint-only` -> `-m "code_sprint"` + +### Flexible Marker Expressions + +Use `TEST_XARGS` to append custom pytest expressions while still using `make`: + +```shell +make test-unit-only TEST_XARGS='-m "oap_part1 and functional and not remote"' +make test-func-only TEST_XARGS='-m "(workflow or quotation) and not slow"' +make test-offline-only TEST_XARGS='-m "vault and not online and not remote"' +``` + +## Specific Test Selection Examples + +Select a specific file while keeping a make target wrapper: + +```shell +make test-unit-only TEST_XARGS='tests/functional/code_sprint/test_server.py' +``` + +Select one test function in a file: + +```shell +make test-unit-only TEST_XARGS='tests/functional/code_sprint/test_server.py -k test_landing_page_links' +``` + +Select tests by substring expression only: + +```shell +make test-spec-only SPEC='landing_page and conformance' +``` + +## Target Discovery + +```shell +make help +make check-info +``` + +Use these to discover current targets and any `CHECKS_EXCLUDE` behavior. + +## Notes and Constraints + +- Prefer scoped targets first to keep feedback fast. +- Use `test-code-sprint-only` only when required environment variables (for example `TEST_SERVER`) are defined. +- Default pytest options in `setup.cfg` already apply `-m "not online and not remote"` unless overridden. +- Keep validation commands consistent with `Makefile` to preserve CI parity. + +## Related Skills + +- [weaver-install](../weaver-install/SKILL.md) - Setup dependencies and local environment. +- [weaver-skills-update](../weaver-skills-update/SKILL.md) - Maintain skills when Makefile targets evolve. diff --git a/.agents/skills/weaver-install/SKILL.md b/.agents/skills/weaver-install/SKILL.md new file mode 100644 index 000000000..5f2b4732d --- /dev/null +++ b/.agents/skills/weaver-install/SKILL.md @@ -0,0 +1,741 @@ +--- +name: weaver-install +description: | + Install Weaver from Docker or source for local development, testing, or production deployment. + Covers Docker deployment and conda environment setup using Makefile targets. Use when setting up a + new Weaver instance or development environment. +license: Apache-2.0 +compatibility: Requires Python 3.10+, conda (recommended), Docker (for container deployment), Make. +metadata: + author: fmigneault +--- + +# Install Weaver + +Install and set up Weaver for development, testing, or production use. + +## When to Use + +- Setting up a new Weaver development environment +- Installing Weaver for local testing +- Deploying Weaver in production +- Contributing to Weaver development +- Running Weaver services locally + +## Prerequisites + +### System Requirements + +- **Python**: 3.10 or higher +- **Operating System**: Linux, macOS, or Windows (with WSL) +- **Make**: Build tool for using Makefile targets (if using source installation instead of Docker) + +### Required Dependencies + +- **Git**: For cloning repository +- **Conda/Miniconda**: For isolated environment management (recommended) +- **Docker**: For containerized deployment (portable production/testing) + +**Note**: If you prefer to install directly in your current Python environment without conda, you can set `CONDA_CMD=""` +(empty string) before running make commands. This bypasses conda creation/activation/detection and installs using the +detected Python interpreter. + +## Installation Methods + +### Method 1: Docker (Recommended for Production) + +#### Pull Pre-built Image + +```bash +# Latest development version +docker pull pavics/weaver:latest + +# Manager image (API and job management) +docker pull pavics/weaver:latest-manager + +# Worker image (job execution) +docker pull pavics/weaver:latest-worker + +# Specific version (check available tags on DockerHub) +docker pull pavics/weaver:X.Y.Z +docker pull pavics/weaver:X.Y.Z-manager +docker pull pavics/weaver:X.Y.Z-worker +``` + +#### Run Weaver Container + +```bash +# Basic run +docker run -p 4001:4001 pavics/weaver:latest + +# With configuration +docker run -p 4001:4001 \ + -v $(pwd)/config:/config \ + -e WEAVER_INI_FILE=/config/weaver.ini \ + pavics/weaver:latest + +# With docker-compose +cd docker +cp docker-compose.yml.example docker-compose.yml +# Edit docker-compose.yml as needed +docker-compose up -d +``` + +#### Available Docker Tags + +- `pavics/weaver:latest` - Latest development version +- `pavics/weaver:latest-manager` - Manager service (latest) +- `pavics/weaver:latest-worker` - Worker service (latest) +- `pavics/weaver:X.Y.Z` - Specific stable version +- `pavics/weaver:X.Y.Z-manager` - Manager service (specific version) +- `pavics/weaver:X.Y.Z-worker` - Worker service (specific version) + +### Method 2: From Source with Makefile (Development) + +#### Clone Repository + +```bash +# Clone from GitHub +git clone https://github.com/crim-ca/weaver.git +cd weaver + +# Checkout specific version (optional) +git checkout X.Y.Z +``` + +#### Install with Makefile Targets + +The Makefile provides several installation targets for different use cases: + +##### Standard Installation (Recommended) + +```bash +# Install everything needed to run Weaver commands +make install +``` + +**This is sufficient for**: + +- Running Weaver CLI commands (`weaver deploy`, `weaver execute`, etc.) +- Starting Weaver server locally +- Deploying and executing processes +- General usage and testing + +This is an alias for `make install-all` and runs: + +- `conda-env` - Creates conda environment +- `conda-install` - Installs conda packages (proj, etc.) +- `install-sys` - Installs system dependencies +- `install-pkg` - Installs application packages +- `install-pip` - Installs application as editable package +- `install-dev` - Installs development/test dependencies + +##### Runtime-Only Installation (Minimal) + +```bash +# Install only runtime dependencies (no development tools) +make install-run +``` + +**Use this if you only need**: + +- To run Weaver server in production +- Minimal installation footprint +- No testing or development capabilities + +This runs: + +- `conda-install` - Installs conda packages +- `install-sys` - Installs system dependencies +- `install-pkg` - Installs application packages +- `install-raw` - Installs application without dependencies + +##### Development-Specific Targets + +**These are only needed for Weaver development** (not required for using Weaver): + +```bash +# Install development/test dependencies only +make install-dev +# Required for: make test, make lint, make check-types + +# Install documentation dependencies only +make install-doc +# Required for: make docs + +# Install application as editable package only +make install-pip +# Required for: development with code changes reflected immediately +``` + +##### Individual Component Targets + +```bash +# Just create/update conda environment +make conda-env + +# Install system dependencies (pip, setuptools, etc.) +make install-sys + +# Install application package dependencies +make install-pkg + +# Install application without dependencies +make install-raw +``` + +#### Quick Installation + +**For most users** (running Weaver commands and processes): + +```bash +# One command does it all (with conda) +make install + +# Activate environment +conda activate weaver + +# You're ready to use Weaver +weaver --version +pserve config/weaver.ini +``` + +**Alternative**: Without conda (use current Python environment): + +```bash +# Bypass conda and install directly in current Python +CONDA_CMD="" make install + +# No conda activation needed - already in your environment +weaver --version +pserve config/weaver.ini +``` + +#### Manual Step-by-Step Installation + +If you prefer to understand each step: + +```bash +# 1. Create conda environment +make conda-env + +# 2. Activate environment +conda activate weaver + +# 3. Install system dependencies +make install-sys + +# 4. Install application dependencies +make install-pkg + +# 5. Install application itself +make install-pip + +# 6. (Recommended) Install development tools for testing +make install-dev +``` + +**Note**: `make install` does all of the above automatically. + +## Configuration + +### Create Configuration File + +```bash +# Copy example configuration +cp config/weaver.ini.example config/weaver.ini + +# Edit configuration +vim config/weaver.ini +``` + +### Key Configuration Options + +```ini +[app:main] +# Weaver mode: ADES, EMS, or HYBRID +weaver.configuration = HYBRID + +# Database connection +weaver.url = http://localhost:4001 +weaver.wps_output_url = http://localhost:4001/wpsoutputs +weaver.wps_output_dir = /tmp/weaver-outputs + +# MongoDB connection +weaver.mongodb_connection = mongodb://localhost:27017/weaver + +# Celery broker +celery.broker_url = mongodb://localhost:27017/celery +celery.result_backend = mongodb://localhost:27017/celery +``` + +### Configuration Modes + +**ADES** (Application Deployment and Execution Service): + +- Local process execution +- Direct access to data +- Single-node deployment + +**EMS** (Execution Management Service): + +- Orchestrates remote ADES +- Distributed workflow execution +- Multi-node deployment + +**HYBRID**: + +- Both ADES and EMS capabilities +- Most flexible configuration + +## Running Weaver + +### Development Mode + +```bash +# With pserve (development server) +pserve config/weaver.ini --reload + +# Access API at http://localhost:4001 +``` + +### Production Mode + +```bash +# Using gunicorn +gunicorn --paste config/weaver.ini -b 0.0.0.0:4001 --workers 4 + +# Using docker-compose +docker-compose -f docker/docker-compose.yml up -d +``` + +### Worker Process + +```bash +# Start Celery worker for job execution +celery -A pyramid_celery.celery_app worker \ + --ini config/weaver.ini \ + --loglevel INFO +``` + +## Verification + +### Check Installation + +```bash +# Python package (after activating conda environment) +conda activate weaver +python -c "import weaver; print(weaver.__version__)" + +# CLI tool +weaver --version + +# API (after starting server) +weaver info -u http://localhost:4001 +curl http://localhost:4001/ +``` + +### Test Installation + +```bash +# Run tests (requires install-dev) +make test + +# Quick test +make test-unit + +# Functional tests +make test-functional +``` + +### Verify Services + +```bash +# Check API +curl http://localhost:4001/ | jq + +# Check processes endpoint +curl http://localhost:4001/processes | jq + +# Check conformance +curl http://localhost:4001/conformance | jq + +# Using CLI +weaver info -u http://localhost:4001 +weaver capabilities -u http://localhost:4001 +``` + +## Common Installation Issues + +### Issue: Python Version Too Old + +```bash +# Error: Requires Python 3.10+ + +# Solution: Specify Python version when creating environment +conda create -n weaver python=3.10 +conda activate weaver +make install +``` + +### Issue: Conda Not Found + +```bash +# Error: conda command not found + +# Solution 1: Install Miniconda +# Download from: https://docs.conda.io/en/latest/miniconda.html +# Or use the Makefile target +make conda-base + +# Solution 2: Bypass conda and use current Python environment +CONDA_CMD="" make install +``` + +### Issue: Missing System Dependencies + +```bash +# Error: gcc not found, or missing libraries + +# Ubuntu/Debian +sudo apt-get update +sudo apt-get install -y build-essential python3-dev libproj-dev + +# macOS (with Homebrew) +brew install proj + +# Then reinstall +make install-sys +make install-pkg +``` + +### Issue: MongoDB Connection Failed + +```bash +# Error: Cannot connect to MongoDB + +# Solution 1: Install and start MongoDB +sudo apt-get install mongodb +sudo systemctl start mongodb + +# Solution 2: Use Docker for MongoDB +docker run -d -p 27017:27017 --name mongodb mongo:latest + +# Solution 3: Update connection string in weaver.ini +weaver.mongodb_connection = mongodb://localhost:27017/weaver +``` + +### Issue: Celery Worker Not Starting + +```bash +# Error: Celery connection refused + +# Solution: Ensure MongoDB is running (used as broker) +sudo systemctl status mongodb + +# Or check Docker container +docker ps | grep mongo + +# Verify broker URL in weaver.ini +celery.broker_url = mongodb://localhost:27017/celery +``` + +### Issue: Docker Permission Denied + +```bash +# Error: permission denied while trying to connect to Docker daemon + +# Solution: Add user to docker group +sudo usermod -aG docker $USER +newgrp docker + +# Or run with sudo +sudo docker pull pavics/weaver:latest +``` + +### Issue: Make Target Fails + +```bash +# Error: make install-all fails + +# Solution: Install dependencies step by step +make conda-env +conda activate weaver +make install-sys +make install-pkg +make install-pip + +# Check for specific error messages and resolve +``` + +## Development Setup + +### Standard Setup (Using Weaver) + +```bash +# Clone and setup +git clone https://github.com/crim-ca/weaver.git +cd weaver + +# Install everything needed to use Weaver +make install + +# Activate environment +conda activate weaver + +# Start development server +pserve config/weaver.ini --reload + +# In another terminal, start worker +conda activate weaver +celery -A pyramid_celery.celery_app worker --ini config/weaver.ini +``` + +### Advanced Development Setup (Contributing to Weaver) + +**Only needed if you're modifying Weaver's code**: + +```bash +# After make install, you already have everything +# Additional tools are included in install-dev (part of make install) + +# Run linting +make lint + +# Run type checking +make check-types + +# Format code +make format + +# Generate documentation (requires install-doc) +make install-doc +make docs + +# Clean build artifacts +make clean +``` + +### Updating Installation + +```bash +# Pull latest changes +git pull origin master + +# Reinstall (updates dependencies and application) +make install + +# Alternatively, update specific parts: +# Update dependencies only +make install-pkg + +# Reinstall application only +make install-pip +``` + +## Docker Compose Deployment + +### Setup + +```bash +cd docker + +# Copy and customize configuration +cp docker-compose.yml.example docker-compose.yml +cp ../config/weaver.ini.example ../config/weaver.ini + +# Edit as needed +vim docker-compose.yml +vim ../config/weaver.ini +``` + +### Deploy Services + +```bash +# Start all services +docker-compose up -d + +# View logs +docker-compose logs -f weaver + +# Check status +docker-compose ps + +# Stop services +docker-compose down +``` + +### Services Included + +- **weaver-manager**: API and job management +- **weaver-worker**: Job execution worker +- **mongodb**: Database +- **nginx**: Reverse proxy (optional) + +## Environment Variables + +### Common Variables + +```bash +# Set Weaver URL +export WEAVER_URL=http://localhost:4001 + +# Set configuration file +export WEAVER_INI_FILE=/path/to/weaver.ini + +# Set log level +export WEAVER_LOG_LEVEL=DEBUG + +# Set output directory +export WEAVER_WPS_OUTPUT_DIR=/tmp/weaver-outputs + +# MongoDB connection +export WEAVER_MONGODB_CONNECTION=mongodb://localhost:27017/weaver +``` + +## Post-Installation + +### Deploy Sample Processes + +```bash +# Activate environment +conda activate weaver + +# Deploy a test process +weaver deploy -u $WEAVER_URL \ + -p hello-world \ + -b tests/functional/application-packages/DockerCopyImages/deploy.json + +# List processes +weaver capabilities -u $WEAVER_URL + +# Execute test +weaver execute -u $WEAVER_URL \ + -p hello-world \ + -I tests/functional/application-packages/DockerCopyImages/execute.json +``` + +### Configure Data Sources + +```bash +# Copy data sources example +cp config/data_sources.yml.example config/data_sources.yml + +# Edit data sources +vim config/data_sources.yml +``` + +### Set Up Vault (Optional) + +```bash +# Configure vault for secure credentials +cp config/request_options.yml.example config/request_options.yml + +# Edit vault settings +vim config/request_options.yml +``` + +## Production Considerations + +### Security + +- Use HTTPS in production +- Configure authentication (if needed) +- Secure MongoDB connections +- Isolate worker processes +- Restrict Docker socket access + +### Performance + +- Use multiple workers for job execution +- Configure appropriate resource limits +- Use external MongoDB for persistence +- Enable result caching +- Monitor resource usage + +### Monitoring + +- Set up log aggregation +- Configure health checks +- Monitor job queue +- Track resource usage +- Set up alerts + +## Makefile Reference + +### Installation Targets + +```bash +make install # Alias for install-all +make install-all # Full development installation +make install-run # Runtime-only installation +make install-dev # Development dependencies only +make install-doc # Documentation dependencies only +make install-pkg # Application packages only +make install-sys # System dependencies only +make install-pip # Application as editable package +make install-raw # Application without dependencies +``` + +### Environment Targets + +```bash +make conda-base # Install conda/miniconda +make conda-env # Create conda environment +make conda-config # Configure conda channels +make conda-install # Install conda packages +``` + +### Testing Targets + +```bash +make test # Run all tests +make test-unit # Unit tests only +make test-functional # Functional tests only +``` + +### Development Targets + +```bash +make start # Start development server +make start-worker # Start Celery worker +make clean # Clean build artifacts +make lint # Run linting +make docs # Generate documentation +``` + +## Related Skills + +- [api-info](../api-info/) - Verify installation +- [api-version](../api-version/) - Check installed version +- [process-deploy](../process-deploy/) - Deploy first process +- [job-execute](../job-execute/) - Run test job +- [api-conformance](../api-conformance/) - Verify OGC compliance + +## Documentation + +- [Installation Guide](https://pavics-weaver.readthedocs.io/en/latest/installation.html) +- [Configuration](https://pavics-weaver.readthedocs.io/en/latest/configuration.html) +- [Docker Deployment](https://pavics-weaver.readthedocs.io/en/latest/installation.html#docker-images) +- [GitHub Repository](https://github.com/crim-ca/weaver) +- [DockerHub Images](https://hub.docker.com/r/pavics/weaver) + +## Quick Start Summary + +```bash +# Method 1: Docker (fastest) +docker pull pavics/weaver:latest +docker run -p 4001:4001 pavics/weaver:latest + +# Method 2: From source (for running weaver commands) +git clone https://github.com/crim-ca/weaver.git +cd weaver +make install +conda activate weaver +pserve config/weaver.ini + +# Verify +weaver info -u http://localhost:4001 +``` + +Choose the method that best fits your use case and environment! diff --git a/.agents/skills/weaver-skill-create/SKILL.md b/.agents/skills/weaver-skill-create/SKILL.md new file mode 100644 index 000000000..fd3c7731b --- /dev/null +++ b/.agents/skills/weaver-skill-create/SKILL.md @@ -0,0 +1,325 @@ +--- +name: weaver-create-skill +description: | + Create new Agent Skills for Weaver capabilities. Learn skill structure, naming conventions, + metadata requirements, and best practices for documenting new capabilities. +license: Apache-2.0 +compatibility: Requires understanding of Weaver architecture and Agent Skills specification. +metadata: + category: setup-operations + version: "1.0.0" + keywords: + - skill-development + - documentation + - agent-skills + - capability-exposure + author: fmigneault +--- + +# Create New Agent Skills + +Learn to create new Agent Skills for Weaver capabilities in the standardized Agent Skills format. + +## When to Use + +- Adding new Weaver capabilities to the skill library +- Documenting LLM-accessible workflows +- Exposing functionality to AI agents and IDEs +- Creating reusable skill templates +- Contributing to the Weaver skill ecosystem + +## Skill Directory Structure + +All skills follow this standard structure: + +``` +.agents/ +└── skills/ + └── my-new-skill/ + ├── SKILL.md (required) + ├── scripts/ (optional) + │ ├── example.py + │ └── example.sh + └── assets/ (optional) + └── diagram.png +``` + +## Skill Naming Conventions + +Skill names follow a two-part pattern depending on their purpose: + +### Repository/Code Management Skills + +Skills that manage the repository, installation, or skill infrastructure use +the **`weaver--`** pattern: + +- **weaver**: Prefix to identify repository/infrastructure management +- **component**: The target domain (e.g., `skill`, `install`) +- **action**: The operation (e.g., `create`, `update`) + +✓ Examples: +- `weaver-skill-create` - Create new Agent Skills +- `weaver-skills-update` - Update skill documentation +- `weaver-install` - Install Weaver + +In a sense, these are "meta-skills" that manage the skill ecosystem itself. +The `"weaver-"` is iself the *component* and the `` is the specific operation on that component. + +### Operational Skills + +Skills that perform operations using Weaver or other tools relevant to it simply describe what they do, +**without** the `weaver-` prefix: + +- **component**: The domain or object (e.g., `job`, `process`, `cwl`, `api`) +- **action**: The operation (e.g., `deploy`, `monitor`, `validate`) + +✓ Examples: +- `job-monitor` - Monitor job execution +- `process-deploy` - Deploy processes +- `cwl-validate-package` - Validate CWL packages +- `api-version` - Get API version information + +### Naming Rules + +- Use **lowercase with hyphens** (never underscores) +- Be **descriptive but concise** +- Match directory name to skill name exactly +- Use `weaver-` prefix **only** for repository/infrastructure management skills + +## SKILL.md Frontmatter + +Every SKILL.md must start with YAML frontmatter: + +```yaml +--- +name: skill-name # Unique identifier (matches directory) +description: | # Multi-line description with keywords + Clear description of what it does. + Include keywords for AI discoverability. +license: Apache-2.0 # License information +compatibility: Requirements # Environment/system requirements +metadata: + category: category-name # e.g., job-operations, process-management + version: "1.0.0" # Skill version + keywords: # Search keywords + - keyword1 + - keyword2 + author: # Initial/original skill author + contributors: # Optional list of subsequent modifiers + - +--- +``` + +### Metadata Fields + +| Field | Type | Required | Description | +| --- | --- | --- | --- | +| `name` | string | Yes | Unique skill identifier (lowercase, hyphens) | +| `description` | string | Yes | Clear description with keywords (max 1024 chars) | +| `license` | string | Yes | License type (e.g., Apache-2.0) | +| `compatibility` | string | Yes | System/environment requirements | +| `metadata.category` | string | Yes | Skill category for organization | +| `metadata.version` | string | Yes | Skill version (semantic versioning) | +| `metadata.keywords` | array | Yes | Search keywords for discovery | +| `metadata.author` | string | Yes | Original skill author | +| `metadata.contributors` | array | No | Contributors who modified the skill after creation | + +The metadata fields must respect the [Agent Skills Specification](https://agentskills.io/specification). + +## Description Guidelines + +Effective skill descriptions should: + +- **Start with the action**: "Deploy processes", "Monitor jobs", "Validate packages" +- **Include use cases**: When and why to use this skill +- **Add keywords**: For AI agent discovery +- **Keep concise**: Under 1024 characters +- **Avoid redundancy**: Don't repeat operations that can be performed by other skills +- **Avoid ambiguity**: Be specific about what the skill does + +Examples: + +✓ Good: "Deploy CWL application packages to Weaver. Use for registering new processes." + +✗ Poor: "This skill is used to deploy CWL packages to Weaver using deployment." + +## Content Structure + +After frontmatter, structure SKILL.md as follows. +Sections that are not applicable can be omitted, but should be provided if available to offer alternatives. + +````markdown +# Skill Title + +One-line summary of the capability. + +## When to Use + +- Use case 1 +- Use case 2 +- Use case 3 + +## Parameters + +- **param_name** (type): Description +- **param_name** (type): Optional description + +## CLI Usage + +```bash +# Example command +command --flag value +``` + +## Python Usage + +```python +from weaver.client import WeaverClient + +client = WeaverClient(url="...") +result = client.method() +``` + +## API Request + +```bash +curl -X GET \ + "${WEAVER_URL}/endpoint" +``` + +## Returns + +JSON structure or response format. + +## Limitations + +- Issue 1: Solution +- Issue 2: Cause and workaround + +## Related Skills + +- [related-skill](../related-skill/): Brief description of how it relates +- [other-skill](../other-skill/): Brief description of how it relates + +## References + +- [Documentation](https://link-to-docs) +```` + +## Best Practices + +1. **Keep it focused**: One skill = one primary capability +2. **Include examples**: Provide CLI, Python, and API examples +3. **Document parameters**: Every input should be documented with type and purpose +4. **Add use cases**: Help LLMs understand when to invoke this skill +5. **Link related skills**: Create navigable skill graphs +6. **Use clear language**: Avoid jargon; be explicit about requirements +7. **Test examples**: Ensure code examples actually work +8. **Update metadata keywords**: These enable AI agent discovery +9. **Provide skils**: Link to related skills, but only if actually relevant and specific to the skill being documented +10. **Provide references**: Link to documentation and specifications if needd for better understanding or context + +## Step-by-Step Creation + +### 1. Create Directory + +```bash +mkdir -p .agents/skills/my-new-skill +cd .agents/skills/my-new-skill +``` + +### 2. Create SKILL.md + +```bash +cat > SKILL.md << 'EOF' +--- +name: my-new-skill +description: | + What this skill does in one clear sentence. +license: Apache-2.0 +compatibility: Requirements here. +metadata: + category: operation-type + version: "1.0.0" + keywords: + - keyword1 + - keyword2 + author: + contributors: + - +--- + +# Skill Title + +Content here... +EOF +``` + +### 3. Add Usage Examples + +Include at least three usage methods: +- CLI examples with commands and flags +- Python code with imports and method calls +- Raw API requests with curl or HTTP + +### 4. Document Parameters + +List all inputs with: +- Parameter name and type +- Description +- Default value (if applicable) +- Example values + +### 5. Document Returns + +Show what the skill returns: +- Success response format +- Error response format +- Example output + +### 6. Update Catalogs + +After creating a skill, update the following files to include cross-references. +Note that file references are from the root of the repository. + +- **[AGENTS.md](/AGENTS.md)** - Add skill to appropriate category +- **[.agents/README.md](/.agents/README.md)** - Add skill to appropriate category with description + +### 7. Test Documentation + +Verify the [Validation Checklist](#validation-checklist) to ensure the skill is complete and discoverable. +If validation reports lint/format issues introduced by the new or modified skill, +fix them and rerun validation until all checks pass. + +## Validation Checklist + +Before considering a skill complete: + +- [ ] Directory name matches skill name (lowercase, hyphens) +- [ ] `SKILL.md` has complete frontmatter +- [ ] All required metadata fields present +- [ ] Description includes keywords for AI discovery +- [ ] At least 3 usage examples (CLI, Python, API) +- [ ] All examples are syntactically correct +- [ ] All code examples tested +- [ ] Parameters clearly documented with types +- [ ] Return values documented and match expected API behavior +- [ ] Links to related skills are valid +- [ ] Metadata keywords enable discovery +- [ ] Line length ≤ 120 characters +- [ ] No escaped underscores (`\_` → `_`) +- [ ] YAML frontmatter is syntactically valid +- [ ] Lint checks pass for Markdown formatting, line length and skill code examples +- [ ] Any lint issues introduced by this skill were fixed and checks rerun until clean + +## Related Skills + +- [weaver-ci-validate](../weaver-ci-validate/) - Validate code and lint with Makefile targets +- [weaver-skills-update](../weaver-skills-update/) - Maintain and update skills documentation after creation + +## References + +- **Agent Skills Specification**: +- **Weaver Documentation**: +- **Weaver GitHub**: + diff --git a/.agents/skills/weaver-skills-update/SKILL.md b/.agents/skills/weaver-skills-update/SKILL.md new file mode 100644 index 000000000..e7869850c --- /dev/null +++ b/.agents/skills/weaver-skills-update/SKILL.md @@ -0,0 +1,614 @@ +--- +name: weaver-skills-update +description: | + Maintain and update Agent Skills documentation when Weaver codebase changes. + Detect code modifications in weaver/cli.py, Makefile, docs/, and configuration files, + then systematically update relevant skills to keep documentation synchronized. + Use when contributing code changes or maintaining skills framework. +license: Apache-2.0 +compatibility: Requires Python 3.10+, git, access to weaver repository. +metadata: + category: setup-operations + version: 1.0.0 + author: fmigneault +allowed-tools: run_command file_read file_write grep_search +--- + +# Update Weaver Agent Skills + +Maintain and update Agent Skills documentation when Weaver codebase changes. + +## When to Use + +- After modifying Weaver CLI commands (`weaver/cli.py`) +- After updating Makefile targets +- After changing configuration options +- After updating API endpoints in `weaver/wps_restapi/` +- After modifying process operations +- When documentation falls out of sync with code +- During major version releases + +## Overview + +Agent Skills must stay synchronized with the Weaver codebase. This skill provides systematic procedures to detect +changes and update relevant skills. + +## Available Scripts + +This skill provides three automation scripts to help maintain Agent Skills: + +### 1. detect-skill-updates.sh + +**Purpose**: Detect which files have changed and which skills need updating + +**Usage**: + +```bash +.agents/skills/weaver-skills-update/scripts/detect-skill-updates.sh "1 week ago" +``` + +**Output**: Reports CLI, Makefile, API, and documentation changes with recommendations + +**See**: [Automated Detection Script](#automated-detection-script) section for details + +### 2. validate-skills.sh + +**Purpose**: Validate YAML frontmatter syntax and cross-references + +**Usage**: + +```bash +.agents/skills/weaver-skills-update/scripts/validate-skills.sh +``` + +**Output**: Checks YAML parsing and skill directory references + +**See**: [Automated Validation](#automated-validation) section for details + +### 3. check_frontmatter.py + +**Purpose**: Verify YAML frontmatter uses proper multiline format + +**Usage**: + +```bash +python3 .agents/skills/weaver-skills-update/scripts/check_frontmatter.py +``` + +**Output**: Ensures all skills use `description: |` format + +**See**: [YAML Frontmatter Format](#yaml-frontmatter-format) section for details + +## Change Detection Strategy + +### 1. Identify Changed Files + +```bash +# Check git status for modified files +git status --porcelain + +# Compare with upstream +git diff origin/master --name-only + +# Check specific areas +git diff origin/master -- weaver/cli.py +git diff origin/master -- Makefile +git diff origin/master -- docs/source/ +git diff origin/master -- weaver/wps_restapi/ +``` + +### 2. Analyze Impact Areas + +Based on changed files, determine affected skill categories: + +| Changed File | Affected Skills | Update Priority | +| ------------------------- | ------------------------- | --------------- | +| `weaver/cli.py` | All CLI-referenced skills | **High** | +| `weaver/wps_restapi/*.py` | API, job, process skills | **Medium** | +| `docs/source/*.rst` | Documentation links | **Medium** | +| `config/*.example` | Configuration examples | **Medium** | +| `weaver/processes/*.py` | process-\* skills | **Medium** | +| `Makefile` | weaver-install | **Medium** | +| `CHANGES.rst` | All depending on change | **Low** | + +## Update Procedures + +### Procedure 1: CLI Command Changes + +**When**: `weaver/cli.py` is modified + +**Steps**: + +#### 1. Extract CLI methods + +```bash +# List all CLI commands +grep -E "^def (.*)\(.*\):" weaver/cli.py | grep -v "^def _" + +# Check for new commands +git diff origin/master weaver/cli.py | grep "^+.*def " | grep -v "^+.*def _" + +# Check for modified command signatures +git diff origin/master weaver/cli.py | grep -A5 "def " +``` + +#### 2. Identify affected skills + +```bash +# Map CLI commands to skills +# deploy → process-deploy +# execute → job-execute +# status → job-status +# logs → job-logs +# results → job-results +# etc. +``` + +#### 3. Update skill documentation + +For each affected skill: + +```bash +# a. Update CLI Usage section +# - Review command signature changes +# - Update parameter descriptions +# - Update example commands + +# b. Update Python Usage section +# - Check WeaverClient method changes +# - Update parameter names +# - Update return value handling + +# c. Verify examples still work +weaver --help +# Test command with example from skill +``` + +#### 4. Check for new CLI commands + +```bash +# If new command found, create new skill: +# - Determine category (job-*, process-*, provider-*, etc.) +# - Create skill directory and SKILL.md +# - Follow existing skill template +# - Add cross-references +# - Update .agents/README.md +``` + +### Procedure 2: Makefile Target Changes + +**When**: `Makefile` is modified + +**Steps**: + +#### 1. Detect changed targets + +```bash +# List all make targets +grep -E "^[a-z-]+:" Makefile | cut -d: -f1 + +# Check for new or modified targets +git diff origin/master Makefile | grep "^+.*:" | grep -v "^+##" +``` + +#### 2. Update weaver-install skill + +```bash +# Update sections: +# - "Install with Makefile Targets" +# - "Makefile Reference" +# - Installation procedure examples + +# Verify each target description: +make help | grep "install" +``` + +#### 3. Test installation procedures + +```bash +# Verify each documented procedure still works +cd /tmp && git clone https://github.com/crim-ca/weaver.git test-install +cd test-install +make install-all # Test documented procedure +``` + +### Procedure 3: API Endpoint Changes + +**When**: `weaver/wps_restapi/*.py` files are modified + +**Steps**: + +#### 1. Identify endpoint changes + +```bash +# Check route definitions +git diff origin/master weaver/wps_restapi/api.py | grep "@.*route" + +# Check new endpoints +git diff origin/master weaver/wps_restapi/ | grep -E "@.*\.(get|post|put|delete)" +``` + +#### 2. Map endpoints to skills + +```markdown +GET /processes → process-list +GET /processes/{id} → process-describe +POST /processes → process-deploy +DELETE /processes/{id} → process-undeploy +POST /processes/{id}/execution → job-execute +GET /jobs/{id} → job-status +GET /jobs/{id}/results → job-results +GET /jobs/{id}/logs → job-logs +etc. +``` + +#### 3. Update affected skills + +For each affected skill: + +```bash +# a. Update API Request section +# - Verify endpoint path +# - Check request parameters +# - Update request body examples + +# b. Update Returns section +# - Check response schema changes +# - Update example responses +# - Note new fields + +# c. Test endpoint +curl -X GET "${WEAVER_URL}/endpoint" | jq +``` + +### Procedure 4: Documentation Link Changes + +**When**: `docs/source/*.rst` files are modified + +**Steps**: + +#### 1. Check for restructured docs + +```bash +# Find moved or renamed files +git diff origin/master --name-status docs/source/ + +# Check for changed anchors +git diff origin/master docs/source/*.rst | grep -E "^\+.*\.\. _" +``` + +#### 2. Update documentation links in skills + +```bash +# Find all documentation links +grep -r "https://pavics-weaver.readthedocs.io" .agents/skills/*/SKILL.md + +# For each skill with doc links: +# - Verify link still works +# - Update path if file moved +# - Keep base URLs without anchors (anchors are auto-generated) +``` + +#### 3. Verify links + +```bash +# Check each link returns 200 +for skill in .agents/skills/*/SKILL.md; do + echo "Checking $skill" + grep -o "https://pavics-weaver[^)]*" "$skill" | while read url; do + curl -s -o /dev/null -w "%{http_code} $url\n" "$url" + done +done +``` + +### Procedure 5: Configuration Changes + +**When**: `config/*.example` files are modified + +**Steps**: + +#### 1. Identify configuration changes + +```bash +# Check example configs +git diff origin/master config/weaver.ini.example +git diff origin/master config/data_sources.yml.example +``` + +#### 2. Update weaver-install skill + +```bash +# Update "Configuration" section +# - New configuration options +# - Changed defaults +# - Deprecated options + +# Update "Key Configuration Options" examples +``` + +## Systematic Update Workflow + +### Complete Update Process + +```bash +# 1. Create update branch +git checkout -b update-skills-$(date +%Y%m%d) + +# 2. Identify changes since last update (optional - see what changed) +git log --since="LAST_UPDATE_DATE" --name-only --pretty=format: | sort -u + +# 3. Detect changes and get recommendations +.agents/skills/weaver-skills-update/scripts/detect-skill-updates.sh "1 week ago" + +# 4. For each change category reported, follow the corresponding procedure above +# - CLI changes → Procedure 1 +# - Makefile changes → Procedure 2 +# - API changes → Procedure 3 +# - Documentation changes → Procedure 4 +# - Configuration changes → Procedure 5 + +# 5. Verify all skills with validation scripts +.agents/skills/weaver-skills-update/scripts/validate-skills.sh +python3 .agents/skills/weaver-skills-update/scripts/check_frontmatter.py + +# 6. Update skill count in README (if skills added/removed) +vim .agents/README.md + +# 7. Commit changes +git add .agents/ +git commit -m "Update Agent Skills to match code changes" + +# 8. Create pull request +git push origin update-skills-$(date +%Y%m%d) +``` + +## Automated Detection Script + +A script is provided to detect which files have changed and need skill updates. The script identifies changes but the +actual update procedures are documented in this SKILL.md file (see "Update Procedures" section above). + +**Script**: [`scripts/detect-skill-updates.sh`](scripts/detect-skill-updates.sh) + +**Usage**: + +```bash +# Run from repository root +.agents/skills/weaver-skills-update/scripts/detect-skill-updates.sh + +# Check changes since specific date +.agents/skills/weaver-skills-update/scripts/detect-skill-updates.sh "1 week ago" + +# Check changes since last month (default) +.agents/skills/weaver-skills-update/scripts/detect-skill-updates.sh "1 month ago" +``` + +**What it does**: + +- Analyzes git history for changes to key files +- Reports CLI changes (affects `job-*`, `process-*`, `provider-*` skills) +- Reports Makefile changes (affects weaver-install skill) +- Reports API changes (affects endpoint-related skills) +- Reports documentation changes (affects documentation links) +- Provides update recommendations based on detected changes + +**Note**: This script only *detects* changes. +To perform the actual updates, follow the procedures documented in the +sections above (Procedure 1-5 under "Update Procedures"). + +## Skill Quality Checklist + +When updating skills, verify: + +- [ ] **YAML frontmatter** is valid +- [ ] **YAML description** uses multiline format with `description: |` (see YAML Frontmatter Format below) +- [ ] **Name** matches directory name +- [ ] **Author** in frontmatter `metadata.author` preserves the original skill author +- [ ] **Contributors** include the committer in `metadata.contributors` when the skill is modified by someone else +- [ ] **Description** is accurate (1-1024 chars) +- [ ] **Parameters** remain clearly documented with types +- [ ] **Return values** remain documented and aligned with expected API behavior +- [ ] **Scripts** that require large set of commands are placed in dedicated `scripts/` and referenced by the skill +- [ ] **Returns** section has completeness note +- [ ] **Job IDs** are UUIDs (not simple strings) +- [ ] **Documentation links** work (base URLs without anchors) +- [ ] **Cross-references** point to existing skills +- [ ] **Steps** that are purely procedural without code use numbered lists instead of code blocks +- [ ] **Steps** that need code use code blocks only as needed, not for the entire step (avoid embedded comment list) +- [ ] **Code blocks** have proper syntax highlighting +- [ ] **Code blocks** do not repeat example keywords making their structure invalid +- [ ] **Examples** are syntactically valid +- [ ] **Python examples** use correct method signatures +- [ ] **CLI examples** use current syntax +- [ ] **API requests** use curl with `${WEAVER_URL}` +- [ ] **Examples** are tested and working if deemed necessary +- [ ] **Markdown** formatting is valid + +### YAML Frontmatter Format + +All skills must use proper YAML frontmatter with multiline descriptions to respect the limit of 120 characters per line. + +**Required format**: + +```yaml +--- +name: skill-name +description: | + Multi-line description that explains what the skill does. + Use the pipe (|) symbol to enable multiline format. + This prevents line wrapping issues and maintains readability. +license: Apache-2.0 +compatibility: Requirements here +metadata: + category: category-name + version: "1.0.0" + author: + contributors: + - +allowed-tools: tool1 tool2 +--- +``` + +**Key points**: + +- **Always use `description: |`** for multiline format +- Indent description content with 2 spaces +- Keep description lines under 100 characters +- Preserve `metadata.author` as the original skill author +- Add/update `metadata.contributors` for anyone modifying the skill +- Ensure valid YAML syntax (no trailing commas, proper indentation) + +**Validation script**: [`scripts/check_frontmatter.py`](scripts/check_frontmatter.py) + +```bash +# Check YAML frontmatter format in all skills +python3 .agents/skills/weaver-skills-update/scripts/check_frontmatter.py +``` + +## Version-Specific Updates + +### Major Version Updates (X.0.0) + +Comprehensive review required: + +1. Check all CLI command changes +2. Review API breaking changes +3. Update all version references +4. Review all examples for compatibility +5. Update configuration examples +6. Check for deprecated features + +### Minor Version Updates (X.Y.0) + +Focus on new features: + +1. Check for new CLI commands → create skills +2. Check for new API endpoints → create skills +3. Update affected skills with new options +4. Add examples for new features + +### Patch Version Updates (X.Y.Z) + +Minimal updates usually needed: + +1. Check CLI help text changes +2. Verify examples still work +3. Update error handling if changed + +## Testing Updated Skills + +### Manual Testing (optional) + +> ⚠️ WARNING ️Unless a Weaver instance is running locally, the following tests will fail. Running an instance can be a +> timely process. Therefore, consider whether this is actually needed and there are no simpler workarounds. If required, +> ensure you have a test instance available before running these commands using +> [weaver-install](../weaver-install/SKILL.md) +> skill instructions. + +```bash +# Test CLI examples +weaver info -u http://localhost:4001 + +# Test curl commands +export WEAVER_URL=http://localhost:4001 +curl -X GET "${WEAVER_URL}/processes" | jq + +# Test Python examples +python << EOF +from weaver.cli import WeaverClient +client = WeaverClient(url="http://localhost:4001") +print(client.capabilities()) +EOF +``` + +### Automated Validation + +A script is provided to validate YAML frontmatter and cross-references in all skills. +If any validation or linting command reports issues introduced by the update, +fix them and rerun the same checks until all are clean. + +**Script**: [`scripts/validate-skills.sh`](scripts/validate-skills.sh) + +**Usage**: + +```bash +# Run from repository root +.agents/skills/weaver-skills-update/scripts/validate-skills.sh +``` + +**What it validates**: + +- **YAML frontmatter**: Ensures all skills have valid YAML frontmatter between `---` markers +- **Cross-references**: Checks that all relative links to other skills (`../skill-name/SKILL.md`) + point to existing skill directories and their corresponding `SKILL.md` file. + +**Example output**: + +```text +Validating Agent Skills... +========================== + +Checking YAML frontmatter... +✓ .agents/skills/api-conformance/SKILL.md +✓ .agents/skills/api-info/SKILL.md +... + +Checking cross-references... + +========================== +✅ All validations passed +``` + +### Format Validation + +Employ the `make check-md-only` target. +Similar command with `make fix-md-only` can be used to automatically formatting issues. +Remove the `-only` suffix if installation/updates of dependencies are needed. +Repeat `check-md-only` after each fix until no warnings/errors remain. + +## Best Practices + +1. **Update skills immediately** after code changes +2. **Test examples** before committing +3. **Use git diff** to identify all impacts +4. **Maintain consistency** across similar skills +5. **Document breaking changes** prominently +6. **Version control** skills with code +7. **Review related skills** when updating one +8. **Keep examples simple** and focused +9. **Verify links** after documentation restructure +10. **Update skill count** in README when adding/removing skills + +## Related Skills + +- [weaver-install](../weaver-install/SKILL.md) - Keep installation procedures current +- All skills - Any skill may need updates based on code changes + +## Documentation + +- [Agent Skills Specification](https://agentskills.io/specification) +- [Weaver Contributing Guide](https://pavics-weaver.readthedocs.io/en/latest/contributing.html) +- [Git Workflow](https://pavics-weaver.readthedocs.io/en/latest/contributing.html) + +## Quick Reference + +```bash +# Detect changes +git diff origin/master --name-only + +# Check CLI changes +git diff origin/master weaver/cli.py | grep "def " + +# Check API changes +git diff origin/master weaver/wps_restapi/ + +# Validate skills (automated detection) +.agents/skills/weaver-skills-update/scripts/detect-skill-updates.sh "1 week ago" + +# Update and test +# 1. Update affected skills +# 2. Test examples +# 3. Verify links +# 4. Commit changes +``` + +Keep skills synchronized with code to maintain their value for users and AI agents! diff --git a/.agents/skills/weaver-skills-update/scripts/check_frontmatter.py b/.agents/skills/weaver-skills-update/scripts/check_frontmatter.py new file mode 100755 index 000000000..27c35c83e --- /dev/null +++ b/.agents/skills/weaver-skills-update/scripts/check_frontmatter.py @@ -0,0 +1,63 @@ +#!/usr/bin/env python3 +""" +Check YAML frontmatter format in Agent Skills. +Verifies that descriptions use multiline format with 'description: |'. +""" +import os +import sys + +import yaml + +SKILLS_DIR = ".agents/skills" +errors = [] +warnings = [] + +for skill_name in sorted(os.listdir(SKILLS_DIR)): + skill_path = os.path.join(SKILLS_DIR, skill_name, "SKILL.md") + if not os.path.isfile(skill_path): + continue + + with open(skill_path, mode="r", encoding="utf-8") as f: + content = f.read() + + parts = content.split('---') + if len(parts) < 3: + errors.append(f"{skill_name}: Invalid frontmatter structure") + continue + + frontmatter = parts[1] + + try: + data = yaml.safe_load(frontmatter) + + # Check description format + if 'description' in data and isinstance(data['description'], str): + if 'description: |' not in frontmatter: + errors.append(f"{skill_name}: needs 'description: |' format") + elif len(data['description']) > 1024: + warnings.append(f"{skill_name}: description exceeds 1024 characters") + else: + print(f"✓ {skill_name}: correct format") + else: + errors.append(f"{skill_name}: missing or invalid description") + + except yaml.YAMLError as e: + errors.append(f"{skill_name}: YAML error - {e}") + +print() +if warnings: + print("Warnings:") + for warning in warnings: + print(f" ⚠️ {warning}") + print() + +if errors: + print("Errors:") + for error in errors: + print(f" ❌ {error}") + print() + print(f"Total errors: {len(errors)}") + sys.exit(1) +else: + print("✅ All skills have properly formatted YAML frontmatter") + sys.exit(0) diff --git a/.agents/skills/weaver-skills-update/scripts/detect-skill-updates.sh b/.agents/skills/weaver-skills-update/scripts/detect-skill-updates.sh new file mode 100755 index 000000000..cee5e5098 --- /dev/null +++ b/.agents/skills/weaver-skills-update/scripts/detect-skill-updates.sh @@ -0,0 +1,38 @@ +#!/bin/bash +# Detect files requiring skill updates + +SINCE_DATE=${1:-"1 month ago"} + +echo "Changes since: $SINCE_DATE" +echo "================================" + +# CLI changes +CLI_CHANGES=$(git log --since="$SINCE_DATE" --name-only --pretty=format: weaver/cli.py | sort -u | wc -l) +if [ "$CLI_CHANGES" -gt 0 ]; then + echo "⚠️ CLI changes detected: $CLI_CHANGES" + echo " → Review all job-*, process-*, provider-* skills" +fi + +# Makefile changes +MK_CHANGES=$(git log --since="$SINCE_DATE" --name-only --pretty=format: Makefile | sort -u | wc -l) +if [ "$MK_CHANGES" -gt 0 ]; then + echo "⚠️ Makefile changes detected: $MK_CHANGES" + echo " → Update weaver-install skill" +fi + +# API changes +API_CHANGES=$(git log --since="$SINCE_DATE" --name-only --pretty=format: weaver/wps_restapi/ | sort -u | wc -l) +if [ "$API_CHANGES" -gt 0 ]; then + echo "⚠️ API changes detected: $API_CHANGES" + echo " → Review affected endpoint skills" +fi + +# Documentation changes +DOC_CHANGES=$(git log --since="$SINCE_DATE" --name-only --pretty=format: docs/source/ | sort -u | wc -l) +if [ "$DOC_CHANGES" -gt 0 ]; then + echo "⚠️ Documentation changes detected: $DOC_CHANGES" + echo " → Verify documentation links in skills" +fi + +echo "================================" +echo "See .agents/skills/weaver-skills-update/SKILL.md for update procedures" diff --git a/.agents/skills/weaver-skills-update/scripts/validate-skills.sh b/.agents/skills/weaver-skills-update/scripts/validate-skills.sh new file mode 100755 index 000000000..f2e2b04dd --- /dev/null +++ b/.agents/skills/weaver-skills-update/scripts/validate-skills.sh @@ -0,0 +1,60 @@ +#!/bin/bash +# Validate Agent Skills YAML frontmatter and cross-references + +echo "Validating Agent Skills..." +echo "==========================" + +ERRORS=0 + +# Validate YAML frontmatter +echo "" +echo "Checking YAML frontmatter..." +for skill in .agents/skills/*/SKILL.md; do + python3 -c " +import yaml +import sys +try: + with open('$skill') as f: + content = f.read() + parts = content.split('---') + if len(parts) >= 3: + yaml.safe_load(parts[1]) + print('✓ $skill') + else: + print('✗ $skill: Invalid frontmatter') + sys.exit(1) +except Exception as e: + print('✗ $skill: ' + str(e)) + sys.exit(1) +" || ERRORS=$((ERRORS + 1)) +done + +# Check for broken cross-references +echo "" +echo "Checking cross-references..." +broken_refs=() +for skill in .agents/skills/*/SKILL.md; do + while IFS= read -r ref; do + target=$(echo "$ref" | sed 's/\.\.\/\([^/]*\).*/\1/') + if [ ! -d ".agents/skills/$target" ]; then + broken_refs+=("✗ Broken reference in $skill: $ref") + fi + done < <(grep -o '\.\./[^/)]*)' "$skill" 2>/dev/null) +done + +ERRORS=${#broken_refs[@]} +if [ "$ERRORS" -gt 0 ]; then + for ref in "${broken_refs[@]}"; do + echo "$ref" + done +fi + +echo "" +echo "==========================" +if [ "$ERRORS" -eq 0 ]; then + echo "✅ All validations passed" + exit 0 +else + echo "❌ Found $ERRORS error(s)" + exit 1 +fi diff --git a/.github/ISSUE_TEMPLATE/bug-report.md b/.github/ISSUE_TEMPLATE/bug-report.md index 7637fdc58..11238591d 100644 --- a/.github/ISSUE_TEMPLATE/bug-report.md +++ b/.github/ISSUE_TEMPLATE/bug-report.md @@ -1,10 +1,8 @@ --- - name: Bug Report about: Create a report to help us improve labels: triage/bug assignees: fmigneault - --- ## Describe the bug @@ -13,7 +11,7 @@ assignees: fmigneault ## How to Reproduce - diff --git a/.github/ISSUE_TEMPLATE/feature-request.md b/.github/ISSUE_TEMPLATE/feature-request.md index 8d5bba95f..8bc80377a 100644 --- a/.github/ISSUE_TEMPLATE/feature-request.md +++ b/.github/ISSUE_TEMPLATE/feature-request.md @@ -1,15 +1,13 @@ --- - name: Feature Request about: Suggest an idea for this project labels: triage/feature assignees: fmigneault - --- ## Description -