zilliz-launchpad

From a sample document to a production-deployed search app on Milvus / Zilliz Cloud — in minutes. An AI-guided scaffold delivered as an agent skill.

Whether you're new to Milvus or just tired of boilerplate — pick a file, run six steps, ship.

What you get

Artifact	Description
Milvus Collection	Schema auto-designed from your data
Vector + sparse fields	OpenAI / Voyage / Cohere / BYOM (text); CLIP / Voyage multimodal (image)
`plan.md`	Every decision recorded — reviewable and reproducible
Next.js Demo UI	Hybrid search list or thumbnail gallery (auto-detected) at `localhost:3000`
`eval_report.md`	recall@10, p50/p95/p99 latency, optional RAG quality metrics
`deploy.json`	Zilliz Cloud deployment record, resumable on rerun

Requirements

Agent — Claude Code, Copilot CLI, Gemini CLI, or any skills-compatible agent
Python ≥ 3.11 + uv
Node.js ≥ 18 + pnpm
Docker (local Standalone) or a Zilliz Cloud account
API key — OpenAI, Voyage, Cohere, or Zilliz BYOM
Optional — zilliz CLI for Cloud auto-discovery and bulk import
Optional — ffmpeg on PATH (needed for video-search frame sampling; scene-change sampling shells out to it directly)

Install

The launchpad ships as an agent skill. Install it with the skills CLI, which discovers skills under skills/ and symlinks them into your agent's skill directory (Claude Code, Copilot CLI, Gemini CLI, Cursor, OpenCode, Codex):

npx skills add zilliztech/zilliz-launchpad

Once installed, open your agent in this repo and say something like "use zilliz-launchpad to index this file" — the skill drives the six phases end-to-end (stopping at Phase 4 by default; Phases 5 and 6 kick in when you ask), installing Python deps, bringing up Milvus, and prompting for any missing API keys as it goes.

Install options

# Install to a specific agent
npx skills add zilliztech/zilliz-launchpad -a claude-code

# Install globally (available across all projects)
npx skills add zilliztech/zilliz-launchpad -g

# Install to all detected agents
npx skills add zilliztech/zilliz-launchpad --all

# List available skills before installing
npx skills add zilliztech/zilliz-launchpad --list

Other agent flags include -a copilot-cli, -a gemini-cli, -a cursor, -a opencode, and -a codex.

MCP install path

For non-skill-aware hosts (Cursor, Claude Desktop, generic MCP clients), the launchpad also ships an MCP server that exposes the same six phases as MCP tools. Install the mcp extra and launch the stdio server:

uv sync --extra mcp
uv run python -m launchpad_mcp.server

See mcp/README.md for the full tool catalog, the structured error envelope, and a host-registration snippet.

Preflight (optional)

The skill drives these on demand, but you can do them ahead of time to skip a few conversational round-trips:

# Install Python deps
uv sync

# Bring up local Milvus Standalone (skip if using Zilliz Cloud)
./skills/zilliz-launchpad/scripts/start_milvus.sh up

# Export at least one embedding key
export OPENAI_API_KEY=<your-key>

If you plan to drive the CLI directly without an agent, run all three — the Walkthrough below assumes they're done.

Walkthrough — the six phases

The launchpad is organized as six CLI subcommands. Each writes a single artifact to skills/zilliz-launchpad/scripts/runs/<utc-timestamp>/. You can rerun any phase; nothing is destructive until Phase 4 touches Milvus, and Phase 6 is gated behind an explicit --confirm before it creates any Cloud resources.

collect ──▶ configure ──▶ plan ──▶ execute ──▶ evaluate ──▶ deploy
                                       │           │           │
                                       │           │           └─ Zilliz Cloud + deploy.json
                                       │           └─ eval_report.{json,md}
                                       └─ Milvus collection + demo UI

We'll use the bundled movies sample (20 short fictional plot summaries) throughout.

Phase 1 — Collect: analyze your data

Looks at your file, infers field types, picks a candidate primary key and text field, and writes collect.json.

uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py collect --sample movies
# or: --input path/to/your.jsonl
# or: --input path/to/your.pdf      # one record per page (requires `.[documents]` extra)
# or: --input path/to/notes.md      # whole file; add --split-markdown-headings for `## ` sections
# or: --input ./docs/               # directory (recursive) — mixes .jsonl/.pdf/.md/.csv/.txt
# or: --input 'docs/*.pdf'          # shell glob — quote it so the shell doesn't expand it first

Directory or glob inputs produce a source_files[] array in collect.json (one entry per file) and a union schema across files. A field name appearing in only some files is marked nullable: true; the same field name with different JSON types in different files raises input_schema_conflict and refuses to write collect.json.

Output (collect.json, abbreviated):

{
  "data_shape": "jsonl",
  "record_count_estimate": 20,
  "fields": [
    { "name": "id",    "type": "string", "avg_length": 4,   "sample_value": "m001" },
    { "name": "title", "type": "string", "avg_length": 18,  "sample_value": "The Quantum Gardener" },
    { "name": "body",  "type": "string", "avg_length": 126, "sample_value": "An astrophysicist..." },
    { "name": "year",  "type": "int",    "sample_value": 2023 },
    { "name": "genre", "type": "string", "sample_value": "sci-fi" }
  ]
}

For Milvus veterans: this is where you'd normally hand-write a CollectionSchema. Skip it.

Image search? Point --input at a directory of images (.jpg/.png/.webp/.gif):
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py collect --input ./photos/
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py configure \
    --use-case image-search --dataset-size 64000 --deployment local-standalone
Phase 1 walks the directory, reads EXIF, encodes a thumbnail per image. Phase 3 then picks clip-local (open-source ViT-B/32, runs on CPU/MPS/CUDA, no API key) by default — install the optional extra first: uv pip install -e '.[multimodal]'. The Next.js UI auto-switches to a thumbnail gallery. See issue #14 for the MVP scope.

Search by example (image → image). After Phase 4 Execute builds the collection you can query with another image instead of a text phrase. In the demo UI (pnpm dev from scripts/ui/) click Search by image… next to the text box — or drop an image anywhere on the page — to find visually similar images in your collection. Uploads are capped at 10 MB per request. For a CLI smoke:
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate \
    --query-image ./query/my_dog.jpg
This prints the top-10 ranked primary keys with scores and is the fastest way to confirm image-to-image is wired. For a labelled eval, mix image-to-image rows into your qrels file (one row per line):
{"query_image_path": "query/sunset.jpg", "expected_image_ids": ["photos/sky1.jpg", "photos/beach2.jpg"]}
and pass the file to evaluate --qrels path/to/qrels.jsonl to get recall / MRR / NDCG against your ground truth. Queries against a Voyage-multimodal-backed collection (embedding_preference: voyage-multimodal-3) call the Voyage API per query and bill to VOYAGE_API_KEY; CLIP-local stays free. See issue #15.

Video search? Point --input at a directory of .mp4 / .mov / .mkv / .webm clips:
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py collect --input ./clips/
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py configure \
    --use-case video-search --dataset-size 25 --deployment local-standalone
Phase 1 samples a frame every 2 seconds (default) using PyAV, capped at 600 frames per video, and writes JPEGs under <run-dir>/frames/. Phase 3 reuses the same CLIP path image-search uses (or voyage-multimodal-3 on override) and adds two scalar fields (video_path, t_seconds) so the UI can deep-link playback. Phase 4 fans out per video, embedding frames in batches with resumable per-video tracking (processed_videos[]). The demo UI renders each unique video as a card with the top-scoring frame as the primary thumbnail plus clustered secondary frames under it; clicking mounts an inline <video> element seeked to the matched timestamp. Sidecar static-serves videos under /videos/…; set LAUNCHPAD_VIDEO_STATIC_ROOT to point at your source tree if it lives outside the run-directory parent. Tune sampling density with --frame-interval-seconds, --max-frames-per-video, and --sampling-strategy {every_n_seconds, scene_change} (scene-change needs ffmpeg on PATH). Evaluation reports both recall@k (frame) and recall@k (video) so you can see whether the right moment (frame) or the right clip (video) is being retrieved. See issue #16. Voyage multimodal ingest bills per-frame; for videos default to CLIP-local unless you explicitly opt in. The sidecar upload cap remains 10 MB per request (inherited from image-search).

Phase 2 — Configure: capture your intent

Three knobs decide everything downstream: use case, dataset size, deployment target.

uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py configure \
    --use-case rag \
    --dataset-size 20 \
    --deployment local-standalone

Flag	Common values
`--use-case`	`rag`, `semantic-search`, `hybrid-search`, `recommendation`, `image-search`, `video-search`
`--dataset-size`	row-count estimate, drives index choice
`--deployment`	`local-standalone`, `zilliz-serverless`, `zilliz-dedicated`, `zilliz-byoc`

Output (configure.json): a normalized requirement profile used by the downstream phases.

Phase 3 — Plan: deterministic decisions, no LLM

Reads collect.json + configure.json and writes both a machine plan (plan.json) and a human-readable explanation (plan.md).

uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py plan

Example plan.md:

# Launchpad Plan

- Collection: `launchpad_collection`
- Target URI: `http://localhost:19530`
- Deployment: `local-standalone`

## Schema
- Primary key: `id`
- Text field: `body`
- Vector field: `embedding` (dim 1536)
- Sparse field: `sparse`
- Extra fields: title, year, genre

## Embedding
- Provider: `openai`
- Model: `text-embedding-3-small`
- Dim: 1536

## Index
- Type: HNSW   Metric: COSINE
- Params: { "M": 16, "efConstruction": 200 }

## Rationale
- Dataset size 20 → HNSW with M=16, ef=200
- Use case 'rag' + hybrid preference 'auto' → sparse=True
- Embedding provider 'openai' model 'text-embedding-3-small' (dim 1536)

Read the rationale, tweak configure.json if you disagree, rerun plan. Nothing has touched Milvus yet.

Phase 4 — Execute: create the collection, ingest, search

uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py execute --sample movies
# → connecting to http://localhost:19530
# → creating collection 'launchpad_collection' (HNSW, COSINE, dim=1536)
# → embedding 20 rows with openai/text-embedding-3-small
# → ingested 20 / 20
# → smoke test: query "movie about parallel universes"
# → ✓ Top-1: m001 'The Quantum Gardener' score=0.87

# Ingest a whole folder of weekly exports — `execute` streams files in
# lexicographic order. If the process is killed mid-batch, re-running with
# the same --run-dir resumes from the next file via execute.json.processed_files[]:
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py execute \
    --run-dir runs/2026-05-20-collect --input ./weekly_exports/

What it does:

Connects to the URI from plan.json (local Milvus or Zilliz Cloud)
Creates the collection + index per the plan (idempotent: skips if it matches; errors with schema_conflict if not)
Embeds and upserts your data (client-side for ≤100k rows, zilliz import for larger corpora when the CLI is installed)
Runs a smoke query and prints the top-1 result

Demo UI

cd skills/zilliz-launchpad/scripts/ui
pnpm install
pnpm dev
# → http://localhost:3000

A minimal Next.js app that calls a local /api/search route, which in turn hits your Milvus collection. Uses the latest run directory automatically. Hot-reload friendly — restyle it however you like.

Phase 5 — Evaluate: score retrieval, latency, RAG quality

Runs a query set against the live collection and writes eval_report.{json,md}. Three query-set tiers — pick whichever matches the labels you have:

# Derived smoke eval: no labels required, samples 25 docs from your corpus
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate

# Labelled eval: recall@10, MRR@10, NDCG@10
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate --qrels qrels.jsonl

# Opt-in RAG quality via ragas (needs a generator-model API key)
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate \
    --qrels qrels.jsonl --judge-llm openai:gpt-4o-mini

Example eval_report.md:

# Evaluation Report — 2026-04-23T10-15-02Z-execute

- Query count: 25
- Derived query set: **true**

## Decision table

| variant | recall@10 | p95 (ms) | faithfulness | cost/query |
| --- | --- | --- | --- | --- |
| base | 0.920 | 42.7 | — | — |

> Note: queries were derived from the corpus...

Comparison mode — re-run the same query set against alternative plan variants (swap embedding model, index params, hybrid on/off, reranker on/off) and get a decision table. Capped at 6 variants by default:

uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate \
    --qrels qrels.jsonl --compare variants.yaml

# variants.yaml
variants:
  - name: small-m
    overrides: { index: { params: { M: 8 } } }
  - name: voyage
    overrides: { embedding: { model: voyage-3 } }
  - name: no-hybrid
    overrides: { hybrid: false }

See skills/zilliz-launchpad/references/knowledge/evaluation_guide.md for the full metric contract.

Phase 6 — Deploy: promote to Zilliz Cloud

Recreates the plan's collection + index on a Zilliz Cloud cluster, ingests data (bulk import above the plan's threshold, client-side below), and writes deploy.json with observability pointers and a resumable state machine. Snapshots after every transition, so a rerun picks up where a failure left off.

# Target an existing cluster
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py deploy --cluster-id <id>

# Or provision a new one — --confirm is required because this bills real money
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py deploy --create --confirm

What it does:

Preflight — checks the zilliz CLI is installed + authenticated (zilliz auth whoami), and when targeting an existing cluster, verifies it's RUNNING
Provision (with --create) — calls zilliz cluster create with plan/region from configure.json, polls zilliz cluster describe with exponential backoff until RUNNING
Collection + index — reuses Phase 4's idempotent create_collection + create_index; surfaces schema_conflict if an incompatible collection already exists on the cluster
Ingest — routes through zilliz import create for corpora above the plan's bulk_import_threshold (default 100k), falls back to client-side upsert if bulk fails
Observability — records the Grafana dashboard URL (from cluster describe) in deploy.json.observability.grafana_dashboard and appends a post-ingest snapshot to observability.json

deploy.json carries the fixed schema the spec guarantees: cluster_id, cluster_uri, token_source, collection_name, ingest_mode, ingest_row_count, ingest_status, observability, timestamps. If Phase 6 stops mid-run (cluster provisioned, ingest failing), a rerun with --cluster-id <id> skips the already-ready steps and retries only what's left.

Example prompts

The CLI is the seam, but day-to-day you'll talk to the skill in natural language inside your agent. Here are concrete prompts that map cleanly to the six phases — each one shows what the skill runs under the hood.

1. First time — just run the bundled sample

"Use zilliz-launchpad with the bundled movies sample so I can see the whole flow end-to-end."

Skill runs Phases 1–4 against sample_data/movies.jsonl (--use-case rag --dataset-size 20 --deployment local-standalone) and starts the demo UI. Best first invocation — proves your environment is wired before you point it at real data. Once you've clicked around the UI, ask "now run Phase 5 in derived mode" for a quick latency+recall smoke.

2. RAG over your own JSONL

"I have ~/data/support_tickets.jsonl (~80k rows, fields: ticket_id, subject, body, priority). Index it for RAG on local Milvus."

collect --input ~/data/support_tickets.jsonl — infers fields, picks body as the text field
configure --use-case rag --dataset-size 80000 --deployment local-standalone
plan — sparse field on by default for RAG
execute — embed + upsert + smoke query

3. Hybrid search over a product catalog

"I want hybrid search over products.jsonl — keyword should match SKUs and brand names exactly, semantic should match descriptions."

Skill writes --use-case hybrid-search into configure.json. Phase 3 produces dense (HNSW) + sparse (BM25) indexes; Phase 4 embeds the description field while keeping SKU/brand as scalar fields you can filter on in the UI. Useful when you've used sparse before but don't want to wire BM25 + dense by hand.

4. Skip local Milvus, go straight to Zilliz Cloud

"Skip local — set up a serverless cluster on Zilliz Cloud and ingest corpus.jsonl (~1.2M rows) into it."

Assumes zilliz auth login already done. Skill will:

Run zilliz cluster list, let you pick (or auto-select your most recent)
configure --deployment zilliz-serverless --dataset-size 1200000
Pre-flight cluster state via zilliz cluster describe
Detect >100k rows and route Phase 4 through zilliz import create instead of client-side upsert

5. Tweak the plan before touching Milvus

"The current plan picked text-embedding-3-small. Switch to Voyage's voyage-3 and rerun the plan."

Skill edits the embedding section of configure.json in the active run dir, reruns plan, and diffs the new plan.md against the previous one. Phase 3 is deterministic and never touches Milvus, so you can iterate freely until you're happy, then run execute.

6. Append new data to an existing collection

"I already ran the launchpad on docs_v1.jsonl last week. Now I have docs_v2.jsonl — append it to the same collection."

Skill runs execute --append --run-dir <previous-run-dir> --input docs_v2.jsonl. The append path reuses plan.json (no re-plan), confirms the live collection schema matches, and upserts only the new rows. Results land in a fresh execute_append.json artifact so the original execute.json stays untouched (and a second append produces execute_append.2.json, etc.). If the new file's fields don't match the planned schema, it stops with schema_conflict and tells you how to resolve it.

7. Recover from `schema_conflict`

"Phase 4 just failed with schema_conflict. What do I do?"

Skill parses the JSON error envelope on stderr and offers two paths: drop the existing collection (destructive — confirms first), or change collection_name in plan.json and rerun. The general pattern: every CLI error code in docs/TROUBLESHOOTING.md maps to a remediation the skill knows how to drive — so when something goes red, just paste the error back into the conversation.

8. Compare embedding models before committing

"I have qrels.jsonl with 50 labelled queries. Before I commit to an embedding model, run Phase 5 with voyage-3 and text-embedding-3-small side-by-side."

Skill writes a variants.yaml with the two overrides, runs evaluate --qrels qrels.jsonl --compare variants.yaml, and shows the resulting decision table (recall@10, p95, faithfulness per variant). Use this whenever you're about to change a plan axis and want signal, not vibes.

9. Promote the local prototype to Cloud

"The local eval looks good. Promote this run to a new Zilliz Cloud serverless cluster."

Skill confirms the projected cost, runs deploy --create --confirm, streams zilliz cluster describe progress to stderr while the cluster comes up, and tails deploy.json as the state machine advances. If anything fails mid-deploy, a rerun with deploy --cluster-id <id> resumes from the last checkpoint — the skill reads deploy.json to know which steps are already done.

10. Index my photos for text-to-image search

"I have ~5,000 photos in ~/Pictures/library. Make them searchable by typing — I want to type 'sunset' and see sunsets."

Skill installs the [multimodal] extra if it's missing, then runs:

collect --input ~/Pictures/library — walks the directory, reads dimensions / EXIF / capped 256 px thumbnails per image
configure --use-case image-search --dataset-size 5000 --deployment local-standalone — forces hybrid off, picks clip-local (ViT-B/32, 512 dim) by default
plan — schema is dense-only with image_path as the primary key
execute — downloads CLIP weights once (~150 MB), batches images through the encoder, ingests; the demo UI auto-detects the modality and renders a thumbnail grid instead of a text list

For Phase 5 add --judge-llm openai:gpt-4o-mini to derive captioned eval queries; cached to derived_image_queries.jsonl so reruns don't re-spend tokens.

Going to Zilliz Cloud

There are two patterns — pick based on whether you want to iterate locally first.

Pattern A — prototype locally, promote via Phase 6 (recommended)

Run Phases 1–5 against local Standalone, then deploy to Cloud once the eval looks good:

# Phases 1–4 against local Milvus (--deployment local-standalone)
# Phase 5 on local to sanity-check recall + latency
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate --qrels qrels.jsonl

# Phase 6 promotes to Cloud. --create provisions a new cluster; --cluster-id targets existing.
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py deploy --create --confirm

Phase 6 requires the zilliz CLI (≥ 0.3.0) on PATH with zilliz auth login done. Cluster plan (Serverless / Standard / Enterprise) and region are taken from configure.json.

Pattern B — target Cloud from the start

Skip local Standalone entirely by setting the Cloud target in Phase 2:

uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py configure \
    --use-case rag --dataset-size 500000 --deployment zilliz-serverless

With the zilliz CLI on PATH, the launchpad will:

Discover your clusters (zilliz cluster list) and write cluster_id into configure.json
Pre-flight the cluster state (zilliz cluster describe) before Phase 4 ingests anything
Route ingestion through zilliz import create for corpora above the plan's bulk_import_threshold (default 100k rows)

Without the CLI, export ZILLIZ_TOKEN directly and Phase 4 falls back to client-side upsert. Phase 6's --create path requires the CLI — there's no paste-the-token fallback for cluster provisioning.

Troubleshooting

See docs/TROUBLESHOOTING.md for common errors (missing credentials, schema conflicts, Cloud cluster states) and their remediations. Every CLI error is a single-line JSON envelope on stderr — the code field maps to a row in that doc.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.claude		.claude
.github/workflows		.github/workflows
docs		docs
launchpad_mcp		launchpad_mcp
mcp		mcp
skills/zilliz-launchpad		skills/zilliz-launchpad
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

zilliz-launchpad

What you get

Requirements

Install

Install options

MCP install path

Preflight (optional)

Walkthrough — the six phases

Phase 1 — Collect: analyze your data

Phase 2 — Configure: capture your intent

Phase 3 — Plan: deterministic decisions, no LLM

Phase 4 — Execute: create the collection, ingest, search

Demo UI

Phase 5 — Evaluate: score retrieval, latency, RAG quality

Phase 6 — Deploy: promote to Zilliz Cloud

Example prompts

1. First time — just run the bundled sample

2. RAG over your own JSONL

3. Hybrid search over a product catalog

4. Skip local Milvus, go straight to Zilliz Cloud

5. Tweak the plan before touching Milvus

6. Append new data to an existing collection

7. Recover from schema_conflict

8. Compare embedding models before committing

9. Promote the local prototype to Cloud

10. Index my photos for text-to-image search

Going to Zilliz Cloud

Pattern A — prototype locally, promote via Phase 6 (recommended)

Pattern B — target Cloud from the start

Troubleshooting

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

7. Recover from `schema_conflict`

Packages