From a sample document to a production-deployed search app on Milvus / Zilliz Cloud — in minutes. An AI-guided scaffold delivered as an agent skill.
Whether you're new to Milvus or just tired of boilerplate — pick a file, run six steps, ship.
| Artifact | Description |
|---|---|
| Milvus Collection | Schema auto-designed from your data |
| Vector + sparse fields | OpenAI / Voyage / Cohere / BYOM (text); CLIP / Voyage multimodal (image) |
plan.md |
Every decision recorded — reviewable and reproducible |
| Next.js Demo UI | Hybrid search list or thumbnail gallery (auto-detected) at localhost:3000 |
eval_report.md |
recall@10, p50/p95/p99 latency, optional RAG quality metrics |
deploy.json |
Zilliz Cloud deployment record, resumable on rerun |
- Agent — Claude Code, Copilot CLI, Gemini CLI, or any
skills-compatible agent - Python ≥ 3.11 +
uv - Node.js ≥ 18 + pnpm
- Docker (local Standalone) or a Zilliz Cloud account
- API key — OpenAI, Voyage, Cohere, or Zilliz BYOM
- Optional —
zillizCLI for Cloud auto-discovery and bulk import - Optional —
ffmpegonPATH(needed for video-search frame sampling; scene-change sampling shells out to it directly)
The launchpad ships as an agent skill. Install it with the skills CLI, which discovers skills under skills/ and symlinks them into your agent's skill directory (Claude Code, Copilot CLI, Gemini CLI, Cursor, OpenCode, Codex):
npx skills add zilliztech/zilliz-launchpadOnce installed, open your agent in this repo and say something like "use zilliz-launchpad to index this file" — the skill drives the six phases end-to-end (stopping at Phase 4 by default; Phases 5 and 6 kick in when you ask), installing Python deps, bringing up Milvus, and prompting for any missing API keys as it goes.
# Install to a specific agent
npx skills add zilliztech/zilliz-launchpad -a claude-code
# Install globally (available across all projects)
npx skills add zilliztech/zilliz-launchpad -g
# Install to all detected agents
npx skills add zilliztech/zilliz-launchpad --all
# List available skills before installing
npx skills add zilliztech/zilliz-launchpad --listOther agent flags include -a copilot-cli, -a gemini-cli, -a cursor, -a opencode, and -a codex.
For non-skill-aware hosts (Cursor, Claude Desktop, generic MCP clients), the launchpad
also ships an MCP server that exposes the same six phases as MCP tools. Install the
mcp extra and launch the stdio server:
uv sync --extra mcp
uv run python -m launchpad_mcp.serverSee mcp/README.md for the full tool catalog, the structured error
envelope, and a host-registration snippet.
The skill drives these on demand, but you can do them ahead of time to skip a few conversational round-trips:
# Install Python deps
uv sync
# Bring up local Milvus Standalone (skip if using Zilliz Cloud)
./skills/zilliz-launchpad/scripts/start_milvus.sh up
# Export at least one embedding key
export OPENAI_API_KEY=<your-key>If you plan to drive the CLI directly without an agent, run all three — the Walkthrough below assumes they're done.
The launchpad is organized as six CLI subcommands. Each writes a single artifact to skills/zilliz-launchpad/scripts/runs/<utc-timestamp>/. You can rerun any phase; nothing is destructive until Phase 4 touches Milvus, and Phase 6 is gated behind an explicit --confirm before it creates any Cloud resources.
collect ──▶ configure ──▶ plan ──▶ execute ──▶ evaluate ──▶ deploy
│ │ │
│ │ └─ Zilliz Cloud + deploy.json
│ └─ eval_report.{json,md}
└─ Milvus collection + demo UI
We'll use the bundled movies sample (20 short fictional plot summaries) throughout.
Looks at your file, infers field types, picks a candidate primary key and text field, and writes collect.json.
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py collect --sample movies
# or: --input path/to/your.jsonl
# or: --input path/to/your.pdf # one record per page (requires `.[documents]` extra)
# or: --input path/to/notes.md # whole file; add --split-markdown-headings for `## ` sections
# or: --input ./docs/ # directory (recursive) — mixes .jsonl/.pdf/.md/.csv/.txt
# or: --input 'docs/*.pdf' # shell glob — quote it so the shell doesn't expand it firstDirectory or glob inputs produce a source_files[] array in collect.json (one entry per file) and a union schema across files. A field name appearing in only some files is marked nullable: true; the same field name with different JSON types in different files raises input_schema_conflict and refuses to write collect.json.
Output (collect.json, abbreviated):
{
"data_shape": "jsonl",
"record_count_estimate": 20,
"fields": [
{ "name": "id", "type": "string", "avg_length": 4, "sample_value": "m001" },
{ "name": "title", "type": "string", "avg_length": 18, "sample_value": "The Quantum Gardener" },
{ "name": "body", "type": "string", "avg_length": 126, "sample_value": "An astrophysicist..." },
{ "name": "year", "type": "int", "sample_value": 2023 },
{ "name": "genre", "type": "string", "sample_value": "sci-fi" }
]
}For Milvus veterans: this is where you'd normally hand-write a
CollectionSchema. Skip it.
Image search? Point
--inputat a directory of images (.jpg/.png/.webp/.gif):uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py collect --input ./photos/ uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py configure \ --use-case image-search --dataset-size 64000 --deployment local-standalonePhase 1 walks the directory, reads EXIF, encodes a thumbnail per image. Phase 3 then picks
clip-local(open-source ViT-B/32, runs on CPU/MPS/CUDA, no API key) by default — install the optional extra first:uv pip install -e '.[multimodal]'. The Next.js UI auto-switches to a thumbnail gallery. See issue #14 for the MVP scope.Search by example (image → image). After Phase 4 Execute builds the collection you can query with another image instead of a text phrase. In the demo UI (
pnpm devfromscripts/ui/) click Search by image… next to the text box — or drop an image anywhere on the page — to find visually similar images in your collection. Uploads are capped at 10 MB per request. For a CLI smoke:uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate \ --query-image ./query/my_dog.jpgThis prints the top-10 ranked primary keys with scores and is the fastest way to confirm image-to-image is wired. For a labelled eval, mix image-to-image rows into your qrels file (one row per line):
{"query_image_path": "query/sunset.jpg", "expected_image_ids": ["photos/sky1.jpg", "photos/beach2.jpg"]}and pass the file to
evaluate --qrels path/to/qrels.jsonlto get recall / MRR / NDCG against your ground truth. Queries against a Voyage-multimodal-backed collection (embedding_preference: voyage-multimodal-3) call the Voyage API per query and bill toVOYAGE_API_KEY; CLIP-local stays free. See issue #15.Video search? Point
--inputat a directory of.mp4 / .mov / .mkv / .webmclips:uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py collect --input ./clips/ uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py configure \ --use-case video-search --dataset-size 25 --deployment local-standalonePhase 1 samples a frame every 2 seconds (default) using PyAV, capped at 600 frames per video, and writes JPEGs under
<run-dir>/frames/. Phase 3 reuses the same CLIP path image-search uses (orvoyage-multimodal-3on override) and adds two scalar fields (video_path,t_seconds) so the UI can deep-link playback. Phase 4 fans out per video, embedding frames in batches with resumable per-video tracking (processed_videos[]). The demo UI renders each unique video as a card with the top-scoring frame as the primary thumbnail plus clustered secondary frames under it; clicking mounts an inline<video>element seeked to the matched timestamp. Sidecar static-serves videos under/videos/…; setLAUNCHPAD_VIDEO_STATIC_ROOTto point at your source tree if it lives outside the run-directory parent. Tune sampling density with--frame-interval-seconds,--max-frames-per-video, and--sampling-strategy {every_n_seconds, scene_change}(scene-change needsffmpegonPATH). Evaluation reports bothrecall@k (frame)andrecall@k (video)so you can see whether the right moment (frame) or the right clip (video) is being retrieved. See issue #16. Voyage multimodal ingest bills per-frame; for videos default to CLIP-local unless you explicitly opt in. The sidecar upload cap remains 10 MB per request (inherited from image-search).
Three knobs decide everything downstream: use case, dataset size, deployment target.
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py configure \
--use-case rag \
--dataset-size 20 \
--deployment local-standalone| Flag | Common values |
|---|---|
--use-case |
rag, semantic-search, hybrid-search, recommendation, image-search, video-search |
--dataset-size |
row-count estimate, drives index choice |
--deployment |
local-standalone, zilliz-serverless, zilliz-dedicated, zilliz-byoc |
Output (configure.json): a normalized requirement profile used by the downstream phases.
Reads collect.json + configure.json and writes both a machine plan (plan.json) and a human-readable explanation (plan.md).
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py planExample plan.md:
# Launchpad Plan
- Collection: `launchpad_collection`
- Target URI: `http://localhost:19530`
- Deployment: `local-standalone`
## Schema
- Primary key: `id`
- Text field: `body`
- Vector field: `embedding` (dim 1536)
- Sparse field: `sparse`
- Extra fields: title, year, genre
## Embedding
- Provider: `openai`
- Model: `text-embedding-3-small`
- Dim: 1536
## Index
- Type: HNSW Metric: COSINE
- Params: { "M": 16, "efConstruction": 200 }
## Rationale
- Dataset size 20 → HNSW with M=16, ef=200
- Use case 'rag' + hybrid preference 'auto' → sparse=True
- Embedding provider 'openai' model 'text-embedding-3-small' (dim 1536)Read the rationale, tweak configure.json if you disagree, rerun plan. Nothing has touched Milvus yet.
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py execute --sample movies
# → connecting to http://localhost:19530
# → creating collection 'launchpad_collection' (HNSW, COSINE, dim=1536)
# → embedding 20 rows with openai/text-embedding-3-small
# → ingested 20 / 20
# → smoke test: query "movie about parallel universes"
# → ✓ Top-1: m001 'The Quantum Gardener' score=0.87
# Ingest a whole folder of weekly exports — `execute` streams files in
# lexicographic order. If the process is killed mid-batch, re-running with
# the same --run-dir resumes from the next file via execute.json.processed_files[]:
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py execute \
--run-dir runs/2026-05-20-collect --input ./weekly_exports/What it does:
- Connects to the URI from
plan.json(local Milvus or Zilliz Cloud) - Creates the collection + index per the plan (idempotent: skips if it matches; errors with
schema_conflictif not) - Embeds and upserts your data (client-side for ≤100k rows,
zilliz importfor larger corpora when the CLI is installed) - Runs a smoke query and prints the top-1 result
cd skills/zilliz-launchpad/scripts/ui
pnpm install
pnpm dev
# → http://localhost:3000A minimal Next.js app that calls a local /api/search route, which in turn hits your Milvus collection. Uses the latest run directory automatically. Hot-reload friendly — restyle it however you like.
Runs a query set against the live collection and writes eval_report.{json,md}. Three query-set tiers — pick whichever matches the labels you have:
# Derived smoke eval: no labels required, samples 25 docs from your corpus
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate
# Labelled eval: recall@10, MRR@10, NDCG@10
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate --qrels qrels.jsonl
# Opt-in RAG quality via ragas (needs a generator-model API key)
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate \
--qrels qrels.jsonl --judge-llm openai:gpt-4o-miniExample eval_report.md:
# Evaluation Report — 2026-04-23T10-15-02Z-execute
- Query count: 25
- Derived query set: **true**
## Decision table
| variant | recall@10 | p95 (ms) | faithfulness | cost/query |
| --- | --- | --- | --- | --- |
| base | 0.920 | 42.7 | — | — |
> Note: queries were derived from the corpus...Comparison mode — re-run the same query set against alternative plan variants (swap embedding model, index params, hybrid on/off, reranker on/off) and get a decision table. Capped at 6 variants by default:
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate \
--qrels qrels.jsonl --compare variants.yaml# variants.yaml
variants:
- name: small-m
overrides: { index: { params: { M: 8 } } }
- name: voyage
overrides: { embedding: { model: voyage-3 } }
- name: no-hybrid
overrides: { hybrid: false }See skills/zilliz-launchpad/references/knowledge/evaluation_guide.md for the full metric contract.
Recreates the plan's collection + index on a Zilliz Cloud cluster, ingests data (bulk import above the plan's threshold, client-side below), and writes deploy.json with observability pointers and a resumable state machine. Snapshots after every transition, so a rerun picks up where a failure left off.
# Target an existing cluster
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py deploy --cluster-id <id>
# Or provision a new one — --confirm is required because this bills real money
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py deploy --create --confirmWhat it does:
- Preflight — checks the
zillizCLI is installed + authenticated (zilliz auth whoami), and when targeting an existing cluster, verifies it'sRUNNING - Provision (with
--create) — callszilliz cluster createwith plan/region fromconfigure.json, pollszilliz cluster describewith exponential backoff untilRUNNING - Collection + index — reuses Phase 4's idempotent
create_collection+create_index; surfacesschema_conflictif an incompatible collection already exists on the cluster - Ingest — routes through
zilliz import createfor corpora above the plan'sbulk_import_threshold(default 100k), falls back to client-side upsert if bulk fails - Observability — records the Grafana dashboard URL (from
cluster describe) indeploy.json.observability.grafana_dashboardand appends a post-ingest snapshot toobservability.json
deploy.json carries the fixed schema the spec guarantees: cluster_id, cluster_uri, token_source, collection_name, ingest_mode, ingest_row_count, ingest_status, observability, timestamps. If Phase 6 stops mid-run (cluster provisioned, ingest failing), a rerun with --cluster-id <id> skips the already-ready steps and retries only what's left.
The CLI is the seam, but day-to-day you'll talk to the skill in natural language inside your agent. Here are concrete prompts that map cleanly to the six phases — each one shows what the skill runs under the hood.
"Use zilliz-launchpad with the bundled
moviessample so I can see the whole flow end-to-end."
Skill runs Phases 1–4 against sample_data/movies.jsonl (--use-case rag --dataset-size 20 --deployment local-standalone) and starts the demo UI. Best first invocation — proves your environment is wired before you point it at real data. Once you've clicked around the UI, ask "now run Phase 5 in derived mode" for a quick latency+recall smoke.
"I have
~/data/support_tickets.jsonl(~80k rows, fields:ticket_id,subject,body,priority). Index it for RAG on local Milvus."
collect --input ~/data/support_tickets.jsonl— infers fields, picksbodyas the text fieldconfigure --use-case rag --dataset-size 80000 --deployment local-standaloneplan— sparse field on by default for RAGexecute— embed + upsert + smoke query
"I want hybrid search over
products.jsonl— keyword should match SKUs and brand names exactly, semantic should match descriptions."
Skill writes --use-case hybrid-search into configure.json. Phase 3 produces dense (HNSW) + sparse (BM25) indexes; Phase 4 embeds the description field while keeping SKU/brand as scalar fields you can filter on in the UI. Useful when you've used sparse before but don't want to wire BM25 + dense by hand.
"Skip local — set up a serverless cluster on Zilliz Cloud and ingest
corpus.jsonl(~1.2M rows) into it."
Assumes zilliz auth login already done. Skill will:
- Run
zilliz cluster list, let you pick (or auto-select your most recent) configure --deployment zilliz-serverless --dataset-size 1200000- Pre-flight cluster state via
zilliz cluster describe - Detect >100k rows and route Phase 4 through
zilliz import createinstead of client-side upsert
"The current plan picked
text-embedding-3-small. Switch to Voyage'svoyage-3and rerun the plan."
Skill edits the embedding section of configure.json in the active run dir, reruns plan, and diffs the new plan.md against the previous one. Phase 3 is deterministic and never touches Milvus, so you can iterate freely until you're happy, then run execute.
"I already ran the launchpad on
docs_v1.jsonllast week. Now I havedocs_v2.jsonl— append it to the same collection."
Skill runs execute --append --run-dir <previous-run-dir> --input docs_v2.jsonl. The append path reuses plan.json (no re-plan), confirms the live collection schema matches, and upserts only the new rows. Results land in a fresh execute_append.json artifact so the original execute.json stays untouched (and a second append produces execute_append.2.json, etc.). If the new file's fields don't match the planned schema, it stops with schema_conflict and tells you how to resolve it.
"Phase 4 just failed with
schema_conflict. What do I do?"
Skill parses the JSON error envelope on stderr and offers two paths: drop the existing collection (destructive — confirms first), or change collection_name in plan.json and rerun. The general pattern: every CLI error code in docs/TROUBLESHOOTING.md maps to a remediation the skill knows how to drive — so when something goes red, just paste the error back into the conversation.
"I have
qrels.jsonlwith 50 labelled queries. Before I commit to an embedding model, run Phase 5 withvoyage-3andtext-embedding-3-smallside-by-side."
Skill writes a variants.yaml with the two overrides, runs evaluate --qrels qrels.jsonl --compare variants.yaml, and shows the resulting decision table (recall@10, p95, faithfulness per variant). Use this whenever you're about to change a plan axis and want signal, not vibes.
"The local eval looks good. Promote this run to a new Zilliz Cloud serverless cluster."
Skill confirms the projected cost, runs deploy --create --confirm, streams zilliz cluster describe progress to stderr while the cluster comes up, and tails deploy.json as the state machine advances. If anything fails mid-deploy, a rerun with deploy --cluster-id <id> resumes from the last checkpoint — the skill reads deploy.json to know which steps are already done.
"I have ~5,000 photos in
~/Pictures/library. Make them searchable by typing — I want to type 'sunset' and see sunsets."
Skill installs the [multimodal] extra if it's missing, then runs:
collect --input ~/Pictures/library— walks the directory, reads dimensions / EXIF / capped 256 px thumbnails per imageconfigure --use-case image-search --dataset-size 5000 --deployment local-standalone— forces hybrid off, picksclip-local(ViT-B/32, 512 dim) by defaultplan— schema is dense-only withimage_pathas the primary keyexecute— downloads CLIP weights once (~150 MB), batches images through the encoder, ingests; the demo UI auto-detects the modality and renders a thumbnail grid instead of a text list
For Phase 5 add --judge-llm openai:gpt-4o-mini to derive captioned eval queries; cached to derived_image_queries.jsonl so reruns don't re-spend tokens.
There are two patterns — pick based on whether you want to iterate locally first.
Run Phases 1–5 against local Standalone, then deploy to Cloud once the eval looks good:
# Phases 1–4 against local Milvus (--deployment local-standalone)
# Phase 5 on local to sanity-check recall + latency
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py evaluate --qrels qrels.jsonl
# Phase 6 promotes to Cloud. --create provisions a new cluster; --cluster-id targets existing.
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py deploy --create --confirmPhase 6 requires the zilliz CLI (≥ 0.3.0) on PATH with zilliz auth login done. Cluster plan (Serverless / Standard / Enterprise) and region are taken from configure.json.
Skip local Standalone entirely by setting the Cloud target in Phase 2:
uv run python skills/zilliz-launchpad/scripts/zilliz_ops.py configure \
--use-case rag --dataset-size 500000 --deployment zilliz-serverlessWith the zilliz CLI on PATH, the launchpad will:
- Discover your clusters (
zilliz cluster list) and writecluster_idintoconfigure.json - Pre-flight the cluster state (
zilliz cluster describe) before Phase 4 ingests anything - Route ingestion through
zilliz import createfor corpora above the plan'sbulk_import_threshold(default 100k rows)
Without the CLI, export ZILLIZ_TOKEN directly and Phase 4 falls back to client-side upsert. Phase 6's --create path requires the CLI — there's no paste-the-token fallback for cluster provisioning.
See docs/TROUBLESHOOTING.md for common errors (missing credentials, schema conflicts, Cloud cluster states) and their remediations. Every CLI error is a single-line JSON envelope on stderr — the code field maps to a row in that doc.
Apache-2.0. Copyright 2026 Zilliz.