Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions .cursor/skills/fill-model-descriptions/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
name: fill-model-descriptions
description: Fill empty `description: ""` fields in `packages/llm-info/data/models.yml` by querying OpenRouter and provider documentation. Use when the user asks to populate model descriptions, enrich the model catalog, or backfill descriptions after running `pnpm sync-models`.
disable-model-invocation: true
---

# Fill Model Descriptions

## Goal

Replace empty `description: ""` strings in `packages/llm-info/data/models.yml` with concise, accurate one-sentence summaries sourced from OpenRouter first, then provider documentation as a fallback.

## Scope and Contract

- **Only modify entries where `description` is the empty string.** Existing non-empty descriptions are human-curated; never overwrite them.
- The file is a top-level YAML map keyed by provider id (`anthropic:`, `openai:`, …). Each provider's value is an array of model entries.
- Do not modify any other field, reorder entries, or change formatting (flow-style arrays like `roles: [chat, edit]`, blank lines between entries, blank line between provider sections).
- This skill requires network access.

## Workflow

1. **Read** `packages/llm-info/data/models.yml`. Parse it with the `yaml` library's `parseDocument` (NOT `parse`) so formatting and comments are preserved on write-back.

2. **Collect** every entry where `description` is the empty string. Track them as `{ provider, model }` pairs.

3. **Source descriptions** in this priority order:

**a. OpenRouter** β€” `GET https://openrouter.ai/api/v1/models` (no auth required). Build a lookup keyed by lowercased `id`. For each `{ provider, model }`, try candidates in this order:
- `${vendor}/${model}` where `vendor` comes from the Vendor Map below.
- `${vendor}/${model.replaceAll('-', '.')}` (some OR ids use `.` separators, e.g. `claude-3.5-sonnet`).
- For Bedrock entries with region/alias prefixes (`us.`, `eu.`, `jp.`, `global.`, `anthropic.`, `minimax.`, `mistral.`, `zai.`, `nvidia.`), strip the prefix and retry with the appropriate vendor.
- For Google entries with `@default` suffix (e.g. `claude-opus-4-7@default`), strip the suffix and try `anthropic/${rest}`.

**b. Provider docs** β€” if no OpenRouter hit, search the canonical provider's docs page (anthropic.com/news, platform.openai.com/docs/models, ai.google.dev/gemini-api/docs/models, mistral.ai/news, x.ai/blog, etc.) for the model's one-line summary. Use the `WebSearch` or `WebFetch` tool.

**c. Skip** β€” if neither source has a confident answer, leave `description: ""` and record the `{ provider, model }` for the final summary. Empty is better than wrong.

4. **Truncate** each sourced description per the rules in the "Truncation" section.

5. **Write** the YAML back using targeted `Pair` mutations on the parsed `Document`, not text replacement. Reference implementation: `packages/llm-info/src/sync-models.ts` shows how to load a Document, mutate sequence/map nodes, and stringify with `{ lineWidth: 0, flowCollectionPadding: false }` to preserve formatting.

6. **Validate** by running:
```bash
pnpm --filter @marimo-team/llm-info test
```
The `schema.test.ts` test will catch any structural drift.

7. **Report** to the user:
- How many entries got descriptions.
- How many were skipped (with the provider/model list).
- Suggest opening the diff (`git diff packages/llm-info/data/models.yml`) for review.

## Vendor Map

OpenRouter id vendor prefixes for each marimo provider:

| Marimo provider | OpenRouter vendor prefix | Notes |
|---|---|---|
| `anthropic` | `anthropic` | Direct match in most cases |
| `openai` | `openai` | Direct match |
| `google` | `google` | Strip `@default` suffix if present |
| `mistral` | `mistralai` | Note the `ai` suffix |
| `xai` | `x-ai` | Hyphenated |
| `azure` | `openai` | Azure mostly re-exposes OpenAI models |
| `bedrock` | varies | Strip region prefix (`us.`/`eu.`/`jp.`/`global.`) and use the embedded vendor (`anthropic.foo` β†’ `anthropic/foo`) |
| `github` | varies | Ids like `openai/gpt-4.1` carry the vendor; strip the github layer |
| `openrouter` | embedded in id | Ids like `anthropic/claude-opus-4.7-fast` already carry the vendor |
| `wandb` | embedded in id | Ids like `deepseek-ai/DeepSeek-V3.1`; the prefix is the vendor |
| `opencode-go` | rarely on OR | Source from provider docs |
| `ollama` | not on OR | Source from the Ollama model card |

## Truncation

- **One sentence.** Split on `. ` and take the first. Strip trailing whitespace and any markdown formatting (`**`, `_`, backticks).
- **Hard cap at 200 characters.** If the first sentence is longer, cut at the last word boundary before 200 and append `…`.
- **Strip leading articles** for consistency: prefer "Frontier reasoning model optimized for…" over "This is a frontier reasoning model optimized for…".
- **Prefer capability statements over marketing.** If OpenRouter gives "the best model from X" but the provider docs give "optimized for tool use and long-context reasoning", prefer the latter.

## Examples

**Good** (matches the tone of existing hand-curated entries):
- `Latest Opus model, strongest for coding and long-running professional tasks`
- `Most capable Sonnet-class model, with frontier performance across coding, agents, and professional work`
- `Lightweight Gemma model trained by Google, designed to run on a single GPU`
- `Multimodal Mixture-of-Experts model optimized for complex tool use and reasoning`

**Bad** (do not produce these):
- `Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Sonnet is particularly good at coding, vision…` β€” too long, contains marketing
- `The best model from Anthropic` β€” vague, no information content
- `An AI model` β€” useless
- `**Claude 3.5 Sonnet** is a _frontier_ model…` β€” markdown not stripped

## Anti-Patterns

- **Do not run `pnpm sync-models`** as part of this task. Sync adds new entries; this skill only fills descriptions on entries that already exist.
- **Do not invent descriptions** if neither OpenRouter nor provider docs have a credible source.
- **Do not reformat the YAML.** Use library mutations on the parsed `Document`, not string splicing.
- **Do not commit** unless the user explicitly asks. Show the diff first.

## Quick Reference

- Data file: `packages/llm-info/data/models.yml`
- Schema: `packages/llm-info/src/index.ts` (`AiModel` interface)
- YAML write reference: `packages/llm-info/src/sync-models.ts` (`buildEntryNode`, `addProviderSection`)
- OpenRouter API: `https://openrouter.ai/api/v1/models` (public, no auth)
- Validation: `pnpm --filter @marimo-team/llm-info test`
81 changes: 43 additions & 38 deletions frontend/src/components/ai/__tests__/ai-utils.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,46 +4,51 @@ import type { AiModel } from "@marimo-team/llm-info";
import { beforeEach, describe, expect, it, vi } from "vitest";
import type { UserConfig } from "@/core/config/config-schema";

// Mock the models.json import
vi.mock("@marimo-team/llm-info/models.json", () => {
const models: AiModel[] = [
{
name: "GPT-4",
model: "gpt-4",
description: "OpenAI GPT-4 model",
providers: ["openai"],
roles: ["chat", "edit"],
thinking: false,
},
{
name: "Claude 3",
model: "claude-3-sonnet",
description: "Anthropic Claude 3 Sonnet",
providers: ["anthropic"],
roles: ["chat", "edit"],
thinking: false,
},
{
name: "Gemini Pro",
model: "gemini-pro",
description: "Google Gemini Pro model",
providers: ["google"],
roles: ["chat", "edit"],
thinking: false,
},
{
name: "Ollama Model",
model: "llama2",
description: "Ollama Llama 2 model",
providers: ["ollama"],
roles: ["chat", "edit"],
thinking: false,
},
];

return {
models: models,
const make = (
overrides: Partial<AiModel> & Pick<AiModel, "name" | "model">,
): AiModel => ({
description: "",
roles: ["chat", "edit"],
capabilities: [],
input_types: [],
output_types: [],
release_date: "1970-01-01",
...overrides,
});

const models: Record<string, AiModel[]> = {
openai: [
make({
name: "GPT-4",
model: "gpt-4",
description: "OpenAI GPT-4 model",
}),
],
anthropic: [
make({
name: "Claude 3",
model: "claude-3-sonnet",
description: "Anthropic Claude 3 Sonnet",
}),
],
google: [
make({
name: "Gemini Pro",
model: "gemini-pro",
description: "Google Gemini Pro model",
}),
],
ollama: [
make({
name: "Ollama Model",
model: "llama2",
description: "Ollama Llama 2 model",
}),
],
};

return { models };
});

// Must import after mock
Expand Down
4 changes: 2 additions & 2 deletions frontend/src/components/ai/ai-model-dropdown.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -305,7 +305,7 @@ const AiModelDropdownItem = ({
<div className="flex flex-row w-full items-center">
<span>{model.name}</span>
<div className="ml-auto">
{model.thinking && (
{model.capabilities.includes("thinking") && (
<Tooltip content="Reasoning model">
<BrainIcon
className={`h-5 w-5 rounded-md p-1 ${getTagColour("thinking")}`}
Expand Down Expand Up @@ -362,7 +362,7 @@ export const AiModelInfoDisplay = ({
</div>
)}

{model.thinking && (
{model.capabilities.includes("thinking") && (
<div className="flex items-center gap-2">
<div className="w-2 h-2 bg-purple-500 rounded-full animate-pulse" />
<span className="text-xs text-muted-foreground">
Expand Down
2 changes: 1 addition & 1 deletion frontend/src/components/app-config/ai-config.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -509,7 +509,7 @@ const ModelInfoCard = ({ model }: { model: AiModel }) => {
<Tooltip content="Custom model">
{model.custom && <BotIcon className="h-4 w-4" />}
</Tooltip>
{model.thinking && (
{model.capabilities.includes("thinking") && (
<div
className={cn(
"flex items-center gap-1 rounded px-1 py-0.5 w-fit",
Expand Down
Loading
Loading