HyperTools 2.0: modernized toolbox, interactive backend, soft clustering, comprehensive visual verification by jeremymanning · Pull Request #270 · ContextLab/hypertools

jeremymanning · 2026-07-02T04:49:54Z

HyperTools 2.0: modernized toolbox with interactive backend, soft clustering, and comprehensive visual verification

This PR modernizes hypertools end-to-end while preserving the public API, integrating the best ideas from the earlier refactor attempt (jeremymanning/hypertools dev branch + backend experiments) into the current, tested codebase. Full design rationale: notes/hypertools_2.0_roadmap.md.

Highlights

Interactive plotly backend with visual parity (`backend='auto' | 'matplotlib' | 'plotly'`)

matplotlib remains the default renderer everywhere; backend='auto' (the default) switches to plotly only on Google Colab and Kaggle. Existing local/CI workflows see zero change.
The two backends produce visually matched output: identical colors, line/marker styles and sizes (pt→px calibrated), format strings (markers + dash styles), the signature wireframe cube / square frame, hidden axes, and matched camera angles. Evidence: 22 side-by-side montages in docs/images/v2.0-parity/ — matplotlib left, plotly right, same call.
Animations on both backends (sliding window + camera spin with play/pause controls on plotly).

Mixture-model ("soft") clustering + robust coloring

hyp.cluster(x, cluster='GaussianMixture' | 'BayesianGaussianMixture' | 'LatentDirichletAllocation' | 'NMF') returns (n_samples, n_components) membership proportions (rows sum to 1). Hard clustering unchanged.
hyp.plot(x, cluster='GaussianMixture', ...) colors observations by proportion-weighted blends of component colors.
hue accepts categorical labels, continuous values, or any 2D matrix via the new mat2colors.

Multicolored lines

Continuous or matrix-valued hue + a line format string colors each trajectory continuously along its length on both backends (matplotlib Line3DCollection/LineCollection; plotly per-point line colors in 3D, segment traces in 2D).

Nested-list input with multilevel styling

hyp.plot([[a, b], [c]]) colors datasets by outermost group; deeper nesting renders thinner + fainter. Text corpora keep existing behavior.

`hyp.apply_model`: the stack/unstack core

Datasets are stacked, the model fits once across all of them, and results unstack to the input structure — what makes embeddings/labels comparable across datasets. Model specs: registry name / dict with params / sklearn-style instance / pipeline list. mode='auto'|fit_transform|fit_predict|predict_proba, return_model=True for held-out reuse, stack=False for per-dataset fits. Explicit whitelist registry (no eval).

Retired legacy arguments (long-deprecated)

plot(group=...) → hue; plot(model=/model_params=) → reduce; reduce(model=/model_params=/normalize=/align=); align(method=/normalize=/ndims=) and the ambiguous align=True (now a clear ValueError with a migration hint); cluster(ndims=). Saved geos from hypertools 0.x still load — retired kwargs are translated (group→hue) or dropped with a warning on replay.

Bug fixes

importing hypertools updates matplotlib rcParams #259 fixed: plotting no longer mutates global matplotlib rcParams (verified by before/after diff; regression-tested).
Problem with creating multiple hypertools figures in a for loop #264 fixed: plots in loops no longer repeat the first plot — root cause was the memoize cache, whose str()-keys truncated numpy arrays so new data collided with stale entries. Cache removed; regression test reproduces the reported loop scenario.
Plotting animations in Jupyter does not seem to be compatible with current Numpy (version 2 or greater) #265 fixed: animate=True under numpy≥2 — regression test reproduces the exact array from the issue report; also fixed the Colab/ipympl backend crash (matplotlib ≥3.9 raises ValueError, not ImportError, for broken module:// backends) and a broken-Tcl/Tk fallback.
Long-standing is_line() bug: '' in Line2D.markers made it return False for every format string, silently disabling hypertools' smooth line interpolation on modern matplotlib. Fixed (with linestyle-aware parsing), restoring the intended rendering; per-point labels are now re-mapped onto interpolated trajectories.
Fixed redundant format_data/PPCA pass per plot.

Performance, packaging, docs

import hypertools: 5.1s → 1.4s (lazy umap/seaborn/scipy.interpolate).
PEP 621 pyproject.toml (2.0.0.dev0, Python 3.10–3.13); setup.py/requirements.txt/MANIFEST.in/.travis.yml removed; extras [interactive], [dev]; CI matrix 3 OS × py3.10–3.13 with bumped actions + screenshot artifacts; readthedocs on py3.11; external hdbscan → sklearn's built-in; unmaintained pca-magic dropped.
Docs updated throughout: README "What's new in 2.0"; 5 new gallery examples (interactive backend, mixture models, multicolored lines, nested lists, apply_model) — full sphinx site + gallery builds cleanly; apply_model added to the API reference; docstrings updated for all changed signatures.

Evidence: every function verified on both backends

Test suite: 185 passing (was 136) — all real calls, no mocks. Includes regression tests reproducing importing hypertools updates matplotlib rcParams #259/Problem with creating multiple hypertools figures in a for loop #264/Plotting animations in Jupyter does not seem to be compatible with current Numpy (version 2 or greater) #265 from the issue reports.
Visual verification matrix: 75/75 cases pass — every public function (plot, reduce, align, normalize, cluster, analyze, describe, format_data, load, apply_model, text input) across use cases on both backends: line/marker/dash styles, hue variants, clustering (hard + mixture), nested lists, multicolored lines, missing data, animations. Regenerate: python scripts/generate_verification_screenshots.py.
Backend parity matrix: 22/22 montages — identical calls rendered side by side (matplotlib | plotly). Regenerate: python scripts/generate_parity_screenshots.py.
Dev notebook executed end-to-end, 0 errors: dev/hypertools_2.0_dev_executed.ipynb.

Sample parity montages (matplotlib | plotly — same call)

case	montage
trajectories
multicolored line
mixture blending
dashed lines
nested multilevel

Breaking changes

Python ≥3.10; dependency floors raised (numpy≥2, pandas≥2.2, sklearn≥1.4, matplotlib≥3.8).
Retired arguments listed above raise TypeError/ValueError with migration hints (old saved geos are translated on replay).
The buggy result cache is gone (recompute instead of risking wrong cached results).

⚠️ Do not merge without @jeremymanning's explicit sign-off.

🤖 Generated with Claude Code

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…audit Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…erator, dev notebook - scripts/screenshot_harness.py: headless PNG capture per function/use-case - scripts/generate_baseline_screenshots.py: 13 baseline cases, all passing on v0.8.2 - dev/hypertools_2.0_dev.ipynb: interactive test matrix, one section per public function - Roadmap updated with design decisions mined from fork issue tracker (incl. comments) - tests/screenshots/ gitignored (reviewed locally / CI artifacts, not committed) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…mixture models to first-class 2.0 features; record approved backend='auto' policy Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…models, robust coloring Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…lazy heavy imports - Migrate to PEP 621 pyproject.toml (v2.0.0.dev0, py3.10+); delete setup.py, requirements.txt, MANIFEST.in, stale .travis.yml - CI: py3.10-3.13 matrix, setup-python@v5, cache@v4, codecov@v4, screenshot artifact upload; readthedocs python 3.9->3.11 - Remove memoize entirely (user requirement): str()-keyed cache truncated numpy arrays -> cache collisions returned wrong results (fork issue #3) - Lazy-import umap, seaborn, scipy.interpolate: import hypertools 5.1s -> 1.46s - 136/136 tests pass Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…re-formatting, scope plot styling (fixes #259) - Replace external hdbscan package with sklearn.cluster.HDBSCAN (always available); drop the SyntaxWarning filter that existed only for it - plot(): pass format_data=False to the post-analyze reduction (data was already formatted; avoids a redundant format_data/PPCA pass per plot) - plot(): apply seaborn palette/style inside plt.rc_context() so plotting no longer permanently mutates matplotlib rcParams (GH #259) - verified with a real before/after rcParams diff - 136/136 tests pass Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…lors) - cluster() supports GaussianMixture, BayesianGaussianMixture, LDA, NMF and returns (n_samples, n_components) membership proportions (rows sum to 1); hard-clustering behavior unchanged - New hypertools/tools/colors.py: mat2colors maps categorical labels, continuous 1D values, or 2D matrices (soft assignments / arbitrary numeric matrices) to RGB; colors2groups quantizes per-point colors into traces for the matplotlib renderer - plot() accepts cluster='GaussianMixture' etc. (points colored by proportion-weighted blends) and matrix-valued hue - 145/145 tests pass (9 new: real GaussianMixture/BGM/LDA/NMF calls, color-blend math, end-to-end mixture plot) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

- plot([[a, b], [c]]) flattens arbitrarily nested dataset lists, coloring every leaf by its outermost group and rendering deeper leaves thinner and fainter (summary -> detail, per fork design issues #14/#16) - Nested string lists (text corpora) are explicitly excluded and keep their existing text-pipeline behavior - 156/156 tests pass (6 new, incl. rendered-line color/width assertions) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

- New hypertools/plot/interactive.py: plotly renderer mirroring _draw's contract (2D/3D traces, fmt-string mode mapping, per-trace colors/labels, hypertools no-ticks aesthetic, matplotlib elev/azim -> plotly camera) - Animations: sliding-window frames (animate=True) and camera spin (animate='spin') with play/pause controls - hyp.plot(..., backend='auto'|'matplotlib'|'plotly'): auto uses plotly ONLY on Colab/Kaggle (approved policy); matplotlib default everywhere else - Screenshot harness exports plotly figures via kaleido - 169/169 tests pass (13 new: policy resolution incl. Colab/Kaggle markers, fmt mapping, camera math, end-to-end plotly figure/animation assertions) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ok, backend fix, README 2.0 docs - scripts/generate_verification_screenshots.py: 44/44 cases pass covering every public function (plot/reduce/align/normalize/cluster/analyze/ describe/format_data/load/text) on both backends; INDEX.md manifest; curated copy committed to docs/images/v2.0-verification/ for PR evidence - dev notebook executed end-to-end with 0 errors via nbclient (dev/hypertools_2.0_dev_executed.ipynb); notebook cells updated to exercise implemented 2.0 APIs - backend.py: catch ValueError from mpl.use() -- matplotlib >=3.9 raises it (not ImportError) for missing ipympl; likely root cause of Colab animate=True failures (#235) - README: What's new in 2.0 + modernized requirements; ipykernel in [dev] - Full suite re-verified: 169/169 tests, 13/13 baselines, import 1.5-1.7s Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… time GitHub's windows/py3.13 runners ship a broken Tcl/Tk: TkAgg imports fine (so backend probing selects it) but window creation raises _tkinter.TclError. manage_backend now retries the plot once on the original backend after an interactive-backend TclError instead of crashing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

- plotly renderer now reproduces the matplotlib aesthetic exactly: black wireframe cube (3D) / square frame (2D), hidden axes, unit-cube range, matched camera (elev/azim), pt->px line/marker sizing, full fmt-string support (marker symbols + dash styles, with 3D symbol fallbacks) - MULTICOLORED LINES: continuous or matrix hue + line fmt colors each trajectory continuously along its length (matplotlib Line3DCollection / LineCollection; plotly per-point line colors in 3D, segment traces in 2D) - Fix long-standing is_line() bug: '' in Line2D.markers made it return False for every fmt string, silently disabling line interpolation on modern matplotlib; also parse linestyles before marker chars ('-.') - Re-mapped per-point labels onto interpolated trajectories (fixes latent IndexError that interpolation re-enablement exposed) - Parity montage generator (scripts/generate_parity_screenshots.py): matplotlib|plotly side-by-side for 22 identical calls - 173/173 tests pass Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

- Removed (previously deprecated, now retired for 2.0): plot's group/model/ model_params; reduce's model/model_params/normalize/align; align's method/normalize/ndims and the ambiguous align=True form (now a clear ValueError with migration hint); cluster's ndims - DataGeometry.plot translates/drops retired kwargs when replaying geos saved by hypertools < 2.0 (group -> hue), so old files still load - New hyp.apply_model: the stack/unstack core from the revamp design -- one model fit across stacked datasets then unstacked to input structure (stack=False for per-dataset fits); model specs as registry name / dict / sklearn instance / pipeline list; mode auto|fit_transform|fit_predict| predict_proba; return_model for reuse on held-out data; explicit whitelist registry (no eval) - 185/185 tests pass (12 new apply_model tests) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…s for all 2.0 features - docs/images/v2.0-parity/: 22 side-by-side (matplotlib | plotly) montages of identical calls -- line/marker styles, dashes, sizing, colors, hue variants, clustering, mixtures, nested lists, multicolored lines - docs/images/v2.0-verification/: refreshed 75-case matrix (was 44) now covering every plotting feature on BOTH backends, incl. multicolored lines, mixture models, nested lists, marker/line styles, animations, and apply_model - 5 new gallery examples (interactive backend, mixture models, multicolored lines, nested lists, apply_model), all executing cleanly; gallery rebuilt; apply_model added to the API reference - dev notebook updated for all implemented 2.0 features and re-executed end-to-end with 0 errors - README documents multicolored lines, apply_model, backend parity, and the retired legacy arguments - 185/185 tests pass; import 1.41s Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

jeremymanning · 2026-07-02T12:21:47Z

All review items addressed ✅

Every item from the review is now implemented, tested, and screenshotted (PR body updated with full details). Summary of what changed since the initial submission:

1. Backend visual parity

The plotly renderer was rewritten to reproduce the matplotlib output exactly: same wireframe cube / square framing, hidden axes, matched camera (elev/azim), pt→px-calibrated line widths and marker sizes, full format-string support (marker symbols + dash styles), and identical palette assignment. Evidence: 22 side-by-side montages (matplotlib left | plotly right, same call) in docs/images/v2.0-parity/ (manifest).

Multicolored line, same call on both backends:

2. Complete feature screenshot coverage

The verification matrix grew from 44 → 75 cases, all passing (manifest): clustering (hard + all four mixture models), multilevel/nested lists, multicolored lines (new feature — continuous per-segment coloring along trajectories, both backends), matrix/continuous/categorical hue, marker + line styles, animations, apply_model, and every other public function — each on both backends.

3. Formerly deferred items — all now in this PR

hyp.apply_model (stack → fit once → unstack core): registry/dict/instance/pipeline model specs, return_model for held-out reuse, stack=False for per-dataset fits; 12 dedicated tests; gallery example + API reference entry.
Sphinx gallery: 5 new examples (interactive backend, mixture models, multicolored lines, nested lists, apply_model); full docs site + gallery build cleanly.
Animation bugs: Problem with creating multiple hypertools figures in a for loop #264 (loops repeating the first plot — memoize root cause) and Plotting animations in Jupyter does not seem to be compatible with current Numpy (version 2 or greater) #265 (numpy≥2 animations — exact repro from the issue) are fixed with regression tests, alongside importing hypertools updates matplotlib rcParams #259 (rcParams mutation).
Deprecated kwargs retired: group, model/model_params, align(method=)/align=True, cluster(ndims=), reduce(normalize=/align=) — with clear migration errors, and old saved geos still replay (retired args translated with a warning).

Bonus fix found while restoring parity

is_line() had returned False for every format string on modern matplotlib ('' in Line2D.markers is a substring of everything), silently disabling hypertools' smooth line interpolation. Fixed — line plots are smooth again on both backends.

Final numbers: 185 tests passing · 75/75 verification cases · 22/22 parity montages · dev notebook executes with 0 errors · docs build clean · CI matrix green.

🤖 Generated with Claude Code

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ments, plotly gallery - Animation EXPORT on both backends, format by extension: .gif (Pillow), .png/.apng (animated PNG), .mp4/.mov/.avi (ffmpeg). plotly animations render each frame via kaleido then assemble; exported frames no longer include the play/pause controls; frame counts scale with duration. 7 new tests save real files and verify frame counts. Sample GIFs from BOTH backends committed to docs/images/v2.0-animations/. - Mixture demos now use OVERLAPPING clusters (1.5 sd apart) so multi-class membership is visible as blended colors (examples, screenshots, parity, notebook); new test asserts a substantial fraction of genuinely mixed assignments. - Backend parity refinements: centered black 12pt title (matching matplotlib), default 640x480 canvas, 2D frame fills the canvas like matplotlib (no forced square), 3D box uses matplotlib's 4:4:3 aspect, camera distance tuned so cube sizes match (r=1.95). - Sphinx gallery renders plotly figures (plotly_sg_scraper + kaleido); new animate_plotly example with an animated GIF thumbnail wired into post_build; interactive-backend example shows the plotly figure inline. - Dev notebook displays animations inline (to_jshtml + plotly frames) and demonstrates gif export; re-executed end-to-end with 0 errors. - Evidence regenerated: 22/22 parity montages, 75/75 verification cases. - 192/192 tests pass (185 + 7 animation-export) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Root cause of the macos/py3.11 CI failure: Google Drive answered a dataset request with an HTML rate-limit page (200 status), which load() cached as the dataset -- poisoning every subsequent text-data test on that runner with UnpicklingError. - _download_example_data: raise_for_status; detect HTML error pages before caching (all example datasets are pickles, which never start with '<') - _load_example_data: on a corrupt cache, delete it and retry the download once before failing; never leave a poisoned cache behind - Regression test poisons the real cache with the actual Drive error page and verifies recovery (or clean failure with the cache removed) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… CI jobs - _download_example_data retries up to 4 times (2s/6s/18s backoff) when the host rate-limits, instead of failing on the first error page - CI caches ~/hypertools_data (immutable datasets, one cross-OS entry) so 24 concurrent jobs stop re-downloading the same files from Google Drive every run -- the root cause of the intermittent text-test failures Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

jeremymanning · 2026-07-02T15:27:53Z

Round-2 review items addressed ✅

1. Mixture demos now show true multi-class membership

All mixture-model demos (examples, screenshots, parity montages, dev notebook) use overlapping clusters (1.5 sd apart), so points in the overlap regions have genuinely mixed memberships and render with blended, intermediate colors. A new test asserts that a substantial fraction of points have soft (< 0.9 max-proportion) assignments.

2. Axis sizing/ratio and title placement now match

Title: centered, black, 12pt on both backends (plotly previously rendered it off-center, blue-gray, oversized).
3D: plotly now uses matplotlib's default 4:4:3 box aspect and a matched camera distance, so the cube's shape and on-canvas size agree.
2D: the frame fills the canvas exactly like matplotlib (no forced square).
Default canvas is 640×480 on both (matplotlib's default figsize).

All 22 parity montages regenerated: docs/images/v2.0-parity/.

3. Animation works and exports to gif / animated png / mp4 — both backends

hyp.plot(..., animate=..., save_path='file.gif' | '.png' | '.mp4') — the extension picks the format. matplotlib uses Pillow (gif/APNG) or ffmpeg (video); the plotly backend renders every frame via kaleido and assembles them (play/pause controls are excluded from exports). 7 new tests save real files and verify multi-frame output. Committed samples (INDEX):

matplotlib	plotly

The dev notebook now displays animations inline (to_jshtml for matplotlib; interactive frames for plotly) and demonstrates gif export — re-executed end-to-end with 0 errors.

4. Sphinx gallery renders plotly output (including animation)

docs/conf.py now uses the plotly sphinx-gallery scraper, so plotly figures produced by examples render into the gallery (verified in the rebuilt HTML: the interactive-backend page shows the plotly figure). A new animate_plotly example demonstrates plotly animation + export, with an animated GIF thumbnail wired into the existing post-build thumbnail mechanism.

Bonus: dataset-download hardening (found via a CI failure during this round)

One macOS CI job failed because Google Drive rate-limited a dataset download and returned an HTML error page with a 200 status, which load() cached as the dataset — poisoning every subsequent text-data test on that runner. load() now validates downloads (rejects HTML error pages), retries with backoff when rate-limited, and heals corrupt caches instead of leaving a poisoned file behind — regression-tested against the real failure mode. CI additionally shares one cross-OS cache of the example datasets so 24 concurrent jobs no longer hammer the download host every run.

Updated numbers: 193 tests passing · 75/75 verification cases · 22/22 parity montages · 4 committed animation exports · docs + gallery build clean · CI green.

🤖 Generated with Claude Code

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…multi-panel + cache verification - SVG export on both backends: static (.svg) plus ANIMATED vector SVG via SMIL (frames stitched with discrete display switching; verified frame advance in headless Chrome by scrubbing setCurrentTime). matplotlib frames captured through a public AbstractMovieWriter subclass with frame subsampling (<=60 frames) - plotly window animations now rotate the camera while the window advances, matching matplotlib's behavior - plotly titles centered over the plot area (xref='paper') with a matplotlib-matched font stack - hyperalign: n_iter argument (default 10) iteratively re-estimates the common template; dict form no longer returns None; removed leftover 'method' reference that raised NameError for unknown align strings - shapes zoo: bunny/cube/dragon/sphere/teapot/vase/biplane + datasaurus registered with their Dropbox sources (direct-URL download support in the loader; tolerant unpickler for dill/legacy-pandas formats; dill added as a dependency). 'egyption_mask' excluded: upstream file is an empty (0,3) array - Multi-panel figures verified (hyp.plot(..., ax=...) into user subplot grids, 3D + 2D panels) - Re-download hygiene verified: repeated loads leave the cache byte-stable (no duplication / storage leak) - Reconstructed the classic readthedocs hyperaligned-weights animation (docs/images/v2.0-animations/weights_hyperaligned.gif) - 9 new tests (tests/test_round3.py), all real calls Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… regen - Docs: modern pydata-sphinx-theme (replacing sphinx_bootstrap_theme); full site + gallery build verified, screenshots committed to docs/images/v2.0-theme/. nbsphinx_execute='never': tutorial notebooks ship pre-executed, and 'auto' was re-executing every gallery notebook (doubling build time and hanging on plotly exports in the nbsphinx kernel) - Fixed zoom: Axes3D.dist was removed in matplotlib >= 3.8, silently disabling animation zoom; replaced with set_box_aspect(zoom=...) using the exact legacy scale mapping (10 / (9 - zoom)) - Reconstructed the classic readthedocs hyperaligned-weights animation (docs/images/v2.0-animations/weights_hyperaligned.gif): 36 subjects, align='hyper', smooth interpolated trajectories, working zoom - Modern demos: gallery examples plot_shapes_zoo + plot_datasaurus; executed tutorial notebooks hugging_face_embeddings (sentence- transformers + HF ag_news, mixture soft clustering, UMAP, animated spin gif) and modern_sklearn_dynamics (HDBSCAN, GaussianMixture, Lorenz attractor multicolored line, animated gif); registered in the tutorials toctree - Animation evidence regenerated with the rotation + zoom fixes; parity montages regenerated (22/22) with the title-font change - 202/202 tests pass Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

jeremymanning · 2026-07-02T17:41:28Z

Round-3 review items — all 10 addressed ✅

1. SVG export: static AND animated, both backends

save_path='plot.svg' works everywhere. Animated exports produce a single self-contained SMIL-animated vector SVG (frames switched via discrete <animate>, looping, no JavaScript). Verified in a real browser: scrubbing the SVG timeline with setCurrentTime() in headless Chrome renders different frames on both backends. 4 new tests.

2. plotly window animation now rotates

animate=True on the plotly backend rotates the camera while the window advances, exactly like matplotlib. Re-exported sample:

3. Titles match

plotly titles are centered over the plot area (xref='paper'), black, 12pt, with a matplotlib-matched font stack. Parity montages regenerated (22/22): docs/images/v2.0-parity/.

4. Multi-panel figures verified

hyp.plot(..., ax=<subplot>) composes into user figure grids (3D + 2D panels mixed); when embedding, hypertools respects the caller's color cycle. Tested + screenshotted.

5. Hyperalignment `n_iter`

hyp.align(data, align='hyper', n_iter=10) — the common template is iteratively re-estimated (default 10; also settable via the dict form). Found and fixed two latent bugs in the same block: the dict form silently returned None, and a leftover method reference raised NameError.

6. Classic readthedocs figure reconstructed

hyp.plot(weights, align='hyper', animate=True, zoom=2.5) reproduces the hyperaligned-weights animation:

Getting this right exposed a real bug: Axes3D.dist was removed in matplotlib ≥3.8, so animation zoom had been silently doing nothing — now implemented via set_box_aspect(zoom=...) with the exact legacy scale mapping.

7. Modern sphinx theme

Docs now use pydata-sphinx-theme (numpy/pandas-style), full site + gallery compile verified; screenshots committed to docs/images/v2.0-theme/. Also fixed a docs-build hang: nbsphinx was re-executing every gallery notebook (nbsphinx_execute='never'; tutorials ship pre-executed).

8. Shapes zoo datasets

hyp.load('bunny' | 'cube' | 'dragon' | 'sphere' | 'teapot' | 'vase' | 'biplane' | 'datasaurus') — registered with their Dropbox sources (direct-URL download support; tolerant unpickler for the dill/legacy-pandas formats these were saved in; dill added as a dependency). All verified via real downloads. ⚠️ egyption_mask is intentionally excluded: the source file (locally and at the Dropbox link) is an empty (0, 3) array — flagging for you to re-export it.

9. No re-download copy leak

Verified + regression-tested: repeated hyp.load() calls leave the cache byte-identical (no duplicate files, no re-download, mtime unchanged).

10. Modern demos

Gallery: plot_shapes_zoo (2×2 multi-panel point clouds) and plot_datasaurus (identical stats, different shapes).
Tutorial notebooks (executed, 0 errors, in the docs toctree): Visualizing Hugging Face text embeddings — sentence-transformers + HF ag_news, category coloring, GaussianMixture soft clustering, UMAP, animated spin export; Modern scikit-learn models and dynamical systems — HDBSCAN, mixture blending, and a Lorenz-attractor multicolored-line animation.

Updated numbers: 202 tests passing · 75/75 verification cases · 22/22 parity montages · 5 committed animation exports · full docs + gallery build clean on the new theme · CI green.

🤖 Generated with Claude Code

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ality fixes, embedding demos - hyperalign now REPEATEDLY applies the full procedure (n_iter passes, default 10): each pass's aligned output feeds the next, compounding convergence (mean distance to group mean 0.43 -> 0.004 over 10 passes on rotated copies). The classic readthedocs weights animation is regenerated with the corrected pipeline (normalize='across', align='hyper', zoom=3.5, rotations=1, frame_rate=50, linewidth=3) - NEW animate='serial' mode (both backends): datasets appear one at a time in list order, each growing point-by-point while earlier ones stay fixed, never connected -- built for conversation-turn visualizations; tests assert sequential reveal on both backends - Animation quality: full-canvas animated axes + skip tight_layout fixes cube/data clipping at rotation angles (border-pixel regression test); linewidth is now a plot() argument and animations no longer hardcode linewidth=1; markersize is now a plot() argument - EXACT per-point colors for markers: matrix hue / mixture-model scatter renders true per-observation blends via scatter (was: quantized color groups); plotly path already carried per-point colors - Shapes morph: 3510 frames @ 30fps (117s, 13 rotations), committed as mp4 + 20s preview gif - Demos: wikipedia embeddings (BAAI/bge-small-en-v1.5 + UMAP + 10-way GaussianMixture soft clustering, markersize=2) and reddit conversation trajectories (convokit reddit-small, sliding-window SBERT, per-speaker colors, animate='serial', 30s/3 rotations) -- both executed 0 errors with committed gifs - 205 tests passing Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…tax, 30fps standard, rebuilt demos WEIGHTS (the classic readthedocs animation) — fully diagnosed via Jeremy's 2020 pieman_trajectory_demo notebook (hypertools 0.6.2 + timecorr): gaussian temporal smoothing (var=300) -> hyp.align repeated n_iter=20 (SRM) -> smooth again -> UMAP -> animate Two additional findings were required to reproduce it on modern deps: - align('SRM') now supports n_iter (re-fits SRM on each pass's output; inter-subject correlation plateaus ~0.87 = the data's shared-signal ceiling, matching era behavior since the vendored SRM is unchanged since v0.6.2) - modern umap-learn's default n_neighbors=15 keeps neighborhoods within-subject and DISPERSES the aligned bundle; n_neighbors=150 merges same-timepoint rows across subjects and reproduces the tight looping rope of the original. Recipe scripted in scripts/generate_weights_trajectory.py; gif regenerated (900 frames, 30fps, tight rope verified against the reference render) Also in this round: - repeated-hyperalignment scale collapse fixed (procrustes' optimal scaling < 1 shrinks data geometrically across passes; per-pass output rescaling keeps norms stable through n_iter=50) - single-call soft-cluster coloring: hyp.plot(x, '.', markersize=2, reduce='UMAP', cluster={'model': GaussianMixture, 'n_clusters': 10}) (dict accepts top-level n_clusters and model classes, in both cluster() and plot(); colors flow from mixture proportions automatically) - 30fps animation standard: plotly frame density raised to 30/s (cap 600); all tutorial gifs re-rendered at fps=30 with no conversion downsampling (wikipedia 300 frames/10s/1 rotation; conversation 900 frames/30s/1 rotation; lorenz + hf spin 900 frames/30s); shapes morph 3510-frame mp4 + 30fps preview gif; matplotlib evidence gifs at 450 frames (plotly evidence gifs kept from the prior render -- kaleido makes 450-frame exports impractically slow; noted) - conversation demo rebuilt: 3-sentence windows WITHIN utterances (true disconnection), repeating per-speaker colors, animate='serial', rotations=1, no frame clipping - 206 tests passing Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The n_neighbors=150 recipe over-globalized the UMAP embedding and flattened the trajectory into a near-straight line; download.png shows a tightly-bundled bundle with a dramatic loop. Swept n_neighbors against the reference: 15 disperses into a hairball, 150 flattens the loop, and 36 (min_dist=0.1) is the sweet spot -- one tight rope that keeps the loop. Verified this is a pure UMAP-neighborhood effect, not a dependency version: the modern SRM branch is byte-identical to v0.6.2, and era umap-learn 0.4.6 also hairballs the same aligned data at its default n_neighbors=15. Also switched the animation from a rolling window (animate=True + tail_duration) to animate='spin': the window only ever showed a ~4s fragment (a tangle), while spin draws the whole bundle and orbits it so the loop is visible from every angle. Tightened the gif encode (scale=340, 48-colour palette) to ~8MB, in line with the other animation gifs, still 900 frames at 30fps with no downsampling. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Switch back from animate='spin' to the classic animate=True sliding window, per review. The critical property -- the space inside the cube is fixed for the whole animation, independent of which window is visible -- is guaranteed by the pipeline (helpers.scale normalizes once from the full stacked dataset and the window updater never touches axis limits) and verified programmatically: axis limits are identical across frames while the visible fragment's extent slides along the loop. (The earlier claim that window mode showed a static tangle was a frame- extraction bug in the verification harness -- PIL's ImageSequence yields one re-seeked image object, so materializing it with list() produced N copies of the final frame. Measured correctly, the window render rotates and the comet travels: mean inter-second frame motion 8.3, 0 clipped frames.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Streams are a first-class data type -- no flag. hyp.plot() detects Python iterators/generators and Hugging Face IterableDatasets (load_dataset(..., streaming=True)) from the structure of the input: - the first stream_init samples (default 10,000) ESTIMATE the normalization/reduction parameters; those fitted models are then APPLIED to every subsequent sample, which is added to the plot dynamically (the fit-on-head/transform-forever semantics from the issue thread) - stream_chunk (default 100) is the per-fetch batch size; each chunk renders as one live redraw / saved animation frame - stream_max (default None) streams continually; infinite streams render incoming data indefinitely, and Ctrl-C cleanly finalizes any save_path animation and returns the geometry - stream_window optionally shows only the trailing samples (comet style) while everything consumed stays on the geometry - reduction models must support transform() (IncrementalPCA default, PCA, UMAP; TSNE raises); align/cluster raise for streams (cluster planned) - dict rows: numeric fields concatenated in insertion order, strings ignored (use .select_columns() for control); datasets added to [dev] 14 real tests (tests/test_streaming.py) incl. an actual HF iris stream, interrupt finalization, and a fitted-on-head-only assertion. New executed tutorial docs/tutorials/streaming_data.ipynb with two streaming animations. Docs theme: pydata-sphinx-theme -> Furo with the ContextLab brand ported from ContextLab/scheduler (Nunito Sans, lowercase 300-weight headings with 0.6px letter-spacing, green #007030 / dark #4CAF50), screenshot-verified. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

jeremymanning · 2026-07-03T00:13:04Z

Round-4.5: the weights animation is solved, plus the requested animation & API fixes ✅

The story-trajectories mystery — full diagnosis

The classic readthedocs animation could not be reproduced by any plot(align=...) call because it was never produced by one. Working from the 2020 pieman_trajectory_demo notebook (hypertools 0.6.2 + timecorr), the actual pipeline is:

gaussian temporal smoothing (var=300)
  → hyp.align(...) applied REPEATEDLY (n_iter=20, SRM)
  → smooth again
  → UMAP
  → animate

Two things still had to be fixed/understood to make that work on modern dependencies:

align('SRM', n_iter=...) now re-fits SRM on each pass's aligned output (matching the notebook's loop). Inter-subject correlation plateaus at ~0.87 — the data's shared-signal ceiling (the vendored SRM is byte-identical to v0.6.2, so this matches era behavior).
UMAP's n_neighbors sets the whole character of the embedding. The modern default (15) keeps neighborhoods within-subject and disperses the aligned data into a hairball; very large values (150+) over-globalize and flatten the trajectory into a near-straight line. n_neighbors=36 (with min_dist=0.1) is the sweet spot: same-timepoint rows across the 36 subjects rope together into one tight bundle while preserving the dramatic looping bend of the classic reference.

I verified this is a pure neighborhood-size effect, not a dependency-version artifact: the modern SRM alignment is byte-identical to v0.6.2 (SRM(features=min(shape[0])), re-fit per pass), and fitting the era umap-learn 0.4.6 on the same aligned data also produces a hairball at its default n_neighbors=15. So the look is recovered by neighborhood tuning on current dependencies rather than by pinning old ones.

The animation is the classic sliding-window style (animate=True) in a fixed space: the data are scaled once from the full dataset before animating, so the axis limits never depend on which window is visible — the comet travels along the loop inside a stationary cube while the camera makes one slow rotation (900 frames @30fps). I verified the fixed-view property programmatically: axis limits are identical across all frames while the visible fragment's extent slides along the trajectory. The full recipe is scripted in scripts/generate_weights_trajectory.py:

Along the way, repeated hyperalignment got a real fix too: procrustes' optimal scaling is < 1 under noise, so repeated passes shrank the data geometrically (eventually crashing). Per-pass rescaling keeps norms stable through n_iter=50.

Single-call soft clustering (requested syntax)

geo = hyp.plot(embeddings, '.', markersize=2, reduce='UMAP',
               cluster={'model': GaussianMixture, 'n_clusters': 10})

works verbatim: the cluster dict accepts top-level n_clusters and model classes (in both cluster() and plot()), and colors flow from the mixture proportions automatically as exact per-point blends. The Wikipedia tutorial now uses this one-liner (10s, 1 rotation, small dots):

Conversation demo on the new `animate='serial'` mode

Windows are now 3-sentence windows within each utterance, so utterances are truly disconnected trajectories; speaker colors repeat (one color per speaker); rotations=1; no frame clipping (border-verified):

30fps smooth/slow animation standard

All animations render at frame_rate=30 and gifs convert at fps=30 with no downsampling (the earlier choppiness came from fps-6/12 conversions). plotly's animation frame density is raised to 30 effective fps (cap 600). Regenerated at the new standard: shapes morph (3,510-frame mp4 + 30fps preview below), lorenz + HF spin tutorials (900 frames / 30s each), matplotlib evidence gifs (450 frames).

(One honest caveat: the two plotly evidence gifs kept their previous renders — kaleido makes 450-frame plotly exports impractically slow; everything user-facing uses the new standard.)

206 tests passing · all four tutorial notebooks re-executed with 0 errors · CI green.

🤖 Generated with Claude Code

jeremymanning · 2026-07-03T00:13:14Z

Round 5: streaming data (#101), streaming tutorial, and the ContextLab docs theme ✅

Streaming data — closes the oldest open feature request (#101, 2017)

Streams are now a first-class data type, exactly as specified in the issue thread: no streaming flag — hyp.plot() infers it from the structure of the input. Python iterators/generators and Hugging Face streaming datasets both work:

from datasets import load_dataset
ds = load_dataset('scikit-learn/iris', split='train', streaming=True)
geo = hyp.plot(ds, '.')          # streams straight in, nothing materialized

Semantics (matching the issue design + review guidance):

stream_init (default 10,000): the initial samples used to estimate the normalization/reduction parameters. Those fitted models are then applied to every subsequent sample, which is added to the plot dynamically — verified by test: IncrementalPCA.n_samples_seen_ == stream_init after a full stream, and the stored model's transform() exactly reproduces the plotted trajectory.
stream_chunk (default 100): how many samples are fetched per update; each chunk renders as one live redraw / one saved animation frame, so it sets the animation's temporal resolution.
stream_max (default None): streaming continues until the stream ends, stream_max is hit, or the user hits Ctrl-C. Infinite streams render continually, and any save_path animation is finalized whenever streaming stops — including on interrupt (tested with a generator that raises KeyboardInterrupt mid-stream: the gif is finalized and the geometry returned).
stream_window (optional): comet-style display of only the most recent samples for long/infinite streams; everything consumed is still retained on the returned geometry (geo.data, geo.xform_data, geo.stream_info).

Guardrails: reduction models must support transform() (IncrementalPCA default, PCA, UMAP; TSNE raises with a clear message); align/cluster raise for streams (a stream is a single dataset; streaming clustering is future work). Dict rows (the HF case) contribute their numeric fields in insertion order; .select_columns(...) gives exact control.

14 real tests (tests/test_streaming.py) — real generators, a real infinite stream, a real interrupt, and a real load_dataset(..., streaming=True) network stream; no mocks.

New executed tutorial (docs/tutorials/streaming_data.ipynb) with two streamed animations:

Docs theme: Furo with the ContextLab look

Ported from ContextLab/scheduler per review: stock Furo plus the lab's brand — Nunito Sans, lowercase 300-weight headings with 0.6 px letter-spacing, green #007030 (light) / #4CAF50 (dark). Screenshot-verified across index/API/tutorial pages:

220 tests passing (213 fast set + 7 animation-export) · streaming tutorial executed with 0 errors · docs build clean.

🤖 Generated with Claude Code

Gallery: sphinx-gallery previously executed only plot_*-named examples, leaving chemtrails/animate*/precog/explore/save_*/analyze pages with code but no rendered output. All examples now execute (filename_pattern), matplotlib animations render as embedded mp4 video (matplotlib_animations + sphinxcontrib-video; animation examples expose `ani = ani_geo.line_ani` for the scraper), save_* examples write to temp files, and every example page gets a branch-aware "Open in Colab" badge + .ipynb link (post_build.py) so gallery examples open as runnable notebooks. Dual-backend audit (scripts/audit_gallery_backends.py): every example runs under BOTH matplotlib and plotly in subprocesses; 78/78 pass (save_movie under plotly needs a long timeout -- kaleido per-frame mp4 export). The audit caught a real, years-latent bug: per-dataset fmt lists (['-','--']) routed each dataset through interp_array_list (plural), silently replacing 2D arrays with lists of per-row interpolations; latent because is_line() always returned False before its round-2 fix. Fixed with interp_array + regression test. Plotly parity for animation extras: chemtrails/precog/bullettime draw low-opacity trail traces on window animations, tail_duration sets the window length, and zoom moves the camera (r = 1.95*(9-zoom)/8, mirroring the matplotlib zoom semantics); previously none of these were forwarded to the plotly renderer. explore maps to plotly's native hover. Universal loader: hyp.load (and DataGeometry.plot/transform) resolve strings by trying, in order: built-in dataset name -> local file (npy/npz/csv/tsv/txt/json/parquet/mat/pickle) -> Hugging Face dataset (split=/streaming=; streaming feeds straight into hyp.plot) -> Google Drive URL or bare id -> Dropbox URL or shared-link path -> any URL with or without https://. Lists of strings return lists of datasets. Raw text (whitespace) still flows to the text-embedding pipeline. Also fixes df2mat for pandas>=2 (get_dummies bool dtype made mixed DataFrames produce object arrays that crashed np.isnan). 19 new tests (12 loader incl. real HF/Drive/Dropbox/URL fetches, 6 plotly trails, 1 interp regression); 232 passing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Two-layer fix for example pages showing code but no output: 1. nbsphinx was claiming the gallery pages: sphinx-gallery writes a downloadable .ipynb next to each generated .rst, and nbsphinx rendered the UNEXECUTED notebook instead of the gallery page. auto_examples/*.ipynb is now excluded from the document build (downloads still work). 2. matplotlib animations render as embedded HTML5 video: the sphinxcontrib.video extension is registered (required by matplotlib_animations=(True, 'mp4')) and all animation example pages (chemtrails, precog, animate*, save_movie) now embed a playable 30s mp4, verified visually. Includes the regenerated gallery artifacts (all 39 examples executed, 9m39s total build execution). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…report Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…round 6.5) NEW ANIMATION STANDARD (both backends): frame_rate=30, duration=30s, rotations=1 -- one revolution every 30 seconds. Three layers had to change for plotly to actually match matplotlib's pacing: - library defaults (frame_rate 50->30, rotations 2->1) and the plotly frame math: n_frames = frame_rate * duration exactly like matplotlib (600-frame cap removed); parity test asserts identical 900 frames at ~33ms on both backends - DataGeometry.plot no longer replays animation-pacing kwargs baked into saved .geo files (old-era defaults like frame_rate=50 and rotations=2 silently overrode the current standard in every gallery example that calls geo.plot); explicit caller overrides still win - docs builds: plotly's sphinx-gallery renderer serialized every animation frame through kaleido for one static png (a 900-frame figure took ~an hour); the show path now writes a frame-stripped png plus an interactive html with embedded frames capped at 150, each shown proportionally longer -- total duration and rotation speed unchanged, pages ~0.1MB Streaming stability: the data->box transform is FROZEN from the head (the center+scale affine is captured once); every future sample goes through the same transform and out-of-range samples are clamped to the closest point on the box surface. Axis limits never change once set -- no more per-chunk rescale "twitch" (verified: zero vanishing ink across tutorial animation frames + exact-position regression tests). Legends (both backends): rendered to the RIGHT of the plot, vertically centered on the box (mpl bbox_to_anchor; plotly x=1.02/y=0.5 with a reserved right margin). Screenshot-verified in 2D and 3D. Gallery UX: thumbnail clicks were dead (sphinx-gallery >= 0.17 no longer wraps the thumb <img> in an anchor; the old gallery-fixes.js targeted extinct .xref markup) -- thumbnails now open the example's notebook on Colab while title text opens the example page. Animated thumbnails were squashed into 200x200 squares from 4:3 sources; all 7 regenerated letterboxed at the correct aspect from the new 30fps mp4s (scripts/generate_gallery_thumbs.py). Plotly evidence gifs re-rendered at the pacing standard. DataGeometry.plot/transform docstrings document the universal string-loading behavior for the API pages. 237 tests passing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

A kaleido subprocess wedged during test_animated_svg_plotly on one Windows runner and burned the full 6-hour Actions job timeout. No test legitimately takes over 20 minutes; a hung native call now fails fast with a stack dump instead of holding a runner hostage (thread method, since the hangs are inside native calls). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

jeremymanning · 2026-07-03T12:00:05Z

Round 6: gallery pages render everything, dual-backend audit, notebooks-on-click, universal loader ✅

Gallery pages now display their outputs (including animations)

chemtrails.html (and 9 other pages) showed code with no output. Two independent root causes, both fixed:

sphinx-gallery only executed examples named plot_* — chemtrails/animate*/precog/explore/save_*/analyze were never run. All 39 examples now execute at build time, and matplotlib animations render as embedded HTML5 video (matplotlib_animations=(True,'mp4') + sphinxcontrib.video; the animation examples now expose ani = ani_geo.line_ani, which also documents how to grab the animation object).
nbsphinx was hijacking the pages: sphinx-gallery writes a downloadable .ipynb next to each generated .rst, and nbsphinx rendered the unexecuted notebook instead of the gallery page. Gallery notebooks are now excluded from the document build (downloads unaffected).

The chemtrails page with its playable 30s animation (screenshot):

Clicking a gallery example gets you a runnable notebook

Every example page opens with an Open in Colab badge (branch-aware link to the example's committed notebook) next to the .ipynb download — visible in the screenshot above, injected across all 40 pages by post_build.py. If you'd prefer gallery thumbnails to jump straight to Colab instead of the example page, that's a one-line change to the click handler — say the word.

Every example verified on BOTH backends

New audit harness (scripts/audit_gallery_backends.py) runs all 39 examples under matplotlib and plotly in subprocesses: 78/78 pass (full report, side-by-side spot-check). Two things fell out of it:

A real, years-latent data-corruption bug: per-dataset format lists (['-','--']) routed each dataset through interp_array_list (plural), silently replacing 2D arrays with lists of per-row interpolations. It was unreachable for years because is_line() always returned False before its round-2 fix. Fixed (interp_array) with a regression test; this is what was crashing plot_procrustes and plot_missing_data.
Plotly parity gaps closed: chemtrails/precog/bullettime now draw low-opacity trail traces on plotly window animations, tail_duration sets the plotly window length, and zoom moves the camera (mirroring the matplotlib zoom semantics) — none were previously forwarded to the plotly renderer. explore maps to plotly's native hover labels. (save_movie under plotly passes but takes ~32 min — kaleido renders all 600 mp4 frames; noted in the report.)

Universal loader (`hyp.load` + `DataGeometry`)

Strings (and lists of strings) resolve in the requested order: built-in dataset → local file (npy/npz/csv/tsv/txt/json/parquet/mat/pickle) → Hugging Face dataset (streaming=True feeds straight into hyp.plot) → Google Drive URL or bare file id → Dropbox URL/shared-link path → any URL with or without https://. Raw text still flows to the text-embedding pipeline (whitespace is the discriminator). All verified with real network fetches — including the legacy Drive-hosted files, Dropbox dl=0 links, and schemeless URLs. Also fixed df2mat for pandas≥2 (bool get_dummies produced object arrays that crashed mixed-dtype DataFrame plotting).

239 tests passing (232 fast set + 7 animation-export) · 78/78 dual-backend example runs · docs build clean with all examples executed (9m39s) · CI green.

🤖 Generated with Claude Code

jeremymanning · 2026-07-03T12:00:06Z

Round 6.5: animation pacing standard, streaming stability, legends, gallery UX ✅

One animation standard, both backends: 30 fps · 30 s · 1 rotation per 30 s

"Plotly is much too fast" had three stacked causes, each now fixed and regression-tested:

Library defaults — frame_rate 50→30, rotations 2→1, and plotly now generates exactly frame_rate × duration frames like matplotlib (its 600-frame cap is gone). A parity test asserts both backends produce identical 900 frames @ ~33 ms at defaults.
Saved geos smuggled in old pacing — gallery examples call geo.plot(...) on pickled example data whose stored kwargs carry their era's defaults (frame_rate=50, rotations=2), silently overriding current defaults. DataGeometry.plot no longer replays pacing kwargs from saved files (explicit overrides still win).
Docs rendering — plotly's sphinx-gallery renderer pushed every animation frame through kaleido for one static png (a 900-frame figure took ~57 min and would embed tens-of-MB pages). The docs path now writes a frame-stripped png + an interactive html with embedded frames capped at 150, each displayed proportionally longer — duration and rotation speed unchanged, pages ~0.1 MB.

All gallery animation mp4s re-rendered at the standard (verified 900 frames @ 30 fps via ffprobe), and the plotly evidence gifs re-exported at correct pacing.

Streaming: view is rock-stable, out-of-range samples clamp to the box

The data→box transform is frozen from the initial samples (the center+scale affine is captured once); every future sample passes through that exact transform, and anything outside the box is clamped to the closest point on its surface. Axis limits never change once set. Verified: zero vanishing ink across the re-rendered tutorial animation (previously drawn pixels never move), plus regression tests for exact drawn-position stability and box-surface clamping:

Legends: right of the plot, vertically centered (both backends)

matplotlib | plotly:

Gallery: thumbnails click through to notebooks, correct aspect

Thumbnail clicks were genuinely dead (sphinx-gallery ≥ 0.17 no longer wraps the image in a link, and the old fix-up JS targeted markup that no longer exists). Thumbnails now open the example's notebook on Colab; the title text under each thumbnail opens the example page with the rendered output. The animated thumbnails were also being squashed into squares from 4:3 sources — all seven are regenerated letterboxed at the correct aspect from the new 30 fps videos:

API docs

hyp.load's docstring documents the full resolution chain (builtin → local file → Hugging Face incl. streaming → Drive → Dropbox → URL, lists supported), and DataGeometry.plot/transform now document automatic string-source loading — both regenerate into the API reference.

(CI hardening: a kaleido subprocess wedged for 6 hours on one Windows runner before hitting the Actions timeout -- pytest-timeout now caps every test at 20 minutes so a hung native call fails fast instead of burning a runner.)

237 tests passing · gallery animations verified at 900 frames / 30 fps · streaming tutorial re-executed with 0 errors.

🤖 Generated with Claude Code

Every documentation notebook (39 gallery + 13 tutorials) now opens with a branch-aware install cell so it runs standalone in Google Colab. On a preview branch it installs that branch from GitHub (`%pip install "hypertools[interactive] @ git+...@dev-2.0"`, verified in a clean venv: imports as 2.0.0.dev0 with 2.0-only features working); on master it installs the released package. scripts/add_colab_install_cell.py injects the line idempotently into the hand-authored tutorial notebooks, and conf.py's first_notebook_cell emits the same line for gallery notebooks on rebuild. Gallery examples: - shapes zoo: plots EVERY zoo shape (bunny, cube, dragon, sphere, teapot, vase, biplane) as small black dots (',' pixel marker), one panel each - datasaurus: plots ALL THIRTEEN datasets of the dozen as small black dots ('.' point marker), one panel each Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

jeremymanning and others added 22 commits July 1, 2026 23:57

Ignore local virtualenv and tooling directories (.venv, .omc)

c8257b6

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Add session notes for hypertools 2.0 kickoff

35afcd7

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Add hypertools 2.0 roadmap synthesizing refactor analysis and master …

158477a

…audit Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Roadmap: elevate multilevel indices, stack/unstack, robust coloring, …

bf6fa7d

…mixture models to first-class 2.0 features; record approved backend='auto' policy Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Dev notebook: add acceptance targets for multilevel indices, mixture …

1bc283c

…models, robust coloring Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Notes: record Jeremy's confirmed 2.0 design decisions

97041b7

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Notes: mark roadmap phases 0-2 complete

0759e89

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Notes: final pre-PR status

da7a358

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Ignore docs gallery build byproducts

f1270b5

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Notes: PR #270 opened

9d97a32

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Notes: PR #270 CI fully green (24/24)

49e66ea

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

jeremymanning and others added 5 commits July 2, 2026 08:22

Notes: third work block complete, PR evidence posted

5325ac1

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Notes: review round 2 complete

567c0fc

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Notes: round 2 closed, CI green, evidence posted

4ffacae

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

jeremymanning and others added 3 commits July 2, 2026 12:36

Notes: round 3 complete

3357e4d

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

jeremymanning and others added 8 commits July 2, 2026 13:41

Notes: round 3 closed, evidence posted

6eda5aa

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Notes: round 4.5 shipped

ad0f015

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Theme evidence: Furo/ContextLab screenshots (index + streaming tutorial)

5cc599d

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

jeremymanning and others added 7 commits July 2, 2026 21:34

Round-6 evidence: gallery video page screenshot + dual-backend audit …

5dc97bf

…report Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Audit report: save_movie/plotly verified complete (1943s)

6546501

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Notes: round-6.5 addendum (3-layer plotly pacing diagnosis)

7d9e224

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

jeremymanning and others added 2 commits July 3, 2026 08:35

Notes: round 7 (Colab install cells, full-panel examples)

ae30f6d

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HyperTools 2.0: modernized toolbox, interactive backend, soft clustering, comprehensive visual verification#270

HyperTools 2.0: modernized toolbox, interactive backend, soft clustering, comprehensive visual verification#270
jeremymanning wants to merge 48 commits into
masterfrom
dev-2.0

jeremymanning commented Jul 2, 2026 •

edited

Loading

Uh oh!

jeremymanning commented Jul 2, 2026

Uh oh!

jeremymanning commented Jul 2, 2026

Uh oh!

jeremymanning commented Jul 2, 2026

Uh oh!

jeremymanning commented Jul 3, 2026

Uh oh!

jeremymanning commented Jul 3, 2026

Uh oh!

jeremymanning commented Jul 3, 2026

Uh oh!

jeremymanning commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jeremymanning commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

HyperTools 2.0: modernized toolbox with interactive backend, soft clustering, and comprehensive visual verification

Highlights

Interactive plotly backend with visual parity (backend='auto' | 'matplotlib' | 'plotly')

Mixture-model ("soft") clustering + robust coloring

Multicolored lines

Nested-list input with multilevel styling

hyp.apply_model: the stack/unstack core

Retired legacy arguments (long-deprecated)

Bug fixes

Performance, packaging, docs

Evidence: every function verified on both backends

Sample parity montages (matplotlib | plotly — same call)

Breaking changes

Uh oh!

jeremymanning commented Jul 2, 2026

All review items addressed ✅

1. Backend visual parity

2. Complete feature screenshot coverage

3. Formerly deferred items — all now in this PR

Bonus fix found while restoring parity

Uh oh!

jeremymanning commented Jul 2, 2026

Round-2 review items addressed ✅

1. Mixture demos now show true multi-class membership

2. Axis sizing/ratio and title placement now match

3. Animation works and exports to gif / animated png / mp4 — both backends

4. Sphinx gallery renders plotly output (including animation)

Bonus: dataset-download hardening (found via a CI failure during this round)

Uh oh!

jeremymanning commented Jul 2, 2026

Round-3 review items — all 10 addressed ✅

1. SVG export: static AND animated, both backends

2. plotly window animation now rotates

3. Titles match

4. Multi-panel figures verified

5. Hyperalignment n_iter

6. Classic readthedocs figure reconstructed

7. Modern sphinx theme

8. Shapes zoo datasets

9. No re-download copy leak

10. Modern demos

Uh oh!

jeremymanning commented Jul 3, 2026

Round-4.5: the weights animation is solved, plus the requested animation & API fixes ✅

The story-trajectories mystery — full diagnosis

Single-call soft clustering (requested syntax)

Conversation demo on the new animate='serial' mode

30fps smooth/slow animation standard

Uh oh!

jeremymanning commented Jul 3, 2026

Round 5: streaming data (#101), streaming tutorial, and the ContextLab docs theme ✅

Streaming data — closes the oldest open feature request (#101, 2017)

Docs theme: Furo with the ContextLab look

Uh oh!

jeremymanning commented Jul 3, 2026

Round 6: gallery pages render everything, dual-backend audit, notebooks-on-click, universal loader ✅

Gallery pages now display their outputs (including animations)

Clicking a gallery example gets you a runnable notebook

Every example verified on BOTH backends

Universal loader (hyp.load + DataGeometry)

Uh oh!

jeremymanning commented Jul 3, 2026

Round 6.5: animation pacing standard, streaming stability, legends, gallery UX ✅

One animation standard, both backends: 30 fps · 30 s · 1 rotation per 30 s

Streaming: view is rock-stable, out-of-range samples clamp to the box

Legends: right of the plot, vertically centered (both backends)

Gallery: thumbnails click through to notebooks, correct aspect

API docs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

jeremymanning commented Jul 2, 2026 •

edited

Loading

Interactive plotly backend with visual parity (`backend='auto' | 'matplotlib' | 'plotly'`)

`hyp.apply_model`: the stack/unstack core

5. Hyperalignment `n_iter`

Conversation demo on the new `animate='serial'` mode

Universal loader (`hyp.load` + `DataGeometry`)