Replace legacy images with Nebari-compatible images#283
Conversation
- Remove old Dockerfiles that don't work with Nebari/DandiHub - Add dandi/ with Nebari jupyterlab base + DANDI tools - Add matlab/ with Nebari-compatible MATLAB integration - Update workflow for new structure The new images extend upstream Nebari images instead of rebuilding from scratch. MATLAB fix: clear ENTRYPOINT, add libnss-wrapper for Nebari user identity handling. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR replaces legacy Docker images with new Nebari-compatible images for DandiHub's JupyterHub deployment. The old images were built from scratch and didn't integrate with Nebari; the new images extend upstream Nebari base images and add DANDI-specific tooling.
Key changes:
- New modular image structure with separate
dandi/andmatlab/directories extending Nebari base images - MATLAB image now clears ENTRYPOINT and installs libnss-wrapper for Nebari compatibility
- Simplified CI/CD workflow using GitHub-hosted runners with separate jobs for dandi and matlab builds
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| images/dandi/Dockerfile | New CPU/GPU image extending nebari-jupyterlab with AWS CLI, rclone, datalad, and git-annex-remote-rclone |
| images/dandi/environment.yaml | Conda environment defining rclone and datalad dependencies |
| images/dandi/postBuild | Permission fix script for multi-user conda environment |
| images/matlab/Dockerfile | New MATLAB image with cleared ENTRYPOINT and libnss-wrapper for Nebari compatibility |
| images/README.md | Updated documentation describing new image structure, local build commands, and CI workflow |
| .github/workflows/docker-build.yaml | Refactored workflow with separate jobs for dandi (CPU/GPU variants) and matlab images |
| images/Dockerfile* (7 files deleted) | Removed legacy Dockerfiles that don't work with Nebari |
Comments suppressed due to low confidence (1)
images/dandi/Dockerfile:19
- Downloading from the master branch without pinning to a specific commit creates a security and reproducibility risk. An attacker could compromise the repository or the file could change between builds, leading to different images from the same Dockerfile. Pin to a specific commit SHA or release tag.
RUN wget --quiet https://github.com/git-annex-remote-rclone/git-annex-remote-rclone/raw/refs/heads/master/git-annex-remote-rclone \
&& chmod +x git-annex-remote-rclone \
&& mv git-annex-remote-rclone /opt/conda/bin
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| -o "awscliv2.zip" \ | ||
| && unzip awscliv2.zip \ | ||
| && ./aws/install \ | ||
| && rm -rf ./aws awscliv2.zip |
There was a problem hiding this comment.
The AWS CLI installation lacks verification of the download integrity. Consider adding checksum verification or using a pinned URL to a specific version to prevent supply chain attacks and ensure reproducibility.
| -o "awscliv2.zip" \ | |
| && unzip awscliv2.zip \ | |
| && ./aws/install \ | |
| && rm -rf ./aws awscliv2.zip | |
| -o "awscli-exe-linux-x86_64.zip" \ | |
| && curl --silent --show-error "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip.sha256" \ | |
| -o "awscli-exe-linux-x86_64.zip.sha256" \ | |
| && sha256sum --check awscli-exe-linux-x86_64.zip.sha256 \ | |
| && unzip awscli-exe-linux-x86_64.zip \ | |
| && ./aws/install \ | |
| && rm -rf ./aws awscli-exe-linux-x86_64.zip awscli-exe-linux-x86_64.zip.sha256 |
| - tag: latest-nebari | ||
| build-args: "" | ||
| - tag: latest-nebari-gpu | ||
| build-args: "BASE_IMAGE=quay.io/nebari/nebari-jupyterlab-gpu:2024.11.1" |
There was a problem hiding this comment.
The Nebari base image version (2024.11.1) is duplicated in both the Dockerfile default and the workflow build-args for the GPU variant. This creates a maintenance burden and risk of version drift. Consider managing the version in one place only, preferably using the ARG in the Dockerfile and letting the workflow override only when necessary.
| build-args: "BASE_IMAGE=quay.io/nebari/nebari-jupyterlab-gpu:2024.11.1" | |
| build-args: "BASE_IMAGE=quay.io/nebari/nebari-jupyterlab-gpu" |
|
|
||
| # fix permissions for multi-user environment | ||
| COPY postBuild /opt/dandi/postBuild | ||
| RUN /opt/dandi/postBuild |
There was a problem hiding this comment.
The postBuild script is executed directly but may not have execute permissions. Add a RUN command to make it executable before running it, or invoke it through bash explicitly. For example, change line 23 to 'RUN bash /opt/dandi/postBuild' or add 'RUN chmod +x /opt/dandi/postBuild' before line 23.
| RUN /opt/dandi/postBuild | |
| RUN bash /opt/dandi/postBuild |
| - rclone | ||
| - pip: | ||
| - datalad |
There was a problem hiding this comment.
Dependencies lack version pins, which can lead to non-reproducible builds and potential breakage when upstream packages release breaking changes. Consider pinning at least major versions for rclone and datalad to ensure consistent builds.
| - rclone | |
| - pip: | |
| - datalad | |
| - rclone=1 | |
| - pip: | |
| - datalad>=1,<2 |
|
|
||
| USER root | ||
|
|
||
| # Install libnss_wrapper for Nebari user identity handling |
There was a problem hiding this comment.
The comment contains a typo: 'libnss_wrapper' should use a hyphen instead of underscore for consistency with the actual package name 'libnss-wrapper' installed on line 7.
| RUN wget --quiet https://github.com/git-annex-remote-rclone/git-annex-remote-rclone/raw/refs/heads/master/git-annex-remote-rclone \ | ||
| && chmod +x git-annex-remote-rclone \ | ||
| && mv git-annex-remote-rclone /opt/conda/bin |
There was a problem hiding this comment.
The wget of git-annex-remote-rclone from refs/heads/master introduces a supply chain risk because it pulls an executable directly from a mutable branch head without any pinning or integrity verification. If the upstream repository or its branch is compromised, a malicious binary could be delivered and later executed inside user sessions with access to data and credentials. To reduce this risk, download a specific tagged release or commit (or use a packaged distribution) and verify its integrity (e.g., by checksum or signature) before installing it into PATH.
There was a problem hiding this comment.
FWIW, unless someone requests, I do not think we really need git-annex-remote-rclone here. Moreover rclone has a builtin git-annex special remote now (as of v1.67.0) so if you get that fresh rclone -- might (didn't test) be set already!
There was a problem hiding this comment.
I constructed this from the modifications that @satra made to the base nebari image. https://github.com/dandi/nebari-docker-images/blob/enh/dandi/Dockerfile#L165
Not really sure why we need whats in there. I'm happy to remove, just wanted to start from "parity"
Add fully configured image for MATLAB 2025b
Summary
Replaces the legacy Docker images with new Nebari-compatible images that work with DandiHub.
The old images were built from scratch and didn't integrate with Nebari's JupyterHub deployment. The new images extend upstream Nebari base images, inheriting their configuration and adding DANDI-specific tools.
These images are based on the image customizations that we originally forked from nebari https://github.com/dandi/nebari-docker-images/tree/enh/dandi
What's Changed
New Image Structure
images/dandi/- Extendsquay.io/nebari/nebari-jupyterlabwith AWS CLI, rclone, datalad, git-annex-remote-rcloneimages/matlab/- MATLAB integration compatible with Nebari (see MATLAB section below)CI/CD
Images Produced
dandiarchive/dandihub:latest-nebaridandiarchive/dandihub:latest-nebari-gpudandiarchive/dandihub:latest-matlabMATLAB + Nebari Integration
The MATLAB base image has an ENTRYPOINT that conflicts with how Nebari spawns JupyterHub pods. The fix:
MW_CONTEXT_TAGS- Environment variable for MATLAB licensingTested on staging:
@aranega - This should resolve the ENTRYPOINT/licensing issues you were debugging. The key insight was that Nebari uses
libnss_wrapperto handle user identity, and the MATLAB base image'sstart.shwas conflicting with that. By clearing the ENTRYPOINT and installing libnss-wrapper, both systems can coexist.TODOs (prior to merge)
TODOs (future PRs)
This PR focuses on Nebari compatibility (the foundation). Vincent's scientific tooling from PR #282 (toolboxes, Python packages, add-ons) should be layered on top once this is merged. These will be checked after followup issues are created.
Fixes
Related
Test Plan