Skip to content

Make rugged optional, default to pure-Ruby git backend#261

Open
stanhu wants to merge 5 commits into
grodowski:masterfrom
stanhu:sh-git-adapter
Open

Make rugged optional, default to pure-Ruby git backend#261
stanhu wants to merge 5 commits into
grodowski:masterfrom
stanhu:sh-git-adapter

Conversation

@stanhu
Copy link
Copy Markdown

@stanhu stanhu commented May 6, 2026

Summary

Refactor Undercover::Changeset behind a small adapter interface so the gem no longer requires rugged as a hard dependency. Two adapters live side by side:

  • Undercover::Changeset::GitAdapter — pure Ruby, backed by the git gem (new default)
  • Undercover::Changeset::RuggedAdapter — libgit2-backed, used automatically when rugged is loadable

rugged is dropped from add_dependency in the gemspec. Users who already have rugged in their bundle (directly or transitively) keep using it — Changeset.default_adapter_class prefers rugged whenever it can be required, falling back to the git gem otherwise. No runtime configuration is needed to switch backends; install or remove rugged from your bundle.

Why avoid native extensions by default

rugged wraps libgit2, which means every gem install undercover triggers a C compile and pulls libgit2 along for the ride. That's a real cost for a tool whose job is to read git diffs:

  • Install friction. libgit2 has to be present and ABI-compatible at install time. macOS users hit brew install libgit2, Debian/Ubuntu users hit apt install libgit2-dev, Alpine/musl users hit notoriously fiddly cross-compilation issues. New contributors and CI cold-cache builds eat that cost on every fresh environment.
  • Version coupling. rugged's upper bound (< 1.10) exists because its native bindings track libgit2 ABI changes. Bumping system libgit2 can silently break the gem until a matching rugged ships. Pure-Ruby code has no such coupling.
  • Symbol collisions with other native gems. libgit2 1.8+ statically links llhttp, and rugged exports all 119 llhttp_* symbols from its shared object. Loading rugged in the same process as llhttp-ffi (used by the popular http gem) causes hard-to-debug ABI collisions — exactly the kind of failure mode pure-Ruby gems can't produce. See rugged#1000 for the in-flight fix; until it ships, any process that loads both rugged and http/llhttp-ffi is at risk.
  • Build time. Native compilation noticeably slows bundle install on cold caches, which matters most in containers and CI — exactly where undercover runs.
  • Platform reach. Native extensions are the most common reason a gem fails to install on less-common architectures (musl Linux, certain ARM setups, BSDs). A pure-Ruby default means undercover "just works" everywhere Ruby runs.
  • Security surface. libgit2 has had its share of CVEs, and they ride along into every rugged-bundling app whether or not the user benefits from libgit2's features. The git gem shells out to the system git, which is already managed by the OS package manager.
  • It's optional, not gone. rugged remains meaningfully faster on large repos because it reads packfiles in-process instead of forking git. Users who already rely on it (or who want it for monorepo performance) just keep gem 'rugged' in their Gemfile and undercover picks it up automatically.

The net effect: smaller, friction-free default install for the 95% case, with the fast path still one gem 'rugged' line away.

Compatibility caveat

The git gem shells out to the system git binary at runtime, so the default backend introduces a soft runtime requirement on git being on PATH. In practice this is almost always already true:

  • Developer machines have git installed by definition
  • CI runners (GitHub Actions, CircleCI, etc.) ship with git
  • Mainstream container images (ruby:3.x, Debian/Ubuntu, Alpine) include git
  • Bundler itself shells out to git for any git-sourced gem, so projects with such gems already need git present

The realistic edge cases are minimal images — distroless, *-slim variants, scratch-based custom images — where git may have been intentionally stripped. For those users, the README's "Git backend" section points at rugged as the escape hatch (no system git needed, since libgit2 runs in-process). Happy to upgrade this to a more prominent upgrade-note callout if you'd like it called out more loudly than a paragraph in the backend section.

Changes

  • lib/undercover/changeset.rb — slimmed down to delegate to an adapter; no longer requires rugged
  • lib/undercover/changeset/rugged_adapter.rb — extracted from the old Changeset body, behaviorally identical
  • lib/undercover/changeset/git_adapter.rb — new, parses unified diff hunks for added line numbers
  • undercover.gemspec — adds git ~> 4.0, removes rugged
  • Gemfile — adds rugged as a dev dependency so both adapters get exercised in CI
  • spec/changeset_spec.rb — converted to a shared example group run against both adapters
  • README.md — documents the backend choice, runtime requirements, and how to opt back into rugged

Split into five reviewable commits:

  1. Extract Rugged code into a Changeset adapter (pure refactor)
  2. Add a GitAdapter backed by the git gem (new backend + parameterized specs)
  3. Drop rugged as a required dependency (the gemspec change)
  4. Document the git/rugged backend choice in README
  5. Note the system git CLI requirement of the default backend

Test plan

  • bundle exec rspec — all 206 examples pass with both adapters
  • bundle exec rspec spec/changeset_spec.rb — 18 examples (9 specs × 2 adapters), full contract parity
  • bundle exec rubocop — clean
  • Verified fallback: when rugged is unloadable, Changeset.default_adapter_class returns GitAdapter and end-to-end behavior matches
  • CI on the PR

🤖 Generated with Claude Code

stanhu and others added 5 commits May 6, 2026 14:42
Move all direct Rugged usage out of Changeset into a new
RuggedAdapter. Changeset now delegates workdir lookup, file/line
diff iteration, and emptiness checks to an adapter, keeping
filtering and validation logic in Changeset itself. No behavior
change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implement a Changeset adapter on top of the pure-Ruby git gem so
undercover can read changes without depending on libgit2. Parses
unified diff hunks to recover added line numbers, mirroring the
RuggedAdapter contract. Parameterize changeset_spec over both
adapters to lock the contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove rugged from undercover.gemspec so installing undercover no
longer pulls in the libgit2 native extension. RuggedAdapter is
still preferred at runtime when rugged happens to be loadable, so
existing users see no change. Keep rugged in the dev Gemfile so
both adapters get exercised by the test suite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Explain that undercover ships with the pure-Ruby git gem by
default and opts into rugged automatically when it is in the
user's bundle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Call out in the Git backend section that the git gem shells out
to a system git binary, which is universally present on dev and
CI machines but may need to be added to minimal production images
(distroless, slim Docker bases). Point users at rugged as the
escape hatch when installing git is not an option.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant