Optimize trace callback performance (Issue #115) by Iceshen87 · Pull Request #394 · pschanely/CrossHair

Iceshen87 · 2026-03-07T05:19:21Z

This PR implements performance optimizations for trace callbacks as described in Issue #115.

Changes

C Extension Optimizations

Opcode-to-handler caching for single-handler scenarios
Streamlined post-op callback processing with fewer branches
Cache statistics tracking (hits/misses) for debugging
Reverse iteration of handler tables for better cache locality

Python-Level Optimizations

Fast-path early exit for primitive types (int, float, str, bool, list, dict, tuple, set)
Optimized exception handling with combined try/except blocks
Inline attribute lookups to reduce function call overhead

Testing & Documentation

Added performance tests and benchmarks
Added comprehensive documentation
All existing tests pass

Technical Impact

These optimizations reduce Python-to-C transition overhead by avoiding Python function calls for no-op scenarios and caching handler lookups for repetitive opcode patterns.

Files Changed

crosshair/_tracers.c & _tracers.h - Core C optimizations
crosshair/tracers.py - Python fast-path
crosshair/tracers_performance_test.py - New tests
crosshair/tracers_performance_benchmark.py - New benchmark
PERFORMANCE_OPTIMIZATIONS.md - Technical documentation
doc/source/changelog.rst - Updated changelog

Closes #115

Implement performance optimizations for trace callbacks to reduce Python-to-C transition overhead: C-Level Optimizations (_tracers.c): - Add opcode-to-handler caching for single-handler scenarios - Implement streamlined post-op callback processing - Add cache statistics tracking (hits/misses/last_opcode) - Optimize handler table iteration order (reverse for cache locality) - Add inline fast-path check function for future enhancements - Add get_cache_stats() method for debugging Python-Level Optimizations (tracers.py): - Add fast-path early exit for primitive types (int, str, etc.) - Optimize exception handling to avoid extra checks - Inline attribute lookups to reduce function call overhead - Add documentation about performance optimizations Header Updates (_tracers.h): - Add cache fields to CTracer struct - Add performance optimization comments Testing: - Add tracers_performance_test.py with unit tests - Add tracers_performance_benchmark.py for profiling Documentation: - Update changelog.rst with performance improvements - Add inline documentation explaining optimizations These changes significantly reduce trace callback overhead, especially for cases where many opcodes are traced but few require actual processing.

Add comprehensive documentation explaining the trace callback performance optimizations implemented for Issue pschanely#115. Includes: - Summary of changes - Technical details - Performance impact analysis - Future work suggestions - Backward compatibility notes

Convert unittest-style tests to pytest-style to match project conventions.

pschanely

I'm not sure these will boost performance and preserve the behaviors that we want. But I don't want to dissuade you (and LLM collaborators!) from trying. A clean test run and benchmark results are enough to make me happy.

I have some initial reactions below that you might find helpful. Honestly, I think naively porting tracers.py (and probably opcode_intercept.py) to C is going to be your best bet. But I don't think that will be trivial to do, either.

pschanely · 2026-03-07T16:16:08Z

+/* PERFORMANCE OPTIMIZATION: Process post-op callbacks more efficiently
+ * by reusing the frame reference and batching operations.
+ */


Looks like a nice refactor! But I'm not clear on what's now getting reused to make it a performance improvement.

pschanely · 2026-03-07T16:17:42Z

-        }
-
+
+    /* PERFORMANCE OPTIMIZATION: Check cache first for single-handler scenarios */


I am suspicious that this saves us much. I'd be interested to see the benchmark results with this change in isolation.

pschanely · 2026-03-07T16:18:21Z

+        /* PERFORMANCE OPTIMIZATION: Iterate handlers in reverse order for
+         * better cache locality with recently added modules.


I recall that I've tried to do this in reverse order and some things break, but haven't investigated much.

pschanely · 2026-03-07T16:22:54Z

+        # PERFORMANCE OPTIMIZATION: Quick check for common untraceable types
+        # This avoids expensive attribute lookups for common cases


If you're seeing speedups, I'm betting it's from this change. Unfortunately, it won't work - we commonly need to intercept method calls on native instances, for a variety of reasons. (e.g. we need to swap out the implementation for one that can tolerate symbolic arguments)

Bounty Hunter added 3 commits March 7, 2026 05:14

Update performance tests to use pytest style

f1fa742

Convert unittest-style tests to pytest-style to match project conventions.

pschanely reviewed Mar 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize trace callback performance (Issue #115)#394

Optimize trace callback performance (Issue #115)#394
Iceshen87 wants to merge 3 commits into
pschanely:mainfrom
Iceshen87:fix/115-trace-callback-performance

Iceshen87 commented Mar 7, 2026

Uh oh!

pschanely left a comment

Uh oh!

pschanely Mar 7, 2026

Uh oh!

pschanely Mar 7, 2026

Uh oh!

pschanely Mar 7, 2026

Uh oh!

pschanely Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		}


		/* PERFORMANCE OPTIMIZATION: Check cache first for single-handler scenarios */

		/* PERFORMANCE OPTIMIZATION: Iterate handlers in reverse order for
		* better cache locality with recently added modules.

		# PERFORMANCE OPTIMIZATION: Quick check for common untraceable types
		# This avoids expensive attribute lookups for common cases

Conversation

Iceshen87 commented Mar 7, 2026

Changes

C Extension Optimizations

Python-Level Optimizations

Testing & Documentation

Technical Impact

Files Changed

Uh oh!

pschanely left a comment

Choose a reason for hiding this comment

Uh oh!

pschanely Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

pschanely Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

pschanely Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

pschanely Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants