Skip to content

[GR-74799] Make JFR virtual thread registration lazy again#13318

Open
graalvmbot wants to merge 4 commits into
masterfrom
GR-74799
Open

[GR-74799] Make JFR virtual thread registration lazy again#13318
graalvmbot wants to merge 4 commits into
masterfrom
GR-74799

Conversation

@graalvmbot
Copy link
Copy Markdown
Collaborator

@graalvmbot graalvmbot commented Apr 13, 2026

This change restores lazy virtual-thread registration for JFR thread constants by registering vthreads only when a thread id is actually written to JFR data, instead of eagerly on every mount. That avoids retaining unreferenced short-lived virtual threads in RecordingStream consumers while preserving chunk-rotation correctness.

It also adds a regression test that keeps a RecordingStream active, creates short-lived virtual threads that do not emit watched events, and verifies their ids do not appear in the JFR thread constant pool.

Issue described at #13308

@roberttoyonaga
Copy link
Copy Markdown
Collaborator

Thank you for implementing this. I think switching to lazy vthread registration is indeed the correct way to fix the problem described by #13308. Although, I still need some more time to finish reviewing this.

Side note: we never had lazy vthread registration to begin with. I tried to implement it a while ago in this PR #9570 but we ultimately decided against it at the time.

@graalvmbot graalvmbot force-pushed the GR-74799 branch 2 times, most recently from 28481a6 to 93e6445 Compare April 17, 2026 12:38
@thomaswue
Copy link
Copy Markdown
Member

@roberttoyonaga This should be in a mergeable state now.

Copy link
Copy Markdown
Collaborator

@roberttoyonaga roberttoyonaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @thomaswue for working on implementing this. I agree with the approach and I think this looks very good. I have left a few minor comments below.


I ran the reproducer with your changes and it looks good.

Image

}

protected RecordingStream startStream(String[] events, Consumer<RecordingStream> configurer) throws Throwable {
RecordingStream stream = new RecordingStream();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to keep the old way of providing the default configuration. It doesn't make a huge difference, but helps to exercise the JFR infrastructure more.

new RecordingStream(getDefaultConfiguration());

public void test() throws Throwable {
if (!HasJfrSupport.get()) {
/* Prevent that the code below is reachable on platforms that don't support JFR. */
Assert.fail("JFR is not supported on this platform.");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we don't need this check here (and below too) since this class already extends AbstractJfrTest which has:

    @BeforeClass
    public static void checkForJFR() {
        assumeTrue("skipping JFR tests", !ImageInfo.inImageCode() || HasJfrSupport.get());
    }

emittedBeforeRotation.incrementAndGet();

while (!proceed) {
// Busy-wait so the virtual thread stays mounted on the carrier.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means this test doesn't verify any of the new getEventWriter() re-registration code. Since they are still mounted, the vthreads are re-registered after the chunk rotation in the old pre-existing JfrThreadRepository.registerRunningThreads() code.

This is still a useful test. I just want to double check the intention.

String[] events = new String[]{JfrEvent.ThreadStart.getName()};
Recording recording = startRecording(events);

recording.dump(createTempJfrFile());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for the induced chunk rotation here?

String[] events = new String[]{JfrEvent.ExecuteVMOperation.getName()};
Recording recording = startRecording(events);

recording.dump(createTempJfrFile());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for the chunk rotation here?

  1. Chunk rotation
  2. Virtual thread created
  3. Emission of ExecuteVMOperationEvent involving re-registration of vthread.

Should the order of 1 and 2 be flipped?

AtomicInteger completedThreads = new AtomicInteger();
VirtualStressor.executeAsync(THREADS, () -> {
try {
long threadId = (Long) Thread.class.getMethod("threadId").invoke(Thread.currentThread());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use Thread.currentThread().threadId() ?

JfrNativeEventWriter.putLong(data, SubstrateJVM.get().getStackTraceId(JfrEvent.ThreadStart));
JfrNativeEventWriter.putThread(data, thread);
JfrNativeEventWriter.putThread(data, JavaThreads.getParentThreadId(thread));
JfrNativeEventWriter.putRegisteredThreadId(data, parentThreadId, parentVThreadName);
Copy link
Copy Markdown
Collaborator

@roberttoyonaga roberttoyonaga Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call putThread(parentThread)? In JfrThreadLocal.beforeThreadStart we are getting the parent Thread instance now anyway. So extracting the name and ID in order to call putRegisteredThreadId here seem a bit unnecessary.

using putThread(parentThread) also has the benefit of updating Target_java_lang_VirtualThread.jfrEpochId (which isn't a matter of correctness, but provides a small optimization)


@Uninterruptible(reason = "Locking without transition requires that the whole critical section is uninterruptible.")
private void registerThread0(long threadId, String name, long osThreadId, boolean isVirtual, ThreadGroup threadGroup) {
registerThread0(threadId, name, osThreadId, isVirtual, threadGroup, null);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be okay to just inline this? It only has one caller.

return 0;
}

public static String getOptionalCurrentThreadName() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be clearer to rename this to getOptionalCurrentVThreadName ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is no longer used and can be removed. Afterward, it might be worth renaming registerThreadGroup0 to registerThreadGroup, since registerThreadGroup no longer exists.

* allocating thread id, so virtual threads must be registered while their Thread object
* is still available.
*/
SubstrateJVM.getThreadRepo().registerThread(thread);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OldObjectSample events are special because they are only emitted at the end of a recording, not every chunk. That means this early registration technique that was used for ExecutionSample events cannot be reused here. Registration may happen in chunk_1, but the events may not be emitted until the recording ends in chunk_2. This would split up the constant data and event data -- leading to broken references.

This is actually a pre-existing problem with platform threads and other constants too. I pointed it out a while ago here: #10206 (comment)

I would suggest not fixing this in the current PR because the fix will be a bit involved. I just wanted to point it out here.

There is an old issue open for this too: #10247
After your PR is integrated. I will close my old PR and work on a fresh fix built on your latest changes.

@graalvmbot graalvmbot force-pushed the GR-74799 branch 3 times, most recently from 6f81e85 to 85fc2c4 Compare May 24, 2026 18:29
Localize thread-start metadata handling to JFR, centralize optional current-thread name lookup, and keep the web-image override aligned with PlatformThreads.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants