[GR-74799] Make JFR virtual thread registration lazy again#13318
[GR-74799] Make JFR virtual thread registration lazy again#13318graalvmbot wants to merge 4 commits into
Conversation
|
Thank you for implementing this. I think switching to lazy vthread registration is indeed the correct way to fix the problem described by #13308. Although, I still need some more time to finish reviewing this. Side note: we never had lazy vthread registration to begin with. I tried to implement it a while ago in this PR #9570 but we ultimately decided against it at the time. |
28481a6 to
93e6445
Compare
|
@roberttoyonaga This should be in a mergeable state now. |
There was a problem hiding this comment.
Thank you @thomaswue for working on implementing this. I agree with the approach and I think this looks very good. I have left a few minor comments below.
I ran the reproducer with your changes and it looks good.
| } | ||
|
|
||
| protected RecordingStream startStream(String[] events, Consumer<RecordingStream> configurer) throws Throwable { | ||
| RecordingStream stream = new RecordingStream(); |
There was a problem hiding this comment.
Maybe it's better to keep the old way of providing the default configuration. It doesn't make a huge difference, but helps to exercise the JFR infrastructure more.
new RecordingStream(getDefaultConfiguration());
| public void test() throws Throwable { | ||
| if (!HasJfrSupport.get()) { | ||
| /* Prevent that the code below is reachable on platforms that don't support JFR. */ | ||
| Assert.fail("JFR is not supported on this platform."); |
There was a problem hiding this comment.
Maybe we don't need this check here (and below too) since this class already extends AbstractJfrTest which has:
@BeforeClass
public static void checkForJFR() {
assumeTrue("skipping JFR tests", !ImageInfo.inImageCode() || HasJfrSupport.get());
}
| emittedBeforeRotation.incrementAndGet(); | ||
|
|
||
| while (!proceed) { | ||
| // Busy-wait so the virtual thread stays mounted on the carrier. |
There was a problem hiding this comment.
That means this test doesn't verify any of the new getEventWriter() re-registration code. Since they are still mounted, the vthreads are re-registered after the chunk rotation in the old pre-existing JfrThreadRepository.registerRunningThreads() code.
This is still a useful test. I just want to double check the intention.
| String[] events = new String[]{JfrEvent.ThreadStart.getName()}; | ||
| Recording recording = startRecording(events); | ||
|
|
||
| recording.dump(createTempJfrFile()); |
There was a problem hiding this comment.
What is the reason for the induced chunk rotation here?
| String[] events = new String[]{JfrEvent.ExecuteVMOperation.getName()}; | ||
| Recording recording = startRecording(events); | ||
|
|
||
| recording.dump(createTempJfrFile()); |
There was a problem hiding this comment.
What is the reason for the chunk rotation here?
- Chunk rotation
- Virtual thread created
- Emission of ExecuteVMOperationEvent involving re-registration of vthread.
Should the order of 1 and 2 be flipped?
| AtomicInteger completedThreads = new AtomicInteger(); | ||
| VirtualStressor.executeAsync(THREADS, () -> { | ||
| try { | ||
| long threadId = (Long) Thread.class.getMethod("threadId").invoke(Thread.currentThread()); |
There was a problem hiding this comment.
Why not use Thread.currentThread().threadId() ?
| JfrNativeEventWriter.putLong(data, SubstrateJVM.get().getStackTraceId(JfrEvent.ThreadStart)); | ||
| JfrNativeEventWriter.putThread(data, thread); | ||
| JfrNativeEventWriter.putThread(data, JavaThreads.getParentThreadId(thread)); | ||
| JfrNativeEventWriter.putRegisteredThreadId(data, parentThreadId, parentVThreadName); |
There was a problem hiding this comment.
Why not just call putThread(parentThread)? In JfrThreadLocal.beforeThreadStart we are getting the parent Thread instance now anyway. So extracting the name and ID in order to call putRegisteredThreadId here seem a bit unnecessary.
using putThread(parentThread) also has the benefit of updating Target_java_lang_VirtualThread.jfrEpochId (which isn't a matter of correctness, but provides a small optimization)
|
|
||
| @Uninterruptible(reason = "Locking without transition requires that the whole critical section is uninterruptible.") | ||
| private void registerThread0(long threadId, String name, long osThreadId, boolean isVirtual, ThreadGroup threadGroup) { | ||
| registerThread0(threadId, name, osThreadId, isVirtual, threadGroup, null); |
There was a problem hiding this comment.
Would it be okay to just inline this? It only has one caller.
| return 0; | ||
| } | ||
|
|
||
| public static String getOptionalCurrentThreadName() { |
There was a problem hiding this comment.
Would it be clearer to rename this to getOptionalCurrentVThreadName ?
There was a problem hiding this comment.
This method is no longer used and can be removed. Afterward, it might be worth renaming registerThreadGroup0 to registerThreadGroup, since registerThreadGroup no longer exists.
| * allocating thread id, so virtual threads must be registered while their Thread object | ||
| * is still available. | ||
| */ | ||
| SubstrateJVM.getThreadRepo().registerThread(thread); |
There was a problem hiding this comment.
OldObjectSample events are special because they are only emitted at the end of a recording, not every chunk. That means this early registration technique that was used for ExecutionSample events cannot be reused here. Registration may happen in chunk_1, but the events may not be emitted until the recording ends in chunk_2. This would split up the constant data and event data -- leading to broken references.
This is actually a pre-existing problem with platform threads and other constants too. I pointed it out a while ago here: #10206 (comment)
I would suggest not fixing this in the current PR because the fix will be a bit involved. I just wanted to point it out here.
There is an old issue open for this too: #10247
After your PR is integrated. I will close my old PR and work on a fresh fix built on your latest changes.
6f81e85 to
85fc2c4
Compare
Localize thread-start metadata handling to JFR, centralize optional current-thread name lookup, and keep the web-image override aligned with PlatformThreads.
This change restores lazy virtual-thread registration for JFR thread constants by registering vthreads only when a thread id is actually written to JFR data, instead of eagerly on every mount. That avoids retaining unreferenced short-lived virtual threads in RecordingStream consumers while preserving chunk-rotation correctness.
It also adds a regression test that keeps a RecordingStream active, creates short-lived virtual threads that do not emit watched events, and verifies their ids do not appear in the JFR thread constant pool.
Issue described at #13308