Arm backend: Add event profiling to VGF backend#19703
Conversation
Signed-off-by: Elena Zhelezina <elena.zhelezina@arm.com> Change-Id: I26c6c8857744e91911daa4cf52ce7260e452e72e
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19703
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 1 New FailureAs of commit af840eb with merge base 1feb56c ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
| } | ||
|
|
||
| bool VgfRepr::init_timestamp_queries() { | ||
| const char* enable = std::getenv("EXECUTORCH_VGF_ENABLE_TIMESTAMP_QUERIES"); |
There was a problem hiding this comment.
did you consider backend options as opposed to envvar?
|
|
||
| if (!valid_vgf) { | ||
| #ifdef ET_EVENT_TRACER_ENABLED | ||
| event_tracer_end_profiling_delegate(event_tracer, init_total_event); |
There was a problem hiding this comment.
Do you see nested profiling events leading to some noise? I.e. the time spent in recording the inner profiling event measured in the outer one? I understand this is needed for chrometrace etc. Just want to make sure this record overhead isn't too much.
|
|
||
| ## Profiling of VGF Backend | ||
|
|
||
| VGF profiling now emits both host-side ExecuTorch event tracer ranges and Vulkan timestamp-query measurements. The host ranges split init into `VGF_INIT_*` phases, including `VGF_INIT_CREATE_DATA_GRAPH_PIPELINE`, and split execute into `VGF_COPY_INPUTS`, `VGF_QUEUE_SUBMIT`, `VGF_QUEUE_WAIT_IDLE`, `VGF_TIMESTAMP_QUERY_READBACK`, `VGF_DISPATCH_AND_WAIT`, and `VGF_COPY_OUTPUTS`. Vulkan timestamp queries are inserted into the recorded VGF command buffer around `vkCmdDispatchDataGraphARM()`, producing `VGF_DATA_GRAPH_DEVICE_TIME`, which measures device-side elapsed time for the submitted data-graph command buffer region. To collect a profile, build the VGF runner with event tracing enabled, run the model with an ETDump path, then convert the ETDump to Chrome trace JSON: |
There was a problem hiding this comment.
Are you planning to expand this to hardware performance counters like occupancy going beyond time measurements, and chrometrace.
Add event profiling to VGF backend.
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani