Skip to content

sysMon Parser

The sysMon Parser consumes profiling data generated by the sysMonApp profiler service or sysMon DSP profiler to generate profiling reports.

The sysMon Parser executable located in the following folders:

  • Windows version: $HEXAGON_SDK_ROOT/tools/utils/sysmon/parser_win_v2/HTML_Parser
  • Linux version: $HEXAGON_SDK_ROOT/tools/utils/sysmon/parser_linux_v2/HTML_Parser

The parser generates an HTML report along with some CSV files.

Usage

HTML sysMon Parser usage:

sysmon_parser <input_file> [--tlp <input_tlp_file>] [--outdir <output_path>] [--summary]
Argument Description
input_file Input binary generated by the sysMonApp profile service or the sysMon profiler.
--tlp <input_tlp_file> Input TLP binary generated by the sysMonApp TLP service.
--outdir <output_path> Folder in which to generate the profiling reports. Default is current directory.
--summary Generate only a sysMon summary report HTML file and skip the generation of all CSV files.

Non-HTML sysMon Parser usage:

SysmonParser <input_file> <output_path> <mode_type>
Argument Description
input_file Input binary generated by the sysMonApp profile service or the sysMon profiler.
output_path Folder in which to generate the profiling reports.
mode_type Identify which mode was used during the collection of the profiling data. Valid values are default and user.

Files generated

The sysMon Parser generates the following files from the given input_file:

File Description
sysmon_report.html Detailed summary report with plots, pie charts, and an HVX analysis (as applicable).
post_processed_metrics.csv Includes all metrics computed for all the samples read from the input sysMon raw binary file. This file is useful for plotting metrics against time, and correlating these metrics with other logs.
raw_pmu.csv Includes all collected PMU raw values for all samples in the BIN file. This file is useful for computing any metrics relevant to the end user.
pmuStats.csv Includes all PMU raw values accumulated for the entire duration of the profiling. (Because not all PMU data are sampled in any given time, missing data is interpolated from the last known values.)

Following files are generated when STID/MarkerId enabled:

File Description
SysMonMarkerID_STID.html Detailed summary report for STID filtered profiling (if enabled) and Markers (if enabled) data.
STID_<STID_value>_Metrics.csv Detailed postprocessed metrics of the software thread bearing <STID_value> as its STID.
MarkerID_<Marker_value>_PP_Metrics.csv Detailed postprocessed metrics of the marked region bearing <Marker_value> as it's marker identifier.

Following files are generated when TLP binary is provided using --tlp option:

File Description
SysMontlp_report.html Detailed summary report with thread level profiling summary and plots.
SysMontlp_rawdata.csv Detailed postprocessed metrics for all the samples read from the input TLP data.
SysMontlp_summary_report.csv Summary containing averages per thread.

Here are some examples of sysmon_report.html summary report:

screenshot screenshot screenshot screenshot

screenshot screenshot screenshot screenshot screenshot

Here is a sample from a post_processed_metrics.csv file:

Sample Index Sampling period(ms) SysClock Time(ms) TimeStamp Effec Q6 Freq(MHz) QDSP6 clk(MHz) Bus Clk vote(MHz) Measured MEMNOC(MHz) MPPS pCPP AXI_RD_BW_128B AXI_WR_BW_128B IU stalls
1 1 16:9:30:657 13.12 1171.2 806 1555.21 2.96 4.43 0.8 0.48 2.1
1 13.8 16:9:30:671 13.12 1171.2 806 1555.21 2.96 4.43 0.8 0.48 2.1

Starting Lahaina, profiling data can capture samples from aDSP/sDSP while in Island mode as well. The parser generates following additional files for samples collected in Island mode:

File Description
post_processed_metrics_island.csv Includes all Island mode metrics computed from raw samples read from the input sysMon raw binary file.
raw_pmu_island.csv Includes raw PMU values corresponding to Island mode operation

Here is an example of sysmon_report.html with Island report button under 'Detailed report' column to view a summary of Island mode execution.

screenshot screenshot screenshot

STID and Markers data

Here are some examples of SysMonMarkerID_STID.html report generated with STID filtering/Markers enabled:

screenshot screenshot

Here is an example of MarkerID_5_PP_Metrics.csv containing postprocessed metrics associated with the marker ID as 5 (<Marker_value> = 5): screenshot

TLP data

Here is an example of SysMontlp_report.html report: screenshot

Here is an example of SysMontlp_rawdata.csv report containing detailed thread-level profiling data: screenshot

Metric descriptions

Here are some of the common metrics found in the reports:

Metric Unit Description
MPPS Million packets per second Number of DSP packets. Average MPPS of a real-time use case is constant and independent of the core clock. An increase in MPPS for a non-real-time use case for a given clock indicates effective utilization of L1 and L2 cache.
HVX Thread MPPS Million packets per second Number of packets executed by the HVX coprocessor. MPPS metrics captures both scalar core and HVX core packets. The MPPS executed on the scalar DSP threads can be calculated: DSP scalar Thread MPPS = (MPPS - HVX Thread MPPS).
Effective Q6 frequency MHz Actual load on the processor. The ratio of the effective DSP frequency and NPA core clock frequency can be used to get DSP usage: DSP usage = (Effective DSP frequency / NPA core clock). A DSP usage percentage approaching 1 indicates that the DSP core must run at a higher frequency to avoid any glitches or frame drops. MPPS and pCPP metrics together can be used to decide if the DSP core clock vote or bus clock vote must be adjusted in this case.
pCPP Processor cycles per packet Average processor cycles taken per packet. The lower the pCPP factor, the more the work is done for a given core clock frequency. Core stalls due to bus accesses can result in a higher pCPP factor. Increasing the bus clock vote or prefetching data memory before actual usage can help in lowering this factor and increasing the work done for a given core clock frequency.
IU stall frequency MHz Derived from measured cycles that the core has stalled on instruction unit cache accesses due to misses. The higher the IU stall frequency, the higher the pCPP factor can be.
DU stall frequency MHz Derived from measured cycles that the core has stalled on accessing L1 data cache lines due to misses. The higher the DU stall frequency, the higher the pCPP factor can be. DMT (Dynamic Multi Threading) uses DU stalls of stalled thread and schedules other threads for efficient utilization of the core clock.
AXI cached read/write bandwidth MBps AXI bus bandwidth (DDR accesses) generated by read/write access from the core due to L2 cache line misses. Misses include both demand and prefetch misses in L2 cache.
L2 fetch bandwidth MBps Bus bandwidth generated by the L2fetch instruction to prefetch data into L2 cache.
Clock votes MHz Core clock vote for the frequency the DSP is running at. A bus clock vote captures the overall DSP vote for bus clock. Also, the final bus clock frequency (done outside of the DSP) will be based on votes from other subsystems (application processor, modem, and so on).
Static clock votes MHz Aggregated static votes from all clients for core and bus clocks.
DCVS clock votes MHz DCVS vote for core and bus clocks.