Skip to content

sysMonApp

sysMonApp is an Android application that configures and displays from multiple performance-related services on the cDSP, aDSP, and sDSP. These services are explained in the following sections.

Service Description
Profiler Profile any DSP to gather clock information, resource usage, processor and thread load distribution, bus bandwidth metrics, and so on.
Getstate Get the current clock voting information and heap statistics of each static Protection Domain.
DCVS Enable or disable the Dynamic Clock Voltage Scaling (DCVS) feature executing on each DSP.
Clocks Set the DSP core clock and bus clocks.
Thread info For each active software thread, display their priority, declared stack size, and maximum stack used.
TLP Thread-level profiling (TLP) that monitors software thread activity.
ETM trace Enable the ETM to trace the instruction and data memory and provide information about dynamic modules loaded on the cDSP.
HTA profiler Profile the HTA to gather clock information, DDR bandwidth, and various HTA activity statistics.
getPowerStats This service reports time spent in power collapse, low power island (LPI) and each core clock level of selected Q6 subsystem since device boot up or last reset.
pinfo Display the statistics (e.g. thread priority, stack size) of FastRPC-spawned user processes.
getinfo Display information about the DSPs.
getload Display DSP load statistics.

sysMonApp supports all targets covered by the Hexagon SDK with the exception of Lahaina. For more details, see the feature matrix.

Setup

The Hexagon SDK includes the following LA and LE variants of sysMonApp. These variants are present at the indicated locations:

  • LA 64-bit verion: ${HEXAGON_SDK_ROOT}/tools/utils/sysmon/sysMonApp
  • LA 32-bit version: ${HEXAGON_SDK_ROOT}/tools/utils/sysmon/sysMonApp_32Bit
  • LE 64-bit verion: ${HEXAGON_SDK_ROOT}/tools/utils/sysmon/sysMonAppLE
  • LE 32-bit version: ${HEXAGON_SDK_ROOT}/tools/utils/sysmon/sysMonAppLE_32Bit

Push the version for your device to a location of your choice, and then change the permissions to make the file an executable. For example, on rooted Android devices, we recommend you push your executable into a specified folder as part of the data partition:

adb push ${HEXAGON_SDK_ROOT}/tools/utils/sysmon/sysMonApp /data/local/tmp/
adb shell chmod 777 /data/local/tmp/sysMonApp

Execute sysMonApp from the ADB shell by entering the following command:

adb shell /data/local/tmp/sysMonApp <service> <arguments related to the service>

When sysMonApp is executed without a service name, the help page is displayed:

adb shell /data/local/tmp/sysMonApp

Profiler service

Use the sysMonApp profiler option to profile services running on one of the Hexagon DSPs. Gather information such as:

  • The clocks voted for
  • Resource usage
  • Load distribution across available hardware threads
  • Load on the processor
  • Bus bandwidth metrics
  • And other profiling metrics

The metrics are useful in measuring performance, debugging performance related issues, and identifying possible optimizations.

To see the profiler service help page:

adb shell /data/local/tmp/sysMonApp profiler -help

Usage

sysMonApp profiler [options]
Options Expected values Default value Description
--samplingPeriod Integer >=0 0 if debugLevel==1
1 otherwise
Sampling period (in milliseconds) at which the eight PMU events are collected. When samplingPeriod is set to 0, the sampling interval is chipset and subsystem dependent but usually on the order of 1 through 50 ms.
--debugLevel 0/1/2 Chipset dependent Select a sampling mode. 0 - User mode with four customized PMU counters. 1 - Default mode. 2 - User mode with eight customized PMU counters. See below for more details.
--dcvsOption 0/1 1 if debugLevel==1
0 otherwise
0 - Disable DCVS for the profiling duration. 1 - no DCVS override, DCVS enablement is controlled by the applications running on the selected DSP. See below for more details.
--profileFastrpcTimeline 0/1 0 Disable (0) or enable (1) the option of collecting the time spent by the FastRPC interface.
--noMeasured 0/1 0 Enable (0) or disable (1) reporting measured bus clock frequencies.
--stidArray Comma-separated Software Thread ID (STID) values disabled User-provided list (comma separated) of STID values. For more details, see below.
--q6 adsp, cdsp, sdsp adsp Select the Hexagon DSP to profile.

Sampling modes

The 'debugLevel' option controls the PMU sampling modes. Three sampling modes are supported:

  • Default mode ('--debugLevel 1')
  • User mode with four customized PMU counters ('--debugLevel 0')
  • User mode with eight customized PMU counters ('--debugLevel 2')

In Default mode, a fixed set of eight PMU events (MPPS, AXI read and write bandwidth, HVX MPPS, and so on) are monitored. (For more details, see the sysMon parser metric description.)

The eight PMU events chosen in this mode are specific to the chipset and subsystem (aDSP/cDSP/sDSP). You cannot override these events.

In User mode, you can configure which PMU events to assign to four (or eight) of the PMU counters. Furthermore, you can also use the --stidArray option to configure the counters to only increment when the configured events occur on specified software threads. Select any number of PMU events to count in a /data/pmu_events.txt file. sysMonApp rotates through the requested events, four (or eight) at a time, to generate a sampling of all the requested events. If the --stidArray option is used to confine the counting to only specified software threads, sysMonApp further rotates through the specified software threads for each set of four (or eight) PMU events assigned to the PMU counters at a given time.

For example, in User mode 0 (four user-defined PMUs), assume an STID array specifies two threads and 10 user-defined events:

  • The sysMon profiler service collects a set of samples for the first four user-defined PMUs for the first STID, and then for the second STID.
  • It then moves on to collect samples for the next four user-defined PMU events for the first STID, and then the second STID.
  • Next, it collects samples for both the last two and again the first two user-defined PMU events, for the first STID, and then the second STID, and so on.
  • Throughout the entire time, the profiler also collects samples for four predefined PMU events. These events are not filtered by STID: they are collected for all software threads with an STID equal to 0.

DCVS monitors a set of PMU events for decision making. When --debugLevel 0 is selected, DCVS will be restricted to using the available 4 PMU counters (while the other four PMU counters will be used for user profiling), hence the DCVS decisions may not be coherent with the default mode (where all 8 PMU counters are available to DCVS). DCVS will remain disabled in '--debugLevel 2' profiling as all the available PMU counters are used for user profiling.

PMU selection

As explained earlier, STIDs allow you to filter some PMU events captured by the profiler service.

To filter PMU counters by software thread, follow these steps:

  1. Programmatically assign an STID to each thread that is to be filtered by the profiler.

    Use the qurt_thread_attr_enable_stid API at the time of thread creation. The second parameter of this API is an 8-bit unsigned number that has the following meaning:

    • 0: No STID is assigned. No PMU-based filtering is applied to this thread.
    • 1: QuRT assigns an STID that is not already in use.
    • 2 through 255: This value is used as the STID value. QuRT does not check whether that STID is already in use, and thus multiple threads might share the same STID.

    The following code snippet shows how to use qurt_thread_attr_enable_stid:

    // Standard way of setting some thread attributes
    qurt_thread_attr_t attr;
    qurt_thread_attr_init (&attr);
    qurt_thread_attr_set_name(&attr, p_ThreadName);
    qurt_thread_attr_set_stack_addr (&attr, p_StackBase);
    qurt_thread_attr_set_stack_size (&attr, stackSize);
    
    /* The following line enables STID allocation to the software thread being created by
     * requesting QuRT to assign an available STID to the thread during qurt_thread_create. */
    qurt_thread_attr_enable_stid(&attr, 1);
    
    result = qurt_thread_create(&tid, &attr, (void*)entry, NULL);
    
  2. Use the tinfo service while the application is running to determine or confirm which STIDs are assigned to the threads on which STID filtering is required.

  3. Run the sysMonApp profiler service with the --stidArray option that lists the STIDs.

    For example, to start the profiler service on the cDSP, enter the following command:

    adb shell /data/local/tmp/sysMonApp profiler --debugLevel 0 --stidArray 1,2,3,4 --q6 cdsp
    

    This command starts the profiler in User mode with four customized PMU events (debugLevel 0). The --q6 option overrides the default, which is \adsp`.

    As explained previously, four user-configured PMU events are selected per sample in User mode, while the other four are the defaults and remain constant. STID filtering only applies to the four user-configured PMU events, and the defaults are not filtered (STID mask is 0) and thus collected continuously. In a profiling sample captured over a provided sampling period, only one STID value at a time is applied as a filter to the four user-configured PMU counters. In this example, a sequence of 50 PMU events for threads with STID values of 1, 2, 3, and 4 are collected one at a time.

NOTES:

  • By default, STID-based filtering is disabled. You can enable it by using the --stidArray option in conjunction with --debugLevel to select one of the two user modes.
  • Not all PMU events are maskable under STID.
  • An STID cannot be assigned to a thread that has already been created.
  • Due to time division multiplexing (iterating over provided STIDs over profiling samples), selecting multiple STIDs as filters might not provide a complete picture for a given STID (missing time slots where other STID is configured). In cases where a continuous (time domain) filtering is required per STID, pass only one STID to sysMonApp.
  • The --stidArray list has a limit of eight STIDs.

Usage examples

  1. Run the sysMonApp profiler service in User mode.

    adb shell /data/local/tmp/sysMonApp profiler --debugLevel 0
    
    Starting Profiler with parameters:
    Q6 Processor: adsp
     Sampling Interval in ms : 1
     Total samples :0
     samplesInSet: 50
     Default Mode : 0
     dcvs enable : 0
     no. of stids: 0
    Domain Configured ADSP
    Q6 architecture detected as v66...
    Opening /data/pmu_events.txt file
    /data/pmu_events.txt not found, going ahead with default events, no. of events = 84
    opening outputfile @/sdcard/sysmon.bin
    Enabling DSP SysMon using FastRPC
    Allocating output buffer
    >> Starting thread to Query DSP SysMon for samples
    >> Waiting for a keyboard iHTAt...
    
  2. Run the sysMonApp profiler service in Default mode.

    adb shell /data/local/tmp/sysMonApp profiler --debugLevel 1
    
    Starting Profiler with parameters:
    Q6 Processor: adsp
     Sampling Interval in ms : 0
     Total samples :0
     samplesInSet: 50
     Default Mode : 1
     dcvs enable : 1
     no. of stids: 0
    Domain Configured ADSP
    Q6 architecture detected as v66...
    opening outputfile @/sdcard/sysmon.bin
    Enabling DSP SysMon using FastRPC
    Allocating output buffer
    >> Starting thread to Query DSP SysMon for samples
    >> Waiting for a keyboard iHTAt...
    
  3. Run the sysMonApp profiler and capture samples at sampling intervals of 10 ms on the cDSP in User mode, and capture the FastRPC timeline packets.

    adb shell /data/local/tmp/sysMonApp profiler --debugLevel 0 --samplingPeriod 10 --q6 cdsp --profileFastrpcTimeline 1
    
    Starting Profiler with parameters:
    Q6 Processor: cdsp
     Sampling Interval in ms : 10
     Total samples :0
     samplesInSet: 50
     Default Mode : 0
     dcvs enable : 0
     no. of stids: 0
    Domain Configured Compute DSP
    Running FastRPC Timeline Profiling in parallel...
    Q6 architecture detected as v66...
    Opening /data/pmu_events.txt file
    /data/pmu_events.txt not found, going ahead with default events, no. of events = 84
    opening outputfile @/sdcard/sysmon_cdsp.bin
    Enabling DSP SysMon using FastRPC
    Allocating output buffer
    >> Starting thread to Query DSP SysMon for samples
    >> Profiling FastRPC Timelines in parallel
    >> Waiting for a keyboard iHTAt...
    

Data collection

The sysMonApp profiler stores raw profiling data on the device in either the /sdcard or /data folders:

  • For the aDSP, the file name is sysmon.bin.
  • For the cDSP, the file name is sysmon_cdsp.bin.
  • For the sDSP, the file name is sysmon_sdsp.bin.

The sysMonApp profiler also prints the output file path with the appropriate file name in the standard output. When you finish profiling, pull the file from the device and postprocess it using the sysmon parser on a host machine.

To pull the profiler output file from device using ADB, enter the following command:

adb pull /sdcard/sysmon<DSP>.bin <destination directory>

Where <DSP> takes the value _cdsp or _sdsp to designate the cDSP or sDSP, respectively. For the aDSP, do not use <DSP>.

Data postprocessing

To postprocess the profiling data, see the instructions on how to use the sysmon parser.

Getstate service

Use the getstate service to determine the current clock voting information and heap statistics of each static Protection Domain.

Usage

sysMonApp getstate [--getVotes <0/1>] [--q6 <dsp>]

Where the --q6 option allows you to specify the DSP to query: adsp, cdsp, or sdsp. --getVotes 1 can be used to display DCVS_V2/V3 votes done via HAP_power_set() call.

Example

The following command queries the clock and heap statistics of the ADSP:

adb shell /data/local/tmp/sysMonApp getstate

Domain Configured ADSP
DSP Core clock :576.00MHz
SNOC Vote:1.26MHz
MEMNOC Vote:0.00MHz
GuestOS : Total Heap:1792.00KB
Available Heap:514.49KB
Max.Free Bin:433.02KB
Audio PD : Total Heap:9216.00KB
Available Heap:8170.21KB
Max.Free Bin:8059.44KB
Measured SNOC (/clk/snoc) :200.00MHz
Measured BIMC (/clk/bimc) :681.66MHz
Measured CPU L3 clock :1612.80MHz

The following command queries the clock and heap statistics of the CDSP:

adb shell /data/local/tmp/sysMonApp getstate --q6 cdsp

Domain Configured Compute DSP
DSP Core clock :384.00MHz
SNOC Vote:0.00MHz
MEMNOC Vote:0.62MHz
GuestOS : Total Heap:2560.00KB
Available Heap:1622.56KB
Max.Free Bin:1575.28KB
Measured SNOC (/clk/snoc) :200.00MHz
Measured BIMC (/clk/bimc) :681.66MHz
Measured CPU L3 clock :1612.80MHz

getPowerStats service

Reports time spent in power collapse, low power island (LPI) and each core clock level of selected Q6 subsystem since device boot up or last reset.

Usage

sysMonApp getPowerStats [--clear 1] [--q6 <dsp>]

Where the --q6 option allows you to specify the DSP to query: adsp, cdsp, or sdsp. --clear 1 can be provided to clear the stats collected so far and start fresh logging.

Example

The following command queries the power statistics of the ADSP:

adb shell /data/local/tmp/sysMonApp getPowerStats

Domain Configured ADSP

Clock freq.(MHz)                 Active time(seconds)
    307.20                             7.28
    576.00                             0.01
    614.40                             9.60
    768.00                             0.78
    940.80                             1.93
    960.00                             0.10
   1171.20                             0.05
   1324.80                             0.00
   1401.60                             0.03
Power collapse time(seconds): 17181.12
Low Power Island time(seconds): 38.59
Current core clock(MHz): 307.20
Total time(seconds): 17239.49

The following command queries the power statistics of the CDSP:

adb shell /data/local/tmp/sysMonApp getPowerStats --q6 cdsp

Domain Configured Compute DSP

Clock freq.(MHz)                 Active time(seconds)
    364.80                             0.10
    556.80                             0.00
    768.00                             0.08
    960.00                             0.00
   1171.20                             0.18
   1324.80                             0.00
   1382.40                             0.00
Power collapse time(seconds): 17276.43
Low Power Island time(seconds): 0.00
Current core clock(MHz): 364.80
Total time(seconds): 17276.79

The following command queries the power statistics of the SDSP:

adb shell /data/local/tmp/sysMonApp getPowerStats --q6 sdsp

Domain Configured Sensors DSP

Clock freq.(MHz)                 Active time(seconds)
    423.00                             0.23
    557.00                             0.00
    672.00                             0.22
    845.00                             1.18
    960.00                             0.00
   1075.00                             0.01
Power collapse time(seconds): 13988.50
Low Power Island time(seconds): 39.77
Current core clock(MHz): 423.00
Total time(seconds): 14029.91

pinfo service

This service displays FastRPC-spawned user process statistics.

Usage

sysMonApp pinfo [--maxT] [--q6 <dsp>]
  • Use --q6 to specify the DSP to query (adsp, cdsp, or sdsp)
  • Use --maxT option to create multiple threads on the DSP subsystem within sysMonApp process and reports maximum number of successful thread creations.

Example

The following command displays the FastRPC-spawned user process statistics of the CDSP:

adb shell /data/local/tmp/sysMonApp pinfo --q6 cdsp

Domain Configured Compute DSP
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Info of UserPD : 1
    NAME is : /frpc/f05b8b10 sysMonApp    ASID is : 772    PIDA is : 6493    PD State is : InitDone    PD Type is : SignedDynamic
    User heap Used Bytes in KB : 53.58
    RPC Total Memory in KB : 512.00    RPC Memory Used in KB  : 35.28

    UserPD Threads info:
    DSP FastRPC Threads:
    T1 :     Name : /frpc/f05b8b10     tidQ : 5494    tidA : 6493    Thread state is : Running    Priority : 192    Allocated Stack : 16384
    T2 :     Name : /frpc/f05b8b10     tidQ : 7553    tidA : 6495    Thread state is : Running    Priority : 192    Allocated Stack : 16384
    T3 :     Name :     tidQ : 6534    Thread state is : Running
    DSP Non-FastRPC Threads:
    T4 :     Thread Name : user_reaper    Priority : 64    Allocated Stack : 4096
    T5 :     Thread Name : gc_thread    Priority : 192    Allocated Stack : 4096
    T6 :     Thread Name : HAP_par_thread_    Priority : 192    Allocated Stack : 4096
    T7 :     Thread Name : HAP_par_thread_    Priority : 192    Allocated Stack : 4096
    T8 :     Thread Name : exception_handl    Priority : 192    Allocated Stack : 4096
    T9 :     Thread Name : HAP_par_thread_    Priority : 192    Allocated Stack : 4096
    T10 :     Thread Name : HAP_par_thread_    Priority : 192    Allocated Stack : 4096
    T11 :     Thread Name : HAP_par_thread_    Priority : 192    Allocated Stack : 4096
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

**********************************************************************************
    Total memory borrowed from HLOS by all PDs in KB  : 512.00
    Number of times memory borrowed from HLOS : 1
    Total Threads created on CDSP by all PDs: 11
**********************************************************************************

**********************************************************************************
        Maximum Threads Can be Created: 58
**********************************************************************************

getinfo service

Use the getinfo service to display details of DSP. This service displays the following information for a given DSP: * Q6 version. * Maximum h/w threads. * Core Clock frequencies. * L2 and VTCM memory size. * HMX clock frequencies (v75 onwards).

Usage

sysMonApp getinfo [--q6 <dsp>]
  • Use --q6 to specify the DSP to query (adsp, cdsp, or sdsp)

Example

The following command displays the following details about the ADSP:

adb shell /data/local/tmp/sysMonApp getinfo --q6 adsp

Domain Configured ADSP
Q6 Version                : v66
Max HW Threads            : 4
L2 Cache Size             : 0.5 MB
L2 TCM Size               : 1.5 MB
Core Clock plans(in MHz)  : {
                               562.500 , LOW_MINUS       ,
                               750.000 , LOW             ,
                               918.750 , LOW_PLUS        ,
                              1143.750 , NOMINAL         ,
                              1293.750 , NOMINAL_PLUS    ,
                              1368.750 , HIGH            ,
                            }

getload service

This service displays instantaneous Q6 load statistics of the selected DSP subsystem.

Usage

sysMonApp getload [--time <duration_ms>] [--iterations <num_iter>] [--q6 <dsp>]
  • Use --q6 to specify the DSP to query (adsp, cdsp, or sdsp)
  • Use --time <duration_ms> to specify the duration in ms for making load measurements (default: 1000ms)
  • Use --iterations <num_iter> to provide profiling statistics for iterations (default: 1)

Example

The following command displays the following load statistics for the CDSP:

adb shell /data/local/tmp/sysMonApp getload --q6 cdsp

Time programed is:1000
No of Iterations:1
Domain Configured Compute DSP
Q6 architecture detected as v68...
opening outputfile @/sdcard/sysmon_cdsp.bin
Enabling DSP SysMon using FastRPC
Allocating output buffer
>> Starting thread to Query DSP SysMon for samples
>> Waiting for profile duration (2 s) to elapse...

=========================================================
=========================================================
METRICS                    Units          AVG       MAX
=========================================================
MPPS                       Packets/Sec    0.52      80.10
pCPP                       Cycles/Packet  5.22      6.21
AXI Cached Rd BW           MBPS           2.32      282.02
AXI Cached Wr BW           MBPS           0.02      2.88
Eff Q6 Freq                MHz            2.73
=========================================================
=========================================================

***************************EXITING!***************************
>> Time elapsed, sending kill to Query thread...
>> Waiting for the Query thread to join...

The output bin file is placed @ /sdcard/sysmon_cdsp.bin
Domain Configured Compute DSP
DSP Core clock :375.00MHz
SNOC Vote:0.00MHz
MEMNOC Vote:0.62MHz
GuestOS : Total Heap:3072.00KB
Available Heap:1917.71KB
Max.Free Bin:1859.91KB
Measured SNOC (/clk/snoc) :240.00MHz
Measured BIMC (/clk/bimc) :2096.44MHz

DCVS service

This service is used to enable or disable the Dynamic Clock Voltage Scaling (DCVS) feature executing on each DSP.

Usage

sysMonApp dcvs <enable/disable> [--q6 <dsp>]

Where:

  • enable or disable enables or disables DCVS on the selected DSP until the target is rebooted or this setting is modified again.
  • --q6 option allows you to specify the DSP to query: adsp, cdsp, or sdsp.

Example

To enable DCVS on the aDSP:

adb shell /data/local/tmp/sysMonApp dcvs enable

Domain Configured ADSP
Successfully enabled adsp DCVS

Clocks service

Use clocks service to set or reset the DSP core clock and bus clocks of the specified DSP. For architecture v75 onwards, clocks service can also be used to set or reset the HMX clock and Q6-CENG bus clock.

Usage

Three clocks service actions are available:

Clocks set

Use the clocks set service to set the minimum clock frequency for a given subsystem or vote for a sleep latency:

sysMonApp clocks set [options]
Options Values Description
--coreClock MHz Sets the minimum clock speed at which the DSP should run.
--busClock MHz Sets the minimum AXI (DSP <-> AXI) bus frequency when the DSP is active.
--hcpBusClock MHz Sets the minimum HCP (HCP <-> DDR) bus frequency (cDSP only).
--dmaBusClock MHz Sets the minimum DMA (DMA <-> DDR) bus frequency (cDSP only).
--sleepLatency uSec Specified DSP sleep latency vote. Used by the sleep driver to choose one of the appropriate low power modes that satisfy the latency requirement.
--hmxClock MHz Sets the minimum clock speed at which the HMX unit should run (v75 onwards).
--cengClock MHZ Sets the minimum clock for the Q6-CENG bus interface (v75 onwards).
--q6 adsp, cdsp, sdsp Execute the service for the selected DSP.

NOTE: A vote of 0 resets the settings for that option.

Clocks limit

Use the clocks limit service to put an upper limit (maximum) on the selected DSP core clock:

sysMonApp clocks limit --coreClock <frequency_MHz> [--q6 <dsp>]

The DSP core clock is capped at the nearest available clock frequency even when the applications request an available higher frequency.

Given that each target has a finite set of available clock rates, you can select exactly one of these clock rates by setting the minimum (with the clocks set service) and maximum (with the clocks limit service) rates. If a selected range does not include a valid clock rate, the system will be forced to pick a clock rate outside the requested range. Starting with v75, the --coreClock option can also be used to limit the HMX clock frequency and Q6-CENG bus clock frequency on the CDSP as follows:

sysMonApp clocks limit --hmxClock <frequency_MHz> --q6 cdsp
sysMonApp clocks limit --cengClock <frequency_MHz> --q6 cdsp

Clocks remove

Use the clock remove service to reset all clock settings to their default values:

sysMonApp clocks remove [--q6 <dsp>]

Examples

NOTE: All sysMonApp invocations are followed by calling the getstate service to show the new DSP clock settings. For the sake of brevity, these calls are not reproduced in the following command output examples.

To set the CDSP core clock speed to 400 MHz and the DMA clock speed to 100 MHz:

adb shell /data/local/tmp/sysMonApp clocks set  clocks set --coreClock 400 --dmaBusClock 100 --q6 cdsp

Domain Configured Compute DSP
Calling CDSP set clocks function with following parameters:
Core clock : 400 MHz
DMA AXI clock : 100 MHz
Successfully set the required clock configurations, Call the remove API once done...
Example: sysMonApp clocks remove

To reinitialize the clock speeds to their default values:

adb shell /data/local/tmp/sysMonApp clocks remove --q6 cdsp

Domain Configured Compute DSP
Resetting clock votes and limits...
Successful in resetting the votes and limits...

To disable the power collapse of the aDSP, use a low sleep latency:

adb shell /data/local/tmp/sysMonApp clocks set --sleepLatency 10

Domain Configured ADSP
Calling adsp set clocks function with following parameters:
Sleep latency vote : 10usec

tinfo service

For each active software thread, use the tinfo service to display the thread's priority, declared stack size, and maximum stack used.

Usage

sysMonApp tinfo [--getstack <PD>] [--q6 <dsp>]

Where <PD> specifies the protection domain (PD) on which to request information. The values are:

  • audio
  • sensors
  • user
  • all (to display information for all PDs)

Examples

To see the software thread information for the audio PD on the aDSP:

adb shell /data/local/tmp/sysMonApp tinfo --getstack audio

Domain Configured ADSP
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
     ThreadName          ThreadID   ThreadPrio      PID      STID     DeclaredStackSize(Bytes)   MaxUsedStackSize(Bytes)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
audio_process_r            243          32            1         0                         4096                       484
    user_reaper            240          32            1         0                         4096                       248
         rcinit            239         112            1         0                         6144                       492
  rcinit_worker            238          94            1         0                         6144                      1528
           UIST            237          60            1         0                         1952                         0
     TMR_CLNT_1            236          19            1         0                         4096                       276
       DIAG_LSM            235         237            1         0                         4096                       476
  /frpc/AudioPD            234         192            1         0                         4096                       844
  dog_vir_task1            233          93            1         0                         4096                       476
         dog_hb            232         121            1         0                         4096                       412
NPA_ASYNC_EVENT            231          48            1         0                         8192                       136
NPA_ASYNC_REQUE            230          29            1         0                         8192                       136
    GPIOINT_SRV            229           4            1         0                         4096                       136
     DALTF_TH_0            228         252            1         0                         8192                        48
     DALTF_TH_1            227         253            1         0                         8192                        48
     DALTF_TH_2            226         254            1         0                         8192                        48
     DALTF_TH_3            225         255            1         0                         8192                        48
     IPCRTR_RDR            224         204            1         0                         4096                       640
   QMI_PING_SVC            223          10            1         0                         2560                       540
    gen_cb_ctxt            222         205            1         0                         2048                       204
sr_notif_worker            221          93            1         0                         4096                       960
sr_notif_signal            220          93            1         0                         2048                        80
    UTMR_CLNT_1            218          18            1         0                         4096                       168
                           217          94            1         0                         4096                       248
 qdssc_svc_task            216          10            1         0                         4096                        48
      smlworker            215         152            1         0                         4096                       840
     i2c_qdi_cb            214          14            1         0                         4096                       344
    APR_QDI_USR            213          50            1         0                         8064                       608
          AMDB0            212          69            1         0                         8192                       728
          AMDB1            211          69            1         0                         8192                       548
          AMDB2            210          69            1         0                         8192                       648
          AMDB3            209          69            1         0                         8192                       728
            AVT            208           1            1         0                         4096                       104
      hw_af_ist            207           2            1         0                         1024                        80
            VTM            206          11            1         0                         2560                       184
           VDS1            205          12            1         0                         2560                       208
           VMX1            204          32            1         0                         4096                       296
           VMX2            203          32            1         0                         4096                       296
            VPM            202          52            1         0                        12288                       336
            VSM            201          35            1         0                        16384                       368
           AfeS            200          34            1         0                         5120                      1520
           MXAT            199          38            1         0                         8192                       336
           MXAR            198          38            1         0                         8192                       820
dma_typ_dflt_is            197           2            1         0                         1024                        80
dma_typ_hdmi_is            196           2            1         0                         1024                        80
     AfeDataMgr            195          31            1         0                         4096                       192
   CodecIntHldr            194          37            1         0                         4096                       860
           RXSR            193          38            1         0                         4096                       312
           TXSR            192          38            1         0                         4096                       312
            ADM            191          37            1         0                        12544                       812
            ASM            190          37            1         0                        16640                       736
           VfrD            189          10            1         0                         4096                       480
            LSM            188          71            1         0                         2048                       336
            USM            187          69            1         0                         4096                       384
         AfeANC            186          90            1         0                         5120                       320
            ACS            184          37            1         0                        16384                       792
            MVM            183          50            1         0                         4096                       712
   CVD_CAL_LOGG            182         109            1         0                         2048                       364
  SlimBusQmiSvc            181          39            1         0                         4096                      1068
            usb            180          32            1         0                         2040                       420
          irq11            177           4            1         0                         2048                       344
     SlimBusMsg            176         207            1         0                         4088                       624
  SlimBusMaster            175         207            1         0                         4088                       616
          irq86            173           4            1         0                         2048                        96
     SlimBusMsg            172         207            1         0                         4088                       584
  SlimBusMaster            171         207            1         0                         4088                       544
          irq87           4266           4            1         0                         2048                       336
    err_ex_pd_1            167           1            1         0                         4096                       248
  mem_gc_thread            166         192            1         0                         4096                        56
  /frpc/audiopd            165         192            1         0                        16384                      1172
  /frpc/audiopd            164         192            1         0                        16384                       988
          irq12          94382           4            1         0                         2048                       336
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

To see the software thread information for the user PD on the cDSP:

adb shell /data/local/tmp/sysMonApp  tinfo --getstack user  --q6 cdsp

Domain Configured Compute DSP
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
     ThreadName          ThreadID   ThreadPrio      PID      STID     DeclaredStackSize(Bytes)   MaxUsedStackSize(Bytes)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/frpc/c04d3240           41193          32           12         0                        65536                       940
    user_reaper          41190          32           12         0                         4096                       248
  mem_gc_thread          61666         192           12         0                         4096                        64
HAP_par_thread_          61662         192           12         0                         8192                       540
exception_handl          61669         192           12         0                         4096                       248
/frpc/c04d3240           41187         192           12         0                        16384                      1196
/frpc/c04d3240           61663         192           12         0                         4096                      3136
HAP_par_thread_          61674         192           12         0                         8192                      2220
HAP_par_thread_         209124         192           12         0                         8192                      1008
HAP_par_thread_          41191         192           12         0                         8192                       752
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

NOTE: The PID value allows you to distinguish between the different user PDs that are running.

Thread level profile (TLP) service

The TLP service provides profiling information for specified software threads. The profiling information includes cycles spent and packets executed per software thread.

The service also provides a --profile option to run the sysmon profiler service in parallel to collect all DSP profiling data along with the software thread-specific information.

Usage

sysMonApp tlp [options]
Options Values Description
--samplingPeriod Integer>=1. Default=50 Sampling period in milliseconds for collecting profile statistics for a given software thread
--profile 0 (default)/1 Execute the sysmon profiler in parallel (1). Execute the TLP service only (0).
--tname Thread names (case sensitive) to profile separated by ,. Default = Profile all active threads.
--duration Integer>=1. Default=10 duration (in seconds) after which the thread-level profiling will stop.
--q6 adsp, cdsp, sdsp Execute the service for the selected DSP.

NOTE: Thread names can be accessed by using the tinfo service.

To end the profiling service, press Enter. Then enter the following command to retrieve the output binary file created in /sdcard (or /data):

adb pull /sdcard/sysmontlp_<x>dsp.bin

Where takes the value a, c, or s to designate the aDSP, cDSP, or sDSP.

If the sysmon profiler was executed in parallel using the --profile 1 option, follow the data collection instructions above to retrieve the generated output.

Postprocessing

Refer to the sysMon parser documentation for postprocessing TLP output binary files. sysmon parser can parse the output from the TLP service with or without the sysmon profiler output.

ETM trace service

Use the ETM service to trace the instruction and data memory on the cDSP only. The service also provides information about the dynamic modules that are loaded on the cDSP.

Usage

Two ETM trace services commands are available:

DLL command

The DLL command provides information about all the dynamic modules loaded on the cDSP. It also provides the information required to run the Hexagon Trace Analyzer tool.

Usage:

sysMonApp etmTrace --command dll

For each loaded DLL, the following information is provided:

  • ELF_NAME: Name of the dynamic module that was loaded.
  • LOAD_ADDRESS: Virtual address where the dynamic module was loaded.
  • LOAD_TIMESTAMP: Timestamp when the dynamic module was loaded.
  • UNLOAD_TIMESTAMP: Timestamp when the dynamic module was unloaded.

NOTE: This command can take around 30 seconds to complete.

Following is an example of output data showing when the FastRPC shell was loaded and unloaded.

data.ELF_NAME = fastrpc_shell_3
data.LOAD_ADDRESS = 0xe0d00000
data.ELF_IDENTIFIER = 0x00000000
data.LOAD_TIMESTAMP = 0x14b46e27fb1a
data.UNLOAD_TIMESTAMP = 0x14b46e3abe76

NOTE: A load address of 0x0 means the library was loaded statically (it is part of the DSP image).

ETM command

The ETM command traces instruction and data memory on the cDSP. For information on how to retrieve and parse the generated trace file, see the documentation on the Hexagon Trace Analyzer.

There are three supported trace modes:

Trace mode Description
default ETM does not send any cycle information. Keeps the amount of data sent by ETM to a minimum. This mode is helpful in analyzing the flow trace but does not help for any performance analysis.
cycle-accurate ETM sends cycle information for each packet. This results in emitting data at a higher rate per instruction and can occasionally lead to data loss due to buffer overlow. This mode is required to postprocess the trace with the Hexagon Trace Analyzer.
cycle-coarse ETM still tracks cycles but the tracking is not done per instruction, and thus it results in a lower data rate per instruction. It might be a good compromise for some performance analysis where the cycle-accurate mode overflows.

Usage

sysMonApp etmTrace --command etm [--etmType <etm_type>]

Where the following values are supported for :

etm_type Description
pc_mem Trace instruction and data memory. (Default.)
pc Trace instruction memory.
mem Trace data memory.
cc_pc Trace instruction memory with cycle-coarse mode.
cc_mem Trace data memory with cycle-coarse mode.
cc_pc_mem Trace instruction and data memory with cycle-coarse mode.
ca_pc Trace instruction memory with cycle-accurate mode.
ca_mem Trace data memory with cycle-accurate mode.
ca_pc_mem Trace instruction and data memory with cycle-accurate mode.

HTA profiler service

Use the HTA profiler service to profile services running on the HTA to gather the following information:

  • Clocks voted for
  • DDR read and write bandwidths
  • HTA activity statistics like HTA active time
  • Measured and maximum inferences per second
  • Number of layers
  • Bus bandwidth metrics

These profiling metrics are useful in measuring performance, debugging performance-related issues, and identifying possible optimizations.

Usage

adb shell /data/local/tmp/sysMonApp profiler --q6 HTA [--cdsp 1]

Where the --cdsp 1 option allows the cDSP to be profiled in parallel with the HTA.

Data collection

The HTA profiler service stores raw profiling data in either the sysmon_HTA.bin or sysmon_cdsp_HTA.bin file, depending whether the cDSP was profiled in parallel with the HTA or not. These files are stored in the /sdcard/ or /data/ folders.

The sysMonApp profiler also prints the output file path and file name in the standard output. When you are finished with profiling, pull the file from the device and postprocess it using the sysmon parser on a host machine.

To pull the profiler output file using ADB:

adb pull /sdcard/sysmon[_cdsp]_HTA.bin <destination directory>

Data postprocessing

To postprocess the HTA profiling data, see the instructions on how to use the sysmon parser.