Papi Native Event

  • Upload
    fenz85

  • View
    63

  • Download
    0

Embed Size (px)

DESCRIPTION

Papi event

Citation preview

Available native events and hardware information.--------------------------------------------------------------------------------PAPI Version : 5.3.0.0Vendor string and code : GenuineIntel (1)Model string and code : Intel(R) Core(TM) i3-2330M CPU @ 2.20GHz (42)CPU Revision : 7.000000CPUID Info : Family: 6 Model: 42 Stepping: 7CPU Max Megahertz : 2200CPU Min Megahertz : 800Hdw Threads per core : 2Cores per Socket : 2Sockets : 1CPUs per Node : 4Total CPUs : 4Running in a VM : noNumber Hardware Counters : 11Max Multiplex Counters : 64--------------------------------------------------------------------------------=============================================================================== Native Events in Component: perf_event===============================================================================| ix86arch::UNHALTED_CORE_CYCLES || count core clock cycles whenever the clock signal on the specific || core is running (not halted) || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| ix86arch::INSTRUCTION_RETIRED || count the number of instructions at retirement. For instructions t|| hat consists of multiple micro-ops, this event counts the retireme|| nt of the last micro-op of the instruction || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| ix86arch::UNHALTED_REFERENCE_CYCLES || count reference clock cycles while the clock signal on the specifi|| c core is running. The reference clock operates at a fixed frequen|| cy, irrespective of core freqeuncy changes due to performance stat|| e transitions |--------------------------------------------------------------------------------| ix86arch::LLC_REFERENCES || count each request originating from the core to reference a cache || line in the last level cache. The count may include speculation, b|| ut excludes cache line fills due to hardware prefetch || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| ix86arch::LLC_MISSES || count each cache miss condition for references to the last level c|| ache. The event count may include speculation, but excludes cache || line fills due to hardware prefetch || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| ix86arch::BRANCH_INSTRUCTIONS_RETIRED || count branch instructions at retirement. Specifically, this event || counts the retirement of the last micro-op of a branch instruction|| || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| ix86arch::MISPREDICTED_BRANCH_RETIRED || count mispredicted branch instructions at retirement. Specifically|| , this event counts at retirement of the last micro-op of a branch|| instruction in the architectural path of the execution and experi|| enced misprediction in the branch prediction hardware || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_CPU_CYCLES || PERF_COUNT_HW_CPU_CYCLES |--------------------------------------------------------------------------------| perf::CYCLES || PERF_COUNT_HW_CPU_CYCLES |--------------------------------------------------------------------------------| perf::CPU-CYCLES || PERF_COUNT_HW_CPU_CYCLES |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_INSTRUCTIONS || PERF_COUNT_HW_INSTRUCTIONS |--------------------------------------------------------------------------------| perf::INSTRUCTIONS || PERF_COUNT_HW_INSTRUCTIONS |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_CACHE_REFERENCES || PERF_COUNT_HW_CACHE_REFERENCES |--------------------------------------------------------------------------------| perf::CACHE-REFERENCES || PERF_COUNT_HW_CACHE_REFERENCES |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_CACHE_MISSES || PERF_COUNT_HW_CACHE_MISSES |--------------------------------------------------------------------------------| perf::CACHE-MISSES || PERF_COUNT_HW_CACHE_MISSES |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_BRANCH_INSTRUCTIONS || PERF_COUNT_HW_BRANCH_INSTRUCTIONS |--------------------------------------------------------------------------------| perf::BRANCH-INSTRUCTIONS || PERF_COUNT_HW_BRANCH_INSTRUCTIONS |--------------------------------------------------------------------------------| perf::BRANCHES || PERF_COUNT_HW_BRANCH_INSTRUCTIONS |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_BRANCH_MISSES || PERF_COUNT_HW_BRANCH_MISSES |--------------------------------------------------------------------------------| perf::BRANCH-MISSES || PERF_COUNT_HW_BRANCH_MISSES |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_BUS_CYCLES || PERF_COUNT_HW_BUS_CYCLES |--------------------------------------------------------------------------------| perf::BUS-CYCLES || PERF_COUNT_HW_BUS_CYCLES |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_STALLED_CYCLES_FRONTEND || PERF_COUNT_HW_STALLED_CYCLES_FRONTEND |--------------------------------------------------------------------------------| perf::STALLED-CYCLES-FRONTEND || PERF_COUNT_HW_STALLED_CYCLES_FRONTEND |--------------------------------------------------------------------------------| perf::IDLE-CYCLES-FRONTEND || PERF_COUNT_HW_STALLED_CYCLES_FRONTEND |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_STALLED_CYCLES_BACKEND || PERF_COUNT_HW_STALLED_CYCLES_BACKEND |--------------------------------------------------------------------------------| perf::STALLED-CYCLES-BACKEND || PERF_COUNT_HW_STALLED_CYCLES_BACKEND |--------------------------------------------------------------------------------| perf::IDLE-CYCLES-BACKEND || PERF_COUNT_HW_STALLED_CYCLES_BACKEND |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_REF_CPU_CYCLES || PERF_COUNT_HW_REF_CPU_CYCLES |--------------------------------------------------------------------------------| perf::REF-CYCLES || PERF_COUNT_HW_REF_CPU_CYCLES |--------------------------------------------------------------------------------| perf::PERF_COUNT_SW_CPU_CLOCK || PERF_COUNT_SW_CPU_CLOCK |--------------------------------------------------------------------------------| perf::CPU-CLOCK || PERF_COUNT_SW_CPU_CLOCK |--------------------------------------------------------------------------------| perf::PERF_COUNT_SW_TASK_CLOCK || PERF_COUNT_SW_TASK_CLOCK |--------------------------------------------------------------------------------| perf::TASK-CLOCK || PERF_COUNT_SW_TASK_CLOCK |--------------------------------------------------------------------------------| perf::PERF_COUNT_SW_PAGE_FAULTS || PERF_COUNT_SW_PAGE_FAULTS |--------------------------------------------------------------------------------| perf::PAGE-FAULTS || PERF_COUNT_SW_PAGE_FAULTS |--------------------------------------------------------------------------------| perf::FAULTS || PERF_COUNT_SW_PAGE_FAULTS |--------------------------------------------------------------------------------| perf::PERF_COUNT_SW_CONTEXT_SWITCHES || PERF_COUNT_SW_CONTEXT_SWITCHES |--------------------------------------------------------------------------------| perf::CONTEXT-SWITCHES || PERF_COUNT_SW_CONTEXT_SWITCHES |--------------------------------------------------------------------------------| perf::CS || PERF_COUNT_SW_CONTEXT_SWITCHES |--------------------------------------------------------------------------------| perf::PERF_COUNT_SW_CPU_MIGRATIONS || PERF_COUNT_SW_CPU_MIGRATIONS |--------------------------------------------------------------------------------| perf::CPU-MIGRATIONS || PERF_COUNT_SW_CPU_MIGRATIONS |--------------------------------------------------------------------------------| perf::MIGRATIONS || PERF_COUNT_SW_CPU_MIGRATIONS |--------------------------------------------------------------------------------| perf::PERF_COUNT_SW_PAGE_FAULTS_MIN || PERF_COUNT_SW_PAGE_FAULTS_MIN |--------------------------------------------------------------------------------| perf::MINOR-FAULTS || PERF_COUNT_SW_PAGE_FAULTS_MIN |--------------------------------------------------------------------------------| perf::PERF_COUNT_SW_PAGE_FAULTS_MAJ || PERF_COUNT_SW_PAGE_FAULTS_MAJ |--------------------------------------------------------------------------------| perf::MAJOR-FAULTS || PERF_COUNT_SW_PAGE_FAULTS_MAJ |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_CACHE_L1D || L1 data cache || :READ || read access || :WRITE || write access || :PREFETCH || prefetch access || :ACCESS || hit access || :MISS || miss access |--------------------------------------------------------------------------------| perf::L1-DCACHE-LOADS || L1 cache load accesses |--------------------------------------------------------------------------------| perf::L1-DCACHE-LOAD-MISSES || L1 cache load misses |--------------------------------------------------------------------------------| perf::L1-DCACHE-STORES || L1 cache store accesses |--------------------------------------------------------------------------------| perf::L1-DCACHE-STORE-MISSES || L1 cache store misses |--------------------------------------------------------------------------------| perf::L1-DCACHE-PREFETCHES || L1 cache prefetch accesses |--------------------------------------------------------------------------------| perf::L1-DCACHE-PREFETCH-MISSES || L1 cache prefetch misses |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_CACHE_L1I || L1 instruction cache || :READ || read access || :PREFETCH || prefetch access || :ACCESS || hit access || :MISS || miss access |--------------------------------------------------------------------------------| perf::L1-ICACHE-LOADS || L1I cache load accesses |--------------------------------------------------------------------------------| perf::L1-ICACHE-LOAD-MISSES || L1I cache load misses |--------------------------------------------------------------------------------| perf::L1-ICACHE-PREFETCHES || L1I cache prefetch accesses |--------------------------------------------------------------------------------| perf::L1-ICACHE-PREFETCH-MISSES || L1I cache prefetch misses |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_CACHE_LL || Last level cache || :READ || read access || :WRITE || write access || :PREFETCH || prefetch access || :ACCESS || hit access || :MISS || miss access |--------------------------------------------------------------------------------| perf::LLC-LOADS || Last level cache load accesses |--------------------------------------------------------------------------------| perf::LLC-LOAD-MISSES || Last level cache load misses |--------------------------------------------------------------------------------| perf::LLC-STORES || Last level cache store accesses |--------------------------------------------------------------------------------| perf::LLC-STORE-MISSES || Last level cache store misses |--------------------------------------------------------------------------------| perf::LLC-PREFETCHES || Last level cache prefetch accesses |--------------------------------------------------------------------------------| perf::LLC-PREFETCH-MISSES || Last level cache prefetch misses |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_CACHE_DTLB || Data Translation Lookaside Buffer || :READ || read access || :WRITE || write access || :PREFETCH || prefetch access || :ACCESS || hit access || :MISS || miss access |--------------------------------------------------------------------------------| perf::DTLB-LOADS || Data TLB load accesses |--------------------------------------------------------------------------------| perf::DTLB-LOAD-MISSES || Data TLB load misses |--------------------------------------------------------------------------------| perf::DTLB-STORES || Data TLB store accesses |--------------------------------------------------------------------------------| perf::DTLB-STORE-MISSES || Data TLB store misses |--------------------------------------------------------------------------------| perf::DTLB-PREFETCHES || Data TLB prefetch accesses |--------------------------------------------------------------------------------| perf::DTLB-PREFETCH-MISSES || Data TLB prefetch misses |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_CACHE_ITLB || Instruction Translation Lookaside Buffer || :READ || read access || :ACCESS || hit access || :MISS || miss access |--------------------------------------------------------------------------------| perf::ITLB-LOADS || Instruction TLB load accesses |--------------------------------------------------------------------------------| perf::ITLB-LOAD-MISSES || Instruction TLB load misses |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_CACHE_BPU || Branch Prediction Unit || :READ || read access || :ACCESS || hit access || :MISS || miss access |--------------------------------------------------------------------------------| perf::BRANCH-LOADS || Branch load accesses |--------------------------------------------------------------------------------| perf::BRANCH-LOAD-MISSES || Branch load misses |--------------------------------------------------------------------------------| perf::PERF_COUNT_HW_CACHE_NODE || Node memory access || :READ || read access || :WRITE || write access || :PREFETCH || prefetch access || :ACCESS || hit access || :MISS || miss access |--------------------------------------------------------------------------------| perf::NODE-LOADS || Node load accesses |--------------------------------------------------------------------------------| perf::NODE-LOAD-MISSES || Node load misses |--------------------------------------------------------------------------------| perf::NODE-STORES || Node store accesses |--------------------------------------------------------------------------------| perf::NODE-STORE-MISSES || Node store misses |--------------------------------------------------------------------------------| perf::NODE-PREFETCHES || Node prefetch accesses |--------------------------------------------------------------------------------| perf::NODE-PREFETCH-MISSES || Node prefetch misses |--------------------------------------------------------------------------------| AGU_BYPASS_CANCEL || Number of executed load operations with all the following traits: || 1. addressing of the format [base + offset], 2. the offset is betw|| een 1 and 2047, 3. the address specified in the base register is i|| n one page and the address [base+offset] is in another page || :COUNT || 1. addressing of the format [base + offset], 2. the offset is bet|| ween 1 and 2047, 3. the address specified in the base register is || in one page and the address [base+offset] is in another page, mask|| s:This event counts executed load operations || :e=0 || 1. addressing of the format [base + offset], 2. the offset is bet|| ween 1 and 2047, 3. the address specified in the base register is || in one page and the address [base+offset] is in another page, mask|| s:edge level (may require counter-mask >= 1) || :i=0 || 1. addressing of the format [base + offset], 2. the offset is bet|| ween 1 and 2047, 3. the address specified in the base register is || in one page and the address [base+offset] is in another page, mask|| s:invert || :c=0 || 1. addressing of the format [base + offset], 2. the offset is bet|| ween 1 and 2047, 3. the address specified in the base register is || in one page and the address [base+offset] is in another page, mask|| s:counter-mask in range [0-255] || :t=0 || 1. addressing of the format [base + offset], 2. the offset is bet|| ween 1 and 2047, 3. the address specified in the base register is || in one page and the address [base+offset] is in another page, mask|| s:measure any thread || :u=0 || 1. addressing of the format [base + offset], 2. the offset is bet|| ween 1 and 2047, 3. the address specified in the base register is || in one page and the address [base+offset] is in another page, mask|| s:monitor at user level || :k=0 || 1. addressing of the format [base + offset], 2. the offset is bet|| ween 1 and 2047, 3. the address specified in the base register is || in one page and the address [base+offset] is in another page, mask|| s:monitor at kernel level |--------------------------------------------------------------------------------| ARITH || Counts arithmetic multiply operations || :FPU_DIV_ACTIVE || Cycles that the divider is active, includes integer and floating p|| oint || :FPU_DIV || Number of cycles the divider is activated, includes integer and fl|| oating point || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| BACLEARS || Branch resteered || :ANY || Counts the number of times the front end is resteered, mainly when|| the BPU cannot provide a correct prediction and this is corrected|| by other branch handling mechanisms at the front end || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| BR_INST_EXEC || Branch instructions executed || :NONTAKEN_COND || All macro conditional non-taken branch instructions || :TAKEN_COND || All macro conditional taken branch instructions || :NONTAKEN_DIRECT_JUMP || All macro unconditional non-taken branch instructions, excluding c|| alls and indirects || :TAKEN_DIRECT_JUMP || All macro unconditional taken branch instructions, excluding calls|| and indirects || :NONTAKEN_INDIRECT_JUMP_NON_CALL_RET || All non-taken indirect branches that are not calls nor returns || :TAKEN_INDIRECT_JUMP_NON_CALL_RET || All taken indirect branches that are not calls nor returns || :TAKEN_RETURN_NEAR || All taken indirect branches that have a return mnemonic || :TAKEN_DIRECT_NEAR_CALL || All taken non-indirect calls || :TAKEN_INDIRECT_NEAR_CALL || All taken indirect calls, including both register and memory indir|| ect || :ALL_BRANCHES || All near executed branches instructions (not necessarily retired) || :ALL_CONDITIONAL || All macro conditional branch instructions || :ANY_COND || All macro conditional branch instructions || :ANY_INDIRECT_JUMP_NON_CALL_RET || All indirect branches that are not calls nor returns || :ANY_DIRECT_NEAR_CALL || All non-indirect calls || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| BR_INST_RETIRED || Retired branch instructions || :ALL_BRANCHES || All taken and not taken macro branches including far branches (Pre|| cise Event) || :CONDITIONAL || All taken and not taken macro conditional branch instructions (Pre|| cise Event) || :FAR_BRANCH || Number of far branch instructions retired (Precise Event) || :NEAR_CALL || All macro direct and indirect near calls, does not count far calls|| (Precise Event) || :NEAR_RETURN || Number of near ret instructions retired (Precise Event) || :NEAR_TAKEN || Number of near branch taken instructions retired (Precise Event) || :NOT_TAKEN || All not taken macro branch instructions retired (Precise Event) || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| BR_MISP_EXEC || Mispredicted branches executed || :NONTAKEN_COND || All non-taken mispredicted macro conditional branch instructions || :TAKEN_COND || All taken mispredicted macro conditional branch instructions || :NONTAKEN_INDIRECT_JUMP_NON_CALL_RET || All non-taken mispredicted indirect branches that are not calls no|| r returns || :TAKEN_INDIRECT_JUMP_NON_CALL_RET || All taken mispredicted indirect branches that are not calls nor re|| turns || :NONTAKEN_RETURN_NEAR || All non-taken mispredicted indirect branches that have a return mn|| emonic || :TAKEN_RETURN_NEAR || All taken mispredicted indirect branches that have a return mnemon|| ic || :NONTAKEN_DIRECT_NEAR_CALL || All non-taken mispredicted non-indirect calls || :TAKEN_DIRECT_NEAR_CALL || All taken mispredicted non-indirect calls || :NONTAKEN_INDIRECT_NEAR_CALL || All nontaken mispredicted indirect calls, including both register || and memory indirect || :TAKEN_INDIRECT_NEAR_CALL || All taken mispredicted indirect calls, including both register and|| memory indirect || :ANY_COND || All mispredicted macro conditional branch instructions || :ANY_RETURN_NEAR || All mispredicted indirect branches that have a return mnemonic || :ANY_DIRECT_NEAR_CALL || All mispredicted non-indirect calls || :ANY_INDIRECT_JUMP_NON_CALL_RET || All mispredicted indirect branches that are not calls nor returns || :ALL_BRANCHES || All mispredicted branch instructions || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| BR_MISP_RETIRED || Mispredicted retired branches || :ALL_BRANCHES || All mispredicted macro branches (Precise Event) || :CONDITIONAL || All mispredicted macro conditional branch instructions (Precise Ev|| ent) || :NEAR_CALL || All macro direct and indirect near calls (Precise Event) || :NOT_TAKEN || Number of branch instructions retired that were mispredicted and n|| ot-taken (Precise Event) || :TAKEN || Number of branch instructions retired that were mispredicted and t|| aken (Precise Event) || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| BRANCH_INSTRUCTIONS_RETIRED || Count branch instructions at retirement. Specifically, this event || counts the retirement of the last micro-op of a branch instruction|| || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| MISPREDICTED_BRANCH_RETIRED || Count mispredicted branch instructions at retirement. Specifically|| , this event counts at retirement of the last micro-op of a branch|| instruction in the architectural path of the execution and experi|| enced misprediction in the branch prediction hardware || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| LOCK_CYCLES || Locked cycles in L1D and L2 || :SPLIT_LOCK_UC_LOCK_DURATION || Cycles in which the L1D and L2 are locked, due to a UC lock or spl|| it lock || :CACHE_LOCK_DURATION || Cycles in which the L1D is locked || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| CPL_CYCLES || Unhalted core cycles at a specific ring level || :RING0 || Unhalted core cycles the thread was in ring 0 || :RING0_TRANS || Transitions from rings 1, 2, or 3 to ring 0 || :RING123 || Unhalted core cycles the thread was in rings 1, 2, or 3 || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| CPU_CLK_UNHALTED || Cycles when processor is not in halted state || :REF_P || Cycles when the core is unhalted (count at 100 Mhz) || :THREAD_P || Cycles when thread is not halted || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| DSB2MITE_SWITCHES || Number of DSB to MITE switches || :COUNT || Number of DSB to MITE switches || :PENALTY_CYCLES || Cycles SB to MITE switches caused delay || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| DSB_FILL || DSB fills || :ALL_CANCEL || Number of times a valid DSB fill has been cancelled for any reason|| || :EXCEED_DSB_LINES || DSB Fill encountered > 3 DSB lines || :OTHER_CANCEL || Number of times a valid DSB fill has been cancelled not because of|| exceeding way limit || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| DTLB_LOAD_MISSES || Data TLB load misses || :MISS_CAUSES_A_WALK || Demand load miss in all TLB levels which causes an page walk of an|| y page size || :CAUSES_A_WALK || Demand load miss in all TLB levels which causes an page walk of an|| y page size || :STLB_HIT || Number of DTLB lookups for loads which missed first level DTLB but|| hit second level DTLB (STLB); No page walk. || :WALK_COMPLETED || Demand load miss in all TLB levels which causes a page walk that c|| ompletes for any page size || :WALK_DURATION || Cycles PMH is busy with a walk || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| DTLB_STORE_MISSES || Data TLB store misses || :MISS_CAUSES_A_WALK || Miss in all TLB levels that causes a page walk of any page size (4|| K/2M/4M/1G) || :CAUSES_A_WALK || Miss in all TLB levels that causes a page walk of any page size (4|| K/2M/4M/1G) || :STLB_HIT || First level miss but second level hit; no page walk. Only relevant|| if multiple levels || :WALK_COMPLETED || Miss in all TLB levels that causes a page walk that completes of a|| ny page size (4K/2M/4M/1G) || :WALK_DURATION || Cycles PMH is busy with this walk || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| FP_ASSIST || X87 Floating point assists || :ANY || Cycles with any input/output SSE or FP assists || :SIMD_INPUT || Number of SIMD FP assists due to input values || :SIMD_OUTPUT || Number of SIMD FP assists due to output values || :X87_INPUT || Number of X87 assists due to input value || :X87_OUTPUT || Number of X87 assists due to output value || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| FP_COMP_OPS_EXE || Counts number of floating point events || :X87 || Number of X87 uops executed || :SSE_FP_PACKED_DOUBLE || Number of SSE double precision FP packed uops executed || :SSE_FP_SCALAR_SINGLE || Number of SSE single precision FP scalar uops executed || :SSE_PACKED_SINGLE || Number of SSE single precision FP packed uops executed || :SSE_SCALAR_DOUBLE || Number of SSE double precision FP scalar uops executed || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| HW_INTERRUPTS || Number of hardware interrupts received by the processor || :RECEIVED || Number of hardware interrupts received by the processor || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| HW_PRE_REQ || Hardware prefetch requests || :L1D_MISS || Hardware prefetch requests that misses the L1D cache. A request is|| counted each time it accesses the cache and misses it, including || if a block is applicable or if it hits the full buffer, for exampl|| e. This accounts for both L1 streamer and IP-based Hw prefetchers || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| ICACHE || Instruction Cache accesses || :MISSES || Number of Instruction Cache, Streaming Buffer and Victim Cache Mis|| ses. Includes UC accesses || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| IDQ || IDQ operations || :EMPTY || Cycles IDQ is empty || :MITE_UOPS || Number of uops delivered to IDQ from MITE path || :DSB_UOPS || Number of uops delivered to IDQ from DSB path || :MS_DSB_UOPS || Number of uops delivered to IDQ when MS busy by DSB || :MS_MITE_UOPS || Number of uops delivered to IDQ when MS busy by MITE || :MS_UOPS || Number of uops were delivered to IDQ from MS by either DSB or MITE|| || :MITE_UOPS_CYCLES || Cycles where uops are delivered to IDQ from MITE (MITE active) || :DSB_UOPS_CYCLES || Cycles where uops are delivered to IDQ from DSB (DSB active) || :MS_DSB_UOPS_CYCLES || Cycles where uops delivered to IDQ when MS busy by DSB || :MS_MITE_UOPS_CYCLES || Cycles where uops delivered to IDQ when MS busy by MITE || :MS_UOPS_CYCLES || Cycles where uops delivered to IDQ from MS by either BSD or MITE || :ALL_DSB_UOPS || Number of uops deliver from either DSB paths || :ALL_DSB_CYCLES || Cycles MITE/MS deliver anything || :ALL_MITE_UOPS || Number of uops delivered from either MITE paths || :ALL_MITE_CYCLES || Cycles DSB/MS deliver anything || :ANY_UOPS || Number of uops delivered to IDQ from any path || :MS_DSB_UOPS_OCCUR || Occurences of DSB MS going active || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| IDQ_UOPS_NOT_DELIVERED || Uops not delivered || :CORE || Number of non-delivered uops to RAT (use cmask to qualify further)|| || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| ILD_STALL || Instruction Length Decoder stalls || :LCP || Stall caused by changing prefix length of the instruction || :IQ_FULL || Stall cycles due to IQ full || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| INSTS_WRITTEN_TO_IQ || Instructions written to IQ || :INSTS || Number of instructions written to IQ every cycle || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| INST_RETIRED || Instructions retired || :ANY_P || Number of instructions retired || :PREC_DIST || Precise instruction retired event to reduce effect of PEBS shadow || IP distribution (Precise Event) || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| INSTRUCTION_RETIRED || Number of instructions at retirement || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| INSTRUCTIONS_RETIRED || This is an alias for INSTRUCTION_RETIRED || :e=0 || edge level (may require counter-mask >= 1) || :i=0 || invert || :c=0 || counter-mask in range [0-255] || :t=0 || measure any thread || :u=0 || monitor at user level || :k=0 || monitor at kernel level |--------------------------------------------------------------------------------| INT_MISC || Miscellaneous internals || :RAT_STALL_CYCLES || Cycles RAT external stall is sent to IDQ for this thread || :RECOVERY_CYCLES || Cycles waiting to be recovered after Machine Clears due to all oth|| er cases except JEClear || :RECOVERY_STALLS_COUNT || Number of times need to wait after Machine Clears due to all other|| cases except JEClear || :e=0 || edge level (may require counter-mask >= 1)