Enable profiling by line number.

Submitted by junyan.he@inbox.com on Dec. 24, 2015, 11:01 a.m.

Details

Reviewer None
Submitted Dec. 24, 2015, 11:01 a.m.
Last Updated Dec. 24, 2015, 11:03 a.m.
Revision 1

Cover Letter(s)

Revision 1
      From: Junyan He <junyan.he@linux.intel.com>

This patch set will let the user to specify the line numbers in the source
code to insert the profiling watch points.
As the first step, we just use the env var OCL_PROFILING_LINES to control
the kernel name and line numbers. The format is:
KERNEL_NAME:PROFILING_MODE:LINE_NUMBER0,LINE_NUMBER1,LINE_NUMBER2,...
for example:
export OCL_PROFILING_LINES="builtin_atanpi_float8:2:2,6,7,8,15"
will insert watch points at 2 6 7 8 15 lines in the kernel named builtin_atanpi_float8.
We have 3 PROFILING_MODE,
  level 1: just brief timestamp with line number.
               Total log number is 6
	       Line        2:    Timestamp:       190  Thread Exec:6
	       Line        6:    Timestamp:      1174  Thread Exec:6
	       Line        7:    Timestamp:      3092  Thread Exec:6
	       Line        8:    Timestamp:      3105  Thread Exec:6
	       Line       15:    Timestamp:      3241  Thread Exec:6

   level 2: timestamp with source, plus:
   Format: Average Timestamp  Exec number Source
                                |  __kernel void builtin_atanpi_float8(__global float *dst,  __global float *src1, __global int *vector) {
   TS:      190  Num:   6 ----> |    int i = get_global_id(0);
                                |    float8 x1 = (float8) (src1[i * (*vector) + 0],src1[i * (*vector) + 1],src1[i * (*vector) + 2],src1[i * (*vector) + 3],src1[i * (*vector) + 4],src1[i * (*vector) + 5],src1[i * (*vector) + 6],src1[i * (*vector) + 7]);
                                |  
                                |    float8 ret;
   TS:     1174  Num:   6 ----> |    ret = atanpi(x1);
   TS:     3092  Num:   6 ----> |    dst[i * (*vector) + 0] = ret[0];
   TS:     3105  Num:   6 ----> |    dst[i * (*vector) + 1] = ret[1];
                                |    dst[i * (*vector) + 2] = ret[2];
                                |    dst[i * (*vector) + 3] = ret[3];
                                |    dst[i * (*vector) + 4] = ret[4];
                                |    dst[i * (*vector) + 5] = ret[5];
                                |    dst[i * (*vector) + 6] = ret[6];
                                |    dst[i * (*vector) + 7] = ret[7];
   TS:     3241  Num:   6 ----> |  };


   level 3: output the detail logs, add all logs as:
   ------------------------ Log 0      -----------------------
   | fix functions id:   7     simd:   16   kernel id:    0  |
   | thread id:    0  EU id:   8  sub slice id: 0 slice id 0 |
   | dispatch Mask:   1 prolog:      6860  epilog:     19548 |
   | globalX:   3~   3  globalY:   0~   0  globalZ:   0~   0 |
   |  ts0 :       201  | ts1 :      1180  | ts2 :     12417  |
   |  ts3 :     12430  | ts4 :     12637  | ts5 :         0  |
   |  ts6 :         0  | ts7 :         0  | ts8 :         0  |
   |  ts9 :         0  | ts10:         0  | ts11:         0  |
   |  ts12:         0  | ts13:         0  | ts14:         0  |
   |  ts15:         0  | ts16:         0  | ts17:         0  |
   |  ts18:         0  | ts19:         0  |                  |
   ------------------------ Log 1      -----------------------
   | fix functions id:   7     simd:   16   kernel id:    0  |
   | thread id:    0  EU id:   8  sub slice id: 1 slice id 0 |
   | dispatch Mask:   1 prolog:      6877  epilog:     19569 |
   | globalX:   4~   4  globalY:   0~   0  globalZ:   0~   0 |
   |  ts0 :       209  | ts1 :      1190  | ts2 :     12423  |
   |  ts3 :     12436  | ts4 :     12643  | ts5 :         0  |
   |  ts6 :         0  | ts7 :         0  | ts8 :         0  |
   |  ts9 :         0  | ts10:         0  | ts11:         0  |
   |  ts12:         0  | ts13:         0  | ts14:         0  |
   |  ts15:         0  | ts16:         0  | ts17:         0  |
   |  ts18:         0  | ts19:         0  |                  |
   .....
   .....



Some problems:
1. On BDW, the timestamp sometimes gives invalid huge value.
   It may be a HW issue or feature, we need to check it further.
2. Sometimes the line number of instruction is different from the
   source code. This is caused by optimization and we can notice
   and analyse it by Gen IR or ASM. I will send a patch to set
   optimization level later.
3. Some line numbers are missing when there are lots of inline
   function call and macro. I will investigate llvm and clang to
   find a better solution.

I will write a detail doc about the profiling later.

With this parch set, please just ignore the previous two patchsets
about debug info.


Signed-off-by: Junyan He <junyan.he@linux.intel.com>
---
 backend/src/backend/gen_context.cpp               |  20 +-
 backend/src/backend/gen_encoder.cpp               |   9 +-
 backend/src/backend/gen_insn_selection.cpp        |  10 +-
 backend/src/backend/gen_insn_selection_output.cpp |   3 +
 backend/src/backend/gen_program.cpp               |  11 +-
 backend/src/backend/gen_register.hpp              |   4 +-
 backend/src/backend/program.cpp                   |  34 ++--
 backend/src/backend/program.h                     |   3 +-
 backend/src/backend/program.hpp                   |   2 +-
 backend/src/ir/context.cpp                        |   4 +-
 backend/src/ir/function.cpp                       |   6 +-
 backend/src/ir/function.hpp                       |  10 +
 backend/src/ir/instruction.cpp                    |  11 +-
 backend/src/ir/instruction.hpp                    |   3 +
 backend/src/ir/lowering.cpp                       |  15 +-
 backend/src/ir/profiling.cpp                      |  72 +++++++-
 backend/src/ir/profiling.hpp                      |  35 +++-
 backend/src/ir/structurizer.cpp                   |  55 ++++--
 backend/src/ir/structurizer.hpp                   |   6 +-
 backend/src/ir/unit.cpp                           |   9 +-
 backend/src/ir/unit.hpp                           |  12 +-
 backend/src/llvm/ExpandConstantExpr.cpp           |   6 +-
 backend/src/llvm/ExpandLargeIntegers.cpp          | 173 +++++++++--------
 backend/src/llvm/llvm_gen_backend.cpp             |  11 +-
 backend/src/llvm/llvm_gen_backend.hpp             |  11 +-
 backend/src/llvm/llvm_intrinsic_lowering.cpp      |   8 +-
 backend/src/llvm/llvm_passes.cpp                  |  13 +-
 backend/src/llvm/llvm_profiling.cpp               | 215 +++++++++++++++++-----
 backend/src/llvm/llvm_sampler_fix.cpp             |   7 +
 backend/src/llvm/llvm_scalarize.cpp               |   5 +-
 backend/src/llvm/llvm_to_gen.cpp                  |   4 +-
 backend/src/llvm/llvm_to_gen.hpp                  |   2 +-
 src/cl_program.c                                  |   3 +-
 33 files changed, 535 insertions(+), 257 deletions(-)
    

Revisions

Patches download mbox

# Name Submitter State A F R T
[01/18] Backend: Refine the CopyDebug template function. junyan.he@inbox.com New
[02/18] Backend: Use CopyDebug when we add llvm instructions. junyan.he@inbox.com New
[03/18] Backend: Store the debug info for gen instruction. junyan.he@inbox.com New
[04/18] Backend: Init the DBGInfo in all constructors. junyan.he@inbox.com New
[05/18] Backend: Add debug info to the return lower. junyan.he@inbox.com New
[06/18] Backend: Add the debug info to IF ELSE instructions. junyan.he@inbox.com New
[07/18] Backend: Add debug info to BRA instruction. junyan.he@inbox.com New
[08/18] Backend: Output line and column number for insn selection. junyan.he@inbox.com New
[09/18] Backend: add line and column to ASM output. junyan.he@inbox.com New
[10/18] Backend: Move profilingInfo from Unit to Function. junyan.he@inbox.com New
[11/18] Backend: Add more APIs to ProfilingInfo class. junyan.he@inbox.com New
[12/18] Backend: Add the profiling output with source. junyan.he@inbox.com New
[13/18] Backend: Disable the welform for profiling instructions. junyan.he@inbox.com New
[14/18] Backend: Add parameter source to all the build functions. junyan.he@inbox.com New
[15/18] Backend: Fix two bugs for Gen context. junyan.he@inbox.com New
[16/18] Backend: Refine llvm_profiling pass. junyan.he@inbox.com New
[17/18] Backend: Use OCL_PROFILING_LINES to replace OCL_PROFILING_LOG var. junyan.he@inbox.com New
[18/18] Make -g a default option arg for clang if output debug. junyan.he@inbox.com New