-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME
More file actions
150 lines (123 loc) · 6.65 KB
/
README
File metadata and controls
150 lines (123 loc) · 6.65 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
WIICA: Workload ISA-Independent Characterization for Applications
=================================================================
WIICA is a workload characterization tool to characterize the ISA-independent
characteristics of applications in the context of specialized architectures.
If you use WIICA in your research, please cite:
ISA-Independent Workload Characterization and its Implications for Specialized
Architectures,
Yakun Sophia Shao and David Brooks,
International Symposium on Performance Analysis of Systems and Software
(ISPASS), April 2013
==================================================================
0. Build WIICA
1) LLVM 3.4 and Clang 3.4 64-bit
2) LLVM IR Trace Profiler (LLVM-Tracer)
LLVM-Tracer is an LLVM compiler pass that instruments code in LLVM
machine-independent IR. It prints out a dynamic trace of your program, which
then be take as input for WIICA (and Aladdin.)
You can download LLVM-Tracer from here:
[https://github.com/ysshao/LLVM-Tracer]
To build LLVM-Tracer:
1. Set `LLVM_HOME` to where you installed LLVM
```
export LLVM_HOME=/your/path/to/llvm
export PATH=$LLVM_HOME/bin/:$PATH
export LD_LIBRARY_PATH=$LLVM_HOME/lib/:$LD_LIBRARY_PATH
```
2. Go to where you put LLVM-Tracer source code
```
cd /path/to/LLVM-Tracer
cd /path/to/LLVM-Tracer/full-trace
make
cd /path/to/LLVM-Tracer/profile-func
make
```
=================================================================
1. Run WIICA:
After you build LLVM-Tracer, you need to specify the function names you want to instrument in the programs to run in environment variable WORKLOAD. You can use example SHOC programs in the SHOC
director to run WIICA. Here we provide all the function names in setenv.sh.
An example to run wiica is:
source setenv.sh
cd scripts
python run_wiica.py --directory /your/path/to/wiica/SHOC/fft/ --kernel fft --source fft --arguments non --analysis_types memory
=================================================================
Related scripts:
1) run_wiica.py
The interface of wiica.
usage: run_wiica.py [-h] [--directory DIRECTORY] [--kernel KERNEL]
[--source SOURCE]
[--arguments [ARGUMENTS [ARGUMENTS ...]]]
[--test_file TEST_FILE]
[--analysis_types [{opcode,staticinst,memory,branch,basicblock,register} [{opcode,staticinst,memory,branch,basicblock,register} ...]]]
optional arguments:
-h, --help show this help message and exit
--directory DIRECTORY
ABSOLUTE directory of the benchmark
--kernel KERNEL benchmark to analyze. If the benchmark is one of the
algorithms of the kernel, use kernel.algorithm instead
--source SOURCE the name of source file, e.g. fft, md, etc.
--arguments [ARGUMENTS [ARGUMENTS ...]]
the list of arguments to run the program. If no argument, put "non"
--analysis_types [{opcode,staticinst,memory,branch,basicblock,register} [{opcode,staticinst,memory,branch,basicblock,register} ...]]
Type of analysis. Separate multiple values with
spaces. The supported analysis types are shown.
2) compile.py
Compiling the program with LLVM-Tracer to generate a dynamic LLVM IR trace.
3) process_trace.py
For those benchmarks with "llvm.memset" instrinsic. It replaces "llvm.memset" with several non-intrinsic instructions.
4) analysis.py
Performing opcode,staticinst,memory,branch,basicblock,register analysis.
Opcode: Opcode Breakdown into Compute, Memory, and Branch
StaticInst: Number of dynamic executions for each static instruction, sorted
by the dynamic counts
Memory: Memory Footprint, Memory Global/Local Entropy[Shao2013]
Branch: Branch Entropy[Shao2013]
BasicBlock: Size and number of dynamic executions of each basic block
5) mem_analysis.py
Spatial Locality Score, see [Weinberg2005] for more details.
Temporal Locality Scor, see [Weinberg2005] for more details.
6) reg_analysis.py
Register Degrees: The averarge use of registers, equals to the total number of register read divided by the total number of register write, see [Franklin1992] for more details.
Register Distribution: The distribution of the register dependency distance.
Register Lifetime: The distribution of the distance between the creation and the last use of registers, see [Franklin1992] for more details.
Register Number: The number of register required at a certain point. We assume the application is executed 1 instruction per cycle.
=================================================================
2.WIICA Outputs:
Stats files are generated to store the results including:
(These files are generated from analysis.py)
[bench name]_opcode_profile
[bench name]_staticinst_profile
[bench name]_footprint Memory footprint
[bench name]_mem_entropy
[bench name]_branch_entropy
[bench name]_basicblock_profile
(These files are generated from mem_analysis.py)
[bench name]_spatial_locality
[bench name]_temporal_locality
[bench name]_stride_profile Used to compute spatial locality
[bench name]_reuse_profile Used to compute temporal locality
(These files are generated from reg_analysis.py)
[bench name]_reg_degree total read / total write
[bench name]_reg_distribution dependency distance distribution
[bench name]_reg_lifetime the distance (between when the register is created with when it is used for the last time) distribution
[bench name]_reg_number The dynamic register number needed at each cycle (assume 1 cycle / instruction)
[bench name]_reg_maxn The maximun number in [bench name]_reg_number, which is the minimun number of registers needed to run the program
=================================================================
4. Feedback
Feel free to email to archbugreport@gmail.com if you have any questions or
comments.
=================================================================
Yu Emma Wang, Sophia Yakun Shao
VLSI-Arch group
Harvard University
July 26, 2014
=================================================================
References:
[Weinberg2005] J. Weinberg, M.O. McCracken, E. Strohmaier, and A. Snavely.
Quantifying Locality in the Memory Access Patterns of HPC Applications, SC, 2005
[Shao2013] Y.S. Shao and D. Brooks.
ISA-Independent Workload Characterization and its Implications for Specialized
Architecture, ISPASS, 2013
[Franklin1992] Franklin, M., & Sohi, G. S.
Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors.
In ACM SIGMICRO Newsletter (Vol. 23, No. 1-2, pp. 236-245). IEEE Computer Society Press, 1992.