This repository was archived by the owner on Nov 1, 2018. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathrunning.html
358 lines (320 loc) · 14.3 KB
/
running.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html> <head>
<title>Running compiled programs</title>
</head>
<body>
<h1>Running compiled programs</h1>
<h3>Program flags</h3>
A Haskell program compiled with hbc automatically decodes a number of flags:
<ul>
<li>-C<br>
Produce a ``core'' file when a signal occurs.
<li>-f<br>
Prints a description of the flags; the program is not executed.
<li>-H<var>size</var><br>
Set maximum heap size. Default is 8M.
<li>-h<var>size</var><br>
Set minimum heap size. Default is 500k.
<li>-A<var>size</var><br>
Set pointer stack size. Default is 100k.
<li>-V<var>size</var><br>
SPARC only! Set return stack size. Default is 50000.
<li>-T<br>
Enter the runtime tracer. This only works if some of the files
were compiled with the ``-T'' flag. If this flag is used twice
a trace is produced without any user interaction.
<li>-gc-gen<br>
Use a generational garbage collector.
<li>-gc-slide<br>
Use an in-place compacting garbage collector.
<li>-X<br>
Debug mode (gives some additional messages).
<li>-<br>
Marks the end of decoded arguments.
</ul>
<p>
If the runtime system was compiled with dumping enabled there are some
additional flags:
<ul>
<li>-d<br>
Print a stack dump on error.
<li>-G<br>
Print a stack dump before and after each garbage collection.
<li>-K<var>addr</var><br>
When GC reaches stack location <var>addr</var> the routine debstop is called.
<li>-M<var>n</var><br>
The maximum number of dumped nodes is set to <var>n</var>.
<li>-t<var>n</var><br>
The depth of dump is set to <var>n</var>.
</ul>
<p>
If the runtime system was compiled with GC statistics enabled there
are some additional flags:
<ul>
<li>-B<br>
Sound the bell at the start of garbage collection.
<li>-S<var>file</var><br>
Produce a garbage collection statistics file. If no file name is given
it is written in ``STAT.<var>program</var>''. If the file name is ``stderr'' it is written
on standard error.
</ul>
<p>
The file ``STAT.<var>program</var>'' (when produced)
will contain various (selfexplanatory?) statistics.
<h3>Heap profiling (Authors: Colin Runciman and David Wakeling)</h3>
If the program has been compiled with the heap profiling turned of (the -ph flag
to the compiler) it decodes the following flags:
<ul>
<li>-i<var>float</var><br>
set the sampling interval of the heap profiler to <var>float</var> seconds.
Normally the profiler runs with an exponentially increasing profiling
interval.
<li>-g[{<var>g</var>,...}]<br>
Give profiling information by <var>module group</var>.
When the graph is sampled the space occupied by each node (in bytes) is
charged to the particular group of modules that produced the node. In
some cases, only certain groups of modules may be of interest, and
these groups can be named in an optional restriction set following the
-g flag.
<li>-m[{<var>m</var>,...}]<br>
Give profiling information by <var>module</var>. Similar to the -g flag.
In this case though, the space occupied by each node is charged to the
module that produced the node. Once again, only certain modules may be
of interest, and those can be named in a restriction set.
<li>-p[{<var>p</var>,...}]<br>
Give profiling information by <var>producer</var>.
In this case, the space occupied by each node is charged to the
function that produced the node.
<li>-c[{<var>c</var>,...}]<br>
Give profiling information by <var>construction</var>.
As for the -p flag. In this case the space occupied by each node is charged to the
construction that it represents, with the function component being used
for closures.
<li>-t[{<var>t</var>,...}]<br>
Give profiling information by <var>type</var>.
In this case, the space occupied by each node is charged to the
type of the node.
</ul>
Two or more of the -g, -m, -p, and -c
flags may be used together. In this case, the first flag specifies what
kind of profile is to be produced, and the remainder are used to specify
restrictions. <p>
During reduction the graph is periodically sampled and the samples
are written to a file whose name is that of the program, extended
with a ``.hp'' suffix. This file can be converted to a PostScript
file by the hp2ps program.<p>
A notable feature is that there is absolutely no concept of scope.
Profiles do not distinguish between nested functions with the same
name, or between functions in different modules with the same name.
The only way to make such distinctions is to copy and rename function
bodies. Some names get lost during compilation, and obscure
identifiers appear in their place. This happens most often in programs
that make heavy use of higher-order functions; my apologies.<p>
Examples:<p>
Running the program ``a.out -p'' gives a producer profile.<p>
``a.out -p -c{(.)}'' gives a producer profile, in which the only producers of interest are those
of ``(.)'' nodes. <p>
``a.out -c -m{lex,parse,typecheck}'' gives a construction profile,
but only for constructions produced by the modules ``lex'', ``parse'' and ``typecheck''.<p>
``a.out -c -m{lex,parse,typecheck} -p{tokeniser,syntax,tcheck}''
gives a construction profile, but only for constructions produced by the
modules ``lex'', ``parse'' and ``typecheck'', and only for the functions
``tokeniser'', ``syntax'' and ``tcheck'' within these modules.<p>
<h3>Tracing</h3>
There is no debugger available that can handle programs produced by lmlc/hbc,
but if programs are compiled with the ``-T'' flag there is a simple interactive
tracer that can be used. The tracer is invoked by giving the ``-T'' flag to a
<em>compiled</em> program. Unfortunately the tracer cannot be used together with
the interactive system (yet).
The tracer has an interactive interface where the user can turn tracing of and
off, run till a certain point etc.
<p>
The following commands are available:
<ul>
<li><tt>help</tt><br>
Print a help message describing the commands.
<li><tt>quit</tt><br>
Quit the tracer and the program.
<li><tt>leave</tt><br>
Leave a recursive invokation of the tracer.
<li><tt>next</tt><br>
Trace (i.e.\ print messages) until the next function is entered.
<li><tt>cont</tt><br>
Run (i.e.\ do not print messages) the program to completion.
<li><tt>rcont</tt><br>
Trace the program to completion.
<li><tt>exit</tt><br>
Trace until the current function exits.
<li><tt>rexit</tt><br>
Run until the currentfunction exits.
<li><tt>stop</tt> <var>re</var><br>
Set breakpoints on all functions matching <var>re</var>.
<li><tt>nostop</tt> <var>re</var><br>
Remove breakpoints from all functions matching <var>re</var>.
<li><tt>arg</tt> <var>n</var><br>
Evaluate (to WHNF) and print argument number <var>n</var>.
<li><tt>farg</tt> <var>n</var><br>
Fully evaluate and print argument number <var>n</var>.
<li><tt>on</tt> <var>re</var><br>
Turn on tracing for functions matching <var>re</var>.
<var>Re</tt> may contain ``*'' which matches any number of characters.
<li><tt>off</var> <var>re</var><br>
Turn off tracing for functions matching <var>re</var>.
<var>Re</var> may contain ``*'' which matches any number of characters.
<li><tt>where</tt><br>
Show call stack.
<li><tt>depth</tt> <var>n</var><br>
Set print depth to <var>n</var>. Default value is 3.
<li><tt>file y/n</tt><br>
Turn on/off file (module) name printing.
</ul>
Identifier may be prefixed by their file (module) name followed by a dot
to make them unique. An empty command will repeat the previous command.
Any kind of error or call to fail will cause the tracer to be entered.
The tracer prints the following messages:
<ul>
<li><tt>Enter</tt> <var>expr</var><br>
A function is just about to be entered. The <var>expr</var> shows the function
with its arguments.
<li><tt>Return</tt> <var>expr</var><br>
A function is just about to return with an evaluated expression.
<li><tt>Return variable</tt><br>
A function is just about to return with variable that might evaluate
to a function.
<li><tt>Jump (unknown)</tt><br>
A function is just about to tail call another function that was not known at compile time
(or had a number of arguments that did not correspond to its arity).
<li><tt>Jump</tt><br>
A function is just about to tail call a function known at compile time.
</ul>
Each message is indented with a depth corresponding to the call stack depth.
In the case of a tail called the function which is called can be seen from
the following ``Enter'' message, provided that function has been compiled with the
trace flag on. The enter and exit messages always come in pairs, even if
tracing is turned off for a particular function between.
<p>
The tracer is not able to determine the type for all constructed values.
If it cannot then it uses the name CON<var>n</var> for the <var>n</var>:th constructor of a type.
<p>
Traced and non-traced modules can be mixed, but only calls to traced code can be observed.
Each module in a program can be compiled for tracing, but then the trace flag
can be omitted when linking the program. In this case the program will run with
a moderate slowdown (it will take about 25\% longer). If it is linked for tracing,
but run without the ``-T'' flag it may run as much as 5 times slower.
<p>
A problem with understanding the trace messages is that they refer to the program
after the extensive transformations performed by the compiler. An aid to understanding
is to look at the program after all transformations; given the ``-ftransformed'' flag
the compiler will show this.
<p>
<h3>Memory allocation</h3>
The current strategy for memory allocation is as follows:
On startup a heap is allocated, the size of this never changes during
execution. The size is the heapsize
given as argument, or a default of 8 megabytes.
<p>
Only part of this memory is used during exection to lower the working
set of the program. How large part is determined after each garbage
collection. The amount is used (i.e., available for allocation) is
the amount that was copied when the collection occured multiplied by 4.
In this way the working set is
adapted to the amount of heap that is actually in use.
<h3>Tips to get efficient programs</h3>
NOTHING MUCH WRITTEN YET (maybe it's impossible?)!!!
<h4>Haskell overloading</h4>
Overloaded Haskell functions are nice but slow. The compiler tries to
remove overloading where the types are known, so type signatures help
to improve efficiency. It is a good idea, both from efficiency and
also from a programming point of view, to include type signatures for
all top level functions in a module. With optimization turned on the
compiler tries to make specialized functions for all possible types it
is used at. This is currently impossible across module boundaries, so
for exported functions the <tt>SPECIALIZE</tt> pragma should be used.
<p>
It is generally not a good idea to worry too much about what the
compiler does, but here are a few general tips.
Elementary functions, as well as certain idioms, on simple data basic
types turn into a few machine instructions.
The number of machine instructions cited below does not include those
that compute the value, check if it is computed, loads it into a
register etc. These instructions usually take much longer than the
operation itself, but the table gives an indication of which operations
you can expect reasonable efficiency.
<p>
<ul>
<li><tt>Bool</tt><br>
<ul>
<li><tt>==,/=,<,<=,>,>=</tt><br>
turn into single machine instruction.
</ul>
<li><tt>Char</tt><br>
<ul>
<li><tt>chr,ord</tt><br>
usually turn into nothing at all.
<li><tt>==,/=,<,<=,>,>=</tt><br>
turn into single machine instruction.
</ul>
<li><tt>Int</tt><br>
<ul>
<li><tt>==,/=,<,<=,>,>=,+,-,*,negate,quot,rem</tt><br>
turn into single machine instruction.
<li><tt>max,min,abs,signum,div,mod,even,odd</tt><br>
turn into a few machine instructions.
</ul>
<li><tt>Float,Double</tt><br>
<ul>
<li><tt>==,/=,<,<=,>,>=,+,-,*,negate,/,fromRational.toRational,fromInt</tt><br>
turn into single machine instructions. The composition
<tt>fromRational.toRational</tt> turns into a single instructions if the
argument/result type is <tt>Float</tt>/<tt>Double</tt> or <tt>Double</tt>/<tt>Float</tt>.
<li><tt>truncate</tt><br>
turns into a single machine instruction if the result type is <tt>Int</tt>.
<li><tt>max,min,abs,signum</tt><br>
turn into a few machine instructions.
<li><tt>exp,log,sqrt,sin,cos,tan,asin,acos,atan,sinh,cosh,tanh</tt><br>
turn into calls to C. For type <tt>Float</tt> the argument is first
converted to <tt>Double</tt>, then the C function is called, and the
result is converted back again.
</ul>
<li><tt>Integer</tt><br>
All functions involve a call to C routines, so they are likely to be
slow for small values, but for large numbers they are quite efficient
as the actual computations tend to dominate the running time.
<li><tt>Complex</tt><br>
Complex numbers based on <tt>Float</tt> and <tt>Double</tt> have specialized
instances everywhere in the Prelude and are fairly efficient. Part of
the efficiency comes from the strict data constructor for <tt>Complex</tt>.
<li><tt>Rational</tt><br>
Has specialized instances, but is not as efficient as you could make
it by calling C functions to do the arithmetic.
</ul>
There are a few things that should be currently avoided because they
are very slow:
<ul>
<li><tt>show,read</tt><br> on numeric types in general, and floating
types in particular.
<li><tt>atan2</tt><br>
it is a real beast.
<li><tt>fromRational</tt><br>
is very slow for floating types. An exception is <tt>fromRational</tt>
applied to a constant with a result type which is <tt>Float</tt> or
<tt>Double</tt> which the compiler handles specially.
<li><tt>toRational,fromInteger</tt><br>
can be slow. Again <tt>fromInteger</tt> on constants are handled
specially.
<li><tt>gcd</tt><br>
uses Euclid's algorithm.
<li><tt>Array</tt><br>
All operations on arrays are worse than you would hope for.
Some operations on arrays with <tt>Int</tt> as index are handled more efficiently.
</ul>
Most Prelude functions involving numeric types have specialized
instances for all the numeric types in the Prelude, e.g.
<tt>sum</tt> can sum lists of any Prelude numeric type efficiently.
<p>
<hr>
<address></address>
<!-- hhmts start -->
Last modified: Mon Jul 22 01:15:26 MET DST 1996
<!-- hhmts end -->
</body> </html>