-
Notifications
You must be signed in to change notification settings - Fork 23
/
Copy patherr.sbl
465 lines (306 loc) · 12.7 KB
/
err.sbl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
-in80
-title err.spt : compress spitbol error messages
-stitle initialization
* Copyright 1987-2012 Robert B. K. Dewar and Mark Emmer.
* Copyright 2012-2017 David Shields
* this file is part of macro spitbol.
* macro spitbol is free software: you can redistribute it and/or modify
* it under the terms of the gnu general public license as published by
* the free software foundation, either version 2 of the license, or
* (at your option) any later version.
* macro spitbol is distributed in the hope that it will be useful,
* but without any warranty; without even the implied warranty of
* merchantability or fitness for a particular purpose. see the
* gnu general public license for more details.
* you should have received a copy of the gnu general public license
* along with macro spitbol. if not, see <http://www.gnu.org/licenses/>.
* err.spt
* find a near optimum compression for a spitbol error message file.
* spitbol -u "machine" -1<input file> -2<output file> err
* the variable machine is set equal to the uppercase name of the machine
* being processed. usage is optional.
* technique:
* the 159 characters from 1 to 31, and 128 to 255 are used to
* represent words and phrases from the error file. the error
* list is then reduced to two character strings:
* 1. an error message string
* 2. a phrase string
* within each string, the individual messages or phrases are
* separated by '\0' characters. the nth element of either
* string can be found by scanning and counting '\0's.
* when a character with value 1 to 31, or 128 to 255 is encountered,
* it should be mapped to an ordinal number in the range [1-159], and
* the phrase string consulted to obtain the appropriate phrase.
* naturally (this being snobol4), phrases may recursively contain
* other code characters.
* the problem reduces to finding the 159 phrases that will
* provide the greatest compression. for any phrase p of length s
* that occurs n times in the source error message file, replacing
* it with a single character and adding it to the phrase string
* produces a savings of n*(s-1) - s - 1 characters.
* each of the n occurrances of p are replaced by a single character,
* for a savings of n*(s-1) in the error message string. the phrase
* string is lengthened by s + 1 characters (1 for the '\0' at the end).
* for a unique phrase (n=1), the savings is -2, that is, there is no
* savings, but rather a net gain of 2 characters.
* a sneaky pattern is used to generate all single- and multi-word
* phrases in a particular error message. each phrase indexes a table
* entry that counts the cost savings associated with that phrase.
* all error messages are scanned to build this savings table, which
* is then rsorted to obtain the phrase that if chosen, would produce
* the greatest savings. taking that phrase, all occurrances of it
* in all error messages are replaced by a single character, and the
* phrase is added to a phrase array.
* the entire process is repeated again, rather than just taking the
* next best entry from the table. this allows new combinations to
* materialize, as a result of having replaced the phrase with a single
* character, and also eliminates the problem of now obsolete table
* entries that partially overlapped the chosen phrase. the process
* is repeated for all 159 entries that can be accommodated in the
* phrase table.
* as each phrase is selected, the phrase array is also scanned
* to see any substitutions can be made there.
* the program makes considerable use of gimpel's seq function to
* sequence through various arrays.
* the output is a series of data statements in generic 68000 assembly
* language. there are two types of pseudo-ops:
* 1. byte to define decimal data bytes
* 2. ascii to define an ascii string
* needless to say, it does take some time to grind out the results (22
* minutes with 68k spitbol on a unix pc, 22 hours with snobol4+ on
* a pc/xt).
* note: moving the lower case characters (64-95) from the normal
* character set to the special set (thus providing for 191
* phrases instead of 159), only saved 40 bytes in the output
* file, and was deemed unnecessarily restrictive.
* concatenating the error and phrase arrays prior to each call
* produced a less optimal packing, and was discarding
* in favor of just using the error array to pick the next
* phrase for compression.
* set machine definition flag, if any.
ident(host(0)) :s(nomachine)
$replace(host(0),&lcase,&ucase) = 1
output = "machine : " host(0)
nomachine
* tab character
tab = char(9)
* comma delimited string of decimal bytes:
db = 'db'
byte = tab db tab
* delim delimited string of 'normal' ascii characters (32-127)
delim = '"'
ascii = tab 'db' tab
* comment lead-in character
comment = ';'
* data and public definitions
data = tab 'segment' tab '.data'
* setup keywords
&anchor = 0
&trim = 1
* &stlimit = -1
&code = 1
* &fullscan = 1 ;* for snobol4+
* define the characters whose codes will be used as special pointers.
&alphabet len(1) len(31) . f1 len(96) len(128) . f2
specials = f1 f2
* functions
define('asc(c)') :(asc.end)
* asc(c)
* return the integer code for an ascii character c.
asc
&alphabet break(c) @asc :(return)
asc.end
define('seq(arg_s,arg_name)') :(seq.end)
* seq(s,n)
* will sequence through a set of statements until failure is detected.
* the indexing variable is given by the name n. from gimpel, "algorithms
* in snobol4."
* entry point: initialize indexing variable. then convert arg_s to code.
seq
$arg_name = 0
arg_s = code(arg_s ' :s(seq_1)f(seq_2)')
+ :f(error)
* increment indexing variable by 1 and spring off to compiled code.
* return will be to seq_1 or seq_2.
seq_1
$arg_name = $arg_name + 1 :<arg_s>
* control flows to seq_2 if a fail was detected. if first time through
* fail; otherwise succeed.
seq_2
eq($arg_name,1) :s(freturn)f(return)
seq.end
define('spread(spread)') :(spread.end)
* spread(n)
* given n in the range [1-159], spread it out over the ranges [1-31]
* and [128-255].
spread
spread = ge(spread,32) spread + 96 :(return)
spread.end
* shrink(n)
* given n in the ranges [1-31] and [128-255], shrink it back to the
* range [1-159].
define('shrink(shrink)') :(shrink.end)
shrink
shrink = ge(shrink,128) shrink - 96 :(return)
shrink.end
* best(err,n)
* select the best phrase from the error message array -- that is,
* the phrase that will provide the greatest savings if replaced
* by a one character code.
* technique: build a table of the savings associated with all
* possible phrases in the err array, sort it, and select the best.
* then make all possible replacements, using integer n for the code.
define('best(err,n)t,j') :(best.end)
* entry: define a fresh table to hold all the generated phrases.
best
t = table(1001)
* generate phrases for all error messages, then sort and select best.
seq(" generate(err[j],t)", .j)
t = rsort(t,2) :f(freturn)
* abort the caller's sequencing if there is not a net savings possible
* any more.
gt(t[1,2],0) :f(freturn)
best = t[1,1]
* now replace the selected phrase in all error and phrase strings.
seq(" repl(.err[j],best,n)", .j)
seq(" repl(.phrase[j],best,n)", .j) :(return)
best.end
* generate(s,t)
* generate all phrases from a string s, and add their potential
* savings to table t. the first occurence of a phrase produces
* a savings of -2 (net loss). subsequent occurences provide an
* incremental savings of size(phrase) -1.
* the one phrase we do not generate from s is the entire
* string s itself, since moving it to the phrase table is
* meaningless. (assuming unique input strings.)
* this relies upon fail pattern to generate all the permutations.
* put a trace on variable w to watch the results.
define('generate(s,t)w')
gen_pat = fence ("" | (breakx(' ' specials) len(1)))
+ ((breakx(' ' specials) len(1)) | (len(1) rem)) $ w
+ *differ(s,w)
+ *?(t[w] = (ident(t[w]) -2, t[w] + size(w) - 1)) fail
+ :(generate.end)
generate
s ? gen_pat :(return)
generate.end
* repl(x,i,s,n)
* replace string s in $x with char(n).
* this must be isolated into a function to prevent seq from
* terminating early when a pattern match fails.
define('repl(x,s,n)') :(repl.end)
repl
$x ? s = char(n) :f(return)s(repl)
repl.end
* putout(s,i)
* output string s as a series of ascii and byte data statements.
* i is the index of the string being compressed.
define('putout(s,i)c,n,w')
* pattern to break s into normal and special characters.
put_pat = (break(specials) | (len(1) rem)) . w
+ (span(specials) | "") . n
put_one = len(1) . c :(putout.end)
* entry. put out comment line to show what string is being compressed.
putout
outfile =
outfile = comment lpad(i,5) ' "' expand(s) '"'
* main loop. get segments of normal and special characters. each segment
* (or both) may be null.
put1
s ? put_pat =
* put out the normal characters, if any.
outfile = differ(w) ascii delim w delim
outsize = outsize + size(w)
* if s is exhausted, append '\0' to special characters, then
* put it out as a series of byte data statements.
n = ident(s) n char(0)
outsize = outsize + size(n)
put2
n ? put_one = :f(put3)
outfile = byte asc(c) :(put2)
put3
differ(s) :s(put1)f(return)
putout.end
* expand(s)
* recursively expand a string with compressed phrases.
define('expand(s)f,c')
ex_pat = (break(specials) . f len(1) . c) |
+ ((len(1) rem) . f "" . c) :(expand.end)
expand
s ? ex_pat = :f(return)
expand = ident(c) expand f :s(return)
expand = expand f expand(phrase[shrink(asc(c))])
+ :(expand)
expand.end
* readerr(err)
* read in the error message file, building array of messages.
* return maximum error number as function value.
define('readerr(a)s,n,msg')
rde_done = fence notany('0123456789')
rde_pat = fence len(3) . n len(1) rem . msg :(readerr.end)
readerr
s = infile :f(return)
s ? rde_done :s(return)
s ? rde_pat :f(rderr1)
(differ(a[n]) differ(a[n],msg)) :s(rderr2)
a[n] = msg
readerr = gt(n,readerr) n
insize = insize + 1 + size(msg) :(readerr)
rderr1
output = "bad input line: " s :(freturn)
rderr2
output = "inconsistent re-use of message number " n
output = "first use : " a[n]
output = " re-use: " msg :(freturn)
readerr.end
***************************************************************************
* main program
* open channel 1 on command line for infile, channel 2 for outfile
output = ~input(.infile,1) "no input file" :s(end)
output = ~output(.outfile,2) "no output file" :s(end)
* arrays to hold error messages and phrases
in = array(400)
phrase = array(159)
* prepare statistics
insize = 0
outsize = 0
* read input file, create err array of exact size to just hold error messages.
err = array(readerr(in)) :f(end)
* copy in array to err array, then discard in array.
seq(" err[i] = in[i]", .i)
in =
* build phrase array
seq(" phrase[i] = best(err, spread(i))", .i)
* putout title and header information
outfile = comment ' compressed spitbol error messages ' date()
outfile = comment
outfile = ' BITS 64'
outfile = ' DEFAULT REL'
outfile = name
outfile = model
outfile = include
outfile = group
outfile = data
outfile =
* putout label and compressed error message table
outfile = tab 'global' tab 'errors'
outfile = 'errors :' tab 'db' tab '0'
seq(" putout(err[i],i)", .i)
* putout label and phrase table
outfile =
outfile = tab 'global' tab 'phrases'
outfile = 'phrases :' tab 'db' tab '0'
seq(" putout(phrase[i],spread(i))", .i)
* finish up the output file
outfile =
outfile = enddata
outfile = ' section .note.GNU-stack noalloc noexec nowrite progbits'
outfile = endstmt
* print statistics
output = "input message length = " insize
output = "output string size = " outsize
output = "savings = " insize - outsize
output = "spitbol statements executed " &stcount
* clear error indicator for successful return
&code = 0
end