forked from pbassiner/coursera_progfun-004_materials
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path3 - 3 - Lecture 2.3 - Example- Finding Fixed Points (10-46).srt
589 lines (472 loc) · 12.1 KB
/
3 - 3 - Lecture 2.3 - Example- Finding Fixed Points (10-46).srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
1
00:00:00,000 --> 00:00:05,046
In this session we're going to apply much
more of what we've learned about high
2
00:00:05,046 --> 00:00:11,000
order functions in one coherent example.
The task here is to find fixed points of
3
00:00:11,000 --> 00:00:16,047
functions and the application of the task
will be to our well known square root
4
00:00:16,047 --> 00:00:19,028
example.
So what is a fixed point?
5
00:00:19,028 --> 00:00:24,063
A number x is called a fixed point of a
function f if f of x equals x.
6
00:00:24,063 --> 00:00:31,082
Let's see that in a graphical example.
Let's suppose we have the function that
7
00:00:31,082 --> 00:00:38,028
maps x to one plus x over two.
So the graph of that function would look
8
00:00:38,028 --> 00:00:44,079
like this.
And that would be the diagonal.
9
00:00:46,052 --> 00:00:50,088
And that would be one.
And the fixed point would be where the
10
00:00:50,088 --> 00:00:55,003
diagonal hits the graph of the function.
That would be two.
11
00:00:55,079 --> 00:01:01,030
Turns out that for some functions we can
locate the fixed points by starting with
12
00:01:01,030 --> 00:01:04,086
an initial estimate and then applying f
repetetively.
13
00:01:04,086 --> 00:01:09,049
So, we start with x, then we compute f of
x, then f of f of x, and so on.
14
00:01:09,049 --> 00:01:14,073
And for these functions at some point the
value will not vary any more, or the
15
00:01:14,073 --> 00:01:20,030
change will be sufficiently small, so that
we can declare that we have approximated
16
00:01:20,030 --> 00:01:28,042
the fixed point.
So for functions that satisfy that
17
00:01:28,042 --> 00:01:32,079
property, we could write the following
fixed point algorithm.
18
00:01:33,018 --> 00:01:37,078
I will go into the worksheet and start it
directly.
19
00:01:38,012 --> 00:01:43,095
So that algorithm is if you look at it
closely it's a closed derivative of what
20
00:01:43,095 --> 00:01:48,054
we did for square root.
We have essentially an iteration method
21
00:01:48,054 --> 00:01:54,088
that test whether our guess is close
enough to the, to the result and if it's
22
00:01:54,088 --> 00:02:01,015
not then we iterate and beyond the improve
methods that we had in the square root
23
00:02:01,015 --> 00:02:05,015
would now be this f method that we pass as
a parameter.
24
00:02:06,012 --> 00:02:14,023
So let's try this method out.
For our function that maps x to one, one
25
00:02:14,023 --> 00:02:19,019
plus x over two.
So let me write fixed point.
26
00:02:19,019 --> 00:02:24,003
X to one plus x over two.
Initial value one.
27
00:02:24,050 --> 00:02:29,021
And you would get something that's very
close to two, as expected.
28
00:02:33,065 --> 00:02:39,056
So now that we know how to do fixed
points, can we somehow get new insight in
29
00:02:39,056 --> 00:02:45,000
the, our square root function?
Previously we gave you Newton's method
30
00:02:45,000 --> 00:02:48,842
essentially as fact.
That's the sequence that you have to use
31
00:02:48,842 --> 00:02:53,081
for your iterations.
Can we somehow derive that sequence?
32
00:02:53,081 --> 00:02:58,019
When in fact, yes.
So here's a specification of the square
33
00:02:58,019 --> 00:03:01,019
root function.
What is square root of x?
34
00:03:01,019 --> 00:03:05,011
Well, it would by a number y, such that y
squared equals x.
35
00:03:05,011 --> 00:03:11,049
That's what it means to be a square root.
Now that we can, now we can do just simple
36
00:03:11,049 --> 00:03:16,025
algebraic manipulation.
So, we divide both sides of this equation
37
00:03:16,025 --> 00:03:21,863
by y, so we get square root of x is the
number y such that y equals x divided by
38
00:03:21,863 --> 00:03:25,964
y.
But what that means is that square of the
39
00:03:25,964 --> 00:03:32,378
x is the fixed point of the function y to
x over y because if you put y in and you
40
00:03:32,378 --> 00:03:38,376
get x of a y and x of a y is the same as y
then that equation is satisfied.
41
00:03:38,376 --> 00:03:45,053
So a solution of the equation is the same
as a fixed point of this function again.
42
00:03:48,063 --> 00:03:54,026
Now, with that knowledge we can try to
revisit our implementational square root.
43
00:03:54,026 --> 00:03:59,067
Make it simpler by just plugging it in
into our fixed point find a function.
44
00:03:59,067 --> 00:04:05,030
So, that would give square root of x
double equals fixed point of this function
45
00:04:05,030 --> 00:04:08,093
that maps y to x over y, and some initial
value, 1.0.
46
00:04:08,093 --> 00:04:18,019
Let's try that out in the worksheet.
So we would have def square root of x
47
00:04:18,019 --> 00:04:29,012
double, equals fixed point of y maps to x
over y and one as the initial value.
48
00:04:30,056 --> 00:04:39,031
And let's try with our square root of two.
And we get, unfortunately, an infinite
49
00:04:39,031 --> 00:04:42,038
computation, this doesn't converge.
Why not?
50
00:04:42,038 --> 00:04:46,042
What happens?
Well, to find out more we can use print
51
00:04:46,042 --> 00:04:52,035
run as a debugging help.
So let me just add a print run statement
52
00:04:52,035 --> 00:04:56,060
at a good point.
So, I guess a good point is in the iterate
53
00:04:56,060 --> 00:05:00,042
function, where we could say, for each
iteration step.
54
00:05:00,042 --> 00:05:05,041
We want to write what the current guess
is.
55
00:05:05,041 --> 00:05:11,092
So we write guess equals, and then current
guess here.
56
00:05:12,045 --> 00:05:24,854
So if we do that, then we see that.
Our guess, oscillates between one and two
57
00:05:24,854 --> 00:05:29,122
all the time.
And if you do the computations for square
58
00:05:29,122 --> 00:05:34,662
root, I encourage you to do that and you
will find that yes indeed if you trace the
59
00:05:34,662 --> 00:05:42,586
execution using the substitution model
then that's precisely what will happen.
60
00:05:42,586 --> 00:05:50,053
So, how can we do better?
One way to control these oscillations that
61
00:05:50,053 --> 00:05:56,023
we were seeing is to prevent the
estimation from varying to much and one
62
00:05:56,023 --> 00:06:02,000
way to do this is by averaging successive
values of the original sequence.
63
00:06:02,000 --> 00:06:08,009
So instead of going 1,2, 1,2 we take the
average of two successive values that
64
00:06:08,009 --> 00:06:13,079
would give us 1.5 and that would set us on
the right path to convergence.
65
00:06:13,079 --> 00:06:24,005
So let's try that.
So what we want to do is, instead of
66
00:06:24,005 --> 00:06:28,769
saying x over y, we want to take the
average of the two values.
67
00:06:28,769 --> 00:06:34,174
So that would be y plus x over y divided
by two.
68
00:06:34,174 --> 00:06:42,576
And we get back to the square root
functions that we've seen in the first
69
00:06:42,576 --> 00:06:46,510
week.
In fact, if we expand the fixed point
70
00:06:46,510 --> 00:06:50,993
function fixed point and plug in the
definition of this average temping
71
00:06:50,993 --> 00:06:56,061
function of, for, for square root, then
what we will find is that we get back a
72
00:06:56,061 --> 00:07:01,089
very similar square root algorithm than
the one we've seen last week.
73
00:07:01,089 --> 00:07:04,973
So it's no surprise that we get the same
results back.
74
00:07:04,973 --> 00:07:10,571
So the previous examples have shown that
the expressive power of a language is
75
00:07:10,571 --> 00:07:14,459
greatly increased if we can, can pass,
functions as arguments.
76
00:07:14,459 --> 00:07:19,839
I'm going to show you next that it's also
very useful to have functions that return
77
00:07:19,839 --> 00:07:24,941
other functions, and I'm going, going to
show you that with fixed point as the
78
00:07:24,941 --> 00:07:28,002
example.
Let's go back to our observation that
79
00:07:28,002 --> 00:07:32,086
square root of x is a fixed point of the
function that map y to x over y.
80
00:07:32,086 --> 00:07:38,013
We have taken that observation, we have
seen that square root doesn't converge
81
00:07:38,013 --> 00:07:42,087
that way, but we could make it converge by
averaging successive values.
82
00:07:42,087 --> 00:07:48,002
This technique of stabilizing by averaging
is general enough to merit being
83
00:07:48,002 --> 00:07:52,096
abstracted into its own function.
Let's see what this function would look
84
00:07:52,096 --> 00:07:56,001
like.
It would have the function average down.
85
00:07:56,001 --> 00:07:59,060
It takes an arbitrary function from double
to double.
86
00:07:59,060 --> 00:08:04,831
It takes the value x of type double and it
then performs the, it computes the average
87
00:08:04,831 --> 00:08:10,080
of x and f of x.
Let's do an exercise.
88
00:08:10,080 --> 00:08:18,031
Given fixed point and average damp, can
you write a square root function that uses
89
00:08:18,031 --> 00:08:24,091
both functions.
So, let's see how we would solve this
90
00:08:24,091 --> 00:08:29,046
example.
I have given you the average damp function
91
00:08:29,046 --> 00:08:34,090
that we developed on the slide.
What would square root be now?
92
00:08:34,090 --> 00:08:42,048
So square root would be a fixed point of a
function that maps y to x over y with
93
00:08:42,048 --> 00:08:46,407
initial value one.
But that was the one we did that didn't
94
00:08:46,407 --> 00:08:52,087
converge.
So, what we do is we use average damp.
95
00:08:52,087 --> 00:09:00,840
On that function here.
And if we run it, we can, we get the
96
00:09:00,840 --> 00:09:05,471
expected square root result.
So what average stamp is, is it's a
97
00:09:05,471 --> 00:09:11,353
function that takes a function, namely
this function that is at, at the root of
98
00:09:11,353 --> 00:09:16,936
the square root specification, and it
returns another function, namely the
99
00:09:16,936 --> 00:09:22,398
function that is essentially the same
iteration but with averaged stamping
100
00:09:22,398 --> 00:09:26,057
applied.
So it's a function that takes a function
101
00:09:26,057 --> 00:09:30,056
to another function.
If you look at this definition of square
102
00:09:30,056 --> 00:09:35,023
root, then I think you would agree that
it's probably very difficult to write
103
00:09:35,023 --> 00:09:39,011
something that is clearer and shorter than
this very definition.
104
00:09:39,011 --> 00:09:43,023
It came from the specification of what it
means to be a square root.
105
00:09:43,023 --> 00:09:48,021
We said we derive that the solving the
equations for square roots means taking
106
00:09:48,021 --> 00:09:52,045
the fixed point of this function.
We determined that we needed to take
107
00:09:52,045 --> 00:09:57,024
averages to dampen the oscillation, so we
use this average damp function in the
108
00:09:57,024 --> 00:10:03,001
middle, and we are done.
So in summary, we saw last week that
109
00:10:03,001 --> 00:10:08,029
functions are essential abstractions.
Because they allow us to introduce general
110
00:10:08,029 --> 00:10:12,097
methods to perform computations, as
explicit and named arguments in our
111
00:10:12,097 --> 00:10:17,417
programming language.
And this see, week we've seen that these
112
00:10:17,417 --> 00:10:22,736
abstractions can be combined with high
auto functions to create new abstractions.
113
00:10:22,736 --> 00:10:26,479
So, that's a very powerful way to compose
and abstract programs.
114
00:10:26,479 --> 00:10:31,785
And as a programmer it's always good to
look for opportunities to extract entry
115
00:10:31,785 --> 00:10:34,196
use.
It's not true that the highest level of
116
00:10:34,196 --> 00:10:38,528
abstraction is always the best one.
Sometimes it's actually better to stay a
117
00:10:38,528 --> 00:10:42,103
little bit more concrete.
But it's important to know the techniques
118
00:10:42,103 --> 00:10:49,031
of abstraction so that you can use them
when they're appropriate.