forked from jupyterhub/zero-to-jupyterhub-k8s
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathschema.yaml
812 lines (727 loc) · 30 KB
/
schema.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
title: Config
type: object
properties:
hub:
type: object
properties:
cookieSecret:
type:
- string
- "null"
description: |
A 32-byte cryptographically secure randomly generated string used to sign values of
secure cookies set by the hub. If unset, jupyterhub will generate one on startup and
save it in the file `jupyterhub_cookie_secret` in the `/srv/jupyterhub` directory of
the hub container. A value set here will make JupyterHub overwrite any previous file.
You do not need to set this at all if you are using the default configuration for
storing databases - sqlite on a persistent volume (with `hub.db.type` set to the
default `sqlite-pvc`). If you are using an external database, then you must set this
value explicitly - or your users will keep getting logged out each time the hub pod
restarts.
Changing this value will all user logins to be invalidated. If this secret leaks,
*immediately* change it to something else, or user data can be compromised
```sh
# to generate a value, run
openssl rand -hex 32
```
imagePullPolicy:
type: string
enum:
- IfNotPresent
- Always
- Never
description: |
Set the imagePullPolicy on the hub pod.
See the [Kubernetes docs](https://kubernetes.io/docs/concepts/containers/images/#updating-images)
for more info on what the values mean.
imagePullSecret:
type: object
description: |
Creates an image pull secret for you and makes the hub pod utilize
it, allowing it to pull images from private image registries.
Using this configuration option automates the following steps that
normally is required to pull from private image registries.
```sh
# you won't need to run this manually...
kubectl create secret docker-registry hub-image-credentials \
--docker-server=<REGISTRY> \
--docker-username=<USERNAME> \
--docker-email=<EMAIL> \
--docker-password=<PASSWORD>
```
```yaml
# you won't need to specify this manually...
spec:
imagePullSecrets:
- name: hub-image-credentials
```
To learn the username and password fields to access a gcr.io registry
from a Kubernetes cluster not associated with the same google cloud
credentials, look into [this
guide](http://docs.heptio.com/content/private-registries/pr-gcr.html)
and read the notes about the password.
properties:
enabled:
type: boolean
description: |
Enable the creation of a Kubernetes Secret containing credentials
to access a image registry. By enabling this, the hub pod will also be configured
to use these credentials when it pulls its container image.
registry:
type: string
description: |
Name of the private registry you want to create a credential set
for. It will default to Docker Hub's image registry.
Examples:
- https://index.docker.io/v1/
- quay.io
- eu.gcr.io
- alexmorreale.privatereg.net
username:
type: string
description: |
Name of the user you want to use to connect to your private
registry. For external gcr.io, you will use the `_json_key`.
Examples:
- alexmorreale
- _json_key
password:
type: string
description: |
Password of the user you want to use to connect to your private
registry.
Examples:
- plaintextpassword
- abc123SECRETzyx098
For gcr.io registries the password will be a big JSON blob for a
Google cloud service account, it should look something like below.
```yaml
password: |-
{
"type": "service_account",
"project_id": "jupyter-se",
"private_key_id": "f2ba09118a8d3123b3321bd9a7d6d0d9dc6fdb85",
...
}
```
Learn more in [this
guide](http://docs.heptio.com/content/private-registries/pr-gcr.html).
image:
type: object
description: |
Set custom image name / tag for the hub pod.
Use this to customize which hub image is used. Note that you must use a version of
the hub image that was bundled with this particular version of the helm-chart - using
other images might not work.
properties:
name:
type: string
description: |
Name of the image, without the tag.
```
# example names
yuvipanda/wikimedia-hub
gcr.io/my-project/my-hub
```
tag:
type: string
description: |
The tag of the image to pull.
This is the value after the `:` in your full image name.
```
# example tags
v1.11.1
zhy270a
```
db:
type: object
properties:
type:
type: string
enum:
- sqlite-pvc
- sqlite-memory
- mysql
- postgres
description: |
Type of database backend to use for the hub database.
The Hub requires a persistent database to function, and this lets you specify
where it should be stored.
The various options are:
1. **sqlite-pvc**
Use an `sqlite` database kept on a persistent volume attached to the hub.
By default, this disk is created by the cloud provider using
*dynamic provisioning* configured by a [storage
class](https://kubernetes.io/docs/concepts/storage/storage-classes/).
You can customize how this disk is created / attached by
setting various properties under `hub.db.pvc`.
This is the default setting, and should work well for most cloud provider
deployments.
2. **sqlite-memory**
Use an in-memory `sqlite` database. This should only be used for testing,
since the database is erased whenever the hub pod restarts - causing the hub
to lose all memory of users who had logged in before.
When using this for testing, make sure you delete all other objects that the
hub has created (such as user pods, user PVCs, etc) every time the hub restarts.
Otherwise you might run into errors about duplicate resources.
3. **mysql**
Use an externally hosted mysql database.
You have to specify an sqlalchemy connection string for the mysql database you
want to connect to in `hub.db.url` if using this option.
The general format of the connection string is:
```
mysql+pymysql://<db-username>:<db-password>@<db-hostname>:<db-port>/<db-name>
```
The user specified in the connection string must have the rights to create
tables in the database specified.
Note that if you use this, you *must* also set `hub.cookieSecret`.
4. **postgres**
Use an externally hosted postgres database.
You have to specify an sqlalchemy connection string for the postgres database you
want to connect to in `hub.db.url` if using this option.
The general format of the connection string is:
```
postgres+psycopg2://<db-username>:<db-password>@<db-hostname>:<db-port>/<db-name>
```
The user specified in the connection string must have the rights to create
tables in the database specified.
Note that if you use this, you *must* also set `hub.cookieSecret`.
pvc:
type: object
description: |
Customize the Persistent Volume Claim used when `hub.db.type` is `sqlite-pvc`.
properties:
annotations:
type: object
description: |
Annotations to apply to the PVC containing the sqlite database.
See [the Kubernetes
documentation](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/)
for more details about annotations.
selector:
type: object
description: |
Label selectors to set for the PVC containing the sqlite database.
Useful when you are using a specific PV, and want to bind to
that and only that.
See [the Kubernetes
documentation](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims)
for more details about using a label selector for what PV to
bind to.
storage:
type: string
description: |
Size of disk to request for the database disk.
url:
type:
- string
- "null"
description: |
Connection string when `hub.db.type` is mysql or postgres.
See documentation for `hub.db.type` for more details on the format of this property.
labels:
type: object
description: |
Extra labels to add to the hub pod.
See the [Kubernetes docs](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/)
to learn more about labels.
extraEnv:
type: list
description: |
Extra environment variables that should be set for the hub pod.
A list of [EnvVar](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.9/#envvar-v1-core)
objects.
These are usually used in two circumstances:
- Passing parameters to some custom code specified with `extraConfig`
- Passing parameters to an authenticator or spawner that can be directly customized
by environment variables (rarer)
extraConfig:
type: object
description: |
Arbitrary extra python based configuration that should be in `jupyterhub_config.py`.
This is the *escape hatch* - if you want to configure JupyterHub to do something specific
that is not present here as an option, you can write the raw Python to do it here.
extraConfig is a *dict*, so there can be multiple configuration snippets
under different names.
The configuration sections are run in alphabetical order.
Non-exhaustive examples of things you can do here:
- Subclass authenticator / spawner to do a custom thing
- Dynamically launch different images for different sets of images
- Inject an auth token from GitHub authenticator into user pod
- Anything else you can think of!
Since this is usually a multi-line string, you want to format it using YAML's
[| operator](http://www.yaml.org/spec/1.2/spec.html#id2795688).
For example:
```yaml
hub:
extraConfig:
myConfig.py: |
c.JupyterHub.something = 'something'
c.Spawner.somethingelse = 'something else'
```
No validation of this python is performed! If you make a mistake here, it will probably
manifest as either the hub pod going into `Error` or `CrashLoopBackoff` states, or in
some special cases, the hub running but... just doing very random things. Be careful!
uid:
type: integer
minimum: 0
description:
The UID the hub process should be running as.
Use this only if you are building your own image & know that a user with this uid
exists inside the hub container! Advanced feature, handle with care!
Defaults to 1000, which is the uid of the `jovyan` user that is present in the
default hub image.
fsGid:
type: integer
minimum: 0
description:
The gid the hub process should be using when touching any volumes mounted.
Use this only if you are building your own image & know that a group with this gid
exists inside the hub container! Advanced feature, handle with care!
Defaults to 1000, which is the gid of the `jovyan` user that is present in the
default hub image.
proxy:
type: object
properties:
secretToken:
type: string
description: |
A 32-byte cryptographically secure randomly generated string used to secure communications
between the hub and the configurable-http-proxy.
```sh
# to generate a value, run
openssl rand -hex 32
```
Changing this value will cause the proxy and hub pods to restart. It is good security
practice to rotate these values over time. If this secret leaks, *immediately* change
it to something else, or user data can be compromised
https:
type: object
description: |
properties:
enabled:
type: boolean
description: |
type:
type: string
enum:
- letsencrypt
- manual
- offload
- secret
description: |
Sets the type of HTTPS encryption that is used.
The type decides on which ports that are used for communication via HTTPS. Setting this to `secret` sets the type to manual HTTPS with a secret.
Setting https.type to letsencrypt together with setting https.enabled to true and providing https.hosts will enable automatic https certificate provision (referred to as autohttps).
letsencrypt:
type: object
description: |
properties:
contactEmail:
type: string
description: |
manual:
type: object
description: |
properties:
key:
type: string
description: |
cert:
type: string
description: |
secret:
type: object
description: |
properties:
name:
type: string
description: |
key:
type: string
description: |
crt:
type: string
description: |
hosts:
type: list
description: |
required:
- secretToken
auth:
type: object
properties:
state:
type: object
properties:
enabled:
type: boolean
description: |
Enable persisting auth_state (if available).
See: http://jupyterhub.readthedocs.io/en/latest/api/auth.html
cryptoKey:
type:
- string
- "null"
description: |
auth_state will be encrypted and stored in the Hub’s database. This can include things like authentication tokens, etc. to be passed to Spawners as environment variables.
Encrypting auth_state requires the cryptography package.
It must contain one (or more, separated by ;) 32-byte encryption keys. These can be either base64 or hex-encoded.
The JUPYTERHUB_CRYPT_KEY environment variable for the hub pod is set using this entry.
```sh
# to generate a value, run
openssl rand -hex 32
```
If encryption is unavailable, auth_state cannot be persisted.
singleuser:
type: object
description: |
Options for customizing the environment that is provided to the users after they log in.
properties:
cpu:
type: object
description: |
Set CPU limits & guarantees that are enforced for each user.
See: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
properties:
limit:
type:
- string
- "null"
guarantee:
type:
- string
- "null"
memory:
type: object
description: |
Set Memory limits & guarantees that are enforced for each user.
See the [Kubernetes docs](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container)
for more info.
properties:
limit:
type:
- string
- "null"
guarantee:
type:
- string
- "null"
description: |
Note that this field is referred to as *requests* by the Kubernetes API.
imagePullSecret:
type: object
description: |
Creates an image pull secret for you and makes the user pods utilize
it, allowing them to pull images from private image registries.
Using this configuration option automates the following steps that
normally is required to pull from private image registries.
```sh
# you won't need to run this manually...
kubectl create secret docker-registry singleuser-image-credentials \
--docker-server=<REGISTRY> \
--docker-username=<USERNAME> \
--docker-email=<EMAIL> \
--docker-password=<PASSWORD>
```
```yaml
# you won't need to specify this manually...
spec:
imagePullSecrets:
- name: singleuser-image-credentials
```
To learn the username and password fields to access a gcr.io registry
from a Kubernetes cluster not associated with the same google cloud
credentials, look into [this
guide](http://docs.heptio.com/content/private-registries/pr-gcr.html)
and read the notes about the password.
properties:
enabled:
type: boolean
description: |
Enable the creation of a Kubernetes Secret containing credentials
to access a image registry. By enabling this, user pods and image
puller pods will also be configured to use these credentials when
they pull their container images.
registry:
type: string
description: |
Name of the private registry you want to create a credential set
for. It will default to Docker Hub's image registry.
Examples:
- https://index.docker.io/v1/
- quay.io
- eu.gcr.io
- alexmorreale.privatereg.net
username:
type: string
description: |
Name of the user you want to use to connect to your private
registry. For external gcr.io, you will use the `_json_key`.
Examples:
- alexmorreale
- _json_key
password:
type: string
description: |
Password of the user you want to use to connect to your private
registry.
Examples:
- plaintextpassword
- abc123SECRETzyx098
For gcr.io registries the password will be a big JSON blob for a
Google cloud service account, it should look something like below.
```yaml
password: |-
{
"type": "service_account",
"project_id": "jupyter-se",
"private_key_id": "f2ba09118a8d3123b3321bd9a7d6d0d9dc6fdb85",
...
}
```
Learn more in [this
guide](http://docs.heptio.com/content/private-registries/pr-gcr.html).
image:
type: object
description: |
Set custom image name / tag used for spawned users.
This image is used to launch the pod for each user.
properties:
name:
type: string
description: |
Name of the image, without the tag.
Examples:
- yuvipanda/wikimedia-hub-user
- gcr.io/my-project/my-user-image
tag:
type: string
description: |
The tag of the image to use.
This is the value after the `:` in your full image name.
pullPolicy:
type: string
enum:
- IfNotPresent
- Always
- Never
description: |
Set the imagePullPolicy on the singleuser pods that are spun up by the hub.
See the [Kubernetes docs](https://kubernetes.io/docs/concepts/containers/images/#updating-images)
for more info.
schedulerStrategy:
type:
- string
- "null"
description: |
Deprecated and no longer does anything. Use the user-scheduler instead
in order to accomplish a good packing of the user pods.
extraTolerations:
type: list
description: |
Tolerations allow a pod to be scheduled on nodes with taints. These
are additional tolerations other than the user pods and core pods
default ones `hub.jupyter.org/dedicated=user:NoSchedule` or
`hub.jupyter.org/dedicated=core:NoSchedule`. Note that a duplicate set
of tolerations exist where `/` is replaced with `_` as the Google
cloud does not support the character `/` yet in the toleration.
See the [Kubernetes docs](https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/)
for more info.
Pass this field an array of
[`Toleration`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#toleration-v1-core)
objects.
extraNodeAffinity:
type: object
description: |
Affinities describe where pods prefer or require to be scheduled, they
may prefer or require a node where they are to be scheduled to have a
certain label (node affinity). They may also require to be scheduled
in proximity or with a lack of proximity to another pod (pod affinity
and anti pod affinity).
See the [Kubernetes
docs](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/)
for more info.
properties:
required:
type: list
description: |
Pass this field an array of
[`NodeSelectorTerm`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#nodeselectorterm-v1-core)
objects.
preferred:
type: list
description: |
Pass this field an array of
[`PreferredSchedulingTerm`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#preferredschedulingterm-v1-core)
objects.
extraPodAffinity:
type: object
description: |
See the description of `singleuser.extraNodeAffinity`.
properties:
required:
type: list
description: |
Pass this field an array of
[`PodAffinityTerm`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#podaffinityterm-v1-core)
objects.
preferred:
type: list
description: |
Pass this field an array of
[`WeightedPodAffinityTerm`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#weightedpodaffinityterm-v1-core)
objects.
extraPodAntiAffinity:
type: object
description: |
See the description of `singleuser.extraNodeAffinity`.
properties:
required:
type: list
description: |
Pass this field an array of
[`PodAffinityTerm`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#podaffinityterm-v1-core)
objects.
preferred:
type: list
description: |
Pass this field an array of
[`WeightedPodAffinityTerm`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#weightedpodaffinityterm-v1-core)
objects.
scheduling:
type: object
description: |
Objects for customizing the scheduling of various pods on the nodes and
related labels.
properties:
userScheduler:
type: object
description: |
The user scheduler is making sure that user pods are scheduled
tight on nodes, this is useful for autoscaling of user node pools.
properties:
enabled:
type: boolean
description: |
Enables the user scheduler.
replicas:
type: integer
description: |
You can have multiple schedulers to share the workload or improve
availability on node failure.
image:
type: object
description: |
The image containing the [kube-scheduler
binary](https://console.cloud.google.com/gcr/images/google-containers/GLOBAL/kube-scheduler-amd64).
properties:
name:
type: string
tag:
type:
- string
- "null"
podPriority:
type: object
description: |
Generally available since Kubernetes 1.11, Pod Priority is used to
allow real users evict placeholder pods.
properties:
enabled:
type: bool
description: |
Generally available since Kubernetes 1.11, Pod Priority is used to
allow real users evict placeholder pods.
userPlaceholder:
type: object
description: |
User placeholders simulate users but will thanks to PodPriority be
evicted by the cluster autoscaler if a real user shows up. In this way
placeholders allow you to create a headroom for the real users and
reduce the risk of a user having to wait for a node to be added. Be
sure to use the the continuous image puller as well along with
placeholders, so the images are also available when real users arrive.
To test your setup efficiently, you can adjust the amount of user
placeholders with the following command:
```sh
# Configure to have 3 user placeholders
kubectl scale sts/user-placeholder --replicas=3
```
properties:
enabled:
type: bool
replicas:
type: int
description: |
How many placeholder pods would you like to have?
resources:
type: object
description: |
Unless specified here, the placeholder pods will request the same
resources specified for the real singleuser pods.
corePods:
type: object
description: |
These settings influence the core pods like the hub, proxy and
user-scheduler pods.
properties:
nodeAffinity:
type: object
description: |
Where should pods be scheduled? Perhaps on nodes with a certain
label is preferred or even required?
properties:
matchNodePurpose:
type: string
enum:
- ignore
- prefer
- require
description: |
Decide if core pods *ignore*, *prefer* or *require* to
schedule on nodes with this label:
```
hub.jupyter.org/node-purpose=core
```
userPods:
type: object
description: |
These settings influence the user pods like the user-placeholder,
user-dummy and actual user pods named like jupyter-someusername.
properties:
nodeAffinity:
type: object
description: |
Where should pods be scheduled? Perhaps on nodes with a certain
label is preferred or even required?
properties:
matchNodePurpose:
type: string
enum:
- ignore
- prefer
- require
description: |
Decide if user pods *ignore*, *prefer* or *require* to
schedule on nodes with this label:
```
hub.jupyter.org/node-purpose=user
```
custom:
type: object
description: |
Additional values to pass to the Hub.
JupyterHub will not itself look at these,
but you can read values in your own custom config via `hub.extraConfig`.
For example:
```yaml
custom:
myHost: "https://example.horse"
hub:
extraConfig:
myConfig.py: |
c.MyAuthenticator.host = get_config("custom.myHost")
```