Skip to content

Commit 3ab723c

Browse files
authored
Merge pull request kubernetes#2991 from parispittman/pastsummitnotes
adding historical notes for 2014 contributor summit
2 parents 4e2b324 + 6f72604 commit 3ab723c

File tree

2 files changed

+201
-1
lines changed

2 files changed

+201
-1
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
# Kubernetes Contributor Conference, 2014-12-03 to 12-05
2+
**Full notes:** (Has pictures; Shared with k-dev mailing list) (https://docs.google.com/document/d/1cQLY9yeFgxlr_SRgaBZYGcJ4UtNhLAjJNwJa8424JMA/edit?usp=sharing)
3+
**Organizers:** thockin and bburns
4+
**26 Attendees from:** Google, Red Hat, CoreOS, Box
5+
**This is a historical document. No typo or grammar correction PRs needed.**
6+
7+
Last modified: Dec. 8. 2014
8+
9+
# Clustering and Cluster Formation
10+
Goal: Decide how clusters should be formed and resized over time
11+
Models for building clusters
12+
* Master in charge - asset DB
13+
Dynamic join - ask to join
14+
* How Kelsey Hightower has seen this done on bare metal
15+
Use Fleet as a machine database
16+
A Fleet agent is run on each node
17+
Each node registers its information in etcd when it comes up
18+
Only security is that etcd expects the node to have a cert signed by a specific CA
19+
Run an etcd proxy on each node
20+
Don't run any salt scripts, everything is declarative
21+
Just put a daemon (kube-register) on a machine to become part of the cluster
22+
brendanburns: basically using Fleet as the cloud provider
23+
* Puppet model - whitelist some cert and/or subnet that you want to trust everything in
24+
One problem - if CA leaks, have to replace certs on all nodes
25+
* briangrant: we may want to support adding nodes that aren't trusted, only scheduling work from the nodes' owner on them
26+
* lavalamp: we need to differentiate between node states:
27+
In the cluster
28+
Ready to accept work
29+
Trusted to accept work
30+
* Proposal:
31+
New nodes initiate contact with the master
32+
Allow multiple config options for how trust can be established - IP, cert, etc.
33+
Each new node only needs one piece of information - how to find the master
34+
Can support many different auth modes - let anyone in, whitelist IPs, a particular signed cert, queue up requests for an admin to approve, etc.
35+
Default should be auto-register with no auth/approval needed
36+
Auth-ing is separate from registering
37+
Supporting switching between permissive and strict auth modes:
38+
Each node should register a public key such that if the auth mode is changed to require a cert upon registration, old nodes won't break
39+
kelseyhightower: let the minion do the same thing that kube-register currently does
40+
Separate adding a node to the cluster from declaring it as schedulable
41+
* Use cases:
42+
Kick the tires, everything should be automagic
43+
Professional that needs security
44+
* Working group for later: Joe, Kelsey, Quintin, Eric Paris
45+
# Usability
46+
* Getting started
47+
Want easy entry for Docker users
48+
Library/registry of pod templates
49+
* GUI - visualization of relationships and dependencies, workflows, dashboards, ways to learn, first impressions
50+
Will be easiest to start with a read-only UI before worrying about read-write workflows
51+
* Docs
52+
Need to refactor getting started guides so that there's one common guide
53+
Each cloud provider will just have its own short guide on how to create a cluster
54+
Need a simple test that can verify whether your cluster is healthy or diagnose why it isn't
55+
Make it easier to get to architecture/design doc from front page of github project
56+
Table of contents for docs?
57+
Realistic examples
58+
Kelsey has found that doing a tutorial of deploying with a canary helped make the value of labels clear
59+
* CLI
60+
Annoying when local auth files and config get overwritten when trying to work with multiple clusters
61+
Like when running e2e tests
62+
* Common friction points
63+
External IPs
64+
Image registry
65+
Secrets
66+
Deployment
67+
Stateful services
68+
Scheduling
69+
Events/status
70+
Log access
71+
* Working groups
72+
GUI - Jordan, Brian, Max, Satnam
73+
CLI - Jeff, Sam, Derek
74+
Docs - Proppy, Kelsey, TJ, Satnam, Jeff
75+
Features/Experience - Dawn, Rohit, Kelsey, Proppy, Clayton: https://docs.google.com/document/d/1hqn6FtBNMe0sThbciq2PbE_P5BONBgCzHST4gz2zj50/edit
76+
77+
78+
# v1beta3 discussion
79+
80+
12-04-2014
81+
Network -- breakout
82+
* Dynamic IP
83+
Once we support live migration, IP assigned for each POD has to move together, which might be broken the underneath.
84+
We don’t have introspection, which makes supporting various network topology harder.
85+
External IP is an important part.
86+
There’s a kick-the-tires mode and full-on mode (for GCE, AWS - fully featured).
87+
How do we select kick-the-tires ? Weave, Flannel, Calico: pick one.
88+
Someone does a comparison. thockin@ would like help in evaluating these tech against some benchmarks. Eric Paris can help - has a bare-metal setup. We’ll have a benchmark setup for evaluation.
89+
We need to have two real use-cases at least - a webserver example; can 10 pods find each other. lavalamp@ working on a test.
90+
If docker picks up a plugin model, we can use that.
91+
Cluster will be dynamically change, we need to design a flexible network plugin API to accomplish this.
92+
Flannel two things: network allocation through etcd and traffic routing w/ overlays. Also programs underlay networks (like GCE). Flannel will do IP allocation, not hard-coded.
93+
One special use case: per node, there are only 20 ips could be allocated. Scheduler might need to know the limitation: OUT-OF-IP(?)
94+
Different cloud providers, but OVS is a common mechanism
95+
We might need Network Grids at the end
96+
ACTIONS: better doc, test.
97+
* Public Services
98+
Hard problem: Have to scale to GCE, GCE load balancer cannot target to arbitrary IP, only can target to a VM for now.
99+
Until we have an external IP, you cannot build a HA public service.
100+
We can run Digital Ocean on top of kubernetes
101+
Issue: When starting a public service, there is internal IP assigned. It is accessable from node within cluster, but not from outside. Now we have a 3-tier services, how to access one service from outside The issue is how to take this internal accessible service externalized. General solution: forwarding the traffic outside to the internal IP. First action, teach kubernetes mapping.
102+
We need a registry of those public IPs. All traffic comes to that IP will be forwarded to proper IP internally.
103+
public service can register with DNS, and do a intermiddle load balancing outside cluster / kubernetes. Label query to tell the endpoint.
104+
K8s proxy can be L3 LB, and listen to the external IPs, it also talk to k8s service DB and find internal services; then goes to L7 LB, which could be HAP proxy, scheduled as a pod, it talks to Pods DB, find a cluster of pods to forward the traffic.
105+
106+
107+
108+
Two types of services: mapping external IPs and L3 LB to map to pods. L7 LB can access the IPs assigned to pods.
109+
Policy: Add more nodes, more external IPs can be used.
110+
Issue1: how to take external IP to map to a list of pods, L3 LB part.
111+
Issue2: how to slice those external IPs: general pool vs. private pools.
112+
* IP-per-service, visibility, segmenting
113+
* Scale
114+
* MAC
115+
116+
# Roadmap
117+
118+
* Should be driven by scenarios / use cases -- breakout
119+
* Storage / stateful services -- breakout
120+
Clustered databases / kv stores
121+
Mongo
122+
MySQL master/slave
123+
Cassandra
124+
etcd
125+
zookeeper
126+
redis
127+
ldap
128+
Alternatives
129+
local storage
130+
durable volumes
131+
identity associated with volumes
132+
lifecycle management
133+
network storage (ceph, nfs, gluster, hdfs)
134+
volume plugin
135+
flocker - volume migration
136+
“durable” data (as reliable as host)
137+
* Upgrading Kubernetes
138+
master components
139+
kubelets
140+
OS + kernel + Docker
141+
* Usability
142+
Easy cluster startup
143+
Minion registration
144+
Configuring k8s
145+
move away from flags in master
146+
node config distribution
147+
kubelet config
148+
dockercfg
149+
Cluster scaling
150+
CLI + config + deployment / rolling updates
151+
Selected workloads
152+
* Networking
153+
External IPs
154+
DNS
155+
Kick-the-tires networking implementation
156+
* Admission control not required for 1.0
157+
* v1 API + deprecation policy
158+
* Kubelet API well defined and versioned
159+
* Basic resource-aware scheduling -- breakout
160+
require limits?
161+
auto-sizing
162+
* Registry
163+
Predictable deployment (config-time image resolution)
164+
Easy code->k8s
165+
Simple out-of-the box setup
166+
One or many?
167+
Proxy?
168+
Service?
169+
Configurable .dockercfg
170+
* Productionization
171+
Scalability
172+
100 for 1.0
173+
1000 by summer 2015
174+
HA master -- not gating 1.0
175+
Master election
176+
Eliminate global in-memory state
177+
IP allocator
178+
Operations
179+
Sharding
180+
Pod getter
181+
Kubelets need to coast when master down
182+
Don’t blow away pods when master is down
183+
Testing
184+
More/better/easier E2E
185+
E2E integration testing w/ OpenShift
186+
More non-E2E integration tests
187+
Long-term soaking / stress test
188+
Backward compatibility
189+
Release cadence and artifacts
190+
Export monitoring metrics (instrumentation)
191+
Bounded disk space on master and kubelets
192+
GC of unused images
193+
* Docs
194+
Reference architecture
195+
* Auth[nz]
196+
plugins + policy
197+
admin
198+
user->master
199+
master component->component: localhost in 1.0
200+
kubelet->master

hack/.spelling_failures

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
events/elections/2017/
22
vendor/
33
sig-contributor-experience/contribex-survey-2018.csv
4-
4+
events/2014

0 commit comments

Comments
 (0)