|
| 1 | +# Kubernetes Contributor Conference, 2014-12-03 to 12-05 |
| 2 | +**Full notes:** (Has pictures; Shared with k-dev mailing list) (https://docs.google.com/document/d/1cQLY9yeFgxlr_SRgaBZYGcJ4UtNhLAjJNwJa8424JMA/edit?usp=sharing) |
| 3 | +**Organizers:** thockin and bburns |
| 4 | +**26 Attendees from:** Google, Red Hat, CoreOS, Box |
| 5 | +**This is a historical document. No typo or grammar correction PRs needed.** |
| 6 | + |
| 7 | +Last modified: Dec. 8. 2014 |
| 8 | + |
| 9 | +# Clustering and Cluster Formation |
| 10 | +Goal: Decide how clusters should be formed and resized over time |
| 11 | +Models for building clusters |
| 12 | +* Master in charge - asset DB |
| 13 | +Dynamic join - ask to join |
| 14 | +* How Kelsey Hightower has seen this done on bare metal |
| 15 | +Use Fleet as a machine database |
| 16 | +A Fleet agent is run on each node |
| 17 | +Each node registers its information in etcd when it comes up |
| 18 | +Only security is that etcd expects the node to have a cert signed by a specific CA |
| 19 | +Run an etcd proxy on each node |
| 20 | +Don't run any salt scripts, everything is declarative |
| 21 | +Just put a daemon (kube-register) on a machine to become part of the cluster |
| 22 | +brendanburns: basically using Fleet as the cloud provider |
| 23 | +* Puppet model - whitelist some cert and/or subnet that you want to trust everything in |
| 24 | +One problem - if CA leaks, have to replace certs on all nodes |
| 25 | +* briangrant: we may want to support adding nodes that aren't trusted, only scheduling work from the nodes' owner on them |
| 26 | +* lavalamp: we need to differentiate between node states: |
| 27 | +In the cluster |
| 28 | +Ready to accept work |
| 29 | +Trusted to accept work |
| 30 | +* Proposal: |
| 31 | +New nodes initiate contact with the master |
| 32 | +Allow multiple config options for how trust can be established - IP, cert, etc. |
| 33 | +Each new node only needs one piece of information - how to find the master |
| 34 | +Can support many different auth modes - let anyone in, whitelist IPs, a particular signed cert, queue up requests for an admin to approve, etc. |
| 35 | +Default should be auto-register with no auth/approval needed |
| 36 | +Auth-ing is separate from registering |
| 37 | +Supporting switching between permissive and strict auth modes: |
| 38 | +Each node should register a public key such that if the auth mode is changed to require a cert upon registration, old nodes won't break |
| 39 | +kelseyhightower: let the minion do the same thing that kube-register currently does |
| 40 | +Separate adding a node to the cluster from declaring it as schedulable |
| 41 | +* Use cases: |
| 42 | +Kick the tires, everything should be automagic |
| 43 | +Professional that needs security |
| 44 | +* Working group for later: Joe, Kelsey, Quintin, Eric Paris |
| 45 | +# Usability |
| 46 | +* Getting started |
| 47 | +Want easy entry for Docker users |
| 48 | +Library/registry of pod templates |
| 49 | +* GUI - visualization of relationships and dependencies, workflows, dashboards, ways to learn, first impressions |
| 50 | +Will be easiest to start with a read-only UI before worrying about read-write workflows |
| 51 | +* Docs |
| 52 | +Need to refactor getting started guides so that there's one common guide |
| 53 | +Each cloud provider will just have its own short guide on how to create a cluster |
| 54 | +Need a simple test that can verify whether your cluster is healthy or diagnose why it isn't |
| 55 | +Make it easier to get to architecture/design doc from front page of github project |
| 56 | +Table of contents for docs? |
| 57 | +Realistic examples |
| 58 | +Kelsey has found that doing a tutorial of deploying with a canary helped make the value of labels clear |
| 59 | +* CLI |
| 60 | +Annoying when local auth files and config get overwritten when trying to work with multiple clusters |
| 61 | +Like when running e2e tests |
| 62 | +* Common friction points |
| 63 | +External IPs |
| 64 | +Image registry |
| 65 | +Secrets |
| 66 | +Deployment |
| 67 | +Stateful services |
| 68 | +Scheduling |
| 69 | +Events/status |
| 70 | +Log access |
| 71 | +* Working groups |
| 72 | +GUI - Jordan, Brian, Max, Satnam |
| 73 | +CLI - Jeff, Sam, Derek |
| 74 | +Docs - Proppy, Kelsey, TJ, Satnam, Jeff |
| 75 | +Features/Experience - Dawn, Rohit, Kelsey, Proppy, Clayton: https://docs.google.com/document/d/1hqn6FtBNMe0sThbciq2PbE_P5BONBgCzHST4gz2zj50/edit |
| 76 | + |
| 77 | + |
| 78 | +# v1beta3 discussion |
| 79 | + |
| 80 | +12-04-2014 |
| 81 | +Network -- breakout |
| 82 | +* Dynamic IP |
| 83 | +Once we support live migration, IP assigned for each POD has to move together, which might be broken the underneath. |
| 84 | +We don’t have introspection, which makes supporting various network topology harder. |
| 85 | +External IP is an important part. |
| 86 | +There’s a kick-the-tires mode and full-on mode (for GCE, AWS - fully featured). |
| 87 | +How do we select kick-the-tires ? Weave, Flannel, Calico: pick one. |
| 88 | +Someone does a comparison. thockin@ would like help in evaluating these tech against some benchmarks. Eric Paris can help - has a bare-metal setup. We’ll have a benchmark setup for evaluation. |
| 89 | +We need to have two real use-cases at least - a webserver example; can 10 pods find each other. lavalamp@ working on a test. |
| 90 | +If docker picks up a plugin model, we can use that. |
| 91 | +Cluster will be dynamically change, we need to design a flexible network plugin API to accomplish this. |
| 92 | +Flannel two things: network allocation through etcd and traffic routing w/ overlays. Also programs underlay networks (like GCE). Flannel will do IP allocation, not hard-coded. |
| 93 | +One special use case: per node, there are only 20 ips could be allocated. Scheduler might need to know the limitation: OUT-OF-IP(?) |
| 94 | +Different cloud providers, but OVS is a common mechanism |
| 95 | +We might need Network Grids at the end |
| 96 | +ACTIONS: better doc, test. |
| 97 | +* Public Services |
| 98 | +Hard problem: Have to scale to GCE, GCE load balancer cannot target to arbitrary IP, only can target to a VM for now. |
| 99 | +Until we have an external IP, you cannot build a HA public service. |
| 100 | +We can run Digital Ocean on top of kubernetes |
| 101 | +Issue: When starting a public service, there is internal IP assigned. It is accessable from node within cluster, but not from outside. Now we have a 3-tier services, how to access one service from outside The issue is how to take this internal accessible service externalized. General solution: forwarding the traffic outside to the internal IP. First action, teach kubernetes mapping. |
| 102 | +We need a registry of those public IPs. All traffic comes to that IP will be forwarded to proper IP internally. |
| 103 | +public service can register with DNS, and do a intermiddle load balancing outside cluster / kubernetes. Label query to tell the endpoint. |
| 104 | +K8s proxy can be L3 LB, and listen to the external IPs, it also talk to k8s service DB and find internal services; then goes to L7 LB, which could be HAP proxy, scheduled as a pod, it talks to Pods DB, find a cluster of pods to forward the traffic. |
| 105 | + |
| 106 | + |
| 107 | + |
| 108 | +Two types of services: mapping external IPs and L3 LB to map to pods. L7 LB can access the IPs assigned to pods. |
| 109 | +Policy: Add more nodes, more external IPs can be used. |
| 110 | +Issue1: how to take external IP to map to a list of pods, L3 LB part. |
| 111 | +Issue2: how to slice those external IPs: general pool vs. private pools. |
| 112 | +* IP-per-service, visibility, segmenting |
| 113 | +* Scale |
| 114 | +* MAC |
| 115 | + |
| 116 | +# Roadmap |
| 117 | + |
| 118 | +* Should be driven by scenarios / use cases -- breakout |
| 119 | +* Storage / stateful services -- breakout |
| 120 | +Clustered databases / kv stores |
| 121 | +Mongo |
| 122 | +MySQL master/slave |
| 123 | +Cassandra |
| 124 | +etcd |
| 125 | +zookeeper |
| 126 | +redis |
| 127 | +ldap |
| 128 | +Alternatives |
| 129 | +local storage |
| 130 | +durable volumes |
| 131 | +identity associated with volumes |
| 132 | +lifecycle management |
| 133 | +network storage (ceph, nfs, gluster, hdfs) |
| 134 | +volume plugin |
| 135 | +flocker - volume migration |
| 136 | +“durable” data (as reliable as host) |
| 137 | +* Upgrading Kubernetes |
| 138 | +master components |
| 139 | +kubelets |
| 140 | +OS + kernel + Docker |
| 141 | +* Usability |
| 142 | +Easy cluster startup |
| 143 | +Minion registration |
| 144 | +Configuring k8s |
| 145 | +move away from flags in master |
| 146 | +node config distribution |
| 147 | +kubelet config |
| 148 | +dockercfg |
| 149 | +Cluster scaling |
| 150 | +CLI + config + deployment / rolling updates |
| 151 | +Selected workloads |
| 152 | +* Networking |
| 153 | +External IPs |
| 154 | +DNS |
| 155 | +Kick-the-tires networking implementation |
| 156 | +* Admission control not required for 1.0 |
| 157 | +* v1 API + deprecation policy |
| 158 | +* Kubelet API well defined and versioned |
| 159 | +* Basic resource-aware scheduling -- breakout |
| 160 | +require limits? |
| 161 | +auto-sizing |
| 162 | +* Registry |
| 163 | +Predictable deployment (config-time image resolution) |
| 164 | +Easy code->k8s |
| 165 | +Simple out-of-the box setup |
| 166 | +One or many? |
| 167 | +Proxy? |
| 168 | +Service? |
| 169 | +Configurable .dockercfg |
| 170 | +* Productionization |
| 171 | +Scalability |
| 172 | +100 for 1.0 |
| 173 | +1000 by summer 2015 |
| 174 | +HA master -- not gating 1.0 |
| 175 | +Master election |
| 176 | +Eliminate global in-memory state |
| 177 | +IP allocator |
| 178 | +Operations |
| 179 | +Sharding |
| 180 | +Pod getter |
| 181 | +Kubelets need to coast when master down |
| 182 | +Don’t blow away pods when master is down |
| 183 | +Testing |
| 184 | +More/better/easier E2E |
| 185 | +E2E integration testing w/ OpenShift |
| 186 | +More non-E2E integration tests |
| 187 | +Long-term soaking / stress test |
| 188 | +Backward compatibility |
| 189 | +Release cadence and artifacts |
| 190 | +Export monitoring metrics (instrumentation) |
| 191 | +Bounded disk space on master and kubelets |
| 192 | +GC of unused images |
| 193 | +* Docs |
| 194 | +Reference architecture |
| 195 | +* Auth[nz] |
| 196 | +plugins + policy |
| 197 | +admin |
| 198 | +user->master |
| 199 | +master component->component: localhost in 1.0 |
| 200 | +kubelet->master |
0 commit comments