Skip to content

Commit

Permalink
adding contributor summit notes
Browse files Browse the repository at this point in the history
  • Loading branch information
parispittman committed May 14, 2018
1 parent 643702e commit 0761a86
Show file tree
Hide file tree
Showing 5 changed files with 436 additions and 0 deletions.
139 changes: 139 additions & 0 deletions events/2018/05-contributor-summit/clientgo-notes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# Client-go
**Lead:** munnerz with assist from lavalamp
**Slides:** combined with the CRD session [here](https://www.dropbox.com/s/n2fczhlbnoabug0/API%20extensions%20contributor%20summit.pdf?dl=0) (CRD is first; client-go is after)
**Thanks to our notetakers:** kragniz, mrbobbytales, directxman12, onyiny-ang

## Goals for the Session

* What is currently painful when building a controller
* Questions around best practices
* As someone new:
* What is hard to grasp?
* As someone experienced:
* What important bits of info do you think are critical


## Pain points when building controller
* A lot of boilerplate
* Work queues
* HasSynced functions
* Re-queuing
* Lack of deep documentation in these areas
* Some documentation exists, bot focused on k/k core
* Securing webhooks & APIServers
* Validation schemas
* TLS, the number of certs is a pain point
* It is hard right now, the internal k8s CA has been used a bit.
* OpenShift has a 'serving cert controller' that will generate a cert based on an annotation that might be able to possibly integrate upstream.
* Election has been problematic and the Scaling API is low-level and hard to use. doesn't work well if resource has multiple meanings of scale (eg multiple pools of nodes)
* Registering CRDs, what's the best way to go about it?
* No best way to do it, but has been deployed with application
* Personally, deploy the CRDs first for RBAC reasons
* Declarative API on one end that has to be translated to translated to a transactional API on the other end (e.g. ingress). Controller trying to change quite a few things.
* You can do locking, but it has to be built.
* Q: how do you deal with "rolling back" if the underlying infrastructure
that you're describing says no on an operation?
* A: use validating webhook?
* A: use status to keep track of things?
* A: two types of controllers: `kube --> kube` and `kube --> external`,
they work differently
* A: Need a record that keeps track of things in progress. e.g. status. Need more info on how to properly tackle this problem.


## Best practices
(discussion may be shown by Q: for question or A: for audience or answer)
* How do you keep external resources up to date with Kubernetes resources?
* A: the original intention was to use the sync period on the controller if
you watch external resources, use that
* Should you set resync period to never if you're not dealing with
external resources?
* A: Yes, it's not a bug if watch fails to deliver things right
* A: controller automatically relists on connection issues, resync
interval is *only* for external resources
* maybe should be renamed to make it clear it's for external resources
* how many times to update status per sync?
* A: use status conditions to communicate "fluffy" status to user
(messages, what might be blocked, etc, in HPA), use fields to
communicate "crunchy" status (last numbers we saw, last metrics, state
I need later).
* How do I generate nice docs (markdown instead of swagger)
* A: kubebuilder (kubernetes-sigs/kubebuilder) generates docs out of the
box
* A: Want to have IDL pipeline that runs on native types to run on CRDs,
run on docs generator
* Conditions vs fields
* used to check a pods’ state
* "don't use conditions too much"; other features require the use of conditions, status is unsure
* What does condition mean in this context
* Additional fields that can have `ready` with a msg, represents `state`.
* Limit on states that the object can be in.
* Use conditions to reflect the state of the world, is something blocked etc.
* Conditions were created to allow for mixed mode of clients, old clients can ignore some conditions while new clients can follow them. Designed to make it easier to extend status without breaking clients.
* Validating webhooks vs OpenAPI schema
* Can we write a test that spins up main API server in process?
* Can do that current in some k/k tests, but not easy to consume
* vendoring is hard
* Currently have a bug where you have to serve aggregated APIs on 443,
so that might complicate things
* How are people testing extensions?
* Anyone reusing upstream dind cluster?
* People looking for a good way to test them.
* kube-builder uses the sig-testing framework to bring up a local control plane and use that to test against. (@pwittrock)
* How do you start cluster for e2es?
* Spin up a full cluster with kubeadm and run tests against that
* integration tests -- pull in packages that will build the clusters
* Q: what CIs are you using?
* A: Circle CI and then spin up new VMs to host cluster
* Mirtantis has a tool for a multi-node dind cluster for testing
* #testing-commons channel on stack. 27 page document on this--link will be put in slides
* Deploying and managing Validating/Mutating webhooks?
* how complex should they be?
* When to use subresources?
* Are people switching to api agg to use this today?
* Really just for status and scale
* Why not use subresources today with scale?
* multiple replicas fields
* doesn't fit polymorphic structure that exists
* pwittrock@: kubectl side, scale
* want to push special kubectl verbs into subresources to make kubectl
more tolerant to version skew

## Other Questions

* Q: Client-go generated listers, what is the reason for two separate interfaces to retrieve from client and cache?
* A: historical, but some things are better done local vs on the server.
* issues: client-set interface allows you to pass special options that allow you to do interesting stuff on the API server which isn't necessarily possible in the lister.
* started as same function call and then diverged
* lister gives you slice of pointers
* clientset gives you a slice of not pointers
* a lot of people would take return from clientset and then convert it to a slice of pointers so the listers helped avoid having to do deep copies every time. TLDR: interfaces are not identical
* Where should questions go on this topic for now?
* A: most goes to sig-api-machinery right now
* A : Controller related stuff would probably be best for sig-apps
* Q: Staleness of data, how are people dealing with keeping data up to date with external data?
* A: Specify sync period on your informer, will put everything through the loop and hit external resources.
* Q: With strictly kubernetes resources, should your sync period be never? aka does the watch return everything.
* A: The watch should return everything and should be used if its strictly k8s in and k8s out, no need to set the sync period.
* Q: What about controllers in other languages than go?
* A: [metacontroller](https://github.com/GoogleCloudPlatform/metacontroller) There are client libs in other languages, missing piece is work queue,
informer, etc
* Cluster API controllers cluster, machineset, deployment, have a copy of
deployment code for machines. Can we move this code into a library?
* A: it's a lot of work, someone needs to do it
* A: Janet Kuo is a good person to talk to (worked on getting core workloads
API to GA) about opinions on all of this
* Node name duplication caused issues with AWS and long-term caches
* make sure to store UIDs if you cache across reboot

## Moving Forwards
* How do share/disseminate knowledge (SIG PlatformDev?)
* Most SIGs maintain their own controllers
* Wiki? Developer Docs working group?
* Existing docs focus on in-tree development. Dedicated 'extending kubernetes' section?
* Git-book being developed for kubebuilder (book.kubebuilder.io); would appreciate feedback @pwittrock
* API extensions authors meetups?
* How do we communicate this knowledge for core kubernetes controllers
* Current-day: code review, hallway conversations
* Working group for platform development kit?
* Q: where should we discuss/have real time conversations?
* A: #sig-apimachinery, or maybe #sig-apps in slack (or mailing lists) for the workloads controllers
92 changes: 92 additions & 0 deletions events/2018/05-contributor-summit/crds-notes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# CRDs - future and painpoints
**Lead:** sttts
**Slides:** combined with the client-go session [here](https://www.dropbox.com/s/n2fczhlbnoabug0/API%20extensions%20contributor%20summit.pdf?dl=0)
**Thanks to our notetakers:** mrbobbytales, kragniz, tpepper, and onyiny-ang

## outlook - aggregation
* API stable since 1.10. There is a lack of tools and library support.
* GSoC project with @xmudrii: share etcd storage
* `kubectl create etcdstorage your api-server`
* Store custom data in etcd

## outlook custom resources

1.11:
* alpha: multiplier versions with/without conversion
* alpha: pruning - blocker for GA - unspecified fields are removed
* deep change of semantics of custom resources
* from JSON blob store to schema based storage
* alpha: defaulting - defaults from openapi validation schema are applied
* alpha: graceful deletion - (maybe? PR exists)
* alpha: server side printing columns for `kubectl get` customization
* beta: subresources - alpha in 1.10
* will have additionalProperties with extensible string map
* mutually exclusive with properties

1.12
* multiple versions with declarative field renames
* strict create mode (issue #5889)

Missing from Roadmap:
- Additional Properties: Forbid additional fields
- Unknown fields are silently dropped instead of erroring
- Istio used CRD extensively: proto requires some kind of verification and CRDs are JSON
- currently planning to go to GA without proto support
- possibly in the longer term to plan
- Resource Quotas for Custom Resources
- doable, we know how but not currently implemented
- Defaulting: mutating webhook will default things when they are written
- Is Validation going to be required in the future
- poll the audience!
- gauging general sense of validation requirements (who wants them, what's missing?)
- missing: references to core types aren't allowed/can't be defined -- this can lead to versioning complications
- limit CRDs clusterwide such that the don't affect all namespaces
- no good discussion about how to improve this yet
- feel free to start one!
- Server side printing columns, per resource type needs to come from server -- client could be in different version than server and highlight wrong columns

Autoscaling is alpha today hopefully beta in 1.11

## The Future: Versioning
* Most asked feature, coming..but slowly
* two types, "noConversion" and "Declarative Conversion"
* "NoConversion" versioning
* maybe in 1.11
* ONLY change is apiGroup
* Run multiple versions at the same time, they are not converted

* "Declarative Conversion" 1.12
* declarative rename e.g
```
spec:
group: kubecon.io
version: v1
conversions:
declarative:
renames:
from: v1pha1
to: v1
old: spec.foo
new: bar
```
* Support for webhook?
* not currently, very hard to implement
* complex problem for end user
* current need is really only changing for single fields
* Trying to avoid complexity by adding a lot of conversions

## Questions:
* When should someone move to their own API Server
* At the moment, telling people to start with CRDs. If you need an aggregated API server for custom versioning or other specific use-cases.
* How do I update everything to a new object version?
* Have to touch every object.
* are protobuf support in the future?
* possibly, likely yes
* update on resource quotas for CRDs
* PoC PR current out, it's doable just not quite done
* Is validation field going to be required?
* Eventually, yes? Some work being done to make CRDs work well with `kubectl apply`
* Can CRDs be cluster wide but viewable to only some users.
* It's been discussed, but hasn't been tackled.
* Is there support for CRDs in kubectl output?
* server side printing columns will make things easier for client tooling output. Versioning is important for client vs server versioning.
63 changes: 63 additions & 0 deletions events/2018/05-contributor-summit/devtools-notes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Developer Tools:
**Leads:** errordeveloper, r2d4
**Slides:** n/a
**Thanks to our notetakers:** mrbobbytales, onyiny-ang

What APIs should we target, what parts of the developer workflow haven't been covered yet?

* Do you think the Developer tools for Kubernetes is a solved problem?
* A: No

### Long form responses from SIG Apps survey
* Need to talk about developer experience
* Kubernetes Community can do a lot more in helping evangelize Software development workflow, including CI/CD. Just expecting some guidelines on the more productive ways to write software that runs in k8s.
* Although my sentiment is neutral on kube, it is getting better as more tools are emerging to allow my devs to stick to app development and not get distracted by kube items. There is a lot of tooling available which is a dual edge sword, these tools range greatly in usability robustness and security. So it takes a lot of effort to...

### Current State of Developer Experience
* Many Tools
* Mostly incompatible
* Few end-to-end workflows

### Comments and Questions
* Idea from scaffold to normalize the interface for builders, be able to swap them out behind the scenes.
* Possible to formalize these as CRDs?
* Lots of choices, helm, other templating, kompose etc..
* So much flexibility in the Kubernetes API that it can become complicated for new developers coming up.
* Debug containers might make things easier for developers to work through building and troubleshooting their app.
* Domains and workflow are so different from companies that everyone has their own opinionated solution.
* Lots of work being done in the app def working group to define what an app is.
* app CRD work should make things easier for developers.
* Break out developer workflow into stages and try and work through expanding them, e.g. develop/debug
* debug containers are looking to be used both in prod and developer workflows
* Tool in sig-cli called kustomize, was previously 'konflate'?
* Hard to talk about all these topics as there isn't the language to talk about these classes of tools.
* @jacob investigation into application definition: re: phases, its not just build, deploy, debug, its build, deploy, lifecycle, debug. Managing lifecycle is still a problem, '1-click deploy' doesn't handle lifecycle.
* @Bryan Liles: thoughts about why this is hard:
* kubectl helm apply objects in different orders
* objects vs abstractions
* some people love [ksonnet](https://ksonnet.io/), some hate it. Kubernetes concepts are introduced differently to different people so not everyone is starting with the same base. Thus, some tools are harder for some people to grasp than others. Shout out to everyone who's trying to work through it * Being tied to one tool breaks compatibility across providers.
* Debug containers are great for break-glass scenarios
* CoreOS had an operator that handled the entire stack, additional objects could be created and certain metrics attached.
* Everything is open source now, etcd, prometheus operator
* Tools are applying things in different orders, and this can be a problem across tooling
* People who depend on startup order also tend to have reliability problems as they have their own operational problems, should try and engineer around it.
* Can be hard if going crazy on high-level abstractions, can make things overly complicated and there are a slew of constraints in play.
* Ordering constraints are needed for certain garbage collection tasks, having ordering may actually be useful.
* Some groups have avoided high-level DSLs because people should understand readiness/livelness probes etc. Developers may have a learning curve, but worthwhile when troubleshooting and getting into the weeds.
* Lots of people don't want to get into it at all, they want to put in a few details on a db etc and get it.
* Maybe standardize on a set of labels to on things that should be managed as a group. Helm is one implementation, it should go beyond helm.
* There is a PR that is out there that might take care of some of this.
* Everyone has their own "style" when it comes to this space.
* Break the phases and components in the development and deployment workflow into sub-problems and they may be able to actually be tackled. Right now the community seems to tackling everything at once and developing different tools to do the same thing.
* build UI that displays the whole thing as a list and allows easy creation/destruction of cluster
* avoid tools that would prevent portability
* objects rendered to file somehow: happens at runtime, additional operator that takes care of the sack
* 3, 4 minor upgrades without breakage
* @Daniel Smith: start up order problems = probably bigger problems, order shouldn't need to matter but in the real world sometimes it does
* platform team, internal paths team (TSL like theme), etc. In some cases it's best to go crazy focusing on the abstractions--whole lot of plumbing that needs to happen to get everything working properly
* Well defined order of creation may not be a bad thing. ie. ensure objects aren't created that are immediately garbage collected.
* Taking a step back from being contributors and put on developer hats to consider the tool sprawl that exists and is not necessarily compatible across different aspects of kubernetes. Is there anyway to consolidate them and make them more standardized?
* Split into sub-problems

## How can we get involved?
- SIG-Apps - join the conversation on slack, mailing list, or weekly Monday meeting
Loading

0 comments on commit 0761a86

Please sign in to comment.