Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR to support RFC2136 multiple hosts. #4653

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Jeremy-Boyle
Copy link
Contributor

@Jeremy-Boyle Jeremy-Boyle commented Aug 2, 2024

Description

Details in #4651

Fixes #4651
Fixes #3470

Checklist

  • Unit tests updated
  • End user documentation updated

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 2, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @Jeremy-Boyle. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Aug 2, 2024
@Jeremy-Boyle Jeremy-Boyle changed the title PR to support RFC2136 multiple hosts. WIP: PR to support RFC2136 multiple hosts. Aug 2, 2024
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 2, 2024
@Jeremy-Boyle Jeremy-Boyle changed the title WIP: PR to support RFC2136 multiple hosts. PR to support RFC2136 multiple hosts. Aug 2, 2024
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 2, 2024
@Jeremy-Boyle
Copy link
Contributor Author

Jeremy-Boyle commented Aug 2, 2024

Work has been completed All tests are succeeding,

I have build and tested the use cases for fail-over and load balancing options within a working lab environment.

Failover

time="2024-08-02T18:27:57Z" level=info msg="Created Dynamic Kubernetes client https://10.96.0.1:443"
time="2024-08-02T18:27:57Z" level=info msg="Configured RFC2136 with zones '[PRIVATE-DNS.com]' and nameservers '[HOST-1 HOST-2]'"
time="2024-08-02T18:27:57Z" level=debug msg="Fetching records for '\"PRIVATE-DNS.com\"'"
time="2024-08-02T18:27:57Z" level=debug msg="Fetching records from nameserver: HOST-1:53"
time="2024-08-02T18:27:59Z" level=warning msg="Last operation failed for nameserver HOST-1:53"
time="2024-08-02T18:27:59Z" level=warning msg="Last operation error message: failed to fetch records via AXFR: failed to connect for transfer: dial tcp HOST-1:53: i/o timeout"
time="2024-08-02T18:27:59Z" level=debug msg="Fetching records from nameserver: HOST-2:53"
time="2024-08-02T18:27:59Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tSOA\topnsense.home.arpa. mail.opnsense.home.arpa. 2407231527 21600 3600 3542400 3600"
time="2024-08-02T18:27:59Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tNS\tns1.PRIVATE-DNS.com."

Load balance

Round Robbin

time="2024-08-02T18:32:43Z" level=info msg="Created Dynamic Kubernetes client https://10.96.0.1:443"
time="2024-08-02T18:32:43Z" level=info msg="Configured RFC2136 with zones '[PRIVATE-DNS.com]' and nameservers '[HOST-1 HOST-2]'"
time="2024-08-02T18:32:43Z" level=debug msg="Fetching records for '\"PRIVATE-DNS.com\"'"
time="2024-08-02T18:32:43Z" level=debug msg="Fetching records from nameserver: HOST-1:53"
time="2024-08-02T18:32:43Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tSOA\topnsense.home.arpa. mail.opnsense.home.arpa. 2407231527 21600 3600 3542400 3600"
time="2024-08-02T18:32:43Z" level=info msg="All records are already up to date"
time="2024-08-02T18:32:53Z" level=debug msg="Fetching records for '\"PRIVATE-DNS.com\"'"
time="2024-08-02T18:32:53Z" level=debug msg="Fetching records from nameserver: HOST-2:53"
time="2024-08-02T18:32:53Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tSOA\topnsense.home.arpa. mail.opnsense.home.arpa. 2407231527 21600 3600 3542400 3600"
time="2024-08-02T18:33:03Z" level=debug msg="Fetching records for '\"PRIVATE-DNS.com\"'"
time="2024-08-02T18:33:03Z" level=debug msg="Fetching records from nameserver: HOST-1:53"
time="2024-08-02T18:33:03Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tSOA\topnsense.home.arpa. mail.opnsense.home.arpa. 2407231527 21600 3600 3542400 3600"
time="2024-08-02T18:33:03Z" level=info msg="All records are already up to date"
time="2024-08-02T18:33:14Z" level=debug msg="Fetching records for '\"PRIVATE-DNS.com\"'"
time="2024-08-02T18:33:14Z" level=debug msg="Fetching records from nameserver: HOST-2:53"
time="2024-08-02T18:33:14Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tSOA\topnsense.home.arpa. mail.opnsense.home.arpa. 2407231527 21600 3600 3542400 3600"
time="2024-08-02T18:33:14Z" level=info msg="All records are already up to date"
time="2024-08-02T18:33:25Z" level=debug msg="Fetching records for '\"PRIVATE-DNS.com\"'"
time="2024-08-02T18:33:25Z" level=debug msg="Fetching records from nameserver: HOST-1:53"
time="2024-08-02T18:33:25Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tSOA\topnsense.home.arpa. mail.opnsense.home.arpa. 2407231527 21600 3600 3542400 3600"
time="2024-08-02T18:33:25Z" level=info msg="All records are already up to date"

###Random

time="2024-08-02T18:38:02Z" level=info msg="Created Dynamic Kubernetes client https://10.96.0.1:443"
time="2024-08-02T18:38:02Z" level=info msg="Configured RFC2136 with zones '[PRIVATE-DNS.com]' and nameservers '[HOST-1 HOST-2 HOST-3]'"
time="2024-08-02T18:38:02Z" level=debug msg="Fetching records for '\"PRIVATE-DNS.com\"'"
time="2024-08-02T18:38:02Z" level=debug msg="Fetching records from nameserver: HOST-3:53"
time="2024-08-02T18:38:02Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tSOA\topnsense.home.arpa. mail.opnsense.home.arpa. 2407231527 21600 3600 3542400 3600"
time="2024-08-02T18:38:02Z" level=info msg="All records are already up to date"
time="2024-08-02T18:38:12Z" level=debug msg="Fetching records for '\"PRIVATE-DNS.com\"'"
time="2024-08-02T18:38:12Z" level=debug msg="Fetching records from nameserver: HOST-2:53"
time="2024-08-02T18:38:12Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tSOA\topnsense.home.arpa. mail.opnsense.home.arpa. 2407231527 21600 3600 3542400 3600"
time="2024-08-02T18:38:12Z" level=info msg="All records are already up to date"
time="2024-08-02T18:38:22Z" level=debug msg="Fetching records for '\"PRIVATE-DNS.com\"'"
time="2024-08-02T18:38:22Z" level=debug msg="Fetching records from nameserver: HOST-2:53"
time="2024-08-02T18:38:22Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tSOA\topnsense.home.arpa. mail.opnsense.home.arpa. 2407231527 21600 3600 3542400 3600"
time="2024-08-02T18:38:22Z" level=info msg="All records are already up to date"
time="2024-08-02T18:38:33Z" level=debug msg="Fetching records for '\"PRIVATE-DNS.com\"'"
time="2024-08-02T18:38:33Z" level=debug msg="Fetching records from nameserver: HOST-3:53"
time="2024-08-02T18:38:33Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tSOA\topnsense.home.arpa. mail.opnsense.home.arpa. 2407231527 21600 3600 3542400 3600"
time="2024-08-02T18:38:33Z" level=info msg="All records are already up to date"
time="2024-08-02T18:38:44Z" level=debug msg="Fetching records for '\"PRIVATE-DNS.com\"'"
time="2024-08-02T18:38:44Z" level=debug msg="Fetching records from nameserver: HOST-3:53"
time="2024-08-02T18:38:44Z" level=debug msg="Record=PRIVATE-DNS.com.\t86400\tIN\tSOA\topnsense.home.arpa. mail.opnsense.home.arpa. 2407231527 21600 3600 3542400 3600"
time="2024-08-02T18:38:55Z" level=info msg="All records are already up to date"

I can provide the image for testing if you would like, however when it goes to staging i believe the image can be used when its build via the ci.

Additionally if all hosts fail then the pod will crash like in its current state in master.

pkg/apis/externaldns/types.go Show resolved Hide resolved
pkg/apis/externaldns/types.go Show resolved Hide resolved
provider/rfc2136/rfc2136.go Show resolved Hide resolved
provider/rfc2136/rfc2136.go Show resolved Hide resolved
@mloiseleur
Copy link
Contributor

Thanks for this PR @Jeremy-Boyle. It seems you are a big user of this provider.
I'm not sure to understand why you didn't use a reverse proxy to load-balance the requests.

Considering the current status of in-tree providers (see README), would you be interested to move this provider out of tree, using webhook ?

@Jeremy-Boyle
Copy link
Contributor Author

@mloiseleur

Unfortunately, without going into any specifics, making it a webhook provider wouldn't work. No issues with that implementation, however our organization wouldn't be allowed or able to use anything other than the official images provided directly.

The loadblancer in front of the dns server is a valid option for some use cases. However we also use Kerberos, thus with a load balancer you would need to sticky session the entire session to the same server once it gets its session token which neglects the whole reason for this, and same with tsig.

This solution logs you into each host to properly handle sending requests to each individual server.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 15, 2024
@Jeremy-Boyle
Copy link
Contributor Author

@mloiseleur , did you have any other questions or concerns before considering this pr?

@Jeremy-Boyle
Copy link
Contributor Author

Can i please have a update for this please?

@Jeremy-Boyle
Copy link
Contributor Author

@mloiseleur Any update ?

Signed-off-by: Jeremy-Boyle <[email protected]>

Fixed go tests, added checks to ensure multiple hosts, and RFC2136LoadBalancingStrategy is set and can be overritten

Signed-off-by: Jeremy-Boyle <[email protected]>

Documentation to support Multiple Hosts and Load Balancing features

Signed-off-by: Jeremy-Boyle <[email protected]>

WIP, counter not working correctly

Signed-off-by: Jeremy-Boyle <[email protected]>

Make pointers to the rfc2136 provider, fixed counter issue, log out last error.

Signed-off-by: Jeremy-Boyle <[email protected]>

Fix error with failover not working correctly

Signed-off-by: Jeremy-Boyle <[email protected]>

Repoint makefile us.gcr.io/k8s-artifacts-prod/external-dns

Signed-off-by: Jeremy-Boyle <[email protected]>

Repoint makefile us.gcr.io/k8s-artifacts-prod/external-dns

Signed-off-by: Jeremy-Boyle <[email protected]>

Fix changes that arent related directly to this PR

Signed-off-by: Jeremy-Boyle <[email protected]>

Changed comment message details for counter

Signed-off-by: Jeremy-Boyle <[email protected]>
@Jeremy-Boyle Jeremy-Boyle force-pushed the feature/multiple-RFC2136-hosts-4651 branch from 8f9be13 to 16dd5b6 Compare November 14, 2024 21:25
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign raffo for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 14, 2024
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 14, 2024
@Jeremy-Boyle
Copy link
Contributor Author

Jeremy-Boyle commented Nov 14, 2024

Rebased off master. This is still needed for my ORG.

This also includes fixes and improvements from users reporting issues: #3470 .

Please consider this MR. @mloiseleur, @Raffo, @szuecs , if this is not something that wants to be improved or fixed, can we please have a discussion on it. I do not personally bandwidth to fully support a webhook based plugin, I can however provide support and fixes as needed for the code above. Tests have been written and no breaking changes have been introduced or made that would conflict with existing configurations.

Additionally, I have been using this patch since august with no issues. This has fixed our issues and has improved load on our system by distributing the requests. However, having to maintain our own port is tiresome, and a few other members have requested some of this functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Multiple RFC2136 Hosts with Load Balancing Options Support list of rfc2136 dns server hosts
3 participants