Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix networking issues within swarm #204

Open
2 of 8 tasks
ryardley opened this issue Dec 11, 2024 · 2 comments · May be fixed by #205
Open
2 of 8 tasks

Fix networking issues within swarm #204

ryardley opened this issue Dec 11, 2024 · 2 comments · May be fixed by #205
Assignees
Labels
bug Something isn't working chore Ciphernode Related to the ciphernode package

Comments

@ryardley
Copy link
Contributor

ryardley commented Dec 11, 2024

Swarm is still not working correctly.

  • Ensure kademlia is bootstrapping correctly
  • Ensure nodes are sending messages to one another correctly
  • Get nodes to accept multiaddrs that resolve their dns domains correctly (just about done)
  • Enable nodes to not retry when dialing themselves
  • Supply all nodes with multiaddrs of all other nodes
  • Ensure we have a way to upgrade individual containers - Shutdown and update one service at a time
  • Migrate all work on the example to enclave - possibly refactoring to an actor if easy
  • Document everything
@ryardley ryardley linked a pull request Dec 11, 2024 that will close this issue
@ryardley ryardley self-assigned this Dec 11, 2024
@ryardley ryardley added bug Something isn't working Ciphernode Related to the ciphernode package chore labels Dec 11, 2024
@ryardley
Copy link
Contributor Author

ryardley commented Dec 17, 2024

Shifted work on this over to this repo which demonstrates exponential backoff within dialing to nodes: https://github.com/ryardley/libp2p-kad-gossipsub-quic-example/tree/main next step is to back port it to enclave

@ryardley
Copy link
Contributor Author

ryardley commented Dec 18, 2024

TODO list for 19/12:

  1. Finish getting nodes here to accept a multiaddr and swap out a resolved domain (close to done) https://github.com/ryardley/libp2p-kad-gossipsub-quic-example/blob/main/src/main.rs#L176

  2. Handle what happens when a node is asked to dial itself cover the failure case there and don't retry it.

  3. Supply bootstrap nodes with multiaddrs for all nodes in the cluster - this means that any node can be shut down and will automatically dial all the other nodes as it is restart - Q. do we need to persist kademlia routing table?

  4. Fix deployment to shut down and update single nodes one at a time

    So far we have a script over here: https://github.com/ryardley/libp2p-kad-gossipsub-quic-example/blob/main/deploy.sh

    This shuts down the whole stack and restarts it again. This is so we can get the logs to appear to be from the newly deployed instances. For some reason I cannot update the instances and have the service logs respect it. Need a more detailed investigation of docker stack to work out how to manage this.

  5. Migrate everything in the example repo to the enclave repo. This might involve refactoring the network peer to an actor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working chore Ciphernode Related to the ciphernode package
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant