-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - P2P decentralization improvements #5329
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## develop #5329 +/- ##
=========================================
- Coverage 77.5% 77.3% -0.2%
=========================================
Files 252 257 +5
Lines 29695 30441 +746
=========================================
+ Hits 23026 23546 +520
- Misses 5211 5401 +190
- Partials 1458 1494 +36 ☔ View full report in Codecov by Sentry. |
d2223d4
to
c8adf86
Compare
36fbe1c
to
a1a2e40
Compare
97faa75
to
3cc024c
Compare
select { | ||
case <-ctx.Done(): | ||
return nil | ||
case p, ok := <-peerCh: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
peerCh
could be nil
at this point (if d.disc.FindPeers(ctx, ns)
failed). Reading from it would block forever then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to continue
the loop in the if
above, fixed, thanks
51db827
to
3f7e8ba
Compare
Fixed the conflict, created https://github.com/spacemeshos/api/releases/tag/v1.25.0 (also https://github.com/spacemeshos/api/releases/tag/release%2Fgo%2Fv1.25.0) |
Allow multiple values
3f7e8ba
to
c531ea6
Compare
Disabled routing discovery advertisement by default, and made the nodes use bootnodes as relays when routing discovery is disabled for even better consistency with existing behaviour when an old config is being used. |
@poszu pls check if the README / changelog changes are sufficient (or if they're excessive) |
c531ea6
to
3cea314
Compare
bors merge |
## Motivation This PR implements changes needed for spacemeshos/pm#275, except for measurement ## Changes * Introduce Routing Discovery to contact peers behind NATs * Introduce dynamic v2 relay discovery which is needed for hole punching. The idea is to have a wider array of circuit-v2 passive relays which should be much safer than old libp2p active relays (which were disabled in e.g. Filecoin due to security concerns) * Introduce QUIC transport to improve chances at hole punching, with testnet-mainnet "crosstalk" protection based on a transport-level handshake mechanism * the handshake is not used on mainnet. That way, connections between mainnet and testnet nodes are still prevented, as testnet peers expect the handshake, but if/when my libp2p changes are merged (libp2p/go-libp2p#2658) or libp2p gets private network support * Make it possible to listen on multiple addresses and advertise multiple addresses * Extend DebugService with additional P2P info needed for hole punching diagnostics (needs spacemeshos/api#285) * Add `ping-peers` config option to facilitate P2P network issue diagnostics * Add `force-dht-server` config option that is useful during troubleshooting DHT and hole-punching issues `ping-peers` and `force-dht-server` were initially considered to be temporary features, but I think it might make sense to keep them for various P2P network troubleshooting scenarios. All of the changes are disabled in the config by default, except for: * libp2p Ping service is enabled by default to make diagnostics easier * DHT Values and Providers as these will make DHT Routing Peer discovery work efficiently from the beginning when we enable this feature in the configs * Bootnodes aren't used as relays by default anymore. v2 relays have very limited capacity by default and bootnode relay servers' reservations are very quickly exhausted. Need to either specify a static relay list or enable routing discovery, which searches for more available relays as needed ## Test Plan * Tested using k8s several clusters with cone NATs enabled via `bridge` CNI plugin (via Multus) -- backported to v1.2.8 * Added a Mac node for testing ## TODO - [x] Have spacemeshos/api#285 merged and updated to the new `api` release - [x] Retest using an image based on this branch (not backport) - [ ] Decide on whether/how to extend systests to include NAT testing - [ ] To check: TCP holepunching tends to happen more than QUIC (might be related to the handshake mechanism) - [ ] ~~To consider: try picking up some % (e.g.: 50%) of non-infra peers during routing discovery~~ (doesn't work too well, need something more involved for that) Maybe as a follow-up (depending on how soon this gets reviewed): - Include new metrics / check if they're already present - NAT type (UDP / TCP) - Cone / Symmetric / Unknown - Reachability - Public / Private / Unknown - N of "advertised" peers found via routing discovery - N of TCP and UDP (QUIC) peers - N of peers reached via relayed connections (these being present for a long time may indicate hole-punching troubles, usually relayed connections go away relatively quickly) - N of relay reservations this node managed to obtain - Whether routing discovery is active or suspended (e.g. b/c `low-peers` N of peers has been reached) - Whether DHT is in the `Server` or `Client` mode - systests checking NATed connections Co-authored-by: Ivan Shvedunov <[email protected]>
Pull request successfully merged into develop. Build succeeded: |
This PR implements changes needed for spacemeshos/pm#275, except for measurement * Introduce Routing Discovery to contact peers behind NATs * Introduce dynamic v2 relay discovery which is needed for hole punching. The idea is to have a wider array of circuit-v2 passive relays which should be much safer than old libp2p active relays (which were disabled in e.g. Filecoin due to security concerns) * Introduce QUIC transport to improve chances at hole punching, with testnet-mainnet "crosstalk" protection based on a transport-level handshake mechanism * the handshake is not used on mainnet. That way, connections between mainnet and testnet nodes are still prevented, as testnet peers expect the handshake, but if/when my libp2p changes are merged (libp2p/go-libp2p#2658) or libp2p gets private network support * Make it possible to listen on multiple addresses and advertise multiple addresses * Extend DebugService with additional P2P info needed for hole punching diagnostics (needs spacemeshos/api#285) * Add `ping-peers` config option to facilitate P2P network issue diagnostics * Add `force-dht-server` config option that is useful during troubleshooting DHT and hole-punching issues `ping-peers` and `force-dht-server` were initially considered to be temporary features, but I think it might make sense to keep them for various P2P network troubleshooting scenarios. All of the changes are disabled in the config by default, except for: * libp2p Ping service is enabled by default to make diagnostics easier * DHT Values and Providers as these will make DHT Routing Peer discovery work efficiently from the beginning when we enable this feature in the configs * Bootnodes aren't used as relays by default anymore. v2 relays have very limited capacity by default and bootnode relay servers' reservations are very quickly exhausted. Need to either specify a static relay list or enable routing discovery, which searches for more available relays as needed * Tested using k8s several clusters with cone NATs enabled via `bridge` CNI plugin (via Multus) -- backported to v1.2.8 * Added a Mac node for testing - [x] Have spacemeshos/api#285 merged and updated to the new `api` release - [x] Retest using an image based on this branch (not backport) - [ ] Decide on whether/how to extend systests to include NAT testing - [ ] To check: TCP holepunching tends to happen more than QUIC (might be related to the handshake mechanism) - [ ] ~~To consider: try picking up some % (e.g.: 50%) of non-infra peers during routing discovery~~ (doesn't work too well, need something more involved for that) Maybe as a follow-up (depending on how soon this gets reviewed): - Include new metrics / check if they're already present - NAT type (UDP / TCP) - Cone / Symmetric / Unknown - Reachability - Public / Private / Unknown - N of "advertised" peers found via routing discovery - N of TCP and UDP (QUIC) peers - N of peers reached via relayed connections (these being present for a long time may indicate hole-punching troubles, usually relayed connections go away relatively quickly) - N of relay reservations this node managed to obtain - Whether routing discovery is active or suspended (e.g. b/c `low-peers` N of peers has been reached) - Whether DHT is in the `Server` or `Client` mode - systests checking NATed connections Co-authored-by: Ivan Shvedunov <[email protected]>
This PR implements changes needed for spacemeshos/pm#275, except for measurement * Introduce Routing Discovery to contact peers behind NATs * Introduce dynamic v2 relay discovery which is needed for hole punching. The idea is to have a wider array of circuit-v2 passive relays which should be much safer than old libp2p active relays (which were disabled in e.g. Filecoin due to security concerns) * Introduce QUIC transport to improve chances at hole punching, with testnet-mainnet "crosstalk" protection based on a transport-level handshake mechanism * the handshake is not used on mainnet. That way, connections between mainnet and testnet nodes are still prevented, as testnet peers expect the handshake, but if/when my libp2p changes are merged (libp2p/go-libp2p#2658) or libp2p gets private network support * Make it possible to listen on multiple addresses and advertise multiple addresses * Extend DebugService with additional P2P info needed for hole punching diagnostics (needs spacemeshos/api#285) * Add `ping-peers` config option to facilitate P2P network issue diagnostics * Add `force-dht-server` config option that is useful during troubleshooting DHT and hole-punching issues `ping-peers` and `force-dht-server` were initially considered to be temporary features, but I think it might make sense to keep them for various P2P network troubleshooting scenarios. All of the changes are disabled in the config by default, except for: * libp2p Ping service is enabled by default to make diagnostics easier * DHT Values and Providers as these will make DHT Routing Peer discovery work efficiently from the beginning when we enable this feature in the configs * Bootnodes aren't used as relays by default anymore. v2 relays have very limited capacity by default and bootnode relay servers' reservations are very quickly exhausted. Need to either specify a static relay list or enable routing discovery, which searches for more available relays as needed * Tested using k8s several clusters with cone NATs enabled via `bridge` CNI plugin (via Multus) -- backported to v1.2.8 * Added a Mac node for testing - [x] Have spacemeshos/api#285 merged and updated to the new `api` release - [x] Retest using an image based on this branch (not backport) - [ ] Decide on whether/how to extend systests to include NAT testing - [ ] To check: TCP holepunching tends to happen more than QUIC (might be related to the handshake mechanism) - [ ] ~~To consider: try picking up some % (e.g.: 50%) of non-infra peers during routing discovery~~ (doesn't work too well, need something more involved for that) Maybe as a follow-up (depending on how soon this gets reviewed): - Include new metrics / check if they're already present - NAT type (UDP / TCP) - Cone / Symmetric / Unknown - Reachability - Public / Private / Unknown - N of "advertised" peers found via routing discovery - N of TCP and UDP (QUIC) peers - N of peers reached via relayed connections (these being present for a long time may indicate hole-punching troubles, usually relayed connections go away relatively quickly) - N of relay reservations this node managed to obtain - Whether routing discovery is active or suspended (e.g. b/c `low-peers` N of peers has been reached) - Whether DHT is in the `Server` or `Client` mode - systests checking NATed connections Co-authored-by: Ivan Shvedunov <[email protected]>
This PR implements changes needed for spacemeshos/pm#275, except for measurement * Introduce Routing Discovery to contact peers behind NATs * Introduce dynamic v2 relay discovery which is needed for hole punching. The idea is to have a wider array of circuit-v2 passive relays which should be much safer than old libp2p active relays (which were disabled in e.g. Filecoin due to security concerns) * Introduce QUIC transport to improve chances at hole punching, with testnet-mainnet "crosstalk" protection based on a transport-level handshake mechanism * the handshake is not used on mainnet. That way, connections between mainnet and testnet nodes are still prevented, as testnet peers expect the handshake, but if/when my libp2p changes are merged (libp2p/go-libp2p#2658) or libp2p gets private network support * Make it possible to listen on multiple addresses and advertise multiple addresses * Extend DebugService with additional P2P info needed for hole punching diagnostics (needs spacemeshos/api#285) * Add `ping-peers` config option to facilitate P2P network issue diagnostics * Add `force-dht-server` config option that is useful during troubleshooting DHT and hole-punching issues `ping-peers` and `force-dht-server` were initially considered to be temporary features, but I think it might make sense to keep them for various P2P network troubleshooting scenarios. All of the changes are disabled in the config by default, except for: * libp2p Ping service is enabled by default to make diagnostics easier * DHT Values and Providers as these will make DHT Routing Peer discovery work efficiently from the beginning when we enable this feature in the configs * Bootnodes aren't used as relays by default anymore. v2 relays have very limited capacity by default and bootnode relay servers' reservations are very quickly exhausted. Need to either specify a static relay list or enable routing discovery, which searches for more available relays as needed * Tested using k8s several clusters with cone NATs enabled via `bridge` CNI plugin (via Multus) -- backported to v1.2.8 * Added a Mac node for testing - [x] Have spacemeshos/api#285 merged and updated to the new `api` release - [x] Retest using an image based on this branch (not backport) - [ ] Decide on whether/how to extend systests to include NAT testing - [ ] To check: TCP holepunching tends to happen more than QUIC (might be related to the handshake mechanism) - [ ] ~~To consider: try picking up some % (e.g.: 50%) of non-infra peers during routing discovery~~ (doesn't work too well, need something more involved for that) Maybe as a follow-up (depending on how soon this gets reviewed): - Include new metrics / check if they're already present - NAT type (UDP / TCP) - Cone / Symmetric / Unknown - Reachability - Public / Private / Unknown - N of "advertised" peers found via routing discovery - N of TCP and UDP (QUIC) peers - N of peers reached via relayed connections (these being present for a long time may indicate hole-punching troubles, usually relayed connections go away relatively quickly) - N of relay reservations this node managed to obtain - Whether routing discovery is active or suspended (e.g. b/c `low-peers` N of peers has been reached) - Whether DHT is in the `Server` or `Client` mode - systests checking NATed connections Co-authored-by: Ivan Shvedunov <[email protected]>
This PR implements changes needed for spacemeshos/pm#275, except for measurement * Introduce Routing Discovery to contact peers behind NATs * Introduce dynamic v2 relay discovery which is needed for hole punching. The idea is to have a wider array of circuit-v2 passive relays which should be much safer than old libp2p active relays (which were disabled in e.g. Filecoin due to security concerns) * Introduce QUIC transport to improve chances at hole punching, with testnet-mainnet "crosstalk" protection based on a transport-level handshake mechanism * the handshake is not used on mainnet. That way, connections between mainnet and testnet nodes are still prevented, as testnet peers expect the handshake, but if/when my libp2p changes are merged (libp2p/go-libp2p#2658) or libp2p gets private network support * Make it possible to listen on multiple addresses and advertise multiple addresses * Extend DebugService with additional P2P info needed for hole punching diagnostics (needs spacemeshos/api#285) * Add `ping-peers` config option to facilitate P2P network issue diagnostics * Add `force-dht-server` config option that is useful during troubleshooting DHT and hole-punching issues `ping-peers` and `force-dht-server` were initially considered to be temporary features, but I think it might make sense to keep them for various P2P network troubleshooting scenarios. All of the changes are disabled in the config by default, except for: * libp2p Ping service is enabled by default to make diagnostics easier * DHT Values and Providers as these will make DHT Routing Peer discovery work efficiently from the beginning when we enable this feature in the configs * Bootnodes aren't used as relays by default anymore. v2 relays have very limited capacity by default and bootnode relay servers' reservations are very quickly exhausted. Need to either specify a static relay list or enable routing discovery, which searches for more available relays as needed * Tested using k8s several clusters with cone NATs enabled via `bridge` CNI plugin (via Multus) -- backported to v1.2.8 * Added a Mac node for testing - [x] Have spacemeshos/api#285 merged and updated to the new `api` release - [x] Retest using an image based on this branch (not backport) - [ ] Decide on whether/how to extend systests to include NAT testing - [ ] To check: TCP holepunching tends to happen more than QUIC (might be related to the handshake mechanism) - [ ] ~~To consider: try picking up some % (e.g.: 50%) of non-infra peers during routing discovery~~ (doesn't work too well, need something more involved for that) Maybe as a follow-up (depending on how soon this gets reviewed): - Include new metrics / check if they're already present - NAT type (UDP / TCP) - Cone / Symmetric / Unknown - Reachability - Public / Private / Unknown - N of "advertised" peers found via routing discovery - N of TCP and UDP (QUIC) peers - N of peers reached via relayed connections (these being present for a long time may indicate hole-punching troubles, usually relayed connections go away relatively quickly) - N of relay reservations this node managed to obtain - Whether routing discovery is active or suspended (e.g. b/c `low-peers` N of peers has been reached) - Whether DHT is in the `Server` or `Client` mode - systests checking NATed connections Co-authored-by: Ivan Shvedunov <[email protected]>
## Motivation This PR implements changes needed for spacemeshos/pm#275, except for measurement ## Changes * Introduce Routing Discovery to contact peers behind NATs * Introduce dynamic v2 relay discovery which is needed for hole punching. The idea is to have a wider array of circuit-v2 passive relays which should be much safer than old libp2p active relays (which were disabled in e.g. Filecoin due to security concerns) * Introduce QUIC transport to improve chances at hole punching, with testnet-mainnet "crosstalk" protection based on a transport-level handshake mechanism * the handshake is not used on mainnet. That way, connections between mainnet and testnet nodes are still prevented, as testnet peers expect the handshake, but if/when my libp2p changes are merged (libp2p/go-libp2p#2658) or libp2p gets private network support * Make it possible to listen on multiple addresses and advertise multiple addresses * Extend DebugService with additional P2P info needed for hole punching diagnostics (needs spacemeshos/api#285) * Add `ping-peers` config option to facilitate P2P network issue diagnostics * Add `force-dht-server` config option that is useful during troubleshooting DHT and hole-punching issues `ping-peers` and `force-dht-server` were initially considered to be temporary features, but I think it might make sense to keep them for various P2P network troubleshooting scenarios. All of the changes are disabled in the config by default, except for: * libp2p Ping service is enabled by default to make diagnostics easier * DHT Values and Providers as these will make DHT Routing Peer discovery work efficiently from the beginning when we enable this feature in the configs * Bootnodes aren't used as relays by default anymore. v2 relays have very limited capacity by default and bootnode relay servers' reservations are very quickly exhausted. Need to either specify a static relay list or enable routing discovery, which searches for more available relays as needed ## Test Plan * Tested using k8s several clusters with cone NATs enabled via `bridge` CNI plugin (via Multus) -- backported to v1.2.8 * Added a Mac node for testing ## TODO - [x] Have spacemeshos/api#285 merged and updated to the new `api` release - [x] Retest using an image based on this branch (not backport) - [ ] Decide on whether/how to extend systests to include NAT testing - [ ] To check: TCP holepunching tends to happen more than QUIC (might be related to the handshake mechanism) - [ ] ~~To consider: try picking up some % (e.g.: 50%) of non-infra peers during routing discovery~~ (doesn't work too well, need something more involved for that) Maybe as a follow-up (depending on how soon this gets reviewed): - Include new metrics / check if they're already present - NAT type (UDP / TCP) - Cone / Symmetric / Unknown - Reachability - Public / Private / Unknown - N of "advertised" peers found via routing discovery - N of TCP and UDP (QUIC) peers - N of peers reached via relayed connections (these being present for a long time may indicate hole-punching troubles, usually relayed connections go away relatively quickly) - N of relay reservations this node managed to obtain - Whether routing discovery is active or suspended (e.g. b/c `low-peers` N of peers has been reached) - Whether DHT is in the `Server` or `Client` mode - systests checking NATed connections Co-authored-by: Ivan Shvedunov <[email protected]>
This PR implements changes needed for spacemeshos/pm#275, except for measurement * Introduce Routing Discovery to contact peers behind NATs * Introduce dynamic v2 relay discovery which is needed for hole punching. The idea is to have a wider array of circuit-v2 passive relays which should be much safer than old libp2p active relays (which were disabled in e.g. Filecoin due to security concerns) * Introduce QUIC transport to improve chances at hole punching, with testnet-mainnet "crosstalk" protection based on a transport-level handshake mechanism * the handshake is not used on mainnet. That way, connections between mainnet and testnet nodes are still prevented, as testnet peers expect the handshake, but if/when my libp2p changes are merged (libp2p/go-libp2p#2658) or libp2p gets private network support * Make it possible to listen on multiple addresses and advertise multiple addresses * Extend DebugService with additional P2P info needed for hole punching diagnostics (needs spacemeshos/api#285) * Add `ping-peers` config option to facilitate P2P network issue diagnostics * Add `force-dht-server` config option that is useful during troubleshooting DHT and hole-punching issues `ping-peers` and `force-dht-server` were initially considered to be temporary features, but I think it might make sense to keep them for various P2P network troubleshooting scenarios. All of the changes are disabled in the config by default, except for: * libp2p Ping service is enabled by default to make diagnostics easier * DHT Values and Providers as these will make DHT Routing Peer discovery work efficiently from the beginning when we enable this feature in the configs * Bootnodes aren't used as relays by default anymore. v2 relays have very limited capacity by default and bootnode relay servers' reservations are very quickly exhausted. Need to either specify a static relay list or enable routing discovery, which searches for more available relays as needed * Tested using k8s several clusters with cone NATs enabled via `bridge` CNI plugin (via Multus) -- backported to v1.2.8 * Added a Mac node for testing - [x] Have spacemeshos/api#285 merged and updated to the new `api` release - [x] Retest using an image based on this branch (not backport) - [ ] Decide on whether/how to extend systests to include NAT testing - [ ] To check: TCP holepunching tends to happen more than QUIC (might be related to the handshake mechanism) - [ ] ~~To consider: try picking up some % (e.g.: 50%) of non-infra peers during routing discovery~~ (doesn't work too well, need something more involved for that) Maybe as a follow-up (depending on how soon this gets reviewed): - Include new metrics / check if they're already present - NAT type (UDP / TCP) - Cone / Symmetric / Unknown - Reachability - Public / Private / Unknown - N of "advertised" peers found via routing discovery - N of TCP and UDP (QUIC) peers - N of peers reached via relayed connections (these being present for a long time may indicate hole-punching troubles, usually relayed connections go away relatively quickly) - N of relay reservations this node managed to obtain - Whether routing discovery is active or suspended (e.g. b/c `low-peers` N of peers has been reached) - Whether DHT is in the `Server` or `Client` mode - systests checking NATed connections Co-authored-by: Ivan Shvedunov <[email protected]>
Motivation
This PR implements changes needed for spacemeshos/pm#275, except for measurement
Changes
ping-peers
config option to facilitate P2P network issue diagnosticsforce-dht-server
config option that is useful during troubleshooting DHT and hole-punching issuesping-peers
andforce-dht-server
were initially considered to be temporary features, but I think it might make sense to keep them for various P2P network troubleshooting scenarios.All of the changes are disabled in the config by default, except for:
Test Plan
bridge
CNI plugin (via Multus) -- backported to v1.2.8TODO
api
releaseTo consider: try picking up some % (e.g.: 50%) of non-infra peers during routing discovery(doesn't work too well, need something more involved for that)Maybe as a follow-up (depending on how soon this gets reviewed):
low-peers
N of peers has been reached)Server
orClient
mode