Skip to content

Conversation

@vinny-pereira
Copy link
Contributor

@vinny-pereira vinny-pereira commented Jan 3, 2025

Issue #294

Problem

Our current peer connection implementation has a potential issue with connection cleanup when peers disconnect. When we split the TcpStream and run the read half in a detached task, we rely entirely on the peer's proper shutdown behavior to terminate the connection. If the peer doesn't close their connection cleanly (which appears to be happening, causing fin-wait2 states on Linux), the read task can remain active indefinitely, potentially leading to resource leaks like file descriptor exhaustion.

Solution

This PR implements a cancellation pattern for more reliable connection cleanup:

  1. Added a oneshot channel to coordinate shutdown between the peer and its read task
  2. When a peer initiates shutdown, it signals the read task through this channel
  3. The read task now uses tokio::select! to respond to either:
    • Normal completion of the read loop
    • Cancellation signal from the peer

Implementation Details

  • Uses tokio::sync::oneshot channel instead of CancellationToken from tokio-utils to minimize dependencies
  • Follows Tokio's recommended shutdown patterns (see Tokio shutdown documentation)
  • Adds graceful error handling for shutdown signal propagation

Impact

This change should prevent connection resource leaks by ensuring read tasks are properly cleaned up, even when remote peers don't follow correct TCP shutdown procedures.

@Davidson-Souza
Copy link
Member

Could you run cargo +nightly clippy --all --fix && cargo +nightly fmt --all to fix the linting error?

@vinny-pereira vinny-pereira force-pushed the timeout_peer_read_loop branch 2 times, most recently from 64092aa to 59bce5e Compare January 3, 2025 18:24
@vinny-pereira vinny-pereira force-pushed the timeout_peer_read_loop branch from 59bce5e to 3c87292 Compare January 3, 2025 18:36
@vinny-pereira
Copy link
Contributor Author

I apologize! I ran the justfile, but missed the linting errors.

Linting errors are fixed now!

Copy link
Member

@Davidson-Souza Davidson-Souza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested it over the weekend and seems to fix the problem

ACK 3c87292

@Davidson-Souza Davidson-Souza merged commit 63dcb9a into vinteumorg:master Jan 6, 2025
6 checks passed
Davidson-Souza pushed a commit to Davidson-Souza/Floresta that referenced this pull request Jan 29, 2025
@Davidson-Souza Davidson-Souza mentioned this pull request Jan 29, 2025
15 tasks
Davidson-Souza added a commit that referenced this pull request Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants