Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure non-empty buffers for large vectored I/O #138879

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

thaliaarchi
Copy link
Contributor

readv and writev are constrained by a platform-specific upper bound on the number of buffers which can be passed. Currently, read_vectored and write_vectored implementations simply truncate to this limit when larger. However, when the only non-empty buffers are at indices above this limit, they will erroneously return Ok(0).

Instead, slice the buffers starting at the first non-empty buffer. This trades a conditional move for a branch, so it's barely a penalty in the common case.

The new method limit_slices on IoSlice and IoSliceMut may be generally useful to users like advance_slices is, but I have left it as pub(crate) for now.

@rustbot
Copy link
Collaborator

rustbot commented Mar 24, 2025

r? @jhpratt

rustbot has assigned @jhpratt.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added O-hermit Operating System: Hermit O-unix Operating system: Unix-like S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 24, 2025
@thaliaarchi
Copy link
Contributor Author

thaliaarchi commented Mar 24, 2025

Since this involves what we guarantee for these methods, I want to make sure T-libs-api sees it too:

r? libs-api

The original question that led to this patch was whether these APIs are guaranteed to return Ok(0) iff the buffers are all empty or EOF is reached. As it stands, the implementation breaks that guarantee. However, it is stated on read and write, but not the vectored variants. Users do not have access to max_iov(), so they cannot do this logic themselves and it would make no sense to push the responsibility to them.

@rustbot rustbot added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Mar 24, 2025
@rustbot rustbot assigned BurntSushi and unassigned jhpratt Mar 24, 2025
@thaliaarchi thaliaarchi force-pushed the vectored-large-empty branch 2 times, most recently from d4bc926 to 3321ce6 Compare March 29, 2025 02:48
@thaliaarchi
Copy link
Contributor Author

I've added a test for this case. The second assertion fails before this patch.

@thaliaarchi thaliaarchi force-pushed the vectored-large-empty branch from 3321ce6 to 1e21e57 Compare March 29, 2025 02:50
@bors
Copy link
Collaborator

bors commented Mar 29, 2025

☔ The latest upstream changes (presumably #139101) made this pull request unmergeable. Please resolve the merge conflicts.

@thaliaarchi thaliaarchi force-pushed the vectored-large-empty branch from 1e21e57 to 82ebd46 Compare March 29, 2025 19:53
@GrayJack
Copy link
Contributor

GrayJack commented Mar 30, 2025

This test is falling for me on macOS on the first assertion, after a few hours of investigation, I came to the conclusion that this is happening because the new method (limit_slices) will return a empty slice in the case of all IoSlice are also empty. That means that writev is invocked with iovcount of zero, which, if following POSIX, is disallowed and should return -1 with errno = EINVAL. That is the exact behavior I'm getting.

If I'm correct, and if returning ok-zero if &[IoSlice] is empty is the intended behavior for Rust, Rust then should maybe special case this? Maybe not calling the syscall and returning Ok(0) directly.

Another option I though of is maybe letting one empty IoSlice in the bufs if all IoSlices are empty, calling writev with iovcnt of 1 that is an empty iovec.

POSIX reference I read: https://pubs.opengroup.org/onlinepubs/9799919799/functions/writev.html

EDIT: Looks like readv also is specified to same behavior if iovcnt is zero

@thaliaarchi
Copy link
Contributor Author

Thanks for such a helpful diagnosis! Turns out I also encounter that same error locally for the first assertion, also on macOS. With this patch, it fixes it. I decided to slice it to one empty buffer and still invoke writev, because there aren't other cases currently that skip the writev and this is a very rare case.

Also, what you bring up illustrates a larger, preexisting issue. Should we allow zero-length buffer slices or expose the platform's behavior? I'd argue, given that Rust smooths over the case of too-large buffer slices, that this should be similarly handled. We could even handle both cases together by branching on if bufs.len().wrapping_sub(1) > IOV_MAX - 1.

@thaliaarchi thaliaarchi force-pushed the vectored-large-empty branch from 82ebd46 to dc3625d Compare March 30, 2025 01:43
@GrayJack
Copy link
Contributor

I'd argue that Rust should smooth things over whenever possible if it directly affects an public API that is meant to work in a platform independent way, like File is. I expect platform-dependent behavior overall and platform-dependent things on specialized modules (like std::os::* and std::arch::* modules)

@thaliaarchi thaliaarchi force-pushed the vectored-large-empty branch from dc3625d to de0672f Compare March 30, 2025 04:35
`readv` and `writev` are constrained by a platform-specific upper bound
on the number of buffers which can be passed. Currently, `read_vectored`
and `write_vectored` implementations simply truncate to this limit when
larger. However, when the only non-empty buffers are at indices above
this limit, they will erroneously return `Ok(0)`.

Instead, slice the buffers starting at the first non-empty buffer. This
trades a conditional move for a branch, so it's barely a penalty in the
common case.

The new method `limit_slices` on `IoSlice` and `IoSliceMut` may be
generally useful to users like `advance_slices` is, but I have left it
as `pub(crate)` for now.
POSIX requires at least one buffer passed to readv and writev, but we
allow the user to pass an empty slice of buffers. In this case, return a
zero-length read or write.
@thaliaarchi thaliaarchi force-pushed the vectored-large-empty branch from de0672f to 9d3dc92 Compare March 30, 2025 04:51
@thaliaarchi
Copy link
Contributor Author

I implemented the empty slice case, but I'm not satisfied with the approach, so it's a separate commit for now.

I needed to implement it as a macro, because I couldn't figure out how to get this sort of pattern to compile. When given an empty slice, it needs to conjure up a &[IoSlice<'_>] or &mut [IoSliceMut<'_>] which contains a single empty buffer. Since it cannot be modified, I'd think it would be fine to reference a static (though Miri might complain about multiple mutable references to the same location). To circumvent that, I instead have the outer function return Ok(0) in this case (hence the need for a macro).

use std::io::{IoSlice, IoSliceMut};

fn replace<'a>(bufs: &mut &[IoSlice<'a>], bufs_mut: &mut &mut [IoSliceMut<'a>]) {
    *bufs = &[IoSlice::new(&[])];
    *bufs_mut = &mut [IoSliceMut::new(&mut [])];
}
error[E0716]: temporary value dropped while borrowed
 --> src/main.rs:5:14
  |
4 | fn replace<'a>(bufs: &mut &[IoSlice<'a>], bufs_mut: &mut &mut [IoSliceMut<'a>]) {
  |                           - let's call the lifetime of this reference `'1`
5 |     *bufs = &[IoSlice::new(&[])];
  |     ---------^^^^^^^^^^^^^^^^^^^- temporary value is freed at the end of this statement
  |     |        |
  |     |        creates a temporary value which is freed while still in use
  |     assignment requires that borrow lasts for `'1`

error[E0716]: temporary value dropped while borrowed
 --> src/main.rs:6:22
  |
4 | fn replace<'a>(bufs: &mut &[IoSlice<'a>], bufs_mut: &mut &mut [IoSliceMut<'a>]) {
  |                                                          - let's call the lifetime of this reference `'2`
5 |     *bufs = &[IoSlice::new(&[])];
6 |     *bufs_mut = &mut [IoSliceMut::new(&mut [])];
  |     -----------------^^^^^^^^^^^^^^^^^^^^^^^^^^- temporary value is freed at the end of this statement
  |     |                |
  |     |                creates a temporary value which is freed while still in use
  |     assignment requires that borrow lasts for `'2`

For more information about this error, try `rustc --explain E0716`.

@thaliaarchi
Copy link
Contributor Author

I suspect not all platforms have the restriction that !bufs.is_empty() like POSIX. Perhaps Windows or SOLID don't care. In that case, the limit_slices logic should be moved to each platform and made more specific to their requirements.

@ChrisDenton
Copy link
Member

My current feeling is that this is unnecessary for Windows specifically. Having a slice of 4,294,967,295 empty buffers followed by one or more non-empty buffers isn't a particularly realistic scenario and is almost certainly a user error. We could document the limit but I'm not so sure it's worth having special handling for it.

@thaliaarchi
Copy link
Contributor Author

My current feeling is that this is unnecessary for Windows specifically.

That certainly seems reasonable, as it would require 64 GiB to create a slice of that many buffers. I think the only useful case would be if Windows forbids a 0-length slice of buffers. I'll have to look through Windows docs for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-hermit Operating System: Hermit O-unix Operating system: Unix-like S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants