Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Itertools::join #526

Open
pickfire opened this issue Feb 7, 2021 · 8 comments
Open

Itertools::join #526

pickfire opened this issue Feb 7, 2021 · 8 comments

Comments

@pickfire
Copy link

pickfire commented Feb 7, 2021

Our current join only outputs String, maybe we should consider using the nightly Join in standard library? I created a pull request using that in standard library but I noticed requires allocation so it is less likely to be included in standard std library.

rust-lang/rust#75738

Maybe it can be considered here after it is stable on standard library?

@SkiFire13
Copy link
Contributor

SkiFire13 commented Feb 9, 2021

How about something like:

trait JoinableIterator<S>: Iterator {
    fn join<O>(self, sep: S) -> O
    where 
        Self: Sized,
        O: Join<Self::Item, S>
    {
        O::join(self, sep)
    }
}
impl<I: Iterator, S> JoinableIterator<S> for I {}

trait Join<T, S> {
    fn join<I: IntoIterator<Item = T>>(iter: I, sep: S) -> Self;
}

impl<T: std::fmt::Display, S: std::fmt::Display> Join<T, S> for String {
    fn join<I: IntoIterator<Item = T>>(iter: I, sep: S) -> Self {
        let mut iter = iter.into_iter();
        match iter.next() {
            Some(item) => {
                use std::fmt::Write;
                let mut s = item.to_string();
                iter.for_each(|item| {
                    let _ = write!(s, "{}", sep);
                    let _ = write!(s, "{}", item);
                });
                s
            }
            None => String::new(),
        }
    }
}
// Possibly other implementations

The S generic parameter on JoinableIterator is for better turbofish ergonomics.

@pickfire
Copy link
Author

pickfire commented Feb 9, 2021

I thought about something like that but I haven't been able to figure it out. But maybe not Display, potential footgun if users accidentally put something that is display but they don't want that behavior, it should be all &str or String. But yeah, that would cause a breaking change in itertools, but IMHO I wouldn't want the Display, it is doing write! (which possibly is hard to optimize) as well as there is nothing that guards users from doing side effect when impl ToString which impl Display uses, so limiting it to the original type is better I think.

@SkiFire13
Copy link
Contributor

I based myself on the current implementation which requires Self::Item: Display, so I kept that bound. I think the original idea was that a join with Strings/&strs only would require you to allocate a new String for each iterator element and that would be slower, but I guess we can change that if it's considered a bad API.

As for the .to_string(), it uses Display under the hood and there's a blanket implementation for T: Display so I thought there can't be an user provided implementation that does side effects, however I forgot about specialization.

@pickfire
Copy link
Author

pickfire commented Feb 10, 2021

I based myself on the current implementation which requires Self::Item: Display, so I kept that bound. I think the original idea was that a join with Strings/&strs only would require you to allocate a new String for each iterator element and that would be slower, but I guess we can change that if it's considered a bad API.

User could still use &str if they want it to be more efficient so they don't want to allocate, we probably also does not want to force String, maybe Borrow<str> or AsRef<str>?

As for the .to_string(), it uses Display under the hood and there's a blanket implementation for T: Display so I thought there can't be an user provided implementation that does side effects, however I forgot about specialization.

No, user can override that. Still, that is the reason why I think it is weird and didn't end up using this join from itertools.

Also, if we use Display, what would be for join for &[u8]?

@SkiFire13
Copy link
Contributor

User could still use &str if they want it to be more efficient so they don't want to allocate, we probably also does not want to force String, maybe Borrow or AsRef?

You still need to get that &str. What if I want to join an iterator of numbers into a string? e.g. I want their human representation printed.

No, user can override that. Still, that is the reason why I think it is weird and didn't end up using this join from itertools.

On stable? I don't think so. The following code doesn't compile if you don't enable specialization (however with specialization it does)

struct Foo;

impl std::fmt::Display for Foo {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        todo!()
    }
}

impl std::string::ToString for Foo {
    fn to_string(&self) -> String {
        todo!()
    }
}

Also, if we use Display, what would be for join for &[u8]?

Display is a bound only on the implementation that produces String, it doesn't prevent you from writing other implementations.

@pickfire
Copy link
Author

pickfire commented Feb 10, 2021

You still need to get that &str. What if I want to join an iterator of numbers into a string? e.g. I want their human representation printed.

Yes, they have to convert between different types.

On the other hand, what if there is a secret type than the users does not want to implement Display but only ToString? Does that mean this is not usable? Or they have to do some conversion?

I do think Display is more efficient (since it could do less allocation) but I don't think Display is good, what if it's Vec or any other types that users would not want Display? It would be preferable if they do the conversion themselves or it may be surprising that suddenly a random string appear within the join, at least it is less magic, user (or library authors) can put magic in Display but nothing can go wrong if it's already String or &str, then it may be surprising to see that part being included in join.

Or maybe we should ask in standard library why didn't they do slice::join with Display?

Say, if users have Vec<User> do we want them to implement Display or would we want them to convert it themselves to String? If we allow Display they may have a tendency to implement that rather that doing the conversion themselves, and later they might find out that they need a different Display, once they changed the Display and it may break the join, so it is better to not merge Display for this purpose.

@SkiFire13
Copy link
Contributor

Say, if users have Vec do we want them to implement Display or would we want them to convert it themselves to String? If we allow Display they may have a tendency to implement that rather that doing the conversion themselves, and later they might find out that they need a different Display, once they changed the Display and it may break the join, so it is better to not merge Display for this purpose.

This is a pretty good point. Maybe alternatively we could have a join_with method with its own backing trait that allows the user to specify how the item should be appended using a closure. This way we can make it more general for any type, not just Strings.

@pickfire
Copy link
Author

This is a pretty good point. Maybe alternatively we could have a join_with method with its own backing trait that allows the user to specify how the item should be appended using a closure. This way we can make it more general for any type, not just Strings.

Good idea, or maybe we can have a function but I wonder if we could let users specify their own trait bounds? Do we give the users power to do write! or we only handle that and let the users provide AsRef<&str>?

By the way, happy chinese new year!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants