Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP+RFC: How to treat quoting and escaping on the command line? #1109

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

lucc
Copy link
Collaborator

@lucc lucc commented Aug 9, 2017

I was trying to execute :call "import alot;ui.notify(alot.__version__)" and it did not work because the semicolon was used to split up the command line. This lead to the observation that our command line parsing code needs some hardening regarding quoting and escaping of special chars (actually only ";"). (Note that I have since solved my original problem, see newest entries in the wiki.)

The tests I added show what I believe the command line parsing should do. Do you agree or differ? Please also post more base = '...'; expected = ['...'] examples that should also be put in the tests. I will gradually update the commits. Once we have finished the initial discussion what the parser should do I will turn this into a proper PR (you probably should wait until then when you want to do full reviews).

The relevant part of the shlex docs is here. Some recent issues and PRs that also dealt with the command line parsing are #579, #985, #986 although none of them seems directly related.

The current solution for the first test is not complete. It doesn't work with single quotes and backslashes in quotes and nested quotes are not handled. In general this path (the current solution of doing some special str.replace calls before calling shlex.shlex) seems cumbersome in the long run and difficult to get right for all cases. I had an idea for another way of solving the issue in general: I could try to not set shlex.shlex.whitespace and shlex.shlex.whitespace_split to anything but the default. This will split at normal whitespace and we can then later on "split" the list of strings at entries that are just one semicolon. The lists for each individual command could then be joined with whitespace again. This would potentially change the whitespace so the question for @pazz is: Is this function used to process any strings where the amount of whitespace should not be changed (subject lines, email addresses, ...)?

Copy link
Collaborator

@dcbaker dcbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one nit on the series as is, but otherwise I think this looks good.


@unittest.expectedFailure
def test_singlequoted_semicolons(self):
base = "python -c 'import alot; print(alot)'; echo 'done'"

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

@lucc lucc added the bug label Aug 10, 2017
@lucc lucc changed the title RFC: How to treat quoting and escaping on the command line? WIP+RFC: How to treat quoting and escaping on the command line? Aug 10, 2017
@pazz
Copy link
Owner

pazz commented Jun 25, 2018

Sorry to keep this hanging. @lucc I guess this is still relevant is it? I generally don't mind the changes and adding tests is certainly valuable. Could you rebase?

Is this function used to process any strings where the amount of whitespace should not be changed (subject lines, email addresses, ...)?

Well yes: it is used by the ui at the point where commandline input is interpreted. So this means in particular if you use set Subject foo bar in envelope mode you probably want to preserve whitespaces.

lucc added 4 commits June 28, 2018 11:22
Currently the command line can not contain any quoted semicolons.  The
python shlex module that is used for splitting the command string does
two things with quotes:

1. it interprets them in the sense that everything between quotes is not
   split into tokens
2. it removes them from the generated tokens

To prevent the second we do escape quotes before splitting the command
string with shlex but that implicitly prevents the first point as well.
No caller did use these function arguments so they are not really
needed.
@lucc lucc force-pushed the fix/split_commandline branch from d400ce1 to e60fcaa Compare June 28, 2018 09:34
@lucc
Copy link
Collaborator Author

lucc commented Jun 28, 2018

I rebased so we can better discuss this. There was some subtile issue with these changes but I have to think about this code thoroughly again to remember it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants