Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow ParameterList to be case insensitive #2111

Open
bartgol opened this issue Jan 3, 2018 · 19 comments
Open

Allow ParameterList to be case insensitive #2111

bartgol opened this issue Jan 3, 2018 · 19 comments
Labels
DO_NOT_AUTOCLOSE This issue should be exempt from auto-closing by the GitHub Actions bot. pkg: Teuchos Issues primarily dealing with the Teuchos Package type: enhancement Issue is an enhancement, not a bug

Comments

@bartgol
Copy link
Contributor

bartgol commented Jan 3, 2018

@trilinos/teuchos

Expectations

I was wondering if there is a reason why ParameterList does not support case-insensitive strings (upon request, of course). I sometimes spend some time debugging a crash, and realize I used a lowercase letter instead of an uppercase (or viceversa) in the name of a parameter/sublist in the xml file. If ParameterList was somehow able to ignore the case of the parameters/sublists names, then this would not be a problem.

Note: I'm not talking about the (string-type) parameters values (although, at this point, the class could ignore that too), since those I can always convert to lower/upper case after fetching them from the list. I'm only talking about the parameter name, since I don't want/can't try all the possible typos when I look for a parameter.

Possible Solution

I guess one could set a bool flag at construction time, or even set it afterwards via a setter (in which case, the parameter list is immediately looped through and all the parameters/sublists names are converted to a predefined lower/upper case). Then, in all the getters/setters, before looking for the parameter/sublist, if the bool flag is on, the input string is first converted to a predefined lower/upper case.

@mhoemmen mhoemmen added pkg: Teuchos Issues primarily dealing with the Teuchos Package type: enhancement Issue is an enhancement, not a bug labels Jan 3, 2018
@mhoemmen
Copy link
Contributor

mhoemmen commented Jan 3, 2018

@trilinos/teuchos

@mhoemmen
Copy link
Contributor

mhoemmen commented Jan 3, 2018

@bartgol Would an alternate solution be that solvers actually check the input ParameterList, and throw if they get parameters they don't understand? Solver packages have resisted this approach for various reasons. In fact, there is an outstanding Belos issue (#2040) complaining about this.

It could make sense for ParameterList to have a "case insensitive" option, but it's really up to the solver packages to decide whether they want case-insensitive parameters.

@ibaned
Copy link
Contributor

ibaned commented Jan 3, 2018

Yea, I agree with @mhoemmen .
I don't like the idea of accepting slightly incorrect names, whether it is case or a typo.
I'd rather get a strong error saying that I've provided a parameter name that was not accepted.
I think this is part of the reason why ParameterList keeps track of which parameters were used or not.
We could just fail if any input parameter goes unused.

@mhoemmen
Copy link
Contributor

mhoemmen commented Jan 3, 2018

@ibaned Fun fact: We spent a whole day during the Trilinos developers' spring meeting in 2010 or 2011 trying to agree on a common interface for things that take ParameterList, and couldn't even reach consensus on const PL& vs. PL& vs. RCP<PL> vs. RCP<const PL>. (Also, nobody was willing to put it to a vote and enforce the plurality opinion.) People have similar feelings about the issue you mentioned. I much prefer your approach, if it helps.

@ibaned
Copy link
Contributor

ibaned commented Jan 3, 2018

Yea, I've definitely written code that just silently passed around a const PL&, but for exactly @bartgol 's reason (misspelled names go undetected), I'm inclined to move to using the system that records used versus unused parameters. In fact I vote for const PL& so that users aren't forced to use RCP.
This is also useful for transitions where a not-so-great name is being replaced with a better name (in fact, I've done transitions that replaced lower case names with camel case, just for consistency across the whole input file).

@bartgol
Copy link
Contributor Author

bartgol commented Jan 3, 2018

In my mind, mispellings are one thing, lower vs upper case is another thing. In the former case, we are talking about two different words, while in the latter we are talking about different "fonts" (in a way). I think uppercase (or lowercase, depending on your tastes) is often a burden, when typing. Sure, they double the set of letters you have to name things. But I do not see a (moral) distinction between "timestep" and "TimeStep", while I do see a distinction between "inner" and "inter", for instance. Besides, they require an extra keystroke. :-P

Besides, I find it too restrictive (and perhaps cumbersome) to check that ALL the inputs are valid. Perhaps the user wants to keep the same input file for different executables, and some parameters are needed by one executable, but not by another. So checking that all the entries in the PL are valid is complicated (if possible at all). In such a case (with no checks), if the executable does not find the value "Max iter", it mayuse a default value, and it may miss the parameter "Max Iter".

Is the above example a "bad practice"? Probably. But the language does not prevent you from shooting yourself in the foot, and neither should a library. It should make it "difficult", and protect you from obvious mistakes, but should not put more barriers than strictly needed. In what I proposed, the default behavior should still be case sensitive, and it would take an explicit request by the user to enable case-insensitive behavior. If you decide to use raw pointers when you know there are smart ones available, then you know what you are doing (and you may have valid reasons for doing it).

That said, if there is a shared feeling that case insensitive means unsafe code (even if it is requested by the user), then I'll just close this, and cope with my struggle with caps :-) .

@tjfulle
Copy link
Contributor

tjfulle commented Jan 3, 2018

I echo @bartgol that it would be nice to have (optional) case insensitive parameter lists. In an xml file, something like <ParamterList ignorecase="true" ...>...</ParameterList> would let the user decide (or is that ignoreCase? Or IgnoreCase?)

I don't think it should be an error to contain unused parameters. It is convenient to define several parameter lists for the same target, changing the name to the expected name to turn the ParameterList on/off, i.e.

<ParameterList name="Expected Name"> ... </ParameterList>
<ParameterList name="Expected Name-OFF1"> ... </ParameterList>
<ParameterList name="Expected Name-OFF2"> ... </ParameterList>

The same could be accomplished by commenting out the unused ParameterLists, but that is inconvenient.

@mhoemmen
Copy link
Contributor

mhoemmen commented Jan 3, 2018

@bartgol @tjfulle I agree that this would be a good idea. I think if we were just to implement this and push it, no one would mind or even need to know about it. I'll be happy to review pull requests :-) .

@bartlettroscoe
Copy link
Member

FYI: The primary reason for passing in a non-const PL object is so that the object can insert missing default values. If the user then writes the updated PL out to a file, that provides a nice way to allow them to see what defaults are used and to tweak them on future runs. NOX pioneered this usage of PLs and I think it is a good workflow.

@bartlettroscoe
Copy link
Member

As for making PLs case-insensitive, that would be possible but it is a big change in behavior that will impact several use cases, especially iterators through the PL. And it will make all find and matching operations more expensive. I don't think it is worth the trouble at this point. And we don't want non-validation of PLs. That is not serving our customers well at all.

@bartgol
Copy link
Contributor Author

bartgol commented Jan 3, 2018

@bartlettroscoe If a PL is accessed so many times that performance is an issue, then yes, case insensitive should not be used. But for other cases, the user may want it (I could have used it a few times).

What happens in Trilinos, stays in Trilinos. I don't argue with how PL's are used inside it. Validation or not, I don't care. I'm sure all the choices made in how PL's are used in Trilinos have a very valid reasoning behind it. But I'm not talking about that. If Trilinos doesn't want non-validation PLs, that's fine. However, Teuchos is a very handy library that can be used in disparate applications, where efficiency may not be crucial (at least not as much as, say, in the kernel of a linear solver).

Now, if allowing case-insensitive would deteriorate performance and/or break code also in the case when it is not enabled, that is a different issue, and would answer the question. But if allowing it (only upon explicit request) would not deteriorate performance nor break code, then I don't see why its implementation should be discouraged. That is, assuming one can come up with a "relatively easy" implementation, of course.

To be clear: I wasn't asking "hey guys, could you please add this feature, cause I think I'd like it?". I was just wondering if there was a reason against its implementation. That's all.

@mhoemmen
Copy link
Contributor

mhoemmen commented Jan 4, 2018

@bartgol btw, I would like the feature :-) .

@bartlettroscoe
Copy link
Member

@bartgol, I don't disagree that case-insensitivity would be a useful feature in general. Many input DEC parsers are case-insensitive but this is not the norm in the computing community. Even XML and YAML are case sensitive. About the only major specs that are not case sensitive are CMake function/command names and Fortran 77.

I worry about performance but what I really worry about is added complexity in the implementation. As the person who did the major refactoring of the Teuchos::ParameterList class, I am sensitive to the impact that case sensitivity would have on the code in general.

The most direct way that I could see to implement this would be:

  • Use a validator DECORATOR or other preprocessor that takes and input PL, compares it to "valid PL" generated by the object, and then replaces all of the existing parameters and values (where possible) with the expected case. (That requires a non-const PL).

Given the current design of the software, that might be quite doable. I can see exactly how you would implement such a thing but would be a lot of work and hard to make 100% general. I can give more details if interested.

@github-actions
Copy link

This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity.
If you would like to keep this issue open please add a comment and/or remove the MARKED_FOR_CLOSURE label.
If this issue should be kept open even with no activity beyond the time limits you can add the label DO_NOT_AUTOCLOSE.
If it is ok for this issue to be closed, feel free to go ahead and close it. Please do not add any comments or change any labels or otherwise touch this issue unless your intention is to reset the inactivity counter for an additional year.

@github-actions github-actions bot added the MARKED_FOR_CLOSURE Issue or PR is marked for auto-closure by the GitHub Actions bot. label May 19, 2021
@jhux2 jhux2 added DO_NOT_AUTOCLOSE This issue should be exempt from auto-closing by the GitHub Actions bot. and removed MARKED_FOR_CLOSURE Issue or PR is marked for auto-closure by the GitHub Actions bot. labels May 19, 2021
@bartgol
Copy link
Contributor Author

bartgol commented Dec 2, 2024

@bartlettroscoe I stumbled across this while cleaning up my open issues, and had a thought. What if, we turned class ParameterList into template<typename Key> class ParameterListImpl, where Key is then used in place of std::string throughout the class implementation to retreive parameters and sublists. Then, we could have

using ParameterList = ParameterListImpl<std::string>;
using ParameterListCI = ParameterList<ci_string>;

where ci_string is a simple implementation of a case-insensitive string (I did it once, and it's a moderate size class that inherits from std::string and overloads comparison ops). This would leave business as usual for ParameterList, with zero overhead compared to the current implementation (I think?).

We would actually not even need to add the second alias, leaving it to the user to work out the impl of a CI string, and then its alias.

@bartlettroscoe
Copy link
Member

@bartlettroscoe I stumbled across this while cleaning up my open issues, and had a thought. What if, we turned class ParameterList into template<typename Key> class ParameterListImpl, where Key is then used in place of std::string throughout the class implementation to retreive parameters and sublists.

@bartgol, anything is possible. With ETI for the object code for ParameterListImpl<std::string> and proper header file setup, this would not increase build costs at all for existing customers. I guess if you want to give this a try, it might be workable. (But this could be a lot of effort since you would also need automated tests for all of the needed use cases.)

@bartgol
Copy link
Contributor Author

bartgol commented Dec 3, 2024

You're right. Also, I suppose this would not change the behavior of existing code that uses ParameterList...

I suppose it's time to let this go...

@bartlettroscoe
Copy link
Member

@bartgol, as a workaround for your application, you might just consider storing only lowercase or uppercase keys in your valid parameter list when you form them and then preprocess input parameter lists input files to uppercase or lowercase keys to match the expected case. In fact, we could extend the parameter list readers in Teuchos to give them an option to uppercase or lowercase the key strings as there been read in. That would be much easier to implement and would have the desired effect. That is magnitudes less code to modify and test. Would that be a reasonable middleground for what you’re trying to achieve?

@bartgol
Copy link
Contributor Author

bartgol commented Dec 3, 2024

@bartlettroscoe I like the idea of modifying parsers rather than ParameterList, and I agree that it's much more affordable in terms of implementation.

I really don't want to put this on your todo list though. I feel like this remains more of a "wishlist" kind of feature, and not really something we need. I can live either way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DO_NOT_AUTOCLOSE This issue should be exempt from auto-closing by the GitHub Actions bot. pkg: Teuchos Issues primarily dealing with the Teuchos Package type: enhancement Issue is an enhancement, not a bug
Projects
None yet
Development

No branches or pull requests

6 participants