Feature request: ensure parameter names are valid R identifiers

Issue #8 resolved
Former user created an issue

First, thank you for your useful package. I have a suggestion concerning the naming of the parsed arguments.

In the Unix world, it is a general convention to use the prefices '-' and '--' for short and long option names, respectively. If a long option name consists of multiple words, these are separated by another '-', for example "--end-time". Using your package, I would write:

parser <- arg_parser( 'Example' )
?parser <- add_argument( 
    parser, arg = '--end-time', type = 'double', default = 1e20, 
    help = "Time at which simulation ends"
)
args <- parse_args( parser)

Now, the value of the option is stored in the list 'args' under the name 'end-time'. To access this value, you cannot write

args$end-time    # interpreted as args$end - time

but instead have to write the clumsy

val <- args[['end-time']] 

because words containing '-' are no legal R identifiers. However, it would be much more convenient to write

args$end_time

or something similar. Would it be possible to add an automatic substitution for illegal characters? This could also be made optional. Thanks for your help!

Comments (4)

  1. David Shih repo owner

    I've been wanting to do that. It turned out it wasn't a simple fix.

    I tried again. I made several lines of changes, and hopefully I didn't break anything. I found no bug after light testing, but more extensive testing will be needed (later).

    On the plus side, I found an unrelated bug, and fixed it with the latest commit (it's bad practice, but I am sleepy...)

    Things should work as requested in the latest commit.

    Pull the latest commit, run roxygen2 and build as detailed in the README. (I don't commit auto-generated files to git because they add bulk.)

    Kindly set this issue to resolved if my latest commit solves it.

  2. Felix Kuehnl

    Thanks for your ultra-fast response ;-) I tried the newest version from your repository and after some short testing it seems to work fine. When adding arguments containing hyphens, these are substituted by underscores. Two things I noticed:

    1) You only seem to substitute hyphens ( '-' ) for now. Since you know exactly which characters are allowed in a valid R identifier (letters, numbers, underscores, no numbers in the beginning of identifier) you may want to substitute more aggressively, namely any character that violates the rules. Or is this a bad idea?

    2) When throwing an error message about parameters added twice, you should probably print out the substituted form of the parameter name so the user knows that a substitution is going on. For example,

    parser <- add_argument( 
        parser, arg = '--n_test1', 
        help = "Testing it till it breaks..."    
    )
    parser <- add_argument( 
        parser, arg = '--n-test1', 
        help = "Testing it till it breaks..."    
    )
    

    will lead to an error that says --n-test1 has already been defined. Without knowing of this subsitution, the user might be confused.

    PS: I don't think I can set this issue to resolved since I did not log in when I posted it... sorry!

  3. David Shih repo owner

    Regarding (1), I am not a fan of automatic substitutions, especially R's handling of names with number prefix. I only added the substitution of - to _ because of how common "-" appear in argument numbers. I don't see much utility for other substitutions, and it creates more backwards incompatibility. People who want to use eccentric argument names will just have to live with a more verbose argument indexing syntax.

    Even with the - to _ substitution, people who were using the args[['arg-name']] syntax will have their code broken! I documented this incompatibility issue in the README. It is not desirable to emit both arg-name and arg_name, since it adds bloat and complexity.

    Regarding (2), I am now providing a more explicit error message.

  4. Log in to comment