unfactor() deletes leading zeros and converts column to integer instead of characters

Create issue
Issue #1 resolved
Former user created an issue

R 3.4.2 varhandle 2.0.2

example:

df <- data.frame(postcode=c(rep("01234", 5), rep("09876", 5)), inhabitant=c(rep(30000, 5), rep(100000, 5)), stringsAsFactors=TRUE)

str(df)

'data.frame': 10 obs. of 2 variables: $ postcode : Factor w/ 2 levels "01234","09876": 1 1 1 1 1 2 2 2 2 2 $ inhabitant: num 30000 30000 30000 30000 30000 100000 100000 100000 100000 100000

df_unfactored <- unfactor(df)

str(df_unfactored)

'data.frame': 10 obs. of 2 variables: $ postcode : num 1234 1234 1234 1234 1234 ... $ inhabitant: num 30000 30000 30000 30000 30000 100000 100000 100000 100000 100000

Leading zeros, as used in German postal codes, are deleted. The column is converted to integer. It should be characters.

Comments (4)

  1. Mehrad Mahmoudian repo owner

    Although this can be easily circumvented by appending a character in the beginning of postal codes ( df$postcode <- paste0("p", df$postcode) ), you are right about one thing that the user should be able to disable this internal integral feature if they need to (this is provided by check.numeric function of this package). On the same note, I personally don't share the opinion that this is a bug. I'll mark this in my Trello board to be added to the unfactor function.

  2. Mehrad Mahmoudian repo owner

    This was resolved in v2.0.4 by adding an argument to let user choose if the number conversion should happen or not.

  3. Log in to comment