Variable conflict at stat_fit_tidy

Create issue
Issue #9 resolved
John Ma created an issue

When the results of broom::tidy is reshaped into a single row by fit_tidy_compute_group_fun, the names for the data in the estimate field are not given suffixes, unlike other fields. This is implemented in lines 432-435 of the CRAN version of stat-fit-broom.R:

z.estimate <- as.data.frame(t(mf.td[["estimate"]]))
z.std.error <- as.data.frame(t(mf.td[["std.error"]]))
clean.term.names <- gsub("(Intercept)", "Intercept", mf.td[["term"]], fixed = TRUE)
names(z.estimate) <- clean.term.names
names(z.std.error) <- paste(clean.term.names, "se", sep = "_")

Thus, if stat_fit_tidy is executed with one or more of the pre-defined aes fields as the model term, the estimate for that term will not be outputted using the ..field.. notation in aes(), since it is pre-empted by the inherited mappings. For example, in the following code:

library(ggplot2)
library(ggpmisc)
set.seed(4321)
# generate artificial data
x <- 1:100
y <- (x + x^2 + x^3) + rnorm(length(x), mean = 0, sd = mean(x^3) / 4)
my.data <- data.frame(x, 
                      y, 
                      group = c("A", "B"), 
                      y2 = y * c(0.5,2),
                      block = c("a", "a", "b", "b"))
ggplot(aes(x=fieldA, y=fieldB), data=my.data) + 
    geom_smooth(method="lm") +
    stat_fit_tidy(method="lm", method.args=list(formula=y~x), geom="text", mapping=aes(label=paste0("x estimate=", signif(..x.., 3))))

I would expect the numerical part to be the same as the estimated x term in the equivalent lm run, i.e. 9250:

> summary(lm(y~x, data=my.data))

Call:
lm(formula = y ~ x, data = my.data)

Residuals:
    Min      1Q  Median      3Q     Max 
-212613  -96757  -24449   84044  353130 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -200723.9    25866.0   -7.76 8.21e-12 ***
x              9250.6      444.7   20.80  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 128400 on 98 degrees of freedom
Multiple R-squared:  0.8154,    Adjusted R-squared:  0.8135 
F-statistic: 432.8 on 1 and 98 DF,  p-value: < 2.2e-16

What I get, however, is 1, the value of my.data[1, "x"]

My proposed fix is to also provide suffixes to term estimates such that it won't conflict with a known aes variable.

Comments (3)

  1. Log in to comment