Skip to contents

This function identifies aliased (linearly dependent) variables in a linear model by fitting a linear model, and then using the stats::alias function to detect aliased variables.

Usage

detect_alias(data, verbose = FALSE)

Arguments

data

A data frame or tibble containing the variables to be checked for aliasing.

verbose

Logical. Whether to print the aliased variables found (if any). If TRUE, aliased variables are printed to the console. Defaults to FALSE.

Value

Returns a character vector of aliased variable names if any are found; otherwise, returns NULL invisibly. If verbose is TRUE, the function will also print a message to the console.

Author

Ahmed El-Gabbas

Examples

load_packages(car)

x1 <- rnorm(100)
x2 <- 2 * x1
x3 <- rnorm(100)
y <- rnorm(100)

model <- lm(y ~ x1 + x2 + x3)
summary(model)
#> 
#> Call:
#> lm(formula = y ~ x1 + x2 + x3)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.1728 -0.6343 -0.1377  0.8417  2.3770 
#> 
#> Coefficients: (1 not defined because of singularities)
#>             Estimate Std. Error t value Pr(>|t|)
#> (Intercept)  0.12272    0.10481   1.171    0.245
#> x1           0.17776    0.11061   1.607    0.111
#> x2                NA         NA      NA       NA
#> x3           0.03323    0.10549   0.315    0.753
#> 
#> Residual standard error: 1.029 on 97 degrees of freedom
#> Multiple R-squared:  0.02599,	Adjusted R-squared:  0.00591 
#> F-statistic: 1.294 on 2 and 97 DF,  p-value: 0.2788
#> 

# there are aliased coefficients in the model
try(car::vif(model))
#> Error in vif.default(model) : there are aliased coefficients in the model

# The function identifies the aliased variables
detect_alias(data = cbind.data.frame(x1, x2, x3))
#> [1] "x2"

detect_alias(data = cbind.data.frame(x1, x2, x3), verbose = TRUE)
#> aliased variables: x2
#> [1] "x2"

# excluding x2 and refit the model
model <- lm(y ~ x1 + x3)

summary(model)
#> 
#> Call:
#> lm(formula = y ~ x1 + x3)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.1728 -0.6343 -0.1377  0.8417  2.3770 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)
#> (Intercept)  0.12272    0.10481   1.171    0.245
#> x1           0.17776    0.11061   1.607    0.111
#> x3           0.03323    0.10549   0.315    0.753
#> 
#> Residual standard error: 1.029 on 97 degrees of freedom
#> Multiple R-squared:  0.02599,	Adjusted R-squared:  0.00591 
#> F-statistic: 1.294 on 2 and 97 DF,  p-value: 0.2788
#> 

try(car::vif(model))
#>       x1       x3 
#> 1.022448 1.022448