Interactive mode: click a code block or Show Plot button to reveal/hide its corresponding plot.
stargazerstargazer is one of the most commonly used packages for
creating beautiful tables in R. It allows exporting tables to LaTeX,
HTML, or plain text, which can be easily copied to Word.
#if (FALSE) install.packages("stargazer")
# Load necessary libraries
library(stargazer)
library(tidyverse)
You can summarize the key variables used in the regression models to give a sense of the sample distribution. This step is important for providing context before presenting regression results.
This will output a summary table that includes the mean, standard deviation (sd), minimum (min), maximum (max), and number of observations (n) for each of the variables used in the regression models.
# Example data
data(mtcars)
# Summary table
stargazer(mtcars, type = "text") # You can change 'text' to 'html' or 'latex' to suit your output needs
##
## ============================================
## Statistic N Mean St. Dev. Min Max
## --------------------------------------------
## mpg 32 20.091 6.027 10.400 33.900
## cyl 32 6.188 1.786 4 8
## disp 32 230.722 123.939 71.100 472.000
## hp 32 146.688 68.563 52 335
## drat 32 3.597 0.535 2.760 4.930
## wt 32 3.217 0.978 1.513 5.424
## qsec 32 17.849 1.787 14.500 22.900
## vs 32 0.438 0.504 0 1
## am 32 0.406 0.499 0 1
## gear 32 3.688 0.738 3 5
## carb 32 2.812 1.615 1 8
## --------------------------------------------
# Summary table with selected variables
stargazer(mtcars[,c("cyl","disp","hp")], type = "text") # You can change 'text' to 'html' or 'latex' to suit your output needs
##
## ============================================
## Statistic N Mean St. Dev. Min Max
## --------------------------------------------
## cyl 32 6.188 1.786 4 8
## disp 32 230.722 123.939 71.100 472.000
## hp 32 146.688 68.563 52 335
## --------------------------------------------
mtcars%>%
select(cyl,disp,hp)%>%
stargazer(type = "text")
##
## ============================================
## Statistic N Mean St. Dev. Min Max
## --------------------------------------------
## cyl 32 6.188 1.786 4 8
## disp 32 230.722 123.939 71.100 472.000
## hp 32 146.688 68.563 52 335
## --------------------------------------------
# Fit a regression model
model1 <- lm(mpg ~ cyl + disp, data = mtcars)
model2 <- lm(mpg ~ cyl + disp + hp, data = mtcars)
# Regression table
stargazer(model1,model2, type = "text") # Change 'text' to 'html' for Word compatibility
##
## =================================================================
## Dependent variable:
## ---------------------------------------------
## mpg
## (1) (2)
## -----------------------------------------------------------------
## cyl -1.587** -1.227
## (0.712) (0.797)
##
## disp -0.021* -0.019*
## (0.010) (0.010)
##
## hp -0.015
## (0.015)
##
## Constant 34.661*** 34.185***
## (2.547) (2.591)
##
## -----------------------------------------------------------------
## Observations 32 32
## R2 0.760 0.768
## Adjusted R2 0.743 0.743
## Residual Std. Error 3.055 (df = 29) 3.055 (df = 28)
## F Statistic 45.808*** (df = 2; 29) 30.877*** (df = 3; 28)
## =================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
# Export to Word Doc
stargazer(mtcars, type = "html", out = "summary_table.doc")
##
## <table style="text-align:center"><tr><td colspan="6" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Statistic</td><td>N</td><td>Mean</td><td>St. Dev.</td><td>Min</td><td>Max</td></tr>
## <tr><td colspan="6" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">mpg</td><td>32</td><td>20.091</td><td>6.027</td><td>10.400</td><td>33.900</td></tr>
## <tr><td style="text-align:left">cyl</td><td>32</td><td>6.188</td><td>1.786</td><td>4</td><td>8</td></tr>
## <tr><td style="text-align:left">disp</td><td>32</td><td>230.722</td><td>123.939</td><td>71.100</td><td>472.000</td></tr>
## <tr><td style="text-align:left">hp</td><td>32</td><td>146.688</td><td>68.563</td><td>52</td><td>335</td></tr>
## <tr><td style="text-align:left">drat</td><td>32</td><td>3.597</td><td>0.535</td><td>2.760</td><td>4.930</td></tr>
## <tr><td style="text-align:left">wt</td><td>32</td><td>3.217</td><td>0.978</td><td>1.513</td><td>5.424</td></tr>
## <tr><td style="text-align:left">qsec</td><td>32</td><td>17.849</td><td>1.787</td><td>14.500</td><td>22.900</td></tr>
## <tr><td style="text-align:left">vs</td><td>32</td><td>0.438</td><td>0.504</td><td>0</td><td>1</td></tr>
## <tr><td style="text-align:left">am</td><td>32</td><td>0.406</td><td>0.499</td><td>0</td><td>1</td></tr>
## <tr><td style="text-align:left">gear</td><td>32</td><td>3.688</td><td>0.738</td><td>3</td><td>5</td></tr>
## <tr><td style="text-align:left">carb</td><td>32</td><td>2.812</td><td>1.615</td><td>1</td><td>8</td></tr>
## <tr><td colspan="6" style="border-bottom: 1px solid black"></td></tr></table>
# Export to Word PDF
stargazer(model1,model2, type = "html", out = "regression_table.html")
##
## <table style="text-align:center"><tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="2"><em>Dependent variable:</em></td></tr>
## <tr><td></td><td colspan="2" style="border-bottom: 1px solid black"></td></tr>
## <tr><td style="text-align:left"></td><td colspan="2">mpg</td></tr>
## <tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td></tr>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">cyl</td><td>-1.587<sup>**</sup></td><td>-1.227</td></tr>
## <tr><td style="text-align:left"></td><td>(0.712)</td><td>(0.797)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">disp</td><td>-0.021<sup>*</sup></td><td>-0.019<sup>*</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(0.010)</td><td>(0.010)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">hp</td><td></td><td>-0.015</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.015)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Constant</td><td>34.661<sup>***</sup></td><td>34.185<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(2.547)</td><td>(2.591)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>32</td><td>32</td></tr>
## <tr><td style="text-align:left">R<sup>2</sup></td><td>0.760</td><td>0.768</td></tr>
## <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.743</td><td>0.743</td></tr>
## <tr><td style="text-align:left">Residual Std. Error</td><td>3.055 (df = 29)</td><td>3.055 (df = 28)</td></tr>
## <tr><td style="text-align:left">F Statistic</td><td>45.808<sup>***</sup> (df = 2; 29)</td><td>30.877<sup>***</sup> (df = 3; 28)</td></tr>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td colspan="2" style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr>
## </table>
if (nzchar(Sys.which("pandoc"))) system("pandoc -s regression_table.html -o regression_table.pdf") else message("pandoc not available in this runtime; skipping PDF conversion.")
# Install necessary packages
#if (FALSE) install.packages("stargazer")
#if (FALSE) install.packages("Ecdat") # Contains the Wages dataset
# Load required libraries
library(stargazer)
library(Ecdat)
# Load the wages dataset
data(Wages)
head(Wages)
Question: Why are we missing some variables in the table? How would you fix it?
# Create a summary table of sample statistics
stargazer(Wages, type = "text", title = "Sample Statistics for Wages Dataset",
summary.stat = c("mean", "sd", "min", "max", "n"))
##
## Sample Statistics for Wages Dataset
## ===========================================
## Statistic Mean St. Dev. Min Max N
## -------------------------------------------
## exp 19.854 10.966 1 51 4,165
## wks 46.812 5.129 5 52 4,165
## ind 0.395 0.489 0 1 4,165
## ed 12.845 2.788 4 17 4,165
## lwage 6.676 0.462 4.605 8.537 4,165
## -------------------------------------------
class(Wages$sex)
## [1] "factor"
Wages$sex <- as.numeric(Wages$sex)
class(Wages$sex)
## [1] "numeric"
# Fit multiple regression models using the wages dataset
model1 <- lm(lwage ~ ed + exp, data = Wages)
model2 <- lm(lwage ~ ed + exp + sex, data = Wages)
model3 <- lm(lwage ~ ed + exp + sex + union, data = Wages)
# Create a basic regression table with multiple models quickly in console
stargazer(model1, model2, model3, type = "text")
##
## =================================================================================================
## Dependent variable:
## -----------------------------------------------------------------------------
## lwage
## (1) (2) (3)
## -------------------------------------------------------------------------------------------------
## ed 0.076*** 0.075*** 0.079***
## (0.002) (0.002) (0.002)
##
## exp 0.013*** 0.012*** 0.012***
## (0.001) (0.001) (0.001)
##
## sex 0.436*** 0.421***
## (0.019) (0.019)
##
## unionyes 0.085***
## (0.013)
##
## Constant 5.436*** 4.652*** 4.597***
## (0.034) (0.046) (0.047)
##
## -------------------------------------------------------------------------------------------------
## Observations 4,165 4,165 4,165
## R2 0.247 0.335 0.342
## Adjusted R2 0.246 0.335 0.342
## Residual Std. Error 0.401 (df = 4162) 0.376 (df = 4161) 0.374 (df = 4160)
## F Statistic 681.552*** (df = 2; 4162) 698.837*** (df = 3; 4161) 541.007*** (df = 4; 4160)
## =================================================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
# you can also add more details
stargazer(model1, model2, model3, type = "text",
style = 'aer',
title = "Basic Regression Results Using Wages Dataset",
column.labels = c("Model 1", "Model 2", "Model 3"),
dep.var.labels = "Hourly Wage",
covariate.labels = c("Years of Education", "Years of Experience", "Gender (Male = 1)", "Union Membership (Yes = 1)"),
notes = "Standard errors in parentheses")
##
## Basic Regression Results Using Wages Dataset
## ========================================================================================================
## Hourly Wage
## Model 1 Model 2 Model 3
## (1) (2) (3)
## --------------------------------------------------------------------------------------------------------
## Years of Education 0.076*** 0.075*** 0.079***
## (0.002) (0.002) (0.002)
##
## Years of Experience 0.013*** 0.012*** 0.012***
## (0.001) (0.001) (0.001)
##
## Gender (Male = 1) 0.436*** 0.421***
## (0.019) (0.019)
##
## Union Membership (Yes = 1) 0.085***
## (0.013)
##
## Constant 5.436*** 4.652*** 4.597***
## (0.034) (0.046) (0.047)
##
## Observations 4,165 4,165 4,165
## R2 0.247 0.335 0.342
## Adjusted R2 0.246 0.335 0.342
## Residual Std. Error 0.401 (df = 4162) 0.376 (df = 4161) 0.374 (df = 4160)
## F Statistic 681.552*** (df = 2; 4162) 698.837*** (df = 3; 4161) 541.007*** (df = 4; 4160)
## --------------------------------------------------------------------------------------------------------
## Notes: ***Significant at the 1 percent level.
## **Significant at the 5 percent level.
## *Significant at the 10 percent level.
## Standard errors in parentheses
stargazer(model1, model2, model3, type = "html",out = "3models.doc",
style = "aer",
title = "Basic Regression Results Using Wages Dataset",
column.labels = c("Model 1", "Model 2", "Model 3"),
dep.var.labels = "Hourly Wage",
covariate.labels = c("Years of Education", "Years of Experience", "Gender (Male = 1)", "Union Membership (Yes = 1)"),
notes = "Standard errors in parentheses")
##
## <table style="text-align:center"><caption><strong>Basic Regression Results Using Wages Dataset</strong></caption>
## <tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="3">Hourly Wage</td></tr>
## <tr><td style="text-align:left"></td><td>Model 1</td><td>Model 2</td><td>Model 3</td></tr>
## <tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td><td>(3)</td></tr>
## <tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Years of Education</td><td>0.076<sup>***</sup></td><td>0.075<sup>***</sup></td><td>0.079<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(0.002)</td><td>(0.002)</td><td>(0.002)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Years of Experience</td><td>0.013<sup>***</sup></td><td>0.012<sup>***</sup></td><td>0.012<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(0.001)</td><td>(0.001)</td><td>(0.001)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Gender (Male = 1)</td><td></td><td>0.436<sup>***</sup></td><td>0.421<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.019)</td><td>(0.019)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Union Membership (Yes = 1)</td><td></td><td></td><td>0.085<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td>(0.013)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Constant</td><td>5.436<sup>***</sup></td><td>4.652<sup>***</sup></td><td>4.597<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(0.034)</td><td>(0.046)</td><td>(0.047)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Observations</td><td>4,165</td><td>4,165</td><td>4,165</td></tr>
## <tr><td style="text-align:left">R<sup>2</sup></td><td>0.247</td><td>0.335</td><td>0.342</td></tr>
## <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.246</td><td>0.335</td><td>0.342</td></tr>
## <tr><td style="text-align:left">Residual Std. Error</td><td>0.401 (df = 4162)</td><td>0.376 (df = 4161)</td><td>0.374 (df = 4160)</td></tr>
## <tr><td style="text-align:left">F Statistic</td><td>681.552<sup>***</sup> (df = 2; 4162)</td><td>698.837<sup>***</sup> (df = 3; 4161)</td><td>541.007<sup>***</sup> (df = 4; 4160)</td></tr>
## <tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Notes:</em></td><td colspan="3" style="text-align:left"><sup>***</sup>Significant at the 1 percent level.</td></tr>
## <tr><td style="text-align:left"></td><td colspan="3" style="text-align:left"><sup>**</sup>Significant at the 5 percent level.</td></tr>
## <tr><td style="text-align:left"></td><td colspan="3" style="text-align:left"><sup>*</sup>Significant at the 10 percent level.</td></tr>
## <tr><td style="text-align:left"></td><td colspan="3" style="text-align:left">Standard errors in parentheses</td></tr>
## </table>
how to adjust standard errors for heteroskedasticity or clustered standard errors.
library(sandwich)
robust_se <- list(sqrt(diag(vcovHC(model1, type = "HC1"))),
sqrt(diag(vcovHC(model2, type = "HC1"))),
sqrt(diag(vcovHC(model3, type = "HC1"))))
stargazer(model1, model2, model3, type = "text",
se = robust_se,
title = "Regression with Robust Standard Errors")
##
## Regression with Robust Standard Errors
## =================================================================================================
## Dependent variable:
## -----------------------------------------------------------------------------
## lwage
## (1) (2) (3)
## -------------------------------------------------------------------------------------------------
## ed 0.076*** 0.075*** 0.079***
## (0.002) (0.002) (0.002)
##
## exp 0.013*** 0.012*** 0.012***
## (0.001) (0.001) (0.001)
##
## sex 0.436*** 0.421***
## (0.018) (0.018)
##
## unionyes 0.085***
## (0.012)
##
## Constant 5.436*** 4.652*** 4.597***
## (0.037) (0.046) (0.046)
##
## -------------------------------------------------------------------------------------------------
## Observations 4,165 4,165 4,165
## R2 0.247 0.335 0.342
## Adjusted R2 0.246 0.335 0.342
## Residual Std. Error 0.401 (df = 4162) 0.376 (df = 4161) 0.374 (df = 4160)
## F Statistic 681.552*** (df = 2; 4162) 698.837*** (df = 3; 4161) 541.007*** (df = 4; 4160)
## =================================================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Why Robust Standard Errors? |
|---|
In regression analysis, the standard errors of your coefficients are used to calculate test statistics and confidence intervals. However, standard errors assume that the error terms (residuals) are homoscedastic, meaning they have a constant variance.
|
Install and Load the Package:
# Install the package if it's not already installed
if (FALSE) install.packages("coefplot")
# Load the package
library(coefplot)
Fit a Regression Model:
# Example regression model
model3 <- lm(lwage ~ ed + exp + sex + union, data = Wages)
Plot Coefficients:
# Create a coefficient plot
coefplot(model3,intercept=FALSE)
Customization:
# Customized coefficient plot
coefplot<-coefplot(model3,
title = "Coefficient Plot with Customization", # Add a title
xlab = "Coefficient Estimates", # Label for x-axis
ylab = "Variables", # Label for y-axis
color = "black",
intercept = FALSE,
innerCI = 1.96,
grid = FALSE)# Remove gridlines for a cleaner look