class: center, middle, inverse, title-slide # Koeffizienten Interpretieren --- ## Datenübersicht Datensatz zum BIP pro Kopf (`bip`) und dem Kapitalstock pro Kopf (`k`) von 133 verschiedenen Ländern weltweit in USD für das Jahr 2014. Daten stammen aus den [Penn World Tables](https://www.rug.nl/ggdc/productivity/pwt/). - Zudem: Dummy Variable (`dummy_k`), für jedes Land mit: .tiny[ ```r pwt <- pwt %>% mutate(dummy_k = ifelse(k>mean(pwt$k),1,0), dummy_k1 = ifelse(k<=quantile(pwt$k, probs = 0.25),1,0), dummy_k2 = ifelse(k>quantile(pwt$k, probs = c(0.25)) & k<=quantile(pwt$k, probs = c(0.5)),1,0), dummy_k3 = ifelse(k>quantile(pwt$k, probs = c(0.5)) & k<=quantile(pwt$k, probs = c(0.75)),1,0), dummy_k4 = ifelse(k>quantile(pwt$k, probs = c(0.75)),1,0)) %>% select(country, bip, k, dummy_k, dummy_k1, dummy_k2, dummy_k3, dummy_k4) skim(pwt) %>% yank("numeric") ``` ``` ## ## ── Variable type: numeric ────────────────────────────────────────────────────── ## skim_variable n_missing complete_rate mean sd p0 p25 p50 ## 1 bip 0 1 22009. 23156. 570. 7106. 15913. ## 2 k 0 1 82935. 80522. 1105. 17785. 51825. ## 3 dummy_k 0 1 0.361 0.482 0 0 0 ## 4 dummy_k1 0 1 0.256 0.438 0 0 0 ## 5 dummy_k2 0 1 0.248 0.434 0 0 0 ## 6 dummy_k3 0 1 0.248 0.434 0 0 0 ## 7 dummy_k4 0 1 0.248 0.434 0 0 0 ## p75 p100 hist ## 1 30794. 163294. ▇▂▁▁▁ ## 2 141022. 423284. ▇▂▂▁▁ ## 3 1 1 ▇▁▁▁▅ ## 4 1 1 ▇▁▁▁▃ ## 5 0 1 ▇▁▁▁▂ ## 6 0 1 ▇▁▁▁▂ ## 7 0 1 ▇▁▁▁▂ ``` ] --- ### Linear-Linear Modell (Standardfall) `$$y = \beta_0 + \beta_1 * x + u$$` <table style="text-align:center"><tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td><em>Dependent variable:</em></td></tr> <tr><td></td><td colspan="1" style="border-bottom: 1px solid black"></td></tr> <tr><td style="text-align:left"></td><td>bip</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">k</td><td>0.244<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(0.013)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td style="text-align:left">Constant</td><td>1,768.036</td></tr> <tr><td style="text-align:left"></td><td>(1,533.344)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>133</td></tr> <tr><td style="text-align:left">R<sup>2</sup></td><td>0.720</td></tr> <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.718</td></tr> <tr><td style="text-align:left">Residual Std. Error</td><td>12,294.180 (df = 131)</td></tr> <tr><td style="text-align:left">F Statistic</td><td>337.270<sup>***</sup> (df = 1; 131)</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr> </table> -- Eine Erhöhung von x um eine Einheit, wird im Durchschnitt mit einer Erhöhung von `\(y\)` um `\(\beta_1\)` Einheiten in Verbindung gebracht. --- ### Log-Log Modell (Logarithmierte abhängige und erklärende Variable) `$$log(y) = \beta_0 + \beta_1 * log(x) + u$$` <table style="text-align:center"><tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td><em>Dependent variable:</em></td></tr> <tr><td></td><td colspan="1" style="border-bottom: 1px solid black"></td></tr> <tr><td style="text-align:left"></td><td>log(bip)</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">log(k)</td><td>0.815<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(0.024)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td style="text-align:left">Constant</td><td>0.776<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(0.263)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>133</td></tr> <tr><td style="text-align:left">R<sup>2</sup></td><td>0.895</td></tr> <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.894</td></tr> <tr><td style="text-align:left">Residual Std. Error</td><td>0.368 (df = 131)</td></tr> <tr><td style="text-align:left">F Statistic</td><td>1,113.512<sup>***</sup> (df = 1; 131)</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr> </table> -- Eine Erhöhung von x um ein Prozent, wird im Durchschnitt mit einer Erhöhung von `\(y\)` um `\(\beta_1\)` Prozent in Verbindung gebracht. --- ### Log-Linear Modell (Logarithmierte abhängige Variable) `$$log(y) = \beta_0 + \beta_1 * x + u$$` <table style="text-align:center"><tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td><em>Dependent variable:</em></td></tr> <tr><td></td><td colspan="1" style="border-bottom: 1px solid black"></td></tr> <tr><td style="text-align:left"></td><td>log(bip)</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">k</td><td>0.00001<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(0.00000)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td style="text-align:left">Constant</td><td>8.556<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(0.085)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>133</td></tr> <tr><td style="text-align:left">R<sup>2</sup></td><td>0.642</td></tr> <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.639</td></tr> <tr><td style="text-align:left">Residual Std. Error</td><td>0.680 (df = 131)</td></tr> <tr><td style="text-align:left">F Statistic</td><td>234.609<sup>***</sup> (df = 1; 131)</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr> </table> -- Eine Erhöhung von x um eine Einheit, wird im Durchschnitt mit einer Erhöhung von `\(y\)` um `\(\beta_1 *100\)` Prozent in Verbindung gebracht. --- ### Linear-Log Modell (Logarithmierte erklärende Variable) `$$y = \beta_0 + \beta_1 * log(x) + u$$` <table style="text-align:center"><tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td><em>Dependent variable:</em></td></tr> <tr><td></td><td colspan="1" style="border-bottom: 1px solid black"></td></tr> <tr><td style="text-align:left"></td><td>bip</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">log(k)</td><td>12,422.670<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(1,092.941)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td style="text-align:left">Constant</td><td>-110,876.500<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(11,778.320)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>133</td></tr> <tr><td style="text-align:left">R<sup>2</sup></td><td>0.497</td></tr> <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.493</td></tr> <tr><td style="text-align:left">Residual Std. Error</td><td>16,493.040 (df = 131)</td></tr> <tr><td style="text-align:left">F Statistic</td><td>129.192<sup>***</sup> (df = 1; 131)</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr> </table> -- Eine Erhöhung von x um ein Prozent, wird im Durchschnitt mit einer Erhöhung von `\(y\)` um `\(\frac {\beta_1}{100}\)` Einheiten in Verbindung gebracht. --- ### Dummyvariable als erlärende Variable `$$y = \beta_0 + \beta_1 * I_x+ u$$` <table style="text-align:center"><tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td><em>Dependent variable:</em></td></tr> <tr><td></td><td colspan="1" style="border-bottom: 1px solid black"></td></tr> <tr><td style="text-align:left"></td><td>bip</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">dummy_k</td><td>32,933.830<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(3,054.932)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td style="text-align:left">Constant</td><td>10,122.740<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(1,835.255)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>133</td></tr> <tr><td style="text-align:left">R<sup>2</sup></td><td>0.470</td></tr> <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.466</td></tr> <tr><td style="text-align:left">Residual Std. Error</td><td>16,920.210 (df = 131)</td></tr> <tr><td style="text-align:left">F Statistic</td><td>116.220<sup>***</sup> (df = 1; 131)</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr> </table> -- Alle Beobachtungen bei denen x = 1 ist, wird im Durchschnitt mit einem höherem `\(y\)` von `\(\beta_1\)` Einheiten in Verbindung gebracht. --- ### Mehrere Dummyvariablen als erlärende Variable `$$y = \beta_0 + \beta_1 * I_{x1} + \beta_2 * I_{x2} + \beta_3 * I_{x3} + u$$` .pull-left[ <table style="text-align:center"><tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td><em>Dependent variable:</em></td></tr> <tr><td></td><td colspan="1" style="border-bottom: 1px solid black"></td></tr> <tr><td style="text-align:left"></td><td>bip</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">dummy_k1</td><td>-44,545.740<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(3,919.828)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td style="text-align:left">dummy_k2</td><td>-36,450.900<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(3,948.972)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td style="text-align:left">dummy_k3</td><td>-25,008.320<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(3,948.972)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td style="text-align:left">Constant</td><td>48,645.540<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(2,792.345)</td></tr> <tr><td style="text-align:left"></td><td></td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>133</td></tr> <tr><td style="text-align:left">R<sup>2</sup></td><td>0.531</td></tr> <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.520</td></tr> <tr><td style="text-align:left">Residual Std. Error</td><td>16,040.800 (df = 129)</td></tr> <tr><td style="text-align:left">F Statistic</td><td>48.690<sup>***</sup> (df = 3; 129)</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr> </table> ] -- .pull-right[ Alle Beobachtungen bei denen x1 = 1 ist, wird im Durchschnitt mit einem höherem/niedrigerem `\(y\)` von `\(\beta_1\)` Einheiten über/unter dem Basislevel in Verbindung gebracht. ]