R Markdown
This is an R Markdown
Notebook. When you execute code within the notebook, the results appear
beneath the code.
Try executing this chunk by clicking the Run button within
the chunk or by placing your cursor inside it and pressing
Ctrl+Shift+Enter.
plot(cars)

Add a new chunk by clicking the Insert Chunk button on the
toolbar or by pressing Ctrl+Alt+I.
When you save the notebook, an HTML file containing the code and
output will be saved alongside it (click the Preview button or
press Ctrl+Shift+K to preview the HTML file).
The preview shows you a rendered HTML copy of the contents of the
editor. Consequently, unlike Knit, Preview does not
run any R code chunks. Instead, the output of the chunk when it was last
run in the editor is displayed.
RStudio is able to simulate the final formatting
live, by switching from “Source” to “Visual” in the task bar
above.
Linear Regression
Description of data set
The file module1-video_reading.csv contains the following data:
participant: unique id for each participant
score_reading: number of points the participant scored in a
reading test
hours_video: average number of hours the participant spends
watching video stream (TV, movies, ..) each day.
Tasks
- Read the data file module1-video_reading.csv into R and assign it to
a variable called “dat”.
- Draw a scatter plot of “hours_video” and “score_reading”.
- Estimate a linear regression model for “score_reading” as target
(dependent variable) and “hours_video” as feature
(independent/explanatory variable) using the
lm()
function.
- Redo task 3 using the
mlr3verse package. Does your
final model output (applying the summary() function on the
fitted model object) differ from the one in task 3, which was estimated
using base R functionality?
- Add the regression line from task 4 to the plot from task 2.
- Identify and exclude the outlier, and redo tasks 4 and 5 (i.e.,
estimate the model again and add the new line to the scatter plot).
- What would be the reading score (“score_reading”) of a participant
with a video consumption (“hours_video”) equivalent to the 95th
percentile as predicted by the model? (Hint: you can use the
predict_newdata() method on the fitted model object, which
behaves similar to the predict() function in
base R)
- Add the prediction from task 7 to the scatter plot from task 6.
- Bonus: What is the effect of “hours_video” on “score_reading” in
standardized units? How would you interpret this effect? (Hint: You can
use a linear regression model to determine the correlation between
“hours_video” and “score_reading”)
LS0tDQp0aXRsZTogIk1vZHVsZSAxOiBUdXRvcmlhbDogUmVncmVzc2lvbiINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KZWRpdG9yX29wdGlvbnM6IA0KICBjaHVua19vdXRwdXRfdHlwZTogaW5saW5lDQotLS0NCg0KIyBSIE1hcmtkb3duDQoNClRoaXMgaXMgYW4gW1IgTWFya2Rvd25dKGh0dHA6Ly9ybWFya2Rvd24ucnN0dWRpby5jb20pIE5vdGVib29rLiBXaGVuIHlvdSBleGVjdXRlIGNvZGUgd2l0aGluIHRoZSBub3RlYm9vaywgdGhlIHJlc3VsdHMgYXBwZWFyIGJlbmVhdGggdGhlIGNvZGUuDQoNClRyeSBleGVjdXRpbmcgdGhpcyBjaHVuayBieSBjbGlja2luZyB0aGUgKlJ1biogYnV0dG9uIHdpdGhpbiB0aGUgY2h1bmsgb3IgYnkgcGxhY2luZyB5b3VyIGN1cnNvciBpbnNpZGUgaXQgYW5kIHByZXNzaW5nICpDdHJsK1NoaWZ0K0VudGVyKi4NCg0KYGBge3J9DQpwbG90KGNhcnMpDQpgYGANCg0KQWRkIGEgbmV3IGNodW5rIGJ5IGNsaWNraW5nIHRoZSAqSW5zZXJ0IENodW5rKiBidXR0b24gb24gdGhlIHRvb2xiYXIgb3IgYnkgcHJlc3NpbmcgKkN0cmwrQWx0K0kqLg0KDQpXaGVuIHlvdSBzYXZlIHRoZSBub3RlYm9vaywgYW4gSFRNTCBmaWxlIGNvbnRhaW5pbmcgdGhlIGNvZGUgYW5kIG91dHB1dCB3aWxsIGJlIHNhdmVkIGFsb25nc2lkZSBpdCAoY2xpY2sgdGhlICpQcmV2aWV3KiBidXR0b24gb3IgcHJlc3MgKkN0cmwrU2hpZnQrSyogdG8gcHJldmlldyB0aGUgSFRNTCBmaWxlKS4NCg0KVGhlIHByZXZpZXcgc2hvd3MgeW91IGEgcmVuZGVyZWQgSFRNTCBjb3B5IG9mIHRoZSBjb250ZW50cyBvZiB0aGUgZWRpdG9yLiBDb25zZXF1ZW50bHksIHVubGlrZSAqS25pdCosICpQcmV2aWV3KiBkb2VzIG5vdCBydW4gYW55IFIgY29kZSBjaHVua3MuIEluc3RlYWQsIHRoZSBvdXRwdXQgb2YgdGhlIGNodW5rIHdoZW4gaXQgd2FzIGxhc3QgcnVuIGluIHRoZSBlZGl0b3IgaXMgZGlzcGxheWVkLg0KDQpSU3R1ZGlvIGlzIGFibGUgdG8gKipzaW11bGF0ZSB0aGUgZmluYWwgZm9ybWF0dGluZyBsaXZlKiosIGJ5IHN3aXRjaGluZyBmcm9tICJTb3VyY2UiIHRvICJWaXN1YWwiIGluIHRoZSB0YXNrIGJhciBhYm92ZS4NCg0KIyBMaW5lYXIgUmVncmVzc2lvbg0KDQojIyBEZXNjcmlwdGlvbiBvZiBkYXRhIHNldA0KDQpUaGUgZmlsZSBtb2R1bGUxLXZpZGVvX3JlYWRpbmcuY3N2IGNvbnRhaW5zIHRoZSBmb2xsb3dpbmcgZGF0YToNCg0KLSAgIHBhcnRpY2lwYW50OiB1bmlxdWUgaWQgZm9yIGVhY2ggcGFydGljaXBhbnQNCg0KLSAgIHNjb3JlX3JlYWRpbmc6IG51bWJlciBvZiBwb2ludHMgdGhlIHBhcnRpY2lwYW50IHNjb3JlZCBpbiBhIHJlYWRpbmcgdGVzdA0KDQotICAgaG91cnNfdmlkZW86IGF2ZXJhZ2UgbnVtYmVyIG9mIGhvdXJzIHRoZSBwYXJ0aWNpcGFudCBzcGVuZHMgd2F0Y2hpbmcgdmlkZW8gc3RyZWFtIChUViwgbW92aWVzLCAuLikgZWFjaCBkYXkuDQoNCiMjIFRhc2tzDQoNCjEuICBSZWFkIHRoZSBkYXRhIGZpbGUgbW9kdWxlMS12aWRlb19yZWFkaW5nLmNzdiBpbnRvIFIgYW5kIGFzc2lnbiBpdCB0byBhIHZhcmlhYmxlIGNhbGxlZCAiZGF0Ii4NCg0KYGBge3J9DQoNCmBgYA0KDQoyLiAgRHJhdyBhIHNjYXR0ZXIgcGxvdCBvZiAiaG91cnNfdmlkZW8iIGFuZCAic2NvcmVfcmVhZGluZyIuDQoNCmBgYHtyfQ0KDQpgYGANCg0KMy4gIEVzdGltYXRlIGEgbGluZWFyIHJlZ3Jlc3Npb24gbW9kZWwgZm9yICJzY29yZV9yZWFkaW5nIiBhcyB0YXJnZXQgKGRlcGVuZGVudCB2YXJpYWJsZSkgYW5kICJob3Vyc192aWRlbyIgYXMgZmVhdHVyZSAoaW5kZXBlbmRlbnQvZXhwbGFuYXRvcnkgdmFyaWFibGUpIHVzaW5nIHRoZSBgbG0oKWAgZnVuY3Rpb24uDQoNCmBgYHtyfQ0KDQpgYGANCg0KNC4gIFJlZG8gdGFzayAzIHVzaW5nIHRoZSBgbWxyM3ZlcnNlYCBwYWNrYWdlLiBEb2VzIHlvdXIgZmluYWwgbW9kZWwgb3V0cHV0IChhcHBseWluZyB0aGUgYHN1bW1hcnkoKWAgZnVuY3Rpb24gb24gdGhlIGZpdHRlZCBtb2RlbCBvYmplY3QpIGRpZmZlciBmcm9tIHRoZSBvbmUgaW4gdGFzayAzLCB3aGljaCB3YXMgZXN0aW1hdGVkIHVzaW5nIGBiYXNlYCBSIGZ1bmN0aW9uYWxpdHk/DQoNCmBgYHtyfQ0KDQpgYGANCg0KNS4gIEFkZCB0aGUgcmVncmVzc2lvbiBsaW5lIGZyb20gdGFzayA0IHRvIHRoZSBwbG90IGZyb20gdGFzayAyLg0KDQpgYGB7cn0NCg0KYGBgDQoNCjYuICBJZGVudGlmeSBhbmQgZXhjbHVkZSB0aGUgb3V0bGllciwgYW5kIHJlZG8gdGFza3MgNCBhbmQgNSAoaS5lLiwgZXN0aW1hdGUgdGhlIG1vZGVsIGFnYWluIGFuZCBhZGQgdGhlIG5ldyBsaW5lIHRvIHRoZSBzY2F0dGVyIHBsb3QpLg0KDQpgYGB7cn0NCg0KYGBgDQoNCmBgYHtyfQ0KDQpgYGANCg0KNy4gIFdoYXQgd291bGQgYmUgdGhlIHJlYWRpbmcgc2NvcmUgKCJzY29yZV9yZWFkaW5nIikgb2YgYSBwYXJ0aWNpcGFudCB3aXRoIGEgdmlkZW8gY29uc3VtcHRpb24gKCJob3Vyc192aWRlbyIpIGVxdWl2YWxlbnQgdG8gdGhlIDk1dGggcGVyY2VudGlsZSBhcyBwcmVkaWN0ZWQgYnkgdGhlIG1vZGVsPyAoSGludDogeW91IGNhbiB1c2UgdGhlIGBwcmVkaWN0X25ld2RhdGEoKWAgbWV0aG9kIG9uIHRoZSBmaXR0ZWQgbW9kZWwgb2JqZWN0LCB3aGljaCBiZWhhdmVzIHNpbWlsYXIgdG8gdGhlIGBwcmVkaWN0KClgIGZ1bmN0aW9uIGluIGBiYXNlYCBSKQ0KDQpgYGB7cn0NCg0KYGBgDQoNCjguICBBZGQgdGhlIHByZWRpY3Rpb24gZnJvbSB0YXNrIDcgdG8gdGhlIHNjYXR0ZXIgcGxvdCBmcm9tIHRhc2sgNi4NCg0KYGBge3J9DQoNCmBgYA0KDQo5LiAgQm9udXM6IFdoYXQgaXMgdGhlIGVmZmVjdCBvZiAiaG91cnNfdmlkZW8iIG9uICJzY29yZV9yZWFkaW5nIiBpbiBzdGFuZGFyZGl6ZWQgdW5pdHM/IEhvdyB3b3VsZCB5b3UgaW50ZXJwcmV0IHRoaXMgZWZmZWN0PyAoSGludDogWW91IGNhbiB1c2UgYSBsaW5lYXIgcmVncmVzc2lvbiBtb2RlbCB0byBkZXRlcm1pbmUgdGhlIGNvcnJlbGF0aW9uIGJldHdlZW4gImhvdXJzX3ZpZGVvIiBhbmQgInNjb3JlX3JlYWRpbmciKQ0KDQpgYGB7cn0NCg0KYGBgDQo=