=====R for Statistics====
====Data Wrangling====
* install tidyverse: https://www.rdocumentation.org/packages/tidyr/versions/0.8.3
* Use the library
library(tidyr)
getwd()
setwd()
vsl <- read.csv("vsl1314.csv")
> vsl
ï..year sex dom exp ra1pat ra1time ra5pat ra5time rv1pat rv1time rv5pat
1 2014 M R 7.0 Y 41 Y 35 Y 40 Y
2 2014 M R 3.0 Y 45 Y 55 N 45 Y
3 2014 M R 4.0 Y 70 Y 45 Y 50 Y
4 2014 M R 15.0 N 90 Y 50 Y 70 Y
vsl2 = subset(vsl,select=-c(totnum, success,rate)) # to drop the summary stats
hist(vsltime$time, main="Anastomosis time (all) 2013-2014 n=512")
{{:pe:vsl_success.png?200|}}
Histogram to check shape of distribution --> looks skewed to the right
> shapiro.test(vsltime$time)
Shapiro-Wilk normality test
data: vsltime$time
W = 0.91157, p-value = 1.388e-14
Shapiro-Wilk test for normality --> not normal distribution
> hist(vsl$success)
{{:pe:vsl_success.png?200|}}
> shapiro.test(vsl$success)
Shapiro-Wilk normality test
data: vsl$success
W = 0.9117, p-value = 0.0002277
Some other analyses:
> plot(vsl$exp,vsl$rate,main="Anastomosis success rate vs experience (in years)")
> plot(vsl$exp,vsl$totnum)
> plot(vsl$exp,((vsl$totnum/8)+(vsl$rate))/2)
{{:pe:successrate_number_vs_exp.png?400|}}
Suggests that the first few years of experience does not seem to make a difference to performance, but many years of experience does (? self-selection or already some training). //Some form of correlation analysis might be helpful here.//
===Spread===
> stem(((vsl$totnum/8)+(vsl$rate))/2)
The decimal point is 1 digit(s) to the left of the |
2 | 5
3 |
4 |
5 |
6 | 1333
7 | 1111111111112559999999999
8 | 1111177777888888888888888
9 | 44444444
10 | 0
> stem(vsl$rate)
The decimal point is 1 digit(s) to the left of the |
0 | 0
2 |
4 | 00007
6 | 07777771111155555
8 | 0000003333366666888
10 | 0000000000000000000000
See also [[r:20190423|Combined 13-14]]
====Sources====
* http://www.sthda.com/english/wiki/normality-test-in-r
* https://www.rdocumentation.org/packages/graphics/versions/3.5.3/topics/hist
* Using subset to drop columns: https://www.listendata.com/2015/06/r-keep-drop-columns-from-data-frame.html
* https://www.rdocumentation.org/packages/tidyr/versions/0.8.3/topics/unite
* http://www.r-tutor.com/r-introduction/data-frame/data-import
* https://www.rdocumentation.org/packages/tidyr/versions/0.8.3
* https://uc-r.github.io/tidyr