=====R for Statistics==== ====Data Wrangling==== * install tidyverse: https://www.rdocumentation.org/packages/tidyr/versions/0.8.3 * Use the library library(tidyr) getwd() setwd() vsl <- read.csv("vsl1314.csv") > vsl ï..year sex dom exp ra1pat ra1time ra5pat ra5time rv1pat rv1time rv5pat 1 2014 M R 7.0 Y 41 Y 35 Y 40 Y 2 2014 M R 3.0 Y 45 Y 55 N 45 Y 3 2014 M R 4.0 Y 70 Y 45 Y 50 Y 4 2014 M R 15.0 N 90 Y 50 Y 70 Y vsl2 = subset(vsl,select=-c(totnum, success,rate)) # to drop the summary stats hist(vsltime$time, main="Anastomosis time (all) 2013-2014 n=512") {{:pe:vsl_success.png?200|}} Histogram to check shape of distribution --> looks skewed to the right > shapiro.test(vsltime$time) Shapiro-Wilk normality test data: vsltime$time W = 0.91157, p-value = 1.388e-14 Shapiro-Wilk test for normality --> not normal distribution > hist(vsl$success) {{:pe:vsl_success.png?200|}} > shapiro.test(vsl$success) Shapiro-Wilk normality test data: vsl$success W = 0.9117, p-value = 0.0002277 Some other analyses: > plot(vsl$exp,vsl$rate,main="Anastomosis success rate vs experience (in years)") > plot(vsl$exp,vsl$totnum) > plot(vsl$exp,((vsl$totnum/8)+(vsl$rate))/2) {{:pe:successrate_number_vs_exp.png?400|}} Suggests that the first few years of experience does not seem to make a difference to performance, but many years of experience does (? self-selection or already some training). //Some form of correlation analysis might be helpful here.// ===Spread=== > stem(((vsl$totnum/8)+(vsl$rate))/2) The decimal point is 1 digit(s) to the left of the | 2 | 5 3 | 4 | 5 | 6 | 1333 7 | 1111111111112559999999999 8 | 1111177777888888888888888 9 | 44444444 10 | 0 > stem(vsl$rate) The decimal point is 1 digit(s) to the left of the | 0 | 0 2 | 4 | 00007 6 | 07777771111155555 8 | 0000003333366666888 10 | 0000000000000000000000 See also [[r:20190423|Combined 13-14]] ====Sources==== * http://www.sthda.com/english/wiki/normality-test-in-r * https://www.rdocumentation.org/packages/graphics/versions/3.5.3/topics/hist * Using subset to drop columns: https://www.listendata.com/2015/06/r-keep-drop-columns-from-data-frame.html * https://www.rdocumentation.org/packages/tidyr/versions/0.8.3/topics/unite * http://www.r-tutor.com/r-introduction/data-frame/data-import * https://www.rdocumentation.org/packages/tidyr/versions/0.8.3 * https://uc-r.github.io/tidyr