class: center, middle, inverse, title-slide # Research Methods: Open Science and Reproducible Research in Linguistics ## Registered reports
Communicating/sharing I: RMarkdown ### Joseph V. Casillas, PhD ### Rutgers UniversitySpring 2019Last update: 2019-02-07 --- background-image: url(./assets/img/psych_fail.png) --- background-image: url(https://www.sbs.com.au/guide/sites/sbs.com.au.guide/files/styles/body_image/public/rocky.jpg?itok=D1SCRjAe&mtime=1528594528) background-size: contain background-color: black --- background-image: url(https://static01.nyt.com/images/2016/08/05/us/05onfire1_xp/05onfire1_xp-superJumbo-v2.jpg?quality=90&auto=webp) background-size: contain --- class: center, middle # The current model -- Generate and specify hypotheses -- ⬇︎ Design study -- ⬇︎ Conduct study and collect data -- ⬇︎ Analyze data and test hypotheses -- ⬇︎ Interpret results -- ⬇︎ Publishing pipeline --- background-image: url(https://cdn.cos.io/media/images/Hypothetico-deductive_scientific_method-1.original.png) background-size: contain --- background-image: url(./assets/img/publish_pipeline.png) background-size: contain # Publishing pipeline --- class: title-slide-section-grey, middle, center # Attempts at reform ### .lightgrey[(the comeback)] -- <iframe src="https://giphy.com/embed/YKkV9Yq25RUGc" width="480" height="270" frameBorder="0" class="giphy-embed" allowFullScreen></iframe> --- # Attempts at reform ### Meta-analysis .pull-left[ - Statistical approach - Aggregation of results from many studies - Inferences based on larger and potentially more diverse samples - Attempt to increase power (over individual studies) - Improve estimates of the size of the effect - Resolve uncertainty when reports disagree - Promotes collaboration among scientists, and incentivizes more systematic research programs ] -- .pull-right[ <iframe src="https://giphy.com/embed/d2ZfqZY5eSCR0rza" width="480" height="270" frameBorder="0" class="giphy-embed" allowFullScreen></iframe> ] --- # Attempts at reform ### Meta-analysis - Cannot fix problems of p-hacking, reporting errors, and fraud - Some researchers believe it dramatically exacerbates them - May increase type I error - Researcher is "further" from data, difficult to judge quality (especially when attempting to include unpublished studies) - The popularity of meta-analysis has served to emphasize the size of effects and by thus raising the consciousness of behavioral scientists has promoted the cause of power analysis - If we didn't have the problems we currently have, this approach would be more productive -- ... we will see more meta-analyses in the future --- background-image: url(./assets/img/ma1.png), url(./assets/img/ma2.png) background-position: 5% 50%, 95% 50% background-size: 500px, 600px --- background-image: url(./assets/img/ma3.png), url(./assets/img/ma4.png) background-position: 5% 50%, 95% 50% background-size: 550px, 550px --- background-image: url(./assets/img/ma5.png), url(./assets/img/ma6.png) background-position: 5% 50%, 95% 50% background-size: 500px, 600px --- # Attempts at reform ### p-bashing #### Old statistics critiques .pull-left[ - "a meaningless ordeal of pedantic computations"<sup>1</sup> - "the test of significance has been carrying too much of the burden of scientific inference"<sup>2</sup> - "A potent but sterile intellectual rake who leaves in his merry path a long train of ravished maidens but no viable scientific offspring"<sup>3</sup> ] .pull-right[ - "perhaps the least important attribute of a good experiment"<sup>4</sup> - "that .grey[a great deal of mischief has been associated] with the test of significance[...] is .grey[what everybody knows]"<sup>2</sup> - "one of the worst things that ever happened in the history of psychology"<sup>5</sup> ] .footnote[ <sup>1</sup>Stevens (1960), <sup>2</sup>Bakan (1966), <sup>3</sup>Meehl (1967), <sup>4</sup>Lykken (1968), <sup>5</sup>Meehl (1978) ] --- # Attempts at reform ### p-bashing #### New(ish) statistics critiques<sup>1</sup> .footnote[ <sup>[1](https://www.jstor.org/stable/pdf/20182143.pdf?refreqid=excelsior%3Aaa46f47b6cb642339d0f407c02d3070c)</sup>Cohen (1994), <sup>[2](https://psyarxiv.com/mky9j/)</sup>Benjamin et al. (2018), <sup>[3](https://psyarxiv.com/9s3y6)</sup>Lakens et al. (2018), <sup>4</sup>Cumming (2014), <sup>5</sup>Kruschke (2013), <sup>6</sup>Wagenmakers et al. (2011) ] .pull-left[ - "Because NHST p-values have become the coin of the realm in much of psychology, they have served to inhibit its development as a science." - Lower alpha to 0.005<sup>2</sup> - Justify your alpha<sup>3</sup> - Use point estimation and confidence intervals instead of p-values<sup>4</sup> - Bayesian estimation instead of p-values<sup>5</sup> - Bayes factors over p-values<sup>6</sup> ] -- .pull-right[ <iframe src="https://gifer.com/embed/qA7" width=480 height=270.000 frameBorder="0" allowFullScreen></iframe> ] --- background-image: url(./assets/img/prr.png) background-size: contain https://osf.io/2dxu5/ --- # Attempts at reform ### Registered reports .pull-left[ - Started in medicine - Current trend in psych - Coming to ling (Language Learning, Language and Speech) - Some journals have started offering badges ] .pull-right[ <iframe src="https://giphy.com/embed/1iTH1WIUjM0VATSw" width="480" height="270" frameBorder="0" class="giphy-embed" allowFullScreen></iframe> ] --- background-image: url(./assets/img/roettger_2019_00.png) background-size: contain .footnote[Roettger (2019)] --- class: center, middle # What is a registered report? </br>What is it designed to do? --- # Attempts at reform ### Registered reports - Publish your experimental design first - Receive open peer review based of the theoretical grounds and methods - Reviewers suggest amendments that can still be incorporated before the study is run - The peer review process grants In Principle Acceptance (IPA) - Only then carry out the experiment, analyze data, finish manuscript - Resubmit for second peer review - Publish results, regardless of the outcome - Goal: reduce the number of papers reporting statistically significant results that are actually false positives --- background-image: url(https://cdn.cos.io/media/images/registered_reports.width-800.png) background-size: contain # New publishing pipeline --- # Attempts at reform ### Unreviewed pre-registration - A second type of pre-registration, - Does not involve reviewers before data collection - Authors write plan and it is time-stamped before conducting the study - In theory the process is similar to the standard model, but one can be (more) confident that there is no HARKing/p-hacking - Two models (registered report vs. unreviewed pre-registration) are not mutually exclusive - They can have different priorities in the research cycle (i.e., exploratory vs. confirmatory research) --- background-image: url(./assets/img/reg_report_template.png) background-size: contain --- # Attempts at reform ### Pre-registration #### Negatives - More work? - Too restrictive? - Idea theft? #### Limitations - Flexibility (sometimes we think of things pos facto)? - Fraud (multiple pre-registrations?)? - Irrelevant for certain types of research? --- # Attempts at reform ### Registered reports #### How? Online platforms for pre-registration - the Open Science Framework (OSF) - aspredicted.org --- background-image: url(./assets/img/osf.png) background-size: 900px .footnote[https://cos.io/rr/] --- <iframe src="https://www.aspredicted.org" style="border:none;" height="600" width="100%"></iframe> --- class: title-slide-section-grey, center, middle # **Where are we now?** --- background-image: url(./assets/img/roettger_2019_01.png) background-size: contain .footnote[Roettger (2019)] --- background-image: url(./assets/img/roettger_2019_02.png) background-size: contain .footnote[Roettger (2019)] --- background-image: url(./assets/img/roettger_2019_03.png) background-size: contain .footnote[Roettger (2019)] --- class: title-slide-section-grey, center, middle # **Where are we now?** -- background-image: url(https://justseriesandstuff.files.wordpress.com/2015/07/618_movies_rocky_10.jpg) background-size: contain --- # Where are we now? - Registered reports are becoming the norm in Psychology - Linguistics slow to follow -- <iframe src="https://docs.google.com/spreadsheets/d/17dLaqKXcjyWk1thG8y5C3_fHXXNEqQMcGWDY62BOc0Q/edit#gid=1374958043" style="border:none;" height="400" width="100%"></iframe> --- class: title-slide-section-grey, center, middle # **Where are we heading?** -- background-image: url(https://www.telegraph.co.uk/content/dam/films/2018/11/23/rocky_trans_NvBQzQNjv4Bq04zWM7lESoHlZcET6IbVrgsXXAS7_VrfHdozeI5gQBU.PNG?imwidth=1400) background-size: contain --- # Where are we heading? ### What does it mean for the field? - Researchers have to adapt, learn new methods of open science - Journals have to adapt, adjust model of publishing -- ### What does it mean for you? - Registered dissertation experiments? - Published replications before graduation? - Increased knowledge of coding? - Increased sharing of materials (code, data, stimuli)? --- class: title-slide-section-grey, middle .big[ <ru-blockquote> That said, I have two main reservations about the manuscript. First, the potential value of this manuscript to serve as a fully-worked-out example of GAMMs or Bayesian analysis is limited by the unavailability of the raw data and R code. I actually think the primary value of this manuscript is its demonstration of these statistical techniques, and so if the authors are unable or unwilling to make their raw data and R code available, I cannot recommend this manuscript for publication. </ru-blockquote> ] --- class: title-slide-section-red background-image: url(https://cdn-images-1.medium.com/max/1600/1*gYQhlM7v6GyRuxaL8JtPIQ.png), url(https://upload.wikimedia.org/wikipedia/commons/thumb/4/48/Markdown-mark.svg/2000px-Markdown-mark.svg.png), url(https://www.rstudio.com/wp-content/uploads/2017/05/rmarkdown.png), url(https://upload.wikimedia.org/wikipedia/commons/thumb/7/7d/Tab_plus.svg/2000px-Tab_plus.svg.png), url(https://upload.wikimedia.org/wikipedia/commons/thumb/c/cf/Kennzeichnung_für_Äquivalenzglied.svg/2000px-Kennzeichnung_für_Äquivalenzglied.svg.png) background-position: 5% 60%, 43% 60%, 95% 60%, 27% 60%, 62% 63% background-size: 250px, 250px, 375px, 100px, 175px # Communicating/sharing I --- # What is markdown? - Markdown is a language used to format text - Rather than click a button to format (like in word), you use markdown syntax - Lightweight markup language (like HTML but simple) - Easy to read and write because it uses simple tags (e.g. #) -- .pull-left[ ``` # This is a subsection header This is **bold** text. This is *italic* text. - This is - a list 1. This is a 2. numbered list ``` ] -- .pull-right[ ## This is a subsection header This is **bold** text. This is *italic* text. - This is - a list 1. This is a 2. numbered list ] --- # Exercise I - Open RStudio - File > New file > RMarkdown (then click "ok") - Select all (cmd + a) and delete everthing - Type "hello world" - Click "knit" (You will be asked to save. Save the file to your desktop) -- - Try to add the following: - a section header - bold text - an ordered list - an unordered list - a link to your favorite website --- background-image: url(https://learn.r-journalism.com/publishing/rmarkdown/images/rmdfiles.png) background-position: 95% 50% # What is R Markdown? - An authoring format that combines markdown syntax and R code (R + markdown) -- - An RMarkdown file consists of 3 components... - front matter - plain text - R code -- - How does it work? --- background-image: url(https://cdn-images-1.medium.com/max/1600/1*gYQhlM7v6GyRuxaL8JtPIQ.png), url(https://upload.wikimedia.org/wikipedia/commons/thumb/4/48/Markdown-mark.svg/2000px-Markdown-mark.svg.png), url(https://www.rstudio.com/wp-content/uploads/2017/05/rmarkdown.png), url(https://upload.wikimedia.org/wikipedia/commons/thumb/7/7d/Tab_plus.svg/2000px-Tab_plus.svg.png), url(https://upload.wikimedia.org/wikipedia/commons/thumb/c/cf/Kennzeichnung_für_Äquivalenzglied.svg/2000px-Kennzeichnung_für_Äquivalenzglied.svg.png), url(https://www.rstudio.com/wp-content/uploads/2014/04/knitr-200x232.png) background-position: 5% 30%, 43% 30%, 95% 10%, 27% 30%, 62% 23%, 30% 70% background-size: 250px, 250px, 375px, 100px, 175px, 150px -- .footnote[.big[`knitr` is used to 'knit' r code into the markdown text file]] --- # What is R Markdown? - An authoring format that combines markdown syntax and R code (R + markdown) - An RMarkdown file consists of 3 components... - front matter - plain text - R code - How does it work? - **What can it do**? --- # Exercise II - Open RStudio - File > New file > RMarkdown (then click "ok") - Take a look at the text. What markup do you see? - Click "knit" (You will be asked to save. Save the file to your desktop) - What do you see? What section represents the `front matter`? How can you distinguish plain markdown text from r code? -- - Create a new `knitr` code chunk, add the following and click `knit`: `x <- 5; 2 * x` -- - Create a new `knitr` code chunk and add the following (note: you may have to install the package): .pull-left[ ``` library(tidyverse) mtcars %>% ggplot(aes(x = drat, y = mpg)) + geom_point() + geom_smooth() ``` ] -- .pull-right[ <img src="index_files/figure-html/unnamed-chunk-2-1.png" width="504" /> ] --- class: center, middle # Why use it? ### An RMardown file is a **dynamic document** that is fully reproducible ### It can be regenerated automatically whenever the R code or data changes ### It allows you to easily share your results --- class: middle background-image: url(./assets/img/rmd_01.png) background-size: contain background-position: 100% 50% .pull-left[ .big[ RMarkdown allows you to write simple text documents that can be converted to many differnt output formats - HTML - PDF - Word - HTML5 slides - websites/blogs - .grey[Beamer] - .grey[Tufte handouts] - .grey[Books] - .grey[dashboards] ] ] --- # Exercise III - Open the github desktop app - You should still have the `github_practice` repo - Pull in the newest changes (click 'pull') - **If you don't have the `github_practice` repo, go to github.com, search for `jvcasillas`, search for the `github_practice` repo, fork it again, and clone it to your desktop**. -- - Open the `rmarkdown_ex` folder and double click "rmarkdown_ex.Rproj" -- - Find the "Files" tab in one of the 4 window panes, click on `ex3.Rmd` - Inspect the file, notice the front matter and the code chunks. - Click 'knit' -- - Change the front matter from what you see on the left to what you see on the right and click `knit`: .pull-left[ ```r --- title: "More complex RMardown example" author: "Joseph Casillas" date: "`r Sys.Date()`" output: html_document: highlight: kate number_sections: yes theme: spacelab toc: true toc_float: toc_collapsed: true --- ``` ] .pull-right[ ```r --- title: "More complex RMardown example" author: "Joseph Casillas" date: "`r Sys.Date()`" output: word_document --- ``` ] --- background-image: url(https://www.r-project.org/Rlogo.png), url(../assets/img/prohibited.png), url(https://www.mcdwayne.com/wp-content/uploads/2018/05/I-love-markdown-syntax-language.png) background-size: 200px, 350px, contain background-position: 0% 80%, 66% 26%, 60% 50% background-color: #e6e6e6 --- class: title-slide-final, middle background-image: url(https://github.com/jvcasillas/ru_xaringan/raw/master/img/logo/ru_shield.png), url(https://www.r-project.org/Rlogo.png) background-size: 55px, 100px background-position: 9% 15%, 89% 15% # Getting help ## If you have problems using RMarkdown (or github) ## ask for help in the slack channel ### You can find some very basic tutorials related to ### R, RStudio, RMarkdown, GitHub, and Slack [here][here] [here]: http://www.jvcasillas.com/ru_teaching/ru_spanish_589/589_01_s2018/sources/tuts/index.html