4 Navigating the Software
Introduction
Both R and RStudio are big chunks of software, first and foremost. You will inevitably spend time doing what one does with any big piece of software: configuring it, customizing it, updating it, and fitting it into your computing environment. This chapter will help you perform those tasks. There is nothing here about numerics, statistics, or graphics. This is all about dealing with R and RStudio as software.
Download and load packages
if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse) # All purpose wrangling for dataframes
4.1 Getting and Setting the Working Directory
Your computer is a maze of folders and files. Outside of R, when you want to open a specific file, you probably open up an explorer window that allows you to visually search through the folders on your computer. Or, maybe you select recent files, or type the name of the file in a search box to let your computer do the searching for you. While this system usually works for non-programming tasks, it is a no-go for R. Why? Well, the main problem is that all of these methods require you to visually scan your folders and move your mouse to select folders and files that match what you are looking for. When you are programming in R, you need to specify all steps in your analyses in a way that can be easily replicated by others and your future self. This means you can’t just say: “Find this one file I emailed to myself a week ago” or “Look for a file that looks something like experimentAversion3.txt
.” Instead, you need to be able to write R code that tells R exactly where to find critical files – either on your computer or on the web.
To make this job easier, R uses working directories. A working directory is where everything starts and ends. Your working directory is important because it is the default location for all file input and output—including reading and writing data files, opening and saving script files, and saving your workspace image. Many of you who previously worked with SPSS using the point-click interface, would often wonder 1) where did my saved SPSS file went? or 2) where did my exported image go to? This confusion arises because often you do not know what was the default working directory. Rather than relying on the default, we specify it explicitly so we know where to store our files, and where this software goes looking for files.
The easiest and recommended way to set your working directory is using RStudio projects. For every piece of work/assessment, you create a project. The project is a folder that can live anywhere on your computer - your desktop, downloads folder, documents folder, etc. In this folder, it contains everything from your files to be analyzed, codes, and exported files and images. Everything is self-contained, there is no confusion.
4.2 Creating a new Rstudio project
You want to create a new RStudio project to keep all your files related to a specific project. Click File → New Project as in Figure 4.2. ** I ALWAYS use this approach, please use it too**
This will open the New Project dialog box and allow you to choose which type of project you would like to create, as shown in Figure 4.3.
Projects are a powerful concept that’s specific to RStudio. They help you by doing the following:
- Setting your working directory to the project directory.
- Preserving window state in RStudio so when you return to a project your windows are all as you left them. This includes opening any files you had open when you last saved your project.
- Preserving RStudio project settings.
To hold your project settings, RStudio creates a project file with an .Rproj extension in the project directory. If you open the project file in RStudio, it works like a shortcut for opening the project. In addition, RStudio creates a hidden directory named .Rproj.user to house temporary files related to your project.
Any time you’re working on something nontrivial in R we recommend creating an RStudio project. Projects help you stay organized and make your project workflow easier.
4.3 Installing Packages
When you download and install R for the first time, you are installing the Base R software. Base R will contain most of the functions you’ll use on a daily basis like mean()
and hist()
. However, only functions written by the original authors of the R language will appear here. If you want to access data and code written by other people, you’ll need to install it as a package. An R package is simply a bunch of data, from functions, to help menus, to vignettes (examples), stored in one neat package.
A package is like a light bulb. In order to use it, you first need to order it to your house (i.e.; your computer) by installing it. Once you’ve installed a package, you never need to install it again. However, every time you want to actually use the package, you need to turn it on by loading it. Here’s how to do it.
4.3.1 Installing a new package
Installing a package simply means downloading the package code onto your personal computer. There are two main ways to install new packages. The first, and most common, method is to download them from the Comprehensive R Archive Network (CRAN). CRAN is the central repository for R packages. To install a new R package from CRAN, you can simply run the code install.packages("name")
, where “name” is the name of the package. For example, to download the tidyverse
package, which contains several functions we will use in this book, you should run the following:
# Install the tidyverse package from CRAN
# You only need to install a package once!
install.packages("tidyverse")
When you run install.packages("name")
R will download the package from CRAN. If everything works, you should see some information about where the package is being downloaded from, in addition to a progress bar.
Like ordering a light bulb, once you’ve installed a package on your computer you never need to install it again (unless you want to try to install a new version of the package). However, every time you want to use it, you need to turn it on by loading it.
When downloading some packages for the first time, you may receive two prompts below. For now, the simplest options are to click No for both. Downloading and installation should progress till the end.
4.3.2 Loading a package
Once you’ve installed a package, it’s on your computer. However, just because it’s on your computer doesn’t mean R is ready to use it. If you want to use something, like a function or dataset, from a package you always need to load the package in your R session first. Just like a light bulb, you need to turn it on to use it!
To load a package, you use the library()
function. For example, now that we’ve installed the tidyverse
package, we can load it with library("tidyverse")
:
# Load the tidyverse package so I can use it!
# You have to load a package in every new R session!
library("tidyverse")
Now that you’ve loaded the tidyverse
package, you can use any of its functions! Let us create a very simple histogram plot using a default dataset found within R. Don’t worry about the specifics of the code below, you’ll learn more about how all this works later. For now, just run the code and marvel at your plot in TWO LINES.
# Make a pirateplot using the pirateplot() function
# from the yarrr package!
ggplot(mtcars,aes(x=mpg)) +
geom_histogram(binwidth=5)
4.3.3 A simple approach to package
For novices, the pacman
package can be used. All you need to do is to type in the name of the package in the function pacman::p_load()
. In the example below, I want pacman
to load the package tidverse
– notice how ""
are not used. If tidverse
is not found in your computer, pacman
will download it first, than automatically load it. I will use this command from now on when loading packages.
if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse) # All purpose wrangling for dataframes
4.4 Learning check
Create a folder called
se201
on your computer desktop.Create a new project inside the folder
se201
, click File – New Project – New Directory – New Project – Browse, search forse201
folder – under Directory name, typepractice
.Create a new R script, click File – New File – R Script.
Save the new R script, click File – Save As. Save it inside the se201/practice folder. Use the file name
practice_script
. It will have the extension.R
assigned to it automatically.Enter the code below, run it, to install and load the packages.
if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse, # All purpose wrangling for dataframes
openxlsx) # moving average for vo2
Close RStudio, reopen RStudio via the
.Rproj
symbol Figure 1.3. In the Files tab on the bottom right, you should see the script you createdpractice_script.R
. Click on it to open and you should see the code you typed. Run the codes you typed above again.Download the solution to this learning check below.