D Appendix: Loading External Data

In this appendix, we’ll cover the basics of how to load external data into R.

D.1 Working with External Data

Often, when we want to analyze data, we need to get it from an external source. As we’ve seen, R has a few built in data sets, and there are more that we can get in various libraries (more on that later), but often, we want to work with our own data, and to do this we’ll need to pull it into R from somewhere on our computer. One common way we’ll do this is to load a CSV file.

Part of what we’ll cover in this section includes

  • What is a CSV file?
  • What is a working directory?

D.2 CSV files and your Working Directory

CSV stands for “comma separated values” and its a common format often used to store data in files. In particular if we have data in Excel, it’s easy to save as a CSV file. Common attributes of CSV files include:

  • Each row is a different line
  • Each cell or element in a row is separated by a comma
  • There may or may not be a header row.
The format of CSV files.

Figure D.1: The format of CSV files.

Let’s assume we have our file in CSV format stored on our computer somewhere. But that exact somewhere is pretty important. For R to be able to load it, R needs to know where exactly it is.

Your working directory is the directory or folder on your computer that R is currently “looking at”. You can see this with the following command:

getwd()
## [1] "C:/Users/bloosmore/OneDrive - Eastside Preparatory School/Desktop/Statistics/Lecture Notes/fall"

(Yours will certainly be different.)

Now, to load a file, the file I’m using must be located in that exact directory. In that case, we can simply call

WA_wildfire <- read.csv("WA-wildfires.csv")

where we enclose the file name with quotation marks. The WA_wildfire data frame is now available for us to use as if it were a built in object.

So the key steps here are to:

  • figure out where your current working directory is using the getwd() function
  • put the file in the same directory
  • load the file using the read.csv() function

D.3 Changing your Working Directory

It is possible to change your working directory. You can do this either using the setwd() command or using the Session -> Set Working Directory menu command.

Note: You’ll need to be cautious about whether you are doing this in an R Markdown document or in the console. Changing the working directory in the console will not make it so that read.csv() commands in the R Markdown file necessarily work.

D.4 Choosing the File with a Dialog box

You can also avoid any issues with the working directory and choose the file using a dialog box. I’ll leave it to you to decide if this is harder or easier.

WA_wildfire <- read.csv(file.choose())