1 / 23

nytimes/2009/01/07/technology/business-computing/07program.html?pagewanted=all

http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?pagewanted=all. Workspace Fewer Lines of Code Efficiency Capability. Package Code Documentation Datasets. Source Code Tons of Lines of Code Simplified.

ike
Télécharger la présentation

nytimes/2009/01/07/technology/business-computing/07program.html?pagewanted=all

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?pagewanted=allhttp://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?pagewanted=all

  2. Workspace • Fewer Lines of Code • Efficiency • Capability • Package • Code • Documentation • Datasets • Source Code • Tons of Lines of Code Simplified

  3. The next data visual was produced with about 150 lines of R code

  4. R Installation Already Includes Several Libraries Workflow Data Analysis Goals Input a Comma Separated Values Data Input Enter Manually Data Management Combine Variables Add Variable Select a Subset Statistics & Analysis Visualization & Reporting

  5. Integrated Development Environment (IDE) • Write Code/ Program • Input Data • Analyze • Graphics Datasets, etc. Enter Commands View Results

  6. The R Graphics Package library(help="graphics") • Graphing Parameters • Basic Chart Types Titles X-Axis Title Y-Axis Title Legend Scales Color Gridlines

  7. Currently, how many R Packages? At the command line enter: • dim(available.packages()) • available.packages()

  8. Correlations Matrix library(car) scatterplotMatrix(h)

  9. ggplot2 In ggplot2 a plot is made up of layers. Pl o t

  10. ggplot2

  11. Data Structures character vector numeric vector Dataframe: d <- c(1,2,3,4)e <- c("red", "white", "red", NA)f <- c(TRUE,TRUE,TRUE,FALSE)mydata <- data.frame(d,e,f)names(mydata) <- c("ID","Color","Passed") List: w <- list(name="Fred", age=5.3) Numeric Vector: a <- c(1,2,5.3,6,-2,4) Character Vector: b <- c("one","two","three") Framework Source: Hadley Wickham Matrix: y<-matrix(1:20, nrow=5,ncol=4)

  12. Actor Heights Create Vectors of Actor Names, Heights, Date of Birth, Gender 2) Combine the 4 Vectors into a DataFrame

  13. Variable Types • Numeric: e.g. heights • String: e.g. names • Dates: “12-03-2013 • Factor: e.g. gender • Boolean: TRUE, FALSE

  14. Creating a Character / String Vector • We use the c() function and list all values in quotations so that R knows that it is string data. • Create a variable called ActorNamesas follows: ActorNames <- c(“John", “Meryl”, “Jennifer", “Andre")

  15. Class, Length, Index class(ActorNames) length(ActorNames) ActorNames[2]

  16. Creating a Numeric Vector / Variable • Create a variable called ActorHeights(inches): ActorHeights <- c(77, 66, 70, 90)

  17. Creating a Date Variable • Use the as.Date() function: ActorDoB <-as.Date(c("1930-10-27", "1949-06-22", "1990-08-15", "1946-05-19“ )) • Each date has been entered as a text string (in quotations) in the appropriate format (yyyy-mm-dd). • By enclosing these data in the as.Date() function, these strings are converted to date objects.

  18. Creating a Categorical / Factor Variable • Use the factor() function: ActorGender <- c(“male", “female", “female", “male“ ) ActorGender <- factor(ActorGender)

  19. Vectors and DataFrames Actor.DF <-data.frame(Name=ActorNames, Height=ActorHeights, BirthDate = ActorDob, Gender=ActorGender) dim(Actor.DF) Actor.DF[2] Actor.DF[2,] Actor.DF[1,3] Actor.DF[2,2] Actor.DF[2:3,]

  20. getwd() setwd() > getwd() [1] "C:/Users/johnp_000/Documents" > setwd()

  21. Write / Create a File • write.table(Actors.DF, “ActorData.txt", sep="\t", row.names = TRUE) • write.csv(Actors.DF, “ActorData.csv")

More Related