230 likes | 302 Vues
Research Question. What determines a person’s height?. Hypothesis Brainstorming. Genetics Nutrition Immigration / Origins Disease. Hypotheses:. Sons will be similar to their Dad’s height Daughters will be similar to their Mom’s height. Literature Review: Article #1. Francis Galton.
E N D
Research Question What determines a person’s height?
Hypothesis Brainstorming • Genetics • Nutrition • Immigration / Origins • Disease Hypotheses: • Sons will be similar to their Dad’s height • Daughters will be similar to their Mom’s height
Literature Review: Article #1 Francis Galton • Invented Regression • When Mid-Parents are taller then mediocrity, their Children tend to be shorter than they • When Mid-Parents are shorter than mediocrity, their Children tend to be taller then they
Literature Review: Article #2 • Variables: • Genes • First two years of life • Illnesses • Infant mortality rates • Smaller Families • Higher income • Better education
Literature Review: Article #3 “we find that a 54-loci genomic profile explained 4–6% of the sex- and age-adjusted height variance” “the Galtonian mid-parental prediction method explained 40% of the sex- and age-adjusted height variance”
Variables Dependent Variable Y Y Height X4 X3 X1 X2 X’s Independent Variables
Height Dataset Variables heights <- read.csv("GaltonFamilies.csv")
Dataset Variables: Type Data Types: Numbers and Factors/Categorical
Frequency Distribution, Histogram hist(heights$childHeight)
Mode, Bimodal hist(h$childHeight,freq=F, breaks =25, ylim = c(0,0.14)) curve(dnorm(x, mean=mean(h$childHeight), sd=sd(h$childHeight)), col="red", add=T) Bimodal: two modes
Correlation Matrix for Continuous Variables PerformanceAnalytics package chart.Correlation(num2)
Correlations Matrix: Both Types Zoom in on Gender library(car) scatterplotMatrix(heights)
Categorical: Revisit Box Plot Correlation will depend on spread of distributions Note there is an equation here: Y = mx b
Linear Regression: Model 1 Child’s Height = f(Father’s Height)
Linear Regression: Model 2 Child’s Height = f(Father’s Height) model.5 <- lm(childHeight~gender, data = h)
Linear Regression: Additional Models • Mom • MidParent Height
Discussion Summary • Key Findings: • Gender was the biggest factor • Parents height played a lesser role • Downsides • DataSet used did not include more variables of interest • DataSet for X Country for 1877
Future Research • Include More Predictor Variables • Literature review of a few articles suggests several important factors: • Nutrition • Analyze a Contemporary DataSet • DataSet used was from 18?? • Location Specific as Well