1 / 54

Lecture 1: Introduction Math Boot Camp Will Terry Department of Political Science

Lecture 1: Introduction Math Boot Camp Will Terry Department of Political Science University of Oregon September 16, 2013. Objectives of Math Camp. Have a good time learning about the wonders of math(s )! Get ready for PS545-546…. Objectives of PS545-546.

abrial
Télécharger la présentation

Lecture 1: Introduction Math Boot Camp Will Terry Department of Political Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 1: Introduction MathBootCamp Will Terry DepartmentofPoliticalScience University of Oregon September16,2013

  2. Objectives of Math Camp Have a good time learning about the wonders of math(s)! Get ready for PS545-546….

  3. Objectives of PS545-546 • The objectives of our sequence are twofold: (1.) to improve your ability to read mainstream quantitative research, and (2.) provide a broad overview of the main tools of quantitative analysis. • We will focus on the linear regression model. • You will become familiar with Stata.

  4. Statistical software • This course will focus on practical computing skills that you might find useful in your future research. • There are reasons to spend some time with Rto appreciate capability of statistical computing. • Given the limited time we will focus on developing STATA skills as much as possible. • We will master the basic components of statistical computing. • Data management • Estimating regression models • Graphing

  5. The standard political science stats education I. Basic probability theory - random variables - PDFs -CDFs • Statistical inference theory - confidence intervals, hypothesis testing, p-values, etc. • Linear regression analysis - the workhorse model of the social sciences IV. Binary Outcome Models & Other Extensions of the Basic Linear Model V. Time Series Cross Sectional Models

  6. First, some key terms… Causality Phenomenon Y (e.g. income) is affected by factor X (e.g., gender) Statistical inference Drawing conclusions about the world based on characteristics of sample data. Typically we are in interested in understanding “population parameters.” Independent variable (syn. “regressor”, RHS var) The variable that is exogenously manipulated or changed. Dependent variable (syn. “regressand”, LHS var) Its value “depends” on the value taken by the independent variables.

  7. Random variables and hypothesis testing Random Variable (RV) A variable whose values are determined by chance. Population Density Function (PDF) Describes how an RV is “distributed”—i.e., how likely it is that the RV takes any particular value. Parameter Characteristic or measure that describes a population. Statistic (not to be confused with Statistics) Characteristic or measure obtained from a sample. .

  8. Common ways to distinguish variables Qualitative Variables Variables that take non-numerical values. (e.g., eye color; gun ownership) Quantitative Variables Variables that take numerical values. (e.g., number of credit cards in one’s wallet; time elapsed since the Compromise of 1877) Discrete Variables Variables which assume a finite or countable number of possible values. Usually obtained by counting. (e.g., the number of credit cards in one’s wallet) Continuous Variables Variables which assume an infinite number of possible values. Usually obtained by measurement. (e.g., time elapsed since the Compromise of 1877)

  9. Hypothesis testing terminology Population All subjects possessing a common characteristic that is being studied. Sample A subgroup or subset of the population. Statistics Collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions.

  10. Hypothesis testing

  11. Research design • Research design is the means by which we attempt to uncover causal relationships between variables using data that we collect. • In the jargon of the trade, the objective is to to “identify” the effect of a “treatment.” • Conceptually, one wants to make a comparison between two identical subjects—one who received the treatment, and one who did not. • A pure experiment is the gold standard. Unfortunately, this ideal is generally infeasible in the social sciences.

  12. Language of research design Treatment group The group that receives the treatment. Control group The group that does not receive the treatment. Experimental data Data derived from a process whereby the researcher determines the receipt of the treatment. Non-experimental data (syn. “observational data”) Data in which the administration of the treatment is determined by factors beyond the researchers control.

  13. The standard political science stats education I Basic probability theory - random variables - PDFs -CDFs • Statistical inference theory - confidence intervals, hypothesis testing, p-values, etc. • Linear regression analysis - the workhorse model of the social sciences IV. Binary Outcome Models & Other Extensions of the Basic Linear Model V. Time Series Cross Sectional Models

  14. Linear regression analysis • Univariateregression model yi = β0 + β1xi + εi (There is one IV) B. Multivariate regression model yi = β0 + β1xi +β2zi+ εi (There are two IVs) yi = β0 + β1x1i +….+ βNxNi+ εi (There are N IVs)

  15. V. Binary dependent variable models Used when the dependent variable takes one of two possible values: = 1 if citizen i is a Democrat Democrati = 0 if citizen i is not a Democrat Democrati= f(genderi, incomei, racei, agei)

  16. VI. Time series cross sectional models When the researcher observes the objects of analysis at multiple points in time. (These data have both time series and cross section features.)

  17. What we won’t cover in PS545-6 but might be useful in your dissertation, future research, etc. MLE estimation and other procedures Model selection C. Simultaneous equations/IV estimation D. Matching E. Non-parametric models F. Case study selection for qualitative research And much, much more!

  18. Causality and research design • Causality is often difficult to determine—wait for the next slide—that’s that’s why research design is important. • An experiment is the gold standard. • If a treated subject and a control subject are the same in every respect (as they are in a perfect experiment), we can logically attribute any difference in the observed outcome to receipt of the treatment. • In the social sciences, we generally can’t run experiments so we use statistical techniques to make the treatment and control group as alike as we can.

  19. Common difficulties in determining causality One variable causes another, but how do you know which is causal? Douglass firs ? Rainfall Two variables cause each other. Expected closeness of race Candidate expenditures

  20. Common difficulties in determining causality An omitted third variable causes both. (One reason correlation ≠ causation.) Bad Driving Old age Gray Hair If one were to look at the relationship between Bad Driving and Gray Hair only one might be led to the erroneous conclusion that Gray Hair causes people to drive badly (or Bad Driving causes one to have Gray Hair). How could one test these competing hypotheses? Recall the relationship between ice cream consumption and the NY homicide rate…

  21. A research design schematic R denotes randomized assignment. N denotes non-randomized assignment. X denotes receipt of the treatment. O Denotes that the subject is tested.

  22. Some basic mathematical tools We will review some basic mathematical tools: - Functions - Summation operators - Differential Calculus

  23. Functions A function is a rule that assigns exactly one value to each input of a specified type A function expresses the intuitive idea that one quantity (the argument of the function, also known as the input) completely determines another quantity (the value, or the output).

  24. Summation operators Summation operators are a useful way to represent the sum of a large set of numbers: The index i indicates which numbers in the set are to be included in the sum. The product operator works in a similar fashion.

  25. Summation operators Suppose your data were, {x1, x2 , x3 ,x4 , x5 , x6 , x7} ={-100,-10, -1, 0, 1, 10, 100}. Compute the following:

  26. Sample mean and sample variance Every population has a mean (μ) and a variance (σ2), note this implies it has a standard deviation (σ) as well. The population mean tells you were the population is “centered.” There’s a sense in which the mean is the middle of the data. The population variance (or standard deviation) measures how far “spread out” individuals in the population are. (Obviously, these are always non-negative). The sample mean and sample variance are two fundamental statistics. They estimate the parameters of the population the data were drawn from.

  27. Derivatives Loosely speaking, a derivative can be thought of as how much one quantity is changing in response to changes in some other quantity.

  28. Integrals A definite integral of a function can be represented as the signed area of the region bounded by its graph.

  29. Math Camp game plan: Time to get down to business… In the remainder of this lecture we will discuss some elementary results in a branch of mathematics called Real Analysis—i.e., the branch of math that studies real numbers. Q: Why do we care about Real Analysis? A: Because it provides the logical structure that undergirds the math we use as social scientists. The next few slides follow a text that is slightly more advanced than we need, but let’s follow along to develop a few ideas about the real number line…

  30. The set of real numbers: Special symbols

  31. The real number line

  32. The set of real numbers: Properties

  33. Inequalities

  34. Inequalities

  35. Inequalities

  36. Roots

  37. A cheat sheet of handy rules re real numbers (see the Math Camp website for the complete sheet)

  38. Quadratic equations

  39. Quadratic equations (cont.)

  40. Quadratic equations (cont.)

  41. Absolute value

  42. Achilles and the tortoise

  43. Achilles and tortoise

  44. Achilles and the tortoise

  45. Achilles and the tortoise

  46. Bounds

  47. Bounds

  48. Bounds

  49. Bounds

  50. Intervals

More Related