1 / 48

Class 4 ประเภทข้อมูลและการเก็บรวบรวมข้อมูล

Class 4 ประเภทข้อมูลและการเก็บรวบรวมข้อมูล. บธบ 151 สถิติและระเบียบวิจัยธุรกิจ ภาคเรียนที่ 1 ประจำปีการศึกษา 2556. Learning Objectives. Know the difference between primary and secondary data and their sources Know the advantages and disadvantages of each data collection and sampling method

marisa
Télécharger la présentation

Class 4 ประเภทข้อมูลและการเก็บรวบรวมข้อมูล

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Class 4ประเภทข้อมูลและการเก็บรวบรวมข้อมูล บธบ 151 สถิติและระเบียบวิจัยธุรกิจ ภาคเรียนที่ 1 ประจำปีการศึกษา 2556

  2. Learning Objectives • Know the difference between primary and secondary data and their sources • Know the advantages and disadvantages of each data collection and sampling method • Design questionnaires to collect different variables • Evaluate questionnaires

  3. Nominal: classification Ordinal: ranking Interval: equal intervals Ratio: absolute zero Scales of Measurement

  4. Nominal: observations are put into categories based on some criterion Classifies; categorizes Dichotomous Variable: has two values; e.g. male/female, yes/no Multichotomous: has more than two values; e.g. ethnicity, marital status No numerical value (even when observations are numbers) Permissible arithmetic operations: counting Nominal Measurement

  5. Ordinal Measurement • Ordinal: a basic form of quantitative measurement that indicates a numerical order; the intervals between adjacent scale values are undetermined or unequal. • Examples: team/individual standing, socioeconomic status, level of education, Likert scales, any type of rating or ranking • Permissible arithmetic operations: greater than/less than

  6. Interval: intervals between adjacent scale values are equal; scale has an arbitrary zero Hint: If score can go below zero, or if no true zero exists, measurement is interval. Examples: Celsius and Fahrenheit temperature scales, IQ scores, most psychological measures Permissible arithmetic operations: addition, subtraction, multiplication, division; cannot make ratio statements Interval Measurement

  7. Ratio: a measurement scale that has equal units of measurement and a rational zero point for the scale (absolute zero) Hint: An absolute zero indicates a complete absence of the attribute being measured. Examples: Kelvin temperature scale, income in dollars, length, area or volume, height, weight Permissible arithmetic operations: any, including ratios Ratio Measurement

  8. Where do data come from? • Secondary data • data someone else has collected • Primary data • data you collect

  9. Secondary Data • Data gathered by another source (e.g. research study, survey, interview) • Secondary data is gathered BEFORE primary data. WHY? • Because you want to find out what is already known about a subject before you dive into your own investigation. WHY? • Because some of your questions can possibly have been already answered by other investigators or authors. Why “reinvent the wheel”?

  10. Primary Data • Data never gathered before • Advantage: find data you need to suit your purpose • Disadvantage: usually more costly and time consuming than collecting secondary data • Collected after secondary data is collected

  11. Methods of Collecting Data… • There are many methods used to collect or obtain data. Popular methods are: • Direct Observation • Interview •Experiments • Surveys

  12. Observations • Observing behaviors in their settings is one of the most direct ways to collect data. • Observation can range from complete participant observation, where a researcher becomes a member of the group under study to a more detached observation using a casually observing and noting occurrences of specific kinds of behaviors.

  13. Advantages to Observation: • They are free of the biases inherent in the self-report data. • They put a researcher directly in touch with the behaviors in question. • They involved real-time data, describing behavior occurring in the present rather than the past. • They are adapting in that they can be modified depending on what is being observed.

  14. Problems with Observation • Difficulties interpreting the meaning underlying the observations. • Observers must decide which people to observe; choose time periods, territory and events • Failure to attend to these sampling issues can result in a biased sample of data.

  15. Interviews • They permit the interviewer to ask the respondent direct questions. • Further probing and clarification is possible as the interview proceeds. • This flexibility is invaluable for gaining private views and feelings about the organization and exploring new issues that emerge during the interview. • Interviews may be highly structured, resembling questionnaires, or highly unstructured, starting with general questions that allow the respondent to lead the way. • Interviews are usually conducted one-to-one but can be carried out in a group.

  16. Drawback to interviews • They can consume a great deal of time if interviewers take full advantage of the opportunity to hear respondents out and change their questions accordingly. • Personal biases can also distort the data. • The nature of the question and the interactions between the interviewer and the respondent may discourage or encourage certain kinds of responses. • It take considerable skill to gather valid data.

  17. Surveys… • A survey solicits information from people; e.g. Gallup polls; pre-election polls; marketing surveys. • The Response Rate (i.e. the proportion of all people selected who complete the survey) is a key survey parameter. • Surveys may be administered in a variety of ways, e.g. • Personal Interview, • Telephone Interview, • Self Administered Questionnaire, and • Internet

  18. Questionnaire Design… • Over the years, a lot of thought has been put into the science of the design of survey questions. Key design principles: • Keep the questionnaire as short as possible. • Ask short, simple, and clearly worded questions. • Start with demographic questions to help respondents get started comfortably. • Use dichotomous (yes|no) and multiple choice questions. • Use open-ended questions cautiously. • Avoid using leading-questions. • Pretest a questionnaire on a small number of people. • Think about the way you intend to use the collected data when preparing the questionnaire.

  19. Experimentation • Experimentation explores cause and effect relationships by manipulating independent variables in order to see if there is a corresponding effect on a dependent variable

  20. Experimentation • Pure experimentation requires both a controlled environment and the use of a randomly assigned control group • This can be difficult to achieve in human centred experiments conducted in the real-world

  21. Real-World Experiments • There are many experiments that can only be carried out in the messy uncontrolled environments of the real-world, so the search for cause and effect will require tradeoffs between real-world contexts and a controlled environment

  22. Sampling… • Recall that statistical inference permits us to draw conclusions about a population based on a sample. • Sampling (i.e. selecting a sub-set of a whole population) is often done for reasons of cost (it’s less expensive to sample 1,000 television viewers than 100 million TV viewers) and practicality (e.g. performing a crash test on every automobile produced is impractical). • In any case, the sampled population and the target population should be similar to one another.

  23. Classification of Sampling Methods Sampling Methods Probability Samples Non- probability Systematic Stratified Convenience Snowball Cluster Simple Random Quota Judgment

  24. Simple Random Sampling… • A government income tax auditor must choose a sample of 5 of 11 returns to audit…[Can do many different ways]

  25. Simple random sampling • Advantages • Simple • Sampling error easily measured • Disadvantages • Need complete list of units • Units may be scattered and poorly accessible • Heterogeneous population important minorities might not be taken into account

  26. Systematic Random Sampling… • Select sampling units at regular intervals (e.g. every 20th unit)

  27. Systematic sampling • Advantages • Ensures representativity across list • Easy to implement • Disadvantages • Need complete list of units • Periodicity-underlying pattern may be a problem (characteristics occurring at regular intervals)

  28. Stratified Random Sampling… • A stratified random sample is obtained by separating the population into mutually exclusive sets, or strata, and then drawing simple random samples from each stratum. Strata 2 : Age < 20 20-30 31-40 41-50 51-60 > 60 Strata 3 : Occupation professional clerical blue collar other Strata 1 : Gender Male Female We can acquire about the total population, make inferences within a stratum or make comparisons across strata

  29. Stratified Random Sampling… • After the population has been stratified, we can use simple random sampling to generate the complete sample: If we only have sufficient resources to sample 400 people total, we would draw 100 of them from the low income group… …if we are sampling 1000 people, we’d draw 50 of them from the high income group.

  30. Stratified sampling • Advantages • Can acquire information about whole population and individual strata • Precision increased if variability within strata is smaller (homogenous) than between strata • Disadvantages • Sampling error is difficult to measure • Different strata can be difficult to identify • Loss of precision if small numbers in individual strata (resolved by sampling proportional to stratum population)

  31. Cluster Sampling… • A cluster sample is a simple random sample of groups or clusters of elements (vs. a simple random sample of individual objects). • This method is useful when it is difficult or costly to develop a complete list of the population members or when the population elements are widely dispersed geographically. Used more in the “old days”. • Cluster sampling may increase sampling error due to similarities among cluster members.

  32. Cluster sampling • Advantages • Simple as complete list of sampling units within population not required • Less travel/resources required • Disadvantages • Cluster members may be more alike than those in another cluster (homogeneous) • This needs to be taken into account in the sample size and in the analysis (“design effect”)

  33. Sampling and Non-Sampling Errors… • Two major types of error can arise when a sample of observations is taken from a population: • sampling error and nonsampling error. • Sampling error refers to differences between the sample and the population that exist only because of the observations that happened to be selected for the sample. Random and we have no control over. • Nonsampling errors are more serious and are due to mistakes made in the acquisition of data or due to the sample observations being selected improperly. Most likely caused be poor planning, sloppy work, act of the Goddess of Statistics, etc.

  34. Sampling Error… • Sampling error refers to differences between the sample and the population that exist only because of the observations that happened to be selected for the sample. • Increasing the sample size will reduce this type of error.

  35. Nonsampling Error… • Nonsampling errors are more serious and are due to mistakes made in the acquisition of data or due to the sample observations being selected improperly. Three types of nonsampling errors: • Errors in data acquisition, • Nonresponse errors, and • Selection bias. • Note: increasing the sample size will not reduce this type of error.

  36. Errors in data acquisition… …arises from the recording of incorrect responses, due to: — incorrect measurements being taken because of faulty equipment, — mistakes made during transcription from primary sources, — inaccurate recording of data due to misinterpretation of terms, or — inaccurate responses to questions concerning sensitive issues.

  37. Nonresponse Error… • …refers to error (or bias) introduced when responses are not obtained from some members of the sample, i.e. the sample observations that are collected may not be representative of the target population. • As mentioned earlier, the Response Rate (i.e. the proportion of all people selected who complete the survey) is a key survey parameter and helps in the understanding in the validity of the survey and sources of nonresponse error.

  38. Type 1 error • The probability of finding a difference with our sample compared to population, and there really isn’t one…. • Known as the α (or “type 1 error”) • Usually set at 5% (or 0.05)

  39. Type 2 error • The probability of not finding a difference that actually exists between our sample compared to the population… • Known as the β (or “type 2 error”) • Power is (1- β) and is usually 80%

  40. Sample size Quantitative Qualitative

  41. Problem 1 A study is to be performed to determine a certain parameter in a community. From a previous study a sd of 46 was obtained. If a sample error of up to 4 is to be accepted. How many subjects should be included in this study at 99% level of confidence?

  42. Answer

  43. Problem 2 • A study is to be done to determine effect of 2 drugs (A and B) on blood glucose level. From previous studies using those drugs, Sd of BGL of 8 and 12 g/dl were obtained respectively. • A significant level of 95% and a power of 90% is required to detect a mean difference between the two groups of 3 g/dl. How many subjects should be include in each group?

  44. Answer

  45. Problem 3 It was desired to estimate proportion of anaemic children in a certain preparatory school. In a similar study at another school a proportion of 30 % was detected. Compute the minimal sample size required at a confidence limit of 95% and accepting a difference of up to 4% of the true population.

  46. Answer

  47. Problem 4 In previous studies, percentage of hypertensives among Diabetics was 70% and among non diabetics was 40% in a certain community. A researcher wants to perform a comparative study for hypertension among diabetics and non-diabetics at a confidence limit 95% and power 80%, What is the minimal sample to be taken from each group with 4% accepted difference of true value?

  48. Answer

More Related