SAMPLING

POSTGRADUATE METHODOLOGY COURSE SAMPLING Hairul Hafiz Mahsol Institute for Tropical Biology & Conservation School of Science & Technology

A Brief Introduction to Sampling • Researchers usually cannot make direct observations of every individual in the population they are studying. • Instead, they collect data from a subset of individuals – a sample – and use those observations to make inferences about the entire population. • Ideally, the sample corresponds to the larger population on the characteristic(s) of interest. • In that case, the researcher's conclusions from the sample are probably applicable to the entire population.

Sample & Population • Even though Denmark-Wahnefried et al. want to make conclusions about all men in the U.S., they haven’t studied all those men in their research. They were forced to choose a manageable subset to study. • That subset is a sample, and the number of subjects in the sample is called the sample size. • Notation: The sample size is typically denoted n, although that is not the only symbol one can use, nor is it the only use for that symbol.

The group that the sample is meant to represent is called the population. • (NOTE: The statistical concept of population is distinct from the biological one.) In studying the sample, we attempt to draw valid conclusions about the population. • Notation: If the population size is both finite and well-defined, then we will typically denote it N. • However, the population is usually not both finite and well-defined.

Individual observations • observations or measurements taken on the smallest sampling unit • Sample of observations • a collection of individual observations selected by a specified procedure • Population • the totality of individual observations about which inferences are to be made, existing anywhere in the world or at least within a definitely specified sampling area limited in space & time

History of sampling • In 1786 Pierre Simon Laplace estimated the population of France by using a sample, along with ratio estimator. He also computed probabilistic estimates of the error. These were not expressed as modern confidence intervals but as the sample size that would be needed to achieve a particular upper bound on the sampling error with probability 1000/1001. His estimates used Bayes' theorem with a uniform prior probability and it assumed his sample was random. • The theory of small-sample statistics developed by William Sealy Gossett put the subject on a more rigorous basis in the 20th century. • However, the importance of random sampling was not universally appreciated and in the USA the 1936 Literary Digest prediction of a Republican win in the presidential election went badly awry, due to severe bias. A sample size of one million was obtained through magazine subscription lists and telephone directories. It was not appreciated that these lists were heavily biased towards Republicans and the resulting sample, though very large, was deeply flawed.

Sampling • Sampling is that part of statistical practice concerned with the selection of individual observations intended to yield some knowledge about a population of concern, especially for the purposes of statistical inference. • In particular, results from probability theory and statistical theory are employed to guide practice.

Sampling is the process of selecting units (e.g., people, organizations etc) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen.

The sampling process consists of 5 stages: • Definition of population of concern • Specification of a sampling frame, a set of items or events that it is possible to measure • Specification of sampling method for selecting items or events from the frame • Determine the sample size • Implement the sampling plan • Sampling and data collecting • Review of sampling process

Sampling frame • In the most straightforward case, such as the sentencing of a batch of material from production (acceptance sampling by lots), it is possible to identify and measure every single item in the population and to include any one of them in our sample. • However, in the more general case this is not possible. • There is no way to identify all rats in the set of all rats. • There is no way to identify every voter at a forthcoming election (in advance of the election).

Having established the frame, there are a number of ways of organizing it to improve efficiency and effectiveness. • Simple sampling • In this case, all elements of the frame are treated equally and it is not subdivided or partitioned.

Stratified sampling • Where the population embraces a number of distinct categories, the frame can be organized by these categories into separate strata or demographics. • Typically, strata should be chosen to: • have means which differ substantially from one another • minimise variance within strata and maximise variance between strata.

Cluster sampling • Sometimes it is cheaper to 'cluster' the sample in some way e.g. by selecting respondents from certain areas only, or certain time-periods only. (Nearly all samples are in some sense 'clustered' in time - although this is rarely taken into account in the analysis.) • Cluster sampling is an example of 'two-stage sampling' or 'multi-stage sampling': in the first stage a sample of areas is chosen; in the second stage a sample of respondent within those areas is selected.

Quota sampling • In quota sampling, the population is first segmented into mutually exclusive sub-groups, just as in stratified sampling. • Then judgement is used to select the subjects or units from each segment based on a specified proportion. • In quota sampling the selection of the sample is non-random. • This non-random element is its greatest weakness and quota versus probability has been a matter of controversy for many years.

Sampling method • Within any of the types of frame identified above, a variety of sampling methods can be employed, individually or in combination. • Random sampling • In random sampling, also known as probability sampling, every combination of items from the frame, or stratum, has a known probability of occurring, but these probabilities are not necessarily equal. • With any form of sampling there is a risk that the sample may not adequately represent the population but with random sampling there is a large body of statistical theory which quantifies the risk and thus enables an appropriate sample size to be chosen.

The simplest form of random sampling is called simple random sampling. Pretty tricky, huh? Here's the quick description of simple random sampling: • Objective: To select n units out of N such that each NCn has an equal chance of being selected. • Procedure: Use a table of random numbers, a computer random number generator, or a mechanical device to select the sample.

Matched Random Sampling • A method of assigning participants to groups in which pairs of participants are first matched on some characteristic and then individually assigned randomly to groups. • (Brown, Cozby, Kee, & Worden, 1999, p.371).

Stratified Random Sampling • Stratified Random Sampling, also sometimes called proportional or quota random sampling, involves dividing your population into homogeneous subgroups and then taking a simple random sample in each subgroup. • In more formal terms: • Objective: Divide the population into non-overlapping groups (i.e., strata) N1, N2, N3, ... Ni, such that N1 + N2 + N3 + ... + Ni = N. Then do a simple random sample of f = n/N in each strata.

Systematic sampling • Selecting (say) every 10th name from the telephone directory is called an every 10th sample, which is an example of systematic sampling. • It is a type of nonprobability sampling unless the directory itself is randomized. • It is easy to implement and the stratification induced can make it efficient, but it is especially vulnerable to periodicities in the list.

Systematic Random Sampling • Here are the steps you need to follow in order to achieve a systematic random sample: • number the units in the population from 1 to N • decide on the n (sample size) that you want or need • k = N/n = the interval size • randomly select an integer between 1 to k • then take every kth unit

Mechanical sampling • Mechanical sampling is typically used in sampling solids, liquids and gases, using devices such as grabs, scoops, thief probes, the coliwasa and riffle splitter. • Mechanical sampling Care is needed in ensuring that the sample is representative of the frame. • Much work in this area was developed by Pierre Gy.

Convenience sampling • Sometimes called grab or opportunity sampling, this is the method of choosing items arbitrarily and in an unstructured manner from the frame. • Though almost impossible to treat rigorously, it is the method most commonly employed in many practical situations. • In social science research, snowball sampling is a similar technique, where existing study subjects are used to recruit more subjects into the sample.

Multi-Stage Sampling • The four methods we've covered so far -- simple, stratified, systematic and cluster -- are the simplest random sampling strategies. • In most real applied research, we would use sampling methods that are considerably more complex than these simple variations. The most important principle here is that we can combine the simple methods described earlier in a variety of useful ways that help us address our sampling needs in the most efficient and effective manner possible. When we combine sampling methods, we call this multi-stage sampling.

Sampling and data collection • Good data collection involves: • Following the defined sampling process • Keeping the data in time order • Noting comments and other contextual events • Recording non-responses

Review of sampling process • After sampling, a review should be held of the exact process followed in sampling, rather than that intended, in order to study any effects that any divergences might have on subsequent analysis. • A particular problem is that of non-responses.

Non-responses • In survey sampling, many of the individuals identified as part of the sample may be unwilling to participate or impossible to contact. • In this case, there is a risk of differences, between (say) the willing and unwilling, leading to selection bias in conclusions. • This is often addressed by follow-up studies which make a repeated attempt to contact the unresponsive and to characterise their similarities and differences with the rest of the frame.

Ecological Sampling • If we want to know what kind of plants and animals are in a particular habitat, and how many there are of each species, it is usually impossible to go and count each and every one present. It would be like trying to count different sizes and colours of grains of sand on the beach. • Samples are usually taken using a standard sampling unit of some kind. This ensures that all of the samples represent the same area or volume (water) of the habitat each time. • There are three main ways of taking samples. • Random Sampling. • Systematic Sampling (includes line transect and belt transect methods). • Stratified Sampling.

RANDOM SAMPLING • Random sampling is usually carried out when the area under study is fairly uniform, very large, and or there is limited time available. • When using random sampling techniques, large numbers of samples/records are taken from different positions within the habitat. • A quadrat frame is most often used for this type of sampling. • The frame is placed on the ground (or on whatever is being investigated) and the animals, and/ or plants inside it counted, measured, or collected, depending on what the survey is for. This is done many times at different points within the habitat to give a large number of different samples.

A better method of random sampling is to map the area and then to lay a numbered grid over the map. • A (computer generated) random number table is then used to select which squares to sample in. (Random number Table). • For example, if we have mapped our habitat , and have then laid a numbered grid over it as shown (Figure - below) , we could then choose which squares we should sample in by using the random number table.

Random number Table • 2017422823175966386102108610515592524425744904190304103353701154486394609449573894704931386723422965408878713718486406572215781569843252325415125402013738371293932912182730305591875057585149361253964045047797361499455295698503835187855622374491994989399460484906776472592608512557162391021996475989652784309263372624236604506504

SYSTEMATIC SAMPLING • Systematic sampling is when samples are taken at fixed intervals, usually along a line. • This normally involves doing transects, where a sampling line is set up across areas where there are clear environmental gradients. • For example you might use a transect to show the changes of plant species as you moved from grassland into woodland, or to investigate the effect on species composition of a pollutant radiating out from a particular source . • Method in this sampling: • a) Line Transect Method • b) Belt Transect Method

STRATIFIED SAMPLING • Stratified sampling is used to take into account different areas (or strata) which are identified within the main body of a habitat. • These strata are sampled separately from the main part of the habitat. • The name 'stratified sampling' comes from the term 'strata' (plural) or stratum (singular). • For ease of understanding, the term 'unit' will be used in the following explanation, rather than stratum.

BOOKS & REFERENCES Hairul Hafiz Mahsol Institute for Tropical Biology & Conservation School of Science & Technology

Books

TERIMAKASIH

SAMPLING

SAMPLING

Presentation Transcript

Sampling

Sampling

Sampling

Sampling

Sampling and Sampling Distributions

Sampling Design Sampling Procedures

SAMPLING

Sampling

Sampling

Sampling...

Sampling

Sampling Designs Systematic Sampling Cluster Sampling Multistage Sampling

Sampling

Sampling

Sampling and Sampling Distributions

Sampling

Sampling

Sampling

Sampling dan Distribusi Sampling()

SAMPLING

Sampling

Sampling