Sampling In Research

A sample is a finite part of a statistical population whose properties are studied to gain information about the whole (Webster, 1985). When dealing with people, it can be defined as a set of respondents (people) selected from a larger population for the purpose of a survey.

Sampling is the act, process, or technique of selecting appropriate and representative sample of a population for the purpose of decisive parameters or characteristics of the whole population.

Sampling is done to draw conclusions about populations from selected representative samples. Inferential statistics are used as it enables to determine a population`s characteristics by directly observing only a portion (sample) of the population. While carrying out a survey, only a representative sample is asked upon the questions and it is dissimilar to census.

Probability and nonprobability sampling

In probability sampling scheme every unit in the population has a chance of being selected in the sample, and this probability can be accurately determined. The blend of these traits enables to produce unbiased estimates of population totals, by weighting sampled units according to their probability of selection.

Probability sampling includes: Stratified Sampling, Probability Proportional to Size Sampling, Simple Random Sampling, Systematic Sampling and Cluster or Multistage Sampling. All of these various ways of probability sampling have two things in common: one that every element has a known nonzero probability of being sampled and two, it involves random selection at some point.

In case of Nonprobability sampling some elements of the population have no chance of selection ( ‘out of coverage’/'undercovered’), or where the probability of selection can’t be correctly determined. The selection of elements is based on assumptions regarding the population of interest, which forms the criteria for selection. As the selection of elements is nonrandom, nonprobability sampling does not allow the estimation of sampling errors. This places limits on how much information a sample can provide about the population.Even the relationship between sample and population is limited, making it difficult to extrapolate from the sample to the population.

Sampling methods

Simple random sampling: In this type all subsets of the frame are given an equal probability of selection: the frame is not subdivided or partitioned. Moreover, any given pair of elements has the same chance of selection as any other such pair (and similarly for triples, and so on). Equal oppurtunity minimizes bias and simplifies analysis of results. In particular, the variance between individual results within the sample is a good indicator of variance in the overall population, which makes it relatively easy to estimate the accuracy of results.

Simple random sampling is always an EPS design, but not all EPS designs are simple random sampling.

Systematic sampling: Systematic sampling relies on positioning of the target population according to some ordering scheme and then selecting elements at regular intervals through that ordered list. It occupies a random start and then proceeds with the selection of every kth element from then onwards. In this case, k=(population size/sample size). It is important that the starting point is not automatically the first in the list, but is instead randomly chosen from within the first to the kth element in the list.

Stratified sampling: In this type the population embraces a number of distinct categories into separate “strata.” Each stratum is then sampled as an independent sub-population, out of which individual elements can be randomly selected. There are several potential benefits to stratified sampling.

•  Dividing the population into distinct, independent strata enables researcher to draw inferences about specific subgroups that may be lost in a more generalized random sample.

•  Utilizing a stratified sampling method can lead to more efficient statistical estimates (provided that strata are selected based upon relevance to the criterion in question, instead of availability of the samples).

•  Sometimes in this case data are more readily available for individual, pre-existing strata within a population than for the overall population; in such cases, using a stratified sampling approach may be more convenient than aggregating data across groups.

•  Since each stratum is treated as an independent population, different sampling approaches can be applied to different strata, potentially enabling researchers to use the approach best suited (or most cost-effective) for each identified subgroup within the population.

Sampling error

Sampling error consists of the differences between the sample and the population that are due solely to the particular units that happen to have been selected.

There are two basic causes for sampling error. One is chance: That is the error that occurs just because of bad luck. This may result in untypical choices. Unusual units in a population do exist and there is always a possibility that an abnormally large number of them will be chosen. For example, in a recent study in which I was looking at the number of trees, I selected a sample of households randomly but strange enough, the two households in the whole population, which had the highest number of trees (10,018 and 6345 ) were both selected making the sample average higher than it should be. The average with these two extremes removed was 828 trees. The main protection agaisnt this kind of error is to use a large enough sample. The second cause of sampling is sampling bias.

Sampling bias is a tendency to favour the selection of units that have particular characteristics. Sampling bias is usually the result of a poor sampling plan. The most notable is the bias of non response when for some reason some units have no chance of appearing in the sample.

Non sampling error (measurement error): The other main cause of unrepresentative samples is non sampling error. This type of error can occur whether a census or a sample is being used. Like sampling error, non sampling error may either be produced by participants in the statistical study or be an innocent by product of the sampling plans and procedures.

A non sampling error is an error that results solely from the manner in which the observations are made. Biased observations due to inaccurate measurement can be innocent but very devastating. A story is told of a French astronomer who once proposed a new theory based on spectroscopic measurements of light emitted by a particular star. When his colleagues discovered that the measuring instrument had been contaminated by cigarette smoke, they rejected his findings.

In surveys of personal characteristics, unintended errors may result from: -The manner in which the response is elicited -The social desirability of the persons surveyed -The purpose of the study -The personal biases of the interviewer or survey writer

The interviewers effect: No two interviewers are alike and the same person may provide different answers to different interviewers. The manner in which a question is formulated can also result in inaccurate responses. Individuals tend to provide false answers to particular questions. For example, some people want to feel younger or older for some reason known to themselves. If you ask such a person their age in years, it is easier for the individual just to lie to you by over stating their age by one or more years than it is if you asked which year they were born since it will require a bit of quick arithmetic to give a false date and a date of birth will definitely be more accurate.

The respondent effect: Respondents might also give incorrect answers to impress the interviewer. This type of error is the most difficult to prevent because it results from out right deceit on the part of the responder. An example of this is what I witnessed in my recent study in which I was asking farmers how much maize they harvested last year (1995). In most cases, the men tended to lie by saying a figure which is the recommended expected yield that is 25 bags per acre. The responses from men looked so uniform that I became suspicious. I compared with the responses of the wives of these men and their responses were all different. To decide which one was right, whenever possible I could in a tactful way verify with an older son or daughter. It is important to acknowledge that certain psychological factors induce incorrect responses and great care must be taken to design a study that minimizes their effect.

Knowing the study purpose: Knowing why a study is being conducted may create incorrect responses. A classic example is the question: What is your income? If a government agency is asking, a different figure may be provided than the respondent would give on an application for a home mortgage. One way to guard against such bias is to camouflage the study`s goals; Another remedy is to make the questions very specific, allowing no room for personal interpretation. For example, “Where are you employed?” could be followed by “What is your salary?” and “Do you have any extra jobs?” A sequence of such questions may produce more accurate information.

Induced bias: Finally, it should be noted that the personal prejudices of either the designer of the study or the data collector may tend to induce bias. In designing a questionnaire, questions may be slanted in such a way that a particular response will be obtained even though it is inaccurate. For example, an agronomist may apply fertilizer to certain key plots, knowing that they will provide more favourable yields than others. To protect against induced bias, advice of an individual trained in statistics should be sought in the design and someone else aware of search pitfalls should serve in an auditing capacity.