Psychological Research : Sampling, Bias and Measurement

Psychological Research : Sampling, Bias and Measurement (16-Jan-2003)

Sampling

A hypothesis may be quite general in its description of the population it describes, e.g. "watching violent tv programmes causes children to have nightmares". It's not possible to get all children involved in an experiment to test this hypothesis, and so a sample must be obtained. Sampling means obtaining a subset of the target population.

There are different ways of obtaining samples:

A random sample is one where everyone in the target population has an equal chance of being selected.
Random sampling would be the ideal way to obtain participants for an experiment, but it is usually not practical other than for small target populations. For example, it would be extremely difficult to devise a scheme that would randomly children from the group "all children".
a quota (or quasi-random, or stratified random) sample is obtained by splitting the target population into categories, and then selecting people from those categories in the same ratio as they appear in the target population. For example, if 20% of children are anxious avoidant, 50% are securely attached, and 30% are anxious resistant, a quota sample based on this categorisation might end up with 2 AA, 6 SA, and 3 AR children.
The problem with this technique is that it may be difficult to come up with an ideal categorisation.
a convenience, or opportunity sample is one obtained by picking whoever happens to be at hand - e.g. all the people who you find in the room
This has the advantage of being very easy to do, although obviously it's a long way from giving everyone in the target population an equal chance of participation (unless your target population is "anyone nearby").
a systematic sample is one obtained by selecting, e.g. every fifth house in a road
A systematic sample will typically not give everyone in the target population an equal chance of participation (e.g. the occupants of the third house won't be included).
a self-selecting sample is one where people volunteer to participate.
Milgram's obedience experiments are a classic example of this - Milgram advertised in a newspaper for volunteers. The problem with this technique is that the people who volunteer for experiments may have characteristics that mean they're not representative of the target population. E.g. Ora (1965) showed that volunteers had a tendency to a particular personality type.
In fact, because of ethical issues, it could be argued that nearly all experiments have some degree of self-selection, since however the sample is chosen, the P's have to agree to take part.
a haphazard sample is one obtained where there is no conscious bias by the selector, but not everyone has an equal chance of being selected. For example, interviewing people at random on a street corner.

While random sampling is probably the best way of finding a representative sample of a larger population, it is not always practical. Studies have shown that the large majority of psychological studies have been carried out using self-selecting samples of university students (Dobson et al, 1981).

To counter criticisms of bias, experimental results should be confirmed by replication - run the same experiment again with a different set of participants and make sure the results are the same.

Bias

The results of a psychological experiment may confounded by peoples' behaviour of the people involved, since both participants and experimenter may (albeit unconsciously) alter their behaviour as a result of their involvement. In the case of participants, this is commonly caused by:

Demand characteristics which refers to the way in which participants may alter their behaviour because they know they're participants in an experiment. For example, they may want to "help the experiment succeed", or they may want to appear to be "doing the right thing". In any case, their behaviour has been affected by the experimental setting itself.
In some cases, a single-blind may be used to overcome demand characteristics. For example, if a P doesn't know whether or not he's been given water or alcohol, he won't know whether he is "expected" to become inebriated or not. In other cases, it may be necessary to employ some form of deception.
Orne (1962) demonstrated how demand characteristics could affect experimental results, and used his research in his criticisms of Milgram.

An experiment may also be confounded by the experimenter's behaviour:

Experimenter expectation may affect the results: if the experimenter knows that one set of P's has been given placebos, then he may treat them differently from a set of P's who he knows has been given drugs.
To counter this, it may be possible to use a double-blind (set the experiment up in such a way so that neither participant nor experimenter knows who's taken the drugs and who's taken placebos).
Rosenthal and Fode (1963) demonstrated how experimenter expectancy can affect results.

Measuring Results

To have value, results from an experiment must be analysed and measured in some way. There are some standard techniques used for this. See pp162-166 in A Level book

There are three Measures of Central Tendency, which are ways to find the "average" result:
1. the mean is obtained by totalling the scores and then dividing by the number of scores. The mean of (10,2,2,5,4,12,3,15,2,5) is (60/10) = 6.
  The mean is the most sensitive MoCT: it is influenced by all the scores in the results. However, it is sensitive to "rogue" values. For example the mean of (1,1,1,1,1,1,1,1,2,1000) is 110.
2. the mode is the score that occurs most frequently. The mode of (10,2,2,5,4,12,3,15,2,5) is 2. If there are two modes, the result is bi-modal. For example the series (1,2,2,3,3,4) is bi-modal(2,3). More than two modes generally means that there is no meaningful mode.
  This is the only MoCT that can be used when the results are not numeric. For example, if the results were "blue","blue","red","green". However, like the mean, it could be influenced by "rogue" values. For example, the mode of (1,2,3,4,5,6,7,8,9,10,1000,1000) is 1000.
3. the median is the middle score when all the scores have been sorted into order. If there is an even number of scores, the median is the mean of the two middle scores. The median of (10,2,2,5,4,12,3,15,2,5) is 4.5
  The median is less likely to be affected by the odd "rogue" result. However, it only uses one of the scores and so may not be representative.
There are two Measures of Dispersion, which are ways to state how much variance was found in the results:
1. the range is obtained by subtracting the lowest score from the highest score. The range of (10,2,2,5,4,12,3,15,2,5) is (10-2) = 8.
  The range has the advantage that it's quick and easy to calculate. However, only uses two of the samples and so may be misleading.
2. Standard deviation provides a measure of how much scattering there is of the results around the mean value.
  This has the advantage of using all the values in the set of results, but the disadvantage of being more complicated to compute. Use a calculator.

Graphical Representation of Data

See pp 166-170 in A Level book for pictures; common types of graph are:

a bar chart is a way to display a series of results by having a column for each result whose height varies according to that result. For example, a bar chart might represent answers to the question "what is your favourite drink?" with columns for tea, coffee, etc.. In a bar chart, the columns have no order (apart from whatever order makes most sense to show the results).
A bar chart can be used to contrast different values.
a histogram a particular type of bar chart, where the columns are ordered and represent frequencies of particular results. For example, a histogram might represent "how many times participants sneezed" with columns for all values from zero to the maximum. Some columns might have zero height.
A histogram can be used to show a pattern in the overall set of results.
a line graph is drawn by having straight lines joining the points at centre of the top of each column in a histogram - it uses the same data as a histogram but represents it in a slightly different form.
a pie chart is a circle which is divided into segments as a graphical way to represent percentages. For example, a pie chart showing eye-colour might be coloured one sixth green, one half brown and one third blue to represent values of 16%, 50% and 33%.

References

Books

Psychology: A New Introduction for A Level (2nd edition), Gross et al : p159-173

Back to class notes index page

Homework

Methods and Statistics Test

Two groups of subjects were given a memory test. They each memorised a set of words and then one group was tested for immediate recall and the other group was tested on delayed recall of the words (after 10 minutes). Their recall scores are given below:

Immediate Recall	Delayed Recall
6	8
5	9
6	10
7	4
5	6
4	8
4	6
7	11
5	9
3	3

State an appropriate two tailed hypothesis for this experiment (2 marks)
There will be a difference in the number of words recalled by subjects who are tested immediately after being shown the words to subjects tested 10 minutes after having been shown the words.
State an appropriate null hypothesis (2 marks)
There will be no difference in the number of words recalled by subjects who are tested immediately after being shown the words to subjects tested 10 minutes after having been shown the words.
What was the Independent Variable in this study? (1 mark)
The length of time between being shown the words and being asked to recall them.
What was the dependent variable in this study? (1 mark)
The number of words recalled.
What is the difference between a one and a two-tailed hypothesis? (1 mark)
A two-tailed hypothesis says that a change in a specific IV will result in a change in a specific DV. A one tailed hypothesis predicts the nature of that change.
Under what circumstances would one wish to make a one-tailed hypothesis? (2 marks)
If one has some prior evidence to suggest that a particular phenomenon has a specific cause and is looking for confirmation.
The Independent Subjects Design was used in this experiment. Explain what is meant by this and explain one difficulty with this design. (2 marks)
In an Independent Subjects Design, each variant of the experiment (immediate vs. Delayed) is performed by a separate group of subjects. One difficulty with this design is that differences between the participants (e.g. their innate memory skills) may cause differences in the results.
How are Ps ideally allocated to their groups when using the Independent Subjects Design? (1 mark)
Ideally, Ps are randomly allocated to one group or another, where a random selection guarantees each P an equal chance of being in either group.
Name two other experimental designs that can be used with groups of subjects. (2 marks)
Repeated Measures; Matched Pairs
Name any two variables or factors, apart from subject variables, that would have to be controlled when carrying out the above experiment. (2 marks)
The list of words being memorised, and the time allowed to memorise the words.
Calculate the mean for the Immediate recall group. (1 mark)
5.2
What measure of dispersion could be used to describe these scores? (1 mark)
Range
The experimenter wishes to choose her subjects from the population of full-time students at a local college of technology. This college has 300 such students. What would be an appropriate method of obtaining a random sample from such a population? (3 marks)
Use a computer to generate a random list of twenty names from the list of enrolled students.