390 likes | 468 Vues
Realistic and Accessible Statistics for Archaeologists. about this module. Start here. I’ll come back later. Aims.
E N D
Realistic and Accessible Statistics for Archaeologists about this module Start here I’ll come back later
Aims • The aim of this module is to introduce archaeologists to some basic ideas of statistical thought and to some basic techniques of statistical analysis, so that they will be able to: • assess the relevance of a range of statistical techniques to archaeological problems, • undertake analyses based on some of the more widely used approaches, • assess critically published examples of the use of statistical techniques.
What’s here? • The main vehicles for getting these ideas across are a series of worked examples, all based on real archaeological problems. • Each is presented as an annotated set of worksheets in an Excel workbook. • There are also suggestions for further reading, and some notes for teachers. Using Excel notes for teachers reading
Using Excel • The Excel workbooks are designed to open at the top left (cell A1) of the sheet entitled “Background – start here”. • If one of the workbooks does not, then click on the tab “Background – start here” (the left-most tab; it may not be visible initially), and scroll up to the top of the sheet. Why Excel?
Why Excel? • Excel was chosen in preference to more recognised statistical packages, such as SPSS and Minitab, on the grounds of cost and accessibility. • There is no reason why the data should not be imported into another package for analysis, if that is felt to be desirable.
Which tests are available • Only a very selective range of techniques is presented here. chi-squared test binomial test t-test regression general advice I’m not yet ready to decide
General advice • You should resist the temptation to ‘shoe horn’ a problem into one of the scenarios presented here; if it does not ‘fit’ then another technique may well be the right way to go. • In such cases, you may well wish to consult the recommended reading, or perhaps a friendly statistician. If the latter, make sure that (s)he really understands the nature of your data, e.g. that sherds are not independent objects in their own right, but are parts of broken pots.
Choosing a technique • As each technique is presented in the context of an archaeological problem, it may not be immediately apparent which is relevant to a problem that concerns you. For this reason, you are encouraged to read them all, and to look for similarities and parallels with other problems. • However, in case you do not have time to do this, some guidance is provided. guidance
Choosing a technique to suit a problem • The first step is to decide the ‘type’ of the data with which you are dealing. • Broadly speaking, there are four types of data that you are likely to meet: nominal ordinal interval ratio Once we know the type of our data, we can look at some techniques that have been developed to analyse such data.
Moving on • If you have got this far, you should know the difference between nominal, ordinal, interval and ratio data. • You can now move on and see some of the things you can do with these types of data. nominal or ordinal data interval or ratio data
Nominal data • These are simple ‘labels’, with no natural order to them. • For example, ‘types’ of artefacts would be nominal data, even if the types were numbered. what can I do with nominal data?
Ordinal data • These are similar to nominal data, but the values have a meaningful order to them. • For example, phases on a site would be ordinal data, because we could say that Phase 1 comes before Phase 2, which comes before Phase 3, and so on. what can I do with ordinal data?
Interval data • For this sort of data, it makes sense to talk about the differences between values; in other words, it is reasonable to subtract one value from another to express a difference. • For example, dates BC/AD belong to this type: we can say that the difference between AD 600 and AD 900 is 300 years. what can I do with interval data?
Ratio data • This sort of data is like interval data, but in addition it would make sense to divide one value by another. • Length is a good example: it makes sense say that 200mm is twice as long as 100mm (but it would not have made sense to say that 2000 BC is twice as old as 1000 BC). what can I do with ratio data?
Working with nominal or ordinal data • Two things that we might want to do with nominal or ordinal data are: • to test the proportion of a sub-category in a population, • to compare the values of two nominal variables on the same objects. see this see this back to nominal data back to ‘choices’ back to ordinal data
Testing for proportions • We could use the binomial test. For example, if we think that the proportion of decorated pottery on a site should be 20% (based perhaps on data from other local sites), we could test whether this is actually the case. • The worked example for this technique is survival.xls. see this other tests reading
Cross-classified data • we could use the chi-squared (χ2) test. For example, if we had a table showing how many pots of different forms on a site were made in different fabrics, we could test whether any form was preferentially made in any particular fabric (the variables would then be fabric and form). • The worked example for this technique is Winchester.xls. see this more other tests reading
More on chi-squared • This technique can also be used to examine how well our data (usually expressed as counts) fit a given model. For example, if some theory predicts that there should be so many sites of different size ranges in a region, we could test whether this is supported by the data or not. • Two worked examples are given: Noviodunum.xls and Qau.xls. see this see this
Working with interval or ratio data • two things that we might want to do with interval or ratio data are: • to compare the same measurement on two groups of ‘objects’, • to compare the values of two different variables on the same objects. see this see this back to interval data back to ‘choices’ back to ratio data
Comparing two groups of objects • We could use the t-test. For example, if we had two contemporary assemblages of flint flakes, one from a region where flint was plentiful, and one from a region where it was scarce, we could test whether this had any effect on the lengths of the flakes, using this technique. • The worked example for this technique is Roman pot.xls. see this other tests reading
Comparing two measurements on the same objects • we could use regression. For example, we might want to see how the diameter of a certain type of pot changed over time (the variables would then be diameter and time); we could use this technique to examine the relationship and see if there is any pattern. • The worked example for this technique is Dating the Demolition.xls. see this other tests reading
Reading • This section is in two parts: • Background reading to provide context to the worked examples, • General reading on the uses of statistics in archaeology. see this see this
Background reading (1) • Dating the Demolition.xls • Biddle, M. and Webster, J. ‘Green glass bottles’ in Biddle, M. Nonsuch Palace (2005) 266–301. Oxford: Oxbow Books. • Noviodunum.xls • This work is still in progress, and no publications are yet available. • Qau.xls • These data are taken from my Master’s dissertation, and have not been published. The original excavations of Qau and Badari were published in the 1920s by Guy Brunton; my work was on behalf of Barry Kemp. more
Background reading (2) • romanpot.xls • Bedwin, O. and Orton, C. ‘The excavation of the eastern terminal of the Devil’s Ditch (Chichester Dykes), Boxgrove, West Sussex, 1982’ Sussex Archaeol Collect122 (1984) 63–74 and microfiche. • Orton, C. ‘Two useful parameters for pottery research’ Computer Applications in Archaeology 1985 (1986) 114–120. • survival.xls • Keene, S and Orton, C. ‘Stability of treated archaeological iron: an assessment’ Studies in Conservation30 (1985) 136–42. • Winchester.xls • Barclay, K., Biddle, M. and Orton, C. ‘The chronological and spatial distribution of the objects’ in Biddle, M. Object and Economy in Medieval Winchester Winchester Studies 7ii (1990) 42–73. Oxford: Clarendon Press.
General reading • There are four textbooks, any one of which should give the information that you need. They are listed below in my personal order of preference: • Shennan, S. 1997 (2nd edn) Quantifying Archaeology Edinburgh: Edinburgh University Press. • Baxter, M. J. 2003 Statistics in Archaeology London: Hodder Arnold. • Drennan, R. D. 1996 Statistics for Archaeologists. A Commonsense Approach New York: Plenum Press. • Fletcher, M. and Lock, G. R. 2005 (2nd edn) Digging Numbers Oxford: Oxford University School of Archaeology. which should I choose?
Which should I choose? • The choice of a textbook is a personal one; what suits one reader may not suit another. • So it’s worth looking at as many of these as you can, and choosing the one with which you are most comfortable.
Notes for teachers • This section contains discussions of some questions that may arise in the use of this module. It is divided into two parts: • Frequently Asked Questions: the sorts of questions that may arise from basic misunderstandings or plain curiosity, • Infrequently Asked Questions: questions that demonstrate a deeper perception of the issues involved, and which deserve a full and honest answer. FAQs IAQs
Frequently Asked Questions • These questions are grouped according to the example to which they relate. Demolition Noviodunum Qau Roman pot survival Winchester
FAQs about survival.xls • Q1. In the survival example, why are there different numbers of artefacts given different treatments? • Q2. see answer
Frequent answer S1 • A. The data did not come from a designed experiment, but were collected from laboratory records after the event. Therefore, they just reflect laboratory practice at the time.
Infrequently Asked Questions • These questions are grouped according to the example to which they relate. Demolition Noviodunum Qau Roman pot survival Winchester
IAQs about Dating the Demolition.xls • Q1. In the Dating the Demolition example, why do we assume that if shape changes over time, then it does so linearly (i.e. why do we use linear regression)? • Q2. see answer
Answer D1 • Answer D1. There is no theoretical reason for this. We are just applying the simplest possible model of change, and seeing how it fits the data. If it turned out to be a bad fit, we would go on to try other forms of regression. In this example, there are so few data points that we can’t really tell.
IAQs about Roman pot.xls • Q1. In the Roman pot example, is it legitimate to carry out a comparative test such as the t-test after we have inspected the data graphically? • Q2. see answer
Answer R1 • Answer R1. That’s a good point. By choosing where to make the break between the ‘lower’ and ‘upper’ contexts on the basis of the data, we are maximising our chances of finding a statistically significant difference. This is sometimes called data snooping, and is a common pitfall.
IAQs about survival.xls • Q1. In the survival example, doesn’t the phrase ‘stable at the end of the experiment’ imply that the time between treatment and inspection varied from artefact to artefact (because presumably not all the artefacts were treated at the same time)? And doesn’t this create the possibility of bias in the data? • Q2. see answer
Answer S1 • Answer S1.Well spotted; the answer is yes in both cases. If one treatment had been used earlier than another, then its artefacts would have had to survive longer in order to qualify as ‘stable’, and this would have created a bias against the earlier treatment. This highlights the benefits of a controlled experiment in comparison with ‘collected’ data.
Farewell • We hope that you have enjoyed this module, and that you have found the examples useful. • If you have any questions that you would like to be added the FAQs or the IAQs of a later version, please send them to: • Clive Orton, email@example.com quit
About this module • This module was written by Clive Orton and Steve Ellwood, partly supported by a Teaching Development Grant from the Higher Education Academy. • It was designed to meet the needs of the second-year core course ARCL2038 Research and Presentation Skills in Archaeology for the BA/BSc in Archaeology at UCL Institute of Archaeology. • The advice of Penny Everett, UCL Disability IT Support Officer, is gratefully acknowledged.