Data Processing • Data Processing - Processing data into proper format that becomes information which can be understand very easily. • As simply, data processing is a process of converting data into information. • Data collected during the research is processed with a view to reducing them to manageable proportions. • A careful and systematic processing will highlight the important characteristics of the data, facilitates comparisons and render it suitable for further statistical analysis and interpretations.
1) Editing the Primary Data After the data have been collected by the primary method, the next step is to edit. 1) Editing for completeness 2) Editing for accuracy 3) Editing for uniformity 4) Editing for deciphering (unreadable entries)
2) Coding • After editing of the collected data, the next step to follow is coding. • Coding refers to assigning of number digits or letters or both to various responses so as to enable tabulation of information easy. • The purpose of coding is to classify the answers to a question into meaning full categories which is essential for tabulation • In most of the surveys , certainly whenever results are to be put in quantitative form, the immediate stage is coding of the answers. • Ex. For feedback Excellent -5 Very Good- 4 Good – 3 Average -2 Below Average - 1
Classification • Once the data is collected and edited, the first task of the statistician is the organization of the figures in such a form that their significance and comparison with masses of similar data may be felicitated and further analysis may be possible. • This is done through Classification and Tabulation. • The process of arranging the data into groups or classes according to their common characteristics, or separating them into different but related parts.
Example The number of students registered in Delhi University during the academic year 2009-2010 may be classified on the basis of any of the following criteria • Sex • Age • The state to which they belong • Religion • Different faculties like Arts, Science, Commerce and Science • Heights • Institutions
The family budget data relating to nature, quality and quantity of the commodities consumed with expenditure on different items of consumption may be classified under the following heads: • Food • Clothing • Fuel and Lighting • House Rent • Miscellaneous (including items like education, recreation, medical expenses, gifts, newspaper, laundry etc.)
Objects of Classification • To present the facts in a simple form • To bring out clearly points of similarity and dissimilarity • To facilitate comparison • To bring out relationship • To present a mental picture • To prepare a basis for tabulation
Types of Classification • Classification based on differences in kind • Classification based on differences of degree of a given characteristics • Geographical Classification • Chronological Classification • Alphabetical Classification
I) Classification Based on Differences in Kind It is also called as qualitative classification and classes are set up on the basis of qualitative differences 1) Showing the data classified according to one attribute (unemployment)
2) Showing the data classified according to unemployment and sex
II) Classification Based on Differences of Degree of a Given Characteristics The classification of statistical data based on differences of degree of a given characteristics is also called quantitative classification. 1) Showing the Number of Persons According to Income
III) Geographical Classification • In this type of classification, the data are classified according to the geographical location such as continents, countries, states, districts and other sub-division
IV.Chronological Classification • When the given data are classified on the basis of time, it is named chronological classification. The data may be classified the basis of time i.e. years, months, week, days or hours.
V. Alphabetical Classification • When the data are arranged according to alphabetical order, it is called alphabetical classification
Statistical Series • The table in which classification is given is known as statistical series. • Types of Series • Individual Series- When the measurement of individual items are arranged either in ascending order or descending order or according to some other scientific order, it is known as individual series • Discrete Series- When we count the number of times (frequency) each value of the variable occurs, it is known as discrete series. • Continuous series- The presentation of data into continuous series along with the corresponding frequencies is known as continuous series. The basic components of a continuous series are- Class interval and class limits
Tabulation • After the data has been classified, the next step is to arrange them in form of tables. • Tabulation involves the orderly and systematic presentation of numerical data in a form designed to elucidate problem under consideration.