Technique of Developing Measurement Tools

Technique of Developing Measurement Tools:

a)  Concept development: This is the first step. In this case, the researcher should have a complete understanding of all the important concepts relevant to his study. This step is more applicable to theoretical studies compared to practical studies where the basic concepts are already established beforehand.

b)  Specification of concept dimensions: Here, the researcher is required to specify the dimensions of the concepts, which were developed in the first stage. This is achieved either by adopting an intuitive approach or by an empirical correlation of the individual dimensions with that concept and/or other concepts.

c)  Indicator selection: In this step, the researcher has to develop the indicators that help in measuring the elements of the concept. These indicators include questionnaires, scales, and other devices, which help to measure the respondents opinion, mindset, knowledge, etc. Using more than one indicator lands stability and improves the validity of the scores.

Index formation: Here, the researcher combines the different indicators into an index. In case, there are several dimensions of a concept the researcher needs to combine them.


Test of Practicality of a measuring instrument

Test of Practicality of a measuring instrument

The practicality attribute of a measuring instrument can be estimated regarding its economy, convenience and interpretability. From the operational point of view, the measuring instrument needs to be practical. In other words, it should be economical, convenient and interpreted.

Economy consideration suggests that some mutual benefit is required between the ideal research project and that which the budget can afford. The length of measuring instrument is an important area where economic pressures are swiftly felt. Even though more items give better reliability, in the interest of limiting the interview or observation time, we have to take only few items for the study purpose. Similarly, the data-collection methods, which are to be used, occasionally depend upon economic factors.

Convenience test suggests that the measuring instrument should be easily manageable. For this purpose, one should pay proper attention to the layout of the measuring instrument. For example, a questionnaire with clear instructions and illustrated examples is comparatively more effective and easier to complete than the questionnaire that lacks these features. Interpretability consideration is especially important when persons other than the designers of the test are to interpret the results. In order to be interpretable, the measuring instrument must be supplemented by the following:

  1. detailed instructions for administering the test,
  2. scoring keys,
  3. evidence about the reliability, and
  4. guides for using the test and interpreting results.

Test of Reliability

Reliability is an essential element of test quality. An instrument for measurement is reliable if it provides consistent results. But a reliable instrument need not be valid. For example, if a clock shows time nonstop then it is reliable, but that does not mean it is showing the correct time. Reliability deals with consistency, or reproducibility of similar results in a test by the test subject, if a test is administered on two occasions; the same conclusions are reached both times. While a test with poor reliability will have remarkably different scores each time with the same test and same examinee.

If a test is then it has to be reliable, but the vice versa is not true. Although, reliability might is not as valuable as validity, but nonetheless reliability it is easier to assess than validity for a test. Reliability has two key aspects: stability and equivalence. The degree of stability can be located comparing the results of repeated measurements with the same candidate and the same instrument. Equivalence means the probability of the amount of errors getting introduced by various investigators or different sample items being studied during the repetition of the test. The best way to test for reliability of a test is that two investigators should compare their observations of the same events. Reliability can be improved in the following ways:

(i) By standardizing the measurement conditions to reduce external factors such as boredom, fatigue, etc. which leads to achievement of stability.

(ii) By detailed directions for measurement which can be generalized and used by trained and motivated persons to conduct research and also by increasing the purview of the sample of items used, this lead to equivalence.


Tests of Sound Measurement

While evaluating a measurement tool, three major considerations must be taken into account: validity, reliability and practicality. A sound measurement should fulfill all of these tests.

Test of Validity

It is the most important criterion. It indicates the degree to which an instrument measures what it is supposed to measure. There are three types of validity: Content validity, Criterion-related validity, and Construct validity.

Content validity refers to the extent to which a measuring instrument adequately covers the topic under study. Its determination is mainly judgmental and intuitive. It cannot be expressed in numerical terms. It can also be determined by a panel of persons who judge the extent of the measuring instruments standards.

Criterion-related validity refers to our ability to predict or estimate the existence of a current condition. It reflects the success of measures used for empirical estimating purposes. Criterion-related validity is expressed as the coefficient of correlation between the test scores. Here, the concerned criterion must possess the following characteristics:

  • Relevance: When a criterion is defined in terms judged to be the proper measures, it is known to be relevant.
  • Unbiased: When the criterion provides each subject an equal opportunity to score, it is unbiased.
  • Reliability: When a criterion is stable or reproducible, it is considered as reliable.
  • Availability: The information specified by the criterion should be easily available.

Construct validity is most complex and abstract. It is the extent up to which the scores can be accounted for by the explanatory constructs of a sound theory. Its determination requires association of a set of other propositions with the results received from using the measurement instrument. If the measurements correlate with the other propositions as per our predictions, it can be concluded that there is some degree of construct validity.

If the above criteria are met, we may conclude that our measuring instrument is valid and provides correct measurement; if not, we may have to look for more information and/or depend on judgment.


Sources of Error in Measurement

Measurement should be precise and unambiguous in an ideal research study. However, this objective is often not met with in entirety. As such, the researcher must be aware about the sources of error in measurement. Following are listed the possible sources of error in measurement.

a) Respondent: At times the respondent may be reluctant to express strong negative feelings or it is just possible that he may have very little knowledge, but may not admit his ignorance. All this reluctance is likely to result in an interview of ‘guesses.’ Transient factors like fatigue, boredom, anxiety, etc. may limit the ability of the respondent to respond accurately and fully.

b) Situation: Situational factors may also come in the way of correct measurement. Any condition which places a strain on interview can have serious effects on the interviewer-respondent rapport. E.g., if someone else is present, he can distort responses by joining in or merely by being present. If the respondent feels that anonymity is not assured, he may be reluctant to express certain feelings.

c) Measurer: The interviewer can distort responses by rewording or reordering questions. His behavior, style and looks may encourage or discourage certain replies from respondents. Careless mechanical processing may distort the findings. Errors may also creep in because of incorrect coding, faulty tabulation and/or statistical calculations, particularly in the data-analysis stage.

d) Instrument: Error may arise because of the defective measuring instrument. The use of complex words, beyond the comprehension of the respondent, ambiguous meanings, poor printing, inadequate space for replies, response choice omissions, etc. are a few things that make the measuring instrument defective and may result in measurement errors.

Hence, researcher must know that correct measurement depends on successfully meeting all of the issues mentioned above. He must, as far as possible, try to eliminate, neutralize or otherwise deal with all the possible sources of error so that the final results may not be contaminated.


Measurement Scales

The most commonly used  measurement scales are: (i) Nominal scale; (ii) Ordinal scale; (iii) Interval scale; and (iv) Ratio scale.

(i) Nominal scale: In this scale, symbols, events or attributes are numbered in order to identify them. The number order is symbolic and not quantitative  it is just convenient labels. Nominal scales are convenient ways to track people, objects and events. Although the nominal scale is the least powerful measurement level, yet it is very useful and is used in routinely in surveys and ex-post-facto researches for classification of major sub-groups of the population.

(ii) Ordinal scale: The ordinal scale measures degrees of separation between an event, object or emotion rather than quantitative measurement. The scale measures qualitative phenomena, and rank from highest to lowest. Ordinal measures have absolute values, and the real differences between adjacent ranks may not be equal. The usage of an ordinal scale implies ‘greater than’ or ‘less than’ without our being able to state how much greater or less. Measures of statistical significance are restricted to non-parametric methods.

(iii) Interval scale: In interval scale, the intervals in the scale are not fixed by zero but are adjusted as assumptions. Interval scales have an arbitrary zero. The Fahrenheit scale and time can be examples of an interval scale.

(iv) Ratio scale: Ratio scales are those scales of measurement which have an absolute or true zero of measurement. The various examples of ratio scale are Mass, length, duration,energy etc. A ratio scale has equal distances and a true zero.


Measurement in Research

The core of any research is measurement. It can be defined as the method of assigning numbers to things. It is essential in research as everything has to be reduced to numbers.

Assigning numbers to properties of things is easy. However, it is quite difficult in other cases. Measuring social conformity or intelligence is much complex than measuring weight, age or financial assets, which can be directly measured directly with some standard unit of measurement. Measurement tools of abstract/qualitative concepts are not standardized, and the results are not very accurate.

A clear understanding of the level of measurement of variables is important in research because it is the level, which determines what type of statistical analysis has to be conducted. The collected data can be classified into distinct categories. If there are limited categories, then they are known as discrete variables. If there are unlimited categories, they are known as continuous variables. The nominal level of measurement describes these categorical variables. Nominal variables include demographic properties like sex, race, religion, etc. This is considered as the most basic level of measurement. No ranking or hierarchy is present in this level.

The variables that can be sequenced in some order of importance can be described by the ordinal level. Opinions and attitude scales or indexes in the social sciences are ordinal in nature. Ex.: Upper, middle, and lower class. In this case, the order is known; however, the interval between the values is not meaningful.

Variables that have more or less equal intervals are described by the interval level of measurement. Crime rates come under this measurement level. Temperature is also an interval variable. Here, the interval between variables can be interpreted; but, ratios are not meaningful.

Ratio level describes variables that have equal intervals and a reference point. Measurement of physical dimension such as weight, height, distance, etc. falls under this level.


Random Sample From an Infinite Universe

It is relatively difficult to explain the concept of random sample from an infinite population. However, a few examples will show the basic characteristics of such a sample. Suppose we consider the 10 throws of a fair dice as a sample from the hypothetically infinite population that consists of the results of all possible throws of the dice. If the probability of getting a particular number, say 7, is the same for each throw and the 10 throws are all independent, then we say that the sample is random. Similarly, it would be said to be sampling from an infinite population if we sample with replacement from an infinite population and our sample would be considered as a random sample if in each draw all elements of the population have the same probability of being selected and successive draws happen to be independent. In brief, one can say that the selection of each item in a random sample from an infinite population is controlled by the same probabilities and that successive selections are independent of one another.

In other words, if we have to take a sample of grain from a bag, it is not possible to assign a number to each grain or particle constituting the universe and as such the methods of constructing card population or of random sampling numbers cannot be used. In such cases a thorough mixing of the grain may be done and by dividing and sub-dividing the lot in parts, a sample of an adequate size can be obtained. The contents of the bag after thorough mixing may be divided in two equal parts of which one may be selected and this may further be divided in two parts after mixing. In this way the process can be continued till one of the sub-divisions is equal to the size of the desired sample.


Complex Random Sampling Designs

Complex random sampling designs are probability sampling done with restricted sampling techniques. They are also called mixed sampling designs as they tend to combine probability and non-probability sampling procedures during sample selection.

Some of the popular complex random sampling designs are as follows:

(i) Systematic sampling: The researchers sometimes select every ith item from a list, this is known as systematic sampling. The first unit is a random number and the next unit onwards they are selected at the same fixed intervals.

(ii) Stratified sampling: In a very diverse universe stratified sampling is used were the population is divided into several groups that are more similar and then items are selected from each strata as a sample. The strata is a subjective choice of the researcher based on his experience and judgment by using simple random sampling.

(iii) Cluster sampling: In cluster sampling within the population there might be similar groups these are divided into a number of small homogeneous subdivisions then some of these clusters are randomly selected as sample. Cluster sampling is highly economic. The difference between stratified sampling and cluster sampling is that in stratified sampling a random sample is drawn from each of the strata, whereas in cluster sampling only the selected clusters are studied.

(iv) Area sampling: In area sampling a large area is divided into smaller parts and then samples are selected randomly.  This is a type of cluster sampling were the cluster of units is based on geographic area.

(v) Multi-stage sampling: Multi-stage sampling is a complex type of cluster sampling. Multi-stage sampling is used in researches where the entire universe is very large, for example the entire country; the researcher selects samples in various levels. The researcher after selecting clusters from all universe than randomly selects elements from each cluster. This type of sampling is cost effective and easy to administer.

(vi) Probability proportional to size (PPS) sampling: Probability proportional to size (PPS) sampling: Sometimes cluster sampling units lack equal number of elements; in such cases the researcher uses a random selection process where the probability of selection of each sub group is proportional to the size of the cluster. The actual numbers selected are indicative of the clusters chosen and selected. PPS avoids under representation of any one group.

(vii) Sequential sampling: This is a complex sampling design was the size of the sample is not fixed earlier but is determined according the need of the researcher. In this type of sampling method, the researcher does his research on a particular sample if not satisfied takes another sample unit and so on. The researchers keeps fine tuning the experiment and decides only after doing the experiment whether more samples are needed or not.


Selecting a Random Sample

Random sample is the basic sampling method. Its main advantage is that, each member of the group is given an equal chance of being chosen. Thus, the statistical conclusions deduced from a random sample analysis are deemed to be valid. Though it sounds easy, the process of selection of a random sample is quite complex.

Lottery Method: This is the most commonly used method. Every member is assigned a unique number. These numbers are put in a jar and thoroughly mixed. After that, the researcher picks some numbers without looking at it and those people are included in the study.

Random Number Table: This table consists of a series of digits (0-9) that are generated randomly. The numbers are arranged in rows and columns and can be read in any direction. All the digits are equally probable.

Computer: In case of large population, selecting random samples manually becomes tedious and very time-consuming. In these cases, specific computer softwares are used to generate numbers randomly. This process is very fast and easy.

With and Without Replacement: When a population element is given the chance to be chosen more than once, it is known as sampling with replacement; when it can be chosen only once, it is known as sampling without replacement.