What is data Weighting | How to Weight Research Data | Comment us for any Weighting Help

Statistical data alignment procedures consist of weighting; overriding variables and converting scales. Sometimes these procedures significantly improve the quality of data analysis.

Weighing.

This method of statistically flattening data means assigning a weight to each respondent that reflects the relative importance of accounting for their statements relative to those of other respondents. The sum of the weights is equal to the total number of respondents.

We can assume that if weighing is not performed, then the weighting coefficient of each respondent is equal to one. If it is done, then the answers of each respondent are taken into account in all calculations with this weighting factor. For example, the calculation of the average is replaced by the calculation of the weighted average. The calculation of the proportion of respondents who gave a definite answer to a certain question is replaced by the calculation of the proportion, which is the sum of the weights of such respondents in the sum of the weights of all respondents.

How are weights set?

Most often, so that the sample better reflects the structure of the studied population in terms of the main indicators. For example, the ratio between the groups of men and women of the three age categories after weighing should be the same as in the entire study population. For this, the weighting factor is set equal to the ratio of the group’s share in the studied population to its share in the sample. Relatively speaking, if it is known that the share of representatives of a group in the sample is half that in the studied population, then each such respondent is taken into account in the calculations not as one, but as two people.

Other approaches are sometimes used.

One option is to give more weight to those respondents who provide better data. Another option is to give the respondents one or another weight depending on the value of a certain marketing characteristic. For example, a weight of 3.0 can count the opinions of “heavy users” of the product, with a weight of 2.0 – “average users”, with a weight of 1.0 – “light users” and those who do not use the product.

When analyzing weighted data,

Keep in mind that weighting can increase the statistical error of the estimates performed.

Variable redefinition is the creation of new or modification of existing variables in accordance with the goals of the researcher. Here are some examples of such variables.

The first type of transformation is scale up.

Let’s say that the initial level of product utilization was measured on a ten-point scale. After transformation, you can get a variable that has not ten, but only four possible values: “heavy user”, “average user”, “light user” and “non-user”.

Another type of transformation is a generalization of information contained in multiple columns of a data table. Thus, respondents are often asked where they found any information about a product. By counting the number of different sources of information (from TV advertisements, from friends, etc.) indicated by each respondent, a new important indicator can be formed – the Index of Information Search ( IIS ), which is also added to the data table. Sometimes a new metric is a ratio of two other metrics. For example, dividing the total amount of a product purchased by a respondent by the number of purchases of this product per month, you can calculate the average size of one purchase.

In other cases, to obtain an adequate model of the relationship of indicators, the logarithm is used, the square root is extracted, etc.

An important case of variable transformation is the transformation of an alternate column of a data matrix with three or more possible values ​​by introducing several auxiliary columns of zeros and ones into the data table1. Each of these new auxiliary columns is “responsible” for one of the possible values ​​of the alternative column: one means that this value is selected by this respondent, and zero means that it is not. Auxiliary variables are useful for subsequent analysis of the data. For example, if an alternative column contains the results of respondents’ choice of the most preferred brand of a product, then each of the auxiliary variables can be used to construct an integral indicator of the attitude towards a certain brand.

Scale conversion.

Scale conversion is used to make the estimates of different parameters comparable and to make the data more suitable for analysis. Suppose, for example, that the variables that characterize the image are measured on a seven-point scale of the semantic differential; the variables characterizing the attitude – according to the continuous rating, and the variables characterizing the life style – according to the five-point Likert scale. In order to be able to compare the marks given by the same respondent on different scales, they are transformed, leading to the same range of possible values.

Standardization is often used for this.

To standardize the X scale, from the assessment given to this parameter by each respondent, you need to subtract the average assessment of this parameter X for all respondents. After this, the estimates of this parameter are divided by its standard deviation σχ. The resulting parameter has a mean of zero and a standard deviation of one. This corresponds exactly to the conditions for calculating z-scores. Standardization makes it possible to take into account measurements made on scales of different types on an equal footing. This is important, for example, if we intend to create an integral scale by averaging these scales.

Sometimes the scales are converted for other reasons.

For example, when it comes to assessing the importance of different criteria for choosing a product, it is taken into account that some (usually less well-off) of the respondents indicate relatively many selection criteria, while others (usually better off) – only a few. Suppose, for example, we have estimates of the importance of 18 factors in choosing a product on a three-point scale: very important, rather important, not at all important. Let’s calculate for each respondent the average assessment of the importance of the criterion and subtract the resulting number (say, for some respondent 1.8) from all the ratings given by him. Then, in order not to receive negative rating values, we add the same number to all respondents – the modulus of the smallest negative one.

Another variant of the transformation is also possible, taking into account a different number of positions called by different respondents. Based on their own experience, this option seems to be more preferable to the authors. Suppose that the respondents did not assess the degree of importance of each of the 18 factors, but simply noted those factors that are important to them. Then we are dealing with one joint issue, and if the factor is marked, it is coded by one, and if not marked, by zero. Let’s sum up all the positions of this question for each respondent (i.e. just count the number of factors noted by each respondent), divide each unit by the number of units for a given respondent and multiply, for example, by 100, thereby creating a set of 18 new quantitative questions. Thus, we will get new scales, which seem to have been obtained by the method of distribution of a constant sum (see p. 220). This eliminates the effect of different “talkativeness” of the respondents.

PM YojanaGraduation CourseSarkari YojanaIndia Top ExamExcel Tutorial
Spread the love

2 thoughts on “What is data Weighting | How to Weight Research Data | Comment us for any Weighting Help”

Leave a Comment