Guide Research with Diverse Groups: Research Designs and Multivariate Latent Modeling for Equivalence

At the research design stage, three additional issues need to be considered: translation, calibration and metric equivalence. Traditionally encompassed within the concept of measure equivalence, these issues are interrelated with construct equivalence as the measures involve the operational definition of the construct. Translation equivalence is concerned with the translation of the research instrument into another language so it can be understood by respondents in different countries and has the same meaning in each context.

The goal of translation equivalence is commonality in understanding the instrument. Therefore, equivalence of meaning, rather than literal translation, is most important. For instance, sometimes terms cannot be directly translated without losing their meaning, and sometimes a term does not exist in the other language. As discussed later, different translation techniques have been proposed to deal with this.

Calibration equivalence refers to the equivalence in monetary units, measures of weight, distance, volume, and other perceptual cues, such as color and shapes. For example, if the distance between two points is measured in kilometers in one country and miles in another, then questionnaire items relating to this measure should be converted. Therefore, it provides assurance that the units of measurement and other perceptual issues are comparable across populations. Finally, two aspects have to be considered when determining metric equivalence: scalar equivalence and the equivalence of the scale or scoring procedure.

Scalar equivalence refers to whether a score obtained through a certain scale in one country or culture has the same meaning and interpretation in another. As such, this type of equivalence implies that two individuals from different countries or cultures with the same value for a variable for example, the same likelihood of purchase of a product would give the same score on the same scale for example, a value of 4 on a 5-point Likert scale.

Scaling or scoring procedures refer to the use of equivalent scales or scores procedures in different contexts. Inconsistencies in this facet may arise from different levels of familiarity with scaling and scoring formats. Category rating scales are frequently used in survey research. Similarly, the use of nonverbal scales requires detailed analysis to determine the degree of comparison between countries and cultures. The main area for consideration about measure equivalence revolves around the translation equivalence. Different techniques have been proposed in the literature including direct translation, back-translation, parallel translation, decentering and committee approach.

To overcome the problems of direct translation in which a bilingual translator simply translates an instrument from one language to another, researchers employ more sophisticated methods. The procedure most commonly suggested is back-translation.

In this iterative method, a bilingual translator translates a research instrument into another language. Then, the instrument is translated back into the original language by a second independent bilingual translator. If discrepancies are noted in this process, corrections are made. This process can be repeated until equivalence is achieved. Owing to its focus on semantics, the resulting translations may lack naturalness and comprehensibility. In addition, it assumes an etic approach, which can be problematic as equivalent words or constructs in the other language may not exist. Parallel translation is a similar procedure, albeit using two translators with a greater emphasis on wording. Under this approach two translators independently translate the questionnaire.

Then, translations are compared and modified until agreement is reached on a final version. Other procedures include the decentering approach. Research instruments are developed by collaborators in each culture. After an initial translation, this procedure allows changing words and phrases to provide greater accuracy. An alternative collaborative approach is the committee approach, where a committee of bilingual translators and experts discusses alternative versions of a questionnaire, the meaning of items, and so on.

This approach starts with an initial translation, generally using the parallel translation approach with members of the team working independently.

Modifications are made until consensus is reached. The cooperative effort between people with different areas of expertise working together is the main strength of this procedure. Furthermore, pilot studies and pretest are recommended. To guarantee calibration equivalence, researchers should independently check conversions of the different measurement units and other perceptual cues.

Therefore, suggestions and recommendations will be presented in the analysis data section. In this stage, however, preliminary research conducted in each context may provide guidelines regarding the selection of scales, response patterns and measurement methods. Another important issue at the design stage is the sampling design. Two main levels of sampling can be identified: sampling of cultures or countries discussed in the problem definition section and sampling of the individual respondents. This section will focus on the latter.

Problems regarding sampling in this level fall into three areas: the choice of respondents, the conflict between comparability and representativeness and the sampling methods. The choice of relevant respondents is a key issue in sampling, since these can vary across cultures or countries. For instance, women can be suitable respondents in some countries but not in others that is, male-dominated societies.

Similarly, senior managers may play a key role in the organizational decision-making process of Asian or Latin countries, whereas middle managers may have this role in Anglo-Saxon cultures. Therefore, balancing these two extremes represents one of the most important dilemmas in cross-cultural research. Finally, the use of probabilistic methods for example, random and stratified sampling enhances the likelihood of obtaining a representative sample. However, they are often not a viable choice.

For instance, lists or directories are not usually available in emerging country markets. Therefore, in much cross-cultural research, non-probabilistic methods, such as quota sampling and judgmental sampling, are used. These procedures facilitate the control of extraneous variables that could potentially confound the results.

Hult et al 12 suggest enlisting parallel respondents for each unit of analysis. This can be useful to describe and compare their position, role and responsibility in relation to the subject under study in each country or culture of analysis. Based on the type of research conducted, Reynolds et al 38 propose a framework that provides interesting implications for the conflict of representativeness versus comparability, noted above.

When the objective of the study is to examine attitudes and behaviors within specific countries or attributes of a cross-national group, representativeness of the country or specific population of interest is required. Thus, probability-sampling techniques are preferred. By contrast, when the objective of the study is to examine differences or similarities between cultures or countries and to examine the cross-national generalizability of a theory, model or construct, between-country comparability is the most important sampling objective. Therefore, non-probabilistic methods are preferred.

Importantly, if matched samples are used to ensure comparability, the homogeneous samples selected should be suitable and relevant for the investigation. Similarly, the matching variables need to be relevant, logical and based on theory. In addition, researchers should be aware that this procedure may mask cultural differences and that results are limited to the specific groups analyzed.

To enhance the comparability of the data collected, attention must be paid to the following aspects: equivalence of administration, equivalence of response, status and authority of the researcher and timing of data collection. Equivalence of administration refers to the fact that the research settings and the instructions must be equivalent, not identical.

For example, whether a survey is administered individually or in groups could affect the results. Response equivalence is concerned with the design and administration of the research in such a way that people's responses to the questionnaire are equivalent on several dimensions, such as the respondent's familiarity with the test instruments, their levels of anxiety and other psychological reactions.

In this situation, the response does not indicate what it was intended to measure, threatening the validity of the findings seriously. In interviews, the status and authority of the researchers can also influence the results. Finally, the timing of data collection is also important. Data should be collected from different countries within acceptable time frames to enhance comparability. The recommendations mainly focus on the adequate selection, training, supervision and evaluation of interviewers. Some authors suggest assigning interviewers randomly and recording their characteristics.

The use of local agents is also advised.