Introduction
Academic Self-Efficacy and its Importance
Becoming a higher-education student involves a process of change and integration into a new social and academic world. This process can result in academic difficulties and may even lead students to withdraw from college altogether (Medrano et al., 2010), because factors other than cognitive abilities are related to the difficulties encountered in higher education. Therefore, being able to successfully face academic life requires behaviors that facilitate a high level of performance as well as a belief that the student’s own skills and aptitudes are enough for them to successfully complete their college studies. Hence, academic self-efficacy (ASE) is a crucial component.
ASE refers to the beliefs that people have about their own ability to learn and acquire behaviors during previously defined levels, such as high school (Khan, 2013) or university (Zander et al., 2018). In this sense, it is important to know the role that ASE has in learning and academic performance, since it responds to the emotional expectations of students, cognitively activating the performance and attention to situational demands.
Thus, the context where the teaching-learning process takes place is of utmost importance because it is known that SES is formed through four sources of information that inevitably appear in higher education. In the first place, there is direct experience, since it is through this that students interpret this information as positive and strengthen their perception of SES. Secondly, there is vicarious experience, where through the observation of models similar to themselves, students learn new strategies to execute certain tasks successfully. Thirdly, there is the social persuasion, that is, when the people around promote the perception that the person possesses the capabilities to solve various situations. Finally, the physical and emotional states also inform depending on the circumstances, since sleep, anxiety, stress, etc., also help to interpret one's own capacity and competence to solve problems (Bandura, 2001).
Thus, self-efficacy influences students' academic motivation (Borzone-Valdebenito, 2017; Rosario et al., 2012) and their learning behaviors in different educational contexts (Alegre, 2014) since it is considered a self-regulatory mechanism of learning because beliefs about one's own ability influence the achievement attained and behaviors chosen, and also the environment can modify those beliefs through persuasion or vicarious learning (Zamora-Araya, 2020), protects from negative effect of anxiety (Green, 2022), and remains stable over the time in students, but tend to increase in women more than men (Bulfone et al., 2021).
For these reasons, as relevant as knowing what strengths a college student has would be the personal judgment about what they can do with the capabilities they possess. For example, self-esteem predicts over the time the ASE in college students (Luo et al., 2022). Therefore, the lack of knowledge about the ASE and the inadequate selection of learning strategies have been considered relevant to know the variables involved in the adaptation process of university students, which could favorably impact their academic performance and thus favor their permanence in the university (Borzone-Valdebenito, 2017; Gomes & Soares, 2013).
In Latin America, the study of ASE in college students has demonstrated a sustained rise, with a variety of findings that help us understand the importance of ASE in academic behavior. For example, the ASE had a direct effect on engagement (β = .39; Mesurado et al., 2016), flow (β = .49; Mesurado et al., 2016), attention in classes (β = .15; Sánchez-Rosas & Esquivel, 2016), academic motivation (β = .337; Montes de Oca & Moreta-Herrera, 2019), academic progress goals (β = .49; Moran et al., 2019), and academic performance (β = .21; Llorca et al., 2017).
Likewise, there was a positive correlation with self-regulated learning, both from a one-dimensional perspective (r = .650; Alegre, 2014) as well as in the review dimension (r = .15), formulation (r = .21), organization (r = .32), critical thinking (r = .32), metacognitive self-regulation (r = .33), time management and study environment (r= .34), effort regulation (r = .23), and help seeking (r = .19; Ventura et al., 2017). Similarly, there was a positive correlation with autonomy (r = .77; Buadas et al., 2017), academic performance (r = .325; Alegre, 2014; r = .409; Ávalos et al., 2018; ω2 = .037; Ferrel-Ortega et al., 2017), pre-exam coping strategies (r = .344; Dominguez-Lara, 2018a), learning strategies (r = .59; Martins & Santos, 2019), college adjustment (r > .20; Borzone-Valdebenito, 2017), wellbeing (r = .336; Bueno-Pacheco et al., 2018; r = .64; Espinoza & Barra, 2018), and resiliency (r = .34; León et al., 2019). By contrast, there is a negative correlation with stress (r = -.37; Gutiérrez-García & Landeros-Velázquez, 2018) and academic procrastination (r = -.32; Moreta-Herrera et al., 2019).
Likewise, we found that motivational scaffolding (η2= 0.65) and cognitive scaffolding (η2= 0.68) within an e-learning environment had a significant effect upon ASE (Valencia-Vallejo et al., 2019a, 2019b). In addition, they did so as far as individual variables, mood (d = .85; Medrano et al., 2016), structural empowerment (β = .18; Tumino et al., 2020), and psychological empowerment (β = .20; Tumino et al., 2020), affect levels of self-efficacy.
Although what has previously been described may reveal the theoretical importance (correlation with related constructs) and the empirical importance (moderate or high effect size) of ASE, because it is modeled through vicarious learning within socialization processes (Bandura, 1977), it will be dependent upon the cultural group to which the assessed subject belongs. In addition, according to some academic development and career selection models (Brown & Lent, 2019), which are related with Bandura’s socio-cognitive theory, learning experiences have had an impact on the formulation of ASE, like the scientific background that they could have (Bulfone et al., 2021). For example, the ASE levels are higher in advanced semester students than the first semester students (Hernández-Jacquez, 2022). These experiences are influenced by ethnic characteristics, parental rearing practices, sex, and access to healthcare and education. This means that there may be significant differences between experiences in different Latin American countries, and there could be significant variations within a single country. Additionally, some experiences may be susceptible to sociodemographic, political, and economic changes, and these variations may affect measurements in a construct such as ASE.
In particular, Latin American countries share characteristics that make allowance for an assumption of similarity. However, some differences do exist. These differences could create biases in the measurements if the instruments that are used do not consider such differences. This is because an implicit cultural equivalency does not exist just because these cultures share a common language, ethnic features, or certain characteristics related to socioeconomic and political development. For example, Latin American countries have stood out for their collectivist values (Hofstede, 1980) although there are discrepancies as to how the levels of these are manifested. An example of this is Argentina, a country that has been assessed as the most individualistic country of the region, with individualism rankings close to those of the United States (Hofstede, 1989). The Latin American region is also characterized by a predominance of traditional values (prominence of religion, bonds between children and parents, and respect for authority and for traditional families) and self-expression (environmental protection, trust, and tolerance) although countries such as Argentina, Chile, and Uruguay have a lower level of traditionalist values when compared with Ecuador, Guatemala, Colombia, and Mexico. While Mexico, Uruguay, and Colombia have greater levels of self-expression than Peru, Brazil, and Guatemala do. Therefore, and given that the ASE is an individualized variable, it is probable that differences will exist among Latin American countries in terms of measurement constructs as well as scoring.
Evaluation of Academic Self-Efficacy
The ASE has usually been assessed as part of a general model of academic burnout with the Maslach Burnout Inventory-Student Survey (MBI-SS; Schaufeli et al., 2002), either as academic inefficacy or academic efficacy. However, because efficacy and inefficacy are not perfect opposites (Bresó et al., 2007; Morgan et al., 2014; Schaufeli & Salanova, 2007), a low score in inefficacy does not necessarily imply an elevated perception of ASE. Furthermore, the reliability coefficients for that dimension of burnout model have produced inconsistent results, which have generally not been reliable in either the inefficacy dimension (Bresó et al., 2007; Schaufeli & Salanova, 2007) or the efficacy dimension (Atalayin et al., 2015; Galán et al., 2011; Hu & Schaufeli, 2009), except in some cases, where these were, in fact, acceptable (Yavuz & Dogan, 2014; Zhang et al., 2007). Because of this inconsistency with regard to the reliability assessment and its potentially negative impact on empirical results (Zimmerman & Zumbo, 2015), we must continue to search for alternative measuring tools.
Other options in the Spanish language are the Inventory of Expectations and Academic Self-Efficacy (Barraza, 2010), the Inventory of Self-Efficacy for Study (Pérez & Delgado, 2006), as well as the Academic Situations Specific Perceived Self-Efficacy Scale (ASSPSES; Palenzuela, 1983). Unlike the former inventories, the Perceived Academic Self-Efficacy Scale offers greater evidence of replicability at a structural factorial level, and in relation to the number of items it offers (10 in the original version), this is an attractive option for a brief one-dimensional evaluation of ASE (Navarro-Loli & Dominguez-Lara, 2019), particularly when simultaneously evaluating several constructs to analyze explanatory models.
ASSPSES has received attention in Ibero-American countries, and the psychometric indicators have been favorable in Peru (Dominguez-Lara, 2016), Argentina (Tumino et al., 2020), España (Palenzuela, 1983), Mexico (Dominguez-Lara & Campos-Uscanga, 2020), Ecuador (Moreta-Herrera et al., 2021) and Chile (Del Valle et al., 2018; Escobar & Pérez, 2017) in terms of internal structure (one-dimensional structure) and internal coherence (coefficients greater than .80), but there are still no reports from Brazil or Colombia. This scale offers various advantages: it is brief, there is unrestricted access (payment is not required), it was originally developed in Spanish, and there are consistent psychometric findings in the countries where it was analyzed. However, the comparison between countries in intercultural studies is a pending task because there are still no reports addressing its measurement invariance among Latin American countries that allow this comparison to be made accurately, since it is normally assumed that the evaluation of a construct is independent of the group to which it belongs (Byrne & van de Vijver, 2010) and when differences are found, it is not possible to know whether they are real differences or the product of the different representation of the construct in each of the groups (Meredith, 1993).
In this sense, measurement invariance analysis provides for an approximation to structural equivalence of the measurement instruments, something that is required when applied to comparative cross-cultural research (Byrne et al., 2009). Structural equivalence refers to the similarity in composition of the dimensions of the construct being evaluated across different cultural groups (Byrne et al., 2009) because assuming the equivalence of psychological constructs between groups is questionable in light of the ongoing critiques of the generalized investigative model based on reported findings concerning populations living in western countries that are educated, industrialized, rich, and democratic (WEIRD; Henrich et al., 2010). This phenomenon occasionally contradicts what we see in other countries, particularly in countries labeled developing nations.
It is worth pointing out that there are some studies of invariance that involve the MBI-SS wherein there is an (in)efficacy scale, but these studies focus on comparing students from different European countries (Schaufeli et al., 2002) or comparing European with Latin American students (Campos & Maroco, 2012; Charry et al., 2018). We have not found research with characteristics similar to those of the present study.
Relationship between AS and emotional academic exhaustion
Due to their continuous role in the educational environment, affectations in AS affect learning processes and the development of emotional problems related to the academic environment, such as feelings of inability to perform adequately facilitating the use of ineffective strategies that confirm this idea of low competence (Khan, 2013), which can lead to academic emotional exhaustion (AEE) and subsequently to academic burnout (AB; Yu et al., 2016).
AB refers to exhaustion resulting from academic demands, being pessimistic and losing interest in academic tasks, and feelings of incompetence as a student, including absenteeism, low participation, lack of meaning in activities, and inability to learn (Charkhabi et al., 2013).
The core element of AB is academic emotional exhaustion (AEE), which is defined as the feeling of being emotionally overburdened and drained by others (Greenglass, 2007) and can influence academic performance, student-teacher relationships, and affect students' enthusiasm toward education (Charkhabi et al., 2013). In this sense, AEE could be considered as reflecting an ASE crisis. For these reasons, studies show that the relationships between self-efficacy and AB are negative (Charkhabi et al., 2013; Kong et al., 2021).
The Present Study
The aim of this investigation has been to perform an analysis of measurement invariance of ASSPSES between college students from five Latin American countries (Peru, Colombia, Mexico, Argentina, and Brazil) from the perspective of classic test theory to contribute to the teaching-learning process through the use of a scale widely used in Spanish-speaking countries, because the ASE influences the environment, behaviors, and attitudes in the school environment and, in turn, is influenced by them (Bandura, 1997). Students' self-efficacy can be increased by the reinforcement obtained through positive comments received from teachers and family members and by fulfilling learning goals resulting from academic efforts (Schunk, 2012). As a complement, increased ASE should affect the likelihood of students' acquisition of self-regulated learning patterns and the provision of adequate learning spaces (Schunk & DiBenedetto, 2016).
As has already been mentioned, most psychological constructs are dependent upon sociocultural aspects, like the ASE (Thomas et al., 2022), indicating the need to verify the measurement equivalence of the construct in relation to the country being evaluated. This is relevant because in the scales relying upon verbal language (be it oral or written), words can have different meanings, depending on the manner and context wherein they are used (Fernández et al., 2010). Therefore, the evaluation instruments cannot be immediately applied to different groups of subjects from those for which the evaluation was originally designed because there is a chance of conceptual errors occurring when there is a direct translation or when a test that was designed for one cultural subgroup is applied to another subgroup without revisions. In this sense, the differences found in the meaning and structure of the psychological construct measurements tend to originate in the variety of social institutions, values, and socialization practices.
In this order of ideas, the ASE has been studied in the countries that participated in this investigation, but with different measurement scales, such as MBI-SS in Colombia (e.g., Ferrel-Ortega et al., 2017), ASSPSES in Peru (e.g., Dominguez-Lara, 2018), or the Academic Behavior Self-Efficacy Scale (ABSES) in Mexico validated by Blanco-Vega et al. (2011) (e.g., Borzone-Valdebenito, 2017), just to name a few. In this sense, notwithstanding the information provided in relation to the construct, and given that ASE was not evaluated using the same instrument in each country, direct comparison is not possible. Therefore, we must rely upon instruments with measurement invariance evidence, given the great cultural and subcultural diversity that exists in Latin America. Such instruments would allow us to contrast the findings directly (using the same instrument) and exactly (with equivalent measurement) such that the cultural distinctions that emerge are considered.
Additionally, this study also represents a benefit for non-Latin readers as they could apply analyze the psychometric properties of the instrument in their cultural context, as well as replicate the developed procedure, given that only one study was found that assesses the measurement invariance of an ASE instrument assessing people coming from countries from different continents (Yildirim & Aybek, 2019).
Method
Design
This is an instrumental research (Ato et al., 2013) focused on the study of the psychometric properties of the Academic Situations Specific Perceived Self-efficacy Scale.
Participants
The most salient demographic characteristics can be seen below (Table 1). It should be stressed that most students are unmarried, almost a fifth of them work, and Mexican students primarily attend public institutions, while the Peruvian, Argentine, and Colombian students primarily attend private universities. The size of the sample is appropriate in all cases, considering that this is a one-dimensional model with fewer than 10 items and with anticipated factorial loadings of .70 (Wolf et al., 2013). Finally, after homogenizing the comparisons, only participants younger than 36 were included as is the case with other studies where data from various different sources are considered (Potthoff et al., 2016).
Instruments
Academic Situations Specific Perceived Self-Efficacy Scale (ASSPSES). This is a one-dimensional measure of ASE (Palenzuela, 1983) that consists of nine items with four response options (from never to always), in which the highest score corresponds with a higher ASE. The items were elaborated according to a logical strategy based on Bandura's theory (1977). The version used was the one adapted to Peruvian college students (Dominguez-Lara, 2016).
Emotional Exhaustion Scale (EES). The EES (Fontana, 2011) is a unidimensional measure of AE by means of 10 items scaled in Likert format with five response options ranging from rarely (1) to always (5). The version adapted in Mexico, Colombia and Peru (Dominguez-Lara et al., 2021) was used for students of those nationalities. This scale was not applied to students from Brazil and Argentina.
Procedure
This research study is a product associated with an investigative project approved by the first author’s university. The research work was carried out according to the principles of the Declaration of Helsinki (World Medical Association, 1964) and the ethical code of the Colegio de Psicólogos del Perú (Colegio de Psicólogos del Perú, 2017). The process underscores a signature for informed consent as well as specific instructions. The voluntary nature of participation was stressed, along with anonymity, with regard to responses.
The researchers from Colombia and Mexico were invited later, and because this was a project that did not involve invasive or potentially harmful procedures, it was not necessary to bring it before evaluation committees in the various universities. Instead, permission was obtained by approaching authorities directly. Data originating from Brazil and Argentina were gathered as parts of separate projects led by authors from those countries, following all of the ethical guidelines required by their respective institutions, including the Portuguese version of the ASSPSES, in the case of Brazil. In all cases, the questionnaires were administered in pencil-paper format, except in the case of Colombia, where a Google Docs form was used.
Data Analysis
Prior to analyzing measurement invariance, a series of descriptive and factorial analyses were carried out for each sample. In particular, an approximation of the univariate normality through the exploration of skewness (< |3|) and kurtosis (< |10|; Kline, 2016) of each item, as well as the multivariate normality with the Mardia’s coefficient (G2 < 70; Rodríguez & Ruiz, 2008). Due to the number of items, it is likely that there is overlap, so it is important to evaluate it. In that sense, the multicollinearity criterion used in the context of factor analysis was taken into account. Thus, inter-item correlations (r ii ) above .90 suggest multicollinearity (Brown, 2015).
After that, a confirmatory factor analysis was carried out (CFA) for each of the samples. The extraction method was Weighted Least Squares Mean and Variance Adjusted (WLSMV) with polychoric correlations. The WLSMV was carried out in this study because it is appropriate for ordinal items (Ledesma et al., 2021). Furthermore, estimates the factorial loadings with more accurate compared to other methods based on maximum likelihood (Li, 2016a, 2016b).
The valuation of the model was carried out both general and specific. In general terms, the magnitude of some fit indices was examined, such as the CFI (> .90; McDonald & Ho, 2002), the upper limit of the confidence interval (CI) of RMSEA (< .10; West et al., 2012), and WRMR (< 1; DiStefano et al., 2018). The magnitude of factorial loadings was evaluated individually (> .50; Dominguez-Lara, 2018b) as was the difference between highest and lowest magnitude inside of each structure (< |.10|: trivial; ≥ |.10|: small; ≥ |.20|: moderate; ≥ |.30|: large; ≥ |.40|: very large; Finch & French, 2008) and the average variance extracted by factor (AVE > .50; Fornell & Larcker, 1981).
With respect to reliability, in each model, the assumption of tau-equivalence was evaluated prior to the calculation of the coefficient alpha (Dunn et al., 2014), and it was concluded that it is satisfied if ΔCFI < −.01 y ΔRMSEA < .015 (Chen, 2007). Likewise, the construct reliability was valued with the coefficient ordinal alpha (Dominguez-Lara, 2018c) and ω (> .70; Hunsley & Marsh, 2008). The difference between the coefficients alpha and omega (Δω-α) was considered significant when exceeding |.06| (Gignac et al., 2007).
Following these initial steps, a measurement invariance analysis of the one-dimensional model for ASSPSES was conducted among the five samples through a multi-group factor analysis (MGFA) was carried out, and the statistical equivalence of the internal structure of the ASSPSES (configural invariance) was initially examined, as were the factor loadings (metric invariance), thresholds (strong invariance), and finally, residuals (strict invariance; Pendergast et al., 2017). Evidence regarding the level of measurement invariance was initially evaluated in general terms through the examination of variation in the CFI and RMSEA. Thus, the level of metric invariance was considered favorable according to the RMSEA variation (Δ ≤ .05; Rutkowski & Svetina, 2017) y CFI (Δ ≥ −.004; Rutkowski & Svetina, 2017). With regard to strong invariance, the RMSEA (Δ ≤ .01; Rutkowski & Svetina, 2017) was considered as well as the CFI (Δ ≥−.004; Rutkowski & Svetina, 2017). Subsequently, potential non-invariant parameters (according to the modification indices) were examined individually, with a focus on misspecifications (Saris et al., 2009). Finally, if the number of non-invariant parameters would be less than 20 % so that measurement invariance would be determined (Dimitrov, 2010).
Since different methods of assessing measurement equivalence do not converge perfectly due to differences in the sample size of the groups compared, the magnitude of the factor loadings, and the degree of item invariance, the probability of false positives and false negatives was reduced by implementing a complementary approach to measurement invariance. Then, a nonparametric approach based on contingency tables was introduced for the detection of differential item functioning (DIF) in more than two groups. The method was generalized Mantel-Haenszel statistics (Fidalgo & Madeira, 2008) based on the Mantel-Haenszel statistic for contingency tables organized in Q: R x C (Q: strata; R: row; C: colum; Landis et al., 1978). This method allows us to obtain a full statistical test of the null hypothesis of absence of DIF in more than three groups simultaneously.
The specifications for this procedure were as follows: a) the matching variable was the ASSPSES total observed score, and transformed into quantiles; to address the sensitivity of DIF detection, the total score was transformed into five (quintiles) and ten intervals (deciles), b) the number of focus groups was five, corresponding to each country compared, c) the alpha for testing the null hypothesis of absence of DIF was at .05 corrected with Bonferroni´s method (.01/9 = .005) to reduce the effect of repeated statistical testing (Fidalgo, 2011; Penfield, 2001), d) items detected with possible DIF were analyzed post hoc by paired comparisons between groups (Fidalgo & Scalon, 2009), and where the alpha (. 05) of these paired comparisons was also adjusted with the Bonferroni method due to its effectiveness (Kim & Oshima, 2013); and e) the Generalized Ordinal Ordinal Mantel-Haenszel statistic QGMH(2) (df = number of response options - 1) was used, in which the response variable is ordinal (ASSPSES items) and the grouping variable is nominal (in this study, five-level factor, one for each country). The QGMH(2) statistic tests the null hypothesis against no differences in item mean scores across group factor levels, and is sensitive to DIFs of similar direction and magnitude across response categories (Fidalgo 2011; Fidalgo & Scalon, 2012).
Because of the sample size the p-value approach is likely to reject the null hypothesis (Halsey, 2019), even regardless of the magnitude of the DIF, the partial gamma coefficient (( p; Davis, 1967) was used as an effect size estimator. This statistic is applicable to Q: R x C tables characteristic of DIF for ordinal variables (Schnohr et al., 2008) and was valued according to this scale: Between 0 and .15, weak; between .16 and .30, moderate, greater than .30 strong (Schnohr et al., 2008).
The analytic process was facilitated by the software program Mplus Version 7 (Muthén & Muthén, 1998-2015), and the misspecifications were analyzed with a specific module (Dominguez-Lara & Merino-Soto, 2018). The DIF analysis was carried out with the software GMHDIF (Fidalgo, 2009, 2011) and R package iarm (Mueller, 2020)
Finally, regarding the evidence of validity due to its relationship with other variables, the association between the measure of ASE and that of AEE was explored with Pearson correlation coefficient under an effect size approach, where a value between .20 and .50 is considered low, between .50 and .80, moderate, and greater than .80, high (Ferguson, 2009).
Results
To the extent that descriptive analysis is concerned, all five samples evidenced acceptable magnitudes of skewness and kurtosis in each of the items, as well as acceptable multivariate normality (G2ARG = 43.08; G2BRA = 6.82; G2COL = 36.66; G2MEX = 22.79; G2PER = 29.16). As for the inter-item correlation (r ii ) by country, in Peru it ranged between .474 and .694 (M = .58), in Mexico between .48 and .69 (M = .58), in Colombia between .56 and .75 (M = .64), in Argentina between .40 and .57 (M = .47) and in Brazil between .26 and .64 (M = .47), which indicates the absence of multicollinearity between items.
The structural analysis reveals the fulfillment of a one-dimensional structure, with respect to the fit indices, the factor loadings and the AVE (> .50), and with minor differences between the highest and lowest factor loading (Table 3). With respect to reliability, in most of the samples (except for Brazil and Peru) the assumption of tau-equivalence was met for an appropriate estimate of the coefficient α. In addition, all of the reliability indicators were favorable (> .80) and virtually the same in terms of magnitude (Table 2).
As far as the evidence for measurement invariance is concerned, the variation of the fit indices fell within the expected range following the incremental restriction of parameters, both for CFI and RMSEA, until reaching strong invariance (Table 2). In addition, an individual analysis of the estimated parameters reveals complimentary information. As has already been suggested, the one-dimensional structure provides satisfactory indicators when analyzed simultaneously (configural invariance), but when load factors are fixed to a specific designated value -a reference value (RV)- to analyze the metric invariance, some differences emerge among the groups. For example, the factors loadings of several items in the Colombian and Argentine samples are, respectively, higher and lower than the RV (Table 3). Meanwhile, the Peruvian, Brazilian, and Colombian samples do not reveal differences (except for Item 1 for the Brazilian students). However, after modeling the strong invariance, these differences disappeared, and only some thresholds appeared as not invariant in an isolated way (i.e., one threshold per country; Table 3).
Source: Elaborated by authors. Note. PER: Peru; COL: Colombia; ARG: Argentina; MEX: México; BRA: Brazil. %At: percentage of attenuation.
As for DIF, when the total score was divided into quintiles and deciles (Table 4), the QMH(2) statistic detected three items with probable differential functioning (items 1, 3 and 6) , with QMH(2) > 28.0 (p< .0005). The p coefficients ranged from , and for items detected with DIF, the p coefficients ranged from , which can be considered weak. Similarly, when the total score was divided into deciles, all items (except item 9) were detected with DIF (p< .0005), but the coefficients ranged, a magnitude also weak in all cases.
Source: Nota:Elaborated by authors. : Partial gamma coefficient. QMH(2): Generalized Ordinal Mantel-Haenszel statistic (df = 4). * Bonferroni correction p = .0005.
Thus, considering the sources of information, measurement invariance was favorably evidenced taken account the information of the two approaches revised.
Regarding the association with the measure of EEE, small correlations were found with the sample obtained from Mexico (r = -.29), Colombia (r = -.21) and Peru (r = -.34).
Discussion
The aim of the study was to analyze measurement invariance of ASSPSES among college students from five Latin American countries. The research approach emerges from the premise that any kind of intercultural research involves an overlapping of cultures (Sarmento, 2014), allowing for the recognition of differential and common elements in diverse groups with respect to a particular phenomenon.
With regard to the structural results for each group, the psychometric indicators do not differ from previous findings regarding the internal structure and reliability of ASSPSES in the samples from Peru (Dominguez-Lara, 2016), Mexico (Dominguez-Lara & Campos-Uscanga, 2020), and Argentina (Tumino et al., 2020). The factorial parameters from Colombia are more favorable (i.e., higher factor loads), and in Brazil, the scenario was similar to that in the first three countries, with the exception of item 1, which obtained an acceptable factor loading, but one that was lower than those of the others. In this sense, the internal structure of ASSPSES also receives favorable evidence for Colombia and Brazil. Regarding reliability, the level of precision attained is notable in all cases, including cases where the tau-equivalence was not met.
In relation to the former, it is worth recalling an elevated RMSEA in all cases, which is explained by the magnitudes of the observed factor loads (> .70; Mahler, 2016; Savalei, 2012), but an RMSEA at the margin of what is acceptable, together with a good CFI in all cases, does not necessarily mean a bad model (Lai & Green, 2016).
With regard to the invariance analysis that was conducted, it is possible to conclude with a high level of certainty that ASSPSES is invariant among the groups, at least up through the strong invariance in the MGFA and DIF. This indicates that the construct is measured similarly among the students surveyed and with tolerable measurement error (reliability coefficients > .85), which provides greater possibilities for a joint study of the construct, and even for individual educational diagnostic purposes due to its high degree of reliability.
Although this study is the first that involves an analysis of ASSPSES carried out simultaneously in different samples and with favorable findings at that, certain caution is advisable with regard to the descriptive aspects of the findings. For example, Colombia has the highest factor loads, while Argentina has the lowest. This would indicate that although ASE is evaluated in a one-dimensional manner in both countries (and in the other countries), the proportion of true variance differs between these countries, and it is probable that, if a comparison were carried out, the Argentine sample would have more measurement errors than the Colombian sample would.
In this vein, it stands out that all of the countries have item 3 (“I feel confident about approaching situations that test my academic capability”) as one of the two most representative items. The next most representative item is item 6 (“I believe that I am a capable and competent person in my academic life”) for Peru, Colombia, and Mexico, item 5 (“I don’t care that the professors are demanding and tough because I trust my own academic capability”) for Argentina, and item 7 (“I believe I am sufficiently capable of having a good academic record if I set out to do so”) for Brazil. As for the least representative item, item 7 for Peru and Mexico, and item 8 (“I think I can pass the classes fairly easily and even get good grades”) for Colombia, Argentina, and Brazil.
Regarding the association between ASE and AB, the negative correlation is compatible with previous evidence (Charkhabi et al., 2013; Kong et al., 2021) and would indicate that cognitive and social demands could affect SSA beliefs and thereby decrease achievement expectations, planning ability, and academic performance by increasing the experience of stress. This information could be useful to professionals guiding students because students with high levels of ASE should be tolerable and not lead to problems in achieving learning goals.
These results have various practical applications. To begin with, being able to rely on an invariant measurement for ASE allows for a more precise comparison among groups. Likewise, it is possible to consider the use of ASSPSES in comparative research of intercultural nature. From a broad, evidence-based vantage point, this approach would, make, among other features, allowances for the identification of subject areas in the region where further investigation is needed. It is also possible to use the questionnaire to evaluate the efficacy of intervention programs geared toward ASE (e.g., Roshangar et al., 2020; Sohrabi et al., 2022), particularly in instances of low ASE, given the demonstrated correlation with various variables that directly affect the academic achievement of students. In addition, it is possible to encourage the development of research regarding the advancement of health and wellbeing in young people because ASE has a positive correlation with autonomy (Buadas et al., 2017), resiliency (León et al., 2019), moods (Medrano et al., 2016), structural and psychological empowerment (Tumino et al., 2020), and has a negative correlation with stress (Gutiérrez-García & Landeros-Velázquez, 2018) and academic procrastination (Moreta-Herrera et al., 2019), while in other cultural contexts, self-efficacy has proven to be a direct predictor of academic resiliency (Martin & Marsh, 2006), as well as being a predictor of academic performance (Ansong et al., 2019). All of this is relevant at the moment wherein this paper is written because of the current context of generalized uncertainty resulting from the global health emergency caused by the COVID-19 pandemic and given that Latin America is one of the regions with the greatest numbers of people infected, students of all educational levels will be prioritized with regard to intervention (Moreta-Herrera et al., 2022). In that sense, possessing a valid and reliable measure of ASE for a sizable portion of Latin American countries could help consolidate and standardize programs that help students succeed academically and conquer adversities, considering the importance of ASE in these processes, and the same instrument can be used to evaluate its effectiveness due to its brevity and ease of administration.
This study has various strengths. The first is that, in addition to the variation in the fit indices, the individual parameters were also examined (e.g., modification indices) as a foundation for decision making in the analysis of measurement invariance. The second strength concerns the criteria used to assess measurement invariance. Traditionally, cut-off points are used for the variation of adjustment indices (e.g., ΔRMSEA) based on quantitative variables and two groups (e.g., Chen, 2007). However, in light of the fact that we are working with a greater number of groups (5) and a one-dimensional model of ordinal variables, the criteria used was more appropriate (Rutkowski & Svetina, 2017).
Regarding the limitations, it must be remembered that the samples gathered are not representative of each of the participating countries and were collected from different independent studies, so that in future studies the collection of information will be planned jointly. Therefore, it is not possible to generalize the results for the entire population of Latin American students. Second, a majority of the participants were women in all five countries, with percentages ranging from 88.9 % for Colombia to 59.9 % for Mexico. Third, the study did not examine other contextual factors (van de Vijver, 2009) that may have differed among countries and which could have potentially affected the results (e.g., educational models at Latin American institutions of higher education, the educational quality of participating institutions, how these institutions are ranked within their respective countries as well as in the Latin American context). This was because, for example, a majority of the Mexican students attended a public university, while students from all the other countries attended private universities. Fourth, the evaluation of the ESA was only carried out in the Peruvian, Mexican and Colombian samples, since they shared the same evaluation format, while the applications in Argentina and Brazil were developed in the framework of independent projects that did not include this construct, although in all cases the institutional ethical procedures and the presentation of informed consent were respected. Finally, it is true that noisy data is a potential problem for any psychometric study, but it would only appear in the presence of adequate statistical significance of the model (favorable fit indices, acceptable factor loadings, etc.) and low reliability coefficients (Clayson & Miller, 2017), which is not compatible with the findings of the present manuscript, because reliability coefficients are above .88 in all cases.
Then, it is concluded that ASSPSES is an invariant measurement of ASE among students from five Latin American countries and that the findings contribute to the understanding of the internal structure of the scale, something that had been awaiting evaluation, given how the scale is used in various contexts.
Nonetheless, further research is recommended where groups are balanced in terms of sex and field of study to minimize the likelihood of introducing biases resulting from the characteristics of the participants. To that end, it would be useful to carry out invariance analyses according to gender, type of institution (public or private), and work situation (whether the student works) to determine whether the factorial structure is maintained. It would also be appropriate to supplement such an investigation with an invariance analysis from the perspective of item response theory models wherein parameters of difficulty and discrimination are used to evaluate the latent space and specific contribution levels of the items.