After a desk-rejection for JPSP, my co-author and I submitted our ms. to PSPB (see blog https://replicationindex.com/2021/07/28/the-race-implicit-association-test-is-biased/). After several months, we received the expected rejection. But it was not all in vane. We received a signed review by John Jost and for the sake of open science, I am pleased to share it with everybody. My comments are highlighted in bold.
Warning. The content may be graphic and is not suitable for all audiences.
Back in July 2021, the authors sent me a draft of the present paper. I am glad that they did so, because it gave us an opportunity to exchange our opinions and interpretations and to try to correct any misunderstanding or misinterpretations. Unfortunately, however, I see that in the present submission many of those misinterpretations (including false and misleading statements) remain. Thus, I am forced to conclude, reluctantly, that we are not dealing with misunderstandings here but with strategic misrepresentations that seem willful. To be honest, this saddens me, because I thought we could make progress through mutual dialogue. But I don’t see how it serves the goals of science to engage in hyperbole and dismissiveness and to misrepresent so egregiously the views of professional colleagues.
For all of these reasons, and those enumerated below, I am afraid that I cannot support publication of this paper in PSPB.
John Jost
(1) On p. 3 the authors write: “IAT scores close to zero for African Americans have been interpreted as evidence that “sizable proportions of members of disadvantaged groups – often 40% to 50% or even more exhibit implicit (or indirect) biases against their own group and in favor of more advantaged groups” (Jost, 2019, p. 277). This is not true. We did not “interpret” the mean-level scores in terms of frequency distributions (or vice versa). We looked at both. So these are two separate observations; one observation was not used to explain the other. For African Americans the mean-level scores were close to zero (no preference) and, using a procedure described in the note to Figure 1 for Jost et al. (2004 p. 898), we concluded that 39.3% exhibited a pro-White/anti-Black bias. (The 40-50% figure comes from
other intergroup comparisons included in the original article).
It is not important how you did arrive at a precise percentage of unconsciously self-hating African Americans. We used this quote to make clear that you treated the race IAT as a perfectly valid measure of unconscious bias to arrive at the conclusion that a large percentage of African Americans (and a much larger percentage than White Americans) have a preference for the White out-group over the Black in-group. This is the key claim of your article and this is the claim that we challenge. At issue is the validity of the race IAT which is required to make valid claims about African Americans, not the statistical procedure to estimate a percentage.
(2) On p. 4, the authors write: “Jost et al.’s (2004) claims about African Americans follow a long tradition of psychological research on African Americans by mostly White psychologists. Often this research ignores the lived experience of African Americans, which often leads to false claims…” There are two very big problems with this section of the paper, which I have already pointed out to the authors (and they have apparently chosen to ignore them).
(a) The first is that this is an ad hominem critique, directed at me because of a personal characteristic, namely my race. For centuries philosophers have rejected this as a fallacious form of reasoning: whether something is true or false has nothing to do with the personal characteristics of the person making this claim. Furthermore, the senior author (Uli Schimmack) is obviously wielding this critique in bad faith; he, too, is White, so if he took his own objection seriously he would refrain from making any claims about the psychology of African Americans, but he obviously has not refrained from doing so in this submission
or in other forums.
It is a general observation that White researchers have speculated about African American’s self-esteem and mental states often without consulting African Americans. (see our quote of Adams). And I, Ulrich Schimmack, did collaborate with my African American wife on this paper to avoid this very same mistake.
(b) The second problem with this claim, which I have also already pointed out to the authors, is that the very same hypotheses about internalization of inferiority advanced by Jost et al. (2004) in the article in question were, in fact, made by a number of Black scholars, including W.E.B. DuBois, Frantz Fanon, Steven Biko, and Kenneth and Mamie Clark. These influences are discussed in considerable detail in my 2020 book, A Theory of System Justification.
Kenneth and Mamie Clark are the authors’ of the famous doll studies from the 1940s. Are we supposed to believe that nothing has changed over the past 80 years and that we can just use a study with children in 1940s to make claims about adult African Americans’ attitudes in 2014? What kind of social psychologists would ignore the influence of situations and culture on attitudes?
(3) On the next page the authors write: “Just like White theorists’ claims about self-esteem, Jost et al.’s claims about African Americans’ unconscious are removed from African Americans’ own understanding of their culture and identity and disconnected from other findings that are in conflict with the theory’s predictions. The only empirical support for the theory is the neutral score of African Americans on the race IAT.” Now, this claim is absurd. The book cited above describes hundreds of studies providing empirical support for the theory that have nothing to do with the IAT.
Over the past 10 years, we have seen this gaslighting again and again. When one study is criticized, it is defended by pointing to the vast literature of other studies that also support this claim. There may be other evidence, but it is not clear how this other evidence could reveal something about the unconscious. The whole appeal of the IAT was that it shows something that explicit measures cannot show. In fact, explicit ratings often show a stronger in-group favoritism among African Americans. To dismiss this finding, Jost has to allude to the unconscious which shows the hidden preference of Whites.
(4) They go on: “We are skeptical about the claim that most African-Americans secretly favor the outgroup based on the lived experience of the second author” (p. 5). But this was not our claim. As noted above, we found that 39.3% of African Americans (not “most”) exhibited a pro-White/anti-Black bias on the IAT. But, of course, the theory is about relations among variables, not about the specific percentage of Black people who do X, Y, or Z (which is, of course, affected by historical factors, among many other things).
Back to the game with percentages. We do not care whether you wrote 40% or 50%. We care about the fact that you make claims about African American’s unconscious based on an invalid measure.
(5) On p. 6 the authors write: “the mean score of African Americans on the race IAT may be shifted towards a pro-White bias because negative cultural stereotypes persist in US American culture. The same influence of cultural stereotypes would also enhance the pro-White bias for White Americans. Thus, an alternative explanation for the greater in-group bias for White Americans than for African Americans on the race IAT is that attitudes and cultural stereotypes act together for White Americans, whereas they act in opposite directions for African Americans” (p. 6).
As noted above, in July 2021 I wrote to the authors in an attempt to clarify that, from the perspective of SJT, the effects of “cultural stereotypes” in no way support “an alternative explanation” for out-group favoritism, because stereotypes (since the very first article by Jost & Banaji, 1994) have been considered to be system-justifying devices. Here is what I wrote to them: You describe the influence of “cultural stereotypes” as some kind of an alternative to system justification processes, but they are not. The theory started as a way of understanding the origins and consequences of cultural stereotypes. None of this contradicts SJT at all: “The nature of the task may activate cultural stereotypes that are normally not activated when African Americans interact with each other. As a result, the mean score of African Americans on the race IAT may be shifted towards a pro-White bias because negative cultural stereotypes persist in US American culture. The same influence of cultural stereotypes would also enhance the pro-White bias for White Americans.” Yes, this is perfectly consistent with SJT. In fact, it is part of our point. And the purpose of SJT is not to explain what happens “when African Americans interact with each other,” although it may shed some light on intragroup dynamics. I think of the scene in Spike Lee’s (a Black film director, as you well know) movie, School Daze, when the light-skinned and dark-skinned African Americans are fighting/dancing
with each other. There is plenty of system justification going on there, it seems to me.
We may (or may not disagree) in our interpretation of the social dynamics in School Daze, but I feel that the authors are now willfully misrepresenting system justification theory on the issue of “cultural stereotypes,” even after I explicitly sought to clarify their misrepresentation months ago: The activation of cultural stereotypes IS part of what we are trying to understand in terms of SJT.
Jost ignores that many other social psychologists have raised concerns about the validity of the race IAT because it may conflate knowledge of negative stereotypes with endorsement of these stereotypes and attitudes (Olson & Fazio, DOI: 10.1037/0022-3514.86.5.653). For anybody who cares, please ask yourself why Jost does not address the key point of our criticism, namely the use of race IAT scores to make inferences about African Americans’ unconscious without evidence that it can measure conscious or unconscious preferences of African Americans.
(6) It has been a while since I read the Bar-Anan and Nosek (2014) article, but my memory for it is incompatible with the claim that those authors were foolish enough to simply assume that the most valid implicit measures was the one that produced the biggest difference between Whites and Blacks in terms of in-group bias, as the present authors claim (pp. 7-8). As I recall, Bar-Anan and Nosek made a series of serious and comprehensive comparisons between the IAT and other tasks and concluded on the basis of those comparisons, not the one graphed in Figure 1 here, that the validity of the IAT was superior. I feel that, in addition to seriously representing my own work, they are also seriously misrepresenting the work of Bar-Anan and Nosek. Those authors should also have the opportunity to review and/or respond to the present claims being made about the (in) validity of the IAT.

So, the reviewer relies on his foggy memory to question our claim instead of retrieving a pdf file and checking for himself. New York University should be proud of this display of scholarship. I hope Jost made sure to get his Publons credit. Here is the relevant section from Bar-Anan and Nosek (2014 p. 675; https://link.springer.com/article/10.3758/s13428-013-0410-6).

(7) One methodological improvement of this paper over the previous draft that I saw is that this version now includes other implicit measures, including the single category IAT. However, the hypothesis stated on p. 9, allegedly on behalf of SJT, is incorrect: “System justification theory predicts a score close to zero that would reflect an overall neutral attitude and at least 50% of participants who may hold negative views of the in-group.” This is wrong on several counts and indicates a real lack of familiarity with SJT, which predicts that (to varying degrees) people are motivated to hold favorable attitudes toward themselves (ego justification), their in-group (group justification), and toward the overarching social system (system justification). This last motive—in a departure from the first two—implies that, based on the strength of system justification tendencies, advantaged group members’ attitudes toward the ingroup
will become more favorable and disadvantaged group members’ attitudes toward the in-group
will become less favorable. As noted above, SJT is not about making predictions about absolute scores or frequency counts—these are all subject to historical and many other contextual factors. It would be foolish to predict that African Americans have a neutral (near zero) attitude toward their own group or that 50% have a negative attitude. This is not what the theory says at all. Unless you have separate individual-level estimates of ego, group, and system justification scores, the most one could hypothesize is that on the single category IAT is that African Americans would have a more favorable evaluation of the out-group than European Americans would, and European Americans would have a more favorable
evaluation of the in-group than African Americans would. Note that I am writing this before looking at the results.
We are interested in African Americans and White Americans attitudes towards their in-groups and out-groups. If System Justification Theory (SJT) makes no clear predictions about these attitudes, we do not care about SJT. However, we do care about an article that has been cited over 1,000 times that makes the claim that many African Americans have unconscious negative attitudes towards their in-group and the support of this claim by means of computing a percentage of African Americans who scored above zero on a White-Black IAT (i.e., slower responses when African American is paired with good than when African American is paired with bad). We show that the race IAT lacks convergent validity with other implicit measures and that other implicit measures show different results. Thus, Jost has to justify why we should focus on the IAT results and ignore the results from other IAT tasks. So far, he has avoided talking about our actual empirical results.
(8) On pp. 10-11 the authors concede: “The model was developed iteratively using the data. Thus, all results are exploratory and require validation in a separate sample. Due to the small number of Black participants, it was not possible to cross-validate the model with half of the sample. Moreover, tests of group differences have low power and a study with a larger sample of African Americans is needed to test equivalence of parameters… models with low coverage (many missing data) may overestimate model fit. A follow-up study that administers all tasks to all participants should be conducted to provide a stronger test of the model.” These seem like serious limitations that, in the absence of replication with much larger samples, undermine the very strong conclusions the authors wish to draw.
So Jost can make strong claims (40% of African Americans have unconscious negative attitudes towards their group) based on an unvalidated measure, but when we actually show that the measure lacks validity, we need to replicate our findings first? This is not how science works. Rather, Jost needs to explain why other implicit measures, including the single category IAT do not show the same pattern as the race IAT that was used in the 2001 article.
(9) There is a peculiar paragraph on p. 13 in the “Results” sections, even though it goes well beyond the reporting of results: “Most important is the finding that race IAT scores for African Americans were unrelated to the attitudes towards the in-group and out-group factors. Thus, scores on the race IAT do not appear to be valid measures of African Americans’ attitudes. This finding has important implications for Jost et al.’s (2004) reliance on race IAT scores to make inferences about African Americans’ unconscious attitudes towards their in-group. This interpretation assumed that race IAT scores do provide valid information about African American’s attitudes towards the ingroup, but no evidence for this assumption was provided. The present results show 20 years later that this fundamental assumption is wrong. The race-IAT does not provide information about African Americans’ attitudes towards the in-group as reflected in other implicit measures.”
First of all, I don’t know if one can conclude, even in principle, that the race IAT is invalid for African Americans on the basis of a single study carried out with approximately 200 African American participants. There have been dozens, if not more, studies conducted (see Essien et al., 2020, JPSP), so it seems that any attempt to claim invalidity across the board should be based on a far more comprehensive analysis of larger data sets. Second, if I understand the specific methodological claim here it is that African Americans’ race IAT scores are not correlated with whatever the common factor is that is shared by the other implicit attitude measures (AMP, evaluative priming, and SC-IAT) and one explicit attitude measure (feeling thermometer). At most, it seems to me that one could conclude, on the basis of this, that the race IAT is measuring something different than the other things. This is not all that surprising; indeed, the IAT was supposed to measure something different from feeling thermometers. It seems like a stretch to conclude that the IAT is invalid and the other measures are valid simply because they appear to be measuring somewhat or even completely different things.
Third, the hyperbolic and misleading language implies that something about the IAT is a “fundamental assumption” of SJT, but this is false. The IAT was simply considered to be the best implicit measure at that time (20 years ago), so that is what we used. But it is silly to assume that hypotheses, especially “fundamental” ones, should be forever tied to specific operationalizations. Fourth, the attacking, debunking nature of this paragraph—against the IAT as a methodological instrument and against SJT as a theoretical framework—makes it clear that the authors are not really very interested in the dynamics of ingroup and outgroup favoritism among members of advantaged and disadvantaged groups (measured in different ways). It’s as if the real issue doesn’t even come up here.
Finally, we get to the substantive issue. First, let’s get the gaslighting out of the way. There have not been dozens of studies trying to validate the race IAT for African Americans. There have been zero. This is not surprising because there have also been no serious attempts to validate the race IAT for White respondents or IATs in general (Schimmack, 2021; https://journals.sagepub.com/doi/abs/10.1177/1745691619863798). The key problem is that social psychologists are poorly trained in psychometrics (i.e. the science of psychological measurement and construct validation; Schimmack, 2021, https://open.lnu.se/index.php/metapsychology/article/view/1645).
Now on to the substantive issue. We are the first two show that among African Americans, several implicit measures (e.g., evaluative priming, AMP, single category IAT) show some (modest) convergent validity with each other. Not surprisingly, they also show convergent validity with explicit measures because all measures mostly reflect a common attitude (rather than one conscious and one unconscious ones) (Schimmack, 2021; https://journals.sagepub.com/doi/abs/10.1177/1745691619863798). All of these measures show as much (or more) positivity in in-group attitudes for African Africans as for White Americans. This is an interesting finding because positive attitudes on explicit measures were dismissed by Jost. But now several implicit measures show the same result. Thus, it is not a simple rating bias. Now the race IAT and its variants are the odd ones with a different pattern. Why? That remains to be examined, but to make claims about African Americans’ attitudes we would need to know the answer to this question. Maybe it is just a method artifact? Just raising this possibility is a noteworthy contribution to science.
(10) Eventually, a few pages later, the authors get around to telling us what they really found with respect to the actual research question: “Also expected was the finding that out-group attitudes of African Americans, d = .42, 95%CI , are more favorable than out-group attitudes of White Americans, d = .20, 95%CI.” So, um, African Americans exhibited more favorable attitudes toward Whites than Whites exhibited toward African Americans. This is precisely what system justification theory would have predicted, as I noted above (before looking at the results). It is, perhaps, an interesting discovery — if it is replicated with larger samples — that out-group attitudes are unrelated to in-group attitudes for both groups and that in-group attitudes were equally positive for both groups. But, with respect to the key question of out-group favoritism, the authors actually obtained support for SJT but refuse to even acknowledge it. Is this really what science is about? On the contrary, they draw this outrageous conclusion: “Thus, support for the system justification theory rests on a measurement artifact.” In point of fact, when the authors return to the comparative ingroup vs. outgroup measure they arrive at a conclusion that is virtually the same as Jost et al. (2004): “White Americans’ scores on the race IAT
are systematically biased towards a pro-White score, d = .78, whereas African Americans’ scores are only slightly biased towards a pro-Black score, d = -.19.” Yes, advantaged groups tend to show reasonably strong in-group favoritism, whereas disadvantaged groups tend to show weak in-group favoritism, with substantial proportions showing out-group favoritism. This is precisely what we found 20 years ago. The authors and I already had this exchange back in July, but their paper contains the same misleading statements as before. Here is our exchange:
You write: “Proponents of system justification theory might argue that attitudes towards the in-group have to be evaluated in relative terms. Viewed from this perspective, the results still show relatively more in-group favoritism for White Americans, d = .62 – .20 = .42 than African Americans, d = .54 – .40 = .14. However, out-group attitudes contribute more to this difference, d = .40 = .20 = .20, than in-group differences, d = .62 – .54 = .08. Thus, one reason for the difference in relative preferences is that African Americans attitudes towards Whites are more positive than White Americans’ attitudes towards African Americans.”
My response: Yes, this is key. We are talking about the ways in which people respond to relative status, power, and wealth, etc. rankings within a given social system (or society). The fact that “African Americans attitudes towards Whites are more positive than White Americans’ attitudes towards African Americans” is supportive of SJT.
Oh boy, sorry if you had to read all of this. Does it make sense to make a distinction between in-group attitudes and out-group attitudes? I hope we can agree that it does. Would we be surprising if Black girls like White dolls more than White girls like Black dolls? Not really and it doesn’t tell us anything about internalizing stereotypes. The important and classic doll study did not care about the comparison of out-group attitudes. The issue was whether Black children preferred White dolls over Black dolls and Jost et al. (2001) claimed that many African Americans internalized negative stereotypes of their group and positive stereotypes of Whites so that they have a relatively greater preference of White over Black. The problem is that the race IAT confounds in-group and out-group attitudes and that measures that avoid this confound like the single-attribute IAT don’t show the same result.
(11) Another huge problem with this whole research program is that it ignores completely the strongest piece of evidence for SJT in this context, namely that the degree of out-group favoritism among disadvantaged groups is positively associated with support for the status quo, measured in terms of political conservatism and individual difference measures of system-justifying beliefs (e.g., see Ashburn-Nardo et al., 2003; Essien et al., 2020; Jost et al., 2004). If Blacks’ responses on the IAT were random or meaningless, I see no reason why they would be consistently correlated with other measures of system justification. But the voluminous literature shows that they are (Essien et al., 2020). Although I have pointed this out to the authors before, they have simply ignored the issue once again, even though this is a key piece of evidence that supports the SJT interpretation of implicit attitudes about advantaged
and disadvantaged groups
Back to gaslighting. Let’s say there are some studies that show this pattern. How does Jost explain the pattern of results in the present study? He doesn’t. That is the point.
(12) All of the above problems are repeated in the General Discussion, so there is no need to address them again point by point. But I will say that other key issues that the authors and I discussed in July are also ignored in the present submission: I wrote: This statement is interesting but far too categorical, in my opinion: “It would be a mistake to interpret this difference in evaluations of the out-group as evidence that African Americans have
internalized negative stereotypes about their in-group.” First, it is not an either/or situation, as if people either love their group or hate it. This is not how people are. There are multiple, conflicting motives involving ego, group, and system justification, and ambivalence is part of what interests us as system justification theorists. Second, there is plenty of other evidence suggesting that—again, to some degree—African Americans and other groups “internalize”
negative stereotypes. Are you really suggesting that there are NO psychological consequences for African Americans living in a society in which they are systematically devalued? I’m still waiting for an answer to that last question. The purpose of this submission, it seems to me, is not to illuminate anything, really, and indeed very little, if anything, is illuminated. The purpose of the paper, it seems, is to create the appearance of something scandalous and awful and perhaps even racist in the research literature when, in fact, the substantive results obtained here are very similar to what has been found before. And if the authors really want to declare that the race-based IAT is a completely useless measure, they have a lot more work to do than re-analyzing previously published data from one relatively small study.
With the confidence of a peer-reviewer in the role of an expert, Jost feels confident enough to lie when he writes “In fact, the substantive results obtained here are very similar to what has been found before.” Really? Nobody has examined convergent validity of various implicit measures among African Americans before. Bar-Anan and Nosek collected the data, but they didn’t analyze them. Instead, they simply concluded that the race IAT is the best measure because it shows the strongest differences between groups. Here we show that implicit measures that can be scored to distinguish in-group and out-group attitudes do not show that African Americans hold negative views of their in-group. Does it matter? Yes it does. Where do African Americans want to live? Who do they want to marry? Would they want other African Americans as colleagues? The answers to these questions depend on their in-group attitudes. So, if Jost cared about African Americans rather than about his theory that made him famous, he might be a bit more interested in our results. However, Jost just displays the same level of curiosity about disconfirming and distressing evidence as many of his colleagues; that is, none. Instead, he fights like a cornered animal to defend his system of ideas against criticism. You might even call this behavior system justification.