What’s Wrong with the Enneagram

Kyle Whitaker
17 min readFeb 20, 2021

I’m fond of making fun of pop psychology. Mostly, it’s in good fun, and I respect the friends I have who find value in it. However, I’m also a fan of having justified beliefs about things, preferably based on good evidence. This is a bar that typically proves too high for pop psychology to meet. That’s the case with the Enneagram, which is the latest in a line of “personality inventories” or “typologies” to make the rounds amongst a certain type of white middle class religious person. For my fellow 90s kids out there, think Myers-Briggs on steroids.

For reasons I haven’t quite figured out, the Enneagram has really captured the imaginations of progressive evangelical and post-evangelical Christians, even though there isn’t anything particularly Christian about it, except the general fact that wisdom and self-knowledge are good things, and Christianity is in favor of those. It seems to have just the right blend of science, mysticism, and New Age self-centeredness to tick all the boxes of those who are done with the rote and rigid practices of old-school American evangelicalism but who are still into worship services and an intellectualized spirituality. A couple clarifications here: (1) “Intellectualized” does not imply “intellectual,” as will become clear below. What the Enneagram offers is not braininess, but rather the sense that one has thought through who one is, and that all the various aspects of oneself are being brought into integration. (2) “Self-centeredness” is not a snub. The Enneagram is explicitly and unapologetically a tool for self-knowledge. To this extent, it has much in common with the New Age spiritualism of white America in the late 20th century.

Now, let me be clear at the outset that I think self-knowledge is a laudable goal. Hell, it’s positively Socratic. “Know thyself,” the man said. I also think that there is nothing whatsoever wrong with many uses of pop psychology and its insatiable penchant for categorizing things. I have friends and acquaintances who view the Enneagram as one amongst many tools for self-discovery that may or may not be valuable to particular people in particular situations, and who insist on its careful and realistic application, preferably with the guidance of someone more experienced. I also know people who view it as a kind of parlor game, akin to (non-serious) astrology or the Sorting Hat or some uses of the MBTI before it. I have almost nothing to say to these people, and I wish them well.

Let me also put to rest one kind of objection that the Enneagram has encountered: that it is somehow anti-Christian or “pagan.” I have no interest whatsoever in this sort of worry, whether it’s well-founded or not (it isn’t).

My reservations are more concrete. Enneagram practitioners are often straddling the line between “helpful but dispensable tool for self-knowledge” and “pseudoscientific, anti-intellectual propaganda.” Let’s talk about the second thing.

Problems with the Enneagram

The main issue I have with the Enneagram and its advocates is, somewhat ironically, the same issue I have with evangelicals more generally: they really want to be intellectually respectable. Serious proponents of the Enneagram want it to be more than a parlor game. For example, Anna Sutton, a Senior Lecturer in Organisational Psychology at the University of Waikato and an Enneagram researcher and practitioner, says:

…as professional Enneagram practitioners, part of our role is to demonstrate that the Enneagram is not just another fad, that the stories and experiences we use to flesh out the types are not just convenient but are real illustrations of the similarities and differences between people.

The Enneagram certainly takes itself more seriously than astrology typically does, and most Enneagram advocates I’ve come across would bristle at that comparison. But as far as I can tell, the only thing that could justify that distinction is science, with which the Enneagram either refuses, or is unable, to engage seriously.

Let me give you an analogy. Imagine that I, an epistemologist, encountered someone talking publicly about beliefs and knowing things — and trying to get people to see the value of their thoughts on this and generally presenting their ideas as insightful and novel and beneficial to humans in general. Maybe they even teach seminars and collect public speaking and consulting fees for educating people about these topics. But also imagine that they have no awareness of the field of epistemology. And when I point out that there’s a whole scholarly literature on these topics going back millennia, the person either claims to be doing something else — so that they’re justified in not engaging the literature — or else they very superficially dip their toe into the literature, at perhaps a basic introductory undergraduate level, which suffices for most of their audience who don’t know (or care) any better, but which is disingenuous, ineffective, and irresponsible from the perspective of informed scholarship. I chose my own field here for the analogy, but any other would do just as well.

This is what Enneagram practitioners and advocates often do. They make rather grand claims about human psychology and personality — such as that all humans can be meaningfully categorized according to nine discrete types — but when called to justify these claims with evidence, they take one of the two routes mentioned above. They may claim that their typology is not beholden to science, that they’re doing something else entirely. But what the Enneagram does and what it claims to be good for are the topics of extensive scientific study. It’s about a domain for which there is an established scientific field (personality psychology & psychometrics). So if its advocates don’t see themselves as beholden to knowledge of that field, they are in the same predicament as our imagined pseudo-epistemologist (and, it’s worth noting, they have very little recourse to object to the comparison with astrology).

On the other hand, they may attempt to take the high road and make the Enneagram scientific. This is common amongst the practitioners I have observed and read. Unfortunately, most of them are overconfident about the validity of the Enneagram. Their confidence in its results far outstrips what has been established, as even its own more scientific practitioners sometimes admit. For example, Sutton, quoted above, implies in her writing about the Enneagram that any of the standard tests are merely beginnings to discovering one’s type:

Enneagram personality questionnaires like the RHETI and WEPSS are reasonably reliable but we should still be cautious about their validity. That means, we’re likely to get the same results on different occasions but there’s no guarantee that’s really our type.

This is decidedly not how most Enneagram test-takers understand their results. Sutton also admits that there’s no evidence — in fact, there’s negative evidence — that “movement to” different types under stressful or peaceful conditions is real, raising a huge red flag about a favorite feature of the Enneagram, its “wings.” More damning still, she acknowledges that the primary method of testing for Enneagram type — questionnaires — is problematic:

I would suggest that using self-report questionnaires on people who have never come across the Enneagram before is unlikely to provide convincing evidence for the model as a whole.

These quotes come from an article that Sutton published in the 2012 issue of The Enneagram Journal: “‘But Is It Real?’ A Review of Research on the Enneagram.” This paper is the best attempt I’ve found (and I’ve looked everywhere I can think to look) at dealing with the science. And the most she can say for it is:

…there is a relatively small pool of research dissertations and peer-reviewed papers to review. However, what we have so far makes for an interesting and convincing beginning to the research base for the Enneagram.

This is followed by some vague complaints about bias against the Enneagram. Such complaints are echoed elsewhere in the journal. For example, CJ Fitzsimmons and Jack Killen say in their 2013 article “How Science Can Help Solve the Enneagram’s Credibility Problem”:

The Enneagram exists today mainly in backwaters of the mainstream of contemporary psychological science and mental health practice. However, most in the Enneagram community probably believe it has something of value to add to ‘mainstream’ science-based disciplines and models of mental health and illness. Why, then, has it not been widely embraced? The first and most obvious reasons relate to its alleged roots in ancient wisdom traditions, and its more contemporary development which has taken place in popular and transpersonal psychology circles. There can be no doubt that some degree of academic snobbishness looks at these facts and turns its attention elsewhere.

This is not fair to the academic community who have good reason to ignore it (if they even know about it), even just based on its past, to say nothing of its inability to pass scientific muster. None of these authors explore why such bias might exist, even though the most probable reasons are in plain view: its strange history, the lack of need for it (more on this below), and the dubious claims that have been made for it so far, combined with at best neutral and at times negative experimental results in even the few peer-reviewed treatments that exist. Sutton’s paper is admittedly nine (gasp!) years old at the time of this writing, but I can’t find any comparable scientific review since, except one very critical 2015 book that classifies the Enneagram as pseudoscience (Thyer & Pignotti, Science and Pseudoscience in Social Work Practice). And the journal devoted specifically to this issue is no longer extant.

Other research reported in that journal, by the way, exhibits tendencies to exaggerate the validity of the test while downplaying negative results. For example, Maxon & Daniels’ 2008 twin study yielded mostly negative or neutral results that were somehow interpreted as mostly positive. And while we’re on the subject, the existence of this journal itself exemplifies my main contention above: instead of trying to converse with mainstream science, they created their own interior discussion which has the appearance of scientific validity, and then within its pages spent considerably more time parsing the details and applications of the theory than actually validating or critically discussing the theory (and even when this came up, the results were shaky and glossed over).

Let’s dig into the science a bit more.

The version of the test I took — the WEPSS, invented by Jerome Wagner — is among the most scientifically rigorous versions available, and it is plagued with problems. The only scholarly mention that I can find of it that isn’t from a pro-Enneagram institution is from a volume which includes two relatively brief reviews from practicing clinical psychologists which are not entirely positive. Wagner’s own research on it seems to stem from his own experiences teaching it with an overall small, non-representative sample, and he hasn’t done any recent peer-reviewed work to validate it that I can find. When I took it, I got the following very detailed result:

As already mentioned, the evidential support for the existence of “wings” or “auxiliary styles,” to say nothing of the effect of the contexts of “stressful” or “relaxed” conditions (sometimes referred to as “health” and “unhealth”), is nonexistent. But no indication is made of this in the results, nor is there any sense that this aspect of the profile is any less scientifically supported than the main type result.

The test is comprised of 200 phrases that the test-taker rates with respect to fit. Several of the phrases appeared to me to be nearly incoherent. The normal usage of many of them contradicts the descriptions that accompany the phrases in the test. Moreover, the descriptions themselves are often internally contradictory and occasionally include non-sequiturs. Situational variables are entirely ignored by the inventory. Often my answer to a question would be very different based on circumstance, but this is not accounted for in the test. More deeply, a lot of empirical psychological research suggests that character traits are deeply situationally affected (possibly to the point that the common lists of character traits are not very indicative of whatever actual traits there may be —but this is controversial). The WEPSS inventory displays no sensitivity to this fact. Moreover, the numbers you “go to” in various contexts (stress, health, etc.) cannot possibly be predicted from the questions in the assessment, because one of the main weaknesses of those questions is their blindness to context. Many Enneagram practitioners recognize that personality is situational, but how the situational effects on the numbers are determined is mysterious (or at least not made explicit that I’m aware of). Indeed, as noted above, Sutton says that the few studies that have looked specifically for this “movement” phenomenon have not found it.

Also, note that while Wagner describes himself on his website as “a faculty member in the Department of Psychology and the Institute of Pastoral Studies at Loyola University, Chicago,” he is not listed in either the department faculty roster or the institute faculty roster. This is…concerning.

More generally, Sutton notes that several studies have illustrated that the Enneagram exhibits broad agreement with other existing personality inventories, which provides evidence that the types are real. This is a fair point, but depends on the strengths of those other inventories (e.g., similarities with MBTI are not necessarily a point in the Enneagram’s favor, since scientific psychology does not view the MBTI very highly either). Also, it is not terribly surprising or informative that there are broad ways to classify people and that independent tests can be devised to illustrate many of these similar ways. It’s also likely (as is mentioned in the brief peer review of Wagner’s test) that fewer than nine types are actually strongly supported by the data. Sutton also admits that:

…there was only weak agreement (42%) between a person’s type as identified by the RHETI and the Wagner Enneagram Personality Styles Scale (WEPSS). In addition, relationships between the RHETI and the Adjective Checklist, which asks respondents to choose adjectives to describe themselves, were also not strong. This indicates that the two Enneagram questionnaires are not describing the types consistently, either with each other or in a way that can be captured clearly by an outside measure.

A still larger issue is that there are numerous other psychometric typologies with which the Enneagram must compete for scientific attention, prominent among them the NEO, which utilizes the “Big Five" personality typology, and the MMPI (for more detail on how the Enneagram compares to these, see the “scientific evidence” portions of this article). These tests are still debated, critiqued, and revised by professionals in the relevant fields, and they are far more rigorous than the Enneagram. In fact, one method of validating the Enneagram is by checking it against other such tests to see if it yields similar results. On this, Sutton says:

Further work on the RHETI, with a sample of 287 participants who completed the RHETI and a measure of the Big Five, showed that the nine type scales generally had theoretically predicted relationships with the Big Five (Newgent et al., 2004). Although there is still room for improvement in the RHETI, as some of the scales are less reliable than others, this provides some evidence that the differences between the Enneagram personality types can be demonstrated on the “industry standard” measure of personality traits.

Taking her word for this result, we’re still left with the question of why we need the Enneagram in addition to the Big Five or any of the others. If I have a wristwatch that works just fine, even if it needs resetting occasionally, I don’t need another, cheaper wristwatch that keeps worse time.

Takeaways

Given that the Enneagram is easily bested in the scientific arena by other, better-studied, well-established tests that perform essentially identical functions, we have to ask: why bother? What is unique or distinctive to the Enneagram that makes it more valuable than any other personality inventory, than horoscopes, than Nostradamus, than simple armchair introspection?

Proponents of the Enneagram are fond of touting its “explanatory power.” But I see none that makes it superior to any of the other forms of typing mentioned above. Sufficiently clever people could easily reproduce the level of explanatory power that the Enneagram possesses by setting out to intentionally invent a new typing system (which is exactly what the Enneagram’s popularizers — Gurdjieff, Ichazo, Naranjo, etc. — did).

I myself could probably invent a system that has as much hope of scientific validation as the Enneagram. And if I threw in a fun diagram and a mystical backstory, I could probably get people to use it. This isn’t a testament to my genius, but rather to the simplicity and predictability of the types. They’re all commonsense, broad categories that all humans occupy at various times. If I had gotten any other result besides 5 on my test (except perhaps 6), I would have believed it almost as strongly as the result that I did get, although I will admit that pegging me as the intellectual type is pretty easy. But I also find just as much that I would consider to be “core” aspects of my personality in most of the other numbers as I do in my own, a commonly reported experience with the Enneagram.

So why do the types seem accurate? Here I defer to the psychologists, who are adept at explaining why people believe false and unjustified things. One reason, which incidentally also commonly arises in discussions of astrology, is what’s known as the Forer effect. Give someone a vague enough statement, and they’ll see themselves in it, especially if it comes from a trusted source and they want to believe it. I knew I was a Ravenclaw before I ever took the test, and I answered the questions accordingly. When I saw the result and read the description, it’s no surprise that it felt like it was about me.

Another reason is that people are bad at understanding complexity. Enneagram advocates see the typology as more realistic than other inventories because of its greater complexity (nine types, wings, health/unhealth, etc.). But this reveals an understandable misunderstanding about the sorts of large numbers involved in human complexity, analogous to common misunderstandings about exponential growth. For example, just like many people don’t understand intuitively that the difference between 10⁵ and 10⁶ is greater than the difference between 0 and 10⁵, or that a virus that is spread to an average of two people per infected person (i.e., with an R0 of 2) will infect about a billion people in a month, as opposed to thirty people if it’s only spread to one person per infected person (i.e., R0 of 1) — many people simply have no intuitive grasp of the variety of personalities that there really are in the human population, only a tiny fraction of which could be accurately described by any single personality profile, including the Enneagram. Note that this is a problem for all attempts to group people according to a discrete set of categories, which is why even the best personality typologies are of limited usefulness, and should only be employed for serious purposes by professionals as a small part of a wholistic therapeutic approach.

What’s the big deal?

So the Enneagram isn’t sufficiently scientifically valid for the claims that are made for it or the uses to which it is put. What’s the harm?

Hopefully it’s clear by now that I think basing our beliefs on the evidence and being honest about what the evidence is are intrinsically good things. So even without a harm, the Enneagram’s situation is hopeless outside of some narrow, thoroughly casual contexts. But there are, sadly, real possible harms.

One is that it, like other personality inventories, can encourage rigidity. The claim that these types are exhaustive and universal is completely unmotivated and restrictive. The idea that there are only nine types of humans is arbitrary and silly, and no more motivated than any other number. It’s only nine (rather than five or three or whatever) for weird occultic reasons. Granted, the more thoughtful Enneagram practitioners avoid these extreme claims, but it’s natural for the layperson to think this way, given how the Enneagram is usually presented.

This rigidity could keep someone from attempting to change their traits. The Liturgists podcast did an episode on the Enneagram which featured proponents Ian Cron and Suzanne Stabile (authors of the book The Road Back to You). At one point in the interview, Stabile expresses that she had had to become okay with not receiving the kind of appreciation she’d like (or in the ways that she’d like) by her husband and best friend, because they are “fives.” She calls this “compassion,” but it makes me (a “five”) sad. People can change their character with effort, and often they are obligated to do so.

Further, as I gestured at in my Twitter joke, personality typologies can encourage dismissiveness and condescension. Any time I express something critical or give reasons against a view, the Enneagram advocate could reply with something like “Well you are a 5…” This is epistemically dangerous, not to mention rude.

More dangerously still, some might see the Enneagram as a substitute for therapy, or as self-therapy that isn’t backed by much empirical evidence. This essentially makes it a kind of alternative medicine.

Now, let’s be fair: the careful Enneagram proponent would probably caution against these sorts of uses, and I know several such careful people. But here’s the thing: the tool lends itself to these uses nonetheless. Here’s a melodramatic analogy: it’s well-known that the prevalence of guns is strongly correlated with increased gun violence. Gun proponents argue that the guns are a tool, and that it’s bad actors misusing them who are responsible for the violence and tragic deaths. Nonetheless, more guns equals more violence and death, because guns lend themselves to being used that way. The Enneagram isn’t killing anyone (that I know of), but given that it’s an unnecessary tool for purposes for which we already have better tools, and that it lends itself to this misuse, why bother?

There’s also a good deal of hokeyness and woo amongst Enneagram fans. For example, in that Liturgists episode, Stabile talks about there being a “true self” or an “essence” beneath the numbers — a real you without personality. There’s probably no such thing, and there’s certainly no objective evidence for it (because everyone has a personality). Similar points could be made about some of the Enneagram’s more mystical or religious interpretations. While this sort of thing isn’t strictly a feature of the Enneagram itself, we can’t ignore the fact that this phenomenon seems to be common, and that the test itself does little to prevent these interpretations. (We don’t, for example, see similar interpretations of the Big Five.)

It’s worth noting that one of the Liturgists co-hosts at the time, Mike McHargue, provides a more measured overview of the Enneagram:

The Enneagram is a non-rigorous, pragmatic personality modeling system that lets you identify the relative neurological expense of different social interactions and emotional expressions so you can best typify love and grace in a way that’s most neuro-palatable in your context.

Obviously, that’s one guy’s opinion, but I like it pretty well, so long as it stays in its place and doesn’t presume to be uncovering deep truths, nor to be any better at this task than a myriad of other potential “pragmatic modeling systems,” nor to be necessary or beneficial for everyone to think about. If all the Enneagram stans approached it like McHargue, I wouldn’t have written this article.

Knowledge matters.

The risk that’s nearest to my philosopher’s heart is somewhat different. The fact that the Enneagram’s strengths and what it can add to your spiritual life are matched or exceeded by existing, more scientific tests and psychological evaluative methods raises the question: why aren’t more religious people just as interested in those other methods? Access is probably a factor, as the most rigorously tested metrics are expensive or require professional administration. But let’s not kid ourselves: the Enneagram has created a cottage industry of “authorized” tests, complete with in-house debate about which is best, and none of them are free.

I think a more likely reason that these same people don’t tend to be as interested in more mainstream tests is precisely because they are mainstream. This makes them more boring, less exciting to a lot of spiritual people who view demonstrated expertise as either mundane and therefore non-spiritual, or worse, as suspicious. So ultimately, tests like the Enneagram can exacerbate the problem of distrusting expertise, which is a full-scale epidemic in American culture with devastating consequences. At the end of the day, the Enneagram is probably a relatively unimportant fad. But the psychological and epistemic tendencies that made it popular and preferable to other more valid tests are a legitimate threat to the health of our societies. The Enneagram won’t bring down democracy, but ignoring experts will.

--

--

Kyle Whitaker

philosopher writing about disagreement, public discussion, trust, expertise, and (occasionally) politics and religion