The Poppies of Terra #61 - Star Trek's Intellectual Integrity
By Alvaro Zinos-Amaro
2025-07-30 09:00:58
As I write this, the first three episodes of the third season of Star Trek: Strange New Worlds (SNW) have been released. A thought I’ve had a couple of times with some recent Trek outings, and one that returned recently, was whether there might be any way to approximately assess the intellectual quality of Star Trek’s dozen incarnations.
After all, we often talk of IPs without paying much attention to the first word of that phrase, “Intellectual,” but I think it’s always been a significant strand in Trek’s pop storytelling weave.
In terms of televised/streamed story count, Trek is a juggernaut. The current tally of all Star Trek episodes is 943, a number that will reach 950 by the end SNW's current season. With the announcements of various other forthcoming seasons and series, sometime between 2028 and 2030 we’ll likely be hitting the 1,000-episode milestone.
Here’s a summary (adapted from the Wikipedia entry):
Yellow represents the original Gene Roddenberry era (about 11% of episodes), blue what I’ll call the Rick Berman reign (66%; Roddenberry was still somewhat involved at TNG's inception, so it's not a perfect split, but Berman is really the lead overseer of the bulk of this material), and light orange the output of the current Alex Kurtzman administration (23%).
Given its immensely storied past, and how it has intersected with at least three different generations, Star Trek has come to mean a lot of different things to different people. I want to be clear: what follows isn’t an attempt to assess Trek holistically. There are plenty of dedicated books that already undertake that Sisyphean task. Instead, my focus here is on the single element I mentioned earlier, namely Star Trek’s intellectual caliber.
There’s no way of directly measuring such a thing, of course. But with a little creativity we can design a framework that taps into somewhat objective proxies—measurable indicators that capture changes in complexity, coherence, and thematic depth across the franchise over time.
I’m going to call the weighted combination of these measures the franchise’s Intellectual Integrity Score (IIS), because it sounds like something one might overhear in Engineering.
What follows is by no means intended to be definitive. It’s exploratory, and based on educated guesswork, not science. If it sparks other ideas or alternative approaches, so much the better. I’ll be glad to have contributed to the conversation.
Below I’ll get into the nerdy details for the curious, but after creating the IIS and running my model through four different AIs (I’ve excluded Short Treks and the Section 31 film from everything that follows), here are the results I’ve obtained, deriving era-combined scores by weighing each series score by its number of episodes:
I’ve run many variations on the metric sub-components, tweaking their respective contributions to the whole, etc., and have tried several other AI’s to produce more estimates. The results tend to be consistent:
-
Berman-era Trek generally upheld—or even elevated—the intellectual quality established by its predecessors, TOS and TAS.
-
Kurtzman-era Trek appears to show a considerable decline in the franchise’s intellectual integrity, to the tune of about 25%.
It’s worth calling out two things that are sort of buried in these results:
-
Within Berman-era Trek, ENT is the outlier, representing a significant drop in IIS from the previous three shows.
-
Most models place Kurtzman’s SNW at comparable or even higher values than ENT. So just because a given era rates higher/lower than another doesn't mean its constituent shows all do.
-
-
Within Kurtzman-era Trek, LD and PRO attain the highest IISs and therefore raise the combined Kurtzman-era results.
-
On the flip side, and unsurprisingly given its limitations in terms of runtime per episode and so on, TAS rates lower than TOS, slightly reducing the Roddenberry-era combined score.
With the second and third observations in mind, I thought it might be worth redoing the calculations but this time restricting ourselves to just the live-action series.
Here are those results:
Findings:
-
Berman-era Trek is at a similar level as TOS (1% lower). Remove ENT, and the TNG/DS9/VOY aggregate surpasses TOS.
-
Meanwhile, Kurtzman-era Trek, in this view no longer reaping the benefits of LD and PRO, is down about 30% from the previous eras, an even steeper decline than previously found.
One more comment. Folks tend to argue that while DSC and PIC were heavily serialized, may have been inconsistent quality-wise, and geared towards different demographics than TOS, SNW is a more fair point of comparison. It’s largely episodic, set in a similar point in the timeline, and it harkens back to the spirit of the pioneer show, even repurposing a number of its characters.
Here’s what the comparison looks like for just those two series:
The drop in intellectual integrity is now reduced to about 14% on average.
Still, I think it’s reasonable that one might hope for more. SNW came into being after approximately 900 episodes of other Trek had already been made. That experience should count for something.
It’s fair to ask why the drop in intellectual integrity. A deep dive would likely make for a doctoral thesis. It seems likely that part of the Kurtzman-era decline in IIS stems from a combination of streaming-driven serialization–which tends to prioritize emotionally-driven narratives, rather than standalone episodes exploring complex ethical or philosophical issues–and a shift in creative vision.
Alex Kurtzman, as a showrunner and executive producer, brings a different sensibility to bear than Gene Roddenberry’s utopian, at times dogmatic, idealism and Rick Berman’s doctrinaire, sometimes plodding, devotion to consistent world-building. Kurtzman’s background in more action-driven franchises (e.g., Transformers, the Star Trek Kelvin timeline feature films, Hawaii Five-0) suggests a preference for visceral intensity and visual spectacle over the more measured tone of the prior eras. This change in vision may manifest in choices like emphasizing personal trauma (see, for instance, Michael Burnham’s arc in DSC) over more abstract quandaries. It’s probably also just a reflection of broader trends in populist entertainment.
Sociopolitical messaging has also become less nuanced, these days often a storytelling taskmaster rather than an organically emergent property, less, to use Berman-era imagery, bio-neural gel packs and more hectoring hologram. At times current Trek can come across as smug in its certainty about its values. It’s a symptom of the times. But part of Trek’s enlightenment-ideas mold was, even when unabashedly moralizing, to be self-reflexive, self-critical, and, like science itself, self-correcting. Past courage to be progressive seems to have been mistaken for present permission to be authoritative.
I’m sure nostalgia-fueled fan expectations also play a role, and skew my analysis, because I included audience and critic feedback (10% weight) in my calculations. We tend to glorify what came before because the joy it purveyed is a salutary certainty, while the effects of what’s to come are unknown, and therefore inherently un-reassuring. If, as Cyril Connolly observed, “imagination is nostalgia for the past, the absent; it is the liquid solution in which art develops the snapshot of reality,” our nostalgia for imagined futures may obstruct our vision like a double cocoon.
And hop on to X or other social media platforms, and you’ll find plenty of fans extolling the virtues of SNW or LD. Again, an era's overall trend doesn’t represent every one of its iterations.
But clearly, I don’t believe romanticized perceptions of the past, or an under-accounting of current praise, tell the full story, or else I wouldn’t have written this article. I’ll end on this thought. In connection to next year’s Starfleet Academy show, Alex Kurtzman has been quoted as saying: “We wanted to create a show that anchored us back to [Gene] Roddenberry’s essential vision of hope. How do you find it, how do you rebuild it?”
How and where was it lost, I wonder, that it needs such rehabilitation?
Methodology (Theoretical)
The below notes summarize my thoughts on possible systematic, comprehensive ways of calculating each of these scores. I’ve not undertaken these myself, due to time and resource limitations, but I encourage other more ambitious and intrepid souls out there to take a stab at it if so inclined!
1. Dialogue Complexity (Lexical & Syntactic Analysis)
Metric: Average sentence length, lexical diversity (type-token ratio), use of technical/scientific jargon.
Method: Use NLP tools (e.g., spaCy or TextBlob) to analyze transcripts of episodes.
2. Problem-Solving vs. Action Sequences
Metric: Ratio of runtime dedicated to discussion-based problem-solving & diplomacy or ethical debates versus action/fight scenes or explosions.
Method: Use manual scene tagging or computer vision models to analyze a statistically significant sample of episodes.
3. Scientific Accuracy and Use
Metric: Frequency and correctness of scientific/technological terminology per episode.
Method: Analyze scripts and rate them for plausibility and rigor.
4. Character Decision-Making Coherence
Metric: Number of plot-driving decisions made by main characters that contradict established Starfleet principles or internal logic.
Method: Code and count major character decisions per episode and assess whether they: a) Have clear motivation b) Align with prior character behavior c) Reflect Starfleet training.
5. Moral and Philosophical Content
Metric: Frequency and depth of ethical dilemmas per season.
Method: Tag episodes with presence of: a) Prime Directive conflicts b) Morally challenging decisions c) Philosophical debates (e.g., AI rights, time travel ethics).
6. Audience and Critical Metrics
Use this as a secondary/correlative measure.
Sources:
IMDb episode ratings over time
Rotten Tomatoes critic vs. audience splits
Viewer rates where available
7. Continuity and Canon Violations
Metric: Number of established lore contradictions per season.
Method: Track lore inconsistencies cited in fan-maintained wikis or detailed franchise chronologies.
Formula
Putting all this together into a single package, with what I’d call reasonable weightings based on more heavily emphasizing the dimensions susceptible to quantification and weighing the more subjective factors less, we get:
Intellectual Integrity Score = 0.2 × Dialogue Complexity Score + 0.2 × Problem Solving Ratio + 0.15 × Scientific Accuracy Rating + 0.15 × Decision Coherence Index + 0.1 × Moral Dilemma Frequency + 0.1 × Lore Continuity Score + 0.1 × Fan Rating Adjustment
Methodology (Applied)
I arrived at the numbers in the tables above leveraging ChatGPT, Grok, Claude and Gemini, asking them to perform various extrapolations and interpolations in the absence of full quantitative data for each metric. Everything I’ve shared is therefore based on estimations, as I said before.
Within each of these applications, I tried to refine the calculations as much as possible by prompting with examples and counter-examples, attempting to keep my language neutral and being aware of questions posed in a confirmation-bias seeking manner. I ran the models different times outside my user sessions as well to see what kind of results ensued. The values I’ve shared are representative of what I encountered.
Clear Disclosure: I acknowledge and understand that using AI estimations as my primary data source makes this analysis not particularly robust. I essentially smartly guessed at metrics instead of precisely measuring them, and there could be some significant variances as a result. But context is king. This is a thought experiment, intended to show directional patterns, not a scientific paper meant for peer-reviewed publication.