Types considered harmful

It is broadly claimed that static type safety provides a range of advantages over “weak” or “dynamic” types. However, many of the commonly stated advantages either lack evidence or are demonstrably false.

The debate has been done to death over the years, but I want to address it from my own perspective. More than just being oversold, I think that static (compile-time) “type safety” reinforces a mindset that prevents many software engineers from reaching their true potential, alienating them from their user and the value they may otherwise have been able to provide.

What I’m not trying to say is that types are completely useless, or inevitably harmful. I, myself, use types all the time - and there are some benefits of static type safety that are hard to ignore, for example the potential for compile-time optimisations, improved autocomplete, self-documenting code, and so on.

What I am saying, is that “type safe” languages (particularly C#, TypeScript and Java, ubiquitous as they are) conspire to keep many software engineers focused on the wrong thing - that is, the code and its structure - and off the right thing, which is its behaviour and the value that it can produce for another human being.

First thing’s first: definitions

Ok, so let’s set the playing field and start with a bit of a definition because classifying type systems is a nightmare. We’ll go with a very early definition that captures the essence of what we expect from types:

“whenever an object is passed from a calling function to a called function, its type must be compatible with the type declared in the called function.” [3]

What we’re describing here is typical of many general purpose languages, and particularly ubiquitous ones as I mentioned above (C#, TS, Java). If a function asks for a string parameter, we can’t call it with an integer without causing a compile error, and so on.

At first glance, and indeed second glance, this seems like a really good idea. And it is! However let me return to the first part of my original claim:

many of the commonly stated advantages [of type safety] either lack evidence or are demonstrably false

So, let’s get to lacking and demonstrating.

We’ll use JavaScript and TypeScript as our comparison languages. JavaScript because it’s my absolute favourite language, and TypeScript because it’s much nearer the opposite.

For those playing at home JavaScript does type checks at runtime (i.e. dynamically), which is why we’re using it for comparison, since TypeScript is supposed to save us pain by doing type checks at compile time (i.e. statically), instead.

Lie #1: Static type safety reduces bugs

Let’s start with a simple example. This one’s taken from TypeScript’s homepage:

function compact(arr) {  
 if (orr.length > 10)  
   return arr.trim(0, 10)  
 return arr  
}

What’s wrong with this code?

Firstly, there is a typo (orr instead of arr). Secondly we’re calling trim on the array instead of slice.

Here’s how we fix this without types: we test it. Like, seriously. We run it once, or even two or three times if we’re not paying attention (so we can catch both bugs and confirm they are fixed). Or if we’re really playing for keeps we’ve written unit tests that exercise this code properly.

These bugs should never make it to production if you are testing your code ~~properly~~ at all. And if you’re not testing your code, then you’re not focusing on the value that your software is providing to another human being.

Let’s look at the typed example, where TypeScript has come to the “rescue”:

function compact(arr: string[]) {  
 if (arr.length > 10)  
   return arr.slice(0, 10)  
 return arr  
}

This isn’t so bad. We’ve got a function that works. It’s hard to break by accident - we can’t pass in null, an empty array would work fine, an array full of empty strings, etc. All gravy.

But what value did this bring us? All it’s done is say this method is structurally sound based on the types. If we’re testing our code (and we should be) then we still have to write tests to ensure it actually does what we want it to do, and that its behaviour doesn’t unintentionally change over time. And we have to test any consumer code to ensure that whatever is passed into compact is actually the value we wanted to compact, and not just any old string[].

So, sure, TypeScript saved us from runtime errors - but if we are testing our code (and we should be) then we are running it before it goes into production. So what value am I getting from these types, exactly?

It’s right there in the description - type safety allows you to find type errors that would otherwise be found at runtime. So if they’ve made it into production, this implies you never ran your code before production. Is that really how we want to be building software?

The problematic mindset I claim type safety reinforces begins right here, with the subtle lie that runtime and production are equivalent, and begins to snowball…

Lie #2: Type safety reduces the number of tests required

Here’s the tests:

it('returns an empty array when it is empty', () => {  
  expect(compact([])).toEqual([]);  
});

it('returns all items when it is has less than 10 items', () => {  
  expect(compact(['a','b','c'])).toEqual(['a','b','c']);  
});

it('returns the first 10 items when it has more than 10', () => {  
  expect(compact(['a','b','c','d','e','f','g','h','i','j','k'])).toEqual(['a','b','c','d','e','f','g','h','i','j']);  
});

These three tests are required in both languages, and in the JavaScript case they’ve caught both our bugs (typo and bad method call) at runtime when we run the tests, but before we’ve made it to production. So where are all these tests I supposedly no longer have to write? I challenge you to find a single, legitimate example of a test that only needs to be written due to not using static, compile-time type safety. I’ll wait.

The big secret is these additional tests I’m supposedly being saved from writing don’t exist, because by verifying behaviour we are implicitly verifying structure. So why not just skip writing out all the types and focus on the thing that actually matters to your user?

Any test you can produce that might be covered by static type safety is either:

an I/O boundary test you should write regardless
a behavioural test you should write regardless
a duplicate of the compiler (which is pointless and you wouldn’t keep)

This claim simply boils down to engineers not wanting to write tests, and the type checker acts as an enabler.

But what about null?

Lie #3: You have to write more tests without static type safety

Yes, this lie comes in two parts! Lie #2 and #3 are two sides of the same coin, and both are tails.

Degenerate test cases are often the kinds of tests that give unit testing a bad name. These are the ones that check what happens when you pass null, 0, "", -1, and so on.

Often engineers will jump to claim here that techniques such as fuzzing are required when you have no static type safety, in order to cover all these crazy cases that would otherwise be caught by a type system. This “problem”, however, is completely self-imposed.

The only required tests are the ones for which you are intentionally defining a behaviour. If you don’t need to know or care what happens when null is passed, then don’t write a test for it - simple. Leave the behaviour undefined.

Returning to our above example, courtesy of TypeScript.org, if the tests I’ve written for this function don’t specify what happens when null is passed as a parameter, then that behaviour is undefined. If you call this function with null, then that’s on you. Clearly, you haven’t tested your code.

The only place where we need to sanitise parameters, check data types match, and so on, is at the I/O boundary. And you have to do this at runtime. No compiler checks for you!

So again, “degenerate” test cases and “fuzzing” can only be considered required at the I/O boundary, and this is true irrespective of the language you are using.

Let’s jump to some typical C# code, just so we can really clear up this null thing:

private string[] Compact(string[] arr)  
{  
    if (arr == null) {  
       // we have a couple of options  
       // throw new ArgumentException(arr);  
       // return null;  
       // return new string[0];  
    }  
    // … implementation … //  
}

What’s the value of this code? Either we’re introducing new, defined functionality by returning something valid when null is passed, or we’re throwing an Exception.

If we’re wanting to define what happens when null is passed, fine! This can certainly be done in the JS and TS examples, and again tests would be required for both - so still no reduction in required tests, lie #2 holds.

If we’re throwing then it’s still a runtime error, whether it’s thrown by us or not. So we haven’t eliminated any runtime errors, either! Lie #1 holds. So, again, what are these types doing for me?

You might ask what about @Nullable in Java, for example. Well it’s effectively the same as the TS case - you’re broadening or narrowing your interface, sure. But there’s no reason you can’t cover that with behavioural tests (that you should be writing anyway) if it was actually important to the code you were writing.

You might ask, aren’t we just moving these tests to the caller? Now they have to check for null, right? Wrong. If the caller doesn’t care about null either, then they don’t write a test for it either. Simple. That calling function now leaves the null case as undefined behaviour as well. This continues onwards and upwards until one of two things happens:

We hit a case where the behaviour of the null case legitimately needs to be defined; or
We’ve hit an I/O boundary

In either case you’ll be wanting a test regardless of static type safety. This is why I say this problem is self-imposed, because many of these degenerate cases are low value “defensive programming” that don’t even need to be defined, let alone tested. What’s more is these are often quoted as “edge cases”, when really they’re nonsense cases, and perhaps they’re absorbing your attention while the real edge cases sneak right on by.

To generalise the above, I claim:

any line of code that contains an error that can be caught by a type system at compile time will also be caught by proper testing, prior to finding its way into production

So it seems then, that all static type safety is affording me here is the ability to sometimes skip some kinds of tests - or to hire and retain engineers who don’t test their work. This is very different from the claim that it reduces the number of tests required, with the implication that achieves the same level of confidence in the program’s correctness.

I hope you’re starting to see the mindset problem that I opened with - if proper testing catches all the bugs the type system would anyway, then why should we think static type safety is so important? (perhaps, someone’s skipping testing… 😱)

Lie #4: Static type safety prevents typographical errors

What about this claim from Eric Lippert:

[JavaScript programmers on large codebases need to] write test cases for every identifier ever used in the program. In a world where misspellings are silently ignored, this is necessary. This is a cost.

Ok, let’s work it through! I struggle to picture what writing “test cases for every identifier” looks like because I’ve never seen it nor required it, but perhaps a good candidate is a mapping function, something like:

function mapFromAToB(a) {  
  return {  
    fieldA: a.field1,  
    fieldB: a.field2,  
    …  
  };  
}

According to Eric, we have to write more tests in JavaScript than TypeScript to ensure the correctness of something like this, because of typographical errors. But this is demonstrably false.

Rather than demean Markdown by writing out all the tests, let me demonstrate by showing you a class of typographical errors that static type safety does not save us from.

Let’s rewrite our mapping function in TS:

type MyContrivedType = { field1: string; field2: string; /* … */ };  
type MyOtherContrivedType = { fieldA: string; fieldB: string; /* … */ };

function mapFromAToBe(a:MyContrivedType): MyOtherContrivedType {  
  return {  
    fieldB: a.field1,  
    fieldA: a.field2,  
    …  
  };    
}

Did you spot the bug? We’ve assigned fieldA and fieldB in the wrong order - they’re both strings being assigned strings, so as far as TS is concerned this is all good. And there are lots of variations on this bug that copy/paste will happily and tirelessly supply you with.

Only actual testing would catch this problem, and if you’re in a world where tests are silently ignored, this is a cost (see what I did there?).

A unit test, for example, would have verified both behaviour and structure with just a single test - saving me the effort of writing out MyContrivedTypes altogether. So, on balance, the type unsafe method actually requires less code for the same outcome.

So, not only is the claim incorrect, it’s actually the opposite that’s true - only behavioural tests can save you from typos! This “type safe” mindset is so ingrained that what even seems to be the most obvious argument against dynamic types is trivially false.

Lie #5: There is significant empirical evidence for static type safety

Firstly, I will concede this is far from an exhaustive study of the literature. But for something claimed so often and so confidently, you’d expect it wouldn’t be too hard to find some convincing evidence¹. Unfortunately, this is not the case.

There is a great write up by Dan Luu ² which covers many of the oft-quoted papers. I encourage you to read his article, but I will summarise here:

empirical evidence for type safe languages carrying advantages for productivity, understandability and correctness is limited at best
that which does exist is often contradictory, and not well replicated
even studies that do seem to find an effect have design flaws that are fairly criticised
even accepting these flaws, the results are far from generalisable to real world scenarios

There appears to be a repeated pattern with these studies - that being design, methodological and analytical flaws all readily visible. This shows up again in a more recent paper from 2022 “To Type or Not to Type?”. The study claims to find that “bug proneness and bug resolution time” were not significantly lower for TS than JS. Yes, that’s right, this paper seems to support my claim that types don’t help with bugs. But it’s still problematic.

The methodology uses a number of metrics, however the interesting one is:

bug fix commit ratio, i.e., for a given project, the number of bug fix commits is divided by the total number of commit

This is used to infer “bug proneness”, or how susceptible a given language is to bugs. The problem here is that in order to fix bugs we have to find them first - this is a kind of survivorship bias. It’s very possible there are bugs living in the codebases that haven’t been found, and therefore cannot have had a bug fix commit. So this metric is clearly not measuring what the paper claims it’s measuring. In fact, it may be telling us the exact opposite - that TS code had a higher number of bug fix commits because finding and fixing bugs was actually easier than it was in JS.

As much as I’d love to cherry pick this study, scream “JS rules!”, and then do some victory donuts on the school oval, I’d rather we take a step back and realise: there simply is no overwhelming body of evidence that type safe languages deliver on any of their promises. And there are obviously some real world limitations preventing us from drawing a sound conclusion, that keep reappearing in study after study.

We may interpret this as static type safety is complete BS (all but certainly incorrect). We may interpret this as a draw (maybe correct). We may interpret this as whatever the benefit is, it’s dwarfed by other factors, and so very hard to tease out in any real world study (sounds the most reasonable to me).

Whatever the case we cannot fairly conclude that static type safety is empirically more productive, or produces less defects on average, than the alternative.

Lie #6: Languages without static type safety do not scale to large teams / products

It’s often claimed more dynamic languages such as JavaScript do not scale to large software teams. It’s very easy to refute this claim: Google, Facebook, Slack, Electron, Airbnb, Shopify, Netflix, Uber, Trello, Discord, WordPress, EBay - all these names heavily use PHP, JavaScript and even Ruby and have for a long time.

While it does seem to be true that many of these teams are increasingly adopting TypeScript over JavaScript, this is not necessarily evidence that static type safe languages scale better to larger teams, or can unlock further scale. Although it is very interesting - and perhaps we’ll see some empirical evidence for types at some point as more and more teams make this change!

My interpretation of the evidence is that it represents a largely cultural shift in terms of popularity and hiring. Nearly every software engineer I’ve coached - including a number of “senior engineers” - was lacking core skills when it came to testing. And without a strong understanding of testing, working without a type checker is certainly going to feel like hell.

It could also simply mean that there’s problems with JavaScript, specifically. But I refuse to entertain that idea in service of preserving my ego.

Lie #7: Refactoring is harder without static type safety

The claim is that with static type safety you can refactor fearlessly - further encouraged by the abundant “refactoring” options built into modern IDEs. The compiler will tell you if you’ve broken something, so big changes come with lower risk, right?

This sounds great in theory, but it collapses under inspection. Types don’t guarantee behavioural correctness. They only guarantee that your code is structurally consistent.

You can shuffle methods, rename fields, and fix all the compiler errors - and still completely break what the software is supposed to do. The compiler won’t tell you if you’ve swapped two fields, inverted a conditional, or misapplied a business rule. Only tests catch that.

In fact, type systems often make refactoring more painful. Every change explodes into a cascade of compiler errors across the codebase. You end up spending hours “fixing” things that weren’t broken, just to satisfy the type checker. Not only is that not “safe refactoring”, it’s actively distracting you from the real hazard of introducing subtle bugs.

Of course you can readily flip this one, you’ll often hear that tests make refactoring more difficult. The difference is that the tests (assuming they’re designed well) are catching necessary difficulty - they’re telling you something’s not behaving the way it used to. Types don’t, and can’t, do this. So while yes, tests may slow down a refactor when compared with no tests, the price you’re paying is that you know what you’re changing - the slowdown is a good thing. Whereas with types, you’re paying down time cost for nothing - the illusion of safety.

Real safety in refactoring comes from a robust suite of behavioural tests. Tests prove that the observable outcomes of the code (i.e. the things that matter) haven’t changed. Types only prove that your paperwork is in order.

Relying on static types for refactoring is like driving at night with no headlights - the real hazard is on the road in front of you but you’re too busy trying to clear the warning lights on the dash.

How might this article be misinterpreted?

Perhaps some will take away that I think types in general are pointless. This is not the case.

Some may think I’m claiming that poor software or endless refactoring are inevitable outcomes of using static types, and that’s absolutely not the case, again.

Many will think that while I may make a great philosophical or theoretical point that it immediately breaks down in real organisations with real demands. Perhaps that’s true, to an extent - perhaps static types strike a happy medium between speed and quality that allow (most) engineers to move faster in a way that’s good enough for most businesses. But my goal in this article is to coach engineers, not businesses.

Some readers will immediately shout “But what about Rust?” And it’s true: Rust’s type system enforces ownership and borrowing rules that do prevent entire classes of runtime errors (segfaults, data races). That’s a real and impressive achievement.

But those are completely different claims from the ones I’m addressing here. Rust is about memory correctness in systems programming. The mainstream hype around TypeScript, Java, and C# is that types make business logic safer and easier to maintain. On that front, the evidence is absent. Almost all the bugs that actually matter to your users are still logic bugs - and those are only caught by testing.

Tests can be wrong too, you know!

Of course it’s true: tests aren’t perfect. You can write bad tests. You can miss cases. You can test the wrong thing. But here’s the key difference - every test forces you to think about behaviour. Even a mediocre test is an act of asking, “what should this code actually do?” It asks about function.

Types never ask that question. They only ever ask, “what shape should this code take?” It asks about form. That might keep the compiler happy, but it doesn’t say a word about whether the software is valuable, usable, or correct in any meaningful sense.

So yes, tests don’t guarantee correctness. But they cultivate the right mindset. They orient engineers toward behaviour and outcomes. Types, by contrast, keep engineers fixated on structure. And structure without behaviour is just paperwork.

No author thinks their novel is brilliant just because their word processor has no more red squiggles. Type safety is spell-check. Testing is reading the story.

Why should I care?

Many engineers (not all!) push for these languages because it makes them more productive; because it compensates for an underdeveloped core skill set. They don’t realise this is the case - they simply see themselves getting more wins with the type checker enabled, and reason it must be the better way to code.

Types are like training wheels on a bike - until you take them off you’re not really learning how to ride. They reinforce an obsession with code structure and warp ideas about correctness in a way that stunts the growth of many software engineers.

The bigger problem is that this obsession with structure doesn’t stop at types. OOP is cut from the same cloth - a “function follows form” paradigm that often keeps engineers focused on modelling problems rather than building solutions.

The bottom line is: unless engineers confront these weaknesses, they will never reach their full potential, and the longer they stay in this bubble the more entrenched this mindset of structure over behaviour (form over function) becomes.

You’ve convinced me. Where would I start?

If you’ve never seriously used a “loosely typed” language before, then I’d say start with learning JavaScript on Exercism. If you like it, maybe try building something non-trivial with it.

As mentioned at the beginning, classifying type systems is a nightmare. So while languages like Ruby and Python are not statically checked, they are still strongly typed (they don’t feature such flexible type coercion as JavaScript does, for example). However, I still think the lack of static type checking is a boon here.

Python is obviously very mainstream nowadays, so if you want something a little different perhaps try out Elixir, which is Ruby-like but runs on the BEAM.

Try C. Or even Assembly. Take the training wheels off.

Argumentum ex silentio - yes, very good, I also enjoyed the Harry Potter films. ↩
It’s hard to find an exact date on this article, but based on some references within and backlinks it seems it’s from 2014. ↩

Types considered harmful

Footnotes