Types are the basic tool of software design
January 3, 2025

Types are the basic tool of software design

One of my pet peeves is when someone pretends to talk about software design and then… talks about reviews. Or they focus on details about how to write functions.

At the risk of using a tired metaphor, it’s a bit like an architect trying to design an interior. That’s not to say that the inside of a room isn’t important – a bad architect can definitely create a room that humans have no use for – but it’s not the important part either. Furniture can be rearranged, load-bearing walls cannot.

A function is this Typical abstract border. This is one of the few situations where we should be able to ignore details. (Performance is still important, of course, but even so we can understand it in a black-box way.) We can write functions of better or worse quality, but the internal details are not really relevant to the overall design of the software. It should be isolated.

I think design is about what’s left after we remove all functional bodies. It’s worth taking a moment to think about what that looks like. Here are some of my immediate thoughts:

  1. Everything else is just type.
    Even in dynamic languages, they are just less detailed types: classes with certain methods, functions that take a certain number of arguments, and files associated with these.
  2. The classic ban on global countries is mainly to ensure that these types make sense. Any global state can be used anywhereso it’s hard to guess how any The function works accordingly. Without global state, the only thing a function has to do is sign its type.
  3. The modern aversion to deep link method calls (e.g. “Demeter’s Law”) is also an attempt to meaningfully convey the purpose of the function. I don’t really like these rules because they are misused. (Note to self: write a post about this later.) But there’s a solid core of a good idea here: functions should tend to operate on their actual declared parameters, rather than “literally what it might do from its parameters” anything gained”.

Humans love to name things. It’s part of the way we understand the world. In one of Feynman’s books, he joked that it’s useless to just know the name of something, but I think he was wrong. The name is the beginning: it is the idea to which you begin to attach your understanding. It is a tool for communicating with other people and finding information. In programming, everything we name is a type, or is associated with it.

I’m recently helping a kid learn to write Python. They had a situation where they needed an interface, but Python didn’t really have one. No inheritance is needed because there isn’t even any shared implementation to speak of. After feeling a bit stupid for a minute (and after a quick Google, decided abc Too many to cover) and I ended up suggesting they write a “template” category in the review. At least this gives them something to copy and paste to start each new implementation of that non-interface.

It’s really annoying when you can’t name something important.

Of course, it’s equally annoying when you have to provide a name for something more structural. I was reminded of Java’s lack of function types.

Thinking about design in terms of types

Although UML has its shortcomings, at least this part is correct. At design time, we want to figure out what the types are, what their relationships are, and name them. (I have Mentioned before where UML goes wrongbut in short: it’s too object-oriented and the design is iterative, whereas UML is quite upfront. As we know, good design is the result of refactoring.

C programmers will eventually adopt a similar design approach, which I often call “presentation first.” C has a primitive enough type system that thinking about types is really more about presentation than anything else.

One of the toxic parts of the really early “static typing” debate was that most people were exposed to types just for the sake of pursuing Performance. C, pre-template C++, pre-generic Java, and even Common Lisp compilers basically use types just to make the code faster. For a long time, the idea that types could do anything else was foreign to many programmers.

Kernighan and Pike write: “The design of data structures is a core decision in program creation. Once the data structure is laid out, the algorithms fall into place and coding is relatively easy.

This sentence comes from “Programming Practice”, which also says:

One aspect of this view is that the choice of programming language is relatively unimportant to the overall design. We’ll design the program abstractly and then write it in C, Java, C++, Awk, and Perl.

There, I began to disagree. First of all, it is very wrong to equate “data structure” with “type”. C way of thinking. The next examples in the book are too simplistic to really criticize, but overall the point they make here is sometimes derided as “You can write C in any language!” (a joking reference to an earlier complaint “You You can write Fortran in any language.” I guess it’s not surprising that I sometimes hear people complain that some people can write Java in any language.)

The functional programming tradition, especially the branch that gave us standard ML and Haskell, also places a strong emphasis on type-first programming. One of Haskell’s most fundamental innovations (well, I guess it’s probably not the first, but compared to other relatively mainstream languages) is the ability to write function types separately from the function body.

map :: (a -> b) -> [a] -> [b]
...

On the one hand, I often feel the benefits of this language design. It’s very good to write down the type of the function (or more typically the entire collection of function types) before writing a single body line. It allows you to think about problems and make beneficial plans. There’s a reason Haskell programmers start by writing function types, even if it can only be inferred.

On the other hand, I occasionally lament the lack of clear names for function parameters. While I probably won’t have trouble remembering the order of arguments map Specifically, if your IDE can provide you with map(fn, lst) Serves as a quick reminder of function arguments and order. The Haskell declaration style has no canonical name for each parameter of a function because it is pattern matched immediately. Trade-off, I guess.

Deeper, more complex types

Even today with dynamic languages, we build programs based on types. However, simply transitioning to a static language is arguably a step back. We actually have research showing this: most famous papers Regarding static vs. dynamic typing and productivity is about a rather poor static typing system.

In order for static types to come into play and help us design our programs, we need them to be powerful. A good starting point is to support All three types of designs we might want to. But since no language can do this, we’re going to have to use some of them. Such languages ​​include languages ​​such as pure C and pre-version 5 Java.

The benefit of these languages ​​is that more and more programming tasks are mapped through types. For interfaces, we know that the implementation must meet certain minimum requirements. For data types, we know that the function will work by pattern matching against a specific set of conditions. This approach also allows machines to better understand the code, allowing automatic refactoring to work reliably.

The next step in helping us design the type of program is parameterization. It comes with Java generics, C++ templates, etc. This feature originated in functional programming languages ​​and was eventually ported over.

Contrary to what many people are concerned about (Power isn’t what we’re always after: We should want property), parameterized functions (and types) are not just “general programming”. It’s not just the flexibility of using functions with many different types. It’s also about properties: if we parameterize the type, We limit the operations we can perform on variables of this abstract type.
I think this one-two punch (more generally useful code and stronger correctness properties) is responsible for taking over static typing in the functional programming community. It’s all very impressive map is correct is Just one type check and one performance test.

Researchers continue to look for more ways that types can help build programs. Rust’s type system has similarities to linear types that manage lifecycles. With so many modern languages ​​doing garbage collection and exception handling, it’s easy to forget that an important part of design is questions like “How do we do resource management?” and “How do we handle errors?” For many languages ​​the correct answer is to answer these questions universally, so they are no longer relevant. But when we need to be more nuanced, encoding more of these decisions in types means we can better understand the design just by thinking about the types. It also helps with automation: having to allocate space for functions to pass back data is just more diversified in C, but is (actually) handled for us in Rust.

There is also a lot of active research on dependency types. My favorite reason for being excited about this field of study is being able to produce Implementation (or at least partial implementation) from the type. If you have never witnessed this interaction, I recommend watching Video about Idris 2 by Edwin Brady. This demonstrates the extent of IDE support for languages ​​that we haven’t really experienced today. Part of this is due to the lack of good support for such things in IDEs (and language tools), but part of it is also the new features enabled by adding (even less complex) dependency types to the language.

While I often complain that programmers focus too much on abstract functions rather than properties, the type system is one place where I suspect the opposite often occurs. People sometimes act as if types only care about correctness: they focus too much on properties. A type is a machine-readable description of a program’s design. Too powerful.

endnote

  • My point is that the point of today’s article is universal: I therefore emphasize that types are central to the design even in dynamically typed languages. But then I started praising the virtues of static typing. I hope this doesn’t obscure the larger meaning for those who don’t like it.
  • In retrospect, maybe I should have said something about the “duck typing design aesthetic”. The general idea is to support arbitrary/retroactive creation of unnamed struct types.
  • In more “thoughts I had shortly after posting”, it’s also interesting to see how type classes can be implemented in Haskell using logic programming.
  • Edwin Brady also wrote a book called type-driven development. I haven’t had a chance to read it yet. Might be fun.

2024-12-31 10:00:15

Leave a Reply

Your email address will not be published. Required fields are marked *