little cubes

Why software doesn't work anymore

We deserve better Results

No one expects software to work anymore

Section titled: No one expects software to work anymore

What was your reaction when that error dialog just popped up? Did you read it? Or did you just click cancel like you do for every annoying little box that pops up on your computer several times a day? Did you even realize that I insulted your mother?

You didn’t read it because your brain has been programmed over time to expect things like that. Turn it off and turn it back on again. Refresh the page. Clear cache and cookies. We’ve gotten used to it. We’re not surprised by any of it. And our expectations of our software just continue to fall as time goes on.

Does this look familiar?

async function updateUserProfile(req, reply) {
try {
await Auth.check(req.headers.Authorization)
const user = await Db.getUser(req.user.id)
if (!user.isInitialized) throw new Error('User is not initialized')
await Db.updateUser({phoneNumber: req.phone})
} catch (err) {
const msg = err instanceof Error ? error.message : 'Failed to update user profile'
req.log.error({err}, msg)
reply.code(500).send({error: msg})
}
}

It is quite seductive (and quite common) to write code that focuses the “happy path”, how you would like things to work, without taking the (considerable) extra time to stop and address each and every way that things can go off the rails.

The reality is that our software is incredibly error-prone, and yet the amount of error handling in any given codebase tends to be pretty minimal. In the snippet above (which I’m certain is quite similar to many apps you’ve worked on), every single line in the try-block can fail. But we don’t bother with any of the details and just wrap it all in a generic error message at the end. At least in this case there is a catch block, sometimes you don’t even get that.

In order to properly handle those errors you would first have to know which type of error each function can throw and what that error means:

  • Is it fatal?
  • Is it retry-able?
  • Do we need exponential backoff?
  • Does the user need to be notified?
  • What should we tell the user?
  • Or even, do we need to issue a refund?

Because the e in catch(e) is always of type unknown, even having answers to some of those questions in the existing documentation would demonstrate strong discipline by your predecessors. But let’s be honest, that’s unlikely.

A high degree of discipline is required to scale TypeScript, especially in large engineering organizations. But unfortunately, discipline does not scale.

Error handling in programming has a long history of being done poorly. First we had error codes, that people forgot to check. Later we got exceptions, which solved the forgetfulness problem by crashing the app. In HTTP we still have error codes, which the popular axios library checks by throwing exceptions that people forget to catch. What a compromise.

Any approach that makes it possible for a developer to forget to address the error is inherently flawed and bound to fail because (again) discipline doesn’t scale; it doesn’t scale over an org, a team, or even just over time.

The popular Rust programming language doesn’t contain exceptions; throw is not a keyword. Instead it has Result<T, E> for recoverable errors and panic! for unrecoverable errors.

Result<T, E> represents one of two possible states; success or failure, the cat is alive or dead. It can either be Ok and contain your intended return type T, or it can be Err and contain a specific type of error E.

In order to get your data out of the Result, the compiler forces you to check whether the operation was successful or not. The discipline is enforced by the compiler, it scales!

From the Rust book:

Rust requires you to acknowledge the possibility of an error and take some action before your code will compile. This requirement makes your program more robust by ensuring that you’ll discover errors and handle them appropriately before deploying your code to production!

As you can see, Results have a number of advantages over throwing exceptions:

  • Checking errors becomes a compiler requirement
    • in TS implementations it becomes more of a strong suggestion but such is the nature of JavaScript
  • No hidden control flow (exception bubbling)
  • Crystal clear guarantees of how the code succeeds, and how it fails, at a glance.
  • Possible errors are documented by the type system

How might we implement Results in TypeScript?

Section titled: How might we implement Results in TypeScript?
type Ok<T> = {type: 'Success'; value: T}
type Err<E> = {type: 'Failure'; error: E}
type Result<T, E> = Ok<T> | Err<E>

By using a discriminated union on type, when a function returns a Result<T, E> you’re now forced to check which variant you have in order to access the value:

const userResult = await Db.getUser(req.user.id)
if (userResult.type === 'Failure')
return reply.code(404).send({type: 'Failure', error: userResult.message})
const user = userResult.value

Well, that was easy!

We could even create some helper functions to make Results easier to work with

const ResultHelpers = {
succeed<T>(value: T) {
return {type: 'Success', value}
}
fail<E>(error: E) {
return {type: 'Failure', error}
}
isSuccess<T, E>(result: Result<T, E>) {
return result.type === 'Success'
}
isFailure<T, E>(result: Result<T, E>) {
return result.type === 'Failure'
}
/**
* If the result is successful, transform the output value
* If the result has failed, return that original failure
*/
map<T, E, T2>(result: Result<T, E>, fn: (value: T): T2) {
if (result.type === 'Failure') return result
return {type: 'Success', value: fn(result.value)}
}
/**
* If the result is successful, pass the success value to a function that performs another action
* If the result has failed, return that original failure
*/
andThen<T, E, T2, E2>(result: Result<T, E>, fn: (value: T): Result<T2, E2>) {
if (result.type === 'Failure') return result
return fn(result.value)
}
/**
* If the result is successful, return that original success
* If the result has failed, return a fallback value
*/
unWrapOr<T, E, T2>(result: Result<T, E>, orValue: T2) {
if (result.type === 'Success') return return result
return {type: 'Success', value: orValue}
}
/**
* If the result is successful, perform some side effect, then return the original success
* If the result has failed, return that original failure
*/
inspect<T>(result: Result<T, unknown>, fn: (value: T) => unknown) {
if (result.type === 'Success') fn(result.value)
return result
}
/**
* If the result is successful, return the original success
* If the result has failed, perform some side effect, then return the original failure
*/
inspectError<E>(result: Result<unknown, E>, fn: (error: E) => unknown) {
if (result.type === 'Failure') return fn(result.error)
return result
}
}

Most operations that might fail are async, how could we handle those situations?

type ResultAsync<T, E> = Promise<Result<T, E>>
type ResultMaybeAsync<T, E> = Result<T, E> | ResultAsync<T, E>

We could then adapt our helper functions to operate ResultMaybeAsync instead, allowing for a unified api for both sync and async results.

Luckily, someone has already done this

Section titled: Luckily, someone has already done this

The neverthrow npm package is essentially a port of the Rust Result<T, E> implementation. It is the most popular library in this space, but in my opinion has a fundamental flaw because it’s based around class instances, which are not serializable. This means that neverthrow is difficult to use close to client-server boundaries, which is just about everywhere in a web app.

There is a new library called @praha/byethrow that solves this problem by having Results just be POJOs (plain old javascript objects) like we’ve seen above. Not only are byethrow’s Results serializable but, but they can be sent from client ↔ server, and can be validated with a schema. This means that you can return Results from your API endpoints, from your React server functions, from your React Query hooks. From anywhere to anywhere.

Byethrow’s API is quite similar to what we saw above. Their generics are more advanced and support a unified interface for Result and ResultAsync which is really nice.

Additionally, they have a Result.pipe() function that allows you to create a very natural pipeline of result operations:

import {Result} from '@praha/byethrow'
const result = Result.pipe(
// Start with a success value of 5
Result.succeed(5),
// The perform a side effect of logging that value
Result.inspect((value) => console.log('Debug:', value)),
// Then transform that success value by multiplying it by two
Result.andThen((x) => Result.succeed(x * 2)),
)
// Console output: "Debug: 5"
// result: { type: 'Success', value: 10 }

Let’s take the original example from earlier, but now with Results (and a full Fastify route).

import {Result} from '@praha/byethrow'
import z from 'zod'
function resultSchema(args: {value: z.ZodSchema; error: z.ZodSchema}) {
return z.discriminatedUnion('type', [
z.object({type: z.literal('Success'), value: args.value}),
z.object({type: z.literal('Failure'), error: args.error}),
])
}
export async function updateUserProfile(fastify: FastifyInstance) {
fastify.withTypeProvider<ZodTypeProvider>().route({
method: 'PATCH',
url: '/user',
preHandler: fastify.auth([fastify.verifyToken]),
schema: {
response: {
default: resultSchema({value: z.string(), error: z.string()}),
},
},
async handler(req, reply) {
const authResult = await Auth.check(req.headers.Authorization)
if (Result.isFailure(authResult)) {
// Because I have explicitly checked for a failure, the compiler allows me to access .error
// Because authResult.error is strongly typed, I can be certain it has a .message
return reply.code(401).send(Result.fail(authResult.error.message))
}
const userResult = await Db.getUser(req.user.id)
// 1. Because I have explicitly check for a failure here
if (Result.isFailure(userResult)) {
return reply.code(404).send(Result.fail(userResult.error.message))
}
// 2. The compiler allows me to access .value here
const user = userResult.value
if (!user.isInitialized) return reply.code(400).send(Result.fail('User is not initialized'))
const updateResult = await Db.updateUser({phoneNumber: req.phone})
if (Result.isFailure(userResult)) {
return reply.code(500).send(Result.fail(updateResult.error.message))
}
return Request.succeed('User profile updated successfully')
},
})
}

Here we’ve seen an API endpoint that both uses Results internally, and unconditionally returns Results to the client. Assuming that the client types their responses correctly, this will force the client to also check for the possibility of an error, even if they don’t have byethrow installed!

We have taken a 12-line function and turned it into a 21-line function that will operate almost identically almost all of the time.

But I’m sure you recognize that in doing so we’ve also turned this endpoint into a piece of code that will function predictably every time, instead of just most of the time.

Programming like this might feel arduous if you haven’t done it before, but your users, and your future self on prod support, will thank you.

It’s important to recognize that there are two kinds of errors:

  1. Errors that are possible to anticipate and recover from
    • Most errors fall into this bucket
    • e.g. file not found or database connection failed
  2. Truly “exceptional” errors that can’t be anticipated and there is no way to recover from
    • e.g. AWS is having an outage or the API I’m trying to hit has crashed,

These two types of errors must be handled differently in code. The Results that we’ve discussed in this post are meant to handle type 1. But what about type 2 errors?

Fortunately (in terms of work for you, dear programmer) because by definition the only thing that we know about type 2 errors is that they will happen eventually, there’s not terribly much that can be done to handle them. This is where generic error handling at the boundaries of your application come into play.

For React apps this means Error Boundaries either around your whole app or independent sections of your app that will catch the errant exception and display something to your users that isn’t a blank page.

For APIs this means some mechanism around each route that will catch an exception and automatically return a 500. Fastify, for example, has a generic error handler that you can customize should you desire.