How to tame your GraphQL schema

31 Oct, 2023

It’s 2023 and GraphQL is not as popular as it was a couple of years ago.

The hype is definitely dead, try Googling “graphql is dying” to convince yourself.

There is always new alternatives and when Theo floated the question if tRPC is a GraphQL KILLER??!, the stars were shooting through the roof for the new kid on the block:

GraphQL vs tRPC - star history

_{Source: https://star-history.com/#graphql/graphql&trpc/trpc&Date}

However, the real picture is more nuanced.

GraphQL remains a very powerful technology and teams keep adopting it and Netflix did so recently, according to ThePrimeagen.

Beware, not all stars shine equally. While Github has GraphQL stalling, on StackOverflow the trend is positive (but in fairness there is not yet a trend for tRPC): GraphQL vs REST vs gRPC - StackOverflow trends _{Source: https://insights.stackoverflow.com/trends?tags=graphql%2Crest%2Cgrpc}

Such opposite takes exist because a GraphQL's core advantage is also its weakness.

On one hand, GraphQL is extremely flexible and lets you combine REST and RPC practices in one place so you can move fast and scale.

On the other hand, this flexibility can be a deal-breaker if we want to maintain our graph.

In Theo's words, it comes at a cost.

I am going to show you an opinionated approach on how to tame this weakness and distribute this cost between backend and frontend without losing the touted "no brainer" feeling for the UI consumer.

To give you an idea, I can adjust something like this: tangled into this: untangled

The REST blueprint

I like to build the schema around a base blueprint which delivers a REST-like API with a CRUD naming convention for operations on each resource (or Object types in GQL lingo).

As a second step I usually extend this blueprint with queries and mutations with RPC-inspired names which reflect client-side needs.

My aim here is to prevent unintentionally tangling the graph relations into a hairy mess, the looks of which you've only seen in the shower drain of a derelict motel.

To that end, I will provide examples, and for each, a comparison of two directions.

One which favours the minimal base blueprint and pushes the naming responsibility to the client.

The other, which adopts a more business declarative API.

Finding the balance between two is a bit like toggling between speed of execution and maintainability.

Finally, something to keep in mind for picking the right intensity is how public you want your API to be.

Mutations - the CUD in CRUD

I like to namespace mutations by the object name:

extend type Mutation {
  objectName: ObjectNameMutations!
}

type ObjectNameMutations {
  create(input: ObjectNameCreateInput!): ObjectName!
  update(id: ID!, ObjectNameUpdateInput!): ObjectName!
  delete(id: ID!): ObjectName!
}

This approach gives me three mutations per object, sets clear expectations and simplifies graph exploration.

I can still name the client-side operations whatever I want.

This is how I would update the name of a User:

mutation UpdateUserName($id: ID!, $newName: String!) {
  user {
    update(id: $id, input: { newName: $newName }) {
      id
      name
    }
  }
}

It's name of the operation - UpdateUserName - which meets the UX/UI requirements, not the API.

Queries - the R in CRUD

Here's the bare minimum for read operations:

extend type Query {
  objectName(id: ID!): ObjectName!
  objectNames: [ObjectName!]!
}

where querying a single object should ideally never be null if the API is private, i.e. not intended for an open search by the user.

And querying multiple objects avoids nullability of the whole response and nullability of the items, i.e. [] instead of null and [1] instead of [null, 1].

That's it! ¹

The rest is in the RPC-inspired extensions. ²

No verbs in queries

It's probably worth mentioning that I don't stick with the CRUD convention verbatim.

After all, software engineering is the art of trade-offs.

Unlike for mutations, I don't pre-pend a get or a find to obtain a getObjectName. The original GraphQL spec wouldn't want you to.

Compare with and without prefixes:


query GetUser($id: ID!) {
  user(id: $id) {
    id
    name
  }
}


query GetUserWithAddress($id: ID!) {
  getUser(id: $id) {
    id
    name
    getAddress {
      road
    }
  }
}

The nested getAddress field is there to highlight how unnatural the graph traversal reads with verbs instead of nouns.

I did initially try the get and find prefixes but I reverted to the idiomatic GraphQL. And I wouldn't hold it against anyone if you can be consistent about it.

Extending the blueprint

Keeping the API clean can be hard. Sometimes it's faster to add a query or mutation than to change a resolver or your mum's code. ³

I'm pretty sure there's a lot more to life than having a really, really, ridiculously clean API.

Zoolander knows

Like actually creating value to the end user.

And, if you're building for developers, a clean and minimal API might be it.

So, I am going to address a few uses cases that come up often, and give you a sense of which choices are there.

Example - Querying by an attribute

In this example I have a user profile which I can get by id but now I want to search users by the slug.

Here's two ways to accomplish that in the API:

extend type Query {
  user(id: ID!): User!

  user(slug: String): User
  userBySlug(slug: String): User
}

I find it easier to spot the BySlug suffix than the change in the argument.

Compare how the two read as a client-side operation:


query SearchUserBySlug($slug: String!) {
  userBySlug(slug: $slug) {
    id
    name
  }
}


query SearchUserBySlug($slug: String!) {
  user(slug: $slug) {
    id
    name
  }
}

If I had to choose, I can see myself flipping the coin on this one.

Example - Updating an attribute

This one happens quite often.

For example, let's change the name of my User. I could simply add the mutation updateName:

extend type Mutation {
  user: UserMutations!
}

type UserMutations {
  update(id: ID!, input: UserUpdateInput!): User!

  updateName(id: ID!, newName: String!): User!
}

But as you can imagine this approach proliferates mutations.

And I guarantee you, there will be a graveyard of stale and deprecated mutations in your API.

The alternative is to push the naming responsibility to the client-side operation and re-use update.

Now let's compare what both approaches look on the client-side:


mutation UpdateUserName($id: ID!, $newName: String!) {
  user {
    updateName(id: $id, newName: $newName) {
      id
      name
    }
  }
}


mutation UpdateUserName($id: ID!, { $newName: String!} ) {
  user {
    update(id: $id, input: { $newName }) {
      id
      name
    }
  }
}

Re-using the update is easy, right?

Well... there is actually a few things happening here. I am not creating a new mutation but in order to achieve a minimal API:

every field of the UserUpdateInput input has to be optional
the resolver needs to handle the empty case
the data layer has to skip updating missing fields and has to erase those which are explicitly null

This is a meaty decision with an upfront cost. Nonetheless, most of the time I prefer re-using the update than creating ad hoc mutations.

Example - More expressive mutations

Similar to the previous example, sometimes I want a mutation to be part of the API because it reads better.

I know it feels like I am messing around talking about one approach and then suggesting the opposite in the next example. Designing an API is hard.

For example, suppose I want to block a user:

type UserMutations {
  update(id: ID!, input: UserUpdateInput!): User!

  block(id: ID!): User!
}

Pretty self-explanatory but let's compare this approach with the update:


mutation BlockUser($id: ID!) {
  user {
    block(id: $id) {
      id
      isBlocked
    }
  }
}


mutation BlockUser($id: ID!) {
  user {
    update(id: $id, input: { newIsBlocked: true }) {
      id
      isBlocked
    }
  }
}

That's one of those cases where the critical mass of the engineering debates will break the time-space continuum.

If I were to go for a public API, I'd use block. If the API is private and the team has successfully adopted the convention of a minimalist API, then I'd pick the update.

Example - Querying arrays of attributes

It can be tempting to be essential when fetching data.⁴ So, for this example, let's fetch a list of user slugs to pre-render profile pages by slug.

In addition to the base query of user profiles I now have:

extend type Query {
  users: [User!]!

  userSlugs: [String!]!
}

The query reads well, but will blow up if you start making one for each field. I try to keep my habits at bay.

You know the drill, let's compare on the client-side:


query GetUserSlugs {
  userSlugs
}


query GetUserSlugs {
  users {
    id
    slug
  }
}

The first query is essential and feels like dark magic. It will corrupt your soul so handle with care!

The second approach that traverses the user, aside being longer has a minor inconvenience. I now need to map() over a list of objects to get an array of strings.

However, the slug is cached for each user and I often find that desirable.

Example - Adding to an array

This happens quite a lot. Let's add a Post to my User.

Now, the first time I tried to stick with the base blueprint I found this a tad mind-bending. The User becomes secondary and I am going to create a post with and authorId.

The alternative is to have an addPost mutation under the User's part of the API.

Let's first compare the client-side operations and break it down from there:


mutation AddPostToUser($input: CreatePostInput!) {
  post {
    create(input: $input) {
      id
      content
      authorId
    }
  }
}


mutation AddPostToUser($input: AddPostToUserInput!) {
  user {
    addPost(input: $input) {
      id
      content
      authorId
    }
  }
}

On the left, in the minimal API approach the connection to the user is indirect and hidden into the CreatePostInput. The query name says it all, but naming is hard and I lost count the times I renamed things wondering what my past-self meant.

On the right, the additional mutation makes the connection explicit. The concern that the number of mutation will blow is somehow smaller. To put the magnitude in perspective, we had mutations per field and now we have mutations per model.

Pick your poison.

Conclusion

Ultimately, I presented a mental model (my opinionated approach) of how I organise the GraphQL schema (API) and how I think about trade-offs.

On one hand I use an essential base blueprint, insipired by the CRUD naming convention. On the other hand I extend the API with more expressive RPC-like queries and mutations.

I combine the two to meet the needs of our users.

I toggle between minimalism, ease of discovery, finite expectations and clarity, expressiveness.

As I toggle towards the base blueprint, I shift the maintenance and implementation responsibilities (cost) from the backend to the frontend.

This means the frontend needs to think about the names of the operations. Which at times can feel mind-bending, like in the example of adding a Post to a User through a Post creation.

But it keeps the API clean.

Then, when I toggle towards the RPC-like queries and mutations, I go fast but I don't lose sight of the surrounding baseline.

The schema becomes hairier but not to the point where the ease for the frontend is overshadowed by the burdens of the backend.

Sometimes the optimum is in the middle, like with the block (user) mutation. And it's by toggling between the two that I get the most clarity.

Organising inputs e.g. for filtering and pagination is a big topic which does affect the number of queries. However, I wanted to focus on the conceptual bare minimum.↩
Sorry, could not restist. I am also dyslexic...↩
Which is usually weak as your spaghetti. You should salt pasta water. Too much GraphQL can harm your mental health.↩
The topic of over-fetching deserves its own post but also belongs into the debate between sticking to a convention at all costs or being pragmatic.↩

#graphql #programming