Valstar.dev

Logging Slow Queries in Apollo GraphQL

13 min read


Sometimes when you’re developing you need to know what is slow so you can fix it. With Apollo your only options right now are to use Apollo Studio which has limitations or to hook your system into a telemetry system like prometheus, but what if you want something simpler?

I have released this as an NPM module as well: @mikevalstar/apolloslowquery

The Problem

Often while I’m working on a project I’ll find that a page or component will run slow, often times this ends up being the frontend waiting for the backend to respond. When this happens I will inspect the service call and try to track down what is slow.

When I was largely working with REST services this was much simpler, you would first check your database query time(s) as thats usually where the issue lies and most libraries (or the database itself) as a simple way to report on query times and can help to trace your issues. But if that wasn’t the issue, you were generally reading through a single simple path through the application to find what the issue is.

Enter GraphQL and your application is now running multiple resolvers on any given request (hopefully in parallel) and ultimately combining the data together. This can present an issue when debugging why a request is slow. Which of the resolvers is having the problem? is it because too many resolvers are being triggered in sequence? so being able to first find which queries your application is firing that are slow and tracing down which parts of the query are causing the issue can be your first step to improving the speed of your API.

So lets build an Apollo plugin that will both tell you which queries specifically are slow, but also give you insight into which resolver is the slow one!

Creating an Apollo Plugin

Our first step is to create a basic Apollo plugin and warn ourselves not to run this in production, this level of tracing will hurt overall performance of our app if it were to run in a heavy use environment.

With our plugin lets send in what we consider a slow query threshold so we’re not logging every query and start by getting the high resolution time on each request and keeping it for later so we can be extra precise.

const aServer = new ApolloServer({
  schema: schemaWithResolvers,
  plugins: [
    apolloSlowQuery({slow: 20}),
  ],
});
const apolloSlowQuery = ({slow}) => {

  return {

    // Inform the user they are doing something dangerous
    async serverWillStart() {
      console.warn('🦥 Slow query log plugin enabled! do not use in production');
    },

    async requestDidStart() {
      const queryDate = new Date();
      const reqStart = process.hrtime.bigint();
    },

  };
};

export default apolloSlowQuery;

Logging Slow Queries

Now for the most basic version we will log the time that we started the execution so we know how long the server spent parsing and waiting to run the query. and we’ll also grab the time it’s about to send the results to the client.

Additionally lets grab some additional things we’ll want to know later: what was the query?, what variables were sent?, what was our response? and the operation name from the client.

Since we’re using high resolution time, lets convert it from nano seconds to milliseconds and round to 4 decimal places so we have some reasonably easy numbers to read.

Lastly lets output the operation name and how long it took to the console when we’ve exceeded our slow query threshold.

function endTimer(time) {
  function roundTo(decimalPlaces, numberToRound) {
    const factorOfTen = Math.pow(10, decimalPlaces);

    return +(Math.round(numberToRound * factorOfTen)  / factorOfTen);
  }
  const diff = process.hrtime.bigint() - time;
  const elapsed = Number(diff) / 1000000; // convert from bigint to regular number, it'll be small enough

  return roundTo(4, elapsed); // Result in milliseconds to 4 decimal places
}

const apolloSlowQuery = ({slow, folder, format, ignoreOperations}) => {

  // make into an array so teh includes will always work
  const ignore = Array.isArray(ignoreOperations) ? ignoreOperations : [];

  return {

    // Inform the user they are doing somethign dangerous
    async serverWillStart() {
      console.warn('🦥 Slow query log plugin enabled! do not use in production');
    },

    async requestDidStart() {
      const queryDate = new Date();
      const reqStart = process.hrtime.bigint();
      const errors = [];

      let timeUntilExecutionMs = null;

      return {

        async executionDidStart() {
          timeUntilExecutionMs = endTimer(reqStart);
        },

        async didEncounterErrors({errors}) {
          errors.push(errors);
        },

        async willSendResponse(sendContext) {

          const durationMs = endTimer(reqStart);

          if(durationMs > slow) {
            const query = sendContext.source; // use instead of requestContext.request.query in case the client is hashing queries
            const reqOp = sendContext.operationName || 'Unknown';
            const variables = sendContext.request.variables;
            const resp = JSON.stringify(sendContext.response.data, null, 2);

            const output = {
              durationMs,
              timeUntilExecutionMs,
              requestedAt: queryDate,
              operation: reqOp,
              query,
              variables,
              response: resp,
              errors,
            };

            console.info(`🦥 Slow Query: ${reqOp} - ${durationMs} ms`);
            //console.info(output);
          }
        },
      };
    },

  };
};

export default apolloSlowQuery;

Tracing Resolvers

The last item we’ll want to track is a timing for each field resolver, both when did we start resolving, and how long did that resolve take:

return {
  willResolveField({info}) {
    if(!info || !info.path) {
      return; // just being safe
    }

    const fieldStart = endTimer(reqStart);
    const fieldTimer = process.hrtime.bigint();

    let p = info.path;

    let res = `${p.key}`;

    while ((p = p.prev) !== undefined) {
      res = `${p.key}.${res}`;
    }

    return () => {
      trace.push({
        path: res,
        start: fieldStart,
        duration: endTimer(fieldTimer),
      });
    };
  },
};

Detailed Output

Now that we have all of the details we need to help us debug our slow queries. Personally I like YAML format for reading these types of logs, so we’ll convert the json version to YAML and output it to a file if you pass in a folder and format.

Additionally I’m going to sneak into this last update a way to ignore specific operations for queries you know are slow and don’t expect to fix (maybe you’re calling an external API that is slow and hates you).

const apolloSlowQuery = ({slow, folder, format, ignoreOperations}) => {

  // make into an array so teh includes will always work
  const ignore = Array.isArray(ignoreOperations) ? ignoreOperations : [];

  // ...
  if(!ignore.includes(sendContext.operationName) && durationMs > slow) {
    const query = sendContext.source; // use instead of requestContext.request.query in case the client is hashing queries
    const reqOp = sendContext.operationName || 'Unknown';
    const variables = sendContext.request.variables;
    const resp = JSON.stringify(sendContext.response.data, null, 2);

    const output = {
      durationMs,
      timeUntilExecutionMs,
      requestedAt: queryDate,
      operation: reqOp,
      query,
      variables,
      response: resp,
      trace,
      errors,
    };

    console.info(`🦥 Slow Query: ${reqOp} - ${durationMs} ms`);

    if(folder) {
      const prefix = `${queryDate.getTime()}.${Math.floor(Math.random() * 1000 + 1)}.${reqOp}`;

      if(format === 'json') {
        fs.writeFile(path.join(folder, prefix + '.json'), JSON.stringify(output), fileWriteErrHandler);
      }else{
        fs.writeFile(path.join(folder, prefix + '.yaml'), stringify(output), fileWriteErrHandler);
      }
    }else{
      console.info(output);
    }
  }
  // ...
}

The Final Plugin

Lets put it all together and we get this:

import fs from 'fs';
import path from 'path';
import {stringify} from 'yaml';

function endTimer(time) {
  function roundTo(decimalPlaces, numberToRound) {
    const factorOfTen = Math.pow(10, decimalPlaces);

    return +(Math.round(numberToRound * factorOfTen)  / factorOfTen);
  }
  const diff = process.hrtime.bigint() - time;
  const elapsed = Number(diff) / 1000000; // convert from bigint to regular number, it'll be small enough

  return roundTo(4, elapsed); // Result in milliseconds to 4 decimal places
}

// Simple error handler for files, we're just going to try our best and not error out
const fileWriteErrHandler = err => {
  if(err) {
    console.error('Slow Query logger unable to write file', err);
  }
};

const apolloSlowQuery = ({slow, folder, format, ignoreOperations}) => {

  // make into an array so teh includes will always work
  const ignore = Array.isArray(ignoreOperations) ? ignoreOperations : [];

  return {

    // Inform the user they are doing somethign dangerous
    async serverWillStart() {
      console.warn('🦥 Slow query log plugin enabled! do not use in production');
    },

    async requestDidStart() {
      const queryDate = new Date();
      const reqStart = process.hrtime.bigint();
      const trace = [];
      const errors = [];

      let timeUntilExecutionMs = null;

      return {

        async executionDidStart() {
          timeUntilExecutionMs = endTimer(reqStart);

          return {
            willResolveField({info}) {
              if(!info || !info.path) {
                return; // just being safe
              }

              const fieldStart = endTimer(reqStart);
              const fieldTimer = process.hrtime.bigint();

              let p = info.path;

              let res = `${p.key}`;

              while ((p = p.prev) !== undefined) {
                res = `${p.key}.${res}`;
              }

              return () => {
                trace.push({
                  path: res,
                  start: fieldStart,
                  duration: endTimer(fieldTimer),
                });
              };
            },
          };
        },

        async didEncounterErrors({errors}) {
          errors.push(errors);
        },

        async willSendResponse(sendContext) {

          const durationMs = endTimer(reqStart);

          if(!ignore.includes(sendContext.operationName) && durationMs > slow) {
            const query = sendContext.source; // use instead of requestContext.request.query in case the client is hashing queries
            const reqOp = sendContext.operationName || 'Unknown';
            const variables = sendContext.request.variables;
            const resp = JSON.stringify(sendContext.response.data, null, 2);

            const output = {
              durationMs,
              timeUntilExecutionMs,
              requestedAt: queryDate,
              operation: reqOp,
              query,
              variables,
              response: resp,
              trace,
              errors,
            };

            console.info(`🦥 Slow Query: ${reqOp} - ${durationMs} ms`);

            if(folder) {
              const prefix = `${queryDate.getTime()}.${Math.floor(Math.random() * 1000 + 1)}.${reqOp}`;

              if(format === 'json') {
                fs.writeFile(path.join(folder, prefix + '.json'), JSON.stringify(output), fileWriteErrHandler);
              }else{
                fs.writeFile(path.join(folder, prefix + '.yaml'), stringify(output), fileWriteErrHandler);
              }
            }else{
              console.info(output);
            }
          }
        },
      };
    },

  };
};

export default apolloSlowQuery;

NPM Module

I heave released this as an npm module if you’re not looking to customize it: @mikevalstar/apolloslowquery