HomeToolsAbout

Dataloader

What is it

Dataloader sit on top of your data access layer and will de-dupe outgoing requests using debouncing or cache the data response using memoization.

Every new request creates a new dataloader.

Purpose: Solves N+1 problem in Graphql.

  • Because data is fetched at a field-level, we run the risk of overfetching and N+1 queries.
/* in middleware file */ const DataLoader = require('dataloader'); const userLoader = new DataLoader(keys => ( myBatchGetUsers(keys) ));

Because dataloader is created/batches per data entity (e.g. User table, Books table), each entity can be from different sources.

  • e.g. User from SQL DB, Books from NoSQL DB

Architecture Layer

Dataloader(s) are defined in the context or the middleware layer.

Then the loaded definitions are called in the resolvers.

/* middleware.js or in `context` definition */ function dataLoadersMiddleware(ctx, next) { // dataloader definition as a middleware function // create loaders, put into context } const server = new ApolloServer({ typeDefs, resolvers, context: () => { return { someFieldLoader: new DataLoader(async keys => { const someField = await fetch('someSource') return someField; }) } } }) /* app.ts */ // middleware approach app // .use(otherMiddleware) .use(dataLoadersMiddleware) // .use(someOtherMiddlewares) /* resolver.js */ async function f(parent, args, ctx) { // Validations const f = await ctx.dataLoaders.fLoader.load(args.fId); return { ...f, gs: await ctx.dataLoaders.gLoader.loadMany(f.gs), } } async function g(self, args, ctx) { return { ...self, hs: await ctx.dataLoaders.hLoader.loadMany(self.hs), } } // etc...

Illustration

You can have a situation where a nested field could cause duplicate parent-level field requests.

query { name attendee { events { title attendees { name } } } }

If attendees can attend more than one event, we'd encounter overfetching from making duplicate requests for the same attendees attending multiple events.

  • e.g. attendee Sam attends two events and GraphQL would query for Sam's name twice or more because Sam may be attending multiple events.

We can fix this by making the resolver lighter at the parent level and shifting the resolver function to the fields.

  • e.g. resolver for each events, title, attendees fields instead of one resolver at events.
    • this will make child fields responsible for fetching its own data.

Dataloader leverages this structure breakdown to batch and de-dupe requests.

Field name difference between service layer data response and Graphql data response is mapped in dataloader.

  • Because dataloader only cares about actual queries generated based on keys.
AboutContact