Dataloader
What is it
Dataloader
sit on top of your data access layer
and will de-dupe
outgoing requests using debouncing
or cache
the data response using memoization
.
Every new request creates a new dataloader
.
Purpose: Solves N+1
problem in Graphql.
- Because data is fetched at a field-level, we run the risk of overfetching and
N+1
queries.
/* in middleware file */ const DataLoader = require('dataloader'); const userLoader = new DataLoader(keys => ( myBatchGetUsers(keys) ));
Because dataloader
is created/batches
per data entity
(e.g. User table, Books table), each entity can be from different sources.
- e.g.
User
fromSQL DB
,Books
fromNoSQL DB
Architecture Layer
Dataloader
(s) are defined in the resolvers.
If you have a frequently ran query, you can also place them in the middleware
layer.
Then the loaded definitions are called in the resolver
s.
Middleware in Express example:
/* module */ function dataLoadersMiddleware(ctx, next) { // dataloader definition as a middleware function // create loaders, put into context } /* app.ts */ app // .use(otherMiddleware) .use(dataLoadersMiddleware) // .use(someOtherMiddlewares) const server = new ApolloServer({ typeDefs, resolvers, context: () => { return { someFieldLoader: new DataLoader(async keys => { const someField = await fetch('someSource') return someField; }) } } }) /* resolver.js */ async function f(parent, args, ctx) { const f = await ctx.dataLoaders.fLoader.load(args.fId); return { ...f, gs: await ctx.dataLoaders.gLoader.loadMany(f.gs), } } async function g(self, args, ctx) { return { ...self, hs: await ctx.dataLoaders.hLoader.loadMany(self.hs), } } // etc...
Illustration
You can have a situation where a nested field could cause duplicate parent-level field requests.
query { name attendee { events { title attendees { name } } } }
If attendees
can attend more than one event
, we'd encounter overfetching from making duplicate requests for the same attendees
attending multiple events
.
- e.g. attendee
Sam
attends two events and GraphQL would query forSam
'sname
twice or more becauseSam
may be attending multiple events.
We can fix this by making the resolver
lighter at the parent level and shifting the resolver function to the fields
.
- e.g. resolver for each
events
,title
,attendees
fields instead of one resolver atevents
.- this will make child fields responsible for fetching its own data.
Dataloader
leverages this structure breakdown to batch
and de-dupe
requests.
Field name difference between service layer data response and Graphql data response is mapped in dataloader
.
- Because
dataloader
only cares about actual queries generated based onkeys
.