Skip to main content

4 min read
Andreas Marek

We follow a fundamental rule in GraphQL Java regarding Threads: GraphQL Java never creates Threads or interacts with Thread pools. We do this because we want to give the user the full control and whatever GraphQL Java would do, it would not be correct for every use case.

Additionally to being strictly unopinionated regarding Threads, GraphQL Java is also fully reactive, implemented via CompletableFuture (CF). These two constrain together mean we rely on the CF returned by the user. Specifically we piggyback on the CF returned by the DataFetcher (or other async methods which can be implemented by the user, but we focus here on DataFetcher as it is by far the most important).

// Pseudo code in GraphQL Java

CompletableFuture<Object> dataFetcherResult = invokeDataFetcher();
dataFetcherResult.thenApply(result -> {
// in which Thread where this code happens is controlled by the CF returned
continueExecutingQuery(result);
});

Blocking DataFetcher

Lets assume you are accessing a DB in a blocking way in your DataFetcher:

String get(DataFetchingEnvironment env) {
return getValueFromDb(env); // blocking the Thread until the value is read from DB
};

This is not completely wrong, but not recommend in general as the consequence of this kind of DataFecher is that GraphQL can't execute the query in the most efficient way.

For example for the following query:

{
dbData1
dbData2
dbData3
}

If the DataFetcher for these dbData fields don't return a CF, but block the Thread until the data is read, GraphQL Java will not work with maximum efficiency.

GraphQL Java can invoke the DataFetcher for all three fields in parallel. But if your DataFetcher for dbData1 is blocking, GraphQL Java will also be blocked and only invoke the next DataFetcher once dbData<n> is finished. The recommend solution to this problem is offloading your blocking code onto a separate Thread pool as shown here:

CompletableFuture<String> get(DataFetchingEnvironment env) {
return CompletableFuture.supplyAsync( getValueFromDb(env), dbThreadPool );
};

This code will maximize the performance and will cause all three fields to be fetched in parallel.

Different pools for different work

The subsequent work done by GraphQL Java will be executed in the same dbThreadPool until it encounters a new DataFetcher returned by the user code and this new CF dedicates the Thread for the subsequent work.

If you want to have separate pools for different kind of work, one for the actual DataFetcher which normally involve IO and one of the actual GraphQL Java work (which is pure CPU), you need to switch back from your offloaded pool to a dedicated GraphQL Java pool before returning the CF. You can achieve this with code like this:

CompletableFuture<String> get(DataFetchingEnvironment env) {
return CompletableFuture.supplyAsync( getValueFromDb(env), dbThreadPool )
.handleAsync((result,exception) -> {
if(exception !=null) throw exception;
return result;
}, graphqlJavaPool);
};

Notice the .handleAsync which doesn't do anything except forwarding the result, but on a different pool (graphqlJavaPool).

This way you have different pools for different kind of work (one for CPU bound GraphQL Java work and one for multiple ones for IO bound work), which can be configured and monitored independently.

In a fully reactive system

If your system is fully reactive your DataFetcher will more look like this

CompletableFuture<String> get(DataFetchingEnvironment env) {
return callAnotherServiceNonBlocking(env); // returns CompletableFuture
};

The code above could be implemented via Async Http Client or WebFlux WebClient. Both provide fully reactive HTTP clients.

Because the code is non blocking there is no need to offload anything on a dedicated Thread pool to avoid blocking GraphQL Java.

You still might want to consider using a dedicated GraphQL Java pool as you otherwise would use Threads which are dedicated to IO. How much this is really relevant depends highly on your use case.

For example Async Http Client (AHC) uses by default 2 * #cores (this value comes actually from Netty) Threads. If you don't use a dedicated Thread Pool for GraphQL Java you might encounter situations under load where all AHC Threads are either busy or blocked by GraphQL Java code and as a result your system is not as performant as it could be. Normally only load tests in production like environments can show the relevance of different Thread pools.

Feedback or questions

We use GitHub Discussions for general feedback and questions.

5 min read
Brad Baker

Today we are looking into the graphql.schema.DataFetchingFieldSelectionSet and graphql.execution.DataFetcherResult objects as means to build efficient data fetchers.

The scenario

But first lets set the scene. Imagine we have a system that can return issues and the comments on those issues

{
issues {
key
summary
comments {
text
}
}
}

Nominally we would have a graphql.schema.DataFetcher on issues that returns a list of issues and one on the field comments that returns the list of comments for each issue source object.

As you can see this naively creates an N+1 problem where we need to fetch data multiple times, one for each issue object in isolation.

We could attack this using the org.dataloader.DataLoader pattern but there is another way which will discuss in this article.

Look ahead via DataFetchingFieldSelectionSet

The data fetcher behind the issues field is able to look ahead and see what sub fields are being asked for. In this case it knows that comments are being asked for and hence it could prefetch them at the same time.

graphql.schema.DataFetchingEnvironment#getSelectionSet (aka graphql.schema.DataFetchingFieldSelectionSet) can be used by data fetcher code to get the selection set of fields for a given parent field.

DataFetcher issueDataFetcher = environment -> {
DataFetchingFieldSelectionSet selectionSet = environment.getSelectionSet();
if (selectionSet.contains("comments")) {
List<IssueAndCommentsDTO> data = getAllIssuesWithComments(environment, selectionSet.getFields());
return data;
} else {
List<IssueDTO> issues = getAllIssuesWitNoComments(environment);
return issues;
}
};

Imagine this is backed by an SQL system we might be able to use this field look ahead to produce the following SQL

SELECT Issues.Key, Issues.Summary, Comments.Text
FROM Issues
INNER JOIN Comments ON Issues.CommentID=Comments.ID;

So we have looked ahead and returned different data depending on the field sub selection. We have made our system more efficient by using look ahead to fetch data just the 1 time and not N+1 times.

Code Challenges

The challenge with this code design is that the shapes of the returned data is now field sub selection specific. We needed a IssueAndCommentsDTO for one sub selection path and a simpler IssueDTO for another path.

With enough paths this becomes problematic as it adds new DTO classes per path and makes out child data fetchers more complex

Also the standard graphql pattern is that the returned object becomes the source ie. graphql.schema.DataFetchingEnvironment#getSource of the next child data fetcher. But we might have pre fetched data that is needed 2 levels deep and this is challenging to do since each data fetcher would need to capture and copy that data down to the layers below via new TDOs classes per level.

Passing Data and Local Context

GraphQL Java offers a capability that helps with this pattern. GraphQL Java goes beyond what the reference graphql-js system gives you where the object you returned is automatically the source of the next child fetcher and that's all it can be.

In GraphQL Java you can use well known graphql.execution.DataFetcherResult to return three sets of values

  • data - which will be used as the source on the next set of sub fields
  • errors - allowing you to return data as well as errors
  • localContext - which allows you to pass down field specific context

When the engine sees the graphql.execution.DataFetcherResult object, it automatically unpacks it and handles it three classes of data in specific ways.

In our example case we will be use data and localContext to communicate between fields easily.

DataFetcher issueDataFetcher = environment -> {
DataFetchingFieldSelectionSet selectionSet = environment.getSelectionSet();
if (selectionSet.contains("comments")) {
List<IssueAndCommentsDTO> data = getAllIssuesWithComments(environment, selectionSet.getFields());

List<IssueDTO> issues = data.stream().map(dto -> dto.getIssue()).collect(toList());

Map<IssueDTO, List<CommentDTO>> preFetchedComments = mkMapOfComments(data);

return DataFetcherResult.newResult()
.data(issues)
.localContext(preFetchedComments)
.build();
} else {
List<IssueDTO> issues = getAllIssuesWitNoComments(environment);
return DataFetcherResult.newResult()
.data(issues)
.build();
}
};

If you look now you will see that our data fetcher returns a DataFetcherResult object that contains data for the child data fetchers which is the list of issueDTO objects as per usual. It will be their source object when they run.

It also passes down field specific localContext which is the pre-fetched comment data.

Unlike the global context object, local context objects are passed down from a specific field to its children and are not shared across to peer fields. This means a parent field has a "back channel" to talk to the child fields without having to "pollute" the DTO source objects with that information and it is "local" in the sense that it given only to this field and its children and not any other field in the query.

Now lets look at the comments data fetcher and how it consumes this back channel of data

DataFetcher commentsDataFetcher = environment -> {
IssueDTO issueDTO = environment.getSource();
Map<IssueDTO, List<CommentDTO>> preFetchedComments = environment.getLocalContext();
List<CommentDTO> commentDTOS = preFetchedComments.get(issueDTO);
return DataFetcherResult.newResult()
.data(commentDTOS)
.localContext(preFetchedComments)
.build();
};

Notice how it got the issueDTO as its source object as expected but it also got a local context object which is our pre-fetched comments. It can choose to pass on new local context OR if it passes nothing then the previous value will bubble down to the next lot of child fields. So you can think of localContext as being inherited unless a fields data fetcher explicitly overrides it.

Our data fetcher is a bit more complex because of the data pre-fetching but 'localContext' allows us a nice back channel to pass data without modifying our DTO objects that are being used in more simple data fetchers.

Passing back Errors or Data or Both

For completeness we will show you that you can also pass down errors or data or local context or all of them at once.

It is perfectly valid to fetch data in graphql and to ALSO send back errors. Its not common but its valid. Some data is better than no data.

GraphQLError error = mkSpecialError("Its Tuesday");

return DataFetcherResult.newResult()
.data(commentDTOS)
.error(error)
.build();

4 min read
Andreas Marek

Welcome to the new series "GraphQL deep dive" where we will explore advanced or unknown GraphQL topics. The plan is to discuss things mostly in a language and implementation neutral way, even if it is hosted on graphql-java.com.

Merged Fields

First thing we are looking at is "merged fields".

GraphQL allows for a field to be declared multiple times in a query as long as it can be merged.

Valid GraphQL queries are:

{
foo
foo
}
{
foo(id: "123")
foo(id: "123")
foo(id: "123")
}
{
foo(id: "123") {
id
}
foo(id: "123") {
name
}
foo(id: "123") {
id
name
}
}

Each of these queries will result in a result with just one "foo" key, not two or three.

Invalid Queries are:

{
foo
foo(id: "123")
}
{
foo(id: "123")
foo(id: "456", id2: "123")
}
{
foo(id: "123")
foo: foo2
}

The reason why they are not valid, is because the fields are different: in the first two examples the arguments differ and the third query actually has two different fields under the same key.

Motivation

The examples so far don't seem really useful, but it all makes sense when you add fragments:

{
...myFragment1
...myFragment2
}

fragment myFragment1 on Query {
foo(id: "123") {
name
}
}
fragment myFragment2 on Query {
foo(id: "123") {
url
}
}

Fragments are designed to be written by different parties (for example different components in a UI) which should not know anything about each other. Requiring that every field can only be declared once would make this objective unfeasible.

But by allowing fields merging, as long as the fields are the same, allows fragments to be authored in an independent way from each other.

Rules when fields can be merged

The specific details when fields can be merged are written down in Field Selection Merging in the spec.

The rules are what you would expect in general and they basically say that fields must be the same. The following examples are taken from the spec and they are all valid:

fragment mergeIdenticalFields on Dog {
name
name
}
fragment mergeIdenticalAliasesAndFields on Dog {
otherName: name
otherName: name
}
fragment mergeIdenticalFieldsWithIdenticalArgs on Dog {
doesKnowCommand(dogCommand: SIT)
doesKnowCommand(dogCommand: SIT)
}
fragment mergeIdenticalFieldsWithIdenticalValues on Dog {
doesKnowCommand(dogCommand: $dogCommand)
doesKnowCommand(dogCommand: $dogCommand)
}

The most complex case happens when you have fields in fragments on different types:

fragment safeDifferingFields on Pet {
... on Dog {
volume: barkVolume
}
... on Cat {
volume: meowVolume
}
}

This is normally invalid because volume is an alias for two different fields barkVolume and meowVolume but because only one of the some are actually resolved and they both return a value of the same type (we assume here that barkVolume and meowVolume are both of the same type) it is valid.

fragment safeDifferingArgs on Pet {
... on Dog {
doesKnowCommand(dogCommand: SIT)
}
... on Cat {
doesKnowCommand(catCommand: JUMP)
}
}

This is again a valid case because even if the first doesKnowCommand has a different argument than the second doesKnowCommand only one of them is actually resolved.

In the next example someValue has different types (we assume that nickname is a String and meowVolume is a Int) and therefore the query is not valid:

fragment conflictingDifferingResponses on Pet {
... on Dog {
someValue: nickname
}
... on Cat {
someValue: meowVolume
}
}

Sub selections and directives

One thing to keep in my mind is that the sub selections of fields are merged together. For example here foo is resolved once and than id and name is resolved.

{
foo(id: "123") {
id
}
foo(id: "123") {
name
}
}

This query is the same as:

{
foo(id: "123") {
id
name
}
}

The second thing to keep in mind is that different directives can be on each field:

{
foo(id: "123") @myDirective {
id
}
foo(id: "123") @myOtherDirective {
name
}
}

So if you want to know all directives for the current field you are resolving you actually need to look at all of the merged fields from the query.

Merged fields in graphql-js and GraphQL Java

In graphql-js merged fields are relevant when you implement a resolver and you need access to the specific ast field of the query. The info objects has a property fieldNodes which gives you access to all ast fields which are merged together.

In GraphQL Java depending on the version you are running you have List<Field> getFields() in the DataFetcherEnvironment or for GraphQL Java newer than 12.0 you have also MergedField getMergedField() which is the recommend way to access all merged fields.

One min read
Andreas Marek
caution

Spring for GraphQL is the official and current Spring integration. The integration is a collaboration between the Spring and GraphQL Java teams, and is maintained by the Spring team.

We recommend using Spring for GraphQL, rather than the older Spring project mentioned in this blog post.

See our Spring for GraphQL tutorial for how to get started.

We are happy to release the first version of the GraphQL Java Spring (Boot) project.

As described before this project complements the GraphQL Java core project if you build a fully operational GraphQL server with Spring.

Currently it supports GET and POST requests and allows for some basic customization.

In future we are looking into supporting more advanced features like file upload or subscriptions.

As always contributions are more than welcome and we are hoping to grow this project together with the community: please open a new issue or leave a comment on spectrum chat about your wishes.

More details on how to use it can be found on the github page: https://github.com/graphql-java/graphql-java-spring

One min read
Brad Baker

One of the most common questions we get in GraphQL Java land is "can we have a datetime scalar".

This is not defined by the graphql specification per se so we are reluctant to add it to the core library and then have it turn up later as an officially specified type.

But it really is a badly needed type in your GraphQL arsenal and hence graphql-java-extended-scalars was born

https://github.com/graphql-java/graphql-java-extended-scalars

This will be a place where we can add non standard but useful extensions to GraphQL Java.

The major scalars we have added on day one are

  • The aforementioned DateTime scalar as well as a Date and Time scalar

  • A Object scalar or sometimes know as a JSON scalar that allows a map of values to be returned as a scalar value

  • Some numeric scalars that constrain the values allowed such as PositiveInt

  • A Regex scalar that allows a string to fit a regular expression

  • A Url scalar that produces java.net.URL objects at runtime

  • And finally an aliasing technique that allows you to create more meaningfully named scalar values

    We hope you find them useful.

Cheers,

Brad