GraphQL vs REST: Why We Switched Back to REST for Public APIs

At CodingClave, we aren't dogmatic. We use the right tool for the job. For a long time, that tool appeared to be GraphQL. It promised to solve over-fetching, eliminate versioning headaches, and give clients the power to ask for exactly what they needed.

Internally, for our React and mobile clients, GraphQL remains a powerhouse. However, exposing a GraphQL endpoint as a Public API to third-party integrators became an architectural liability.

The flexibility that makes GraphQL great for internal frontend teams becomes a security nightmare and a performance bottleneck when exposed to the public. After six months of battling complexity complexity, non-deterministic latency, and CDN incompatibilities, we migrated our public-facing layer back to REST.

Here is the engineering breakdown of why we reverted, and the architecture we built to replace it.

The High-Stakes Problem: The "Graph" is a Black Box

The fundamental issue with Public GraphQL is the inversion of control. You are giving third-party developers—whose code quality you cannot control—the ability to construct arbitrary queries against your database schema.

We encountered three critical failures:

Complexity Analysis Failures: It is incredibly difficult to calculate the complexity cost of a query before execution. A nested query (e.g., User -> Posts -> Comments -> Author -> Posts...) can bring a database to its knees.
CDN Incompatibility: GraphQL operates primarily over HTTP POST. This renders standard HTTP caching mechanisms (CDNs, browser caches, proxy caches) useless because the intent is buried in the request body, not the URL.
Rate Limiting Ambiguity: How do you rate limit a request? By count? A simple query costs the same "1 request" as a query fetching 5,000 nodes. Complexity-based rate limiting is possible but introduces significant overhead.

Technical Deep Dive: The Pivot

We moved from a single /graphql endpoint to a strictly typed, OpenAPI-compliant REST architecture. The goal was to enforce predictability.

The Vulnerability (GraphQL)

Consider this simplified schema allowing deep nesting. Even with graphql-depth-limit, calculating the "cost" of the following query dynamically is expensive:

# The "Malicious" Recursive Query
query {
  users(first: 100) {
    id
    posts(first: 50) {
      comments(first: 50) {
        author {
          posts(first: 50) {
            title
            # The database is now performing massive joins
          }
        }
      }
    }
  }
}

In a public API, you cannot trust the consumer to be a "good citizen."

The Solution (REST with Strict Expansion)

We implemented a RESTful pattern using sparse fieldsets and strict expansion limits. Instead of allowing arbitrary nesting, we pre-optimize specific access patterns.

Here is a simplified TypeScript implementation of how we handle resource expansion safely in our Controller layer, preventing the N+1 explosion:

// REST Controller: strictly typed expansion
import { Request, Response } from 'express';
import { getRepository } from 'typeorm';
import { User } from './entities/User';

export const getUser = async (req: Request, res: Response) => {
  const { id } = req.params;
  const { expand } = req.query; // e.g., ?expand=posts

  // 1. Strict allow-list for expansion
  const allowedExpansions = ['posts', 'profile'];
  const relations = (expand as string || '')
    .split(',')
    .filter(rel => allowedExpansions.includes(rel));

  try {
    // 2. Optimized Repository Call
    const user = await getRepository(User).findOne({
      where: { id },
      relations, 
      // 3. Enforced limit on relation depth prevents recursion
      loadEagerRelations: false 
    });

    if (!user) return res.status(404).json({ error: 'Not found' });

    // 4. Leveraging standard HTTP Caching
    res.set('Cache-Control', 'public, max-age=300, s-maxage=600');
    res.set('ETag', `"${user.version}_${user.id}"`);

    return res.json(user);
  } catch (error) {
    return res.status(500).json({ error: 'Internal Server Error' });
  }
};

Architecture & Performance Benefits

By switching back to REST, we regained control over our infrastructure.

1. HTTP Caching & Edge Performance

This was the biggest win. With GraphQL, every request hit our origin servers because they were POST requests. With REST, we utilize GET requests for data retrieval.

Result: We offloaded 85% of public read traffic to Cloudflare and Varnish.
Mechanism: We use s-maxage to tell the CDN to hold content for 10 minutes, while max-age tells the client to cache for 5 minutes.
Latency: Average response time for cached endpoints dropped from 240ms (processing GQL AST) to 25ms (CDN hit).

2. Trivial Rate Limiting

We no longer need to calculate "query cost scores." We use the Token Bucket algorithm based on API keys and endpoints.

GET /users = 1 token.
POST /users = 5 tokens. This is transparent to the user and O(1) complexity for our API Gateway to enforce.

3. Developer Onboarding (DX)

While tools like GraphiQL are excellent, curl remains the lingua franca of the web. Third-party integrators do not need to learn our specific Graph Schema Definition Language (SDL). They simply look at the OpenAPI (Swagger) docs and hit a URL. The reduction in support tickets regarding "malformed queries" was immediate.

How CodingClave Can Help

Implementing a high-performance public API isn't just about choosing between REST and GraphQL—it’s about understanding traffic patterns, edge caching strategies, and security perimeters.

While the logic above outlines why we switched, executing a migration like this without downtime or breaking existing integrations is a high-risk engineering challenge. Misconfiguring your CDN or failing to properly version your REST endpoints can lead to data outages and lost customer trust.

CodingClave specializes in high-scale architecture modernization.

We help engineering teams:

Audit existing API performance and security vulnerabilities.
Architect hybrid solutions (Internal GraphQL + Public REST).
Implement aggressive Edge Caching strategies to reduce infrastructure costs.
Design strict OpenAPI specifications that govern your data flow.

If your team is struggling with API scalability or debating the right architectural direction, let’s talk.

Book a Technical Consultation with CodingClave

Internally, for our React and mobile clients, GraphQL remains a powerhouse. However, exposing a GraphQL endpoint as a Public API to third-party integrators became an architectural liability.

Here is the engineering breakdown of why we reverted, and the architecture we built to replace it.

The High-Stakes Problem: The "Graph" is a Black Box

We encountered three critical failures:

Complexity Analysis Failures: It is incredibly difficult to calculate the complexity cost of a query before execution. A nested query (e.g., User -> Posts -> Comments -> Author -> Posts...) can bring a database to its knees.
CDN Incompatibility: GraphQL operates primarily over HTTP POST. This renders standard HTTP caching mechanisms (CDNs, browser caches, proxy caches) useless because the intent is buried in the request body, not the URL.
Rate Limiting Ambiguity: How do you rate limit a request? By count? A simple query costs the same "1 request" as a query fetching 5,000 nodes. Complexity-based rate limiting is possible but introduces significant overhead.

Technical Deep Dive: The Pivot

We moved from a single /graphql endpoint to a strictly typed, OpenAPI-compliant REST architecture. The goal was to enforce predictability.

The Vulnerability (GraphQL)

Consider this simplified schema allowing deep nesting. Even with graphql-depth-limit, calculating the "cost" of the following query dynamically is expensive:

# The "Malicious" Recursive Query
query {
  users(first: 100) {
    id
    posts(first: 50) {
      comments(first: 50) {
        author {
          posts(first: 50) {
            title
            # The database is now performing massive joins
          }
        }
      }
    }
  }
}

In a public API, you cannot trust the consumer to be a "good citizen."

The Solution (REST with Strict Expansion)

We implemented a RESTful pattern using sparse fieldsets and strict expansion limits. Instead of allowing arbitrary nesting, we pre-optimize specific access patterns.

Here is a simplified TypeScript implementation of how we handle resource expansion safely in our Controller layer, preventing the N+1 explosion:

// REST Controller: strictly typed expansion
import { Request, Response } from 'express';
import { getRepository } from 'typeorm';
import { User } from './entities/User';

export const getUser = async (req: Request, res: Response) => {
  const { id } = req.params;
  const { expand } = req.query; // e.g., ?expand=posts

  // 1. Strict allow-list for expansion
  const allowedExpansions = ['posts', 'profile'];
  const relations = (expand as string || '')
    .split(',')
    .filter(rel => allowedExpansions.includes(rel));

  try {
    // 2. Optimized Repository Call
    const user = await getRepository(User).findOne({
      where: { id },
      relations, 
      // 3. Enforced limit on relation depth prevents recursion
      loadEagerRelations: false 
    });

    if (!user) return res.status(404).json({ error: 'Not found' });

    // 4. Leveraging standard HTTP Caching
    res.set('Cache-Control', 'public, max-age=300, s-maxage=600');
    res.set('ETag', `"${user.version}_${user.id}"`);

    return res.json(user);
  } catch (error) {
    return res.status(500).json({ error: 'Internal Server Error' });
  }
};

Architecture & Performance Benefits

By switching back to REST, we regained control over our infrastructure.

1. HTTP Caching & Edge Performance

This was the biggest win. With GraphQL, every request hit our origin servers because they were POST requests. With REST, we utilize GET requests for data retrieval.

Result: We offloaded 85% of public read traffic to Cloudflare and Varnish.
Mechanism: We use s-maxage to tell the CDN to hold content for 10 minutes, while max-age tells the client to cache for 5 minutes.
Latency: Average response time for cached endpoints dropped from 240ms (processing GQL AST) to 25ms (CDN hit).

2. Trivial Rate Limiting

We no longer need to calculate "query cost scores." We use the Token Bucket algorithm based on API keys and endpoints.

GET /users = 1 token.
POST /users = 5 tokens. This is transparent to the user and O(1) complexity for our API Gateway to enforce.

3. Developer Onboarding (DX)

How CodingClave Can Help

Implementing a high-performance public API isn't just about choosing between REST and GraphQL—it’s about understanding traffic patterns, edge caching strategies, and security perimeters.

CodingClave specializes in high-scale architecture modernization.

We help engineering teams:

Audit existing API performance and security vulnerabilities.
Architect hybrid solutions (Internal GraphQL + Public REST).
Implement aggressive Edge Caching strategies to reduce infrastructure costs.
Design strict OpenAPI specifications that govern your data flow.

If your team is struggling with API scalability or debating the right architectural direction, let’s talk.

Book a Technical Consultation with CodingClave

GraphQL vs REST: Why We Switched Back to REST for Public APIs

The High-Stakes Problem: The "Graph" is a Black Box

Technical Deep Dive: The Pivot

The Vulnerability (GraphQL)

The Solution (REST with Strict Expansion)

Architecture & Performance Benefits

1. HTTP Caching & Edge Performance

2. Trivial Rate Limiting

3. Developer Onboarding (DX)

How CodingClave Can Help

Let's build your next product together.

GraphQL vs REST: Why We Switched Back to REST for Public APIs

The High-Stakes Problem: The "Graph" is a Black Box

Technical Deep Dive: The Pivot

The Vulnerability (GraphQL)

The Solution (REST with Strict Expansion)

Architecture & Performance Benefits

1. HTTP Caching & Edge Performance

2. Trivial Rate Limiting

3. Developer Onboarding (DX)

How CodingClave Can Help

Let's build your next product together.