Rate Limiting on Shopify APIs: Handling Throttling Gracefully

The Hidden Tax on Shopify Development: Rate Limits

Every Shopify API call has a cost. Not in dollars, but in rate limit quota.

You're building a bulk import script to sync 10,000 products from your ERP to Shopify. Naïve approach: loop through each product, make one API call per product = 10,000 calls. Shopify's REST API allows 2 requests/second for most apps = 5,000 seconds = 83 minutes to import.

But hit rate limiting midway, and your script hangs. No retry logic, no exponential backoff, no queue. The script fails. You manually restart it. You burn engineering time.

This is avoidable. Understanding Shopify's rate limiting—and how to handle it gracefully—is the difference between a friction-free integration and production incidents.

Shopify's Two Rate Limit Models

Shopify uses different models for REST and GraphQL APIs.

REST API: Bucket-Based Rate Limiting

The REST API uses a leaky bucket model:

You have a bucket with capacity (default: 40 requests per app, refills at 2 requests/second)
Each request consumes 1 unit
If the bucket is empty, requests are throttled (HTTP 429)
The bucket refills automatically

API Tier	Requests/Second	Burst Capacity	Refill Rate
Standard Apps (default)	2 req/s	40 requests	2 req/s
Public Apps (verified)	4 req/s	80 requests	4 req/s
Enterprise / Shopify Plus	4 req/s	80 requests	4 req/s

Example: You make 10 requests in 1 second. Bucket had 40, now has 30. In 5 seconds, 10 more units refill (2 req/s × 5s = 10). Bucket now has 40 again.

Key insight: REST is a simple bucket. You can "burst" up to your bucket size, but sustained throughput is limited by refill rate. If you need 5 requests/second sustained, REST won't work (max 2 req/s). You need GraphQL.

GraphQL API: Cost-Based Rate Limiting

GraphQL uses cost-based rate limiting:

Each query has a cost (1–100+ depending on what data you request)
You have a cost budget per second (default: 10 cost/second for standard apps)
Requests are throttled if you exceed budget
The budget refills automatically

Example costs:

Query	Cost
`{ shop { name } }`	1 cost
`{ products(first: 100) { edges { node { id title } } } }`	70 cost
`{ orders(first: 250) { edges { node { id lineItems { edges { node { product { id } } } } } } } }`	300+ cost (too expensive!)

Key insight: GraphQL costs scale with query complexity and pagination. A simple product query is cheap (1–5 cost); a deep query with nested relationships is expensive (50–100+ cost). The cost is calculated before execution; if it exceeds your budget, you get a 429 immediately.

Advantage of GraphQL: You can request more data per query (cheaper overall). Example: REST requires 100 calls to fetch 10K products in batches of 100; GraphQL might do it in 10 calls (each with 100 products, cost ≈ 70).

How to Detect Rate Limiting

When you hit a rate limit, Shopify returns HTTP 429 Too Many Requests.

REST API response:

HTTP/1.1 429 Too Many Requests
X-Request-Id: abc123xyz
X-Shop-Request-Limit: 39/40
Retry-After: 1

{
  "errors": {
    "error": ["API call limit exceeded. Please retry your request later."]
  }
}

Key headers:

X-Shop-Request-Limit: Shows current usage (e.g., "39/40" = 39 requests used, 1 remaining). Example: "40/40" = bucket full, no capacity.
Retry-After: Seconds to wait before retrying. Shopify typically says 1–2 seconds; play it safe and wait 2s.

GraphQL API response:

HTTP/1.1 429 Too Many Requests
Retry-After: 2

{
  "errors": [
    {
      "message": "Throttled",
      "extensions": {
        "code": "THROTTLED",
        "cost": 105,
        "maxCost": 100,
        "requestedQueryCost": 105
      }
    }
  ]
}

Key fields:

cost: What this query cost
maxCost: Your per-second budget
requestedQueryCost: What you tried to request
Retry-After: Seconds to wait

The Right Way to Handle 429: Exponential Backoff

Naive approach: When you get 429, wait for Retry-After and retry immediately.

Problem: If multiple apps hit the same endpoint, they all retry simultaneously, causing collision. Everyone gets throttled again.

Solution: Exponential backoff with jitter.

import time
import random

def call_shopify_api_with_backoff(url, headers, max_retries=5):
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)
        
        if response.status_code == 429:
            retry_after = int(response.headers.get("Retry-After", "1"))
            # Exponential backoff: 2^attempt * retry_after
            # Add random jitter to avoid thundering herd
            backoff = (2 ** attempt) * retry_after + random.uniform(0, 1)
            print(f"Rate limited. Waiting {backoff:.2f}s before retry...")
            time.sleep(backoff)
            continue
        
        if response.status_code == 200:
            return response.json()
        
        # Other errors (500, 403, etc.)
        raise Exception(f"API error: {response.status_code}")
    
    raise Exception(f"Failed after {max_retries} retries")

Exponential backoff timeline:

Attempt 1: Wait 1 × 2^0 + jitter = 1–2 seconds
Attempt 2: Wait 1 × 2^1 + jitter = 2–3 seconds
Attempt 3: Wait 1 × 2^2 + jitter = 4–5 seconds
Attempt 4: Wait 1 × 2^3 + jitter = 8–9 seconds
Attempt 5: Wait 1 × 2^4 + jitter = 16–17 seconds (give up after this)

Why this works:

First retries happen quickly (1–2s)
Each retry waits longer, reducing collision probability
Jitter prevents synchronized retries (Shopify + Competitors all retry at same time)
Max wait is ~17s; if you're still failing, something is wrong

Proactive Rate Limit Management: Don't Get Throttled

The best strategy is not getting throttled at all.

REST API: Stay Below Burst Capacity

Bad approach: Make 40 requests as fast as possible, then wait.

# DON'T DO THIS
for i in range(40):
    requests.get(url, headers=headers)  # Burst all 40 at once
# Now the bucket is empty; next request hangs

Good approach: Throttle yourself to sustainable rate.

# DO THIS
import time

rate_limit = 2  # 2 requests per second (safe margin below 2 req/s)
for i in range(10000):
    requests.get(url, headers=headers)
    time.sleep(1 / rate_limit)  # Wait 0.5s between requests

Advanced approach: Monitor X-Shop-Request-Limit header and adjust dynamically.

def get_with_rate_awareness(url, headers):
    response = requests.get(url, headers=headers)
    
    # Extract bucket state
    limit_header = response.headers.get("X-Shop-Request-Limit", "0/40")
    used, capacity = map(int, limit_header.split("/"))
    
    if used > capacity * 0.8:  # 80% full
        print("Bucket 80% full, slowing down...")
        time.sleep(0.5)  # Add extra delay
    elif used > capacity * 0.95:  # 95% full
        print("Bucket 95% full, backing off aggressively...")
        time.sleep(2)
    
    return response.json()

GraphQL API: Use Cost Awareness + Bulk Operations

Cost-based limiting is trickier because queries have variable costs.

Problem: You write a query expecting cost 50, but it actually costs 150 (nested fields, pagination). Boom, 429.

Solution 1: Estimate costs before executing.

The GraphQL API returns cost info in the response, even if throttled:

requestedQueryCost: 250
maxCost: 100

Use this to adjust your query:

def graphql_query_safe(query, max_cost=100):
    response = client.execute(query)
    
    if response.status_code == 429:
        cost_info = response.json()["errors"][0]["extensions"]
        requested = cost_info["requestedQueryCost"]
        budget = cost_info["maxCost"]
        
        # If query costs 250 but budget is 100, reduce pagination
        if requested > budget:
            print(f"Query costs {requested}, budget {budget}")
            print("Reducing pagination: first: 100 → first: 50")
            # Modify query, retry

Solution 2: Use Bulk Operations API (free rate limiting).

The Bulk Operations API lets you queue up operations without hitting rate limits:

mutation {
  bulkOperationRunQuery(query: """
    query {
      products(first: 250) {
        edges {
          node {
            id
            title
            variants(first: 250) {
              edges {
                node {
                  id
                }
              }
            }
          }
        }
      }
    }
  """) {
    bulkOperation {
      id
      status
      url
    }
  }
}

Advantage: Bulk operations are NOT rate-limited. You queue the operation, Shopify processes it in the background, and sends you a webhook when done. Perfect for high-volume data syncs.

Downside: Async. You can't get instant results.

Optimization Tactics: Minimize API Calls

1. Batch Requests (REST)

Instead of 10,000 individual calls to fetch 10,000 products, batch them:

# Bad: 10,000 calls
for i in range(1, 10001):
    product = requests.get(f"/admin/api/2024-01/products/{i}.json").json()
    print(product["title"])

# Good: 100 calls (10 requests of 1,000 products each) + pagination
for page in range(1, 101):
    products = requests.get(f"/admin/api/2024-01/products.json?limit=250&offset={(page-1)*250}").json()
    for product in products["products"]:
        print(product["title"])

Impact: 10,000 → 40 calls. 99.6% reduction in API calls.

2. Use GraphQL for Complex Queries

REST requires multiple calls for nested data. GraphQL gets it in one query.

REST: Fetch product + variants + metafields = 3 calls per product = 30K calls for 10K products.

GraphQL: Fetch product + variants + metafields in one query = 1 call per 250 products = 40 calls for 10K products.

Impact: 30,000 → 40 calls. 99.9% reduction.

3. Use Webhooks Instead of Polling

Bad: Poll the API every 1 minute to check for new orders = 1,440 calls/day even if no orders exist.

Good: Subscribe to order webhooks. Shopify pushes new orders to you = 0 calls when no orders exist.

Implementation:

@app.route("/webhooks/orders/create", methods=["POST"])
def handle_order_created():
    order = json.loads(request.data)
    # Process order
    return "", 200

Impact: 1,440 → 0 calls/day (if no orders) or just calls for real orders (if 10 orders/day, 10 calls vs. 1,440).

4. Use Bulk Operations for Data Exports

Exporting 100K orders? Don't fetch them via REST pagination. Use Bulk Operations.

mutation {
  bulkOperationRunQuery(query: """
    query {
      orders(first: 250) {
        edges {
          node {
            id
            createdAt
            email
            totalPrice
          }
        }
      }
    }
  """) {
    bulkOperation {
      id
      status
    }
  }
}

Wait for webhook, download CSV. Zero rate limiting headaches.

5. Cache Aggressively

Product details change infrequently. Cache them locally for 1 hour.

cache = {}
cache_ttl = 3600  # 1 hour

def get_product(product_id):
    if product_id in cache and time.time() - cache[product_id]["timestamp"] < cache_ttl:
        return cache[product_id]["data"]
    
    # Fetch from API
    product = requests.get(f"/admin/api/2024-01/products/{product_id}.json").json()
    cache[product_id] = {"data": product, "timestamp": time.time()}
    return product

Impact: If you call get_product() 1,000 times in 1 hour for same 100 products, you make 100 calls instead of 1,000.

Rate Limit Quota Table (Quick Reference)

Scenario	API	Rate Limit	Optimization
Fetch 10K products	REST	40 calls (2 req/s burst)	Batch pagination: 40 calls
Fetch 10K products + variants	REST	40 calls	Would need 400+ calls (batch only helps for 1 query type)
Fetch 10K products + variants	GraphQL	~40 calls (cost-based, 100 cost/s budget)	Single query per 250 products = 40 calls
Poll for new orders (10 orders/day)	REST	1,440 calls (if polling 1x/min)	Webhooks = 10 calls
Export 100K orders for analytics	REST	Impossible (rate limiting blocks you)	Bulk Operations = 1 call + download
Real-time sync of inventory	REST	80+ calls/min (unsustainable)	Webhooks + polling hybrid = 20 calls/min

Common Mistakes (And How to Fix Them)

Mistake	Symptom	Fix
No retry logic	Script crashes on 429	Add exponential backoff (see code example)
Ignoring X-Shop-Request-Limit	Blind to bucket state, can't optimize	Monitor header, adjust request rate dynamically
Fetching too much data per query	High cost, frequent 429	Reduce pagination (first: 50 instead of 250), use Bulk Operations
Polling instead of webhooks	1,440 calls/day for 1 event/day	Subscribe to webhooks, scale to zero calls when no events
Sequential requests without batching	10,000 calls for 10K products	Use pagination: 40 calls
Not using cache	Re-fetching same data repeatedly	Cache locally for 1 hour, invalidate on webhook

Ready to Build Reliable Shopify Integrations?

Rate limiting isn't a bug to work around—it's a feature to design for. Apps that handle throttling gracefully are fast, reliable, and scale beyond what naive implementations can achieve.

At Tenten, we've built 50+ Shopify integrations. The ones that run smoothly all share one trait: they treat rate limits as a first-class design constraint, not an afterthought.

Ready to optimize your Shopify API integration? Schedule a technical strategy session with our team.

Editorial Note

Rate limiting is the difference between "works in testing, breaks in production" and "scales to 100K products." The merchants who understand Shopify's rate limiting architecture—and design for it from the start—build integrations that scale without incident. Those who ignore it end up with production firefighting and late-night debugging. The choice is yours, but the cost difference is significant.

Frequently Asked Questions

What's the difference between REST and GraphQL rate limiting?

REST uses bucket-based (2 req/sec sustained), allowing 40-request bursts. GraphQL uses cost-based (10 cost/sec budget), where each query costs 1–300+ depending on complexity. GraphQL is more flexible for complex queries; REST is simpler for simple data fetches.

Should I use REST or GraphQL?

Use GraphQL if you need nested/complex data in one call (products + variants + metafields together). Use REST for simple, single-resource fetches. Most modern integrations favor GraphQL because it's more efficient for real-world scenarios.

How do I know if I'm about to hit rate limiting?

Monitor the X-Shop-Request-Limit header (REST) or query cost response (GraphQL). If bucket is > 80% full, slow down. If > 95% full, back off aggressively. Don't wait for 429; proactive throttling is cleaner than reactive retry logic.

What if I need to make thousands of API calls?

Use Bulk Operations API (no rate limiting, async), webhooks (push instead of poll), or batch/pagination (reduce call count). If you absolutely need synchronous real-time, you'll hit limits. Accept that and design with rate limiting as a constraint.

Can I request a higher rate limit?

Shopify increases rate limits for verified public apps and Shopify Plus partners. Standard apps are stuck at 2 req/s (REST) or 10 cost/s (GraphQL). Apply for higher limits if you're a Shopify Plus partner or public app; otherwise, optimize your integration to work within standard limits.

What's the best strategy for syncing large datasets (100K+ records)?

Use Bulk Operations API (no rate limiting, outputs to CSV/JSONL). Bulk ops are async but can process millions of records in minutes. Perfect for one-time data migrations, large exports, and batch processing. Combine with webhooks for ongoing incremental syncs.