15 Minutes to 3: Million-Row Excel Exports in Node.js

Cut export time from ~15 minutes to ~3 using queues and workers.

March 30, 2026 (Today) • 3 min read

Generating Excel files sounds trivial, until your dataset hits a million rows.

At my current role, I worked on a system where users frequently exported large datasets. The implementation worked, but performance was a major issue: ~15 minutes per export under real-world load.

At that scale, it’s not just slow, it’s risky:

  • Long-running requests -> timeouts
  • Memory spikes -> unstable servers
  • Poor UX -> users lose trust

This wasn’t something query optimization could fix.
It required rethinking the execution model.

The Original Approach (And Why It Failed)

The export pipeline was entirely request-driven:

  • Fetch data
  • Transform rows
  • Generate Excel
  • Write file
  • Upload and return URL

What went wrong

  • CPU-heavy work blocked the event loop
    XLSX generation and transformations ran on the main thread.

  • High memory usage
    Large datasets stayed in-process, leading to GC pressure and instability.

  • Fragile execution model
    A single request handled everything, any failure meant restarting the entire process.

Even after optimizing queries, performance barely improved.
The architecture itself was the bottleneck.

The Shift: Moving Work Out of the Request Cycle

To address this, I worked on redesigning the export pipeline to move heavy processing into background jobs.

The new flow:

  • API validates request
  • Creates a job payload (filters, metadata, format)
  • Enqueues the job using BullMQ
  • Returns immediately

This made the API fast and reliable, regardless of dataset size.

Using BullMQ for Orchestration

We used BullMQ to handle:

  • Background processing
  • Retry strategies with backoff
  • Concurrency control
  • Progress tracking

This allowed failed jobs to retry safely without affecting user requests.

Parallel Processing with Worker Threads

The biggest bottleneck was CPU-heavy transformation.

To solve this, I implemented parallel processing using Node.js worker threads:

  • Split large datasets into batches
  • Process batches in parallel across worker threads
  • Merge results before writing to the final output

This removed the event loop bottleneck and allowed us to utilize multiple CPU cores effectively.

Stream-Based Excel Generation

Another key improvement was switching to a streaming approach for Excel generation:

  • Read data in batches
  • Process batches incrementally
  • Write rows directly to a stream
  • Upload progressively to storage

Why this mattered

  • Controlled memory usage
  • No large in-memory workbook
  • Much better stability under load

The Result

  • Export time reduced from ~15 minutes → ~2–3 minutes
  • System remained stable under large workloads
  • No impact on API responsiveness
  • Failures became isolated and recoverable

Key Takeaways

  • Background jobs improve both reliability and performance
  • Node.js handles heavy workloads well if CPU work is offloaded
  • Batch size tuning is critical for performance vs memory trade-offs
  • Observability is essential for scaling systems like this

Closing Thoughts

This wasn’t about optimizing a function, it was about choosing the right execution model.

By moving heavy work out of the request cycle, parallelizing CPU-intensive tasks, and adopting streaming, we significantly improved both performance and system reliability.

If your Node.js application handles large report generation inside request handlers, this is one of the highest-impact architectural improvements you can make.

CC BY-NC 4.02026 © Gautam Suthar

Build with love <3

Gautam Suthar @ gautamsuthar.in