Handling Millions of Requests With PHP

October 27, 2025

When people think of high-performance web systems, PHP doesn’t always come to mind. Yet, some of the world’s busiest platforms run on PHP including Facebook, Wikipedia, WordPress.com, Etsy, Tumblr, Slack (some backend services), Mailchimp, and Drupal.org. These platforms handle millions to billions of requests daily, proving that PHP can scale just as effectively as any modern runtime if used correctly.

Understand the PHP Runtime Model

PHP was traditionally designed for short-lived, request-response cycles. Each request starts a new PHP process (or thread, depending on your server), runs the code, and terminates. This simplicity makes PHP easy to deploy but presents scaling challenges.

PHP Scaling Challenges

Each request starts fresh, no shared memory between requests.
Persistent connections and heavy computation can be costly.
Concurrency must often be achieved horizontally (via multiple processes or servers).

Optimize the Application Layer

Enable Opcache

PHP’s Opcache caches compiled bytecode in memory, eliminating the need to recompile scripts on every request.

; php.ini
opcache.enable=1
opcache.memory_consumption=256
opcache.max_accelerated_files=20000

Opcache alone can improve performance by 2x–3x on high-traffic applications.

Use Efficient Autoloading

Composer’s autoloader is powerful, but can slow down under large dependency graphs. Use:

composer dump-autoload -o

The -o flag generates an optimized class map, which reduces lookup time.

Cache Everything You Can

Implement multiple layers of caching:

Application-level cache: Use Redis or Memcached for query results, API responses, or computed data.
Full-page cache: Tools like Varnish or Nginx FastCGI cache can serve cached HTML instantly.
Opcode cache: As mentioned, ensures PHP code doesn’t recompile unnecessarily.

Horizontal Scaling with Load Balancing

When vertical scaling (adding CPU/RAM) hits its limits, distribute the load horizontally.

Architecture Example

+-------------------+
|    Load Balancer  |
+---------+---------+
          |
          v
+---------+---------+     +---------+---------+
|   PHP-FPM Server A |    |   PHP-FPM Server B |
+---------+---------+     +---------+---------+
          |                         |
          +-----------+-------------+
                      |
                      v
                +-------------+
                |  Database   |
                +-------------+

Best Practices

Use Nginx or HAProxy for load balancing.
Implement session storage in Redis or database for stateless scaling.
Make uploads and media centralized (e.g., on AWS S3 or similar).

Database Scaling

Database bottlenecks are often the limiting factor in PHP scalability.

Read Replicas and Sharding

Offload read queries to replicas.
Split datasets by region, user ID, or category (sharding).

Query Optimization

Avoid SELECT *.
Use proper indexes.
Profile queries with EXPLAIN and tools like Percona Toolkit or Blackfire.

Use Connection Pools

Persistent connections or middleware like PgBouncer can reduce connection overhead in high-traffic environments.

PHP-FPM Tuning

PHP-FPM (FastCGI Process Manager) manages PHP worker processes. Proper configuration makes a huge difference.

Example Configuration

pm = dynamic
pm.max_children = 100
pm.start_servers = 10
pm.min_spare_servers = 5
pm.max_spare_servers = 20

Adjust pm.max_children based on your server’s CPU/memory. Use pm.status_path to monitor real-time load and concurrency.

Async and Event-Driven PHP

If your workload involves I/O-bound tasks (e.g., APIs, WebSockets), consider asynchronous PHP runtimes:

Swoole or RoadRunner — long-lived workers that can handle thousands of concurrent connections.
ReactPHP — event-driven, non-blocking architecture.

Async frameworks can outperform traditional FPM setups when designed properly.

Observability and Monitoring

Tools

Prometheus + Grafana — metrics and visualization.
Blackfire / Tideways — code profiling and tracing.
ELK Stack — centralized logging.

Track metrics like:

Request rate (RPS)
Average response time
Error rate
Worker utilization