12.2.3.3 Gunicorn Runtime
A focused guide to Gunicorn Runtime, connecting core concepts with practical Docker and container operations.
Gunicorn runtime serves a Python WSGI application in production through multiple worker processes, providing the kind of robust, concurrent request handling a development server (like Flask's built-in one) isn't designed to provide, making it the standard choice for actually running a Python web application's container in production.
Configuring Gunicorn as the Container's Runtime Command
Gunicorn replaces a framework's built-in development server as the actual process running inside the production container.
RUN pip install gunicorn
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "--workers", "4", "app:app"]
This runs the application using four worker processes, providing genuine concurrent request handling appropriate for production traffic.
Why a Development Server Is Inappropriate for Production
A framework's built-in development server is typically explicitly documented as unsuitable for production use, lacking the performance, concurrency, and stability characteristics a production-grade WSGI server like Gunicorn provides.
flask run
gunicorn --bind 0.0.0.0:8000 app:app
The second command represents the appropriate production runtime, in contrast to the first, development-oriented command.
Choosing an Appropriate Number of Worker Processes
The number of worker processes should generally reflect the container's allocated CPU resources, following a common guideline of roughly twice the CPU count plus one.
gunicorn --workers 5 --bind 0.0.0.0:8000 app:app
This count might be appropriate for a container allocated 2 CPU cores, following this common guideline.
Combining Gunicorn With an Async Worker Class for I/O-Bound Applications
For an application with substantial I/O-bound work, an async worker class can provide better concurrency than Gunicorn's default synchronous worker.
gunicorn --worker-class gevent --workers 4 --bind 0.0.0.0:8000 app:app
Why Gunicorn Runtime Matters
Using a production-grade WSGI server like Gunicorn, rather than a framework's built-in development server, is essential for a Python web application's container to actually handle production traffic reliably and with appropriate concurrency.