Deep Learning Rest API in Production using Docker, NGINX and Flask

7 min readOct 13, 2018

This is an attempt to explain how deep machine learning models can be deployed in production. While there are multiple ways to achieve the end, in this blog, I’ve explained the same using Docker , NGINX , Gunicorn and Flask which can be used as a template for deploying ML models.

For face detection, I have used the face detection model in the dnn module of OpenCV. This model performs reasonably well. For example, it detects even occluded face(s) but doesn’t do well when the face is not well illuminated. That said, please note that the focus of this post is not to develop a kick-ass face detection system. Should you want a better model for face detection, I’d recommend trying out different models from the TensorFlow Object Recognition API with WIDER Face dataset.

I’ve tried to keep this post short. If need be, I’ll write another post explaining the concepts involved in greater detail. For those who already know the concepts and just want to browse through the code, here’s the link to the Github repository.

This post is organized in the following sections:

Prerequisite
Software Tools Used
Pipeline Overview
Steps to Run and Expected Output
‘Dockerfile’ for Flask and Gunicorn
‘Dockerfile’ for NGINX
Docker Compose
Conclusion
References

At some places, I’ve detailed out the issue(s) which I encountered during development and have also provided the explanation. Most of the issues can be attributed to my limited understanding of docker and gunicorn .

Prerequisite

Designing RESTful APIs with Flask
Basics of Docker: I, myself, am a noob in docker. If you know what docker is and know commands like docker ps -a , docker run etc, you are good to go.

There are wonderful tutorials available on these topics in case you’re an uninitiated. It wouldn’t take much time to learn it enough to appreciate this post.

I’m not expecting you to know what NGINX is. I’ve explained its configuration in sufficient detail.

Software Tools Used

Assumption: Python based frameworks and deployment on Linux servers

Flask: Python framework for developing REST APIs. However, the built-in server is not suitable for deployment in production as it can handle only one request at a time, on its own.
Nginx: Highly stable web server, which provides benefits such as load balancing, SSL configuration etc. We’ll be using nginx as the reverse proxy i.e. to direct the HTTP request to an upstream server such as Gunicorn. Other capabilities of NGINX hasn’t been used in this project.
Gunicorn: A WSGI HTTP server which will be used to run the Flask application. NGINX alone will not be able to directly interface with a Flask app. Hence we need Gunicorn to route the requests to WSGI compliant servers such as Flask.

Pipeline Overview

The diagram below illustrates how the different components fit into the scheme of things.

Nginx will be the interface which will be handling the HTTP request from the clients. It will reverse proxy the request to the WSGI server (gunicorn in this case) which in turn will invoke a callable object in our Flask app.

Traffic sent on the host machine at port 80 will be forwarded to the Docker proxy. The app has port 8000 exposed for nginx.

In this post, we’ll be assuming that both the Flask app and the NGINX server are inside a Docker container.

Steps to Run and Expected Output

Directory Structure

.
|--api
|    |-- model
|    |    +-- deploy.prototxt.txt
|    |    +-- res10_300x300_ssd_iter_140000.caffemodel
|    +-- __init__.py
|    +-- demo_request.py
|    +-- Dockerfile
|    +-- run_fd_server.py
|    +-- settings.py
|    +-- wsgi.py
|-- nginx
|    +-- Dockerfile
|    +-- nginx.conf
+-- .gitignore
+-- docker-compose.yml
+-- README.md
+-- requirements.txt
+-- stress_test.py
+-- image.jpg  # Not included in the repo

There are a couple of files, which aren’t being used such as stress_test.py and wsgi.py. Don’t be concerned with that. I’ve retained those as I was playing around with the API and perhaps, in future will leverage redis along with message queueing/ message brokering paradigms to efficiently batch process incoming requests. For the purpose of this post, safely ignore these files.

Steps to run

Open a terminal and run docker-compose up --build . Wait until the following output shows up on the terminal.

Attaching to api, nginx
api      | [2018-10-08 17:31:36 +0000] [1] [INFO] Starting gunicorn 19.9.0
api      | [2018-10-08 17:31:36 +0000] [1] [INFO] Listening at: http://0.0.0.0:8000 (1)
api      | [2018-10-08 17:31:36 +0000] [1] [INFO] Using worker: sync
api      | [2018-10-08 17:31:36 +0000] [9] [INFO] Booting worker with pid: 9

2. Using curl : In another terminal, run the following command:

curl -X POST -i http://0.0.0.0:8000/predict -F 'image=@/path/to/the/image/image.jpg'

3. Using python request module: In another terminal, cd into api and run python demo-request.py.

Expected Output

Output contains the confidence score for each face detection.
The bounding box coordinates are the coordinates of the corresponding faces. The coordinates are in the following order: (top_left_x_coordinate, top_left_y_coordinate, bottom_right_x_coordinate, bottom_right_y_coordinate).
Length of either of these lists equals the number of face detected in the image.

‘Dockerfile’ for Flask and Gunicorn

The snippet below shows only a part of the Dockerfile which concerns with creating the REST interface. The base image is taken from the Deep Learning and OpenCV Dockerfile which I had created earlier. It can be found on this Github repo.

# flask, redis, gunicorn
RUN pip3 install redis
RUN pip3 install gunicorn# make directories suited to your application
RUN mkdir -p /home/project/face_detection_REST_API
# RUN mkdir -p /home/project/face_detection_REST_API/model
WORKDIR /home/project/face_detection_REST_API# copy and install packages for flask
# ADD requirements.txt /home/project/face_detection_REST_API
# RUN pip3 install --no-cache-dir -r requirements.txt# copy contents from your local to your docker container
COPY . /home/project/face_detection_REST_API
# ADD ./model /home/project/face_detection_REST_API/
# COPY . .
# RUN pip3 install --no-cache-dir -r requirements.txt

Issue Encountered: Understanding Docker ‘build context’

In the code snippet above, I’ve included the comments to explain docker context. To understand the issue, notice that directory tree once again. I’ve two Dockerfile, one each inside the api and the nginx directory. When I run docker-compose up --build, I specify the directory in which the Dockerfile resides (please refer docker-compose.yml) . However, when I was running this command, docker was unable to find ./model/ and ./requirements.txt in the project root of the local machine and hence the COPY and ADD commands were not working resulting in build failure.

When we issue a docker build command, the current working directory is called the build context. It is assumed that the Dockerfile resides in this directory although one can specify the path of the Dockerfile using the -f flag. Regardless of where this file is located, all recursive content of file and directories in the current directory are sent to the Docker daemon as the build context. In my case, the ./model and ./requirements.txt were located outside of the build context.

For more details, please refer this official post on best practices for writing Dockerfiles.

NGINX configuration

The configuration parameters in nginx are grouped into blocks such as main , events, http and server . Following this blog, I’ve used the minimal configuration parameters, making use of the latter two blocks. If want to delve into the details, I’d recommend you to read this post, which perhaps has the best explanation on defining the configurations in nginx.conf file.

# The http block defines how NGINX should handle HTTP traffic.
http {# Timeout value for keep-alive connections with the client
  keepalive_timeout  65;# Following configurations are for serving static content and as a reverse proxy.
  server {
      listen 80;
      # Configure NGINX to reverse proxy HTTP requests to the upstream server (Gunicorn (WSGI server))
      location / {
          # Define the location of the proxy server to send the request to
          proxy_pass http://0.0.0.0:8000;proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      }
  }
}

listen 80 directive implies that Nginx will listen to the request on host on port 80.
The proxy_pass directive is what makes this configuration a reverse proxy. It specifies that all requests which matches the location block (in this case root /) should be forwarded to port 8000 of http://0.0.0.0.
proxy_set_header is for modifying or adding headers that are forwarded along with the proxied requests. This configuration used the built-in $remote_addr variable to send the IP address of the original client to the proxy host.

‘Dockerfile’ for NGINX

# Docker file for NGINX                      
FROM nginx:1.15.2
RUN rm /etc/nginx/nginx.conf                       
COPY nginx.conf /etc/nginx/

In the NGINX Dockerfile , we first take the base image nginx:1.15.2 either stored locally or from Docker Hub. We then remove the default nginx.conf and replace it with the local copy which we created.

Docker Compose

Below is a snippet of docker-compose.yml file which defines the configuration of the application’s services. Compose is a tool to define how a multi-container application will run and communicate with each other. When we run docker-compose it creates and starts all the services defined in the file.

version: '2.0'services:api:
    container_name: api # Name can be anything
    restart: always
    build: ./api
    ports:
      - "8000:8000"
    command: gunicorn -w 1 -b :8000 run_fd_server:appnginx:
    container_name: nginx
    restart: always
    build: ./nginx
    ports:
      - "8001:8001"
    depends_on:
      - api

That’s it. Now run docker-compose as explained earlier in the post.

Conclusion

It was a exciting experience to have created a fully functioning web application running locally on my computer. This isn’t just a development version but something which can be deployed in production, of course with some changes. I hope it whets the appetite of those who want to delve deep. Needless to say, feel free to share your comments/ suggestions/ feedback.

As mentioned earlier, source code can be found on Github.