How to Deploy a TensorFlow Model as a RESTful API Service
Learn how to Deploy a TensorFlow CNN Model to Heroku using Python and Docker
This article is a reupload, and originally got published on FreeCodeCamp
Introduction
If you're like I am, then you've probably watched and read a number of tutorials on creating machine learning models with TensorFlow, PyTorch, Scikit-Learn or any other framework out there.
But there is one thing that these tutorials tend to miss out on, and that's model deployment.
In this tutorial, I'll discuss on how to deploy a CNN TensorFlow model that classifies food images to Heroku using FastAPI and Docker.
Tech We'll Be Using
If you're unfamiliar, FastAPI is a Python web framework for creating fast API applications. And in my opinion, it is the easiest to learn out of all the Python web frameworks out there.
FastAPI also has default integration with swagger documentation and makes it easy to configure and update.
Docker, on the other hand, is an industry staple in software engineering, as it is one of the most popular containerization softwares out there. Docker is used for developing, deploying, and managing applications in virtualized environments called containers.
The main selling point of using Docker is that it solves the problem "it works on my machine, why not in yours?". Coincidentally, I actually faced this exact issue working on this very project, ultimately fixing it when I decided to use Docker.
Heroku, lastly, is a cloud platform where you can deploy, manage, and scale web applications. It works with back-end applications, front-end applications, or full-stack applications.
Prerequisuites
Before we begin, you'll first need the following:
- A Docker account
- A Heroku account, and the Heroku CLI
- A Python installation
The Application we're building
We're going to be building a RESTful API service for a TensorFlow CNN model that classifies food images.
After building the API service, I'll show you how to dockerize the application, and then deploy it to Heroku.
Downloading Necessities
You'll first need to clone the github repository on this link.
git clone https://github.com/eRuaro/food-vision-api.git
There are two branches in this repository - you'll use the start-here
branch as main
is the completed branch.
Once you got the cloned repository, you'll need to download Docker to your local system, and the Heroku CLI as well.
You must also install the following packages on pip.
- FastAPI
- TensorFlow
- Numpy
- Uvicorn
- Image
To do so, create a requirements.txt file on the start-here branch, and put in the following. Note that you can use any other version of the listed packages below, as long as they would still work together.
fastapi==0.73.0
numpy==1.19.5
uvicorn==0.15.0
image==1.5.33
tensorflow-cpu==2.7.0
After which you can install the packages using the command pip install -r requirements.txt
Currently our start-here
branch has the saved model file, and the jupyter notebook used in creating the model. The notebook also has the code that implements our API feature. That is, it implements predicting the food class of an image based on its URL link.
Brief introduction to FastAPI
With that in mind, let's start writing the code! In the root directory, create a main.py
file. In that file, add the following lines of code.
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from uvicorn import run
import os
app = FastAPI()
origins = ["*"]
methods = ["*"]
headers = ["*"]
app.add_middleware(
CORSMiddleware,
allow_origins = origins,
allow_credentials = True,
allow_methods = methods,
allow_headers = headers
)
@app.get("/")
async def root():
return {"message": "Welcome to the Food Vision API!"}
if __name == "__main__":
port = int(os.environ.get('PORT', 5000))
run(app, host="0.0.0.0", port=port)
Running the command python -m uvicorn main:app --reload
would run the app, and would listen to changes we make on the server.
Alternatively, you can use python main.py
and it would run the app on port 5000, courtesy of the last 3 lines of code. However, this would not let the app listen to changes we make, so you'll have to re-run the app every time you want to see your changes.
We also added the CORSMiddleware
which essentially allows us to access the API in a different host. That is, we can extend the app further by creating a front-end interface for it. We won't cover that in this article but I put it here just in case you want to create a front-end to interact with the API as well.
Going to the port where the app is running, you'll get this.
{
"message": "Welcome to the Food Vision API!"
}
The command python -m uvicorn main:app --reload
refers to the following.
main -> The file main.py
app -> The object created inside of main.py with the line app = FastAPI()
--reload -> Make the server restart after code changes
Let's dissect the code we've written so far.
@app.get("/")
async def root():
return {"message": "Welcome to the Food Vision API!"}
@app
is needed for FastAPI commands. The get
is an HTTP method, while the "/"
is the url path of that specific API request. Below that we call a function that will return something. Here we just return a simple json message.
That is, we have a template for writing API endpoints with FastAPI.
@app.http_method("url_path")
async def functionName():
return something
Writing the API functionality
Let us write the main API functionality, that is, taking a food image url from the internet, and predicting the name of that food. First, let's extend the code that we wrote earlier, import all the required functions that we'll use, and load the model itself.
from fastapi import FastAPI
from tensorflow.keras.models import load_model
from tensorflow.keras.utils import get_file
from tensorflow.keras.utils import load_img
from tensorflow.keras.utils import img_to_array
from tensorflow import expand_dims
from tensorflow.nn import softmax
from numpy import argmax
from numpy import max
from numpy import array
from json import dumps
from uvicorn import run
import os
app = FastAPI()
model_dir = "food-vision-model.h5"
model = load_model(model_dir)
...
...
...
if __name == "__main__":
port = int(os.environ.get('PORT', 5000))
run(app, host="0.0.0.0", port=port)
After loading in the model, let us add in the food classes that we have, which are based on the Food 101 dataset.
class_predictions = array([
'apple pie',
'baby back ribs',
'baklava',
'beef carpaccio',
'beef tartare',
'beet salad',
'beignets',
'bibimbap',
'bread pudding',
'breakfast burrito',
'bruschetta',
'caesar salad',
'cannoli',
'caprese salad',
'carrot cake',
'ceviche',
'cheesecake',
'cheese plate',
'chicken curry',
'chicken quesadilla',
'chicken wings',
'chocolate cake',
'chocolate mousse',
'churros',
'clam chowder',
'club sandwich',
'crab cakes',
'creme brulee',
'croque madame',
'cup cakes',
'deviled eggs',
'donuts',
'dumplings',
'edamame',
'eggs benedict',
'escargots',
'falafel',
'filet mignon',
'fish and chips',
'foie gras',
'french fries',
'french onion soup',
'french toast',
'fried calamari',
'fried rice',
'frozen yogurt',
'garlic bread',
'gnocchi',
'greek salad',
'grilled cheese sandwich',
'grilled salmon',
'guacamole',
'gyoza',
'hamburger',
'hot and sour soup',
'hot dog',
'huevos rancheros',
'hummus',
'ice cream',
'lasagna',
'lobster bisque',
'lobster roll sandwich',
'macaroni and cheese',
'macarons',
'miso soup',
'mussels',
'nachos',
'omelette',
'onion rings',
'oysters',
'pad thai',
'paella',
'pancakes',
'panna cotta',
'peking duck',
'pho',
'pizza',
'pork chop',
'poutine',
'prime rib',
'pulled pork sandwich',
'ramen',
'ravioli',
'red velvet cake',
'risotto',
'samosa',
'sashimi',
'scallops',
'seaweed salad',
'shrimp and grits',
'spaghetti bolognese',
'spaghetti carbonara',
'spring rolls',
'steak',
'strawberry shortcake',
'sushi',
'tacos',
'takoyaki',
'tiramisu',
'tuna tartare',
'waffles'
])
Now that we have the food classes, let's write the main API functionality.
@app.post("/net/image/prediction/")
async def get_net_image_prediction(image_link: str = ""):
if image_link == "":
return {"message": "No image link provided"}
img_path = get_file(
origin = image_link
)
img = load_img(
img_path,
target_size = (224, 224)
)
img_array = img_to_array(img)
img_array = expand_dims(img_array, 0)
pred = model.predict(img_array)
score = softmax(pred[0])
class_prediction = class_predictions[argmax(score)]
model_score = round(max(score) * 100, 2)
return {
"model-prediction": class_prediction,
"model-prediction-confidence-score": model_score
}
Here, we make a post request to the endpoint /net/image/prediction/
and provide the image_url
as a query parameter. That is, the full endpoint when posting an image url link would be /net/image/prediction/image_url=image-url
For simplicity's sake, we give the image_link
a default value of ""
and when there's no link passed to the endpoint, we simply return a message saying that there's no image link provided.
get_file()
downloads the image through the provided url link, while load_img()
loads the image in PIL format, and turns it into the appropriate image size that the model wants.
img_to_array()
converts the loaded image to a numpy array. expand_dims()
expands the dimensions of the array by one at the zero'th index.
We then use model.predict()
to get the model prediction on the loaded image, and get the model's confidence score on said prediction using softmax()
. I used softmax here as that's the activation function used in creating the model.
We finally then get the food type by using argmax()
on the model's confidence score and use that as the index that we'll use in searching through the class_predictions
array which contains the various food classes we have.
Lastly, we multiply the model's confidence score by 100 so that the range of the score would be from 1 to 100.
We then return the model's prediction, and the model's confidence score.
Why we need to use Docker to deploy this app
You can actually deploy this app as is on Heroku, using the usual method of defining a Procfile
. However, when I tried this method, I kept on getting a ValueError: Out of range float values are not JSON compliant error
. I also get this error when running the app on Windows Subsystem for Linux (WSL). When I run on Windows, however, the error disappears.
You can actually avoid this error by adding this line of code, after the initial assignment of the model_score
variable.
model_score = dumps(model_score.tolist())
This would let the app run on both Heroku and WSL, however, it would only return these values when making the POST request.
{
"model-prediction": "apple pie",
"model-prediction-confidence-score": NaN,
}
So, it works on my machine (windows), but not on Heroku (using Procfile), nor on WSL. This is the kind of problem that Docker solves!
Dockerizing the application
Let us start dockerizing the application, create a Dockerfile
in the project's root directory and put in the following contents.
FROM python:3.7.3-stretch
# Maintainer info
LABEL maintainer="your-email-address"
# Make working directories
RUN mkdir -p /food-vision-api
WORKDIR /food-vision-api
# Upgrade pip with no cache
RUN pip install --no-cache-dir -U pip
# Copy application requirements file to the created working directory
COPY requirements.txt .
# Install application dependencies from the requirements file
RUN pip install -r requirements.txt
# Copy every file in the source folder to the created working directory
COPY . .
# Run the python application
CMD ["python", "main.py"]
This pulls the Python 3.7.3 image, and installs all the necessary packages defined in the requirements.txt
file. Then, it would run the application by using the command python main.py
as defined in the last line of the file.
You can then build, then run the application using the following CLI commands.
$ docker image build -t <app-name> .
$ docker run -p 5000:5000 -d <app-name>
Then you can stop the app, and free up system resources by running the following.
$ docker container stop <container-id>
$ docker system prune
container-id
is returned when running the docker run
command above.
Deploying to Heroku
With the app now dockerize, we can now deploy it to Heroku. I'm assuming you already have the Heroku CLI installed, and have already logged the CLI into your Heroku account.
Lets first create the app in Heroku through the CLI.
$ heroku create <app-name>
Then we can push, and release the app through the Docker container we made earlier with the following commands.
$ heroku container:push web --app <app-name>
$ heroku container:release web --app <app-name>
After this, you can go to your heroku dashboard and open the app. You should be greeted with the json message we have in the "/"
directory of the application.
Navigating to the /docs
you'll be greeted with the Swagger documentation of the application, here you can play around with the POST request we created and see if the model predictions are correct.
Let's try this out by using a picture of a chocolate cake, its URL link is this.
Paste the link to the text box in the /docs
as so, then press Execute
.
After pressing the Execute
button, it will take a few seconds until we get the model prediction. That's because we're using tensorflow-cpu
because we're limited with the RAM and the slug size of our application when using the free tier of Heroku.
After the execution is finished, you should be greeted with this response:
As you can see, the model predicted it correctly, with a confidence score of 2.65%. This confidence score is alright as we're not dealing with model accuracy (which requires the truth value beforehand), and we're dealing with data the model hasn't seen before.
Conclusion
In this article, you learned how to deploy a TensorFlow CNN model to Heroku, by serving it as a RESTful API, and by using Docker.
If you find this article helpful, feel free to leave a review, and let us connect on Twitter! You can also support me by buying me a coffee.