Your First Containerized Machine Learning Deployment with Docker and FastAPI


Your First Containerized Machine Learning Deployment with Docker and FastAPI

Your First Containerized Machine Learning Deployment with Docker and FastAPI
Image by Editor | ChatGPT

Introduction

Deploying machine learning models can seem complex, but modern tools can streamline the process. FastAPI is a high-performance web framework for building APIs, while Docker allows you to run applications in isolated, containerized environments. Combining these two technologies simplifies deployment across different systems, ensures scalability, and makes maintenance easier. This approach helps avoid dependency conflicts during production, creating a reliable pipeline for serving ML models.

In this article, you’ll learn how to deploy a machine learning model using FastAPI and Docker.

Preparation

Before you start, ensure that you have the following installed on your system:

  • Python 3.8+ – Required for training the model and running the FastAPI server
  • pip – The package installer for Python, used to manage dependencies
  • Docker – A container platform used to build and run the application consistently across environments

You should also be comfortable with basic Python programming, have an understanding of machine learning concepts, and be familiar with RESTful APIs.

Here’s the recommended structure for your project:

Training the Machine Learning Model

We’ll begin by training a simple random forest classifier using Scikit-learn’s Iris dataset. The script below, which you should save as train_model.py, handles loading the data, training the classifier, and serializing the model to a file using joblib. This saved model will be placed in the app/ directory, as defined in our project structure.

To train and save your model, run this script from your terminal:

Creating a FastAPI Application

The next step is to expose the model through an API so it can be accessed by other applications or users. FastAPI makes this easy with minimal boilerplate and excellent support for type checking, validation, and documentation.

We’ll build a simple FastAPI application that loads the trained model and offers a single endpoint, /predict, to return predictions based on user input.

This app exposes a single endpoint, /predict, that accepts flower measurements and returns the predicted class.

Writing the Dockerfile

To run this FastAPI application in a containerized environment, you need to create a Dockerfile. This file contains instructions for Docker to build an image that packages your application and its dependencies. Create a file in your project’s root directory with the following contents and name it Dockerfile, with no file extension.

Creating the requirements.txt File

The requirements.txt file is used by pip to install all the necessary dependencies in your Docker container. It should include all the libraries used in your project:

You can generate this file manually or by running:

Building and Running the Docker Container

Once you have your FastAPI application, model, and Dockerfile ready, the next step is to containerize the application using Docker, and then and run it. This process ensures that your app can run reliably across any environment.

First, build the Docker image:

Then, run the container:

This command maps the container’s port 8000 to your local port 8000.

Testing the API Endpoint

Now that your FastAPI app is running in a Docker container, you can test the API locally.

Using your browser or a tool like Postman you can verify, or you can do so using curl:

Expected output:

FastAPI also provides interactive documentation at http://localhost:8000/docs. You can use this Swagger UI to test and troubleshoot the /predict endpoint directly in your browser.

Improving Model Serving

While the basic setup works well for initial deployment, real-world scenarios often require enhancements to improve the development experience and manage environment-specific configurations. Here are a few pointers to keep in mind.

Enabling Live Reload During Development

When developing locally, it’s helpful to enable automatic reloading so that your API restarts whenever you make changes to the code. FastAPI uses Uvicorn, which supports this feature. Modify your CMD line in your Dockerfile or use it via docker-compose for development like this:

Note: The --reload flag is intended for development only. Avoid using it in production environments.

Using Environment Variables

Instead of hardcoding paths or configuration values, use environment variables. This makes your app more flexible and production-ready.

For example, you can refactor model loading in main.py:

You can also pass environment variables in your Docker run command:

Conclusion

Deploying machine learning models with FastAPI and Docker is an efficient and scalable approach. FastAPI offers a high-performance method for exposing models as APIs, while Docker ensures consistent behavior across all environments. Together, they create a powerful workflow that simplifies development, testing, and deployment, and help your machine learning service becomes more robust and ready for production.


Leave a Comment