
Your First Containerized Machine Learning Deployment with Docker and FastAPI
Image by Editor | ChatGPT
Introduction
Deploying machine learning models can seem complex, but modern tools can streamline the process. FastAPI is a high-performance web framework for building APIs, while Docker allows you to run applications in isolated, containerized environments. Combining these two technologies simplifies deployment across different systems, ensures scalability, and makes maintenance easier. This approach helps avoid dependency conflicts during production, creating a reliable pipeline for serving ML models.
In this article, you’ll learn how to deploy a machine learning model using FastAPI and Docker.
Preparation
Before you start, ensure that you have the following installed on your system:
- Python 3.8+ – Required for training the model and running the FastAPI server
- pip – The package installer for Python, used to manage dependencies
- Docker – A container platform used to build and run the application consistently across environments
You should also be comfortable with basic Python programming, have an understanding of machine learning concepts, and be familiar with RESTful APIs.
Here’s the recommended structure for your project:
iris–fastapi–app/ ├── app/ │ ├── __init__.py │ └── iris_model.pkl # Trained model ├── main.py # FastAPI app ├── train_model.py # Script to train and save the model ├── requirements.txt # Dependencies ├── Dockerfile # Docker build file |
Training the Machine Learning Model
We’ll begin by training a simple random forest classifier using Scikit-learn’s Iris dataset. The script below, which you should save as train_model.py
, handles loading the data, training the classifier, and serializing the model to a file using joblib
. This saved model will be placed in the app/
directory, as defined in our project structure.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
from sklearn.datasets import load_iris from sklearn.ensemble import RandomForestClassifier import joblib import os
def train_and_save_model(): # Ensure the ‘app’ directory exists os.makedirs(‘app’, exist_ok=True)
iris = load_iris() X, y = iris.data, iris.target
model = RandomForestClassifier() model.fit(X, y)
joblib.dump(model, ‘app/iris_model.pkl’) print(“Model trained and saved to app/iris_model.pkl”)
if __name__ == “__main__”: train_and_save_model() |
To train and save your model, run this script from your terminal:
Creating a FastAPI Application
The next step is to expose the model through an API so it can be accessed by other applications or users. FastAPI makes this easy with minimal boilerplate and excellent support for type checking, validation, and documentation.
We’ll build a simple FastAPI application that loads the trained model and offers a single endpoint, /predict
, to return predictions based on user input.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
from fastapi import FastAPI from pydantic import BaseModel import joblib import numpy as np
app = FastAPI() model = joblib.load(“app/iris_model.pkl”)
class IrisInput(BaseModel): sepal_length: float sepal_width: float petal_length: float petal_width: float
@app.post(“/predict”) def predict(data: IrisInput): input_data = np.array([[data.sepal_length, data.sepal_width, data.petal_length, data.petal_width]]) prediction = model.predict(input_data) return {“prediction”: int(prediction[0])} |
This app exposes a single endpoint, /predict
, that accepts flower measurements and returns the predicted class.
Writing the Dockerfile
To run this FastAPI application in a containerized environment, you need to create a Dockerfile. This file contains instructions for Docker to build an image that packages your application and its dependencies. Create a file in your project’s root directory with the following contents and name it Dockerfile
, with no file extension.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
# Use an official Python runtime as a base image FROM python:3.10–slim
# Set the working directory WORKDIR /app
# Copy requirements and install dependencies COPY requirements.txt . RUN pip install —no–cache–dir –r requirements.txt
# Copy the rest of the application’s code COPY . .
# Expose the port EXPOSE 8000
# Run the application CMD [“uvicorn”, “main:app”, “–host”, “0.0.0.0”, “–port”, “8000”] |
Creating the requirements.txt File
The requirements.txt
file is used by pip to install all the necessary dependencies in your Docker container. It should include all the libraries used in your project:
fastapi uvicorn scikit–learn joblib numpy |
You can generate this file manually or by running:
pip freeze > requirements.txt |
Building and Running the Docker Container
Once you have your FastAPI application, model, and Dockerfile ready, the next step is to containerize the application using Docker, and then and run it. This process ensures that your app can run reliably across any environment.
First, build the Docker image:
docker build –t iris–fastapi–app . |
Then, run the container:
docker run –d –p 8000:8000 iris–fastapi–app |
This command maps the container’s port 8000 to your local port 8000.
Testing the API Endpoint
Now that your FastAPI app is running in a Docker container, you can test the API locally.
Using your browser or a tool like Postman you can verify, or you can do so using curl:
curl –X POST “http://localhost:8000/predict” \ –H “Content-Type: application/json” \ –d ‘{“sepal_length”: 5.1, “sepal_width”: 3.5, “petal_length”: 1.4, “petal_width”: 0.2}’ |
Expected output:
FastAPI also provides interactive documentation at http://localhost:8000/docs
. You can use this Swagger UI to test and troubleshoot the /predict
endpoint directly in your browser.
Improving Model Serving
While the basic setup works well for initial deployment, real-world scenarios often require enhancements to improve the development experience and manage environment-specific configurations. Here are a few pointers to keep in mind.
Enabling Live Reload During Development
When developing locally, it’s helpful to enable automatic reloading so that your API restarts whenever you make changes to the code. FastAPI uses Uvicorn, which supports this feature. Modify your CMD
line in your Dockerfile
or use it via docker-compose
for development like this:
CMD [“uvicorn”, “main:app”, “–host”, “0.0.0.0”, “–port”, “8000”, “–reload”] |
Note: The --reload
flag is intended for development only. Avoid using it in production environments.
Using Environment Variables
Instead of hardcoding paths or configuration values, use environment variables. This makes your app more flexible and production-ready.
For example, you can refactor model loading in main.py
:
import os
model_path = os.getenv(“MODEL_PATH”, “app/iris_model.pkl”) model = joblib.load(model_path) |
You can also pass environment variables in your Docker run command:
docker run –d –p 8000:8000 –e MODEL_PATH=app/iris_model.pkl iris–fastapi–app |
Conclusion
Deploying machine learning models with FastAPI and Docker is an efficient and scalable approach. FastAPI offers a high-performance method for exposing models as APIs, while Docker ensures consistent behavior across all environments. Together, they create a powerful workflow that simplifies development, testing, and deployment, and help your machine learning service becomes more robust and ready for production.