Your First Containerized Machine Learning Deployment with Docker and FastAPI

Your First Containerized Machine Learning Deployment with Docker and FastAPI
Image by Editor | ChatGPT

Introduction

Deploying machine learning models can seem complex, but modern tools can streamline the process. FastAPI is a high-performance web framework for building APIs, while Docker allows you to run applications in isolated, containerized environments. Combining these two technologies simplifies deployment across different systems, ensures scalability, and makes maintenance easier. This approach helps avoid dependency conflicts during production, creating a reliable pipeline for serving ML models.

In this article, you’ll learn how to deploy a machine learning model using FastAPI and Docker.

Preparation

Before you start, ensure that you have the following installed on your system:

Python 3.8+ – Required for training the model and running the FastAPI server
pip – The package installer for Python, used to manage dependencies
Docker – A container platform used to build and run the application consistently across environments

You should also be comfortable with basic Python programming, have an understanding of machine learning concepts, and be familiar with RESTful APIs.

Here’s the recommended structure for your project:

iris-fastapi-app/ ├── app/ │ ├── __init__.py │ └── iris_model.pkl # Trained model ├── main.py # FastAPI app ├── train_model.py # Script to train and save the model ├── requirements.txt # Dependencies ├── Dockerfile # Docker build file

iris–fastapi–app/

├── app/

│ ├── __init__.py

│ └── iris_model.pkl # Trained model

├── main.py # FastAPI app

├── train_model.py # Script to train and save the model

├── requirements.txt # Dependencies

├── Dockerfile # Docker build file

Training the Machine Learning Model

We’ll begin by training a simple random forest classifier using Scikit-learn’s Iris dataset. The script below, which you should save as train_model.py, handles loading the data, training the classifier, and serializing the model to a file using joblib. This saved model will be placed in the app/ directory, as defined in our project structure.

from sklearn.datasets import load_iris from sklearn.ensemble import RandomForestClassifier import joblib import os def train_and_save_model(): # Ensure the ‘app’ directory exists os.makedirs(‘app’, exist_ok=True) iris = load_iris() X, y = iris.data, iris.target model = RandomForestClassifier() model.fit(X, y) joblib.dump(model, ‘app/iris_model.pkl’) print(“Model trained and saved to app/iris_model.pkl”) if __name__ == “__main__”: train_and_save_model()

from sklearn.datasets import load_iris

from sklearn.ensemble import RandomForestClassifier

import joblib

import os

def train_and_save_model():

# Ensure the ‘app’ directory exists

os.makedirs(‘app’, exist_ok=True)

iris = load_iris()

X, y = iris.data, iris.target

model = RandomForestClassifier()

model.fit(X, y)

joblib.dump(model, ‘app/iris_model.pkl’)

print(“Model trained and saved to app/iris_model.pkl”)

if __name__ == “__main__”:

train_and_save_model()

To train and save your model, run this script from your terminal:

Creating a FastAPI Application

The next step is to expose the model through an API so it can be accessed by other applications or users. FastAPI makes this easy with minimal boilerplate and excellent support for type checking, validation, and documentation.

We’ll build a simple FastAPI application that loads the trained model and offers a single endpoint, /predict, to return predictions based on user input.

from fastapi import FastAPI from pydantic import BaseModel import joblib import numpy as np app = FastAPI() model = joblib.load(“app/iris_model.pkl”) class IrisInput(BaseModel): sepal_length: float sepal_width: float petal_length: float petal_width: float @app.post(“/predict”) def predict(data: IrisInput): input_data = np.array([[data.sepal_length, data.sepal_width, data.petal_length, data.petal_width]]) prediction = model.predict(input_data) return {“prediction”: int(prediction[0])}

from fastapi import FastAPI

from pydantic import BaseModel

import joblib

import numpy as np

app = FastAPI()

model = joblib.load(“app/iris_model.pkl”)

class IrisInput(BaseModel):

sepal_length: float

sepal_width: float

petal_length: float

petal_width: float

@app.post(“/predict”)

def predict(data: IrisInput):

input_data = np.array([[data.sepal_length, data.sepal_width, data.petal_length, data.petal_width]])

prediction = model.predict(input_data)

return {“prediction”: int(prediction[0])}

This app exposes a single endpoint, /predict, that accepts flower measurements and returns the predicted class.

Writing the Dockerfile

To run this FastAPI application in a containerized environment, you need to create a Dockerfile. This file contains instructions for Docker to build an image that packages your application and its dependencies. Create a file in your project’s root directory with the following contents and name it Dockerfile, with no file extension.

# Use an official Python runtime as a base image FROM python:3.10-slim # Set the working directory WORKDIR /app # Copy requirements and install dependencies COPY requirements.txt . RUN pip install –no-cache-dir -r requirements.txt # Copy the rest of the application’s code COPY . . # Expose the port EXPOSE 8000 # Run the application CMD [“uvicorn”, “main:app”, “–host”, “0.0.0.0”, “–port”, “8000”]

# Use an official Python runtime as a base image

FROM python:3.10–slim

# Set the working directory

WORKDIR /app

# Copy requirements and install dependencies

COPY requirements.txt .

RUN pip install —no–cache–dir –r requirements.txt

# Copy the rest of the application’s code

COPY . .

# Expose the port

EXPOSE 8000

# Run the application

CMD [“uvicorn”, “main:app”, “–host”, “0.0.0.0”, “–port”, “8000”]

Creating the requirements.txt File

The requirements.txt file is used by pip to install all the necessary dependencies in your Docker container. It should include all the libraries used in your project:

fastapi uvicorn scikit-learn joblib numpy

fastapi

uvicorn

scikit–learn

joblib

numpy

You can generate this file manually or by running:

pip freeze > requirements.txt

pip freeze > requirements.txt

Building and Running the Docker Container

Once you have your FastAPI application, model, and Dockerfile ready, the next step is to containerize the application using Docker, and then and run it. This process ensures that your app can run reliably across any environment.

First, build the Docker image:

docker build -t iris-fastapi-app .

docker build –t iris–fastapi–app .

Then, run the container:

docker run -d -p 8000:8000 iris-fastapi-app

docker run –d –p 8000:8000 iris–fastapi–app

This command maps the container’s port 8000 to your local port 8000.

Testing the API Endpoint

Now that your FastAPI app is running in a Docker container, you can test the API locally.

Using your browser or a tool like Postman you can verify, or you can do so using curl:

curl -X POST “http://localhost:8000/predict” \ -H “Content-Type: application/json” \ -d ‘{“sepal_length”: 5.1, “sepal_width”: 3.5, “petal_length”: 1.4, “petal_width”: 0.2}’

curl –X POST “http://localhost:8000/predict” \

–H “Content-Type: application/json” \

–d ‘{“sepal_length”: 5.1, “sepal_width”: 3.5, “petal_length”: 1.4, “petal_width”: 0.2}’

Expected output:

FastAPI also provides interactive documentation at http://localhost:8000/docs. You can use this Swagger UI to test and troubleshoot the /predict endpoint directly in your browser.

Improving Model Serving

While the basic setup works well for initial deployment, real-world scenarios often require enhancements to improve the development experience and manage environment-specific configurations. Here are a few pointers to keep in mind.

Enabling Live Reload During Development

When developing locally, it’s helpful to enable automatic reloading so that your API restarts whenever you make changes to the code. FastAPI uses Uvicorn, which supports this feature. Modify your CMD line in your Dockerfile or use it via docker-compose for development like this:

CMD [“uvicorn”, “main:app”, “–host”, “0.0.0.0”, “–port”, “8000”, “–reload”]

CMD [“uvicorn”, “main:app”, “–host”, “0.0.0.0”, “–port”, “8000”, “–reload”]

Note: The --reload flag is intended for development only. Avoid using it in production environments.

Using Environment Variables

Instead of hardcoding paths or configuration values, use environment variables. This makes your app more flexible and production-ready.

For example, you can refactor model loading in main.py:

import os model_path = os.getenv(“MODEL_PATH”, “app/iris_model.pkl”) model = joblib.load(model_path)

import os

model_path = os.getenv(“MODEL_PATH”, “app/iris_model.pkl”)

model = joblib.load(model_path)

You can also pass environment variables in your Docker run command:

docker run -d -p 8000:8000 -e MODEL_PATH=app/iris_model.pkl iris-fastapi-app

docker run –d –p 8000:8000 –e MODEL_PATH=app/iris_model.pkl iris–fastapi–app

Conclusion

Deploying machine learning models with FastAPI and Docker is an efficient and scalable approach. FastAPI offers a high-performance method for exposing models as APIs, while Docker ensures consistent behavior across all environments. Together, they create a powerful workflow that simplifies development, testing, and deployment, and help your machine learning service becomes more robust and ready for production.

Your First Containerized Machine Learning Deployment with Docker and FastAPI

Introduction

Preparation

Training the Machine Learning Model

Creating a FastAPI Application

Writing the Dockerfile

Creating the requirements.txt File

Building and Running the Docker Container

Testing the API Endpoint

Improving Model Serving

Enabling Live Reload During Development

Using Environment Variables

Conclusion

Like this:

Related

Leave a Comment Cancel reply

Recent Posts

New algorithms enable efficient machine learning with symmetric data | MIT News

“FUTURE PHASES” showcases new frontiers in music technology and interactive performance | MIT News