Content

Learning
Problem
Root-causing
Simple Fix
Smaller Packaging

Learning

Always, always, always match the python package versions used during model development with the ones available on inference infrastructure, especially for critical applications. Personally inspect that pre-processing code in production matches with the one used during model development. Empower Scientists and Developers to jointly own inference infrastructure.

Problem

I trained a model, but the inference infrastructure doesn’t load it.

Root-causing

Models trained with transformers v4.37.2 produce the package:

(pytorch_p310) sh-4.2$ tar -czvf model.tar.gz ./model
./model/
./model/config.json
./model/model.safetensors
./model/rng_state.pth
./model/training_args.bin
./model/tokenizer_config.json
./model/vocab.txt
./model/special_tokens_map.json
./model/scheduler.pt
./model/tokenizer.json
./model/trainer_state.json

model.safetensors (line 4) is preferred due to higher security, lower memory footprint (zero-copy) and faster loading speed. The earlier pickle format pytorch_model.bin can be target to a security vulnerability. From pickle documentation:

Warning: The pickle module is not secure. Only unpickle data you trust.
It is possible to construct malicious pickle data which will execute
arbitrary code during unpickling. Never unpickle data that could have
come from an untrusted source, or that could have been tampered with.

Simple Fix

A simple workaround — load model.safetensors in a model object and then re-save to old format. Delete or move the model.safetensors file.

import torch
torch.save(model.state_dict(), 'pytorch_model.bin')

Create a final deployment-ready package that looks like:

(pytorch_p310) sh-4.2$ tar -czvf model.tar.gz ./model
./model/
./model/config.json
./model/pytorch_model.bin
./model/rng_state.pth
./model/training_args.bin
./model/tokenizer_config.json
./model/vocab.txt
./model/special_tokens_map.json
./model/scheduler.pt
./model/tokenizer.json
./model/trainer_state.json

Smaller Packaging

The training script might also save a optimizer file in the model package. This is not required for inference and can be deleted to reduce the size of the model package.

Library version mismatches declared not safe

Content

Learning

Problem

Root-causing

Simple Fix

Smaller Packaging

Read Next

t-distributed Stochastic Neighbor Embedding says "what"

Mining word collocations

Subscribe