Safer model formats and interop
2 min read
Content
Case Study
Recently I trained a model with transformers v4.37.2. It produced the following structure:
It creates a model.safetensors
(line 4) file which is preferred due to higher security, lower memory footprint (zero-copy) and faster loading speed. The pickle formats used by PyTorch have a security vulnerability. From pickle documentation:
Problem
This newer and recommended model package didn’t load in the target environment due to a runtime transformers version number 4.11.3.
Workaround
A simple workaround (not recommended.) to make it work is to load the model.safetensors
file into a model and save it like so in the traditional format:
Once it creates the bin file, you can delete or move the model.safetensors
file somewhere else and create a package that looks like:
While this works in the intertim, we propose a better long term solution.
Long Term Solution
When training models, it is important to align early on with all parties and systems inloved in the productionization process on the model runtime environment, exact library versions and model packaging. This is to avoid problems:
- The environment in which you train, might produce a model artifact that is incompatible with the target system.
- Avoid any surprises during imlementation
- Expectation differences in formats, folder structures. e.g. A model.tar.gz produced by SageMaker training job might overwrite files in the target system if uncompressed in the same location.
Our recommendation is to work with the target system to have them upgrade their transformer version so that we can use the more secure and forward looking model packaging format.
Other Tips
A training script might also save a optimizer file in the model package. This is not required for pure inference implementation and can be deleted to reduce the size of the model package.