Safer model formats and interop

2 min read

Content

  1. Case Study
  2. Problem
  3. Workaround
  4. Long Term Solution
  5. Other Tips

Case Study

Recently I trained a model with transformers v4.37.2. It produced the following structure:

(pytorch_p310) sh-4.2$ tar -czvf model.tar.gz ./model
./model/
./model/config.json
./model/model.safetensors
./model/rng_state.pth
./model/training_args.bin
./model/tokenizer_config.json
./model/vocab.txt
./model/special_tokens_map.json
./model/scheduler.pt
./model/tokenizer.json
./model/trainer_state.json

It creates a model.safetensors (line 4) file which is preferred due to higher security, lower memory footprint (zero-copy) and faster loading speed. The pickle formats used by PyTorch have a security vulnerability. From pickle documentation:

Warning: The pickle module is not secure. Only unpickle data you trust.
It is possible to construct malicious pickle data which will execute
arbitrary code during unpickling. Never unpickle data that could have
come from an untrusted source, or that could have been tampered with.

Problem

This newer and recommended model package didn’t load in the target environment due to a runtime transformers version number 4.11.3.

Workaround

A simple workaround (not recommended.) to make it work is to load the model.safetensors file into a model and save it like so in the traditional format:

import torch
torch.save(model.state_dict(), 'pytorch_model.bin')

Once it creates the bin file, you can delete or move the model.safetensors file somewhere else and create a package that looks like:

(pytorch_p310) sh-4.2$ tar -czvf model.tar.gz ./model
./model/
./model/config.json
./model/pytorch_model.bin
./model/rng_state.pth
./model/training_args.bin
./model/tokenizer_config.json
./model/vocab.txt
./model/special_tokens_map.json
./model/scheduler.pt
./model/tokenizer.json
./model/trainer_state.json

While this works in the intertim, we propose a better long term solution.

Long Term Solution

When training models, it is important to align early on with all parties and systems inloved in the productionization process on the model runtime environment, exact library versions and model packaging. This is to avoid problems:

  1. The environment in which you train, might produce a model artifact that is incompatible with the target system.
  2. Avoid any surprises during imlementation
  3. Expectation differences in formats, folder structures. e.g. A model.tar.gz produced by SageMaker training job might overwrite files in the target system if uncompressed in the same location.

Our recommendation is to work with the target system to have them upgrade their transformer version so that we can use the more secure and forward looking model packaging format.

Other Tips

A training script might also save a optimizer file in the model package. This is not required for pure inference implementation and can be deleted to reduce the size of the model package.