At OpenSooq, we take AI seriously. Tensroflow and Scikit-Learn are two examples of popular packages used for machine learning, that are used in OpenSooq for different aspects of our e-commerce platform and its supporting services and moderation tools. In Supervised Machine Learning there are two phases training (fitting the model) and prediction. Training is done offline, then models get evaluated and then persisted to be deployed on production. Tensroflow models can be persisted as frozen protocol buffer files, on the other hand, Scikit-Learn models does not have a good way for that. Typically they suggest using Pickle (python way of serialization) which have known issues, namely:
- security – pickle contains byte codes
- maintainability – require same version of sklearn
- slow – because it contains byte codes not only trained weights
Regarding the maintainability point, when you try load a saved model you might get a warning like this
UserWarning: Trying to unpickle estimator MLPClassifier from version 0.18 when using version 0.19.1. This might lead to breaking code or invalid results. Use at your own risk.
This is very scary that at some point of time when you type “yum update” or “apt-get dist-upgrade” your trained models might not work any more, because Pickle does not store the trained weights, but instead it stores compiled byte-code.
Reliable and order of magnitude faster and smaller models? Is it possible?
We have achieved more than 25x faster loading time of MLPClassifier (multi-layer neural network in SkLearn) and 150x faster loading time of TfidfVectorizer (used for text in SkLearn) compared to normal Pickle. And in terms of size, we got 7x smaller files for MLPClassifier and 50x smaller for TfidfVectorizer compared to Pickle. We were able to load it on different Linux distros having different versions of NumPy.
As you can see in the graph which shows relative loading time and file size.
Instead of “pickling” the model instance, we store the needed information to reconstruct it, that is the parameters passed to the constructor, some other properties and attributes and the values of the trainable parameters (weights and bias values), this information is stored in a dictionary that is a JSON-like object
"class_name": "something", "types": key1: value1, key2: value2, ..., "params": key1: value1, key2: value2, ..., "xtra": key1: value1, key2: value2, ..., "weights": key1: value1, key2: value2, ...,
“Params” are what we get from “instance.get_params(True)” and can be applied to target instance using “target.set_params(**params)”. Similarly “xtra” are extra attributes that are needed but not returned by “get_params()”. Lastly “weights” are the trainable properties of the model instance.
In JSON, Keys are strings and values had to be of primitive types (only strings, integers, and floats), or composite lists or dictionaries of such types. In case of having a property that is not of such type, then we record the name of type in “types“, casting the value to a simplified similar type, for example we store tuples as lists but we record the name and indicate that it should be cast to tuple. Numpy’s “ndarray” is similar to tuple, we use “a.tolist()” to cast it to list.
The Compressed Sparse Row matrix (“csr_matrix”) is stored as
"class_name": "csr_matrix", "shape": [...], "arg1": [...], "dtype": "<NAME>"
We can persist this JSON to a file directly or any JSON-like format like MsgPack.