How Should you Protect your Machine Learning Models and IP?

Over the last decade I’ve helped hundreds of product teams ship ML-based products, inside and outside of Google, and one of the most frequent questions I got was “How do I protect my models?”. This usually came from executives, and digging deeper it became clear they were most worried about competitors gaining an advantage from what we released. This worry is completely understandable, because modern machine learning has become essential for many applications so quickly that best practices haven’t had time to settle and spread. The answers are complex and depend to some extent on your exact threat models, but if you want a summary of the advice I usually give it boils down to:

  • Treat your training data like you do your traditional source code.
  • Treat your model files like compiled executables.

To explain why I ended up with these conclusions, I’ll need to dive into some of the ways that malicious actors could potentially harm a company based on how ML materials are released. I’ve spent a lot of my time focused on edge deployments, but many of the points are applicable to cloud applications too.

The most concerning threat is frequently “Will releasing this make it easy for my main competitor to copy this new feature and hurt our differentiation in the market?”. If you haven’t spent time personally engineering ML features, you might think that releasing a model file, for example as part of a phone app, would make this easy, especially if it’s in a common format like a TensorFlow Lite flatbuffer. In practice, I recommend thinking about these model files like the binary executables that contain your application code. By releasing it you are making it possible to inspect the final result of your product engineering process, but trying to do anything useful with it is usually like trying to turn a hamburger back into a cow. Just as with executables you can disassemble them to get the overall structure, by loading them into a tool like Netron. You may be able to learn something about the model architecture, but just like disassembling machine code it won’t actually give you a lot of help reproducing the results. Knowing the model architecture is mildly useful, but most architectures are well known in the field anyway, and only differ from each other incrementally.

What about just copying the model file itself and using it in an application? That’s not as useful as you might think, for a lot of reasons. First off, it’s a clear copyright violation, just like copying an executable, so it’s easy to spot and challenge legally. If you are still worried about this, you can take some simple steps like encrypting the model file in the app bundle and only unpacking it into memory when the app is running. This won’t stop a determined attacker, but it makes it harder. To help catch copycats, you can also add text strings into your files that say something like “Copyright Foo, Inc.”, or get more elaborate and modify your training data to add canaries, also more poetically called Mountweazels, by modifying your training data so that the model produces distinct and unlikely results in rare circumstances. For example, an image model could be trained so that a Starbucks logo always returns “Duck” as the prediction. Your application could ignore this result, but even if the attacker got clever and added small perturbations to the model weights to prevent obvious binary comparisons, the behavior would be likely to persist and prove that it was directly derived from the original.

Even if you don’t detect the copying, having a static model is not actually that useful. The world keeps changing, you’ll want to keep improving the model and adapting to new needs, and that’s very hard to do if all you have is the end result of training. It’s also unlikely that a competitor will have exactly the same requirements as you, whether it’s because of using different hardware or a user population that differs from yours. You might be able to hack a bit of transfer learning to modify a model file, but at that point you’re probably better off starting with a publicly-released model, since you’ll have a very limited ability to make changes on a model that’s already been optimized (for example using quantization).

A lot of these properties are very analogous to a compiled executable, hence my advice at the start. You’ve got an artifact that’s the end result of a complex process, and any attacker is almost certain to want modifications that aren’t feasible without access to the intermediate steps that were required to produce it in the first place. From my experience, by far the most crucial, and so most valuable, part of the recipe for a machine learning feature is the training data. It would be much quicker for me to copy most features if I was given nothing but the dataset used to train the model, than if I had access to the training script, feature generation, optimization and deployment code without that data. The training data is what actually sets out the detailed requirements for what the model needs to do, and usually goes through a long process of refinement as the engineers involved learn more about what’s actually needed in the product.

This is why I recommend treating the dataset in the same way that you treat source code for your application. It’s a machine-readable specification of exactly how to tackle your problem, and as such requires a lot of time, resources, and expertise to reproduce. People in other industries often ask me why big tech companies give away so much ML software as open source, because they’re used to thinking about code as the crown jewels that need to be protected at all costs. This is true for your application code, but in machine learning having access to libraries like TensorFlow or PyTorch doesn’t get you that much closer to achieving what Google or Meta can do with machine learning. It’s actually the training data that’s the biggest barrier, so if you have built something using ML that’s a competitive advantage, make sure you keep your dataset secure.

Personally, I’m a big fan of opening up datasets for research purposes, but if you look around you’ll see that most releases are for comparatively generic problems within speech or vision, rather than more specific predictions that are useful for features in commercial products. Public datasets can be useful as starting points for training a more targeted model, but the process usually involves adding data that’s specific to your deployment environment, relabeling to highlight the things you really want to recognize, and removing irrelevant or poorly-tagged data. All these steps take time and resources, and form a barrier to any competitor who wants to do the same thing.

My experience has largely been with on-device ML, so these recommendations are focused on the cases I’m most familiar with. Machine learning models deployed behind a cloud API have different challenges, but are easier in a lot of ways because the model file itself isn’t accessible. You may still want to put in terms-of-use clauses to bar people from using the services to train their own models, like all commercial speech recognition APIs I know of do, but this approach to copying isn’t as effective as you might expect. It suffers the Multiplicity problem, where copies inevitably seem to lose quality compared to their originals.

Anyway, I am very definitely Not A Lawyer, so don’t take any of this as legal advice, but I hope it will be useful to help understand some useful responses to some typical threat models, and at least give you my perspective on the best practices I’ve seen emerge. I’ll be interested to hear if there are any papers or other publications around these questions too, so please do get in touch if you know of anything I should check out!