Badook for
data scientists

It is no secret that models are only as good as the data you train them with, infer, and predict them on. And as data scientists, you need to build models on data, that in most cases is collected, cleansed, transformed, and labelled by others, leaving you with a partial view of the reality that data was originated from.

The first time you might realize something is wrong is when your models perform poorly, leaving you with all of the responsibility and none of the tools to really know what is the root cause of the problem.

badook is built to help data science teams gain confidence in their data and models throughout the whole lifecycle. Using badook you can have the right controls in place from the moment data gets into your organization letting you know of issues well before they affect your models.


We help data scientists trust data 


Modern MlOps' biggest concern is assuring high-quality, reliable data throughout the whole lifecycle of an ML system. Using badook, you can automate tests that can drive decisions in your ML pipeline based on the behaviour of your data.


badook is built by and for data scientists, giving you the tools you need to discover issues, author easily, and automate tests. Using badook, you can easily author advanced statistical and model-based tests and automate them wherever you build your models.

Eliminate the data cascades

When data issues slip under the radar or, worse, are being miss-treated upstream, the effect can often be amplified when the data reaches downstream systems such as AI/ML models. badook allows data quality to be communicated across teams and systems without exposing data, giving you the proper controls and easily finding the root cause if something happens.


badook allows you to apply the same test logic to all your datasets, including training, inference and prediction, this will enable you to easily enforce the same behaviour across different