One of the best ways to find new drugs for human or animal health is to leverage knowledge of previous attempts, successful and otherwise. With a big enough collection of facts, it becomes possible to learn about the patterns that make some molecules able to treat or cure diseases, whereas others have no effect or have unintended toxic consequences that stop them from being useful in the clinic.

Building computational models using the results from experiments done by other scientists against targets or specific pathogens (bacteria, parasites viruses etc) is a way to leverage the power of machine learning to guide the process of selecting new drug candidates, and has been used by large pharmaceutical companies for decades. There are many different techniques available for computational modeling, and this is done using powerful software. But this power comes with barriers that make it prohibitive for use by non-experts.

Collaborations Pharmaceuticals Inc.  have gathered a large collection of openly available data that represents many thousands of different biological targets, and for each of these targets, we have already built and validated the computational models. These targets include both ones that represent diseases and medical conditions, and off-targets that fall under the ADME/tox categories, and are to be avoided. These computational models can be made available using our Assay Central software that is easy to use, even by the standards of consumer software, requiring no expertise. 

Assay Central supports and facilitates early discovery work enabling the curation of high quality datasets from public and or proprietary data.  Assay Central can be used for building machine learning models that in turn can be used to filter and score compounds prior to testing. Assay Central lowers the entry point for accessing powerful cheminformatics technologies that can be used for drug discovery and makes models accessible from the burgeoning public datasets. Assay Central provides these models with an intuitive interface for the end user.

Some example screenshots are shown below.

 Model predictions can be visualized with a heatmap.

Model predictions can be visualized with a heatmap.

 Atom level contributions to predicted activity can be viewed by color contribution.

Atom level contributions to predicted activity can be viewed by color contribution.

 Predictions can also be assessed using a hex plot which uses K-nearest neighbors. Tanimoto similarity values using ECFP6.

Predictions can also be assessed using a hex plot which uses K-nearest neighbors. Tanimoto similarity values using ECFP6.


  • Making data accessible to machine learning
  • Data intensive visualization resulting from these many models
  • Closing the loop between experimentalist and data repositories
  • Using git for data management in cheminformatics.
  • Ease of deployment- without need for IT support
  • Built on industry standard technology
  • Graphical display of models – instant feedback
  • Model applicability – multiple methods to assess with scores and graphics.


  • Human health
  • Animal health
  • Environmental health
  • Defense health

Success stories

We have used the software in many collaborative projects to date including:

  • Estrogen receptor – worked with a major US consumer product company to collate public ER data and use Assay Central to deliver models.
  • Neglected disease 1. – used ADME models to predict properties of lead compounds.
  • Neglected disease 2.  – multiple collaborations on whole cell and target specific models – have identified novel inhibitors.
  • Neglected disease 3 – Model building and testing with academic with access to previously unpublished data
  • ADME models – Built transporter models for different probes and shared models.

What we may not have in there already is your data, but we can change that, by working with you to integrate content and building better models. This can be done either privately (i.e. only you get to benefit) or as publicly (i.e. the models get better for everyone). Please contact us if you would like a demonstration of what Assay Central can do.

Recent papers on the technologies involved include: 

Comparison of Deep Learning With Multiple Learning Methods and Metrics Using Diverse Drug Discovery Data Sets

Open Source Bayesian Models. 2. Mining a "big dataset" to create and validate models with ChEMBL

The Next Era: Deep Learning in Pharmaceutical Research