One of the best ways to find new drugs is to leverage knowledge of previous attempts, successful and otherwise. With a big enough collection of facts, it becomes possible to learn about the patterns that make some molecules able to treat or cure diseases, whereas others have no effect or have unintended toxic consequences that stop them from being useful in the clinic.

Building computational models using the results from experiments done by other scientists is a way to leverage the power of machine learning to guide the process of selecting new drug candidates, and has been used by large pharmaceutical companies for decades. There are many different techniques available for computational modeling, and this is done using powerful software. But this power comes with barriers that make it prohibitive for use by non-experts.

Collaborations Pharmaceuticals Inc.  have gathered a large collection of openly available data that represents over 1000's of different biological targets, and for each of these targets, we have already built and validated the computational models. These targets include both ones that represent diseases and medical conditions, and off-targets that fall under the ADME/tox categories, and are to be avoided. These computational models can be made available using our Assay Central software that is easy to use, even by the standards of consumer software, requiring no expertise. We can also make the models readily shared as Java jar files.

Assay Central supports and facilitates early discovery work enabling the curation of high quality datasets from public and or proprietary data.  Assay Central can be used for building machine learning models that in turn can be used to filter and score compounds prior to testing. Assay Central lowers the entry point for accessing powerful cheminformatics technologies that can be used for drug discovery and makes models accessible from the burgeoning public datasets. Assay Central provides these models with an intuitive interface for the end user.

Some example screenshots are shown below.

Model ROC and other statistics can be viewed.

Model ROC and other statistics can be viewed.

Molecules can be input through an integrated sketcher or dragged in from an sdf.

Molecules can be input through an integrated sketcher or dragged in from an sdf.

Model predictions can be visualized with a heatmap.

Model predictions can be visualized with a heatmap.

Atom level contributions to predicted activity can be viewed by color contribution.

Atom level contributions to predicted activity can be viewed by color contribution.

Predictions can also be assessed using a hex plot which uses K-nearest neighbors. Tanimoto similarity values using ECFP6.

Predictions can also be assessed using a hex plot which uses K-nearest neighbors. Tanimoto similarity values using ECFP6.

Benefits

  • Making data accessible to machine learning
  • Data intensive visualization resulting from these many models
  • Closing the loop between experimentalist and data repositories
  • Using git for data management in cheminformatics.
  • Ease of deployment- executable Java files executed by users without need for IT support
  • Built on industry standard technology
  • Graphical display of models – instant feedback
  • Model applicability – multiple methods to assess with scores and graphics.

Success stories

We have used the software in many projects to date including:

  • Estrogen receptor – worked with a major US consumer product company to collate public ER data and use Assay Central to deliver models.
  • Neglected disease 1. – used ADME models to predict properties of lead compounds.
  • Neglected disease 2.  – multiple collaborations on whole cell and target specific models – have identified novel inhibitors.
  • Neglected disease 3 – Model building and testing with academic with access to previously unpublished data
  • ADME models – Built transporter models for different probes and jar file used to share models.

What we may not have in there already is your data, but we can change that, by working with you to integrate content and building better models. This can be done either privately (i.e. only you get to benefit) or as publicly (i.e. the models get better for everyone). Please contact us if you would like a demonstration of what Assay Central can do.

Recent papers on the technologies involved include: 

Open Source Bayesian Models. 2. Mining a "big dataset" to create and validate models with ChEMBL

The Next Era: Deep Learning in Pharmaceutical Research