eClean

Software for molecule dataset preparation, semi auto-curation and data visualization.

Features:

  • Converts units
  • Converts SMILES MolFiles
  • Handles duplicate compounds
  • Adds InChI keys and molecular weights
  • Removes special characters from the header
  • Removes NA activities and values
  • Neutralizes charges
  • Removes salts / other chemical fragments
  • Uses decision boundary to binarize values
  • Can use > < = qualifiers to filter and remove ambiguous values
  • Removes duplicate values that don’t match agreement ratio (fraction of similar binary values)
  • Returns rows w/ matching or mismatched values in two datasets
  • Can search by InChI key conversion or raw values
  • Uses ECFP (adjustable radius and bit) or MACCs fingerprints to generate similarity matrix values and graphic
  • Can use same dataset for each axis or upload a different one
  • Generates t-SNE for ECFP (adjustable radius and bit), MACCs, other quantifiable descriptors, or ECFP + other descriptors
  • Other descriptors are z-normalized
  • Generates plot which can be edited and downloaded as SVG or PNG

Access

  • We can use eClean in fee for service work for you.
  • We can provide an annual license for you to access this software on your own server.
  • We provide maintenance and customization options.