Engineering package

QSAR Plugin

Features & Capabilities

Compute thousands of descriptors with ALVADESC. Classify molecules in similarity groups with hierarchical clustering (13 metrics, 8 methods). Verify regression prerequisites with univariate and multivariate descriptor statistics. Build QSAR models and easily apply them in your projects.

Summary

QSAR modeling is organized around a Study Table which contains i) the training and test sets of molecules, ii) the modeled activity/property and iii) the set of molecular descriptors (generated by ALVADESC or PADEL). Experimental values and descriptors can also be entered manually or via copy-paste from other sources.

Molecules can be classified and labeled by similarity in the chosen descriptor space using the Clustering tool. The hierarchical results can be inspected in the Clustering canvas in several layout styles.

Models are created by PLS regression and can be optimized by a Genetic Algorithm, both steps using parameters controlled by the user. All created models are listed in the Models table, which includes validation data and linear regression formula.

The Status page enables inspection of the descriptors, modeling and validation statistics. Status text is interactive and allows to create new models from a set of highlighted descriptors. All tables can be exported in csv format. Predictions using several models can be made in a single step.

References

  1. Mauri A. (2020) alvaDesc: A Tool to Calculate and Analyze Molecular Descriptors and Fingerprints. In: Roy K. (eds) Ecotoxicological QSARs. Methods in Pharmacology and Toxicology. Humana, New York, NY (https://link.springer.com/protocol/10.1007/978-1-0716-0150-1_32)
  2. O. Duda, P. E. Hart, D. G. Stork, 2001. Pattern classification. John Wiley & Sons, Ltd., 2nd edition.
  3. Webb, 2004. Statistical pattern recognition. John Wiley & Sons, Ltd., 2nd
  4. C. Montgomery, E. A. Peck, G. G. Vining, 2006. Introduction to linear regression analysis. John Wiley & Sons, Inc., 4th edition.
  5. H. Kutner, C. J. Nachtsheim, J. Neter, W. Li, 2004. Applied linear statistical models. McGraw-Hill/Irwin, 5th edition.
  6. Shwane-Taylor, N. Cristianini, 2004 Kernel methods for pattern analysis. Cambridge University Press.
  7. Keijzer, J. J. Merelo, G. Romero, M. Schoenauer, 2002. Artificial Evolution, 2310, pp. 829-888.

Summary 

Build quantitative structure-activity relationships (QSAR/QSPR) from thousands of molecular descriptors, to describe biological activity or empirical molecular properties
Related Case Studies