From Descriptors to Predicted Properties: Experimental Design by Using Applicability Domain Estimation

///From Descriptors to Predicted Properties: Experimental Design by Using Applicability Domain Estimation

From Descriptors to Predicted Properties: Experimental Design by Using Applicability Domain Estimation

Stefan Brandmaier, Sergii Novotarskyi, Iurii Sushko and Igor V. Tetko

The importance of reliable methods for representative sub-sampling in terms of experimental
design and risk assessment within the European Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) system is crucial. We developed experimental design approaches, by utilising predicted properties and the ‘distance to model’ parameter, to estimate the benefits of certain compounds to the quality of a resulting model. A statistical evaluation of four regression data sets and one classification data set showed that the adaptive concept of iteratively refining the representation of the chemical space contributes to a more efficient and more reliable selection in comparison to traditional approaches. The evaluation of compounds with regard to the uncertainty and the correlation of prediction is beneficial, and in particular, for regression data sets of sufficient size, whereas the use of predicted properties to define
the chemical space is beneficial for classification models.
You need to register (for free) to download this article. Please log in/register here.