A Comparative Study of Machine Learning Algorithms Applied to Predictive Toxicology Data Mining

Daniel C. Neagu, Gongde Guo, Paul R. Trundle and Mark T.D. Cronin

This paper reports results of a comparative study of widely used machine learning algorithms applied to predictive toxicology data mining. The machine learning algorithms involved were chosen in terms of their representability and diversity, and were extensively evaluated with seven toxicity data sets which were taken from real-world applications. Some results based on visual analysis of the correlations of different descriptors to the class values of chemical compounds, and on the relationships of the range of chosen descriptors to the performance of machine learning algorithms, are emphasised from our experiments. Some interesting findings relating to the data and the quality of the models are presented — for example, that no specific algorithm appears best for all seven toxicity data sets, and that up to five descriptors are sufficient for creating classification models for each toxicity data set with good accuracy. We suggest that, for a specific data set, model accuracy is affected by the feature selection method and model development technique. Models built with too many or too few descriptors are undesirable, and finding the optimal feature subset appears at least as important as selecting appropriate algorithms with which to build a final model.
Adverse Outcome Pathway-based Screening Strategies for an Animal-free Safety Assessment of Chemicals

Brigitte Landesmann, Milena Mennecozzi, Elisabet Berggren and Maurice Whelan

Currently, the assessment of risk to human health from exposure to manufactured chemicals is mainly based on experiments performed on living animals (in vivo). Substantial efforts are being undertaken to develop alternative solutions to in vivo toxicity testing. This new paradigm, based on the Mode-of-Action (MoA) framework, postulates that any adverse human health effect caused by exposure to an exogenous substance can be described by a series of causally-linked biochemical or biological key events with measurable parameters. The elaboration of mechanistic knowledge through literature research is necessary for a MoA-driven design of integrated testing strategies using in vitro methods for in vivo predictions. The objective of our ongoing research is to demonstrate the feasibility of an integrated approach to predict human toxicity following the Adverse Outcome Pathway (AOP) framework. In our previous work on MoA with the HepaRG cell model, we developed a strategy to identify chemicals that were hepatotoxic. This pioneered an innovative way of using data from in vitro experiments to group chemicals based on their MoA, which is likely to be an important step in a toxicity testing strategy.
