Interpretable multi-species QSAR modeling for ecotoxicological hazard prediction of aquatic pollutants.

Ze-Jun Wang,Rui Sun,Xiao Han,Ting-ting Ding,Peng Huang,Jin Yan,Qiuhui Qian,Chang Wu,Hui-li Wang,Kai Li,Shu-Shen Liu

Published 2026 in Environment International

ABSTRACT

As global industrialization accelerates, the variety of emerging aquatic pollutants is increasing, posing complex toxic effects and regional ecological risks. However, toxicity data for native aquatic species remain limited, hindering effective pollutant identification and risk management. Traditional individual species QSAR models lack cross-species generalization, sample efficiency, and mechanistic interpretability. To address these limitations under small-sample conditions, this study developed two biologically interpretable linear multi-species QSAR models based on multi-task learning: a multi-species multiple linear regression model (MS_MLR) and a multi-species stepwise regression model (MS_SR). These models predict acute toxicity across five representative native species: Carassius auratus, Leuciscus idus, Oryzias latipes, Culex quinquefasciatus, and Lemna minor. Compared to individual species models, these cross-species models exhibit an expected performance trade-off in terms of robustness and generalizability, despite a slight decrease in some evaluation metrics. Five key sharing molecular descriptors (P_VSA_LogP_5, R8s, ATS4m, Eta_betaS_A, BLTA96) related to structural features such as hydrophobicity, topology, symmetry, and baseline toxicity were identified. Additionally, six high-risk pollutants (e.g., TCDD, Chlorpyrifos) were consistently detected, with literature-supported mechanisms including AhR activation, AChE inhibition, and oxidative stress. This interpretable multi-task QSAR framework enhances structural sharing and mechanistic understanding, supporting early screening, prioritization, ecological risk assessment and green chemistry design of emerging pollutants under data-limited scenarios. Future extensions may further improve this framework by expanding species coverage, increasing data availability, and integrating advanced multi-task learning strategies.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-59 of 59 references · Page 1 of 1

CITED BY

  • No citing papers are available for this paper.

Showing 0-0 of 0 citing papers · Page 1 of 1