On modeling and utilizing chemical compound information with deep learning technologies: A task-oriented approach

Sangsoo Lim, Sangseon Lee, Yinhua Piao, Min Gyu Choi, Dongmin Bang, Jeonghyeon Gu, Sun Kim

Research output: Contribution to journalReview articlepeer-review

11 Scopus citations

Abstract

A large number of chemical compounds are available in databases such as PubChem and ZINC. However, currently known compounds, though large, represent only a fraction of possible compounds, which is known as chemical space. Many of these compounds in the databases are annotated with properties and assay data that can be used for drug discovery efforts. For this goal, a number of machine learning algorithms have been developed and recent deep learning technologies can be effectively used to navigate chemical space, especially for unknown chemical compounds, in terms of drug-related tasks. In this article, we survey how deep learning technologies can model and utilize chemical compound information in a task-oriented way by exploiting annotated properties and assay data in the chemical compounds databases. We first compile what kind of tasks are trying to be accomplished by machine learning methods. Then, we survey deep learning technologies to show their modeling power and current applications for accomplishing drug related tasks. Next, we survey deep learning techniques to address the insufficiency issue of annotated data for more effective navigation of chemical space. Chemical compound information alone may not be powerful enough for drug related tasks, thus we survey what kind of information, such as assay and gene expression data, can be used to improve the prediction power of deep learning models. Finally, we conclude this survey with four important newly developed technologies that are yet to be fully incorporated into computational analysis of chemical information.

Original languageEnglish
Pages (from-to)4288-4304
Number of pages17
JournalComputational and Structural Biotechnology Journal
Volume20
DOIs
StatePublished - Jan 2022

Keywords

  • Chemical information modeling
  • Chemical space
  • Computer-aided drug discovery
  • Data augmentation
  • Deep learning

Fingerprint

Dive into the research topics of 'On modeling and utilizing chemical compound information with deep learning technologies: A task-oriented approach'. Together they form a unique fingerprint.

Cite this