-
A 3D generation framework using diffusion model and reinforcement learning to generate multi-target compounds with desired properties J. Cheminfom. (IF 7.1) Pub Date : 2025-06-04
Yongna Yuan, Xiaohang Pan, Xiaohong Li, Ruisheng Zhang, Wei SuDeep generative models provide a powerful solution for the de novo design of molecules. However, the majority of existing methods only generate molecules for a single target. Generating molecules with biological activities against multiple specific targets and desired properties remains an extremely difficult challenge. In this study, we propose a novel 3D molecule generation framework based on reinforcement
-
RLSuccSite: succinylation sites prediction based on reinforcement learning dynamic with balanced reward mechanism and three-peaks enhanced method for physicochemical property scores J. Cheminfom. (IF 7.1) Pub Date : 2025-06-02
Lun Zhu, Qingchao Zhang, Sen YangRecent progress in computational biology has driven the development of machine learning models for predicting protein post-translational modification sites. However, challenges such as data imbalance and limited sequence-context representation continue to hinder prediction accuracy, particularly for less frequent modifications like succinylation. In this study, we propose RLSuccSite, a reinforcement
-
Representation of chemistry transport models simulations using knowledge graphs J. Cheminfom. (IF 7.1) Pub Date : 2025-05-31
Eduardo Illueca Fernández, Antonio Jesús Jara Valera, Jesualdo Tomás Fernández BreisPersistent air quality pollution poses a serious threat to human health, and is one of the action points that policy makers should monitor according to the Directive 2008/50/EC. While deploying a massive network of hyperlocal sensors could provide extensive monitoring, this approach cannot generate geospatial continuous data and present several challenges in terms of logistics. Thus, developing accurate
-
Equivariant diffusion for structure-based de novo ligand generation with latent-conditioning J. Cheminfom. (IF 7.1) Pub Date : 2025-05-31
Tuan Le, Julian Cremer, Djork-Arné Clevert, Kristof T. SchüttWe introduce PoLiGenX, a novel generative model for de novo ligand design that employs latent-conditioned, target-aware equivariant diffusion. Our approach leverages the conditioning of the ligand generation process on reference molecules located within a specific protein pocket. By doing so, PoLiGenX generates shape-similar ligands that are adapted to the target pocket, enabling effective applications
-
Higher education in chemoinformatics: achievements and challenges J. Cheminfom. (IF 7.1) Pub Date : 2025-05-31
Alexandre Varnek, Gilles Marcou, Dragos HorvathWhile chemoinformatics is a well-established scientific field, its integration into university curricula is rarely discussed. In this work, we share our experience in developing a chemoinformatics curriculum at the University of Strasbourg and highlight the main challenges in higher education for this discipline.
-
Semi-supervised prediction of protein fitness for data-driven protein engineering J. Cheminfom. (IF 7.1) Pub Date : 2025-05-31
Alicia Olivares-Gil, José A. Barbero-Aparicio, Juan J. Rodríguez, José F. Díez-Pastor, César García-Osorio, Mehdi D. DavariProtein fitness prediction plays a crucial role in the advancement of protein engineering endeavours. However, the combinatorial complexity of the protein sequence space and the limited availability of assay-labelled data hinder the efficient optimization of protein properties. Data-driven strategies utilizing machine learning methods have emerged as a promising solution, yet their dependence on labelled
-
Enhancing atom mapping with multitask learning and symmetry-aware deep graph matching J. Cheminfom. (IF 7.1) Pub Date : 2025-05-30
Maryam Astero, Juho RousuAtom mapping involves identifying the correspondence between individual atoms in reactant molecules and their counterparts in product molecules. This process is crucial for gaining deeper insight into reaction mechanisms, such as defining reaction templates and determining which chemical bonds are formed or broken during a reaction. However, reliable atom mapping data are often limited or incomplete
-
ELNdataBridge: facilitating data exchange and collaboration by linking Electronic Lab Notebooks via API J. Cheminfom. (IF 7.1) Pub Date : 2025-05-26
Martin Starman, Fabian Kirchner, Martin Held, Catriona Eschke, Sayed-Ahmad Sahim, Regine Willumeit-Römer, Nicole Jung, Stefan BräseElectronic Lab Notebooks (ELNs) have become indispensable tools for modern research laboratories, facilitating data management, collaboration, and documentation of scientific experiments. However, the proliferation of diverse ELN platforms poses challenges for researchers who need to seamlessly exchange data between different systems. In this paper, we present ELNdataBridge, a novel server-based solution
-
Moldrug algorithm for an automated ligand binding site exploration by 3D aware molecular enumerations J. Cheminfom. (IF 7.1) Pub Date : 2025-05-26
Alejandro Martínez León, Benjamin Ries, Jochen S. Hub, Aniket MagarkarWe present Moldrug, a computational tool for accelerating the hit-to-lead phase in structure-based drug design. Moldrug explores the chemical space using structural modifications suggested by the CReM library and by optimizing an adaptable fitness function with a genetic algorithm. Moldrug is complemented by Moldrug-Dashboard, a cross-platform and user-friendly graphical interface tailored for the
-
Surfactant representation using COSMO screened charge density for adsorption isotherm prediction using Physics-Informed Neural Network (PINN) J. Cheminfom. (IF 7.1) Pub Date : 2025-05-26
Achmad Anggawirya Alimin, Kattariya Srasamran, Wanutchaya Yuenyong, Ampira Charoensaeng, Bor-Jier Shiau, Uthaiporn SuriyapraphadilokPredicting surfactant adsorption using the currently available isotherm model is limited to one or two independent variables: equilibrium concentration and temperature. This study aims to develop an adsorption model that includes molecular features, testing conditions, and solid properties in the model. A Physics-Informed Neural Network (PINN) was structured by integrating adsorption isotherm into
-
Context-dependent similarity searching for small molecular fragments J. Cheminfom. (IF 7.1) Pub Date : 2025-05-26
Atsushi Yoshimori, Jürgen BajorathSimilarity searching is a mainstay in cheminformatics that is generally used to identify compounds with desired properties. For small molecular fragments, similarity calculations based on standard descriptors often have limited utility for establishing meaningful similarity relationships due to feature sparseness. As an alternative, we have adapted the concept of context-depending word pair similarity
-
Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data J. Cheminfom. (IF 7.1) Pub Date : 2025-05-26
Pilleriin Peets, Aristeidis Litos, Kai Dührkop, Daniel R. Garza, Justin J. J. van der Hooft, Sebastian Böcker, Bas E. DutilhUntargeted metabolomics can comprehensively map the chemical space of a biome, but is limited by low annotation rates (< 10%). We used chemical characteristics vectors, consisting of molecular fingerprints or chemical compound classes, predicted from mass spectrometry data, to characterize compounds and samples. These chemical characteristics vectors (CCVs) estimate the fraction of compounds with specific
-
Enhancing molecular property prediction with quantized GNN models J. Cheminfom. (IF 7.1) Pub Date : 2025-05-26
Areen Rasool, Jamshaid Ul Rahman, Rongin UwitijeEfficient and reliable prediction of molecular properties, such as water solubility, hydration free energy, lipophilicity, and quantum mechanical properties, is essential for rational compound design in the chemical and pharmaceutical industries. While Graph Neural Networks (GNNs) have significantly advanced molecular property prediction tasks, their high memory footprint, computational demands, and
-
Benchmarking molecular conformer augmentation with context-enriched training: graph-based transformer versus GNN models J. Cheminfom. (IF 7.1) Pub Date : 2025-05-22
Cecile Valsecchi, Jose A. Arjona-Medina, Natalia Dyubankova, Ramil NugmanovThe field of molecular representation has witnessed a shift towards models trained on molecular structures represented by strings or graphs, with chemical information encoded in nodes and bonds. Graph-based representations offer a more realistic depiction and support 3D geometry and conformer-based augmentation. Graph Neural Networks (GNNs) and Graph-based Transformer models (GTs) represent two paradigms
-
A resource description framework (RDF) model of named entity co-occurrences in biomedical literature and its integration with PubChemRDF J. Cheminfom. (IF 7.1) Pub Date : 2025-05-21
Qingliang Li, Sunghwan Kim, Leonid Zaslavsky, Tiejun Cheng, Bo Yu, Evan E. BoltonNamed entities, such as chemicals/drugs, genes/proteins, and diseases, and their associations are not only important components of biomedical literature, but also the foundation of creating biomedical knowledgebases and knowledge graphs. This work addresses the challenges of expressing co-occurrence associations between named entities extracted from a biomedical literature corpus in a machine-readable
-
Machine learning-driven generation and screening of potential ionic liquids for cellulose dissolution J. Cheminfom. (IF 7.1) Pub Date : 2025-05-21
Mengyang Qu, Gyanendra Sharma, Naoki Wada, Hisaki Ikebata, Shigeyuki Matsunami, Kenji TakahashiCellulose, a highly versatile material, faces challenges in processing due to its limited solubility in common solvents. Ionic liquids have been found to possess high solvating capacities for cellulose. However, the experimental development of ionic liquids with optimal cellulose solubilities remains a time-consuming trial-and-error process. In this work, a virtual molecular library containing billions
-
Advantages of two quantum programming platforms in quantum computing and quantum chemistry J. Cheminfom. (IF 7.1) Pub Date : 2025-05-19
Pei-Hua Wang, Wei-Yeh Wu, Che-Yu Lee, Jia-Cheng Hong, Yufeng Jane TsengQuantum computing is at the forefront of technological advancement and has the potential to revolutionize various fields, including quantum chemistry. Choosing an appropriate quantum programming language becomes critical as quantum education and research increase. In this paper, we comprehensively compare two leading quantum programming languages, Qiskit and PennyLane, focusing on their suitability
-
Assessing interaction recovery of predicted protein-ligand poses J. Cheminfom. (IF 7.1) Pub Date : 2025-05-19
David Errington, Constantin Schneider, Cédric Bouysset, Frédéric A. DreyerThe field of protein-ligand pose prediction has seen significant advances in recent years, with machine learning-based methods now being commonly used in lieu of classical docking methods or even to predict all-atom protein-ligand complex structures. Most contemporary studies focus on the accuracy and physical plausibility of ligand placement to determine pose quality, often neglecting a direct assessment
-
Addressing standardization and semantics in an electronic lab notebook for multidisciplinary use: LabIMotion J. Cheminfom. (IF 7.1) Pub Date : 2025-05-14
Chia-Lin Lin, Pei-Chi Huang, Christof Wöll, Patrick Théato, Christian Kübel, Lena Pilz, Nicole Jung, Stefan BräseThis work presents the LabIMotion extension for the Chemotion Electronic Lab Notebook (ELN), expanding its capabilities from organic chemistry to support interdisciplinary research and enabling the description of workflows. LabIMotion enhances documentation by introducing customizable components structured across three levels—Elements, Segments, and Datasets—enabling flexible, hierarchical organization
-
Fate-tox: fragment attention transformer for E(3)-equivariant multi-organ toxicity prediction J. Cheminfom. (IF 7.1) Pub Date : 2025-05-14
Sumin Ha, Dongmin Bang, Sun KimToxicity is a critical hurdle in drug development, often causing the late-stage failure of promising compounds. Existing computational prediction models often focus on single-organ toxicity. However, avoiding toxicity of an organ, such as reducing gastrointestinal side effects, may inadvertently lead to toxicity in another organ, as seen in the real case of rofecoxib, which was withdrawn due to increased
-
Generalizable, fast, and accurate DeepQSPR with fastprop J. Cheminfom. (IF 7.1) Pub Date : 2025-05-13
Jackson W. Burns, William H. GreenQuantitative Structure–Property Relationship studies (QSPR), often referred to interchangeably as QSAR, seek to establish a mapping between molecular structure and an arbitrary target property. Historically this was done on a target-by-target basis with new descriptors being devised to specifically map to a given target. Today software packages exist that calculate thousands of these descriptors, enabling
-
Generating diversity and securing completeness in algorithmic retrosynthesis J. Cheminfom. (IF 7.1) Pub Date : 2025-05-13
Florian Mrugalla, Christopher Franz, Yannic Alber, Georg Mogk, Martín Villalba, Thomas Mrziglod, Kevin SchewiorChemical synthesis planning has considerably benefited from advances in the field of machine learning. Neural networks can reliably and accurately predict reactions leading to a given, possibly complex, molecule. In this work we focus on algorithms for assembling such predictions to a full synthesis plan that, starting from simple building blocks, produces a given target molecule, a procedure known
-
The published role of artificial intelligence in drug discovery and development: a bibliometric and social network analysis from 1990 to 2023 J. Cheminfom. (IF 7.1) Pub Date : 2025-05-08
Murat Koçak, Zafer AkçalıToday, drug discovery and development is one of the fields where Artificial Intelligence (AI) is used extensively. Therefore, this study aims to systematically analyze the scientific literature on the application of AI in drug discovery and development to understand the evolution, trends, and key contributors within this rapidly growing field. By leveraging various bibliometric indicators and visualization
-
Application of 3D atom pair map in an attention model for enhanced drug virtual screening J. Cheminfom. (IF 7.1) Pub Date : 2025-05-05
Gina Ryu, Wankyu KimMachine learning and artificial intelligence (AI) are actively applied in drug discovery, such as virtual screening, wherein appropriate molecular representation is critical. Conventional compound representations have limited use because they cannot encode the 3D spatial arrangement of atoms. An atom pair map (APM) represents a compound using a numerical matrix that encodes the physicochemical properties
-
Predicting inhibitors of OATP1B1 via heterogeneous OATP-ligand interaction graph neural network (HOLIgraph) J. Cheminfom. (IF 7.1) Pub Date : 2025-05-05
Mehrsa Mardikoraem, Joelle N. Eaves, Theodore Belecciu, Nathaniel Pascual, Alexander Aljets, Bruno Hagenbuch, Erik M. Shapiro, Benjamin J. Orlando, Daniel R. WoldringOrganic anion transporting polypeptides (OATPs) are membrane transporters crucial for drug uptake and distribution in the human body. OATPs can mediate drug-drug interactions (DDIs) in which the interaction of one drug with an OATP impairs the uptake of another drug, resulting in potentially fatal pharmacological effects. Predicting OATP-mediated DDIs is challenging, due to limited information on OATP
-
Prediction of blood–brain barrier and Caco-2 permeability through the Enalos Cloud Platform: combining contrastive learning and atom-attention message passing neural networks J. Cheminfom. (IF 7.1) Pub Date : 2025-05-05
Nikoletta-Maria Koutroumpa, Andreas Tsoumanis, Haralambos Sarimveis, Iseult Lynch, Georgia Melagraki, Antreas AfantitisIn this study, we introduce a novel approach for predicting two key drug properties, blood–brain barrier (BBB) permeability and human intestinal absorption via Caco-2 permeability. Our methodology centers around a specialized neural network, the atom transformer-based Message Passing Neural Network (MPNN), which we have combined with contrastive learning techniques to enhance the process of representing
-
Leveraging AI to explore structural contexts of post-translational modifications in drug binding J. Cheminfom. (IF 7.1) Pub Date : 2025-05-04
Kirill E. Medvedev, R. Dustin Schaeffer, Nick V. GrishinPost-translational modifications (PTMs) play a crucial role in allowing cells to expand the functionality of their proteins and adaptively regulate their signaling pathways. Defects in PTMs have been linked to numerous developmental disorders and human diseases, including cancer, diabetes, heart, neurodegenerative and metabolic diseases. PTMs are important targets in drug discovery, as they can significantly
-
Improving the accuracy of prediction models for small datasets of Cytochrome P450 inhibition with deep learning J. Cheminfom. (IF 7.1) Pub Date : 2025-04-30
Elpri Eka Permadi, Reiko Watanabe, Kenji MizuguchiThe cytochrome P450 (CYP) superfamily metabolises a wide range of compounds; however, drug-induced CYP inhibition can lead to adverse interactions. Identifying potential CYP inhibitors is crucial for safe drug administration. This study investigated the application of deep learning techniques to the prediction of CYP inhibition, focusing on the challenges posed by limited datasets for CYP2B6 and CYP2C8
-
LAGNet: better electron density prediction for LCAO-based data and drug-like substances J. Cheminfom. (IF 7.1) Pub Date : 2025-04-29
Konstantin Ushenin, Kuzma Khrabrov, Artem Tsypin, Anton Ber, Egor Rumiantsev, Artur KadurinThe electron density is an important object in quantum chemistry that is crucial for many downstream tasks in drug design. Recent deep learning approaches predict the electron density around a molecule from atom types and atom positions. Most of these methods use the plane wave (PW) numerical method as a source of ground-truth training data. However, the drug design field mostly uses the Linear Combination
-
E-GuARD: expert-guided augmentation for the robust detection of compounds interfering with biological assays J. Cheminfom. (IF 7.1) Pub Date : 2025-04-29
Vincenzo Palmacci, Yasmine Nahal, Matthias Welsch, Ola Engkvist, Samuel Kaski, Johannes KirchmairAssay interference caused by small organic compounds continues to pose formidable challenges to early drug discovery. Various computational methods have been developed to identify compounds likely to cause assay interference. However, due to the scarcity of data available for model development, the predictive accuracy and applicability of these approaches are limited. In this work, we present E-GuARD
-
SMILES all around: structure to SMILES conversion for transition metal complexes J. Cheminfom. (IF 7.1) Pub Date : 2025-04-28
Maria H. Rasmussen, Magnus Strandgaard, Julius Seumer, Laura K. Hemmingsen, Angelo Frei, David Balcells, Jan H. JensenWe present a method for creating RDKit-parsable SMILES for transition metal complexes (TMCs) based on xyz-coordinates and overall charge of the complex. This can be viewed as an extension to the program xyz2mol that does the same for organic molecules. The only dependency is RDKit, which makes it widely applicable. One thing that has been lacking when it comes to generating SMILES from structure for
-
Translating community-wide spectral library into actionable chemical knowledge: a proof of concept with monoterpene indole alkaloids J. Cheminfom. (IF 7.1) Pub Date : 2025-04-28
Sarah Szwarc, Adriano Rutz, Kyungha Lee, Yassine Mejri, Olivier Bonnet, Hazrina Hazni, Adrien Jagora, Rany B. Mbeng Obame, Jin Kyoung Noh, Elvis Otogo N’Nang, Stephenie C. Alaribe, Khalijah Awang, Guillaume Bernadat, Young Hae Choi, Vincent Courdavault, Michel Frederich, Thomas Gaslonde, Florian Huber, Toh-Seok Kam, Yun Yee Low, Erwan Poupon, Justin J. J. van der Hooft, Kyo Bin Kang, Pierre Le PogamWith over 3000 representatives, the monoterpene indole alkaloids (MIAs) class is among the most diverse families of plant natural products. The MS/MS spectral space exploration of these complex compounds using chemoinformatic and computational mass spectrometry tools offers a valuable opportunity to extract and share chemical insights from this emblematic family of natural products (NPs). In this work
-
Moldina: a fast and accurate search algorithm for simultaneous docking of multiple ligands J. Cheminfom. (IF 7.1) Pub Date : 2025-04-28
Radek Halfar, Jiří Damborský, Sérgio M. Marques, Jan MartinovičProtein-ligand docking is a computational method routinely used in many structural biology applications. It usually involves one receptor and one ligand. The docking of multiple ligands, however, can be important in several situations, such as the study of synergistic effects, substrate and product inhibition, or competitive binding. This can be a challenging and computationally demanding process.
-
Visualising lead optimisation series using reduced graphs J. Cheminfom. (IF 7.1) Pub Date : 2025-04-24
Jessica Stacey, Baptiste Canault, Stephen D. Pickett, Valerie J. GilletThe typical way in which lead optimisation (LO) series are represented in the medicinal chemistry literature is as Markush structures and associated R-group tables. The Markush structure shows a central core or molecular scaffold that is common to the series with R groups that indicate the points of variability that have been explored in the series. The associated R-group table shows the substituent
-
High-throughput screening data generation, scoring and FAIRification: a case study on nanomaterials J. Cheminfom. (IF 7.1) Pub Date : 2025-04-23
Gergana Tancheva, Vesa Hongisto, Konrad Patyra, Luchesar Iliev, Nikolay Kochev, Penny Nymark, Pekka Kohonen, Nina Jeliazkova, Roland GrafströmIn vitro-based high-throughput screening (HTS) technology is applicable to hazard-based ranking and grouping of diverse agents, including nanomaterials (NMs). We present a standardized HTS-derived human cell-based testing protocol which combines the analysis of five assays into a broad toxic mode-of-action-based hazard value, termed Tox5-score. The overall protocol includes automated data FAIRification
-
Molecular property prediction using pretrained-BERT and Bayesian active learning: a data-efficient approach to drug design J. Cheminfom. (IF 7.1) Pub Date : 2025-04-23
Muhammad Arslan Masood, Samuel Kaski, Tianyu CuiIn drug discovery, prioritizing compounds for experimental testing is a critical task that can be optimized through active learning by strategically selecting informative molecules. Active learning typically trains models on labeled examples alone, while unlabeled data is only used for acquisition. This fully supervised approach neglects valuable information present in unlabeled molecular data, impairing
-
GESim: ultrafast graph-based molecular similarity calculation via von Neumann graph entropy J. Cheminfom. (IF 7.1) Pub Date : 2025-04-22
Hiroaki Shiokawa, Shoichi Ishida, Kei TerayamaRepresenting molecules as graphs is a natural approach for capturing their structural information, with atoms depicted as nodes and bonds as edges. Although graph-based similarity calculation approaches, such as the graph edit distance, have been proposed for calculating molecular similarity, these approaches are nondeterministic polynomial (NP)-hard and thus computationally infeasible for routine
-
Learning motif features and topological structure of molecules for metabolic pathway prediction J. Cheminfom. (IF 7.1) Pub Date : 2025-04-21
Jianguo Hu, Yiqing Zhang, Jinxin Xie, Zhen Yuan, Zhangxiang Yin, Shanshan Shi, Honglin Li, Shiliang LiMetabolites serve as crucial biomarkers for assessing disease progression and understanding underlying pathogenic mechanisms. However, when the metabolic pathway category of metabolites is unknown, researchers face challenges in conducting metabolomic analyses. Due to the complexity of wet laboratory experimentation for pathway identification, there is a growing demand for predictive methods. Various
-
Prediction of the water solubility by a graph convolutional-based neural network on a highly curated dataset J. Cheminfom. (IF 7.1) Pub Date : 2025-04-21
Nadin Ulrich, Karsten Voigt, Anton Kudria, Alexander Böhme, Ralf-Uwe EbertWater solubility is a relevant physico-chemcial property in environmental chemistry, toxicology, and drug design. Although the water solubility is besides the octanol–water partition coefficient, melting point, and boiling point a property with a large amount of available experimental data, there are still more compounds in the chemical universe for which information on their water solubility is lacking
-
Activity cliff-aware reinforcement learning for de novo drug design J. Cheminfom. (IF 7.1) Pub Date : 2025-04-21
Xiuyuan Hu, Guoqing Liu, Yang Zhao, Hao ZhangThe integration of artificial intelligence (AI) in drug discovery offers promising opportunities to streamline and enhance the traditional drug development process. One core challenge in de novo molecular design is modeling complex structure-activity relationships (SAR), such as activity cliffs, where minor molecular changes yield significant shifts in biological activity. In response to the limitations
-
The pucke.rs toolkit to facilitate sampling the conformational space of biomolecular monomers J. Cheminfom. (IF 7.1) Pub Date : 2025-04-17
Jérôme Rihon, Sten Reynders, Vitor Bernardes Pinheiro, Eveline LescrinierUnderstanding of the structural and dynamic behaviour of molecules is a major objective in molecular modeling research. Sampling through the torsional space is an efficient way to map their behaviour. However, generating a landscape of possible conformations relies on multiple formalisms whose mathematics are often difficult to convert to code. Here we present a command line tool and a scripting module
-
Integrating QSAR modelling with reinforcement learning for Syk inhibitor discovery J. Cheminfom. (IF 7.1) Pub Date : 2025-04-15
Maria Zavadskaya, Anastasia Orlova, Andrei Dmitrenko, Vladimir VinogradovSpleen tyrosine kinase (Syk) is a crucial mediator of inflammatory processes and a promising therapeutic target for the management of autoimmune disorders, such as immune thrombocytopenia. While several Syk inhibitors are known to date, their efficacy and safety profiles remain suboptimal, necessitating the exploration of novel compounds. The study introduces a novel deep reinforcement learning strategy
-
Enhancing chemical reaction search through contrastive representation learning and human-in-the-loop J. Cheminfom. (IF 7.1) Pub Date : 2025-04-10
Youngchun Kwon, Hyunjeong Jeon, Joonhyuk Choi, Youn-Suk Choi, Seokho KangIn synthesis planning, identifying and optimizing chemical reactions are important for the successful design of synthetic pathways to target substances. Chemical reaction databases assist chemists in gaining insights into this process. Traditionally, searching for relevant records from a reaction database has relied on the manual formulation of queries by chemists based on their search purposes, which
-
Unveiling polyphenol-protein interactions: a comprehensive computational analysis J. Cheminfom. (IF 7.1) Pub Date : 2025-04-10
Samo Lešnik, Marko Jukić, Urban BrenOur study investigates polyphenol-protein interactions, analyzing their structural diversity and dynamic behavior. Analysis of the entire Protein Data Bank reveals diverse polyphenolic structures, engaging in various noncovalent interactions with proteins. Interactions observed across crystal structures among diverse polyphenolic classes reveal similarities, underscoring consistent patterns across
-
InertDB as a generative AI-expanded resource of biologically inactive small molecules from PubChem J. Cheminfom. (IF 7.1) Pub Date : 2025-04-10
Seungchan An, Yeonjin Lee, Junpyo Gong, Seokyoung Hwang, In Guk Park, Jayhyun Cho, Min Ju Lee, Minkyu Kim, Yun Pyo Kang, Minsoo NohThe development of robust artificial intelligence (AI)-driven predictive models relies on high-quality, diverse chemical datasets. However, the scarcity of negative data and a publication bias toward positive results often hinder accurate biological activity prediction. To address this challenge, we introduce InertDB, a comprehensive database comprising 3,205 curated inactive compounds (CICs) identified
-
HepatoToxicity Portal (HTP): an integrated database of drug-induced hepatotoxicity knowledgebase and graph neural network-based prediction model J. Cheminfom. (IF 7.1) Pub Date : 2025-04-08
Jiyeon Han, Wonho Zhung, Insoo Jang, Joongwon Lee, Min Ji Kang, Timothy Dain Lee, Seung Jun Kwack, Kyu-Bong Kim, Daehee Hwang, Byungwook Lee, Hyung Sik Kim, Woo Youn Kim, Sanghyuk LeeLiver toxicity poses a critical challenge in drug development due to the liver's pivotal role in drug metabolism and detoxification. Accurately predicting liver toxicity is crucial but is hindered by scattered information sources, a lack of curation standards, and the heterogeneity of data perspectives. To address these challenges, we developed the HepatoToxicity Portal (HTP), which integrates an expert-curated
-
A beginner’s approach to deep learning applied to VS and MD techniques J. Cheminfom. (IF 7.1) Pub Date : 2025-04-08
Stijn D’Hondt, José Oramas, Hans De WinterIt has become impossible to imagine the fields of biochemistry and medicinal chemistry without computational chemistry and molecular modelling techniques. In many steps of the drug development process in silico methods have become indispensable. Virtual screening (VS) can tremendously expedite the early discovery phase, whilst the use of molecular dynamics (MD) simulations forms a powerful additional
-
AI/ML methodologies and the future-will they be successful in designing the next generation of new chemical entities? J. Cheminfom. (IF 7.1) Pub Date : 2025-04-06
Rachelle J. BienstockCheminformatics and chemical databases are essential to drug discovery. However, machine learning (ML) and artificial intelligence (AI) methodologies are changing the way in which chemical data is used. How will the use of chemical data change in drug discovery moving forward? How do the new ML methods in molecular property prediction, hit and lead and target identification and structure prediction
-
Clc-db: an open-source online database of chiral ligands and catalysts J. Cheminfom. (IF 7.1) Pub Date : 2025-04-03
Gufeng Yu, Kaiwen Yu, Xi Wang, Chenxi Zhang, Yicong Luo, Xiaohong Huo, Yang YangThe design and optimization of chiral ligands and catalysts are fundamental to advancing asymmetric catalysis, a critical area in organic chemistry with wide-ranging impacts across scientific disciplines. Traditional experimental approaches, while essential, are often hindered by their slow pace and complexity. Recent advancements have demonstrated that computational methods, particularly machine learning
-
The evolution of open science in cheminformatics: a journey from closed systems to collaborative innovation J. Cheminfom. (IF 7.1) Pub Date : 2025-04-03
Christoph SteinbeckCheminformatics has significantly transformed over the past four decades, evolving from a field dominated by proprietary systems to one increasingly embracing open science principles. In its early years, cheminformatics was characterised by commercial software and restricted data access, limiting collaboration and reproducibility. The advent of open-source software in the late 1990s and early 2000s
-
Correction: APBIO: bioactive profiling of air pollutants through inferred bioactivity signatures and prediction of novel target interactions J. Cheminfom. (IF 7.1) Pub Date : 2025-04-01
Eva Viesi, Ugo Perricone, Patrick Aloy, Rosalba GiugnoCorrection: Journal of Cheminformatics (2025) 17:13 https://doi.org/10.1186/s13321-025-00961-1 Following publication of the original article [1], the authors identified the following errors: The incorrect Acknowledgements is: The authors would like to thank the ‘National Biodiversity Future Center’ (identification code CN00000033, CUP B73C21001300006) on ‘Biodiversity’, financed under the National
-
Predictive modeling of visible-light azo-photoswitches’ properties using structural features J. Cheminfom. (IF 7.1) Pub Date : 2025-04-01
Said Byadi, P. K. Hashim, Pavel SidorovIn this manuscript we present the strategy for modeling photoswitch properties (maximum absorption wavelength and thermal half-life of photoisomers) of visible-light azo-photoswitches using structural data. We compile a comprehensive data set from literature sources and perform a rigorous benchmark to select the best feature type and modeling approach. The fragment counts have demonstrated the best
-
Generate what you can make: achieving in-house synthesizability with readily available resources in de novo drug design J. Cheminfom. (IF 7.1) Pub Date : 2025-03-28
Alan Kai Hassen, Martin Šícho, Yorick J. van Aalst, Mirjam C. W. Huizenga, Darcy N. R. Reynolds, Sohvi Luukkonen, Andrius Bernatavicius, Djork-Arné Clevert, Antonius P. A. Janssen, Gerard J. P. van Westen, Mike PreussComputer-Aided Synthesis Planning (CASP) and CASP-based approximated synthesizability scores have rarely been used as generation objectives in Computer-Aided Drug Design despite facilitating the in-silico generation of synthesizable molecules. However, these synthesizability approaches are disconnected from the reality of small laboratory drug design, where building block resources are limited, thus
-
Three pillars for ensuring public access and integrity of chemical databases powering cheminformatics J. Cheminfom. (IF 7.1) Pub Date : 2025-03-28
Antony J. Williams, Ann M. RichardSince the inception of the Internet, public databases disseminating chemistry data to the community have proliferated and helped to support and encourage a burgeoning interest in cheminformatics. This has been supported by a shift in open science, exemplified by Open Data, Open Source, and Open Standards (ODOSOS) for chemistry [1], as well as by the increasing sophistication and availability of free
-
Protecting your skin: a highly accurate LSTM network integrating conjoint features for predicting chemical-induced skin irritation J. Cheminfom. (IF 7.1) Pub Date : 2025-03-27
Huynh Anh Duy, Tarapong SrisongkramSkin irritation is a significant adverse effect associated with chemicals and drug substances. Quantitative structure-activity relationship (QSAR) is an alternative method bypassing in vivo assay for filling data gaps in chemical risk assessment. In this study, we developed QSAR models based on recurrent neural networks (RNNs) to classify skin irritation caused by chemical compounds. We utilized chemical
-
Publishing neural networks in drug discovery might compromise training data privacy J. Cheminfom. (IF 7.1) Pub Date : 2025-03-26
Fabian P. Krüger, Johan Östman, Lewis Mervin, Igor V. Tetko, Ola EngkvistThis study investigates the risks of exposing confidential chemical structures when machine learning models trained on these structures are made publicly available. We use membership inference attacks, a common method to assess privacy that is largely unexplored in the context of drug discovery, to examine neural networks for molecular property prediction in a black-box setting. Our results reveal
-
A unified approach to inferring chemical compounds with the desired aqueous solubility J. Cheminfom. (IF 7.1) Pub Date : 2025-03-26
Muniba Batool, Naveed Ahmed Azam, Jianshen Zhu, Kazuya Haraguchi, Liang Zhao, Tatsuya AkutsuAqueous solubility (AS) is a key physiochemical property that plays a crucial role in drug discovery and material design. We report a novel unified approach to predict and infer chemical compounds with the desired AS based on simple deterministic graph-theoretic descriptors, multiple linear regression (MLR), and mixed integer linear programming (MILP). Selected descriptors based on a forward stepwise
-
Large language models open new way of AI-assisted molecule design for chemists J. Cheminfom. (IF 7.1) Pub Date : 2025-03-24
Shoichi Ishida, Tomohiro Sato, Teruki Honma, Kei TerayamaRecent advancements in artificial intelligence (AI)-based molecular design methodologies have offered synthetic chemists new ways to design functional molecules with their desired properties. While various AI-based molecule generators have significantly advanced toward practical applications, their effective use still requires specialized knowledge and skills concerning AI techniques. Here, we develop
-
An interpretable deep geometric learning model to predict the effects of mutations on protein–protein interactions using large-scale protein language model J. Cheminfom. (IF 7.1) Pub Date : 2025-03-21
Caiya Zhang, Yan Sun, Pingzhao HuProtein–protein interactions (PPIs) are central to the mechanisms of signaling pathways and immune responses, which can help us understand disease etiology. Therefore, there is a significant need for efficient and rapid automated approaches to predict changes in PPIs. In recent years, there has been a significant increase in applying deep learning techniques to predict changes in binding affinity between
-
Anticipating protein evolution with successor sequence predictor J. Cheminfom. (IF 7.1) Pub Date : 2025-03-21
Rayyan Tariq Khan, Pavel Kohout, Milos Musil, Monika Rosinska, Jiri Damborsky, Stanislav Mazurenko, David BednarThe quest to predict and understand protein evolution has been hindered by limitations on both the theoretical and the experimental fronts. Most existing theoretical models of evolution are descriptive, rather than predictive, leaving the final modifications in the hands of researchers. Existing experimental techniques to help probe the evolutionary sequence space of proteins, such as directed evolution