AConcise Bond-Distance Summation Descriptor for Effective Melting PointPrediction of Boronic Acids
DOI:
https://doi.org/10.52280/v0xqtz70Keywords:
molecular descriptors, boronic acids, melting point, machine learning algo rithmsAbstract
Predicting the melting points of boronic acids is crucial for guiding synthetic strategies and understanding their physicochemical be haviors. In this study, we introduce a novel bond-distance summation descriptor, a concise 20-component vector that numerically encodes the molecular structure by summing atomic numbers over the shortest paths from the boron atom. We benchmarked this descriptor against four es tablished feature extraction methods Coulomb Matrix, Mordred, Morgan Fingerprints, and Molecular ACCess System (MACCS) and evaluated the predictive accuracy of five machine learning models: Decision Tree, Ran dom Forest, XGBoost, LightGBM, and Support Vector Machine. Despite having far fewer features than the high-dimensional Mordred and Mor gan representations, our 20-length descriptor achieves competitive results, particularly when paired with XGBoost, which consistently exhibits supe rior performance in terms of Mean Absolute Error (MAE) and R2 score. These findings underscore the potential of a concise, interpretable descrip tor for effective melting point prediction, paving the way for the future integration of this scheme into broader cheminformatics applications.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Muhammad Zia Afzal, Shahid Saeed Siddiqi

This work is licensed under a Creative Commons Attribution 4.0 International License.
