
This repository provides the public PyTorch implementation of CHAMP (Coupled Hierarchical Atom-Motif Predictor) for molecular property prediction. The code released here serves as the main maintained implementation accompanying the manuscript. It contains the core CHAMP model components, motif-construction modules, configuration utilities, and the public training entry point currently documented for the released pipeline. Overview CHAMP is a hierarchical graph neural network framework designed to combine: fine-grained atomic structure, coarse-grained motif semantics, and motif-guided cross-scale fusion within a unified molecular representation learning pipeline. The framework is organized around three conceptual stages: Motif construction and structural encoding CHAMP builds motif-level representations on top of atom-level molecular graphs and models internal motif topology to preserve structural information. Function-aware motif refinement CHAMP refines motif embeddings through supervised contrastive constraints so that structurally similar motifs with different functional roles can be distinguished more effectively. Hierarchical atom-motif fusion CHAMP uses motif-level semantics to guide atom-level aggregation and performs cross-scale fusion through gating and inter-head interaction mechanisms. The current public release focuses on the core modules and the main training workflow implemented in this repository. Repository Scope The released codebase includes: the core model components in Model/, motif extraction and motif-graph construction in motif_extract/, shared helper utilities in utils/, argument configuration in Args.py, motif-aware dataset preparation in motif_spilit.py, the main public training script in main_classification.py, the dependency specification in requirements.txt. Local folders such as dataset/, best_model/, .idea/, and __pycache__/ may appear in the working directory, but they should be interpreted as local resources or development artifacts rather than as the conceptual core of the released source implementation. Repository Structure The current directory structure of the released code is: Code/ ├── Args.py ├── main_classification.py ├── motif_spilit.py ├── overview.png ├── README.md ├── requirements.txt ├── Model/ │ ├── HMSAF.py │ ├── atom_motif_attention.py │ ├── contrastive_learning.py │ └── motif_embedding.py └── motif_extract/ ├── mol_motif.py └── motif_graph.py For readers who only want to understand or reuse the main implementation, the primary source files are: main_classification.py Args.py motif_spilit.py Model/*.py motif_extract/*.py Installation pip install -r requirements.txt Main dependencies: PyTorch (1.12.0+cu113) PyTorch Geometric (2.6.1) RDKit (2024.9.3) scikit-learn (1.7.2) UMAP-learn (0.5.7) Usage Parameter Configuration Training parameters can be configured via Args.py: --dataset: dataset name --data_dir: dataset directory --node_feature_dim: atom feature dimension --edge_feature_dim: edge feature dimension --hidden_dim: hidden representation dimension --batch_size: batch size --epochs: number of epochs --lr: learning rate --weight_decay: optimizer weight decay --patience: scheduler patience --factor: scheduler decay factor --loss_fn: loss function option --alpha: ring-level contrastive loss weight --beta: non-ring contrastive loss weight --Pair_MLP: whether to enable the pairwise motif encoder option --is_contrastive: whether to enable contrastive learning --use_Guide: whether to enable motif guidance --use_gating: whether to enable contextual gating --use_head_interaction: whether to enable inter-head interaction --label_thresh_ratio: threshold ratio used in motif comparison --save_dir: checkpoint directory --log_dir: log directory --device: execution device Running Experiments # Example for a classification task python main_classification.py --dataset MUTAG --use_head_interaction True --use_gating True Supported Datasets The framework supports a wide range of datasets from MoleculeNet, including: Regression Tasks: ESOL, FreeSolv, Lipophilicity. Classification Tasks: MUTAG, HIV, BACE, Tox21. Datasets are expected in a standard graph format, containing node features, edge connectivity, and molecular labels.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
