Skip to content

StarMoonWang/SeisMoLLM

Repository files navigation


🔥 Introduction

Large-scale pretraining for seismic monitoring remains challenging due to the lack of effective pretraining algorithms, the high cost of training, and the significant variations across existing datasets. These obstacles have seriously hindered the development of domain foundation models. SeisMoLLM is the first work to explore the cross-modality transfer strategy for seismic monitoring, unleashing the power of pretraing by adapting a pretrained LLM (GPT-2 here) to build a powerflu and general-purpose framework.
With a unified network architecture, SeisMoLLM can handle various seismic monitoring tasks, including back-azimuth estimation, epicentral distance estimation, magnitude estimation, phase picking, and first-motion polarity classification, demonstrating its potential as a framework for seismic foundation model.

✨ Highlights

  1. Surperising performance: With standard supervised training on DiTing-light and STEAD datasets, SeisMoLLM achieves the state-of-the-art performance across the above five tasks, getting 36 best results out of 43 task metrics, with many relative improvements ranging from 10% to 50% compared to advanced baselines.

2. Excerllent generalization: Using only 10% data as training set, SeisMoLLM consistently attains better results than the train-from-scratch baselines, with 12 top scores out of 16 metrics.

3. Modest cost: Despite the introduction of large language models, SeisMoLLM still maintains low training cost and rapid inference speed that is comparable to or even better than lightweight baselines. Training only takes 3-36 hours on 4× RTX-4090 GPUs.

⚡️ Usage

🛠️ Preparation

  • Install required environment by running pip install -r requirements.txt
  • Download required datasets, you can find STEAD at the STEAD repo, and DiTing dataset requires a request for access, please contact the authors of its paper. Then change the --data to your local data path in bash scripts.
  • Prepare pre-trained GPT-2 model files from the huggingface gpt2 and place them in the your_dir/GPT2 directory, then change the GPT_file_path in models/SeisMoLLM.py to your_dir. We suggest to download manually because huggingface is blocked in many areas.
  • Get all the checkpoints, including baselines, from the huggingface repo if you want to use our trained models.

🚀 Running

We provide inference_demo.ipynb to show how SeisMoLLM works across all tasks. For single-sample inference, please take a look at this notebook. (Sorry for any inconvinience: the comments are in Chinese, as it's originally prepared for my talk at the China Earthquake Administration.) We also plan to add a short tutorial on how to integrate custom models or datasets into this codebase, along with key caveats, please stay tuned.

To start training or evaluation, run the scripts in run_scripts. Try python main.py --help to see description of every hyperparameter so you can tune the arguments. For model selection, the task abbreviations used in the model names are listed in the table below:

Task Abbreviation
Detection & Phase Picking dpk
First-Motion Polarity Classification pmp
Back-Azimuth Estimation baz
Magnitude Estimation emg
Epicentral Distance Estimation dis

If you want to use a custom dataset, model, or change the task settings, in addition to implement your codes following the provided examples, please remember to modify config.py.

👍 Acknowledgement

Our code is developed based on SeisT codebase. Thanks for their great work.

🎓 Citation

Hopefully, if our work helps, please give us a star ⭐ or cite this work with:

@article{https://doi.org/10.1029/2025GL118505,
author = {Xinghao, Wang and Liu, Feng and Su, Rui and Wang, Zhihui and Fang, Lihua and Zhou, Lianqing and Bai, Lei and Ouyang, Wanli},
title = {SeisMoLLM: Advancing Seismic Monitoring via Cross-Modal Transfer With Pretrained Large Language Model},
journal = {Geophysical Research Letters},
volume = {53},
number = {8},
pages = {e2025GL118505},
keywords = {seismic monitoring, deep learning, foundation model, phase picking, earthquake location, magnitude estimation},
doi = {https://doi.org/10.1029/2025GL118505},
url = {https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2025GL118505},
eprint = {https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2025GL118505},
note = {e2025GL118505 2025GL118505},
abstract = {Abstract Recent advances in deep learning have transformed seismic monitoring, yet most existing methods remain task-specific and data-limited, restricting performance on challenging scenarios and generalization to unseen data. Large-scale pretraining has addressed similar limitations in other fields, but its application to seismic data faces challenges, including the absence of effective pretraining algorithms, fragmented data sets, and prohibitive computational costs. Here, we propose SeisMoLLM, a novel approach that cross-modally transfers the sequence modeling knowledge of pretrained large language models into a unified framework adaptable to various tasks, unlocking pretraining benefits for seismic monitoring. Evaluations on STEAD and DiTing data sets demonstrates SeisMoLLM outperforms leading methods and generalizes strongly across multiple tasks, with notable improvements of 10\%–50\%, while maintaining training costs comparable to small baselines and faster inference than the smallest baseline. These results establish SeisMoLLM as a promising framework for foundation models and highlight cross-modal transfer as a compelling direction for advancing seismic monitoring.},
year = {2026}
}