Deep learning for computer vision in pulse-like ground motion identification,Computer-Aided Civil and Infrastructure Engineering

当前位置： X-MOL 学术 › Comput. Aided Civ. Infrastruct. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep learning for computer vision in pulse-like ground motion identification
Computer-Aided Civil and Infrastructure Engineering ( IF 8.5 ) Pub Date : 2025-05-28 , DOI: 10.1111/mice.13521
Lu Han, Zhengru Tao

Near-fault pulse-like ground motions can cause severe damage to long-period engineering structures. A rapid and accurate identification method is essential for seismic design. Deep learning offers a solution by framing pulse-like motion identification as an image classification task. However, the application of deep learning models faces multiple challenges from data and models for pulse-like motion classification. This study focuses on suitable input images and model architecture optimization through a comprehensive strategy. The diverse datasets are realized by transferring the original time history into Morlet wavelet time-frequency diagram, anomaly-marked velocity time history, Fourier amplitude spectrum and its smoothed diagram, and pixel fusion diagrams. Two types of deep learning models are constructed in the image classification task for these datasets. A convolutional neural network (CNN) is enhanced by integrating the self-attention mechanism (SAM) to concentrate on local image features. Additionally, a seismic parameter layer is added to this enhanced model to reduce reliance on input data features. Visual Transformers, including Vision Transformer (ViT) and Swin Transformer (SwinT), are adopted in this task as well. The results of the enhanced CNN demonstrate that TF outperforms other images with higher classification accuracy and convergence speed, and dual-input image presents inferior performance. The accuracy of all input datasets under the constraint of a single-parameter moment magnitude (M_w) is higher than that under the constraint of rupture distance (R_rup). The accuracy under the two-parameter constraint of M_w and R_rup is higher than that of the single parameter constraint for all input datasets, in which the accuracy from TF is the highest, and that from dual-input data is improved. The performance of SwinT is similar to CNN+SAM and better than ViT for single-input images, in which TF presents the highest accuracy. For dual-input images, ViT is better than SwinT, and both of them are better than CNN+SAM. In a resource-limited environment, the enhanced CNN with single-input TF is the best strategy, and the physical constraint of M_w and R_rup is more effective, especially for the dual-input images.

中文翻译：

脉冲式地震动识别中计算机视觉的深度学习

近断层脉冲状地震动会对长周期工程结构造成严重损坏。快速准确的识别方法对于抗震设计至关重要。深度学习通过将脉冲状运动识别作为图像分类任务来提供解决方案。然而，深度学习模型的应用面临着来自脉冲状运动分类的数据和模型的多重挑战。本研究侧重于通过综合策略进行合适的输入图像和模型架构优化。通过将原始时间历程转换为 Morlet 小波时频图、异常标记的速度时间历程、傅里叶振幅谱及其平滑图和像素融合图来实现多样化的数据集。在图像分类任务中为这些数据集构建了两种类型的深度学习模型。通过集成自注意力机制（SAM）来增强卷积神经网络（CNN）以专注于局部图像特征。此外，该增强模型还添加了地震参数图层，以减少对输入数据特征的依赖。Visual Transformers，包括 Vision Transformer （ViT）和 Swin Transformer （SwinT），也被用于这项任务。增强 CNN 的结果表明，TF 在更高的分类精度和收敛速度上优于其他图像，而双输入图像的性能较差。在单参数矩大小（M_w）约束下，所有输入数据集的精度都高于在断裂距离（R_rup）约束下的精度。对于所有输入数据集，M_w 和 R_rup 双参数约束下的精度都高于单参数约束，其中 TF 的精度最高，双输入数据的精度更高。SwinT 的性能与 CNN+SAM 相似，对于单输入图像优于 ViT，其中 TF 呈现出最高的准确性。对于双输入图像，ViT 优于 SwinT，两者都优于 CNN+SAM。在资源有限的环境中，具有单输入 TF 的增强 CNN 是最好的策略，并且 M_w 和 R_rup 的物理约束更有效，尤其是对于双输入图像。

更新日期：2025-06-02

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南