TY - JOUR
T1 - Synthetic generated data for intelligent corrosion classification in oil and gas pipelines
AU - Ramos, Leo Thomas
AU - Casas, Edmundo
AU - Rivas-Echeverría, Francklin
N1 - Publisher Copyright:
© 2024 The Authors
PY - 2025/3
Y1 - 2025/3
N2 - This research presents the K-Pipelines dataset, a pioneering synthetic image collection designed specifically for the classification of corrosion in oil and gas pipelines. Instead of training custom generative architectures, our research used an online image generation tool powered by Stable Diffusion. This choice leveraged the platform's robust capability to quickly produce a high volume of diverse and detailed images, saving significant time and resources. The dataset was carefully constructed using a sequence of refined prompts, derived from a review of pipeline characteristics including material types, environments, and corrosion forms. K-Pipelines consist of 600 PNG images of 512 × 512 resolution. Furthermore, an augmented version was developed, totaling 1080 images. Our evaluation employed state-of-the-art deep learning classifiers, specifically VGG16, ResNet50, EfficientNet, InceptionV3, MobileNetV2, and ConvNeXt-base, to test the integrity of the K-pipelines dataset. These models showcased its robustness by consistently achieving accuracies around the 90% mark, illustrating the dataset's substantial promise as a resource for both AI research and real-world applications in the oil and gas industry. The dataset is publicly available for access and use within the scientific community.
AB - This research presents the K-Pipelines dataset, a pioneering synthetic image collection designed specifically for the classification of corrosion in oil and gas pipelines. Instead of training custom generative architectures, our research used an online image generation tool powered by Stable Diffusion. This choice leveraged the platform's robust capability to quickly produce a high volume of diverse and detailed images, saving significant time and resources. The dataset was carefully constructed using a sequence of refined prompts, derived from a review of pipeline characteristics including material types, environments, and corrosion forms. K-Pipelines consist of 600 PNG images of 512 × 512 resolution. Furthermore, an augmented version was developed, totaling 1080 images. Our evaluation employed state-of-the-art deep learning classifiers, specifically VGG16, ResNet50, EfficientNet, InceptionV3, MobileNetV2, and ConvNeXt-base, to test the integrity of the K-pipelines dataset. These models showcased its robustness by consistently achieving accuracies around the 90% mark, illustrating the dataset's substantial promise as a resource for both AI research and real-world applications in the oil and gas industry. The dataset is publicly available for access and use within the scientific community.
KW - Artificial intelligence
KW - Computer vision
KW - Corrosion
KW - Deep learning
KW - Generative models
KW - Image classification
KW - Oil and gas
KW - Stable diffusion
UR - http://www.scopus.com/inward/record.url?scp=85211983975&partnerID=8YFLogxK
U2 - 10.1016/j.iswa.2024.200463
DO - 10.1016/j.iswa.2024.200463
M3 - Article
AN - SCOPUS:85211983975
SN - 2667-3053
VL - 25
JO - Intelligent Systems with Applications
JF - Intelligent Systems with Applications
M1 - 200463
ER -