SNAME is pleased to offer (mt) magazine as a feature of membership. (mt) is full of cutting edge technical articles on topics of interest across all marine disciplines. Included are vessel reports, in-depth features, historical notes, public policy briefings, profiles of SNAME founders and leaders, abstracts of highlighted technical papers, and reviews of maritime books and media.

(mt) is also available by subscription



The Integration of ML and DL

A convergence of AI technologies is changing the way we understand the ocean



By Yong Bai, Zihan Yang, and Yasutaka Narazaki


The use of ML and DL can help minimize downtime and significantly reduce operational costs, ensuring the longevity and resilience of offshore installations.



The ocean, covering two-thirds of the Earth’s surface in vast expanses of blue, harbors rich and mysterious life while holding an abundance of resources and challenges. In this vast domain, ocean engineering aims to overcome the forces of nature, explore the depths of the sea, and harness marine resources for sustainable development. However, the field faces numerous challenges, including the complexity of the marine environment, the durability of structural designs, and the efficient use of resources. A major challenge in ocean engineering lies in monitoring and understanding the dynamic changes in the marine environment. The ocean is a complex system where factors such as climate, tides, and ocean currents constantly fluctuate, introducing significant uncertainties to ocean engineering. Additionally, the design and maintenance of offshore structures confront issues like durability, corrosion resistance, and structural safety.


Emerging artificial intelligence (AI) methods in recent years are integrating with ocean engineering, offering the prospect of addressing these challenging issues. Machine learning (ML) is a branch of artificial intelligence that empowers systems to learn from data. ML algorithms identify patterns in data, enabling them to improve performance over time as they encounter more information. Deep learning (DL, a subset of ML) uses neural networks with multiple layers to simulate the human brain’s structure, enabling it to handle complex tasks like image and speech recognition. The versatility of ML and DL lies in their ability to integrate seamlessly across various disciplines. By processing vast datasets and recognizing intricate patterns, these technologies provide solutions in fields ranging from healthcare to finance.


In the domain of ocean engineering, the integration of ML and DL unfolds a number of possibilities. These advanced technologies have the ability to revolutionize our approach to understanding, harnessing, and safeguarding the oceans. Through the analysis of satellite and sensor data, ML and DL empower us to predict and interpret complex oceanic phenomena, from forecasting storm surges to understanding intricate currents. These technologies contribute to a more nuanced comprehension of the dynamic marine environment. This predictive capability is not only important for ensuring the safety of maritime activities but also holds immense potential in optimizing the design of new offshore structures. In addition, the convergence of ML and DL with ocean engineering is expected to optimize maintenance strategies. By analyzing real-time data and historical records, these technologies predict structural vulnerabilities, enabling proactive maintenance interventions. This not only minimizes downtime but also significantly reduces operational costs, ensuring the longevity and resilience of offshore installations.


The integration of cutting-edge technologies in ocean engineering signifies a shift in our approach to the oceans. This amalgamation changes the way we interact with the marine environment and reshapes our perspective on harnessing the oceans’ resources.


Advances in ML and DL

Computer vision is the primary approach in engineering when augmentation or potential replacement of humans’ visual recognition capabilities is desired. For these purposes, image data is most typically used as input data, and algorithms such as image classification, object detection, and semantic segmentation are applied to detect, localize, and characterize important patterns in the target application scenario. Such image recognition methods are often applied to different forms of images, such as thermal images, sonar images, and any other feature representations described by 2D maps (for example, audio spectrograms and structural vibration data).


Another category of visual recognition is its application to 3D data. PointNet is a pioneering algorithm in point cloud classification and segmentation, followed by its improvements and extensions to related tasks. Those algorithms and their successful applications to various engineering fields provide the basis for their further adoption to ocean engineering research and practice.


Besides recognition (or discriminative) tasks, generative models have shown remarkable advances recently, obtaining widespread attention and popularity. The early success in this category was marked by generative adversarial networks and related technologies. The mechanism introduced in these studies has been further applied to various image generation tasks, such as image-to-image translation, unpaired image-to-image translation, and their extension to structural damage generation tasks. Recently, image generation based on natural language descriptions (prompts) has been realized at a remarkable level by Midjourney, an independent research lab, where this approach was scaled up to tens of millions of images, artworks, photographs and other media data.


Another mainstream of generative modeling originates from more direct modeling of probability density function by, for example, variational autoencoder. The sophistication of this type of approach has led to diffusion models, which are the key technical components of many of the most successful image (or map) generation algorithms—for example, Stable Diffusion and DALL-E 3 (text-to-image models), and application to monocular depth estimation. Those generative models, together with large language models as the critical building blocks, approach or exceed human capabilities in some selected areas, and their further incorporation to solve engineering problems is anticipated.


Critical to all of those technologies is the attempt to generalize the model to a broad range and level of tasks. One of the classical approaches toward that goal is transfer learning, where a model for a specific problem is trained from the initial weights pretrained by related large datasets (pretraining using ImageNet dataset has commonly been performed). The use of pretrained models and weights has been extended to various other forms of regularizations, frequently in the framework of domain adaptation or domain generalization. For problems with larger scales, it has become common to leverage the abundance of data available on the Internet in the form of multi-modal learning, for example, contrastive language and image pretraining and its extension to six different modalities in ImageBind. One of the ultimate goals in this line of research is to realize artificial general intelligence, which is often referred to in contrast to the “narrow AI” that is designed to perform specific tasks.


Use of synthetic data is another active area of research to develop AI methodologies for domain specific problems (including ones in ocean engineering) efficiently. This topic has been investigated extensively in, for example, visual recognition in autonomous driving scenarios. Driven by the success in those related fields, the application to civil and structural engineering has also been investigated, producing promising results. The use of synthetic data in civil structural engineering can be further optimized with the help of recent advances in unsupervised domain adaptation technique. This line of research has demonstrated that, in addition to significant data augmentation effects, synthetic data can potentially be used to replace annotated real-world data, reducing and eventually eliminating time-consuming manual data annotation processes.


Predicting and interpreting

Understanding and predicting the complex phenomena of the oceans is a challenge for scientists. ML and DL offer powerful capabilities for decoding the mysteries of the marine environment. At the core of ML and DL is their ability to process and analyze vast amounts of data. The oceans generate an immense volume of information, from sea surface temperatures and ocean currents to atmospheric conditions. Traditional methods find it very challenging to handle such large amounts of data. However, ML algorithms, designed for pattern recognition, sift through historical datasets to discern trends and correlations that elude traditional analysis. This historical context forms the basis for predictive models, enabling scientists to anticipate a wide array of oceanic phenomena. With the foundation of data, ML, leveraging its predictive modeling capabilities, can use historical data to forecast future events. By identifying patterns in sea surface temperatures, atmospheric pressure, and ocean currents, ML algorithms create models that predict events such as storms, underwater currents, and temperature fluctuations. Also, the iterative nature of ML enables these models to continuously evolve and improve their accuracy over time.


Meanwhile, the visual complexity of the oceans requires more than one-dimensional numerical analysis. DL is also capable of learning two-dimensional or even three-dimensional patterns from extensive image datasets. Through neural networks with multiple layers, DL is adept at image recognition and pattern detection. Applied to satellite imagery, DL can unravel intricate oceanic patterns—identifying currents, monitoring sea ice dynamics, and even tracking the movements of marine life. DL’s visual acuity supplements the numerical predictions of ML, offering a holistic understanding of the oceans’ dynamic behavior.


In a broader way, ML and DL contribute to understanding the impact of climate change on the oceans. By analyzing extensive climate datasets, these technologies identify global trends such as rising sea levels, alterations in ocean currents, and shifts in marine ecosystems. The predictive capabilities of ML and the visual interpretative strengths of DL combine to provide a comprehensive understanding of the multifaceted effects of climate change on oceanic dynamics. This understanding is crucial for adapting to and mitigating the consequences of global climate shifts.


ML and DL in structural optimization

The collaborative influence of ML and DL not only enhances design efficiency. It also redefines the very fabric of structural engineering. At the heart of this paradigm shift is the precision offered by ML and DL’s capabilities. These algorithms, immersed in historical performance data, decipher intricate patterns associated with sea surface dynamics, environmental variables, and structural responses. For pipeline structure, understanding their response to varying pressures is central to their optimization. ML and DL, trained on extensive datasets of pressure responses, forecast how pipelines will behave under diverse scenarios. This level of precision enables engineers to refine designs, ensuring optimal performance and longevity, especially in regions susceptible to seismic activities or abrupt pressure changes. Another application example in pipeline engineering is cross-sectional design. DL’s capability to discern complex relationships within datasets plays a crucial role in optimizing cross-sectional designs. By analyzing historical data on cross-sectional performance, algorithms can recommend changes to enhance structural integrity or streamline fluid dynamics. For pipelines, this ensures efficient fluid flow, aligning cross-sectional designs with operational requirements.


In the realm of floating platforms, DL emerges as a force in advancing structural engineering. DL’s strength lies in its ability to discern intricate load-bearing patterns and stress distribution within these platforms. By analyzing extensive datasets that encompass wave patterns, tidal forces, and historical structural responses, DL algorithms provide engineers with profound insights. This nuanced understanding empowers engineers to design floating platforms capable of navigating the dynamic forces of the ocean with unparalleled resilience and stability. In essence, DL becomes a critical tool, enabling precision in load-bearing design and fortifying these offshore structures to withstand the challenges posed by the ever-changing marine environment.


Beyond individual components, DL contributes to topological optimization, redefining the fabric of maritime structures. Analyzing datasets related to material properties, construction costs, and structural performance, DL algorithms recommend optimal topological configurations. This approach not only addresses the challenges posed by the marine environment but also sets the stage for a new paradigm in maritime architecture, where precision engineering harmonizes with the fluid forces of the sea.


Oceanic health monitoring

The fusion of DL and ML technologies is reshaping the paradigm of health monitoring for maritime structures. This integration offers greater precision engineering, where advanced algorithms and data-driven insights converge to enhance the longevity, safety, and efficiency of oceanic infrastructure.


ML’s predictive analytics features are instrumental in optimizing maintenance schedules. Processing real-time and historical data, ML algorithms discern patterns indicative of potential structural issues. This not only facilitates early intervention but also streamlines maintenance practices, contributing to the longevity and operational efficiency of maritime structures. Complementing ML, DL offers a new dimension to health monitoring with its image recognition and pattern analysis capabilities. It enhances the precision and depth of damage identification processes. In critical components such as offshore platforms, underwater pipelines, and marine bridges, strategically positioned sensors and data collection devices continuously capture real-time information. DL algorithms process this wealth of data, detecting structural changes that might elude conventional methods. One of DL’s strengths is its ability to analyze images from various sources, such as those obtained from remotely operated vehicles. In the case of underwater pipelines, DL facilitates early detection of corrosion spots or damages along the pipeline’s surface. The visual acuity brought by DL to health monitoring systems ensures heightened awareness, enabling engineers to respond promptly to evolving structural conditions.


The continuous monitoring facilitated by ML ensures that potential issues are identified in their infancy, preventing unexpected failures and minimizing downtime. Concurrently, DL’s image recognition capabilities contribute to a proactive approach by detecting structural changes early on, allowing for timely intervention and optimized maintenance strategies. This integration not only enhances the safety and resilience of maritime structures but also has broader implications for sustainability. The reduction of reactive, large-scale maintenance operations ensures that oceanic structures are not only robust and efficient but also aligned with the principles of environmental conservation.


Pipeline detection

The oil and gas industry anticipates growing demand, necessitating an expansion of production infrastructure. Current subaquatic operations for oil and gas extraction are intricate, requiring the collaboration of structures and systems to extract resources from subterranean reservoirs. Subaqueous pipelines, hailed as the “lifeline” of marine oil and gas production, play a pivotal role in transporting gas or liquid payloads with directness, efficacy, safety, and concern for the environment. Despite being praised for safety and convenience, offshore pipelines still face environmental risks. The leakage of underwater pipelines poses a significant hazard.

 DL and ML offer significant potential for pipeline leak detection.



Traditional methods for detecting underwater pipeline leaks rely on sensor measurements and data analysis, using devices such as sonars and pressure sensors. However, their accuracy and real-time capabilities are constrained. In contrast, the potential of DL and ML in leak detection is significant. This integrated approach provides a solution for more reliable, real-time underwater pipeline leak detection.


A beneficial attempt is to enhance existing DL models to more accurately detect leak locations in pipeline images. Due to harsh underwater conditions, such as turbulent currents in deep-sea environments or low lighting, acquiring seabed images is notably challenging. Consequently, constructing datasets for underwater pipelines and leakage becomes difficult, and presently, there is a lack of publicly available pipeline datasets for researchers.


In one case, this challenge was addressed by using underwater robots to capture some 558 authentic pipeline images from the seabed. Image augmentation techniques were performed, such as mirroring, flipping, introducing salt-and-pepper noise, Gaussian blurring, and histogram equalization, to diversify the dataset with images from various angles and simulated effects. Employing numerous enhancement methods, 600 augmented images were generated. Subsequently, all 1,158 images underwent manual annotation. The annotations encompassed three distinct categories: intact pipelines were labeled as “pipeline” class, pipelines exhibiting leakage were designated as “leak” class, and damaged pipelines were identified as “damage” class.


The YOLOv5 algorithm—a standardized real-time object detection algorithm widely embraced by researchers and engineers—was employed to detect leakage defects in the pipeline dataset. As an object detection algorithm, YOLOv5 is distinguished by its swift processing speed, enabling real-time detection. Simultaneously, its compact size, 27MB, facilitates the portability of object detection algorithms to embedded devices. Notably, YOLOv5 offers commendable accuracy. Testing the underwater pipeline dataset on a series of YOLOv5 and enhanced algorithms revealed defect detection accuracy consistently reaching or surpassing 90%, all while maintaining response speeds in the range of several dozen frames per second (FPS). These results are promising.


Subsequently, the prospective deployment of these cutting-edge pipeline leakage detection algorithms is contemplated onto underwater robots or autonomous underwater vehicles. At this critical juncture, achieving the implementation of these intricate algorithms on diminutive, embedded platforms becomes paramount, necessitating a balance between detection accuracy, speed, and power efficiency. The Jetson Nano, an artificial intelligence embedded platform crafted by Nvidia, facilitates a spectrum of computer vision algorithms, encompassing image classification, object detection, and image segmentation. It integrates with numerous preeminent DL development frameworks.


Transplanting the model, trained on a computer, directly onto the Jetson Nano, while maintaining unaltered accuracy, the velocity proved insufficient for real-time detection. Consequently, a streamlined iteration of YOLOv5 was devised, denoted as YOLOv5-mv3s, wherein MobileNetV3s replaces YOLOv5’s backbone network. This refinement reduced the network’s dimensions to just 7.4 MB. On Jetson Nano, with an input image size of 416 × 416, the detection velocity increased to an impressive 11.5 FPS, while preserving a detection accuracy hovering around 91%. This substantiated the efficacy and practicality of the methodology.


This case demonstrates some of the capabilities of AI technologies in diverse facets of ocean engineering. An assembly of scientists is exploring the application of AI within even more intricate environments, complex structural designs, and the handling of vast multimodal datasets. This includes the cultivation of computational models that go past conventional boundaries, not only to heightened precision but also a heightened level of efficiency.


Yong Bai is with the College of Civil Engineering and Architecture at Zhejiang University and is a SNAME Fellow. Zihan Yang is with the College of Civil Engineering and Architecture at Zhejiang University. Yasutaka Narazaki is with the University of Illinois Urbana-Champaign Institute at Zhejiang University.