WO2024078112A1 - 一种舾装件智能识别方法、计算机设备 - Google Patents

一种舾装件智能识别方法、计算机设备 Download PDF

Info

Publication number
WO2024078112A1
WO2024078112A1 PCT/CN2023/112528 CN2023112528W WO2024078112A1 WO 2024078112 A1 WO2024078112 A1 WO 2024078112A1 CN 2023112528 W CN2023112528 W CN 2023112528W WO 2024078112 A1 WO2024078112 A1 WO 2024078112A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
neural network
convolutional neural
network model
set data
Prior art date
Application number
PCT/CN2023/112528
Other languages
English (en)
French (fr)
Inventor
甄希金
续爱民
张盈彬
郭威
明星
骆晓萌
Original Assignee
上海船舶工艺研究所(中国船舶集团有限公司第十一研究所)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海船舶工艺研究所(中国船舶集团有限公司第十一研究所) filed Critical 上海船舶工艺研究所(中国船舶集团有限公司第十一研究所)
Publication of WO2024078112A1 publication Critical patent/WO2024078112A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present invention relates to the technical field of intelligent ship manufacturing, and in particular to a feature extraction method, and in particular to an intelligent identification method and computer equipment for outfitting parts of an intelligent ship outfitting warehouse.
  • outfitting equipment In shipbuilding enterprises, the stacking of outfitting equipment involves a series of operations such as the collection and distribution of outfitting equipment, logistics transmission, information marking, yard storage planning, warehousing, and outbound use. Some equipment is often imported products, and the relevant information is described in English. It is difficult for on-site material collection and installation personnel to accurately identify the detailed information of the equipment at the first time, which project the equipment is used for, how to install and debug, etc. The accuracy of outfitting equipment information is very important.
  • the backbone enterprises in the industry have introduced intelligent outfitting parts three-dimensional warehouses to realize unmanned warehousing management of outfitting parts. This requires accurate information recognition of outfitting parts in the receiving and warehousing stage, and rapid identification through image recognition methods during use to improve management efficiency.
  • the purpose of the present invention is to provide an intelligent identification method and computer equipment for outfitting parts for ship outfitting warehouses, which is used to accurately identify whether a picture contains a target object outfitting parts in a very short time, and can solve the technical problem that the current intelligent outfitting parts stereoscopic warehouse requires accurate information identification of outfitting parts during the receiving and warehousing stage.
  • An intelligent identification method for outfitting parts in a ship outfitting warehouse characterized in that the method comprises the following steps:
  • Step S12 image preprocessing step, using image linear operation, image logical operation, image spatial operation, image transformation to process the collected image to form input sample tensor data;
  • Step S13) a sample data set acquisition step, receiving input sample tensor data, simplifying the input sample tensor data to integrate into input sample data, wherein the input sample data is divided into training set data samples and test set data samples obtained from the collected images;
  • Step S14 feature extraction step, establishing a convolutional neural network model, calling the sample data set to train and evaluate the model, and obtaining a convolutional neural network model that matches the training set data sample;
  • Step S15 Model calibration step, optimizing the convolutional neural network model matching the training set data samples Training to improve the accuracy of the convolutional neural network model matching the training set data samples;
  • Step S16 Image recognition step, using the calibrated convolutional neural network model of the matching training set data samples to identify the target object.
  • step S12 also includes image arithmetic operations, and the image arithmetic operations are to store the pixel values in the acquired image in an array, and array addition, subtraction, multiplication and division are all direct operations on two elements in the same position, wherein addition is used for image denoising, subtraction is used to enhance the difference between images, and image multiplication or division is used for shadow correction.
  • the logical operation of the image includes the set operation of the image, the set operation of the image is the intersection, union and complement of the image, and the logical operation mode includes AND, OR, NOT and XOR;
  • the spatial operation of the image includes single pixel operation, neighborhood operation and geometric space transformation.
  • the image preprocessing step also includes: image cropping, image resizing, converting image data into tensors, and data standardization.
  • the feature extraction step includes:
  • test set data samples are input into the convolutional neural network model that matches the training set data samples to predict the attribute characteristics of the test set data samples and obtain the accuracy of the convolutional neural network model that matches the training set data samples.
  • the constructed data training model includes: an input layer, a convolutional layer arranged below the input layer, a pooling layer arranged below the convolutional layer, a fully connected layer arranged below the pooling layer, a discard layer arranged below the fully connected layer, and an output layer arranged below the discard layer.
  • step S14 in the process of calling the training set data samples to train the training model until convergence, it also includes optimizing the data training model using a pre-stored optimization function.
  • the preset evaluation criteria include a loss function
  • the step of evaluating the convolutional neural network model to be evaluated includes: calculating the loss function of the convolutional neural network model to be evaluated; comparing the value of the loss function of the convolutional neural network model to be evaluated with a preset loss threshold to obtain the convolutional neural network model that matches the training set data sample; wherein the convolutional neural network model that matches the training set data sample is a data training model corresponding to the minimum value of the loss function.
  • step S15 the convolutional neural network model of the matching training set data samples is trained twice or three times by adding and modifying the data samples to improve the accuracy of the convolutional neural network model of the matching training set data samples.
  • the last aspect of the present invention provides a device, comprising: a processor and a memory; the memory is used to store a computer program, and the processor is used to execute the computer program stored in the memory, so that the device executes the intelligent identification method of outfitting parts.
  • the intelligent identification method of outfitting parts for ship outfitting warehouses described in the present invention has the following beneficial effects: the identification method described in the present invention can accurately identify whether the content in the picture is a target object in a very short time.
  • FIG1 is a flow chart of an outfitting parts intelligent identification method of the present invention.
  • Figure 2 is a schematic diagram of the principle structure of the convolutional layer of a convolutional neural network
  • Figure 3 is a schematic diagram of the principle structure of the pooling layer of a convolutional neural network
  • Figure 4 is a schematic diagram of the principle structure of a fully connected layer of a convolutional neural network
  • FIG5 is a flow chart of a feature extraction step of the present invention.
  • FIG. 6 is a diagram showing the internal structure of a computer device in one embodiment.
  • This embodiment provides an outfitting parts intelligent identification method, including:
  • Step S12 image preprocessing step, using image linear operation, image logical operation, image spatial operation, image transformation to process the collected image to form input sample tensor data;
  • Step S13) a sample data set acquisition step, receiving input sample tensor data, simplifying the input sample tensor data to integrate into input sample data, wherein the input sample data is divided into training set data samples and test set data samples obtained from the collected images;
  • Step S14 feature extraction step, establishing a convolutional neural network model, calling the sample data set to train and evaluate the model, and obtaining a convolutional neural network model that matches the training set data sample;
  • Step S15 a model calibration step, performing optimization training on the convolutional neural network model matching the training set data samples to improve the accuracy of the convolutional neural network model matching the training set data samples;
  • Step S16 Image recognition step, using the calibrated convolutional neural network model of the matching training set data samples to identify the target object.
  • the planning method of the present invention comprises the following steps when implemented:
  • Step S11 includes: Image acquisition: acquiring a large number of pictures containing or excluding target objects.
  • images are acquired facing the ship outfitting warehouse, and the target objects include outfitting parts.
  • Step S12 includes: image preprocessing: image arithmetic operations, image linear operations, image logical operations, image spatial operations, and image transformation.
  • the pixel values in the image are stored in an array.
  • Array addition, subtraction, multiplication, and division are all direct operations on two elements in the same position.
  • An example of array multiplication is:
  • addition operations can be used for image noise reduction
  • subtraction operations can be used to enhance the difference between images
  • image multiplication can use a template image to multiply the corresponding image to leave the ROI area of interest
  • image multiplication or division can be used for shadow correction.
  • the logical operation of the image includes set operation of the image, the set operation of the image is intersection, union and complement of the image, and the logical operation includes AND, OR, NOT and XOR.
  • the spatial operation of the image includes single pixel operation, neighborhood operation and geometric space transformation.
  • the image preprocessing steps also include image cropping, image resizing, converting image data into tensors, and data normalization. Specifically, use transforms.CenterCrop() to crop the image from the center of the image.
  • the parameter of this function is size, which indicates the cropping size; use transforms.Resize() to redefine the image size; use transforms.ToTensor to convert image data into tensors; and use transforms.Normalize() to normalize the data, which can speed up the convergence of the model.
  • Step S13 includes: receiving input sample tensor data; specifically, simplifying the input sample tensor data to integrate into input sample data; marking the output result according to the input sample image: the image containing the target object is marked as "1", and the image not containing the target object is marked as "0".
  • the input sample data is divided into training set data samples and test set data samples obtained from the collected image.
  • the training model constructed in step S14 includes: an input layer, a convolutional layer arranged below the input layer, a pooling layer arranged below the convolutional layer, a fully connected layer arranged below the pooling layer, a dropout layer arranged below the fully connected layer, and an output layer arranged below the dropout layer.
  • the training model is a convolutional neural network.
  • the input layer is used to process multi-dimensional data. Specifically, the input layer performs matrix transformation (reshape process) on the input matrix sample tensor and then inputs it.
  • matrix transformation shape process
  • the convolution layer is used to extract attribute features from the sample data processed by the input layer and output a feature map.
  • Each convolution layer is composed of a number of convolution units, and the parameters of each convolution unit are optimized through the back propagation algorithm.
  • the purpose of the convolution operation is to extract different features of the input.
  • the first convolution layer may only extract some low-level features, such as edges, lines, and corners. More convolution layers can iteratively extract more complex features from low-level features. It includes multiple convolution kernels, each element of which corresponds to a weight coefficient and a bias.
  • the value of the convolution kernel is considered to be set, it can be adjusted, and the value of the convolution kernel is the parameter of the network.
  • the feature image and the convolution kernel exist in the form of a matrix, which can perform convolution calculations.
  • the convolution kernel first performs a convolution operation on the first area of the feature image, and the result will be used as a point on the output feature map.
  • FIG. 2 An example of the convolution calculation process is shown in Figure 2 below.
  • the feature map is input and multiplied by the convolution kernel to output.
  • a new feature map is output, which requires multiple convolutions between the convolution kernel and the feature map.
  • Calculation that is, the convolution kernel will slide on the input feature map, generally sliding from left to right and from top to bottom. Different sliding steps will result in different output feature maps.
  • the pooling layer is used to perform feature selection and filtering on the feature map output by the convolution layer after the convolution layer performs feature extraction.
  • the selection and filtering of the feature map by the pooling layer is essentially a form of downsampling.
  • the method adopted is maximum pooling.
  • the execution process of maximum pooling is to split the matrix into different regions, and the output elements are all the maximum values in the corresponding regions.
  • An example of the maximum pooling process is shown in Figure 3:
  • the max pooling process is like applying a filter of size 2, using a 2*2 region and a stride of 2. If a feature is extracted in the filter, its maximum value is retained. If the feature is not extracted, it may not exist in the region, so the maximum value is still small.
  • the pooling layer will continuously reduce the spatial size of the data, so the number of parameters and the amount of computation will also decrease, which can control overfitting to a certain extent.
  • the fully connected layer generally consists of two parts, a linear part and a nonlinear part.
  • the fully connected layer after passing through multiple convolutional layers and pooling layers, one or more fully connected layers are connected. Each neuron in the fully connected layer is fully connected with all neurons in the previous layer.
  • the fully connected layer can integrate the local information with category distinction in the convolution layer or pooling layer.
  • the overall fully connected network is shown in Figure 4: the conversion process from the input layer to the hidden layer 1 is already a complete fully connected layer.
  • the meaning of this process is: for each pixel, we give a set of weights and perform operations to obtain the final value. In fact, we also know that the weights are random at the beginning, and the subsequent weights will learn by themselves through the back propagation process.
  • Non-linear part generally called activation function.
  • the activation functions between the neurons in the fully connected layer of the present invention respectively use the ReLU function and the Sigmoid function, as shown in Figure 4.
  • the dropout process is used to prevent overfitting. Overfitting means that the training effect is very good and the loss function value can be reduced to a very low level, but the performance is not so good when the test data set is used. The reason is that it is overly dependent on the characteristics of the existing training data set.
  • the dropout process can randomly set some activation functions to Zero (let the weights of some hidden layer nodes of the network not work) to avoid overfitting. During forward propagation, the activation value of a neuron stops working with a certain probability p, which can make the model more generalizable because it will not rely too much on certain local features.
  • the output layer has two different output variables, and the different output variables have the same weights, namely the output variable in the training mode (output_train) and the output variable in the prediction mode (output_test).
  • the feature extraction step includes:
  • Step S141 constructing a training model for the training set data samples
  • Step S142 calling the training set data samples to train the training model until convergence, so as to generate a convolutional neural network model to be evaluated that is suitable for the training set data samples; wherein in the process of calling the training set data samples to train the training model until convergence, it also includes optimizing the data training model using a pre-stored optimization function;
  • Step S143 evaluating the convolutional neural network model to be evaluated, and when the convolutional neural network model to be evaluated meets the preset evaluation criteria, obtaining a convolutional neural network model that matches the training set data sample;
  • Step S144 Input the test set data samples into the convolutional neural network model that matches the training set data samples to predict the attribute characteristics of the test set data samples and obtain the accuracy of the convolutional neural network model that matches the training set data samples.
  • the generation process of the convolutional neural network model to be evaluated that is suitable for the training set data samples specifically described in step S14 is to call the established neural network model and generate a suitable model by adjusting parameters.
  • the tf.estimator.inputs.numpy_input_fn() function is used to load data.
  • the parameters of this function are x, y, batch_size, num_epochs, shuffle.
  • x is the training data x_train
  • batch_size is the number of samples selected in each iteration
  • the preset evaluation criteria include a loss function
  • the loss function includes a cross entropy loss function.
  • the step S14 includes: calculating the loss function (loss) of the convolutional neural network model to be evaluated, where loss is the calculation of the cross entropy loss function (Cross Entropy Loss).
  • the value of the loss function of the convolutional neural network model to be evaluated is compared with the preset loss threshold to obtain the convolutional neural network model that matches the training set data sample; wherein the convolutional neural network model that matches the training set data sample is the data training model corresponding to the minimum value of the loss function.
  • the tf.estimator.inputs.numpy_input_fn() function is used to load the data.
  • x is the predicted data x_test
  • batch_size remains unchanged
  • shuffle False.
  • step S15 the convolutional neural network model of the matching training set data samples is trained twice or three times by adding and modifying the data samples to improve the accuracy of the convolutional neural network model of the matching training set data samples.
  • the intelligent identification method for outfitting parts described in this embodiment can accurately identify whether an image contains a target object in a very short time, providing a new idea for image recognition.
  • This embodiment also provides a computer device, on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned intelligent identification method for outfitting parts is implemented.
  • the computer device may be a server, and its internal structure diagram may be shown in FIG6 .
  • the computer device includes a processor, a memory, a network interface, and a database connected via a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, a computer program, and a database.
  • the internal memory provides an environment for the operation of the operating system and the computer program in the non-volatile storage medium.
  • the database of the computer device is used to store data on the intelligent identification method of outfitting parts.
  • the network interface of the computer device is used to communicate with an external terminal via a network connection.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供一种面向船舶舾装仓库的舾装件智能识别方法,所述识别方法包括:图像采集步骤,采集大量包含或不包含目标物体的图像;图像预处理步骤,利用图像的线性运算、图像的逻辑运算、图像的空间操作、图像变换对所采集的图像进行处理形成输入样本张量数据;样本数据集获取步骤,将接收的输入样本张量数据化简,整合成输入样本数据并分为从采集的图像中获取的训练集数据样本和测试集数据样本;特征提取步骤,建立卷积神经网络模型进行训练和评估,得到匹配训练集数据样本的卷积神经网络模型;模型校准步骤,进行优化训练,以提高精度;图像识别步骤,进行目标物体的识别。本发明可在极短时间内精确识别图片中内容是否为目标物体。

Description

一种舾装件智能识别方法、计算机设备 技术领域
本发明涉及船舶智能制造技术领域,涉及一种特征提取方法,特别是涉及一种面向智能化船舶舾装仓库的舾装件智能识别方法、计算机设备。
背景技术
船舶制造企业中,舾装设备的堆放涉及到舾装设备的集配、物流传输、信息的标记、堆场库位的规划、入库、出库使用等一系列操作,往往有一些设备属进口产品,相关信息资料为英文描述,现场领料、安装人员较难第一时间准确辨识设备的详细信息,设备是哪一个项目使用,如何安装调试等,舾装设备信息的准确度非常重要,行业内的骨干企业引进了智能化舾装件立体仓库,实现舾装件仓储管理的无人化,这就要求收货入库阶段的舾装件的信息识别准确无误,使用过程可通过图像识别方法快速辨识,提升管理效率。
发明内容
本发明的目的在于提供一种面向船舶舾装仓库的舾装件智能识别方法、计算机设备,用于在极短时间内精确识别图片中是否含有目标物体舾装件,能够解决目前智能化舾装件立体仓库要求收货入库阶段的舾装件的信息识别准确无误的技术问题。
为了实现上述目的,本发明的技术方案是:
一种面向船舶舾装仓库的舾装件智能识别方法,其特征在于,该方法包括以下步骤:
步骤S11)图像采集步骤,面向船舶舾装仓库采集图像,所采集的图像为大量包含或不包含目标物体的图像,所述目标物体包括舾装件;
步骤S12)图像预处理步骤,利用图像的线性运算、图像的逻辑运算、图像的空间操作、图像变换对所采集的图像进行处理形成输入样本张量数据;
步骤S13)样本数据集获取步骤,接收输入样本张量数据,将所述输入样本张量数据化简,以整合成输入样本数据,所述输入样本数据分为从采集的图像中获取的训练集数据样本和测试集数据样本;
步骤S14)特征提取步骤,建立卷积神经网络模型,调用所述样本数据集对模型进行训练和评估,得到匹配训练集数据样本的卷积神经网络模型;
步骤S15)模型校准步骤,对匹配训练集数据样本的卷积神经网络模型进行优化训 练,以提高所述匹配训练集数据样本的卷积神经网络模型精度;
步骤S16)图像识别步骤,利用所述校准后的匹配训练集数据样本的卷积神经网络模型,进行目标物体的识别。
进一步地,所述步骤S12中,还包括图像的算术运算,所述图像的算术运算为将所采集的图像中像素值存储在阵列中,阵列加、减、乘、除均为两个相同位置的元素直接运算,其中加法用于图像降噪、减法用于增强图像间差别、图像相乘或相除用于阴影校正。
进一步地,所述步骤S12中,所述图像的线性运算包括:假设作用在图像f(x,y)上的算子H使得H[f(x,y)]=g(x,y)且满足H[aifi(x,y)+ajfj(x,y)]=aiH[fi(x,y)]+ajH[fj(x,y)]=aigi(x,y)+ajgj(x,y),其中ai,aj是任意常数,fi,fj是任意两个大小相同的图像,则H为线性操作;
所述图像的逻辑运算包括图像的集合运算,图像的集合运算为图像的交、并、补,逻辑运算方式包括与或非和异或;
所述图像的空间操作包括单像素操作、邻域操作和几何空间变换。
进一步地,所述步骤S12中,所述图像预处理步骤还包括:图像裁剪、图像大小重新定义、将图像数据转化为张量、数据标准化。
进一步地,所述步骤S14中,所述特征提取步骤包括:
针对训练集数据样本构建训练模型;
调用所述训练集数据样本训练所述训练模型直至收敛,以生成适用于训练集数据样本的待评估卷积神经网络模型;
评估所述待评估卷积神经网络模型,当所述待评估卷积神经网络模型符合预设的评估标准后,得到匹配训练集数据样本的卷积神经网络模型;
将所述测试集数据样本输入匹配训练集数据样本的卷积神经网络模型,以预测所述测试集数据样本的属性特征,并获取匹配训练集数据样本的卷积神经网络模型精度。
进一步地,所述步骤S14中,构建的数据训练模型包括:输入层、设置于所述输入层下层的卷积层、设置于所述卷积层下层的池化层、设置于所述池化层下层的全连接层、设置于所述全连接层下层的丢弃层及设置于所述丢弃层下层的输出层。
进一步地,所述步骤S14中,在调用所述训练集数据样本训练所述训练模型直至收敛过程中,还包括利用预存优化函数优化所述数据训练模型。
进一步地,所述步骤S14中,所述预设的评估标准包括损耗函数;评估所述待评估卷积神经网络模型的步骤包括:计算所述待评估卷积神经网络模型的损失函数;将所述待评估卷积神经网络模型的损失函数的数值与预设损失阈值进行比对,以获取所述匹配训练集数据样本的卷积神经网络模型;其中,所述匹配训练集数据样本的卷积神经网络模型为损失函数的最小数值对应的数据训练模型。
进一步地,所述步骤S15中,通过对所述数据样本的新增和修正,对所述匹配训练集数据样本的卷积神经网络模型进行二次或三次训练,以提高所述匹配训练集数据样本的卷积神经网络模型精度。
进一步地,本发明最后一方面提供一种设备,包括:处理器及存储器;所述存储器用于存储计算机程序,所述处理器用于执行所述存储器存储的计算机程序,以使所述设备执行所述舾装件智能识别方法。
本发明所述的一种面向船舶舾装仓库的舾装件智能识别方法,具有以下有益效果:本发明所述识别方法可在极短时间内精确识别图片中内容是否为目标物体。
附图说明
图1为本发明的一种舾装件智能识别方法的流程图;
图2为卷积神经网络卷积层原理结构示意图;
图3为卷积神经网络池化层原理结构示意图;
图4为卷积神经网络全连接层原理结构示意图;
图5为本发明的特征提取步骤的流程图;
图6为一个实施例中计算机设备的内部结构图。
具体实施方式
以下通过特定的船零件识别说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。需说明的是,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。
需要说明的是,以下实施例中所提供的图示仅以示意方式说明本发明的基本构想,遂图式中仅显示与本发明中有关的组件而非按照实际实施时的组件数目、形状及尺寸绘制,其实际实施时各组件的型态、数量及比例可改变,且其组件布局型态也可能更为复 杂。
本实施例提供一种舾装件智能识别方法,包括:
步骤S11)图像采集步骤,面向船舶舾装仓库采集图像,所采集的图像为大量包含或不包含目标物体的图像,所述目标物体包括舾装件;
步骤S12)图像预处理步骤,利用图像的线性运算、图像的逻辑运算、图像的空间操作、图像变换对所采集的图像进行处理形成输入样本张量数据;
步骤S13)样本数据集获取步骤,接收输入样本张量数据,将所述输入样本张量数据化简,以整合成输入样本数据,所述输入样本数据分为从采集的图像中获取的训练集数据样本和测试集数据样本;
步骤S14)特征提取步骤,建立卷积神经网络模型,调用所述样本数据集对模型进行训练和评估,得到匹配训练集数据样本的卷积神经网络模型;
步骤S15)模型校准步骤,对匹配训练集数据样本的卷积神经网络模型进行优化训练,以提高所述匹配训练集数据样本的卷积神经网络模型精度;
步骤S16)图像识别步骤,利用所述校准后的匹配训练集数据样本的卷积神经网络模型,进行目标物体的识别。
本发明的规划方法,在具体实施时包括以下步骤:
步骤S11包括:图像采集:采集大量包含或不包含目标物体的图片。为了提升识别准确性,能有效区分背景图像,面向船舶舾装仓库采集图像,所述目标物体包括舾装件。
步骤S12包括:图像预处理:图像的算术运算、图像的线性运算、图像的逻辑运算、图像的空间操作、图像变换。
具体地,图像中像素值存储在阵列中,阵列加、减、乘、除均为两个相同位置的元素直接运算,阵列乘法举例:
图像的算术运算中,加法操作可用于图像降噪、减法操作可用于增强图像间差别、图像相乘可以用一个模板图像与对应图像相乘从而留下感兴趣的ROI区域,图像相乘或相除可以用于阴影校正。
图像的线性运算:假设作用在图像f(x,y)上的算子H使得H[f(x,y)]=g(x,y)且满足H[aifi(x,y)+ajfj(x,y)]=aiH[fi(x,y)]+ajH[fj(x,y)]=aigi(x,y)+ajgj(x,y), 其中ai,aj是任意常数,fi,fj是任意两个大小相同的图像,则H为线性操作。
所述图像的逻辑运算包括图像的集合运算,图像的集合运算为图像的交、并、补,逻辑运算包括与或非和异或。
所述图像的空间操作包括单像素操作、邻域操作和几何空间变换。
图像预处理步骤还包括图像裁剪、图像大小重新定义、将图像数据转化为张量、数据标准化。具体地,利用transforms.CenterCrop()从图像中心裁剪图像,该函数的参数为size,表示裁剪大小;利用transforms.Resize()函数重新定义图像大小;利用transforms.ToTensor将图像数据转化为张量;利用transforms.Normalize()进行数据标准化,可以加快模型的收敛。
步骤S13包括:接收输入样本张量数据;具体地,将所述输入样本张量数据化简,以整合成输入样本数据;根据输入样本图像将输出结果标记为:图像中含有目标物体标记为“1”,图像中不含目标物体标记为“0”。从而所述输入样本数据分为从采集的图像中获取的训练集数据样本和测试集数据样本。
步骤S14中所述构建的训练模型包括:输入层、设置于所述输入层下层的卷积层、设置于所述卷积层下层的池化层、设置于所述池化层下层的全连接层、设置于所述全连接层下层的丢弃层(dropout)及设置于所述丢弃层下层的输出层。于本实施例中,所述训练模型为一卷积神经网络。
在本实施例中,所述输入层用于处理多维数据。具体地,所述输入层对输入的矩阵样本张量进行矩阵变换(reshape过程)后输入。
所述卷积层用于从所述输入层处理过的样本数据中提取属性特征,并输出特征图。每层所述卷积层是由若干卷积单元组成,每个卷积单元的参数都是通过反向传播算法最佳化得到的。卷积运算的目的是提取输入的不同特征,第一层卷积层可能只能提取一些低级的特征,例如边缘、线条和角等层级,更多层的卷积层能从低级特征中迭代提取更复杂的特征。其内部包括多个卷积核,其中每个元素都对应一个权重系数和一个偏差量。
卷积核的值是认为设定的,它是可以调节的,卷积核的值就是网络的参数。特征图像和卷积核以矩阵形式存在,它们可以进行卷积计算。卷积核首先对特征图像的第一个区域进行卷积运算,结果会作为输出的特征图上的一个点。
卷积计算过程举例如下图2所示,将特征图输入,与卷积核相乘,即可输出。特征图像在进行卷积过程后,会输出一幅新的特征图,这需要卷积核和特征图进行多次卷积 计算,即卷积核会在输入的特征图上滑动,滑动一般从左到右、从上到下滑动。不同的滑动步长,输出的特征图不同。
所述池化层用于在卷积层进行特征提取后,将卷积层输出的特征图进行特征选择和过滤。
所述池化层对特征图的选择和过滤实质上是一种形式的降采样,有多种不同形式的非线性池化函数,在本实施例中,所采用的方式是最大池化,最大池化的执行过程是把矩阵拆分成不同的区域,输出的元素都是对应区域中的最大值。最大池化过程示例如图3所示:
左边是一个特征图,右边是最大池化后保留的特征。最大池化过程就像是应用一规模为2的过滤器,选用的是2*2区域,步幅为2。如果在过滤器中提取到某个特征,那么保留其最大值。如果没有提取到这个特征,可能在该区域中不存在这个特征,那么其中的最大值也还是很小。
池化层会不断地减小数据的空间大小,因此参数的数量和计算量也会下降,在一定程度上可控制过拟合。
所述全连接层一般由两个部分组成,线性部分和非线性部分。在所述数据训练模型中,经过多个卷积层和池化层后,连接1个或1个以上的全连接层。全连接层中的每个神经元与其前一层的所有神经元进行全连接。
全连接层可以整合卷积层或者池化层中具有类别区分性的局部信息,如图4所示整体的全连接网络:其从输入层(input layer)到隐藏层1(hidden layer1)已经是一个完整的全连接层的转换过程,线性部分细节:对于输入层向量x(n维),其输出是hidden layer1设置为z(m维),它要转变为一个m维的向量,需要乘上一个m*n的矩阵W,再加上一个偏置b,即W*x+b=z。该过程的意义是:对于每个像素点,我们给出一组权重,进行运算来获取最终的值,实际上我们也知道权重初始的时候是随机的,后续权重会通过反向传播过程自己学习。非线性部分:一般称之为激活函数。
为了提升卷积神经网络的性能,本发明中全连接层神经元之间的激活函数分别采用了ReLU函数和Sigmoid函数,具体如图4。
所述dropout过程用于防止过拟合。过拟合:就是训练的时候效果很好,损失函数值可以降得很低,但是到测试数据集的时候表现就不那么好了,原因是过分依赖于现有训练数据集的特征造成的。dropout过程可以在模型训练时随机将部分激活函数设置为 零(让网络某些隐含层节点的权重不工作),以避免过拟合。在前向传播的时候,使一神经元的激活值以一定的概率p停止工作,这样可以使模型泛化性更强,因为它不会太依赖某些局部的特征。
由于丢弃过程在训练和预测时的不同行为,所述输出层具有两个不同的输出变量,不同的输出变量具有相同的权重,分别为训练模式下的输出变量(output_train)和预测模式下的输出变量(output_test)。
如图5所示,所述步骤S14中,所述特征提取步骤包括:
步骤S141)针对训练集数据样本构建训练模型;
步骤S142)调用所述训练集数据样本训练所述训练模型直至收敛,以生成适用于训练集数据样本的待评估卷积神经网络模型;其中在调用所述训练集数据样本训练所述训练模型直至收敛过程中,还包括利用预存优化函数优化所述数据训练模型;
步骤S143)评估所述待评估卷积神经网络模型,当所述待评估卷积神经网络模型符合预设的评估标准后,得到匹配训练集数据样本的卷积神经网络模型;
步骤S144)将所述测试集数据样本输入匹配训练集数据样本的卷积神经网络模型,以预测所述测试集数据样本的属性特征,并获取匹配训练集数据样本的卷积神经网络模型精度。
在步骤S14中具体所述适用于训练集数据样本的待评估卷积神经网络模型的生成过程就是调用所建立的神经网络模型,通过调整参数,生成一适合的模型。
于本实施例中,在训练模型时,采用tf.estimator.inputs.numpy_input_fn()函数来加载数据。该函数的参数为x,y,batch_size,num_epochs,shuffle。其中x为训练数据x_train,y为标签y=y_train,batch_size为每一次迭代时选择的样本个数,num_epoch=None表示训练停止的条件是达到迭代次数。
在本实施例中,所述预设评估标准包括损耗函数,所述损耗函数包括交叉熵损失函数。所述步骤S14包括:计算所述待评估卷积神经网络模型的损失函数(loss),loss为计算交叉熵损失函数(Cross Entropy Loss)。将所述待评估卷积神经网络模型的损失函数的数值与预设损失阈值进行比对,以获取所述匹配训练集数据样本的卷积神经网络模型;其中,所述匹配训练集数据样本的卷积神经网络模型为损失函数的最小数值对应的数据训练模型。
于本实施例中,在评估模型时,采用tf.estimator.inputs.numpy_input_fn()函数加载数 据,其中,x为预测数据x_test,y为标签y=y_test,batch_size大小不变,shuffle=False。
在所述步骤S15中,通过对所述数据样本的新增和修正,对所述匹配训练集数据样本的卷积神经网络模型进行二次或三次训练,以提高所述匹配训练集数据样本的卷积神经网络模型精度。
本实施例所述舾装件智能识别方法可在极短时间内精确识别图像中是否含有目标物体,为图像识别提供了新思路。
本实施例还提供一种计算机设备,其上存储有计算机程序,该计算机程序被处理器执行时实现上述舾装件智能识别方法。
该计算机设备可以是服务器,其内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储舾装件智能识别方法数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (10)

  1. 一种舾装件智能识别方法,其特征在于,包括步骤:
    步骤S11)图像采集步骤,面向船舶舾装仓库采集图像,所采集的图像为大量包含或不包含目标物体的图像,所述目标物体包括舾装件;
    步骤S12)图像预处理步骤,利用图像的线性运算、图像的逻辑运算、图像的空间操作、图像变换对所采集的图像进行处理形成输入样本张量数据;
    步骤S13)样本数据集获取步骤,接收输入样本张量数据,将所述输入样本张量数据化简,以整合成输入样本数据,所述输入样本数据分为从采集的图像中获取的训练集数据样本和测试集数据样本;
    步骤S14)特征提取步骤,建立卷积神经网络模型,调用所述样本数据集对模型进行训练和评估,得到匹配训练集数据样本的卷积神经网络模型;
    步骤S15)模型校准步骤,对匹配训练集数据样本的卷积神经网络模型进行优化训练,以提高所述匹配训练集数据样本的卷积神经网络模型精度;
    步骤S16)图像识别步骤,利用所述校准后的匹配训练集数据样本的卷积神经网络模型,进行目标物体的识别。
  2. 根据权利要求1所述的舾装件智能识别方法,其特征在于,所述步骤S12中,还包括图像的算术运算,所述图像的算术运算为将所采集的图像中像素值存储在阵列中,阵列加、减、乘、除均为两个相同位置的元素直接运算,其中加法用于图像降噪、减法用于增强图像间差别、图像相乘或相除用于阴影校正。
  3. 根据权利要求1所述的舾装件智能识别方法,其特征在于,所述步骤S12中,所述图像的线性运算包括:假设作用在图像f(x,y)上的算子H使得H[f(x,y)]=g(x,y)且满足H[aifi(x,y)+ajfj(x,y)]=aiH[fi(x,y)]+ajH[fj(x,y)]=aigi(x,y)+ajgj(x,y),其中ai,aj是任意常数,fi,fj是任意两个大小相同的图像,则H为线性操作;
    所述图像的逻辑运算包括图像的集合运算,图像的集合运算为图像的交、并、补,逻辑运算方式包括与或非和异或;
    所述图像的空间操作包括单像素操作、邻域操作和几何空间变换。
  4. 根据权利要求1所述的舾装件智能识别方法,其特征在于,所述步骤S12中,所述图像预处理步骤还包括:图像裁剪、图像大小重新定义、将图像数据转化为张量、数据标准化。
  5. 根据权利要求1所述的舾装件智能识别方法,其特征在于,所述步骤S14中,所述特征提取步骤包括:
    针对训练集数据样本构建训练模型;
    调用所述训练集数据样本训练所述训练模型直至收敛,以生成适用于训练集数据样本的待评估卷积神经网络模型;
    评估所述待评估卷积神经网络模型,当所述待评估卷积神经网络模型符合预设的评估标准后,得到匹配训练集数据样本的卷积神经网络模型;
    将所述测试集数据样本输入匹配训练集数据样本的卷积神经网络模型,以预测所述测试集数据样本的属性特征,并获取匹配训练集数据样本的卷积神经网络模型精度。
  6. 根据权利要求5所述的舾装件智能识别方法,其特征在于,所述步骤S14中,构建的数据训练模型包括:输入层、设置于所述输入层下层的卷积层、设置于所述卷积层下层的池化层、设置于所述池化层下层的全连接层、设置于所述全连接层下层的丢弃层及设置于所述丢弃层下层的输出层。
  7. 根据权利要求5所述的舾装件智能识别方法,其特征在于,所述步骤S14中,在调用所述训练集数据样本训练所述训练模型直至收敛过程中,还包括利用预存优化函数优化所述数据训练模型。
  8. 根据权利要求5所述的舾装件智能识别方法,其特征在于,所述步骤S14中,所述预设的评估标准包括损耗函数;评估所述待评估卷积神经网络模型的步骤包括:计算所述待评估卷积神经网络模型的损失函数;将所述待评估卷积神经网络模型的损失函数的数值与预设损失阈值进行比对,以获取所述匹配训练集数据样本的卷积神经网络模型; 其中,所述匹配训练集数据样本的卷积神经网络模型为损失函数的最小数值对应的数据训练模型。
  9. 根据权利要求1所述的舾装件智能识别方法,其特征在于,所述步骤S15中,通过对所述数据样本的新增和修正,对所述匹配训练集数据样本的卷积神经网络模型进行二次或三次训练,以提高所述匹配训练集数据样本的卷积神经网络模型精度。
  10. 一种计算机设备,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时实现权利要求1至9任一项所述的舾装件智能识别方法。
PCT/CN2023/112528 2022-10-10 2023-08-11 一种舾装件智能识别方法、计算机设备 WO2024078112A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211235668.5 2022-10-10
CN202211235668.5A CN115565115A (zh) 2022-10-10 2022-10-10 一种舾装件智能识别方法、计算机设备

Publications (1)

Publication Number Publication Date
WO2024078112A1 true WO2024078112A1 (zh) 2024-04-18

Family

ID=84745884

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/112528 WO2024078112A1 (zh) 2022-10-10 2023-08-11 一种舾装件智能识别方法、计算机设备

Country Status (2)

Country Link
CN (1) CN115565115A (zh)
WO (1) WO2024078112A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115565115A (zh) * 2022-10-10 2023-01-03 上海船舶工艺研究所(中国船舶集团有限公司第十一研究所) 一种舾装件智能识别方法、计算机设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112766232A (zh) * 2021-02-19 2021-05-07 南京邮电大学 一种基于可重构卷积神经网络的道路风险目标识别方法
CN112906795A (zh) * 2021-02-23 2021-06-04 江苏聆世科技有限公司 一种基于卷积神经网络的鸣笛车辆判定方法
CN114926691A (zh) * 2022-05-31 2022-08-19 中国计量大学 基于卷积神经网络的虫害智能化识别方法及系统
CN115565115A (zh) * 2022-10-10 2023-01-03 上海船舶工艺研究所(中国船舶集团有限公司第十一研究所) 一种舾装件智能识别方法、计算机设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112766232A (zh) * 2021-02-19 2021-05-07 南京邮电大学 一种基于可重构卷积神经网络的道路风险目标识别方法
CN112906795A (zh) * 2021-02-23 2021-06-04 江苏聆世科技有限公司 一种基于卷积神经网络的鸣笛车辆判定方法
CN114926691A (zh) * 2022-05-31 2022-08-19 中国计量大学 基于卷积神经网络的虫害智能化识别方法及系统
CN115565115A (zh) * 2022-10-10 2023-01-03 上海船舶工艺研究所(中国船舶集团有限公司第十一研究所) 一种舾装件智能识别方法、计算机设备

Also Published As

Publication number Publication date
CN115565115A (zh) 2023-01-03

Similar Documents

Publication Publication Date Title
CN110175671B (zh) 神经网络的构建方法、图像处理方法及装置
US20230095606A1 (en) Method for training classifier, and data processing method, system, and device
CN110222718B (zh) 图像处理的方法及装置
US20220148291A1 (en) Image classification method and apparatus, and image classification model training method and apparatus
CN112215332B (zh) 神经网络结构的搜索方法、图像处理方法和装置
US20230048405A1 (en) Neural network optimization method and apparatus
CN111291809A (zh) 一种处理装置、方法及存储介质
CN114998695B (zh) 一种提高图像识别速度的方法及系统
CN113095370B (zh) 图像识别方法、装置、电子设备及存储介质
CN113191489B (zh) 二值神经网络模型的训练方法、图像处理方法和装置
CN110879982A (zh) 一种人群计数系统及方法
US20220157046A1 (en) Image Classification Method And Apparatus
WO2024078112A1 (zh) 一种舾装件智能识别方法、计算机设备
Öztürk et al. Transfer learning and fine‐tuned transfer learning methods' effectiveness analyse in the CNN‐based deep learning models
US20210042613A1 (en) Techniques for understanding how trained neural networks operate
CN111223128A (zh) 目标跟踪方法、装置、设备及存储介质
US20230401838A1 (en) Image processing method and related apparatus
Chew et al. Large-scale 3D point-cloud semantic segmentation of urban and rural scenes using data volume decomposition coupled with pipeline parallelism
US20240078428A1 (en) Neural network model training method, data processing method, and apparatus
CN111179270A (zh) 基于注意力机制的图像共分割方法和装置
US20220222934A1 (en) Neural network construction method and apparatus, and image processing method and apparatus
Defriani et al. Recognition of regional traditional house in Indonesia using Convolutional Neural Network (CNN) method
Jiao et al. Non-local duplicate pooling network for salient object detection
Resti et al. Performance improvement of decision tree model using fuzzy membership function for classification of corn plant diseases and pests
CN113256556A (zh) 一种图像选择方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23876317

Country of ref document: EP

Kind code of ref document: A1