CN117422709B - Slump prediction method and slump prediction device based on RGB image and depth image - Google Patents

Slump prediction method and slump prediction device based on RGB image and depth image Download PDF

Info

Publication number
CN117422709B
CN117422709B CN202311709417.0A CN202311709417A CN117422709B CN 117422709 B CN117422709 B CN 117422709B CN 202311709417 A CN202311709417 A CN 202311709417A CN 117422709 B CN117422709 B CN 117422709B
Authority
CN
China
Prior art keywords
image sequence
rgb
slump
images
branch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311709417.0A
Other languages
Chinese (zh)
Other versions
CN117422709A (en
Inventor
杨建红
林柏宏
黄文景
张宝裕
黄骁民
陈焕森
黄伟晴
韩明芝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaqiao University
Fujian South Highway Machinery Co Ltd
Original Assignee
Huaqiao University
Fujian South Highway Machinery Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaqiao University, Fujian South Highway Machinery Co Ltd filed Critical Huaqiao University
Priority to CN202311709417.0A priority Critical patent/CN117422709B/en
Publication of CN117422709A publication Critical patent/CN117422709A/en
Application granted granted Critical
Publication of CN117422709B publication Critical patent/CN117422709B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Quality & Reliability (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a slump prediction method and a slump prediction device based on RGB images and depth images, and relates to the field of slump prediction, wherein the method comprises the following steps: the method comprises the steps of obtaining an original RGB image sequence and an original depth image sequence of a concrete surface collected in a concrete mixing stage, and preprocessing the original RGB image sequence and the original depth image sequence to obtain an RGB image sequence and a depth image sequence; building and training a slump prediction model based on bimodal feature fusion to obtain a trained slump prediction model; and inputting the RGB images in the RGB image sequence and the depth images in the depth image sequence corresponding to the RGB images into a trained slump prediction model to obtain a slump prediction value so as to solve the problem of slump real-time detection in the concrete mixing process.

Description

Slump prediction method and slump prediction device based on RGB image and depth image
Technical Field
The invention relates to the field of slump prediction, in particular to a slump prediction method and device based on RGB images and depth images.
Background
Concrete slump is one of the indexes for evaluating concrete fluidity, and is used for guiding the construction of engineering. At present, the detection of the slump of concrete mainly depends on the manual detection by using a slump barrel, and the detection mode is carried out after the production of the concrete, and has the aftereffect, which means that once the slump of the batch of concrete is detected to be unsatisfactory, the resource waste and the construction delay are caused.
Because the fluctuation of the water content of the raw materials of the concrete is large, the slump of the concrete cannot be controlled directly through the formula to meet the requirement, and the concrete slump needs to be measured periodically to adjust the formula, the production requirement of adjusting the concrete formula in real time is difficult to meet in a mode of manually detecting the slump after the follow-up, and therefore a method capable of detecting the slump in real time in the concrete stirring process is needed.
Disclosure of Invention
The technical problems mentioned above are solved. An objective of the embodiments of the present application is to provide a slump prediction method and device based on RGB images and depth images, so as to solve the technical problems mentioned in the background section.
In a first aspect, the present invention provides a slump prediction method based on an RGB image and a depth image, comprising the steps of:
the method comprises the steps of obtaining an original RGB image sequence and an original depth image sequence of a concrete surface collected in a concrete mixing stage, and preprocessing the original RGB image sequence and the original depth image sequence to obtain an RGB image sequence and a depth image sequence;
building and training a slump prediction model based on bimodal feature fusion to obtain a trained slump prediction model;
and inputting the RGB images in the RGB image sequence and the depth images in the corresponding depth image sequence into a trained slump prediction model to obtain a slump prediction value.
Preferably, the slump prediction model comprises a first convolution layer, a maximum pooling layer, a first residual error module, a second residual error module, a third residual error module, a fourth residual error module, an average pooling layer, a flattening layer and a full-connection layer which are sequentially connected, wherein after the RGB images in the RGB image sequence and the depth images of the corresponding depth image sequence are spliced, the RGB images and the depth images of the corresponding depth image sequence are input into the first convolution layer, and a slump prediction value is output through the full-connection layer.
Preferably, the slump prediction model comprises a first branch and a second branch, the first branch and the second branch comprise a first convolution layer, a maximum pooling layer, a first residual error module, a second residual error module, a third residual error module and a fourth residual error module which are sequentially connected, the output of the first residual error module of the first branch and the output of the first residual error module of the second branch are added and then input into the second residual error module of the first branch, the output of the second residual error module of the first branch and the output of the second residual error module of the second branch are added and then input into the fourth residual error module of the first branch, the output of the fourth residual error module of the first branch and the output of the fourth residual error module of the second branch are added and then input into the average pooling layer, the flattening layer and the full connecting layer which are sequentially connected, and the depth images in the RGB image sequence and the corresponding depth image sequence are respectively input into the first convolution layer of the first branch and the third residual error module of the first branch, the output of the third residual error module of the first branch and the fourth residual error module of the second branch are added, and the output of the fourth residual error module of the first branch is added and the third residual error module of the third residual error module.
Preferably, the slump prediction model comprises a first branch and a second branch, wherein the first branch and the second branch comprise a first convolution layer, a maximum pooling layer, a first residual error module, a second residual error module, a third residual error module, a fourth residual error module, an average pooling layer and a flattening layer which are sequentially connected, the output of the flattening layer of the first branch and the output of the flattening layer of the second branch are spliced and then are input into a full-connection layer, an RGB image in an RGB image sequence and a depth image in a depth image sequence corresponding to the RGB image sequence are respectively input into the first convolution layer of the first branch and the first convolution layer of the second branch, and a slump prediction value is output through the full-connection layer.
Preferably, the first residual module, the second residual module, the third residual module and the fourth residual module respectively comprise 3 residual blocks, 4 residual blocks, 6 residual blocks and 3 residual blocks, wherein the 3 residual blocks are sequentially connected, the 6 residual blocks are sequentially connected, each residual block comprises a second convolution module and a third convolution module which form residual connection, the second convolution module and the third convolution module respectively comprise a second convolution layer and a Relu activation function layer, the convolution kernel size of the first convolution layer is 7 multiplied by 7, the convolution kernel size of the second convolution layer is 3 multiplied by 3, the pooling kernel size of the largest pooling layer is 3 multiplied by 3, and the pooling kernel size of the average pooling layer is 1 multiplied by 1.
Preferably, the preprocessing is performed on the original RGB image sequence and the original depth image sequence to obtain the RGB image sequence and the depth image sequence, which specifically includes:
removing noise points of the original depth images in the original depth image sequence to obtain a processed original depth image sequence;
the brightness of the original RGB image in the original RGB image sequence is improved, and a processed original RGB image sequence is obtained;
aligning the processed original depth image sequence and the processed original RGB image sequence according to the corresponding stirring time to obtain an aligned original depth image sequence and an aligned original RGB image sequence;
the aligned original depth image sequence and the aligned original RGB image sequence are subjected to time period interception to obtain an intercepted original depth image sequence and an intercepted original RGB image sequence;
and performing size transformation and normalization processing on the intercepted original depth image sequence and the intercepted original RGB image sequence to obtain an RGB image sequence and a depth image sequence.
Preferably, the image corresponding to the time period interception is an image of one rotation of the concrete mixing shaft.
In a second aspect, the present invention provides a slump prediction apparatus based on an RGB image and a depth image, comprising:
the image processing module is configured to acquire an original RGB image sequence and an original depth image sequence of the concrete surface acquired in a concrete mixing stage, and preprocess the original RGB image in the original RGB image sequence and the original depth image in the original depth image sequence to acquire an RGB image sequence and a depth image sequence;
the model construction module is configured to construct a slump prediction model based on bimodal feature fusion and train the slump prediction model to obtain a trained slump prediction model;
and the execution module is configured to input the RGB images in the RGB image sequence and the depth images in the corresponding depth image sequence into the trained slump prediction model to obtain a slump prediction value.
In a third aspect, the present invention provides an electronic device comprising one or more processors; and storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method as described in any of the implementations of the first aspect.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method as described in any of the implementations of the first aspect.
Compared with the prior art, the invention has the following beneficial effects:
(1) According to the slump prediction method based on the RGB image and the depth image, a slump prediction model is established by fusing the RGB image and the depth image, and slump is not required to be tested by manual experiments, so that slump of concrete is detected in real time in the stirring process.
(2) According to the slump prediction method based on the RGB image and the depth image, the RGB image sequence and the depth image sequence are acquired, and as the fluid movement of the concrete is divided into the transverse and longitudinal compound movement, the movement information of two directions in the stirring process can be recorded simultaneously, and more characteristic information is learned based on the slump prediction model fused by bimodal characteristics, so that the relation between the RGB image and the depth image and the slump value is established, the prediction accuracy can be effectively improved, the characteristic fusion can be realized in the early stage, the middle stage or the later stage according to different requirements, and the flexibility is higher.
(3) According to the slump prediction method based on the RGB image and the depth image, real-time slump detection can be achieved only by installing the camera on the mixer, and slump can be conveniently adjusted in real time to meet engineering requirements.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is an exemplary device frame pattern to which an embodiment of the present application may be applied;
FIG. 2 is a flow chart of a slump prediction method based on RGB images and depth images according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an RGB image aligned with a depth image based slump prediction method for RGB image and depth image according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a slump prediction model of a slump prediction method based on an RGB image and a depth image according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a slump prediction model of a slump prediction method based on an RGB image and a depth image according to a second embodiment of the present application;
fig. 6 is a schematic structural diagram of a slump prediction model of a slump prediction method based on an RGB image and a depth image according to the third embodiment of the present application;
FIG. 7 is a schematic diagram of a slump prediction apparatus based on RGB images and depth images according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a computer device suitable for use in implementing the embodiments of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 illustrates an exemplary device architecture 100 in which the RGB image and depth image-based slump prediction method or the RGB image and depth image-based slump prediction device of the embodiments of the present application may be applied.
As shown in fig. 1, the apparatus architecture 100 may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the first terminal device 101, the second terminal device 102, the third terminal device 103, and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the first terminal device 101, the second terminal device 102, the third terminal device 103, to receive or send messages, etc. Various applications, such as a data processing class application, a file processing class application, and the like, may be installed on the terminal device one 101, the terminal device two 102, and the terminal device three 103.
The first terminal device 101, the second terminal device 102 and the third terminal device 103 may be hardware or software. When the first terminal device 101, the second terminal device 102, and the third terminal device 103 are hardware, they may be various electronic devices, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and the like. When the first terminal apparatus 101, the second terminal apparatus 102, and the third terminal apparatus 103 are software, they can be installed in the above-listed electronic apparatuses. Which may be implemented as multiple software or software modules (e.g., software or software modules for providing distributed services) or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server that provides various services, such as a background data processing server that processes files or data uploaded by the terminal device one 101, the terminal device two 102, and the terminal device three 103. The background data processing server can process the acquired file or data to generate a processing result.
It should be noted that, the slump prediction method based on the RGB image and the depth image provided in the embodiment of the present application may be executed by the server 105, or may be executed by the first terminal device 101, the second terminal device 102, or the third terminal device 103, and accordingly, the slump prediction device based on the RGB image and the depth image may be set in the server 105, or may be set in the first terminal device 101, the second terminal device 102, or the third terminal device 103.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. In the case where the processed data does not need to be acquired from a remote location, the above-described apparatus architecture may not include a network, but only a server or terminal device.
Fig. 2 shows a slump prediction method based on RGB image and depth image according to an embodiment of the present application, including the steps of:
s1, acquiring an original RGB image sequence and an original depth image sequence of the concrete surface acquired in a concrete stirring stage, and preprocessing the original RGB image sequence and the original depth image sequence to obtain an RGB image sequence and a depth image sequence.
In a specific embodiment, preprocessing an original RGB image sequence and an original depth image sequence to obtain an RGB image sequence and a depth image sequence specifically includes:
removing noise points of the original depth images in the original depth image sequence to obtain a processed original depth image sequence;
the brightness of the original RGB image in the original RGB image sequence is improved, and a processed original RGB image sequence is obtained;
aligning the processed original depth image sequence and the processed original RGB image sequence according to the corresponding stirring time to obtain an aligned original depth image sequence and an aligned original RGB image sequence;
the aligned original depth image sequence and the aligned original RGB image sequence are subjected to time period interception to obtain an intercepted original depth image sequence and an intercepted original RGB image sequence;
and performing size transformation and normalization processing on the intercepted original depth image sequence and the intercepted original RGB image sequence to obtain an RGB image sequence and a depth image sequence.
In a specific embodiment, the image corresponding to the time period interception is an image of one rotation of the concrete mixing shaft.
Specifically, a clear video is acquired during the concrete mixing stage, video frames in the video are extracted, and a reasonable region is selected as a region of interest in the video frames, wherein the region of interest should contain rich concrete surface texture features and should not contain things such as cylinder walls and the like which are irrelevant to the concrete mixing process. The size of the region of interest is selected as the case may be, and in general, the larger the region of interest is, the more the feature information is, but the calculation cost is also increased, and the calculation cost and the benefit need to be comprehensively considered. The method comprises the steps of taking a region of interest selected by a video acquired by a common color camera as an original RGB image, combining the original RGB image with a time sequence to form an original RGB image sequence, taking the region of interest selected by a video acquired by a depth camera as a depth image, and combining the depth image with the time sequence to form an original depth sequence.
Considering the depth direction, the overturning motion of the concrete fluctuates in a certain depth range, the numerical value exceeding the depth range is possibly caused by splash particles and floating dust, the depth range is set to be 0.4-4 meters according to the size of a stirring device used in an experiment, and when noise removal is carried out on an original depth image in an original depth sequence, the depth value of a pixel at the position, exceeding the depth range, in the original depth image is set to be 0.
In addition, a certain brightness improvement is required for the RGB images in the original RGB sequence to improve the quality of the images. Because the blade states of the recorded video initial frames are different, alignment is required, the aligned original depth image sequence and the aligned original RGB image sequence are intercepted, the image sequence of one circle of rotation of the concrete stirring shaft is usually obtained through interception, and the stirring blade states between the intercepted original depth image sequence and the intercepted original RGB image sequence are consistent, as shown in fig. 3. The truncated original depth image sequence and the truncated original RGB image sequence need to be subjected to size transformation and normalization processing to adapt to the input of a subsequent slump prediction model, and parameters of the size transformation and normalization can refer to suggested parameters of different slump prediction models. The size of the image is thus changed to 224 x 224 pixels in order to accommodate the input of a subsequent slump predictive model, which fits the slump predictive model proposed in the three embodiments described below. And simultaneously normalizing the pixel value of the image to a value between 0 and 1 according to the proportion.
S2, building and training a slump prediction model based on bimodal feature fusion to obtain a trained slump prediction model.
Specifically, the slump prediction model based on bimodal feature fusion will be described in detail below by adopting the network model proposed in the first embodiment, the second embodiment and the third embodiment, which will not be described in detail herein. The principle is as follows: the sparseness degree, texture characteristics and the like of the concrete with different slumps are different in the stirring process, the characteristics are intricate and complex, and the characteristics on RGB images in an RGB image sequence and the characteristics on depth images in a depth image sequence corresponding to the RGB images are extracted by using a slump prediction model based on bimodal characteristic fusion and are subjected to characteristic fusion so as to establish the relation between the images and the slumps. The loss function used in the training process of the slump predictive model is an MSE (mean square error) loss function.
S3, inputting the RGB images in the RGB image sequence and the depth images in the corresponding depth image sequence into a trained slump prediction model to obtain a slump prediction value.
Specifically, a trained slump prediction model is deployed in an industrial machine, and RGB images of an RGB image sequence and depth images in a depth image sequence corresponding to the RGB images obtained in real time in the concrete mixing process are input into the trained slump prediction model for prediction, so that a slump prediction value is obtained.
The network structure of the slump prediction model proposed in the embodiment of the present application will be described below using specific embodiments.
Example 1
Referring to fig. 4, a slump prediction model adopted in the first embodiment of the present application includes a first convolution layer, a maximum pooling layer, a first residual error module, a second residual error module, a third residual error module, a fourth residual error module, an average pooling layer, a flattening layer and a full connection layer which are sequentially connected, after the RGB images in the RGB image sequence and the depth images of the corresponding depth image sequence are spliced, the first convolution layer is input, and after the RGB images and the depth images of the corresponding depth image sequence sequentially pass through the maximum pooling layer, the first residual error module, the second residual error module, the third residual error module, the fourth residual error module, the average pooling layer and the flattening layer, a slump prediction value is finally output through the full connection layer. The characteristic fusion process of the slump prediction model is embodied in an early stage, namely, before the slump prediction model is input, the RGB images in the RGB image sequence and the depth images of the corresponding depth image sequence are spliced.
In a specific embodiment, the first residual module, the second residual module, the third residual module and the fourth residual module respectively comprise 3 residual blocks, 4 residual blocks, 6 residual blocks and 3 residual blocks, wherein the 3 residual blocks, the 4 residual blocks, the 6 residual blocks and the 3 residual blocks are sequentially connected, each residual block comprises a second convolution module and a third convolution module which form residual connection, the second convolution module and the third convolution module respectively comprise a second convolution layer and a Relu activation function layer which are sequentially connected, the convolution kernel size of the first convolution layer is 7×7, the convolution kernel size of the second convolution layer is 3×3, the pooling kernel size of the largest pooling layer is 3×3, and the pooling kernel size of the average pooling layer is 1×1.
Specifically, after image preprocessing, before a slump prediction model is input, feature fusion is carried out on RGB images in an RGB image sequence and depth images of a depth image sequence corresponding to the RGB images, after feature fusion, the RGB images in the RGB image sequence and the depth images of the depth image sequence corresponding to the RGB images are combined into a whole, feature extraction is carried out on the RGB images, the extracted features simultaneously have feature information of the RGB images and the depth images, and finally the features are sent into a full-connection layer to establish a mathematical relationship between slump and RGB-D images with slump values. After training of the slump prediction model is completed, RGB images in the RGB image sequence and depth images of the corresponding depth image sequence are input, and the model can output a slump prediction value.
Example two
Referring to fig. 5, the second embodiment of the present application differs from the first embodiment in that: the slump prediction model comprises a first branch and a second branch, wherein the first branch and the second branch comprise a first convolution layer, a maximum pooling layer, a first residual error module, a second residual error module, a third residual error module and a fourth residual error module which are sequentially connected, the output of the first residual error module of the first branch and the output of the first residual error module of the second branch are added and then input into the second residual error module of the first branch, the output of the second residual error module of the first branch and the output of the second residual error module of the second branch are added and then input into the third residual error module of the first branch, the output of the fourth residual error module of the first branch and the output of the fourth residual error module of the second branch are added and then input into the average pooling layer, the leveling layer and the full connecting layer which are sequentially connected, and the average residual error value of the first residual error module, the average leveling layer and the full connecting layer which are sequentially connected, and the average residual error value of the first residual error module, the fourth residual error module of the second branch and the average residual error module of the fourth residual error module of the second branch are sequentially connected, and the average residual error value of the average pooling layer and the full-level error module of the average residual error module of the fourth residual error module of the second branch. The characteristic fusion process of the slump prediction model is embodied in the middle stage, namely the output of the previous residual error module in the two branches needs to be added, and then the output of the previous residual error module in the first branch is input.
Specifically, the parameters of the first convolution layer, the maximum pooling layer, the first residual error module, the second residual error module, the third residual error module, the fourth residual error module, the average pooling layer, the flattening layer, and the full connection layer in each branch are referred to in embodiment one, and are not described herein again.
Example III
Referring to fig. 6, the third embodiment of the present application differs from the first embodiment in that: the slump prediction model comprises a first branch and a second branch, wherein the first branch and the second branch comprise a first convolution layer, a maximum pooling layer, a first residual error module, a second residual error module, a third residual error module, a fourth residual error module, an average pooling layer and a flattening layer which are sequentially connected, the output of the flattening layer of the first branch and the output of the flattening layer of the second branch are spliced and then input into a full-connection layer, RGB images in an RGB image sequence and depth images in a depth image sequence corresponding to the RGB images are respectively input into the first convolution layer of the first branch and the first convolution layer of the second branch, and finally the output of the flattening layer of the two branches is spliced and then input into the full-connection layer, and finally the predicted value of the slump is output through the full-connection layer. The characteristic fusion process of the slump prediction model is embodied in a later stage, namely the outputs of the flattening layers of the two branches are spliced and then input into the full-connection layer.
Specifically, the parameters of the first convolution layer, the maximum pooling layer, the first residual error module, the second residual error module, the third residual error module, the fourth residual error module, the average pooling layer, the flattening layer, and the full connection layer in each branch are referred to in embodiment one, and are not described herein again.
The slump prediction models proposed in examples one to three were tested, referring to tables 1 to 4, table 1 is the prediction accuracy result of the slump prediction models proposed in examples one to three, table 2 is the comparison result of the predicted slump value and the measured value of the slump prediction model proposed in example one, table 3 is the comparison result of the predicted slump value and the measured value of the slump prediction model proposed in example two, and table 4 is the comparison result of the predicted slump value and the measured value of the slump prediction model proposed in example three.
Table 1 prediction accuracy of slump predictive model proposed in examples one to three
Note that: when the measured value is less than or equal to 30mm below the predicted value, the sample is considered to be correctly predicted.
Table 2 comparison of the prediction results and actual measurement values of the slump predictive model proposed in example one
TABLE 3 comparison of the prediction results and actual measurement values of the slump predictive model proposed in example two
Table 4 comparison of the prediction results and actual measurement values of the slump predictive model proposed in example three
In summary, the slump prediction models proposed in the first to third embodiments meet the slump deviation requirement, where the slump prediction model proposed in the first embodiment has the highest accuracy and the smallest average deviation.
With further reference to fig. 7, as an implementation of the method shown in the foregoing figures, the present application provides an embodiment of a slump prediction apparatus based on RGB images and depth images, which corresponds to the embodiment of the method shown in fig. 2, and which is particularly applicable to various electronic devices.
The embodiment of the application provides a slump prediction device based on RGB image and depth image, which comprises:
the image processing module 1 is configured to acquire an original RGB image sequence and an original depth image sequence of the concrete surface acquired in a concrete mixing stage, and preprocess an original RGB image in the original RGB image sequence and an original depth image in the original depth image sequence to acquire an RGB image sequence and a depth image sequence;
the model construction module 2 is configured to construct a slump prediction model based on bimodal feature fusion and train the slump prediction model to obtain a trained slump prediction model;
the execution module 3 is configured to input the RGB images in the RGB image sequence and the depth images in the corresponding depth image sequence into the trained slump prediction model to obtain a slump predicted value.
Referring now to fig. 8, there is illustrated a schematic diagram of a computer apparatus 800 suitable for use in implementing an electronic device (e.g., a server or terminal device as illustrated in fig. 1) of an embodiment of the present application. The electronic device shown in fig. 8 is only an example and should not impose any limitation on the functionality and scope of use of the embodiments of the present application.
As shown in fig. 8, the computer apparatus 800 includes a Central Processing Unit (CPU) 801 and a Graphics Processor (GPU) 802, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 803 or a program loaded from a storage section 809 into a Random Access Memory (RAM) 804. In the RAM 804, various programs and data required for the operation of the device 800 are also stored. The CPU801, GPU802, ROM 803, and RAM 804 are connected to each other through a bus 805. An input/output (I/O) interface 806 is also connected to bus 805.
The following components are connected to the I/O interface 806: an input section 807 including a keyboard, a mouse, and the like; an output portion 808 including a speaker, such as a Liquid Crystal Display (LCD), or the like; a storage section 809 including a hard disk or the like; and a communication section 810 including a network interface card such as a LAN card, a modem, and the like. The communication section 810 performs communication processing via a network such as the internet. The drive 811 may also be connected to the I/O interface 806 as needed. A removable medium 812 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 811 so that a computer program read out therefrom is installed into the storage section 809 as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such embodiments, the computer program may be downloaded and installed from a network via the communications portion 810, and/or installed from a removable medium 812. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 801 and a Graphics Processor (GPU) 802.
It should be noted that the computer readable medium described in the present application may be a computer readable signal medium or a computer readable medium, or any combination of the two. The computer readable medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor apparatus, device, or means, or a combination of any of the foregoing. More specific examples of the computer-readable medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution apparatus, device, or apparatus. In the present application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may be any computer readable medium that is not a computer readable medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution apparatus, device, or apparatus. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present application may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based devices which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments described in the present application may be implemented by software, or may be implemented by hardware. The described modules may also be provided in a processor.
As another aspect, the present application also provides a computer-readable medium that may be contained in the electronic device described in the above embodiment; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: the method comprises the steps of obtaining an original RGB image sequence and an original depth image sequence of a concrete surface collected in a concrete mixing stage, and preprocessing the original RGB image sequence and the original depth image sequence to obtain an RGB image sequence and a depth image sequence; building and training a slump prediction model based on bimodal feature fusion to obtain a trained slump prediction model; and inputting the RGB images in the RGB image sequence and the depth images in the corresponding depth image sequence into a trained slump prediction model to obtain a slump prediction value.
The foregoing description is only of the preferred embodiments of the present application and is presented as a description of the principles of the technology being utilized. It will be appreciated by persons skilled in the art that the scope of the invention referred to in this application is not limited to the specific combinations of features described above, but it is intended to cover other embodiments in which any combination of features described above or equivalents thereof is possible without departing from the spirit of the invention. Such as the above-described features and technical features having similar functions (but not limited to) disclosed in the present application are replaced with each other.

Claims (7)

1. The slump prediction method based on the RGB image and the depth image is characterized by comprising the following steps of:
the method comprises the steps of obtaining an original RGB image sequence and an original depth image sequence of the concrete surface collected in a concrete mixing stage, preprocessing the original RGB image sequence and the original depth image sequence to obtain an RGB image sequence and a depth image sequence, and specifically comprises the following steps:
removing noise points of the original depth images in the original depth image sequence to obtain a processed original depth image sequence;
performing brightness improvement on the original RGB image in the original RGB image sequence to obtain a processed original RGB image sequence;
aligning the processed original depth image sequence and the processed original RGB image sequence according to the corresponding stirring time to obtain an aligned original depth image sequence and an aligned original RGB image sequence;
intercepting the aligned original depth image sequence and the aligned original RGB image sequence in a time period to obtain an intercepted original depth image sequence and an intercepted original RGB image sequence, wherein an image corresponding to the time period interception is an image of one rotation of a concrete stirring shaft;
performing size transformation and normalization processing on the intercepted original depth image sequence and the intercepted original RGB image sequence to obtain an RGB image sequence and a depth image sequence;
building and training a slump prediction model based on bimodal feature fusion to obtain a trained slump prediction model;
the RGB images in the RGB image sequence and the depth images corresponding to the RGB images in the depth image sequence are input into the trained slump prediction model to obtain slump prediction values, the slump prediction model comprises a first convolution layer, a maximum pooling layer, a first residual error module, a second residual error module, a third residual error module, a fourth residual error module, an average pooling layer, a flattening layer and a full connecting layer which are sequentially connected, the RGB images in the RGB image sequence and the depth images corresponding to the RGB images are input into the first convolution layer after being spliced, the slump prediction values are output through the full connecting layer, after the slump prediction model is input, the RGB images in the RGB image sequence and the depth images corresponding to the RGB images are subjected to feature fusion, the RGB images in the RGB image sequence and the depth images corresponding to the RGB images are combined into a whole, then, feature extraction is carried out on the RGB-D images, and the extracted feature extraction is carried out on the RGB images, and the feature extraction is carried out on the RGB images and the RGB images, and the depth images corresponding to the RGB images are connected with the full-slump prediction values, and the slump-D images are finally, and the mathematical relationship between the RGB images and the slump is established.
2. The slump prediction method based on RGB image and depth image according to claim 1, wherein the slump prediction model comprises a first branch and a second branch, each of the first branch and the second branch comprises a first convolution layer, a maximum pooling layer, a first residual module, a second residual module, a third residual module and a fourth residual module which are sequentially connected, wherein the output of the first residual module of the first branch and the output of the first residual module of the second branch are added and then input into the second residual module of the first branch, the output of the second residual module of the first branch and the output of the second residual module of the second branch are added and then input into the third residual module of the first branch, the output of the third residual error module of the first branch and the output of the third residual error module of the second branch are added and then input into the fourth residual error module of the first branch, the output of the fourth residual error module of the first branch and the output of the fourth residual error module of the second branch are added and then input into an average pooling layer, a flattening layer and a full connection layer which are sequentially connected, and RGB images in an RGB image sequence and depth images in the depth image sequence corresponding to the RGB images are respectively input into the first convolution layer of the first branch and the first convolution layer of the second branch, and the predicted value of the slump is output through the full connection layer.
3. The slump prediction method based on RGB images and depth images according to claim 1, wherein the slump prediction model comprises a first branch and a second branch, the first branch and the second branch comprise a first convolution layer, a maximum pooling layer, a first residual module, a second residual module, a third residual module, a fourth residual module, an average pooling layer and a flattening layer which are sequentially connected, the output of the flattening layer of the first branch and the output of the flattening layer of the second branch are spliced and then input into a fully connected layer, and the RGB images in the RGB image sequence and the corresponding depth images in the depth image sequence are respectively input into the first convolution layer of the first branch and the first convolution layer of the second branch, and the predicted value of the slump is output through the fully connected layer.
4. A slump prediction method based on an RGB image and a depth image according to any one of claims 2 to 3, wherein the first residual module, the second residual module, the third residual module, and the fourth residual module respectively include 3 residual blocks connected in sequence, 4 residual blocks connected in sequence, 6 residual blocks connected in sequence, and 3 residual blocks connected in sequence, each residual block includes a second convolution module and a third convolution module that constitute a residual connection, each of the second convolution module and the third convolution module includes a second convolution layer and a Relu activation function layer connected in sequence, a convolution kernel size of the first convolution layer is 7 x 7, a convolution kernel size of the second convolution layer is 3 x 3, a pooling kernel size of the maximum pooling layer is 3 x 3, and a pooling kernel size of the average pooling layer is 1 x 1.
5. A slump predicting device based on an RGB image and a depth image, employing the slump predicting method based on an RGB image and a depth image according to any one of claims 1 to 4, comprising:
the image processing module is configured to acquire an original RGB image sequence and an original depth image sequence of the concrete surface acquired in a concrete mixing stage, and preprocess the original RGB image in the original RGB image sequence and the original depth image in the original depth image sequence to acquire an RGB image sequence and a depth image sequence;
the model construction module is configured to construct a slump prediction model based on bimodal feature fusion and train the slump prediction model to obtain a trained slump prediction model;
and the execution module is configured to input the RGB images in the RGB image sequence and the depth images in the depth image sequence corresponding to the RGB images into the trained slump prediction model to obtain a slump prediction value.
6. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-4.
7. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-4.
CN202311709417.0A 2023-12-13 2023-12-13 Slump prediction method and slump prediction device based on RGB image and depth image Active CN117422709B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311709417.0A CN117422709B (en) 2023-12-13 2023-12-13 Slump prediction method and slump prediction device based on RGB image and depth image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311709417.0A CN117422709B (en) 2023-12-13 2023-12-13 Slump prediction method and slump prediction device based on RGB image and depth image

Publications (2)

Publication Number Publication Date
CN117422709A CN117422709A (en) 2024-01-19
CN117422709B true CN117422709B (en) 2024-04-16

Family

ID=89528632

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311709417.0A Active CN117422709B (en) 2023-12-13 2023-12-13 Slump prediction method and slump prediction device based on RGB image and depth image

Country Status (1)

Country Link
CN (1) CN117422709B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112927184A (en) * 2021-01-15 2021-06-08 重庆交通大学 Self-compacting concrete performance detection method and device based on deep learning
CN113902725A (en) * 2021-10-19 2022-01-07 中国联合网络通信集团有限公司 Slump measuring method, device, equipment and storage medium
CN114266989A (en) * 2021-11-15 2022-04-01 北京建筑材料科学研究总院有限公司 Concrete mixture workability determination method and device
CN115908271A (en) * 2022-10-26 2023-04-04 北京建筑材料科学研究总院有限公司 Online monitoring method, device and equipment for slump of pumped concrete
KR20230049410A (en) * 2021-10-06 2023-04-13 주식회사 에스에이치엘에이비 Method, measuring device and system for measuring slump
CN116862883A (en) * 2023-07-13 2023-10-10 西安理工大学 Concrete slump detection method based on image semantic segmentation
CN117218118A (en) * 2023-11-07 2023-12-12 福建南方路面机械股份有限公司 Slump monitoring method and device based on image sequence and readable medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112927184A (en) * 2021-01-15 2021-06-08 重庆交通大学 Self-compacting concrete performance detection method and device based on deep learning
KR20230049410A (en) * 2021-10-06 2023-04-13 주식회사 에스에이치엘에이비 Method, measuring device and system for measuring slump
CN113902725A (en) * 2021-10-19 2022-01-07 中国联合网络通信集团有限公司 Slump measuring method, device, equipment and storage medium
CN114266989A (en) * 2021-11-15 2022-04-01 北京建筑材料科学研究总院有限公司 Concrete mixture workability determination method and device
CN115908271A (en) * 2022-10-26 2023-04-04 北京建筑材料科学研究总院有限公司 Online monitoring method, device and equipment for slump of pumped concrete
CN116862883A (en) * 2023-07-13 2023-10-10 西安理工大学 Concrete slump detection method based on image semantic segmentation
CN117218118A (en) * 2023-11-07 2023-12-12 福建南方路面机械股份有限公司 Slump monitoring method and device based on image sequence and readable medium

Also Published As

Publication number Publication date
CN117422709A (en) 2024-01-19

Similar Documents

Publication Publication Date Title
US20190362490A1 (en) Method and apparatus for inspecting corrosion defect of ladle
KR20200087807A (en) Image processing methods, training methods, devices, devices, media and programs
CN110021052B (en) Method and apparatus for generating fundus image generation model
CN110413812B (en) Neural network model training method and device, electronic equipment and storage medium
CN109360153B (en) Image processing method, super-resolution model generation method and device and electronic equipment
WO2020093724A1 (en) Method and device for generating information
CN110276346A (en) Target area identification model training method, device and computer readable storage medium
US11514263B2 (en) Method and apparatus for processing image
CN112580481A (en) Edge node and cloud cooperative video processing method, device and server
CN115656189B (en) Defect detection method and device based on luminosity stereo and deep learning algorithm
CN111915480A (en) Method, apparatus, device and computer readable medium for generating feature extraction network
CN117218118B (en) Slump monitoring method and device based on image sequence and readable medium
CN118097157B (en) Image segmentation method and system based on fuzzy clustering algorithm
CN111292406A (en) Model rendering method and device, electronic equipment and medium
CN116051719A (en) Image rendering method and device based on nerve radiation field model
CN112631947A (en) Application program test control method and device, electronic equipment and storage medium
CN115731341A (en) Three-dimensional human head reconstruction method, device, equipment and medium
CN113554068B (en) Semi-automatic labeling method, device and readable medium for instance segmentation data set
CN110555861A (en) optical flow calculation method and device and electronic equipment
CN117422709B (en) Slump prediction method and slump prediction device based on RGB image and depth image
CN113034670A (en) Training method and device for image reconstruction model and electronic equipment
CN111784726B (en) Portrait matting method and device
US20240281938A1 (en) Video processing method and apparatus, electronic device, and medium
CN108256451B (en) Method and device for detecting human face
CN112418233B (en) Image processing method and device, readable medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant