CN115359055A

CN115359055A - Conveyor belt edge detection method, conveyor belt edge detection device, electronic equipment and storage medium

Info

Publication number: CN115359055A
Application number: CN202211279319.3A
Authority: CN
Inventors: 杨志方; 郝博南; 张立亚; 孟庆勇; 吴文臻; 王超; 彭丽; 赵青
Original assignee: CCTEG China Coal Research Institute
Current assignee: CCTEG China Coal Research Institute
Priority date: 2022-10-19
Filing date: 2022-10-19
Publication date: 2022-11-18
Anticipated expiration: 2042-10-19
Also published as: CN115359055B

Abstract

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for detecting an edge of a conveyor belt, an electronic device, and a storage medium. The conveyor belt edge detection method comprises the following steps: acquiring an initial conveyor belt image; controlling an encoder in a double-current conveyor belt edge detection network model to encode an initial conveyor belt image to obtain a conveyor belt characteristic image set, wherein the double-current conveyor belt edge detection network model comprises a depth self-attention transformation network and a convolution neural network; and controlling a decoder in the double-flow conveyor belt edge detection network model to decode the conveyor belt characteristic image set to obtain a target conveyor belt image. By adopting the method and the device, the accuracy and the real-time performance of the edge detection of the conveyor belt can be improved.

Description

Conveyor belt edge detection method, conveyor belt edge detection device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for detecting an edge of a conveyor belt, an electronic device, and a storage medium.

Background

In an industrial practical production scenario, a conveyor belt is required to transport articles. The conveyer belt off tracking accident can cause huge economic loss to the enterprise, can cause the casualties even when serious, in time accurately detects out the conveyer belt off tracking condition to take effective measure, can effectively avoid the occurence of failure, it is significant to the safety in production. The real-time monitoring of the deviation of the conveyor belt is realized, and the key technology is to carry out edge detection on the conveyor belt. However, in the related art, the accuracy and real-time performance of the edge detection of the conveyor belt are low.

Disclosure of Invention

The disclosure provides a conveyor belt edge detection method, a conveyor belt edge detection device, electronic equipment and a storage medium, and mainly aims to improve the accuracy and the real-time performance of conveyor belt edge detection.

According to an aspect of the present disclosure, there is provided a conveyor belt edge detection method including:

acquiring an initial conveyor belt image;

controlling an encoder in a double-flow conveyor belt edge detection network model to encode the initial conveyor belt image to obtain a conveyor belt characteristic image set, wherein the double-flow conveyor belt edge detection network model comprises a deep self-attention transformation network and a convolutional neural network;

and controlling a decoder in the double-flow conveyor belt edge detection network model to decode the conveyor belt characteristic image set to obtain a target conveyor belt image.

Optionally, the encoder includes a depth self-attention transform network basis module and a convolutional neural network basis module, and the encoder in the control double-flow conveyor belt edge detection network model encodes the initial conveyor belt image to obtain a conveyor belt feature image set, including:

controlling the depth self-attention transformation network basic module to convert the initial conveyor belt image into a first conveyor belt characteristic image and a second conveyor belt characteristic image;

controlling the convolutional neural network basic module to convert the initial conveyor belt image into a third conveyor belt characteristic image, a fourth conveyor belt characteristic image and a fifth conveyor belt characteristic image;

and determining the conveyor belt characteristic image set according to the first conveyor belt characteristic image, the second conveyor belt characteristic image, the third conveyor belt characteristic image, the fourth conveyor belt characteristic image and the fifth conveyor belt characteristic image.

Optionally, the depth self-attention transform network base module includes an image block division layer, a linear embedding layer, a first depth layered visual self-attention transform network base module using a mobile windowing, an image block merging layer, and a second depth layered visual self-attention transform network base module using a mobile windowing, and the controlling the depth self-attention transform network base module converts the initial conveyor belt image into a first conveyor belt characteristic image and a second conveyor belt characteristic image, including:

controlling the initial conveyor belt image to sequentially pass through the image block segmentation layer, the linear embedding layer and the first depth layering vision self-attention transformation network basic module using the mobile windowing to obtain a first conveyor belt characteristic image;

and controlling the first conveyor belt characteristic image to sequentially pass through the image block merging layer and the second depth layering vision self-attention transformation network basic module using the mobile windowing to obtain the second conveyor belt characteristic image.

Optionally, the convolutional neural network base module includes a trunk layer, a first downsampling layer and a second downsampling layer, and the controlling the convolutional neural network base module converts the initial conveyor belt image into a third conveyor belt characteristic image, a fourth conveyor belt characteristic image and a fifth conveyor belt characteristic image includes:

inputting the initial conveyor belt image to the main dry layer to obtain a third conveyor belt characteristic image;

controlling the first downsampling layer to downsample the third conveyor belt characteristic image to obtain a fourth conveyor belt characteristic image;

and controlling the second down-sampling layer to down-sample the fourth conveyor belt characteristic image to obtain the fifth conveyor belt characteristic image.

Optionally, the controlling the decoder in the double-flow conveyor belt edge detection network model to decode the conveyor belt feature image set to obtain the target conveyor belt image includes:

controlling the adder to carry out element-by-element addition on the first conveyor belt characteristic image and the fifth conveyor belt characteristic image to obtain a sixth conveyor belt characteristic image;

controlling the first upper convolution unit to perform upper sampling on the sixth conveyor belt characteristic image and the second conveyor belt characteristic image to obtain a seventh conveyor belt characteristic image;

controlling the second upper convolution unit to carry out up-sampling on the seventh conveyor belt characteristic image and the fourth conveyor belt characteristic image to obtain an eighth conveyor belt characteristic image;

controlling the third upper convolution unit to carry out up-sampling on the eighth conveyor belt characteristic image and the third conveyor belt characteristic image to obtain a ninth conveyor belt characteristic image;

and controlling the linear mapping layer to perform segmentation prediction on the ninth conveyor belt characteristic image to obtain the target conveyor belt image.

Optionally, after obtaining the image of the belt of the target conveyor, the method further includes:

and carrying out belt deviation detection on the belt image of the target conveyor to obtain a belt deviation result.

Optionally, the right the belt off tracking of target conveyer belt image detects, obtains the belt off tracking result, includes:

determining the offset corresponding to the edge center line in the target conveyor belt image;

and determining a belt deviation result corresponding to the belt image of the target conveyor according to the offset and the offset threshold.

According to another aspect of the present disclosure, there is provided a conveyor belt edge detecting device including:

the image acquisition unit is used for acquiring an initial conveyor belt image;

the image coding unit is used for controlling an encoder in a double-current conveyor belt edge detection network model to code the initial conveyor belt image to obtain a conveyor belt characteristic image set, wherein the double-current conveyor belt edge detection network model comprises a depth self-attention transformation network and a convolution neural network;

and the image decoding unit is used for controlling a decoder in the double-flow conveyor belt edge detection network model to decode the conveyor belt characteristic image set to obtain a target conveyor belt image.

Optionally, the encoder includes a depth self-attention transform network base module and a convolutional neural network base module, the image encoding unit includes a first image conversion subunit, a second image conversion subunit, and a set determination subunit, and the image encoding unit is configured to control an encoder in a double-flow conveyor belt edge detection network model to encode the initial conveyor belt image, and when a conveyor belt feature image set is obtained:

the first image conversion subunit is configured to control the depth self-attention transformation network base module to convert the initial conveyor belt image into a first conveyor belt characteristic image and a second conveyor belt characteristic image;

the second image conversion subunit is configured to control the convolutional neural network base module to convert the initial conveyor belt image into a third conveyor belt characteristic image, a fourth conveyor belt characteristic image and a fifth conveyor belt characteristic image;

the set determining subunit is configured to determine the set of conveyor belt feature images according to the first conveyor belt feature image, the second conveyor belt feature image, the third conveyor belt feature image, the fourth conveyor belt feature image, and the fifth conveyor belt feature image.

Optionally, the depth self-attention transform network base module includes an image block division layer, a linear embedding layer, a first depth layered visual self-attention transform network base module using a mobile windowing, an image block merging layer, and a second depth layered visual self-attention transform network base module using a mobile windowing, and the first image conversion subunit is configured to, when controlling the depth self-attention transform network base module to convert the initial conveyor belt image into a first conveyor belt feature image and a second conveyor belt feature image, specifically:

Optionally, the convolutional neural network base module includes a trunk layer, a first downsampling layer, and a second downsampling layer, and the second image conversion subunit is configured to control the convolutional neural network base module to convert the initial conveyor belt image into a third conveyor belt characteristic image, a fourth conveyor belt characteristic image, and a fifth conveyor belt characteristic image, and is specifically configured to:

Optionally, the decoder includes an adder, a first convolution unit, a second convolution unit, a third convolution unit, and a linear mapping layer, and the image decoding unit is configured to control the decoder in the dual-flow conveyor belt edge detection network model to decode the conveyor belt feature image set, so as to obtain a target conveyor belt image, and specifically configured to:

controlling the second upper convolution unit to perform upper sampling on the seventh conveyor belt characteristic image and the fourth conveyor belt characteristic image to obtain an eighth conveyor belt characteristic image;

controlling the third upper convolution unit to perform up-sampling on the eighth conveyor belt characteristic image and the third conveyor belt characteristic image to obtain a ninth conveyor belt characteristic image;

Optionally, after obtaining the target conveyor belt image, the method further includes:

and the deviation detection unit is used for carrying out belt deviation detection on the belt image of the target conveyor to obtain a belt deviation result.

Optionally, the deviation detecting unit is configured to perform belt deviation detection on the target conveyor belt image, and when a belt deviation result is obtained, the deviation detecting unit is specifically configured to:

According to another aspect of the present disclosure, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein, the first and the second end of the pipe are connected with each other,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the preceding aspects.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of the preceding aspects.

According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of any one of the preceding aspects.

In one or more embodiments of the present disclosure, the method includes obtaining an initial conveyor belt image; controlling an encoder in a double-current conveyor belt edge detection network model to encode an initial conveyor belt image to obtain a conveyor belt characteristic image set, wherein the double-current conveyor belt edge detection network model comprises a depth self-attention transformation network and a convolution neural network; and controlling a decoder in the double-flow conveyor belt edge detection network model to decode the conveyor belt characteristic image set to obtain a target conveyor belt image. Therefore, the double-current conveyor belt edge detection network model comprising the deep self-attention transformation network and the convolutional neural network is adopted to detect the conveyor belt edge, the extraction capability of the convolution of the convolutional neural network on local features can be combined with the global and long-distance information perception capability of the deep self-attention transformation network structure, the belt edge detection precision can be well improved, and belt image noise and background interference can be restrained. Meanwhile, by designing the feature fusion module of the depth self-attention transformation network and the convolutional neural network, an encoder-decoder structure is formed, global up-down information can be fused better, pre-training of the depth self-attention transformation network structure on a large-scale data set is avoided, the network structure can be adjusted flexibly, and therefore accuracy and real-time performance of edge detection on a conveyor belt can be improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a schematic flow chart illustrating a first method for detecting an edge of a conveyor belt according to an embodiment of the present disclosure;

FIG. 2 is a schematic flow chart diagram illustrating a second method of conveyor belt edge detection provided by an embodiment of the present disclosure;

FIG. 3 is a network architecture diagram illustrating a second method of conveyor belt edge detection provided by an embodiment of the present disclosure;

FIG. 4 is a schematic structural diagram of a deep hierarchical visual self-attention transformation network infrastructure module using a moving window provided by an embodiment of the present disclosure;

FIG. 5 is a schematic diagram illustrating the structure of a convolutional neural network basic module provided in the embodiment of the present disclosure;

FIG. 6 shows a detection schematic diagram of belt off-tracking detection provided by the embodiment of the disclosure;

FIG. 7 is a schematic diagram illustrating a scenario of a method for detecting an edge of a conveyor belt according to an embodiment of the disclosure;

FIG. 8 is a schematic structural diagram of a first conveyor belt edge detection device provided by an embodiment of the disclosure;

fig. 9 is a schematic structural diagram illustrating a second conveyor belt edge detection device provided by the embodiment of the disclosure;

FIG. 10 is a schematic diagram illustrating a third apparatus for detecting an edge of a conveyor belt according to an embodiment of the present disclosure;

fig. 11 is a block diagram of an electronic device for implementing a conveyor belt edge detection method of an embodiment of the disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In an industrial actual production scene, the edge deviation of the conveyer belt is usually the deviation of a certain partial area, so that the fault of the most integral deviation is caused.

According to some embodiments, conventional conveyor belt deviation detection techniques are largely classified into contact and non-contact detection. The contact detection technology generally performs qualitative detection on belt deviation by means of a mechanical device; the non-contact detection technology mainly uses a sensor, a single chip microcomputer, a collection chip and the like to carry out quantitative detection, and the two modes need to establish a special communication facility to transmit data to a background for processing.

In some embodiments, the traditional conveyor belt deviation detection technology is time-consuming to install, high in cost, poor in stability and poor in adaptability, cannot adapt to multi-scene belt detection only through signals of sensors and the like, is poor in anti-interference capacity, is aged along with the service time, is low in accuracy, can detect the deviation condition when a detection device needs the conveyor belt deviation to exceed a certain limit, is long in deviation processing reaction time, and greatly reduces the utilization rate of equipment.

It is easy to understand that the manual inspection and the traditional conveyor belt deviation detection technology have the problems of high cost, instability, easy error detection or missed detection and the like, and are not suitable for industrial belt scenes under long distance and complex environments. With the popularization of cameras in industry, the quantity of industrial belt image data is increasing, and the data becomes a new trend and a new breakthrough based on the research on the belt image data of the conveyor and the active exploration of related application technologies.

According to some embodiments, as can be seen from the image of the industrial site belt deviation, the belt deviation tends to be more obvious in the image of the change of the area or the change of the light shadow. Therefore, under the condition of proper illumination, obvious image area change can be captured in time by using a proper technology, the conveyor belt deviation detection technology based on the visual images can be realized, and the conveyor belt edge can be continuously detected.

In some embodiments, the conveyor belt deviation detection technology based on the visual images is mainly divided into two types of traditional edge detection algorithms and edge detection algorithms based on deep learning. The traditional edge detection algorithm mainly adopts the principle that when a belt deviates, the gray value of an image changes near the edge of the belt, and the edge is extracted by adopting a differential method. The Edge Detection algorithm based on deep learning is mainly based on the principle that a Full Convolution Network (FCN), deep Lab and integral Nested Edge Detection (HED) network is applied to a conveyor belt Edge Detection task, a model structure is compressed on the basis of the HED network, and final network output is simplified.

It is easy to understand that although the traditional edge detection algorithm has the advantages of high detection speed, convenience in implementation and the like, the selection of a proper gradient threshold is difficult, and the influence of the threshold selection on the result is high, so that the accuracy of the edge detection on the conveyor belt is low. The edge detection algorithm based on deep learning has a rough edge effect and a low processing speed, because the convolution operation only carries out local operation, the long-distance dependency relationship among pixels is difficult to construct, the real-time performance is low, and the detection speed does not meet the real-time performance requirement of 25 frames per second.

The present disclosure is described in detail below with reference to specific examples.

In a first embodiment, as shown in fig. 1, fig. 1 shows a schematic flow chart of a first method for detecting an edge of a conveyor belt provided by an embodiment of the present disclosure, which may be implemented by relying on a computer program and may be run on an apparatus for performing the method for detecting an edge of a conveyor belt. The computer program may be integrated into the application or may run as a separate tool-like application.

The conveyor belt edge detection device may be an electronic device with a conveyor belt edge detection function, and the electronic device includes but is not limited to: a server, a wearable device, a handheld device, a personal computer, a tablet, an in-vehicle device, a smartphone, a computing device, or other processing device connected to a wireless modem, and so forth. Electronic devices in different networks may be called different names, such as: user equipment, access terminal, subscriber unit, subscriber station, mobile station, remote terminal, mobile device, user terminal, wireless Communication device, user agent or user equipment, cellular telephone, cordless telephone, personal Digital Assistant (PDA), fifth Generation Mobile Communication technology (5G) network, fourth Generation Mobile Communication technology (4G) network, electronic device in a 3rd-Generation,3G or future evolution network, and the like.

Specifically, the conveyor belt edge detection method comprises the following steps:

s101, acquiring an initial conveyor belt image;

according to some embodiments, the initial conveyor belt image refers to an unprocessed conveyor belt image. The initial conveyor belt image does not refer specifically to a fixed image. For example, the initial conveyor belt image may change when the conveyor belt changes.

It will be readily appreciated that the electronic device may acquire an initial conveyor belt image when the electronic device performs conveyor belt edge detection.

S102, controlling an encoder in a double-flow conveyor belt edge detection network model to encode an initial conveyor belt image to obtain a conveyor belt characteristic image set;

according to some embodiments, a dual-flow conveyor belt edge detection network model (DFTNet) refers to a network model that includes a deep self-attention transform network (Transformer) and a Convolutional Neural Network (CNN). This double-flow conveyer belt edge detection network model fuses transform and CNN, constructs corresponding encoder-decoder structure, can improve the network model to whole situation, long distance information perception ability, can strengthen network context information perception ability and keep abundant detailed information, can fuse whole situation and local information each other to, can be under less parameter, can improve the accuracy when carrying out the edge detection to the conveyer belt, can realize the accurate detection to conveyer belt edge.

In some embodiments, the Transformer structure is not limited to local operations, can model global context information, and has excellent performance on natural language processing tasks.

In some embodiments, the CNN may be, for example, a CNN dual-stream convergence network.

In some embodiments, an Encoder (Encoder) -Decoder (Decoder) structure refers to a structure in which an Encoder converts an original input signal into an intermediate format, and then a Decoder converts the intermediate format into a destination signal. The encoder-decoder structure does not refer specifically to a fixed structure. For example, the encoder-decoder structure may change when the encoder changes. When the decoder is changed, the encoder-decoder structure may also be changed.

According to some embodiments, the conveyor belt characteristic image refers to an image obtained by an encoder after encoding an initial conveyor belt image. The characteristic image of the conveyor belt does not refer to a fixed image. For example, the conveyor belt signature image may change when the initial conveyor belt image changes. The conveyor belt signature image may also change when the encoder changes.

In some embodiments, a set of conveyor belt characteristic images refers to a set of at least one conveyor belt characteristic image aggregated. The set of conveyor belt signature images does not refer to a fixed set. For example, when a conveyor belt feature image changes, the set of conveyor belt feature images may change. The set of conveyor belt characteristic images may also change when the initial conveyor belt image changes.

It is easy to understand that when the electronic device acquires the initial conveyor belt image, the electronic device may control an encoder in the double-flow conveyor belt edge detection network model to encode the initial conveyor belt image, so as to obtain a conveyor belt feature image set.

S103, controlling a decoder in the double-flow conveyor belt edge detection network model to decode the conveyor belt characteristic image set to obtain a target conveyor belt image.

According to some embodiments, the target conveyor belt image refers to an image obtained by decoding a conveyor belt characteristic image set. The target conveyor belt image contains a conveyor belt edge detection result.

It is easy to understand that when the electronic device obtains the conveyor belt feature image set, the electronic device may control a decoder in the double-flow conveyor belt edge detection network model to decode the conveyor belt feature image set, so as to obtain the target conveyor belt image.

In summary, the method provided by the embodiment of the present disclosure obtains an initial conveyor belt image; controlling an encoder in a double-flow conveyor belt edge detection network model to encode an initial conveyor belt image to obtain a conveyor belt characteristic image set; and controlling a decoder in the double-flow conveyor belt edge detection network model to decode the conveyor belt characteristic image set to obtain a target conveyor belt image. Therefore, aiming at the characteristics of unobvious edges, complex background, more interferent information and continuous change in an initial conveyor belt image, the conveyor belt edge detection is carried out by adopting a double-current conveyor belt edge detection network model comprising a depth self-attention transformation network and a convolutional neural network, the extraction capability of convolution of the convolutional neural network on local characteristics can be combined with the sensing capability of the depth self-attention transformation network structure on global and long-distance information, and the belt edge detection precision can be well improved and the belt image noise and the interference of the background can be inhibited. Meanwhile, by designing the depth self-attention transformation network and the convolutional neural network feature fusion module, an encoder-decoder structure is formed, global context information can be fused better, pre-training of the depth self-attention transformation network structure on a large-scale data set can be avoided, the network structure can be flexibly adjusted, and therefore accuracy and real-time performance of edge detection on a conveyor belt can be improved.

Referring to fig. 2 and fig. 3, fig. 2 is a schematic flow chart illustrating a second method for detecting an edge of a conveyor belt according to an embodiment of the present disclosure, and fig. 3 is a network structure diagram illustrating the second method for detecting an edge of a conveyor belt according to an embodiment of the present disclosure. The method may be performed by an electronic device. Specifically, the conveyor belt edge detection method comprises the following steps:

s201, acquiring an initial conveyor belt image;

according to some embodiments, when the electronic device obtains an initial conveyor belt image, the electronic device may control the at least one camera to perform image acquisition on a conveyor belt on a factory site to obtain a conveyor belt site image, perform data processing on the conveyor belt site image to obtain an initial conveyor belt image, and then may control the network transmission device to input the initial conveyor belt image to the electronic device.

In some embodiments, the camera can be installed in the corresponding area of the belt conveyor at different positions in the industrial scene according to the actual situation.

In some embodiments, the network transmission device may be, for example, a data transmission network cable, so that the electronic device may connect the camera to obtain an actual operation image of the belt through the industrial ethernet, may monitor an operation transition state of the conveyor belt in real time, and display the monitored operation transition state of the conveyor belt in the display screen.

S202, controlling a depth self-attention transformation network basic module to convert the initial conveyor belt image into a first conveyor belt characteristic image and a second conveyor belt characteristic image;

according to some embodiments, since the dual-flow conveyor belt edge detection network model fuses the transformers and the CNNs to construct a corresponding encoder-decoder structure, the encoder may include two branches, a deep self-attention transformation network basic module and a convolutional neural network basic module.

In some embodiments, the depth self-attention transform network base module includes an image block partitioning (Patch Partition) layer, a Linear Embedding (Linear Embedding) layer, a first depth layered visual self-attention transform network (SwinT) base module using moving windowing, an image block Merging (Patch Merging) layer, and a second depth layered visual self-attention transform network (SwinT) base module using moving windowing.

According to some embodiments, as shown in fig. 3, when the electronic device controls the depth self-attention transforming network basic module to convert the initial conveyor belt image into the first conveyor belt feature image and the second conveyor belt feature image, the electronic device may control the initial conveyor belt image to sequentially pass through the pitch Partition layer, the Linear Embedding layer and the first SwinT basic module to obtain the first conveyor belt feature image. Then, the electronic device can control the first conveyor belt characteristic image to sequentially pass through the Patch Merging layer and the second SwinT basic module to obtain a second conveyor belt characteristic image.

In some embodiments, when the electronic device controls the initial conveyor belt image to sequentially pass through the Patch Partition layer, the Linear Embedding layer and the first SwinT base module to obtain the first conveyor belt feature image, first, the electronic device may convert the initial conveyor belt image into 4 × 4 sequence patches which do not overlap with each other through the Patch Partition layer, and change the number of channels of the belt image feature to 4 × 4 × 3=48. Then, the electronic device can control the Linear Embedding layer to convert the conveyor belt characteristic image into any dimension, and obtain a conveyor belt characteristic image with an image size (resolution) represented as H/4 xw/4 xc, wherein H represents the height of the initial conveyor belt image, W represents the width of the initial conveyor belt image, and C represents the number of channels of the belt image characteristic. Finally, the electronic device can convert the conveyor belt characteristic image with the image size represented as H/4 xw/4 xc into a first conveyor belt characteristic image through the first SwinT base module, wherein the image size of the first conveyor belt characteristic image can be represented as H/4 xw/4 x 32C.

In some embodiments, when the electronic device controls the first conveyor belt feature image to sequentially pass through the Patch metering layer and the second SwinT base module to obtain the second conveyor belt feature image, the electronic device may convert the first conveyor belt feature image with an image size of H/4 xw/4 x 32C into the second conveyor belt feature image with an image size of H/8 xw/8 x 64C through the Patch metering layer and the second SwinT base module.

Fig. 4 illustrates a schematic structural diagram of a SwinT base module provided by an embodiment of the present disclosure, according to some embodiments. As shown in fig. 4, unlike a multi-head self attention Mechanism (MSA) of convolution, the SwinT basic module adopted in the embodiment of the present disclosure, that is, the first SwinT basic module and the second SwinT basic module, is composed of a first multi-head self attention module and a second multi-head self attention module. The output end of the first multi-head self-attention module is connected with the input end of the second multi-head self-attention module, the input end of the first multi-head self-attention module is the input end of the SwinT basic module, and the output end of the second multi-head self-attention module is the output end of the SwinT basic module.

In some embodiments, as shown in fig. 4, the first Multi-headed Self-Attention module includes two Layer Normalization (LN), a Window-Based Multi-headed Self-Attention mechanism (W-MSA), two residual connection and Multi-Layer Perceptron (MLP), wherein an input of the first LN is connected to a first input of the first residual connection, an output of the first LN is connected to an input of the W-MSA, an output of the W-MSA is connected to a second input of the first residual connection, outputs of the first residual connection are connected to inputs of the second LN and a first input of the second residual connection, respectively, an output of the second LN is connected to an input of the MLP, an output of the MLP is connected to a second input of the second residual connection, and an output of the second residual connection is connected to an input of the second Multi-headed Self-Attention module.

In some embodiments, as shown in fig. 4, the second Multi-headed Self-Attention module comprises two LNs, a Window-Based Multi-headed Self-Attention mechanism (SW-MSA) incorporating a sliding-Window (Shift-Window) operation, two residual-connected Multi-head Self-Attention Modules (MLP), wherein an input of the first LN is connected to a first input of the first residual-connected and an output of the first Multi-headed Self-Attention module, respectively, an output of the first LN is connected to an input of the SW-MSA, an output of the SW-MSA is connected to a second input of the first residual-connected, an output of the first residual-connected is connected to an input of the second LN and a first input of the second residual-connected, an output of the second LN is connected to an input of the MLP, an output of the MLP is connected to a second input of the second residual-connected, and an output of the second residual-connected is an output of the int base module.

In some embodiments, the MLP may consist of two Linear layers and a GELU (Gaussian Error Linear Unit) activation function.

According to some embodiments, the SwinT base module may reduce the computational complexity of the conventional self-attention mechanism by using the W-MSA, and the overall computation process may refer to the following formula:

wherein the content of the first and second substances,z ^l denotes the firstlThe output characteristics of the MLP and residual concatenation of the SwinT basic blocks (i.e. the first SwinT basic block in the disclosed embodiment),

representing the characteristics of the output after the concatenation of W-MSA and residual,z ^l+1 denotes the firstlThe output characteristics of +1 SwinT basic modules (i.e. the second SwinT basic module in the disclosed embodiment) after MLP and residual join,

representing the output characteristics after SW-MSA and residual concatenation,z ^l-1 representing features input into the first SwinT base module.

In some embodiments, when calculating a multi-headed self-attentiveness mechanism, each head is calculated as follows:

wherein the content of the first and second substances,

respectively represent a Query matrix, a Key matrix, a Value matrix, M ² Anddrespectively representing the number of image blocks in a window and the feature dimension number of a Query matrix or a Key matrix,Bis derived from the bias matrix

，K ^T Representing the transposed matrix of the Key matrix.

It is easy to understand that, when the electronic device acquires the initial conveyor belt image, the electronic device may control the Transformer base module to convert the initial conveyor belt image into the first conveyor belt characteristic image and the second conveyor belt characteristic image.

S203, controlling the convolutional neural network basic module to convert the initial conveyor belt image into a third conveyor belt characteristic image, a fourth conveyor belt characteristic image and a fifth conveyor belt characteristic image;

according to some embodiments, fig. 5 shows a schematic structural diagram of a convolutional neural network infrastructure module provided by an embodiment of the present disclosure. As shown in fig. 5, the convolutional neural network base module includes a trunk (Stem) layer, a first downsampling layer, and a second downsampling layer.

In some embodiments, as shown in fig. 5, the stem layer includes two sub-stem layers, wherein each sub-stem layer consists of a convolution with a convolution kernel size of 3 × 3, a step size of 1, and a padding number of 1, as well as Batch Normalization (BN) and Linear rectification function (ReLU). The output end of the first sub-main layer is connected with the input end of the second sub-main layer, the input end of the first sub-main layer is the input end of the main layer, and the output end of the second sub-main layer is the output end of the main layer.

In some embodiments, the stem layer may further include a convolution with convolution kernel size 3 × 3, step size 2, and padding number 1.

In some embodiments, the first and second downsampling layers may implement a maximum pooling (maxporoling) operation of size 2 × 2 and stride 2, i.e., a downsampling operation. Wherein, the number of channels of the belt image features after each down-sampling is doubled.

According to some embodiments, as shown in fig. 3, when the electronic device controls the convolutional neural network base module to convert the initial conveyor belt image into the third conveyor belt feature image, the fourth conveyor belt feature image and the fifth conveyor belt feature image, first, the electronic device may input the initial conveyor belt image into the backbone layer to obtain the third conveyor belt feature image, and the image size of the third conveyor belt feature image may be H × W × 8C. Next, the electronic device may control the first downsampling layer to downsample the third conveyor belt feature image to obtain a fourth conveyor belt feature image, where an image size of the fourth conveyor belt feature image may be H/2 × W/2 × 16C. Finally, the electronic device may control the second downsampling layer to downsample the fourth conveyor belt feature image to obtain a fifth conveyor belt feature image, where an image size of the fifth conveyor belt feature image may be H/4 × W/4 × 32C.

It is easy to understand that when the electronic device acquires the initial conveyor belt image, the electronic device may control the convolutional neural network base module to convert the initial conveyor belt image into a third conveyor belt characteristic image, a fourth conveyor belt characteristic image and a fifth conveyor belt characteristic image.

S204, determining a conveyor belt characteristic image set according to the first conveyor belt characteristic image, the second conveyor belt characteristic image, the third conveyor belt characteristic image, the fourth conveyor belt characteristic image and the fifth conveyor belt characteristic image;

it is easy to understand that when the electronic device acquires the first, second, third, fourth, and fifth conveyor belt feature images, the electronic device may determine a set of conveyor belt feature images, where the set of conveyor belt feature images may include the first, second, third, fourth, and fifth conveyor belt feature images.

S205, controlling an adder to add the first conveyor belt characteristic image and the fifth conveyor belt characteristic image element by element to obtain a sixth conveyor belt characteristic image, and controlling a first up-convolution unit to up-sample the sixth conveyor belt characteristic image and the second conveyor belt characteristic image to obtain a seventh conveyor belt characteristic image;

according to some embodiments, the convolution units provided in the embodiments of the present disclosure, such as the first convolution unit, the second convolution unit, and the third convolution unit, may be formed by bilinear difference values and 1 × 1 convolution.

In some embodiments, the number of lanes of the characteristic image is reduced by half after the up-sampling operation of the conveyor belt characteristic image by the up-convolution unit.

It is easy to understand that when the electronic device obtains the set of the conveyor belt feature images, the electronic device may control the adder to perform element-by-element addition on the first conveyor belt feature image and the fifth conveyor belt feature image to obtain a sixth conveyor belt feature image, and control the first upper convolution unit to perform up-sampling on the sixth conveyor belt feature image and the second conveyor belt feature image to obtain a seventh conveyor belt feature image.

S206, controlling a second upper convolution unit to carry out upper sampling on the seventh conveyor belt characteristic image and the fourth conveyor belt characteristic image to obtain an eighth conveyor belt characteristic image;

it is easy to understand that when the electronic device acquires the seventh conveyor belt feature image, the electronic device may control the second convolution unit to perform upsampling on the seventh conveyor belt feature image and the fourth conveyor belt feature image to obtain an eighth conveyor belt feature image.

S207, controlling a third upper convolution unit to perform upper sampling on the eighth conveyor belt characteristic image and the third conveyor belt characteristic image to obtain a ninth conveyor belt characteristic image;

according to some embodiments, when the conveyor belt characteristic image set passes through the first up-convolution unit, the second up-convolution unit and the third up-convolution unit, each up-sampling step is composed of an up-convolution operation capable of halving the number of image characteristic channels, and the extracted context characteristics are fused with the multi-scale characteristics of the encoder, so that the loss of spatial information caused by the CNN convolution operation can be offset.

It is easy to understand that, when the electronic device acquires the eighth conveyor belt feature image, the electronic device may control the third convolution unit to perform upsampling on the eighth conveyor belt feature image and the third conveyor belt feature image to obtain a ninth conveyor belt feature image, where an image size of the ninth conveyor belt feature image may be set to beH×W×3。

S208, controlling a linear mapping layer to perform segmentation prediction on the ninth conveyor belt characteristic image to obtain a target conveyor belt image;

according to some embodiments, the Linear Projection layer (Linear Projection) may perform pixel-level segmentation prediction on the upsampled features obtained by the upsampling operation, and finally achieve accurate detection of the belt edge.

According to some embodiments, when the electronic device acquires the target conveyor belt image, the electronic device may also predict the target conveyor belt image according to a loss function in DFTNet.

In some embodiments, the Loss function may be, for example, a cross entropy Loss function (CrossEntropy Loss) that may be used to detect a description between two probabilities of an actual predicted value and a tag value. When the value of the cross entropy loss function is smaller, it indicates that the predicted result is closer to the target result. The specific calculation formula is as follows:

where x represents a sample, class represents a class, j represents a label value of the sample x, the positive class is 1, and the negative class is 0.

According to some embodiments, when the DFTNet is constructed, an industrial belt data set can be constructed, and the DFTNet, a traditional conveyor belt deviation detection technology and a conveyor belt deviation detection technology based on visual images are verified through a large number of experiments on the data set, and verification results show that compared with the traditional conveyor belt deviation detection technology and the conveyor belt deviation detection technology based on visual images, the DFTNet obtains the best mIoU, ACC, mF1 and mRecall scores, floating point operation and parameter quantity are superior while detection precision is guaranteed, the processing image frame rate reaches 53.07fps, and the DFTNet is more suitable for industrial actual scenes.

In some embodiments, the DFTNet can be further improved and optimized, and meanwhile, the industrial belt data set is expanded, so that the DFTNet has higher generalization and can be more effectively applied to belt edge detection in industrial scenes with more complex backgrounds.

It is easy to understand that, when the electronic device acquires the ninth conveyor belt feature image, the electronic device may control the linear mapping layer to perform segmentation prediction on the ninth conveyor belt feature image to obtain a target conveyor belt image, where the image size of the target conveyor belt image may be set to beH×W×2。

S209, carrying out belt deviation detection on the belt image of the target conveyor to obtain a belt deviation result.

According to some embodiments, when the electronic device performs belt deviation detection on the target conveyor belt image to obtain a belt deviation result, first, the electronic device may determine an offset corresponding to an edge center line in the target conveyor belt image. And then, the electronic equipment can determine a belt deviation result corresponding to the belt image of the target conveyor according to the offset and the offset threshold.

In some embodimentsFig. 6 shows a detection schematic diagram of belt deviation detection provided by the embodiment of the disclosure. As shown in figure 6 of the drawings,

is the abscissa of the starting point of the left edge in the target conveyor belt image,

for the abscissa of the starting point of the right side edge in the image of the target conveyor belt,

for the left edge endpoint abscissa in the target conveyor belt image,

for the right side edge endpoint abscissa in the target conveyor belt image,

is the average value of the abscissa of the starting points on both sides,

is the average value of the abscissa of the two side end points, wherein,

，

。

in some embodiments, the average of the abscissa of the two side start points and the average of the abscissa of the two side end points are used to determine the offset corresponding to the edge centerline in the image of the target conveyor belt according to the following equationδ：

Wherein the content of the first and second substances,

represents the average value of the abscissa of the starting points on both sides in the label graph,

represents the average of the abscissa of both side end points in the label graph.

In some embodiments, the label map refers to a conveyor belt image when the belt is in a no-tracking state.

In some embodiments, when the offset isδWhen the deviation value is larger than the deviation threshold value, the belt deviation result can be considered as the deviation of the conveyor belt or the deviation of the conveyor belt.

In some embodiments, when the electronic device determines that the belt deviation result is the belt deviation, or the belt has a deviation sign, the electronic device may perform an abnormal warning.

According to some embodiments, fig. 7 shows a scene schematic diagram of a conveyor belt edge detection method provided by an embodiment of the disclosure. As shown in fig. 7, the camera may input a field picture of the conveyor belt to the server through the data transmission network, and the server may determine a target conveyor belt image and a belt deviation result according to the field picture of the conveyor belt, store the belt deviation result in a log, and display the target conveyor belt image, the belt deviation result, and the stored log in the display screen.

It is easy to understand that when the electronic device acquires the belt image of the target conveyor, the electronic device can perform belt deviation detection on the belt image of the target conveyor to obtain a belt deviation result.

In summary, in the method provided by the embodiment of the disclosure, an initial conveyor belt image is obtained, a depth self-attention transformation network basic module is controlled to convert the initial conveyor belt image into a first conveyor belt characteristic image and a second conveyor belt characteristic image, a convolution neural network basic module is controlled to convert the initial conveyor belt image into a third conveyor belt characteristic image, a fourth conveyor belt characteristic image and a fifth conveyor belt characteristic image, a conveyor belt characteristic image set is determined according to the first conveyor belt characteristic image, the second conveyor belt characteristic image, the third conveyor belt characteristic image, the fourth conveyor belt characteristic image and the fifth conveyor belt characteristic image, an adder is controlled to add the first conveyor belt characteristic image and the fifth conveyor belt characteristic image element by element to obtain a sixth conveyor belt characteristic image, a first upper convolution unit is controlled to up-sample the sixth conveyor belt characteristic image and the second conveyor belt characteristic image to obtain a seventh conveyor belt characteristic image, a second upper convolution unit is controlled to up-sample the seventh conveyor belt characteristic image and the fourth conveyor belt characteristic image to obtain a ninth conveyor belt characteristic image, an eighth conveyor belt characteristic image is obtained, an upper convolution unit is controlled to up-sample the seventh conveyor belt characteristic image and the eighth conveyor belt characteristic image, and the transport belt characteristic image is obtained, and the transport belt image is subjected to-sampling and the transport belt-sampling unit to-sampling and the transport belt-up-sampling unit to obtain the ninth conveyor belt-sampling target-sampling transport belt-sampling target-sampling transport belt-sampling control line. Therefore, the double-current conveyor belt edge detection network model comprising the deep self-attention transformation network and the convolutional neural network is adopted to detect the conveyor belt edge, the respective advantages of the deep self-attention transformation network and the convolutional neural network can be effectively utilized to respectively extract local and global features, the extraction capability of the convolution of the convolutional neural network to the local features can be combined with the sensing capability of the deep self-attention transformation network structure to the global and long-distance information, the belt edge detection precision can be well improved, the belt image noise and the interference of the background can be restrained, in addition, the interactive fusion state is achieved through channel addition and up-sampling operation, the belt feature image context dependency relationship is established, the local detail information is enriched, and the capability of the network in extracting the features can be enhanced. Meanwhile, by designing the depth self-attention transformation network and the convolutional neural network feature fusion module, an encoder-decoder structure is formed, global up-down information can be fused better, pre-training of the depth self-attention transformation network structure on a large-scale data set is avoided, the network structure can be adjusted flexibly, and therefore accuracy and real-time performance of edge detection on a conveyor belt can be improved.

In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.

The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.

Please refer to fig. 8, which illustrates a schematic structural diagram of a first conveyor belt edge detecting device according to an embodiment of the present disclosure. The conveyor belt edge detection device may be implemented as all or part of a device, in software, hardware, or a combination of both. The conveyor belt edge detection apparatus 800 includes an image acquisition unit 801, an image encoding unit 802, and an image decoding unit 803, in which:

an image acquisition unit 801 for acquiring an initial conveyor belt image;

an image encoding unit 802, configured to control an encoder in a double-flow conveyor belt edge detection network model to encode an initial conveyor belt image to obtain a conveyor belt feature image set, where the double-flow conveyor belt edge detection network model includes a depth self-attention transform network and a convolutional neural network;

and the image decoding unit 803 is configured to control a decoder in the double-flow conveyor belt edge detection network model to decode the conveyor belt feature image set, so as to obtain a target conveyor belt image.

Optionally, fig. 9 shows a schematic structural diagram of a second conveyor belt edge detection device provided in the embodiment of the present disclosure. As shown in fig. 9, the encoder includes a depth self-attention transformation network basic module and a convolutional neural network basic module, the image encoding unit 802 includes a first image conversion subunit 812, a second image conversion subunit 822 and a set determination subunit 832, and the image encoding unit 802 is configured to control the encoder in the dual-stream conveyor belt edge detection network model to perform encoding processing on the initial conveyor belt image, so as to obtain a conveyor belt feature image set:

a first image conversion subunit 812, configured to control the depth self-attention transformation network infrastructure module to convert the initial conveyor belt image into a first conveyor belt feature image and a second conveyor belt feature image;

a second image conversion subunit 822, configured to control the convolutional neural network base module to convert the initial conveyor belt image into a third conveyor belt feature image, a fourth conveyor belt feature image, and a fifth conveyor belt feature image;

a set determining subunit 832, configured to determine a set of conveyor belt feature images from the first conveyor belt feature image, the second conveyor belt feature image, the third conveyor belt feature image, the fourth conveyor belt feature image and the fifth conveyor belt feature image.

Optionally, the depth self-attention transform network base module includes an image block division layer, a linear embedding layer, a first depth layered visual self-attention transform network base module using a mobile windowing, an image block merging layer, and a second depth layered visual self-attention transform network base module using a mobile windowing, and the first image conversion subunit 812 is specifically configured to, when controlling the depth self-attention transform network base module to convert the initial conveyor belt image into the first conveyor belt feature image and the second conveyor belt feature image:

controlling an initial conveyor belt image to sequentially pass through an image block segmentation layer, a linear embedding layer and a first depth layering vision self-attention transformation network basic module using a mobile windowing to obtain a first conveyor belt characteristic image;

and controlling the first conveyor belt characteristic image to sequentially pass through the image block merging layer and a second depth layering vision self-attention transformation network basic module using a mobile windowing to obtain a second conveyor belt characteristic image.

Optionally, the convolutional neural network base module includes a trunk layer, a first downsampling layer, and a second downsampling layer, and the second image conversion subunit 822 is configured to, when controlling the convolutional neural network base module to convert the initial conveyor belt image into the third conveyor belt characteristic image, the fourth conveyor belt characteristic image, and the fifth conveyor belt characteristic image, specifically configured to:

inputting the initial conveyor belt image to a main layer to obtain a third conveyor belt characteristic image;

controlling the first lower sampling layer to carry out lower sampling on the third conveyor belt characteristic image to obtain a fourth conveyor belt characteristic image;

and controlling the second down-sampling layer to down-sample the fourth conveyor belt characteristic image to obtain a fifth conveyor belt characteristic image.

Optionally, the decoder includes an adder, a first convolution unit, a second convolution unit, a third convolution unit, and a linear mapping layer, and the image decoding unit 803 is configured to control the decoder in the double-flow conveyor belt edge detection network model to decode the conveyor belt feature image set, so as to obtain the target conveyor belt image, and specifically configured to:

the control adder adds the first conveyor belt characteristic image and the fifth conveyor belt characteristic image element by element to obtain a sixth conveyor belt characteristic image;

controlling a first upper convolution unit to carry out up-sampling on a sixth conveyor belt characteristic image and a second conveyor belt characteristic image to obtain a seventh conveyor belt characteristic image;

controlling a second upper convolution unit to carry out up-sampling on the seventh conveyor belt characteristic image and the fourth conveyor belt characteristic image to obtain an eighth conveyor belt characteristic image;

controlling a third upper convolution unit to perform up-sampling on the eighth conveyor belt characteristic image and the third conveyor belt characteristic image to obtain a ninth conveyor belt characteristic image;

and controlling the linear mapping layer to perform segmentation prediction on the ninth conveyor belt characteristic image to obtain a target conveyor belt image.

Alternatively, fig. 10 shows a schematic structural diagram of a third conveyor belt edge detecting device provided in the embodiment of the disclosure. As shown in fig. 10, after obtaining the target conveyor belt image, the method further includes:

and the deviation detection unit 804 is used for carrying out belt deviation detection on the belt image of the target conveyor to obtain a belt deviation result.

Optionally, the deviation detecting unit 804 is configured to perform belt deviation detection on a belt image of the target conveyor, and when a belt deviation result is obtained, the deviation detecting unit is specifically configured to:

determining the offset corresponding to the edge center line in the belt image of the target conveyor;

It should be noted that, when the conveyor belt edge detection apparatus provided in the foregoing embodiment executes the conveyor belt edge detection method, only the division of the above functional modules is taken as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the apparatus is divided into different functional modules, so as to complete all or part of the above described functions. In addition, the conveyor belt edge detection device provided by the above embodiment and the conveyor belt edge detection method embodiment belong to the same concept, and the detailed implementation process is shown in the method embodiment and is not described herein again.

In summary, the apparatus provided in the embodiment of the present disclosure obtains an image of an initial conveyor belt through an image obtaining unit; an image coding unit controls a coder in a double-current conveyor belt edge detection network model to code an initial conveyor belt image to obtain a conveyor belt characteristic image set, wherein the double-current conveyor belt edge detection network model comprises a depth self-attention transformation network and a convolution neural network; and the image decoding unit controls a decoder in the double-flow conveyor belt edge detection network model to decode the conveyor belt characteristic image set to obtain a target conveyor belt image. Therefore, the double-current conveyor belt edge detection network model comprising the deep self-attention transformation network and the convolutional neural network is adopted to detect the conveyor belt edge, the extraction capability of the convolution of the convolutional neural network on local features can be combined with the global and long-distance information perception capability of the deep self-attention transformation network structure, the belt edge detection precision can be well improved, and belt image noise and background interference can be restrained. Meanwhile, by designing the feature fusion module of the depth self-attention transformation network and the convolutional neural network, an encoder-decoder structure is formed, global up-down information can be fused better, pre-training of the depth self-attention transformation network structure on a large-scale data set is avoided, the network structure can be adjusted flexibly, and therefore accuracy and real-time performance of edge detection on a conveyor belt can be improved.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 11 shows a schematic block diagram of an example electronic device 1100 that may be used to implement embodiments of the present disclosure. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 11, the electronic device 1100 includes a computing unit 1101, which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 1102 or a computer program loaded from a storage unit 1108 into a Random Access Memory (RAM) 1103. In the RAM 1103, various programs and data necessary for the operation of the electronic device 1100 may also be stored. The calculation unit 1101, the ROM 1102, and the RAM 1103 are connected to each other by a bus 1104. An input/output (I/O) interface 1105 is also connected to bus 1104.

A number of components in electronic device 1100 connect to I/O interface 1105, including: an input unit 1106 such as a keyboard, a mouse, and the like; an output unit 1107 such as various types of displays, speakers, and the like; a storage unit 1108, such as a magnetic disk, optical disk, or the like; and a communication unit 1109 such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 1109 allows the electronic device 1100 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 1101 can be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 1101 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 1101 performs the various methods and processes described above, such as the conveyor belt edge detection method. For example, in some embodiments, the conveyor belt edge detection method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1108. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 1100 via the ROM 1102 and/or the communication unit 1109. When the computer program is loaded into RAM 1103 and executed by computing unit 1101, one or more steps of the conveyor belt edge detection method described above may be performed. Alternatively, in other embodiments, the computing unit 1101 may be configured to perform the conveyor belt edge detection method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or electronic device.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data electronic device), or that includes a middleware component (e.g., an application electronic device), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), the internet, and blockchain networks.

The computer system may include a client and an electronic device. The client and the electronic device are generally remote from each other and typically interact through a communication network. The relationship of client and electronic device arises by virtue of computer programs running on the respective computers and having a client-electronic device relationship to each other. The electronic device may be a cloud electronic device, which is also called a cloud computing electronic device or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and low service extensibility in a conventional physical host and VPS service ("Virtual Private Server", or "VPS" for short). The electronic device may also be a distributed system of electronic devices, or an electronic device incorporating a blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A conveyor belt edge detection method, comprising:

acquiring an initial conveyor belt image;

controlling an encoder in a double-current conveyor belt edge detection network model to encode the initial conveyor belt image to obtain a conveyor belt characteristic image set, wherein the double-current conveyor belt edge detection network model comprises a deep self-attention transformation network and a convolution neural network;

2. The method of claim 1, wherein the encoder comprises a deep self-attention transform network basis module and a convolutional neural network basis module, and the controlling the encoder in the dual-stream conveyor belt edge detection network model to perform encoding processing on the initial conveyor belt image to obtain a conveyor belt feature image set comprises:

3. The method of claim 2, wherein the depth self-attention transform network infrastructure module comprises a tile split layer, a linear embedded layer, a first depth layered vision self-attention transform network infrastructure module using moving windowing, a tile merge layer, and a second depth layered vision self-attention transform network infrastructure module using moving windowing, the controlling the depth self-attention transform network infrastructure module to convert the initial conveyor belt image into a first conveyor belt feature image and a second conveyor belt feature image, comprising:

controlling the initial conveyor belt image to sequentially pass through the image block segmentation layer, the linear embedding layer and the first depth layering vision self-attention transformation network basic module using movable windowing to obtain a first conveyor belt characteristic image;

4. The method of claim 2, wherein the convolutional neural network base module comprises a backbone layer, a first downsampling layer, and a second downsampling layer, said controlling the convolutional neural network base module to convert the initial conveyor belt image into a third conveyor belt feature image, a fourth conveyor belt feature image, and a fifth conveyor belt feature image comprising:

5. The method of claim 2, wherein the decoder comprises an adder, a first convolution unit, a second convolution unit, a third convolution unit and a linear mapping layer, and the controlling the decoder in the dual-stream conveyor belt edge detection network model to decode the conveyor belt feature image set to obtain the target conveyor belt image comprises:

controlling the first upper rolling unit to perform up-sampling on the sixth conveyor belt characteristic image and the second conveyor belt characteristic image to obtain a seventh conveyor belt characteristic image;

6. The method of claim 1, further comprising, after said obtaining a target conveyor belt image:

7. The method of claim 6, wherein the performing belt deviation detection on the target conveyor belt image to obtain a belt deviation result comprises:

8. A conveyor belt edge detection device, comprising:

9. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.

10. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.