CN118379472A

CN118379472A - Automatic orientation method and equipment for safety shell appearance acquisition equipment based on deep learning

Info

Publication number: CN118379472A
Application number: CN202410540806.3A
Authority: CN
Inventors: 邢诚; 虞剑; 黄晶晶
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2024-04-30
Filing date: 2024-04-30
Publication date: 2024-07-23

Abstract

The invention provides an automatic orientation method and equipment of containment appearance acquisition equipment based on deep learning, comprising the steps of respectively erecting acquisition equipment and an orientation prism on two known points; controlling the safety shell appearance acquisition equipment to approximately align the directional prism, and taking pictures of the directional prism; the method comprises the steps of acquiring and constructing a prism identification data set in advance, cutting out an image construction center identification data set of a prism area from the prism identification data set, and constructing a prism center automatic detection model, wherein the prism center detection model comprises two stages; inputting the shot picture of the directional prism into a trained prism center automatic detection model, extracting the prism center point image space coordinate, and calculating the image space offset distance between the prism center point image space coordinate and an image main point; calculating the object space offset distance and offset angle between the center of the orientation mark and the main point of the image, controlling the rotation of the cradle head to enable the center of the orientation mark to coincide with the main point of the image so as to accurately aim the prism, and shooting the prism again after re-aiming so as to check whether the center of the prism coincides with the center of the main point of the image.

Description

Automatic orientation method and equipment for safety shell appearance acquisition equipment based on deep learning

Technical Field

The invention belongs to the field of nuclear power station containment detection, and particularly relates to an automatic orientation technical scheme of containment appearance acquisition equipment based on deep learning.

Background

The containment of a nuclear power plant has some defects in appearance after long operation due to environmental impact. In order to avoid further damage to the internal structure of the containment, it is necessary to discover these appearance defects in time and to resort to effective reinforcement.

In the past, the appearance inspection of the containment is mainly performed by adopting a manual visual inspection method, and the method is long in time consumption, high in risk and low in efficiency. With the development of technology, the proximate image acquisition devices are applied to the visual inspection task of the containment vessel. However, this equipment is complex in structure and large in mass, and is inconvenient to install and carry. In order to solve the problems, the university of martial arts develops a containment remote acquisition device, and realizes a containment appearance inspection function through image acquisition and image stitching of multiple sites. The remote acquisition equipment has the advantages of light weight, simple structure, convenient installation, high acquisition speed and the like. The remote acquisition equipment drives the camera to rotate to shoot through the cradle head, but unlike a total station, the cradle head has no directional telescope and no automatic directional function. In order to unify the coordinate systems of the cradle head on a plurality of frame sites, when the image data on different sites are unified to the same coordinate system, the photographing equipment needs to be erected on a known point, and meanwhile, the known point is used as a zero direction for orientation, and according to the connecting line direction of the known frame sites and the orientation points, the rotation and translation relations of the coordinate systems of the cradle head on different sites can be calculated, so that the rotation and translation relations are unified. A method for automatically returning to the zero direction of photographic equipment includes such steps as erecting photographic equipment at a known point, erecting a prism at another known point, obtaining the prism image by camera, automatically extracting the central point of prism by the feature extraction method in image processing to obtain the coordinates of central point, calculating the offset distance between the central point and the main point, calculating the offset distance and angle between the central point of prism and the main optical axis, and controlling the rotation of holder to coincide with the central point of prism.

In view of the foregoing, there is a strong need for a more accurate and efficient automatic orientation method for a remote collection device for a containment vessel. The traditional method adopts a threshold segmentation method, extracts 3 triangles according to yellow-green color characteristics and area characteristics of the triangles on the prism target, connects left and right triangle vertexes into a horizontal line, passes through the top triangle vertexes and is vertical to the horizontal line, and the intersection point of the horizontal line and the vertical line is the prism center. This method of extracting the prism center is greatly affected by the external environment and has a general accuracy.

Disclosure of Invention

The invention aims to solve the problem of automatic orientation of safety shell appearance acquisition equipment.

In order to overcome the defects of the prior art, the technical proposal provided by the invention comprises an automatic orientation method of containment appearance acquisition equipment based on deep learning, comprising the following steps,

Step 1, respectively erecting acquisition equipment and a directional prism on two known points;

step 2, controlling the safety shell appearance acquisition equipment to approximately align the directional prism, and taking pictures of the directional prism;

Step 3, pre-collecting pictures of directional prisms in different environments, constructing a prism identification data set, cutting out images of prism areas from the prism identification data set, constructing a center identification data set, and constructing a prism center automatic detection model, wherein the prism center detection model comprises two stages, the first stage identifies prisms in the images, the detected and identified partial images containing the prisms are cut out and transferred to the second stage, and the center of the prisms is further identified from the partial images; training a prism center automatic detection model by using the prism identification data set and the center identification data set;

Step 4, inputting the picture of the directional prism shot in the step 2 into the prism center automatic detection model trained in the step 3, extracting the prism center point image space coordinates, and calculating the image space offset distance between the prism center point image space coordinates and the image main point;

and 5, calculating an object space offset distance and an offset angle between the center of the orientation mark and the main point of the image, controlling the rotation of the cradle head to enable the center of the orientation mark to coincide with the main point of the image so as to accurately aim the prism, and shooting the prism again after re-aiming so as to check whether the center of the prism coincides with the center of the main point of the image.

Moreover, the prism identification data set and the center identification data set are respectively expanded by adopting image enhancement.

Moreover, the first stage employs a conventional YOLOv network model.

In addition, in the second stage, an improved YOLOv network model is adopted, the improved YOLOv network model is based on a traditional YOLOv network structure, a multi-scale fusion frame is added, the feature extraction capacity of the model is improved, and meanwhile, a multi-feature coding module and a small object detection head are added to improve the detection performance of a small object such as a prism center.

And detecting the center of the prism by using the improved YOLOv model, obtaining a rectangular frame, taking the center of the rectangular frame as the center of the prism, returning corresponding pixel coordinates, calculating offset distances and offset angles of the object space in the horizontal and vertical directions according to the photogrammetry principle compared with the coordinates of the main point, and controlling the cradle head to re-aim according to the calculated offset angles to finish the orientation of the acquisition equipment.

Moreover, containment outward appearance collection equipment includes tripod, base, cloud platform fixed chassis, high accuracy cloud platform, camera connecting plate and camera, and the camera passes through the camera connecting plate and links to each other with high accuracy cloud platform, then cloud platform fixed chassis connects in high accuracy cloud platform bottom, and cloud platform fixed chassis links to each other with the base, wholly fixes on the tripod through the base at last.

In another aspect, the present invention also provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the method for automatically orienting a containment appearance collection device based on deep learning as described above when executing the program.

In another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of automatically orienting a containment appearance gathering device based on deep learning as described above.

In another aspect, the invention also provides a computer program product comprising a computer program which when executed by a processor implements the method of automatic orientation of a deep learning based containment appearance gathering device as described above.

Further, the apparatus includes a processor and a memory, the memory for storing program instructions, the processor for invoking the stored instructions in the memory to perform an auto-orientation method of the deep learning based containment appearance gathering apparatus as described above.

Or comprises a readable storage medium having stored thereon a computer program which, when executed, implements an automatic orientation method for a deep learning based containment appearance gathering device as described above.

According to the automatic orientation scheme of the safety shell appearance acquisition equipment based on deep learning, the zero-direction accurate and efficient resolving of the safety shell appearance acquisition equipment is realized.

The scheme of the invention is simple and convenient to implement, has strong practicability, solves the problems of low practicability and inconvenient practical application existing in the related technology, can improve user experience, and has important market value.

Drawings

FIG. 1 is an overall flow chart of the automatic orientation of a containment appearance gathering device in accordance with an embodiment of the present invention;

FIG. 2 is a diagram of YOLOv for detecting a prism according to an embodiment of the present invention;

FIG. 3 is a diagram of a YOLOv block diagram of an improvement for detecting prism centers according to an embodiment of the present invention;

fig. 4 is a schematic diagram of detection according to an embodiment of the present invention.

FIG. 5 is a diagram illustrating the offset of the positioning result according to an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is specifically described below with reference to the accompanying drawings and examples.

The invention discloses an automatic orientation scheme of containment appearance acquisition equipment based on deep learning, which comprises the steps of firstly, respectively erecting the acquisition equipment and an orientation prism on two known points. The cradle head camera is then controlled to be approximately aligned with the directional prism, and a picture of the prism is taken. And respectively constructing a prism identification data set and a center identification data set by using the acquired pictures. After the dataset was made, it was used to train YOLOv model and modified YOLOv model, respectively. A trained YOLOv model and a modified YOLOv model were obtained for acquisition device orientation. After the acquisition equipment is approximately aligned with the prism to shoot a picture, the picture is transmitted to a computer, the computer end firstly identifies the prism in the picture through the trained model, then detects the center of the prism in the identified prism image, and the center of the detection frame is used as the detected center pixel coordinate of the prism. And calculating the image space offset distance between the lens and the main point, calculating the object space offset distance and the angle between the prism center and the main optical axis according to the photogrammetry principle, and controlling the rotation of the cradle head to enable the prism center to coincide with the main optical axis, thereby completing the zero-direction orientation of the photographic equipment.

The embodiment of the invention provides an automatic orientation overall flow of a containment appearance acquisition device based on deep learning, which is shown in a figure 1, and comprises the following specific steps:

s1, erecting acquisition equipment and a directional prism on a known point;

The step is to erect the acquisition equipment and the directional prism on two known points respectively, and the implementation is as follows in the embodiment:

(1) 2 marks are arranged on the ground and used as a frame station, and a total station is used for acquiring coordinate information of two points.

(2) The acquisition equipment and the directional prism are erected on known points through a tripod, and the centering and leveling operation is completed.

In specific implementation, the coordinates of two points can be acquired by adopting a total station. The collection device and prism are then separately mounted at two points. The acquisition equipment comprises a tripod, a base, a tripod head fixing chassis, a high-precision tripod head, a camera connecting plate and a camera. The camera is connected with the high-precision tripod head through a camera connecting plate, then the tripod head fixing base plate is connected to the bottom of the high-precision tripod head, the tripod head fixing base plate is connected with the base, and finally the whole tripod head is fixed on the tripod through the base;

S2, controlling acquisition equipment to shoot a prism image;

This step controls the acquisition device to rotate the approximately collimation orientation prism and take a picture of the orientation prism. In the implementation, the computer can be connected with the high-precision cradle head and the camera through wires, the cradle head is controlled to rotate so that the camera is aimed at the prism, and then the camera is controlled to take a picture and is transmitted back to the computer in real time.

The implementation in the examples is as follows:

(1) And the cradle head and the camera are connected with the notebook computer through a data transmission line.

(2) And controlling the cradle head to rotate so that the prism enters the shooting range of the camera.

(3) And controlling the camera to shoot the prism image and transmitting the prism image back to the notebook computer.

S3, constructing a data set and a network model;

The method comprises the steps of acquiring pictures of directional prisms in different environments in advance, establishing a prism center identification data set and constructing an automatic prism center detection model. The prism center detection model comprises two stages, namely the directional center detection is divided into two stages, and the first stage is to identify the prism from a large image so as to obtain a local image containing the prism. The second stage then accurately identifies the prism center from the partial image. The first stage aims to narrow the range of the detection image and improve the detection precision and the detection efficiency of the second stage.

The first stage preferably uses YOLOv a network model to identify prisms in the image and cuts out the detected and identified partial image containing prisms into a second stage that preferably uses a modified YOLOv a network model to further identify the center of the prisms from the partial image.

The prism center identification data set includes a prism identification data set and a center identification data set. The prism identification dataset is used for training YOLOv, identifying the prism from the whole figure. The center recognition dataset was used to train a modified YOLOv model to further detect the center of the prism from the recognized prism image. The YOLOv model described in the first stage of the present invention, namely the conventional YOLOv8 network structure, is an improvement on YOLOv. This YOLOv model uses a more efficient C2f module instead of the C3 module and adjusts the number of channels. The detection head portion is also modified to use a decoupled head technique for classification and detection. YOLOv8 network structure is simplified, detection speed is faster, and detection accuracy is higher. Meanwhile, a multi-scale fusion frame is added to an improved YOLOv model for detecting the prism center on the basis of a traditional YOLOv network structure, so that the feature extraction capacity of the model is improved. Meanwhile, a multiple feature coding module and a small object detection head are added to improve the detection performance of a small object, namely the center of the prism.

Further, the present invention proposes that the image enhancement technique be employed to expand the image of the data set after the prism image data set is acquired and processed.

The preferred implementation of the embodiment is as follows:

(1) The resolution of the original picture taken in the examples is 6000 x 4000. The proportion of the prism center on the original image is smaller, the center is directly identified on the original image, and the precision is poor. The present invention uses a two-stage detection model to identify prism centers. In order to improve the usability of the model, the embodiment of the invention collects the prism images of different environments, constructs a prism identification data set, and the resolution of the image in the data set is 6000 multiplied by 4000. Then, an image of the prism region is cut out from the map, and a center identification data set with a resolution of 320×320 is constructed. And by adopting a data enhancement method, it is preferably suggested to simulate different weather conditions by adding image noise and changing image brightness, and expand the two data sets so as to improve the training effect of the model.

(2) The prism recognition model is built, the detection model preferably adopts YOLOv model in the prior art, and the model structure is provided as shown in fig. 2 for the convenience of reference. The model comprises an input, a skeleton network, a neck network and four detection parts. The size of the input image in the prism recognition model is 6000×4000. The model includes a 23-layer network structure. A first layer: and converting the input characteristic map into 64 channels by using a CBS module, wherein the step length is 2, and obtaining an output characteristic map of P1. A second layer: and continuously using the CBS module, converting the input feature map from 64 channels to 128 channels, wherein the step length is 2, and obtaining the output feature map of P2. Third layer: the number of channels for the output result was 128 using the C2f module. Fourth layer: and continuously using the CBS module, converting the input feature map from 128 channels to 256 channels, wherein the step length is 2, and obtaining the output feature map of P3. Fifth layer: the number of channels for outputting the result was adjusted to 256 using the C2f module. Sixth layer: and continuously using the CBS module, converting the input feature map from 256 channels to 512 channels, wherein the step length is 2, and obtaining the output feature map of P4. Seventh layer: the number of channels of the input feature map is adjusted to 512 using the C2f module. Eighth layer: using the CBS module, the input signature is converted from 512 channels to 1024 channels. Using the ninth layer: the number of channels of the input feature map is adjusted to 1024 using the C2f module. The feature map is then processed using the SPPF module of the tenth layer.

Eleventh layer: the feature map is upsampled to increase its size by a factor of 2 using an upsampling (Upsample) operation. Twelfth layer: and splicing the characteristic diagram of the last step with the characteristic diagram generated in the seventh layer. Thirteenth layer: the C2f module is used to process the feature map, and the number of channels of the input feature map is adjusted to 512. Fourteenth layer: the feature map is upsampled to increase its size by a factor of 2 using an upsampling (Upsample) operation. Fifteenth layer: and splicing the characteristic diagram of the last step with the characteristic diagram generated in the fifth layer. Sixteenth layer: the feature map is processed by using a C2f module, and the channel number of the input feature map is adjusted to 256, so that a small-size feature map is obtained. Seventeenth layer: using the CBS module, the output is 256 channels. Eighteenth layer: and splicing the characteristic diagram of the last step with the characteristic diagram generated in the tenth layer. Nineteenth layer: the C2f module is used to process the feature map, and the channel number of the input feature map is adjusted to 512, so as to obtain the medium-size feature map. Twentieth layer: using CBS module, the output result is 512 channels. Twenty-first layer: and splicing the characteristic diagram of the last step with the characteristic diagram generated in the tenth layer. Twenty-second layer: the C2f module is used to process the feature map, and the channel number of the input feature map is adjusted to 512, so that a large-size feature map is obtained. Twenty-third layer: and finally, inputting the three-scale feature images into a detection module for target detection. YOLOv8 also replaces the original coupling head with the currently prevailing decoupling head, separating the classification head from the regression detection head.

The structure of the CBS module is shown in fig. 2, consisting of a convolutional layer (Conv 2 d), a normalizing layer (BatchNorm) and SiLU activation function. The structure of the C2f module is shown in fig. 2, the module realizes cross-stage feature fusion and partial feature reuse, and the low-layer and high-layer features are effectively integrated through the organic combination of components such as a convolution layer (Conv 2 d), a bottleneck block (Bottleneck), jump connection (split), feature fusion (concat), and the like, so that the perceptibility and generalization capability of a network are improved, and richer and more useful feature representation is provided for a target detection task. Each bottleneck block (Bottleneck) in the C2f module is typically composed of a series of convolution layers, including 1x1 convolutions, 3x3 convolutions, and so on. A bottleneck structure is generally adopted, that is, the number of channels is reduced, then feature extraction is performed through 3x3 convolution, and finally the number of channels is restored, which helps to reduce the calculation amount and enhance the feature expression capability. The Concat module is a feature fusion operation, which splices the two tensors in the appointed dimension to generate a new tensor. This operation is typically used to fuse feature maps with different channel numbers to extend the representational capacity of the feature. The SPPF module divides the feature map into a plurality of grids with different scales, then performs maximum pooling (MAXPool) operation on each grid, and finally splices all pooling results through a concat layer to form feature vectors with fixed lengths so as to provide more comprehensive space information.

(3) The center recognition model is constructed, and the detection model adopted by the invention is correspondingly improved based on YOLOv model so as to better realize center recognition. The local and global feature information is better acquired by fusing feature graphs with different scales. The Zoom_cat module is constructed to spatially splice features with different sizes so as to capture local fine details of small target objects, the detection effect on the small targets such as the prism center is improved, and the model structure is shown in figure 3. The modified YOLOv has 31 layers, and the backbone network part has 10 layers which are the same as YOLOv. The eleventh layer and the twelfth layer are CBS modules, the convolution operation of 1x1 is carried out on the features, and meanwhile the eleventh layer processes the features of the 5 th layer in the backbone network. The tenth layer is a zoom_cat module, and features from the eleventh layer, the twelfth layer and the seventh layer are spliced at the channel level. The fourteenth layer is a C2f module, and the characteristics after splicing are extracted and converted. The tenth and sixteenth layers are CBS modules. The CBS module of the sixteenth layer processes the 3 rd layer of features in the backbone network, the seventeenth layer is a zoom_cat module, and the features from the sixteenth layer, the fifth layer and the tenth layer are spliced at the channel level. The eighteenth layer is a C2f module, and the characteristics after splicing are extracted and converted. The nineteenth layer CBS module processes the features of the eighteenth layer. The twentieth layer is Concat modules, which merges features of the nineteenth layer and the fifteenth layer. The twenty-first layer is a C2f module, and features of the twentieth layer are processed. The twenty-second layer is a CBS module, and the features of the twenty-first layer are subjected to a convolution operation of 3x3 through a convolution layer. And then, splicing the characteristics of the layer with the characteristics of the tenth layer in the backbone network at the channel level through a Concat module of the twenty-third layer, and increasing the number of channels. Then, the second fourteen layers use the C2f module to perform feature extraction and conversion on the spliced features, so that the number of channels is kept at 512, and the size of the feature map is unchanged. And finally, the twentieth layer utilizes ScalSeq modules to scale the characteristics of the 5 th layer, the seventh layer and the ninth layer in the backbone network, and the channel number is adjusted to 256. Then, the sixteenth layer adds the feature adjusted by the ScalSeq module and the feature output by the eighteenth layer to the element level through the Add module for enhancing the feature expression. Next, the seventeenth layer performs an upsampling operation through Upsample to double the feature map size. And then, the twenty-eighth layer performs channel-level splicing on the up-sampled features and features of the third layer in the backbone network through a Concat module, so that the number of channels is increased. Then, the second nineteenth layer uses a C2f module to perform feature extraction and conversion on the spliced features, so that the number of channels is adjusted to 128, and the size of the feature map is unchanged. Finally, the thirty-first layer uses ScalSeq module to adjust the dimension of the third layer, twenty-sixth layer and twenty-first layer, adjusts the channel number to 128, and uses Add module to Add the element level of the feature adjusted by ScalSeq module and the feature output by C2f module in twenty-ninth layer for enhancing the feature expression. In the detection head part (detection), the original YOLOv structure has only 3 detection heads, and in order to improve the detection effect on the small target, one detection head aiming at the small target is added. Meanwhile, similar to YOLOv, each detection head also adopts a decoupling head consisting of two convolution modules, and the target position and the category information are respectively extracted. wherein, the detection head aiming at the small target and the other 3 detection heads adopt the decoupling heads with the same structure, but the input data are different.

Wherein Add operations Add two tensors element by element to generate a new tensor. This operation is typically used to fuse feature maps having the same spatial dimensions to enhance correlation and complementarity between features. ScalSeq the structure is shown in fig. 3, the function first receives three input feature graphs, representing features from different levels, adjusts the number of channels for two of the features by three 1x1 convolution layers (Conv 2D), then uses interpolation to keep the adjusted feature size consistent with the third feature, then splices the three features along the channel dimension (Concat), further performs feature fusion by one 3D convolution layer (Conv 3D), then performs feature enhancement by using batch normalization (BatchNorm) and activation function (Silu), and finally downsamples the features by 3D max pooling (Pool 3D), generating the final output feature. The function of the Zoom_cat module adopted by the 13 th layer and the 17 th layer is to fuse characteristic diagrams of different layers, and the structure of the Zoom_cat module is the prior art and is not repeated in the invention. Three lists of feature maps are input, representing feature maps from large, medium, and small sizes, respectively. And in forward propagation, firstly, the large-size feature map is adjusted to the same size as the medium-size feature map through self-adaptive pooling, and then, the adjusted large-size feature map is added with the maximum pooling and average pooling results of the large-size feature map, so that feature enhancement is realized. Next, the small-size feature map is adjusted to the same size as the medium-size feature map using an interpolation operation. And finally, splicing the adjusted large-size feature map, the middle-size feature map and the adjusted small-size feature map along the channel dimension, and returning the spliced feature map.

(4) After the prism identification model and the center identification model are built, the prism identification data set and the center identification data set are respectively adopted to train the two models.

S4, acquiring the center image space coordinates of the prism;

The detection process is shown in fig. 4, after two trained models are obtained, the image shot by the prism can be primarily aimed by the acquisition equipment in the step 2, the image is input into the prism identification model, the prism is detected from the image, and the prism image in the detection frame is extracted on the original image and transmitted into the center identification model. The result of the center detection model is a small rectangular box that frames the prism center. Taking the center of the rectangle as the center of the prism, thereby obtaining the image space coordinate of the prism center on the original image.

S5, calculating an offset angle;

Since prism centers were detected using the modified YOLOv model, the result was a rectangular box. The center of the rectangular frame is taken as the center of the prism, and corresponding pixel coordinates (u, v) are returned, and then the offset distance L _x,L_y and the offset angle θ _x,θ_y of the object side in the horizontal and vertical directions can be calculated according to the photogrammetry principle compared with the image principal point coordinates (u _o,v_o).

In an embodiment, after obtaining the image side coordinates, the offset between the point and the image principal point is calculated. The offset of the pixels in the horizontal direction and the vertical direction are obtained, respectively. The offset distance L _x,L_y and the offset angle θ _x,θ_y in the horizontal and vertical directions of the object are calculated according to the photogrammetry principle. The calculation formula is as follows:

Wherein, (u, v) is a pixel coordinate, (u _o,v_o) is an image principal point coordinate, L _x,L_y is offset distances in the horizontal and vertical directions of the object space, θ _x,θ_y is an offset angle, D is a photographing distance, L is an offset distance, f is a principal distance, W is a camera sensor width, and W is an image width. And controlling the cradle head to re-aim according to the calculated offset angle, thus finishing the orientation of the acquisition equipment.

In specific implementation, a prism identification data set and a center identification data set can be adopted in advance to train a prism center automatic detection model, when actual orientation is needed, the prism is roughly aligned through a containment appearance acquisition device, a photo is taken, a trained prism center automatic detection model is input, namely, the center of a prism in an image is acquired through a detection mode based on deep learning, offset angles in the horizontal direction and the vertical direction are calculated, then a holder is controlled to rotate according to a calculation result so as to accurately align the prism, and after re-alignment, the prism is shot again so as to detect whether the center of the prism coincides with the center of an image principal point.

In particular, the method according to the technical solution of the present invention may be implemented by those skilled in the art using computer software technology to implement an automatic operation flow, and a system apparatus for implementing the method, such as a computer readable storage medium storing a corresponding computer program according to the technical solution of the present invention, and a computer device including the operation of the corresponding computer program, should also fall within the protection scope of the present invention.

The automatic orientation device of the safety shell appearance acquisition equipment based on the deep learning, which is provided by the invention, is described below, and the automatic orientation device of the safety shell appearance acquisition equipment based on the deep learning, which is described below, and the automatic orientation method of the safety shell appearance acquisition equipment based on the deep learning, which is described above, can be correspondingly referred to each other.

The electronic device provided by the invention can comprise: a processor (processor), a communication interface (Communications Interface), a memory (memory), and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus. The processor may invoke logic instructions in the memory to perform a method of automatically orienting a deep learning based containment appearance acquisition device, the method comprising:

Further, the logic instructions in the memory described above may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of executing the method for automatically orienting a deep learning based containment appearance collection device provided by the above methods, the method comprising:

In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the method for automatically orienting a deep learning based containment appearance collection device provided by the above methods, the method comprising:

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

To verify the superior performance of the proposed models in the prism center point detection task, YOLOv s, YOLOv7, YOLOV s, YOLOv3, CENTERNET models were trained and tested on the prism center dataset and the test results of these models were compared to the proposed models. To ensure fair comparison, the operating environment parameters remain consistent during training, and all methods are trained until they converge to achieve optimal performance. The table shows the results of the test on the test set for each model. It can be seen from table 1 that the YOLOv model performs best except for the model proposed by the present invention, which has an F1 value of 90.66%, is optimal in all models, but its parameters and GFLOPs are large. The model proposed by the present invention is shown in mAP@0.5 and mAP@0.5:0.95 index was superior to these models. mAP@0.5 and mAP_0.5-0.95 increased by 1.68%,2.01% respectively, and parameters and GFLOPs were also significantly reduced compared to YOLOv. These results also show that the network model proposed by the present invention has good detection performance in prism center detection tasks.

Table 1. Comparison of the results of the different models.

In order to verify the accuracy of the prism center detected by the proposed method. Comparing the automatic detection results with the manual calibration results under different environments, and calculating offset pixel values between the two. The results are shown in FIG. 5. As can be seen from fig. 5, most of the offsets are within 2 pixels. Table 2 counts the error information of the test results, with maximum offsets of-2.688 pixels and-2.188 pixels in the horizontal and vertical directions, and root mean square error values of 1.079 pixels and 0.923 pixels in the horizontal and vertical directions, respectively. The result shows that the method provided by the invention can accurately position the pixel coordinates of the prism center point. The angle is converted according to an angle calculation formula, the detection deviation of the horizontal angle and the vertical angle is respectively 4.51 'and 3.67', the detection deviation is smaller than the minimum rotation angle of the high-precision cradle head, and the precision is far higher than the practical application requirement.

TABLE 2 positioning error information

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

The above embodiments are merely illustrative of the technical solutions of the present invention. The method, apparatus and storage medium for scheduling and unloading a mobile edge computing task based on multi-agent cooperative deep reinforcement learning according to the present invention are not limited to the above embodiments, but the scope of the invention is defined by the claims. Any modifications, additions or equivalent substitutions made by those skilled in the art based on this embodiment are within the scope of the invention as claimed in the claims.

Claims

1. An automatic orientation method of containment appearance acquisition equipment based on deep learning is characterized by comprising the following steps of: comprises the steps of,

2. The automatic orientation method of a deep learning based containment appearance collection apparatus of claim 1, wherein: the prism identification data set and the center identification data set are respectively expanded by adopting image enhancement.

3. The automatic orientation method of a deep learning based containment appearance collection apparatus of claim 1, wherein: the first stage employs a conventional YOLOv network model.

4. The automatic orientation method of a deep learning based containment appearance collection apparatus of claim 1, wherein: in the second stage, an improved YOLOv network model is adopted, the improved YOLOv network model is based on a traditional YOLOv network structure, a multi-scale fusion frame is added, the feature extraction capacity of the model is improved, and meanwhile, a multi-feature coding module and a small object detection head are added to improve the detection performance of a small target such as a prism center.

5. The automatic orientation method of a deep learning based containment appearance collection apparatus of claim 1, wherein: and detecting the center of the prism by using the improved YOLOv model, obtaining a rectangular frame, taking the center of the rectangular frame as the center of the prism, returning corresponding pixel coordinates, calculating offset distances and offset angles of the object space in the horizontal and vertical directions according to the photogrammetry principle compared with the coordinates of the principal point, and controlling the cradle head to re-aim according to the calculated offset angles to finish the orientation of the acquisition equipment.

6. The automatic orientation method of the deep learning based containment appearance collection apparatus according to claim 1 or 2 or 3 or 4 or 5, wherein: the safety shell appearance acquisition equipment comprises a tripod, a base, a tripod head fixing chassis, a high-precision tripod head, a camera connecting plate and a camera, wherein the camera is connected with the high-precision tripod head through the camera connecting plate, then the tripod head fixing chassis is connected to the bottom of the high-precision tripod head, the tripod head fixing chassis is connected with the base, and finally the safety shell appearance acquisition equipment is integrally fixed on the tripod through the base.

7. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, characterized by: the processor, when executing the program, implements the automatic orientation method of the deep learning-based containment appearance collection apparatus according to any one of claims 1 to 6.

8. A non-transitory computer readable storage medium having a computer program stored thereon, characterized by: the computer program, when executed by a processor, implements a method for automatically orienting a deep learning based containment appearance collection apparatus according to any one of claims 1 to 6.

9. A computer program product comprising a computer program characterized by: the computer program, when executed by a processor, implements a method for automatically orienting a deep learning based containment appearance collection apparatus according to any one of claims 1 to 6.