CN113936179A

CN113936179A - Target detection method and device, electronic equipment and storage medium

Info

Publication number: CN113936179A
Application number: CN202111234641.XA
Authority: CN
Inventors: 丁顺意; 席林; 许毅; 曾旭; 何慧钧
Original assignee: Shanghai Thermal Image Science And Technology Co ltd
Current assignee: Shanghai Thermal Image Science And Technology Co ltd
Priority date: 2021-10-22
Filing date: 2021-10-22
Publication date: 2022-01-14

Abstract

The embodiment of the invention discloses a target detection method, a target detection device, electronic equipment and a storage medium. The method comprises the following steps: determining a current prior frame for target detection; determining a target detection regression loss function between a current prior frame and a real frame subjected to switch target labeling in advance to obtain the distance between the prior frame and the real frame; grouping real frames subjected to switch target labeling in advance according to the target detection regression loss function to obtain real frames belonging to the current prior frame; updating the width and the height of the current prior frame according to the width and the height data of the real frame under the current prior frame to obtain an updated prior frame; returning to continue updating until a preset updating condition is met; and carrying out target detection according to the prior frame after the iteration updating is finished. Through the technical scheme of the embodiment, the problems of training speed and recognition precision of the prior art model are solved, and the training speed of the target detection model and the accuracy of switching target detection are improved.

Description

Target detection method and device, electronic equipment and storage medium

Technical Field

The embodiment of the invention relates to the technical field of image processing of power systems, in particular to a target detection method, a target detection device, target detection equipment and a storage medium.

Background

With the rapid development of electric power utilities in China, more and more high-voltage cabinet equipment are provided. The misoperation accident of the switch cabinet equipment is the most serious accident in the safety production accidents of the whole power industry and is one of multiple accidents. The consequence of the misoperation accident is that the power system is damaged if the consequence is light, and the personal safety is seriously damaged if the consequence is heavy. Therefore, it is urgently needed to develop an automatic identification system of the cabinet switch to detect and identify the switch of the high-voltage switch cabinet.

The rapid development of computer technology enables the application of target detection to be wider, the existing target detection algorithm improves the target positioning capability to a certain extent, but the target detection precision is still not high. Therefore, how to improve the recognition accuracy is a technical problem to be solved urgently by those skilled in the art.

Disclosure of Invention

The embodiment of the invention provides a target detection method, a target detection device, electronic equipment and a storage medium, and aims to improve the accuracy of target detection.

In a first aspect, an embodiment of the present invention provides a target detection method, including:

determining a current prior frame for target detection; the current prior frame comprises an initialized prior frame selected from a real frame subjected to switch target labeling in advance or an updated prior frame obtained by last updating;

determining a target detection regression loss function between a current prior frame and a real frame subjected to switch target labeling in advance;

grouping real frames subjected to switch target labeling in advance according to the target detection regression loss function to obtain real frames belonging to the current prior frame;

updating the width and the height of the current prior frame according to the width and the height data of the real frame under the current prior frame to obtain an updated prior frame; returning to continue updating until a preset updating condition is met;

and carrying out target detection according to the prior frame after the iteration updating is finished.

In a second aspect, an embodiment of the present invention further provides an object detection apparatus, including:

a current prior frame determination module, configured to determine a current prior frame for target detection; the current prior frame comprises an initialized prior frame selected from a real frame subjected to switch target labeling in advance or an updated prior frame obtained by last updating;

the distance determining module is used for determining a target detection regression loss function between the current prior frame and a real frame which is subjected to switch target labeling in advance;

the real frame acquisition module is used for grouping real frames subjected to switch target marking in advance according to the target detection regression loss function to obtain real frames belonging to the current prior frame;

the updated prior frame acquisition module is used for updating the width and the height of the current prior frame according to the width and the height data of the real frame under the current prior frame to obtain an updated prior frame; returning to continue updating until a preset updating condition is met;

and the target detection module is used for carrying out target detection according to the prior frame after the iteration update is finished.

In a third aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:

one or more processors;

storage means for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement a method of object detection as described in any embodiment of the invention.

In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the object detection method according to any embodiment of the present invention.

The embodiment of the invention provides a target detection method, a target detection device, electronic equipment and a storage medium, wherein a current prior frame for target detection is determined; determining a target detection regression loss function between a current prior frame and a real frame subjected to switch target labeling in advance to obtain the distance between the prior frame and the real frame; grouping real frames subjected to switch target labeling in advance according to the target detection regression loss function to obtain real frames belonging to the current prior frame; updating the width and the height of the current prior frame according to the width and the height data of the real frame under the current prior frame to obtain an updated prior frame; returning to continue updating until a preset updating condition is met; and carrying out target detection according to the prior frame after the iteration updating is finished. By adopting the technical scheme, the real frame of the switch target label is normalized; selecting an initialization prior frame by a volume wrapping method through drawing a scatter diagram; determining target detection regression loss functions of a current prior frame and a real frame and grouping to obtain the real frame belonging to the current prior frame; and iteratively updating the prior frame until the condition is met, and identifying the target. The problems of training speed and recognition precision of the prior art model are solved, and the training speed of the target detection model and the accuracy of switching target detection are improved.

Drawings

Fig. 1 is a flowchart of a target detection method according to an embodiment of the present invention;

fig. 2 is a flowchart of a target detection method according to a second embodiment of the present invention;

fig. 3 is a schematic structural diagram of a target detection apparatus according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Example one

Fig. 1 is a flowchart of a target detection method according to an embodiment of the present invention, where this embodiment is applicable to a case of detecting a target switch, and the method of this embodiment may be executed by a target detection apparatus, and the apparatus may be implemented in a hardware and/or software manner. The apparatus may be configured in a server for object detection. The method specifically comprises the following steps:

and S110, determining a current prior frame for target detection.

The prior frames may be rectangular frames with different sizes and different aspect ratios preset in advance on the image, for example, when the switch is subjected to target detection, the preset prior frames can predict the position of the switch, traverse each prior frame in the image, and then perform classification and fine adjustment to complete target detection.

Optionally, the current prior frame includes, but is not limited to, an initialized prior frame selected from real frames in which switch target labeling is performed in advance, or a prior frame updated after last updating.

The step of labeling the real frames may be to uniformly scale all the real frames to a fixed size along with the switch picture, label the scaled real frames, and acquire coordinate data.

Wherein scaling to a fixed size facilitates training of the sample data, the fixed size including, but not limited to, 320 x 320.

And S120, determining a target detection regression loss function between the current prior frame and a real frame subjected to switch target labeling in advance.

The target detection regression loss function can be used for determining the difference between a real frame and a prior frame through the overlapping area, the central point distance, the aspect ratio and the like of the real frame and the prior frame, and can better reflect the detection effect of the prior frame and the real frame; the target detection regression loss functions include, but are not limited to, IOU, GIOU, DIOU, CIOU, and EIOU.

The embodiment of the invention adopts EIOU to compare the overlapping area and the central distance between the real frame and the prior frame and the width and height data of the minimum closure area which can simultaneously contain the current prior frame and the real frame, solves the fuzzy definition of the aspect ratio based on CIOU, and solves the problem of sample imbalance in regression by adding Focal Loss.

S130, grouping the real frames subjected to switch target labeling in advance according to the target detection regression loss function to obtain the real frames belonging to the current prior frame.

The grouping can be realized by determining the error between the prior frame and the real frame, selecting the prior frame with the minimum error and attributing the corresponding real frame to the current prior frame; for example, by determining the distance between the center points of the real frame and the prior frame, selecting the frame with the closest center point distance as a group, and selecting the frame with the farthest center point distance as a group; and attributing the corresponding real frame to the current prior frame.

S140, updating the width and the height of the current prior frame according to the width and the height data of the real frame under the current prior frame to obtain an updated prior frame; and returning to continue updating until the preset updating condition is met.

The updating of the width and height of the current prior frame may be to calculate a median or an average according to the width and height data of the real frame under each belonging current prior frame, and use the new width and height data as the new width and height data of the prior frame, so as to obtain an updated prior frame; for example, if the widths of the real boxes under the current prior box are 23.2, 21.5, 24, 23.6 and 22.7, the new width data of the data obtained according to the median value is 23.2; new wide data for the data can be obtained from the average.

The preset updating condition can mean that the width and height data of the prior frame are not changed any more; it may also mean that a preset number of iterations, such as 10, is met.

And S150, carrying out target detection according to the prior frame after the iteration updating is finished.

Wherein the width and height data of the prior frame is not changed; or the preset iteration times are met, the prior frame is not updated in an iteration mode, and the final prior frame width and height data are obtained.

Optionally, the cross-over ratio is calculated according to the obtained prior frame and each real frame, the highest cross-over ratio is selected for each real frame, the cross-over ratios of all the real frames are averaged, and the accuracy is determined to perform target detection.

The embodiment of the invention provides a target detection method, which comprises the steps of determining a current prior frame for target detection, and normalizing a real frame of a switch target label; the EIOU is adopted to compare the overlapping area and the central distance between the real frame and the prior frame and the width and height data of the minimum closure area which can simultaneously contain the current prior frame and the real frame, so that the problems of fuzzy definition of the aspect ratio and unbalanced sample in regression are solved; grouping real frames subjected to switch target labeling in advance according to the target detection regression loss function to obtain real frames belonging to the current prior frame; updating the width and height of the current prior frame may refer to calculating a median or an average value according to the width and height data of the real frame under each belonging current prior frame, and using the new width and height data as new width and height data of the prior frame, so as to obtain an updated prior frame; returning to continue updating until a preset updating condition is met; and carrying out target detection according to the prior frame after the iteration updating is finished. The problems of training speed and recognition precision of the prior art model are solved, the training speed of the target detection model and the accuracy of switch target detection are improved, and misoperation accidents of the switch equipment are reduced.

Example two

Fig. 2 is a flowchart of a target detection method according to a second embodiment of the present invention. Embodiments of the present invention are further optimized on the basis of the above-mentioned embodiments, and the embodiments of the present invention may be combined with various alternatives in one or more of the above-mentioned embodiments. As shown in fig. 2, the target detection method provided in the embodiment of the present invention may include the following steps:

s210, selecting an initialization prior frame from real frames subjected to switch target labeling in advance.

Firstly, all real frames are uniformly scaled to a fixed size, for example 320 × 320, along with the picture, all scaled real frame coordinates are extracted, and only all rectangular frames of all pictures need to be extracted without distinction and put together. Processing the data to obtain width and height data of real frames of all training data; according to the coordinates of the real frame, width and height data of the real frame are obtained, the coordinate data are converted into the width and height of the real frame, and the calculation method is as follows

Long-right lower-left upper-corner abscissa

Width-lower right-corner ordinate-upper left-corner ordinate

Optionally, the initialization method includes, by randomly selecting k values from all the real frames as initial values of k priori frames:

obtaining width and height values of all real frames through data processing, and drawing a scatter diagram;

calculating the convex hull of the scatter point by using a calculation method of convex hulls such as a volume wrapping algorithm;

and uniformly generating a certain number of real frames in the convex hull polygon range, wherein the real frames serve as iteration initial values of the prior frames.

S220, determining an overlapping loss function, a center distance loss function and a width-height loss function between the current prior frame and a real frame subjected to switch target labeling in advance.

Wherein, the EIoU value of each real box and each prior box is obtained. The traditional clustering method measures the difference by using Euclidean distance or IoU, but the error of the Euclidean distance method increases with the increase of the sizes of the prior frame and the real frame, and the IoU method can only calculate the intersection and parallel ratio of the prior frame and the real frame and cannot compare the widths, the lengths and the central distances of the prior frame and the real frame.

Optionally, the overlapping area of the current prior frame and the real frame is obtained, and the overlapping loss function L of the current prior frame and the real frame is determined_IoU；

Obtaining the central distance between the current prior frame and the real frame, and determining the central distance loss function of the current prior frame and the real frame;

and acquiring width and height data of a minimum closure area which can simultaneously contain the current prior frame and the real frame, and determining a width and height loss function of the current prior frame and the real frame.

Wherein, the distance between the prior frame and the real frame is calculated by adopting a formula,

wherein IoU is the intersection ratio of the current prior frame and the real frame; b and b^gtRespectively representing the central points of the prior frame and the real frame; w and w^gtRespectively representing the widths of the prior frame and the real frame; h and h^gtRespectively representing the heights of the prior frame and the real frame; rho represents the Euclidean distance between two central points; c represents the diagonal distance of the minimum closure area which can contain the prior frame and the real frame at the same time; c. C_WAnd c_hIs the width and height of the smallest closure area that can contain both the prior box and the real box.

And S230, grouping the real frames subjected to switch target labeling in advance according to the overlapping loss function, the center distance loss function and the width-height loss function to obtain the real frames belonging to the current prior frame.

Wherein, the error L of a prior frame corresponding to each real frame is determined_EIoU(n, k) by comparing the error magnitude of each real box with its corresponding prior box { L_EIoU(1,k)，L_EIoU(2,k)，……，

L_EIoU(i, k) }, selecting the prior frame with the minimum error, and attributing the corresponding real frame to the current prior frame;

and repeating the classification operation on each real frame to finally obtain the real frame which belongs to each prior frame.

S240, updating the width and the height of the current prior frame according to the width and the height data of the real frame under the current prior frame to obtain an updated prior frame; and returning to continue updating until the preset updating condition is met.

Calculating a median or an average value of the width and height data of the real frame under each current prior frame to be used as new width and height data of the prior frame;

iteratively updating the width and height data of the prior frame until the width and height data of the prior frame are not changed; or satisfy a preset number of iterations.

And S250, carrying out target detection according to the prior frame after the iteration updating is finished.

The embodiment of the invention provides a target detection method, which comprises the steps of uniformly scaling all real frames to a fixed size along with a picture, extracting all scaled real frame coordinates, drawing a scatter diagram, generating the real frames according to convex hulls of the scatter points, and using the real frames as iteration initial values of prior frames; grouping real frames subjected to switch target labeling in advance according to the obtained overlapping loss function, the center distance loss function and the width-height loss function to obtain real frames belonging to the current prior frame; updating the width and the height of the current prior frame according to the width and the height data of the real frame under the current prior frame to obtain an updated prior frame; returning to continue updating until a preset updating condition is met; and calculating the intersection ratio of the obtained prior frame and each real frame, selecting the highest intersection ratio of each real frame, calculating the average value of the intersection ratios of all the real frames, and determining the accuracy to detect the target. By adopting the technical scheme of the embodiment of the invention, the training speed of the target detection model and the accuracy of the switch target detection are improved, the occurrence of misoperation accidents of the switch equipment is reduced, and the problems of the training speed and the recognition accuracy of the model in the prior art are solved.

EXAMPLE III

Fig. 3 is a schematic structural diagram of an object detection apparatus according to a third embodiment of the present invention, where the apparatus includes: current prior frame determination module 310, distance determination module 320, real frame acquisition module 330, updated prior frame acquisition 340, and target detection module 350. Wherein:

a current prior frame determination module 310, configured to determine a current prior frame for target detection; the current prior frame comprises an initialized prior frame selected from a real frame subjected to switch target labeling in advance or an updated prior frame obtained by last updating;

a distance determining module 320, configured to determine a target detection regression loss function between the current prior frame and a real frame in which switching target labeling is performed in advance;

a real frame obtaining module 330, configured to group real frames subjected to switch target labeling in advance according to the target detection regression loss function, so as to obtain a real frame belonging to a current prior frame;

an updated prior frame obtaining module 340, configured to update the width and height of the current prior frame according to the width and height data of the real frame belonging to the current prior frame, so as to obtain an updated prior frame; returning to continue updating until a preset updating condition is met;

and an object detection module 350, configured to perform object detection according to the prior frame after the iteration update is finished.

On the basis of the above embodiment, optionally, the current prior frame determining module 310 includes:

On the basis of the foregoing embodiment, optionally, the distance determining module 320 includes:

acquiring the overlapping area of the current prior frame and the real frame, and determining the overlapping loss function of the current prior frame and the real frame;

On the basis of the foregoing embodiment, optionally, the real frame acquiring module 330 includes:

determining the error of a prior frame corresponding to each real frame, selecting the prior frame with the minimum error by comparing the error of the prior frame corresponding to each real frame, and attributing the corresponding real frame to the current prior frame;

On the basis of the foregoing embodiment, optionally, the obtaining 340 of the updated prior frame includes:

calculating the median or average value of the width and height data of the real frame under each current prior frame to be used as new width and height data of the prior frame;

On the basis of the above embodiment, optionally, the target detection module 350 includes:

and calculating the intersection ratio of the obtained prior frame and each real frame, selecting the highest intersection ratio of each real frame, calculating the average value of the intersection ratios of all the real frames, and determining the accuracy to detect the target.

On the basis of the foregoing embodiment, optionally, the current prior frame determining module 310 further includes:

and (4) uniformly zooming all the real frames to a fixed size along with the picture, labeling the zoomed real frames and acquiring coordinate data.

The device can execute the target detection method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects for executing the target detection method.

Example four

Fig. 4 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present application. The embodiment of the application provides electronic equipment, and the interactive device for target detection provided by the embodiment of the application can be integrated in the electronic equipment. As shown in fig. 4, the present embodiment provides an electronic device 400, which includes: one or more processors 420; the storage device 410 is configured to store one or more programs, and when the one or more programs are executed by the one or more processors 420, the one or more processors 420 implement the method for object detection provided by the embodiment of the present application, the method includes:

Of course, those skilled in the art will understand that the processor 420 also implements the technical solution of the target detection method provided in any embodiment of the present application.

The electronic device 400 shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 4, the electronic device 400 includes a processor 420, a storage device 410, an input device 530, and an output device 440; the number of the processors 420 in the electronic device may be one or more, and one processor 420 is taken as an example in fig. 4; the processor 420, the storage device 410, the input device 430, and the output device 440 in the electronic apparatus may be connected by a bus or other means, and are exemplified by a bus 450 in fig. 4.

The storage device 410 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and module units, such as program instructions corresponding to the object detection method in the embodiment of the present application.

The storage device 410 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the storage 410 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, storage 410 may further include memory located remotely from processor 420, which may be connected via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input means 430 may be used to receive input numbers, character information, or voice information, and to generate key signal inputs related to user settings and function control of the electronic device. The output device 440 may include a display screen, speakers, or other electronic equipment.

The electronic equipment provided by the embodiment of the application can achieve the technical effect of effectively improving the target detection accuracy.

EXAMPLE five

An embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a method for object detection, the method including:

Optionally, the program, when executed by the processor, may be further configured to perform the object detection method provided in any embodiment of the present invention.

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM), a flash Memory, an optical fiber, a portable CD-ROM, an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. A computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take a variety of forms, including, but not limited to: an electromagnetic signal, an optical signal, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, Radio Frequency (RF), etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method of object detection, the method comprising:

2. The method of claim 1, wherein determining a current prior box for target detection comprises:

3. The method of claim 1, wherein determining the target detection regression loss function between the current prior frame and the real frame with the switch target labeling performed in advance comprises:

4. The method according to claim 1, wherein the grouping the real frames subjected to the switch target labeling in advance according to the target detection regression loss function to obtain the real frames belonging to the current prior frame comprises:

5. The method according to claim 1, wherein the width and height of the current prior frame are updated according to width and height data of a real frame under the current prior frame to obtain an updated prior frame; and returning to continue updating until the preset updating condition is met, wherein the updating comprises the following steps:

6. The method of claim 1, wherein the performing object detection according to the prior frame after the iterative update is finished comprises:

7. The method of claim 1, wherein the current prior frame comprises an initialized prior frame selected from real frames previously labeled with switch targets or an updated prior frame obtained from a last update comprises:

8. An object detection apparatus, characterized in that the apparatus comprises:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the object detection method as claimed in any one of claims 1-7 when executing the program.

10. A storage medium containing computer-executable instructions for performing the object detection method of any one of claims 1-7 when executed by a computer processor.