CN113378857A - Target detection method and device, electronic equipment and storage medium - Google Patents
Target detection method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN113378857A CN113378857A CN202110721556.XA CN202110721556A CN113378857A CN 113378857 A CN113378857 A CN 113378857A CN 202110721556 A CN202110721556 A CN 202110721556A CN 113378857 A CN113378857 A CN 113378857A
- Authority
- CN
- China
- Prior art keywords
- detection
- module
- detection frames
- processing
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 177
- 238000012545 processing Methods 0.000 claims abstract description 67
- 238000010586 diagram Methods 0.000 claims abstract description 53
- 238000000034 method Methods 0.000 claims abstract description 32
- 238000000605 extraction Methods 0.000 claims abstract description 27
- 238000012216 screening Methods 0.000 claims abstract description 25
- 238000004590 computer program Methods 0.000 claims description 9
- 238000013135 deep learning Methods 0.000 abstract description 2
- 238000013527 convolutional neural network Methods 0.000 description 10
- 238000004891 communication Methods 0.000 description 8
- 238000003672 processing method Methods 0.000 description 6
- 238000013473 artificial intelligence Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The disclosure provides a target detection method, a target detection device, electronic equipment and a storage medium, and relates to the technical field of computer vision and deep learning. The specific implementation scheme is as follows: carrying out feature extraction on the image to obtain a plurality of feature maps; classifying the plurality of feature maps to obtain a plurality of thermodynamic diagrams; carrying out target detection processing according to the thermodynamic diagrams to obtain a plurality of first detection frames; and screening the plurality of first detection frames to obtain a plurality of second detection frames, wherein the plurality of second detection frames are used for identifying target objects of different classifications in the image. By adopting the method and the device, the target detection speed is improved.
Description
Technical Field
The present disclosure relates to the field of artificial intelligence. The present disclosure relates to computer vision and deep learning technology, and is particularly applicable to smart cities and intelligent traffic scenarios.
Background
With the development of the technology, the hardware performance can be improved through artificial intelligence, the applicable application scenarios are various, for example, image processing, video processing, face recognition, target positioning target detection and the like are involved, especially in the application scenario requiring real-time performance, the requirement for the detection speed is higher and higher, and how to improve the detection speed as much as possible on the premise of ensuring the detection precision is an urgent problem to be solved. For this reason, the related art has no effective solution.
Disclosure of Invention
The disclosure provides a target detection method, a target detection device, an electronic device and a storage medium.
According to an aspect of the present disclosure, there is provided an object detection method including:
carrying out feature extraction on the image to obtain a plurality of feature maps;
classifying the plurality of feature maps to obtain a plurality of thermodynamic diagrams;
carrying out target detection processing according to the thermodynamic diagrams to obtain a plurality of first detection frames;
and screening the plurality of first detection frames to obtain a plurality of second detection frames, wherein the plurality of second detection frames are used for identifying target objects of different classifications in the image.
According to another aspect of the present disclosure, there is provided an object detecting apparatus including:
the first processing module is used for extracting the features of the image to obtain a plurality of feature maps;
the second processing module is used for classifying the plurality of feature maps to obtain a plurality of thermodynamic maps;
the third processing module is used for carrying out target detection processing according to the thermodynamic diagrams to obtain a plurality of first detection frames;
and the fourth processing module is used for screening the plurality of first detection frames to obtain a plurality of second detection frames, and the plurality of second detection frames are used for identifying target objects of different classifications in the image.
According to another aspect of the present disclosure, there is provided an electronic device including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method provided by any one of the embodiments of the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a method provided by any one of the embodiments of the present disclosure.
According to another aspect of the present disclosure, there is provided a computer program product comprising computer instructions which, when executed by a processor, implement the method provided by any one of the embodiments of the present disclosure.
By adopting the method and the device, the image is subjected to feature extraction, a plurality of feature maps can be obtained, the feature maps are classified, a plurality of thermodynamic diagrams can be obtained, target detection processing is carried out according to the thermodynamic diagrams, a plurality of first detection frames can be obtained, the first detection frames are subjected to screening processing, a plurality of second detection frames can be obtained, and the second detection frames are used for identifying target objects of different classifications in the image, so that the target detection speed is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic flow diagram of a target detection method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of an application scenario of a target detection method according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a target detection method in an application example according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of the structure of the target detection device according to an embodiment of the present disclosure;
fig. 5 is a block diagram of an electronic device for implementing the object detection method of the embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. The term "at least one" herein means any combination of at least two of any one or more of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C. The terms "first" and "second" used herein refer to and distinguish one from another in the similar art, without necessarily implying a sequence or order, or implying only two, such as first and second, to indicate that there are two types/two, first and second, and first and second may also be one or more.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
Due to the development of technology, artificial intelligence technology with neural network as the core is widely applied in relevant scenes of computer vision, such as face recognition, image classification, character recognition (OCR), object detection, image segmentation, object tracking, event detection, unmanned driving, smart city and intelligent traffic scene, etc.
Taking an intelligent traffic scene as an example, target detection in the scene can be realized by adopting an anchor point (anchor) or pixel traversal mode based on a computer vision and artificial intelligence technology. For example, a Faster mode that the candidate region is based on a convolutional neural network (fast R-CNN) is adopted, and a predefined frame mode such as an anchor is used to cooperate with a region generation network (RPN) and region of interest pooling (ROI posing) to perform target detection, so that the method has a slow detection speed and high accuracy; for example, a mode of Only aiming at a frame by one eye (Yolo, You Only Look Once) is adopted, target detection is performed according to the position and the type of a pixel point direct regression frame, and an article and the position thereof in an image can be identified; for example, a first-stage multi-box detection (SSD) method is used to perform target detection by scanning a feature pyramid and pixel point by point, and the method has a fast detection speed but a low accuracy.
In summary, in the target detection method in the related art, no matter the first stage (detection in the first stage generally does not need to perform anchor predefinition, and has a fast detection speed but poor accuracy) or the second stage (detection in the second stage generally needs to predefine a large amount of anchor data, and has a high detection accuracy but a slow speed) is adopted, either a large amount of anchor data needs to be preset, or traversal is performed on pixels, and both the detection accuracy and the retrieval speed cannot be taken into consideration.
In order to solve the problems, according to the method, a large amount of anchor data do not need to be preset, traversal for pixel points is not needed, target detection is mainly carried out on multiple categories based on thermodynamic diagrams, the detection speed can be improved as far as possible on the premise that the detection precision is guaranteed, and the method is particularly applied to detection scenes with high real-time requirements such as intelligent traffic and automatic driving, can meet the requirements of real-time detection and can guarantee high detection accuracy.
According to an embodiment of the present disclosure, an object detection method is provided, and fig. 1 is a flowchart of an object detection method according to an embodiment of the present disclosure, which may be applied to an object detection apparatus, for example, in a case where the apparatus may be deployed in a terminal or a server or other processing device to execute, image processing, video processing, and the like may be executed. Among them, the terminal may be a User Equipment (UE), a mobile device, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, and so on. In some possible implementations, the method may also be implemented by a processor calling computer readable instructions stored in a memory. As shown in fig. 1, includes:
and S101, extracting the features of the image to obtain a plurality of feature maps.
And S102, classifying the plurality of feature maps to obtain a plurality of thermodynamic diagrams.
And S103, carrying out target detection processing according to the thermodynamic diagrams to obtain a plurality of first detection frames.
S104, screening the plurality of first detection frames to obtain a plurality of second detection frames, wherein the plurality of second detection frames are used for identifying target objects of different classifications in the image.
In an example of S101-S104, a Convolutional Neural Network (CNN) may be used to perform feature extraction on an input image, classify a plurality of obtained feature maps to obtain a plurality of thermodynamic diagrams, perform connected domain operation on the plurality of thermodynamic diagrams to implement target detection processing to obtain a plurality of first detection frames, and perform filtering processing including regression and classification on the plurality of first detection frames to obtain a plurality of second detection frames.
By adopting the method and the device, the image is subjected to feature extraction, a plurality of feature maps can be obtained, the feature maps are classified, a plurality of thermodynamic diagrams can be obtained, target detection processing is carried out according to the thermodynamic diagrams, a plurality of first detection frames can be obtained, the first detection frames are subjected to screening processing, a plurality of second detection frames can be obtained, and the second detection frames are used for identifying target objects of different classifications in the image, so that the target detection speed is improved.
In one embodiment, extracting features from an image to obtain a plurality of feature maps includes: inputting the image into a feature extraction module; and performing the feature extraction in the feature extraction module, and outputting the plurality of feature maps. For example, the feature extraction module uses the CNN to perform feature extraction on an image input by the CNN, and outputs the image to obtain a CxHxW feature. Wherein C refers to the category of the feature map, H refers to the height of the feature map, and W refers to the width of the feature map. By adopting the embodiment, the image features can be accurately extracted and processed by the feature extraction module, so that the detection precision is improved.
In one embodiment, classifying the plurality of feature maps to obtain a plurality of thermodynamic diagrams includes: inputting the plurality of feature maps into a classification module; the classification is performed in the classification module, and the plurality of thermodynamic diagrams are output and used for characterizing a plurality of classes of image features. For example, a classification module is used to classify CxHxW features corresponding to a plurality of CNN output feature maps, so as to generate a plurality of thermodynamic diagrams, and the thermodynamic diagrams are classified into a plurality of categories for characterizing different image features. By adopting the embodiment, the image features can be accurately classified through the classification module, so that the detection precision is improved.
In one embodiment, performing target detection processing according to the thermodynamic diagrams to obtain a plurality of first detection frames includes: inputting the plurality of thermodynamic diagrams into a connected domain module; and in the connected domain module, respectively carrying out connected domain operation on each thermodynamic diagram to realize the target detection processing, and outputting the plurality of first detection frames. For example, the connected component module performs connected component calculation on each thermodynamic diagram area to obtain a plurality of first detection frames, and in the intelligent transportation, the plurality of detection frames may include detection frames for vehicles, lane lines, objects around roads, and the like. By adopting the embodiment, the accurate detection processing of the target object can be realized through the connected domain module, so that the detection precision is improved, and correspondingly, the detection speed is also improved.
In an embodiment, the screening the plurality of first detection frames to obtain a plurality of second detection frames, where the plurality of second detection frames are used to identify target objects of different classifications in the image, and the method includes: inputting the plurality of first detection boxes into a screening module comprising a regression branch and a classification branch; in the screening module, the coordinates of the first detection frames are subjected to regression processing based on the positions of the detection frames of the corresponding classes through the regression branch, the features of the positions corresponding to the first detection frames are extracted through the classification branch and then subjected to classification processing, and the second detection frames are output. The regression branch is to obtain a more accurate position of the detection frame, for example, the regression device may be directly used to regress the coordinates of the first detection frame corresponding to the category; the classification branch is for performing more accurate classification, for example, a region of interest focusing (ROI Align) algorithm may be used to extract features of a corresponding position of the first detection box, and then the features are sent to a two-classifier for classification, where the two-classifier is used to determine whether the class exists. By adopting the embodiment, the screening processing of the first detection frame can be realized by the regression branch and the classification branch, so that the detection precision is improved, and correspondingly, the detection speed is also improved.
According to an embodiment of the present disclosure, an object detection method is provided, fig. 2 is an application scenario schematic diagram of the object detection method according to the embodiment of the present disclosure, and may be implemented by using an object detection network, where the object detection network may include a feature extraction module, a classification module, a connected domain module, and a screening module, and the object detection network may be deployed in various terminals, such as the server 11, the terminals 21 to 22, the testing machine 31, the desktop 41, and the like shown in fig. 2, and is not limited to the object detection architecture shown in fig. 2, and any terminal may implement the processing logic 200 according to the above embodiment of the present disclosure, which specifically includes:
s201, feature extraction is carried out on the image by using a feature extraction module, and a plurality of feature maps are output.
S202, classifying the plurality of feature maps by using a classification module, and outputting a plurality of thermodynamic diagrams.
And S203, using a connected domain module to respectively perform connected domain calculation on the region rows in each thermodynamic diagram and output a plurality of first detection frames.
S204, screening the plurality of first detection frames by using the screening module, and outputting a plurality of second detection frames.
By adopting the embodiment, the target detection network comprising the feature extraction module, the classification module, the connected domain module and the screening module can be deployed at various required terminals according to actual requirements, and accurate and rapid target detection can be realized.
Application example:
fig. 3 is a schematic diagram of a target detection method in an application example according to an embodiment of the present disclosure, in the application example, for target detection in an intelligent traffic scene, detection is performed based on thermodynamic diagrams, and both detection accuracy and detection speed may be considered, where the method includes:
firstly, for an input image, a CNN is used for carrying out feature extraction, the CNN network can be ResNet, MobileNet and other networks, and the output feature is CxHxW. Wherein C refers to the category of the feature map, H refers to the height of the feature map, and W refers to the width of the feature map.
Secondly, firstly, classifying the CxHxW features output by the CNN by using a classifier, for example, if N classes exist in total, then a thermodynamic diagram of NxHxW is generated, and each matrix of HxW represents the thermodynamic diagram of a certain class.
And thirdly, connecting the areas of each thermodynamic diagram by using a connected domain module, wherein the used connected domain algorithm can be a Two-Pass algorithm. This generates a corresponding first detection box for each category.
Fourthly, the generated first detection frame is subjected to processing of regression branches and classification branches respectively, and screening of the detection frame is carried out to obtain a final second detection frame, specifically, for the regression branches, a regressor can be directly used for regressing the coordinates of the first detection frame corresponding to the category; for the classification branch, the ROI Align algorithm may be used to extract the feature of the corresponding position of the first detection frame, and then the feature is sent to the second classifier, for example, the feature is input into a second classifier for classification (the function of the second classifier is to determine whether the class exists), and finally the required second detection frame is obtained.
By adopting the application example, a large amount of anchor data are not required to be preset, and target detection is not required to be carried out all the time by pixel points.
According to an embodiment of the present disclosure, there is provided an object detection apparatus, fig. 4 is a schematic structural diagram of the object detection apparatus according to an embodiment of the present disclosure, and as shown in fig. 4, an object detection apparatus 400 includes: a first processing module 401, configured to perform feature extraction on an image to obtain a plurality of feature maps; a second processing module 402, configured to classify the plurality of feature maps to obtain a plurality of thermodynamic diagrams; a third processing module 403, configured to perform target detection processing according to the thermodynamic diagrams to obtain a plurality of first detection frames; a fourth processing module 404, configured to perform screening processing on the multiple first detection frames to obtain multiple second detection frames, where the multiple second detection frames are used to identify target objects of different classifications in the image.
In one embodiment, the first processing module is configured to input the image into a feature extraction module; and performing the feature extraction in the feature extraction module, and outputting the plurality of feature maps.
In one embodiment, the second processing module is configured to input the plurality of feature maps into a classification module; the classification is performed in the classification module, and the plurality of thermodynamic diagrams are output and used for characterizing a plurality of classes of image features.
In one embodiment, the third processing module is configured to input the thermodynamic diagrams into a connected domain module; and in the connected domain module, respectively carrying out connected domain operation on each thermodynamic diagram to realize the target detection processing, and outputting the plurality of first detection frames.
In one embodiment, the fourth processing module is configured to input the plurality of first detection boxes into a screening module including a regression branch and a classification branch; in the screening module, the coordinates of the first detection frames are subjected to regression processing based on the positions of the detection frames of the corresponding classes through the regression branch, the features of the positions corresponding to the first detection frames are extracted through the classification branch and then subjected to classification processing, and the second detection frames are output.
The functions of each module in each apparatus in the embodiments of the present disclosure may refer to the corresponding description in the above method, and are not described herein again.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
Fig. 5 is a block diagram of an electronic device for implementing the object detection method, the image processing method, and the video processing method of the embodiments of the present disclosure. The electronic device may be the aforementioned deployment device or proxy device. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 5, the electronic device 500 includes a computing unit 501, which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic apparatus 500 can also be stored. The calculation unit 501, the ROM502, and the RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
A number of components in the electronic device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, or the like; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508, such as a magnetic disk, optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the electronic device 500 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
The computing unit 501 may be a variety of general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 501 performs the respective methods and processes described above, such as the object detection method, the image processing, and the video processing method. For example, in some embodiments, the object detection method, the image processing and video processing methods may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 500 via the ROM502 and/or the communication unit 509. When the computer program is loaded into the RAM 503 and executed by the computing unit 501, one or more steps of the object detection method, the image processing, and the video processing method described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the object detection method, the image processing and the video processing method in any other suitable way (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.
Claims (13)
1. A method of target detection, the method comprising:
carrying out feature extraction on the image to obtain a plurality of feature maps;
classifying the plurality of feature maps to obtain a plurality of thermodynamic diagrams;
carrying out target detection processing according to the thermodynamic diagrams to obtain a plurality of first detection frames;
and screening the plurality of first detection frames to obtain a plurality of second detection frames, wherein the plurality of second detection frames are used for identifying target objects of different classifications in the image.
2. The method of claim 1, wherein the extracting features of the image to obtain a plurality of feature maps comprises:
inputting the image into a feature extraction module;
and performing the feature extraction in the feature extraction module, and outputting the plurality of feature maps.
3. The method of claim 1 or 2, wherein the classifying the plurality of feature maps into a plurality of thermodynamic maps comprises:
inputting the plurality of feature maps into a classification module;
the classification is performed in the classification module, and the plurality of thermodynamic diagrams are output and used for characterizing a plurality of classes of image features.
4. The method of claim 1 or 2, wherein the performing target detection processing according to the plurality of thermodynamic diagrams resulting in a plurality of first detection frames comprises:
inputting the plurality of thermodynamic diagrams into a connected domain module;
and in the connected domain module, respectively carrying out connected domain operation on each thermodynamic diagram to realize the target detection processing, and outputting the plurality of first detection frames.
5. The method according to claim 1 or 2, wherein the screening the plurality of first detection frames to obtain a plurality of second detection frames, the plurality of second detection frames being used for identifying target objects of different classifications in the image, includes:
inputting the plurality of first detection boxes into a screening module comprising a regression branch and a classification branch;
in the screening module, the coordinates of the first detection frames are subjected to regression processing based on the positions of the detection frames of the corresponding classes through the regression branch, the features of the positions corresponding to the first detection frames are extracted through the classification branch and then subjected to classification processing, and the second detection frames are output.
6. An object detection apparatus, the apparatus comprising:
the first processing module is used for extracting the features of the image to obtain a plurality of feature maps;
the second processing module is used for classifying the plurality of feature maps to obtain a plurality of thermodynamic maps;
the third processing module is used for carrying out target detection processing according to the thermodynamic diagrams to obtain a plurality of first detection frames;
and the fourth processing module is used for screening the plurality of first detection frames to obtain a plurality of second detection frames, and the plurality of second detection frames are used for identifying target objects of different classifications in the image.
7. The apparatus of claim 6, wherein the first processing module is to:
inputting the image into a feature extraction module;
and performing the feature extraction in the feature extraction module, and outputting the plurality of feature maps.
8. The apparatus of claim 6 or 7, wherein the second processing module is configured to:
inputting the plurality of feature maps into a classification module;
the classification is performed in the classification module, and the plurality of thermodynamic diagrams are output and used for characterizing a plurality of classes of image features.
9. The apparatus of claim 6 or 7, wherein the third processing module is configured to:
inputting the plurality of thermodynamic diagrams into a connected domain module;
and in the connected domain module, respectively carrying out connected domain operation on each thermodynamic diagram to realize the target detection processing, and outputting the plurality of first detection frames.
10. The apparatus of claim 6 or 7, wherein the fourth processing module is configured to:
inputting the plurality of first detection boxes into a screening module comprising a regression branch and a classification branch;
in the screening module, the coordinates of the first detection frames are subjected to regression processing based on the positions of the detection frames of the corresponding classes through the regression branch, the features of the positions corresponding to the first detection frames are extracted through the classification branch and then subjected to classification processing, and the second detection frames are output.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of any one of claims 1-5.
13. A computer program product comprising computer instructions which, when executed by a processor, implement the method of any one of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110721556.XA CN113378857A (en) | 2021-06-28 | 2021-06-28 | Target detection method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110721556.XA CN113378857A (en) | 2021-06-28 | 2021-06-28 | Target detection method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113378857A true CN113378857A (en) | 2021-09-10 |
Family
ID=77579619
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110721556.XA Withdrawn CN113378857A (en) | 2021-06-28 | 2021-06-28 | Target detection method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113378857A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113989568A (en) * | 2021-10-29 | 2022-01-28 | 北京百度网讯科技有限公司 | Target detection method, training method, device, electronic device and storage medium |
CN114549874A (en) * | 2022-03-02 | 2022-05-27 | 北京百度网讯科技有限公司 | Training method of multi-target image-text matching model, image-text retrieval method and device |
CN115049954A (en) * | 2022-05-09 | 2022-09-13 | 北京百度网讯科技有限公司 | Target identification method, device, electronic equipment and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111310770A (en) * | 2020-02-21 | 2020-06-19 | 集美大学 | Target detection method and device |
CN111461182A (en) * | 2020-03-18 | 2020-07-28 | 北京小米松果电子有限公司 | Image processing method, image processing apparatus, and storage medium |
CN111814755A (en) * | 2020-08-18 | 2020-10-23 | 深延科技(北京)有限公司 | Multi-frame image pedestrian detection method and device for night motion scene |
CN112101195A (en) * | 2020-09-14 | 2020-12-18 | 腾讯科技(深圳)有限公司 | Crowd density estimation method and device, computer equipment and storage medium |
CN112183435A (en) * | 2020-10-12 | 2021-01-05 | 河南威虎智能科技有限公司 | Two-stage hand target detection method |
CN112967315A (en) * | 2021-03-02 | 2021-06-15 | 北京百度网讯科技有限公司 | Target tracking method and device and electronic equipment |
-
2021
- 2021-06-28 CN CN202110721556.XA patent/CN113378857A/en not_active Withdrawn
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111310770A (en) * | 2020-02-21 | 2020-06-19 | 集美大学 | Target detection method and device |
CN111461182A (en) * | 2020-03-18 | 2020-07-28 | 北京小米松果电子有限公司 | Image processing method, image processing apparatus, and storage medium |
CN111814755A (en) * | 2020-08-18 | 2020-10-23 | 深延科技(北京)有限公司 | Multi-frame image pedestrian detection method and device for night motion scene |
CN112101195A (en) * | 2020-09-14 | 2020-12-18 | 腾讯科技(深圳)有限公司 | Crowd density estimation method and device, computer equipment and storage medium |
CN112183435A (en) * | 2020-10-12 | 2021-01-05 | 河南威虎智能科技有限公司 | Two-stage hand target detection method |
CN112967315A (en) * | 2021-03-02 | 2021-06-15 | 北京百度网讯科技有限公司 | Target tracking method and device and electronic equipment |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113989568A (en) * | 2021-10-29 | 2022-01-28 | 北京百度网讯科技有限公司 | Target detection method, training method, device, electronic device and storage medium |
CN114549874A (en) * | 2022-03-02 | 2022-05-27 | 北京百度网讯科技有限公司 | Training method of multi-target image-text matching model, image-text retrieval method and device |
CN114549874B (en) * | 2022-03-02 | 2024-03-08 | 北京百度网讯科技有限公司 | Training method of multi-target image-text matching model, image-text retrieval method and device |
CN115049954A (en) * | 2022-05-09 | 2022-09-13 | 北京百度网讯科技有限公司 | Target identification method, device, electronic equipment and medium |
CN115049954B (en) * | 2022-05-09 | 2023-09-22 | 北京百度网讯科技有限公司 | Target identification method, device, electronic equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113379718B (en) | Target detection method, target detection device, electronic equipment and readable storage medium | |
CN112633276B (en) | Training method, recognition method, device, equipment and medium | |
CN112597837B (en) | Image detection method, apparatus, device, storage medium, and computer program product | |
CN112560862B (en) | Text recognition method and device and electronic equipment | |
CN113378857A (en) | Target detection method and device, electronic equipment and storage medium | |
CN115880536B (en) | Data processing method, training method, target object detection method and device | |
CN113392794B (en) | Vehicle line crossing identification method and device, electronic equipment and storage medium | |
CN114359932B (en) | Text detection method, text recognition method and device | |
CN113657483A (en) | Model training method, target detection method, device, equipment and storage medium | |
CN113947188A (en) | Training method of target detection network and vehicle detection method | |
CN113378969A (en) | Fusion method, device, equipment and medium of target detection results | |
CN113326773A (en) | Recognition model training method, recognition method, device, equipment and storage medium | |
CN113239807A (en) | Method and device for training bill recognition model and bill recognition | |
CN114443794A (en) | Data processing and map updating method, device, equipment and storage medium | |
CN114724133A (en) | Character detection and model training method, device, equipment and storage medium | |
CN113706705B (en) | Image processing method, device, equipment and storage medium for high-precision map | |
CN113344121B (en) | Method for training a sign classification model and sign classification | |
CN113569911A (en) | Vehicle identification method and device, electronic equipment and storage medium | |
CN114724113B (en) | Road sign recognition method, automatic driving method, device and equipment | |
CN114429631B (en) | Three-dimensional object detection method, device, equipment and storage medium | |
CN115761698A (en) | Target detection method, device, equipment and storage medium | |
CN113887394A (en) | Image processing method, device, equipment and storage medium | |
CN115147814A (en) | Recognition method of traffic indication object and training method of target detection model | |
CN112818972B (en) | Method and device for detecting interest point image, electronic equipment and storage medium | |
CN113936158A (en) | Label matching method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20210910 |
|
WW01 | Invention patent application withdrawn after publication |