CN109886208B - Object detection method and device, computer equipment and storage medium - Google Patents

Object detection method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN109886208B
CN109886208B CN201910137428.3A CN201910137428A CN109886208B CN 109886208 B CN109886208 B CN 109886208B CN 201910137428 A CN201910137428 A CN 201910137428A CN 109886208 B CN109886208 B CN 109886208B
Authority
CN
China
Prior art keywords
detection
point
anchor point
determining
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910137428.3A
Other languages
Chinese (zh)
Other versions
CN109886208A (en
Inventor
杨帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN201910137428.3A priority Critical patent/CN109886208B/en
Publication of CN109886208A publication Critical patent/CN109886208A/en
Application granted granted Critical
Publication of CN109886208B publication Critical patent/CN109886208B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The disclosure relates to a method and a device for detecting an object, computer equipment and a storage medium, and belongs to the technical field of computer vision. The method comprises the following steps: determining a feature map of a target image; determining a plurality of feature points in the feature map; corresponding to each feature point, respectively determining a plurality of reference points, and respectively determining at least one anchor point by taking each reference point as a center; and performing object detection on the feature map based on the determined anchor points to obtain position information and detection object types corresponding to the detection objects in the target image. By adopting the method and the device, the omission of the objects can be reduced when the dense small objects are detected.

Description

Object detection method and device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of computer vision technologies, and in particular, to a method and an apparatus for object detection, a computer device, and a storage medium.
Background
Object detection is a core problem in the field of computer vision. The object detection is to detect whether the picture contains the object to be detected, and if the picture contains the object to be detected, the position and the type of the object are determined.
The object detection method in the related art comprises the following steps: first, a plurality of anchor points are determined centering on a feature point in a feature map. Then, detection is performed for each anchor point, and when a detection object exists in the anchor point, position information of the detection object and a detection object type are output.
When a plurality of objects are contained in a plurality of anchor points corresponding to one characteristic point, because the central points of the anchor points are the same, detection areas responsible for the anchor points are overlapped greatly, and therefore when the anchor points are detected, the same objects can be detected only, and omission of other objects is caused.
Disclosure of Invention
The present disclosure provides a method, an apparatus, a computer device and a storage medium for object detection, which can solve the technical problem that objects are often missed when the existing object detection method is applied to detection of dense small objects.
According to a first aspect of embodiments of the present disclosure, there is provided a method of object detection, including:
determining a feature map of a target image;
determining a plurality of feature points in the feature map;
corresponding to each feature point, respectively determining a plurality of reference points, and respectively determining at least one anchor point by taking each reference point as a center;
and performing object detection on the feature map based on the determined anchor points to obtain position information and detection object types corresponding to the detection objects in the target image.
Optionally, the determining a plurality of reference points respectively corresponding to each feature point, and determining at least one anchor point respectively centering on each reference point includes:
determining at least one initial anchor point corresponding to each feature point;
and determining a plurality of reference points in each initial anchor point, and respectively determining at least one anchor point by taking each reference point as a center.
Optionally, the determining a plurality of reference points in each initial anchor point, and with each reference point as a center, respectively determining at least one anchor point, includes:
determining a plurality of uniformly distributed reference points in each initial anchor point, dividing each initial anchor point into a plurality of anchor points respectively based on the reference points in each initial anchor point, and taking the central point of each anchor point obtained by division as one reference point.
Optionally, the determining a plurality of reference points respectively corresponding to each feature point, and determining at least one anchor point respectively centering on each reference point includes:
and corresponding to each feature point, respectively determining a plurality of reference points based on the preset position information of the plurality of reference points relative to the feature point, and respectively determining at least one anchor point by taking each reference point as a center.
Optionally, the determining the feature map of the target image includes:
and determining a plurality of feature maps with different scales of the target image.
Optionally, after the object detection is performed on the feature map based on the determined anchor points to obtain the position information and the types of the detection objects corresponding to the detection objects included in the target image, the method further includes:
and displaying the target image, and adding a mark to each detection object in the target image based on the position information and the detection object type corresponding to each detection object.
Optionally, the performing object detection on the feature map based on the determined anchor points to obtain position information and detection object types corresponding to the detection objects included in the target image includes:
inputting the determined characteristic map area contained by each anchor point into the detection models corresponding to different detection object types to obtain the detection result of each anchor point corresponding to different detection models;
and determining the position information and the detection object type corresponding to each detection object in the target image based on the detection result of each anchor point corresponding to different detection models.
According to a second aspect of the embodiments of the present disclosure, there is provided an apparatus for object detection, comprising
The determining unit is configured to determine a feature map of a target image, determine a plurality of feature points in the feature map, respectively determine a plurality of reference points corresponding to each feature point, and respectively determine at least one anchor point by taking each reference point as a center;
and the detection unit is configured to perform object detection on the feature map based on the determined anchor points to obtain position information and detection object types corresponding to the detection objects included in the target image.
Optionally, the determining unit is configured to:
determining at least one initial anchor point corresponding to each feature point;
and determining a plurality of reference points in each initial anchor point, and respectively determining at least one anchor point by taking each reference point as a center.
Optionally, the determining unit is configured to:
determining a plurality of uniformly distributed reference points in each initial anchor point, dividing each initial anchor point into a plurality of anchor points respectively based on the reference points in each initial anchor point, and taking the central point of each anchor point obtained by division as one reference point.
Optionally, the determining unit is configured to:
and corresponding to each feature point, respectively determining a plurality of reference points based on the preset position information of the plurality of reference points relative to the feature point, and respectively determining at least one anchor point by taking each reference point as a center.
Optionally, the determining unit is configured to:
and determining a plurality of feature maps with different scales of the target image.
Optionally, the apparatus further comprises:
a marking unit configured to display the target image in which a mark is added to each detection object based on the position information and the detection object type corresponding to each detection object.
Optionally, the detection unit is configured to:
inputting the determined characteristic map area contained by each anchor point into the detection models corresponding to different detection object types to obtain the detection result of each anchor point corresponding to different detection models;
and determining the position information and the detection object type corresponding to each detection object in the target image based on the detection result of each anchor point corresponding to different detection models.
According to a third aspect of embodiments of the present disclosure, there is provided a computer device comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
the method of the first aspect of the embodiments of the present disclosure is performed.
According to a fourth aspect of the embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium, wherein instructions in the storage medium, when executed by a processor of a computer device, enable the computer device to perform the method of the first aspect of the embodiments of the present disclosure.
According to a fifth aspect of embodiments of the present disclosure, there is provided an application program comprising one or more instructions executable by a processor of a server to perform the method of the first aspect of embodiments of the present disclosure.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
in the embodiment of the present disclosure, a plurality of reference points are first determined based on each feature point, and then at least one anchor point is generated centering on each reference point. Such that each feature point corresponds to a plurality of non-concentric anchor points.
Compared with the technical scheme in the related art, each feature point corresponds to a plurality of anchor points with different centers, so that the anchor points at different positions are responsible for object detection in respective areas, and the coincidence of the detection areas responsible for the anchor points at different positions is less, so that when the method provided by the embodiment of the disclosure is applied to detection of dense small objects, omission of the objects is less.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a flow chart illustrating a method of object detection according to an exemplary embodiment.
FIG. 2 is a block diagram illustrating an apparatus for object detection in accordance with an exemplary embodiment.
Fig. 3 is a block diagram illustrating a structure of a terminal according to an exemplary embodiment.
FIG. 4 is a block diagram illustrating a configuration of a computer device according to an example embodiment.
FIG. 5 is a feature diagram of a target image shown in accordance with an exemplary embodiment.
FIG. 6 is a feature diagram illustrating inclusion of anchor points in accordance with an example embodiment.
FIG. 7 is a feature diagram illustrating inclusion of anchor points in accordance with an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
The disclosed embodiments provide a method for object detection, which may be implemented by a computer device. The computer device may be a mobile terminal such as a mobile phone, a tablet computer, a notebook computer, a monitoring device, or the like, or a fixed terminal such as a desktop computer, or a server.
The method provided by the embodiment of the disclosure can be applied to a scene of object detection on the image, for example, can be applied to scenes such as an intelligent traffic system, an intelligent monitoring system, military target detection and medical navigation operation. Moreover, the method provided by the embodiment of the disclosure is particularly suitable for detecting and recognizing the image in which a plurality of small objects exist, such as human face detection in large group photography, dense head detection in public places and scenes for estimating the density of fish schools.
FIG. 1 is a flow chart illustrating a method of object detection, as shown in FIG. 1, for use in a computer device, including the following steps, according to an exemplary embodiment.
In step 101, a feature map of the target image is determined.
The target image is an image to be subjected to object detection.
In implementation, the target image needs to be acquired before determining the feature map of the target image. The target image can be acquired in a real-time acquisition mode, the mode is mainly applied to computer equipment such as monitoring equipment and the like, the monitoring equipment acquires monitoring videos in real time and continuously acquires image frames in the monitoring videos as the target image. The target image may also be acquired by extracting image material or video material previously stored in the computer device.
After the target image is acquired, the target image may be input into the neural network model to generate a feature map. The Neural Network model in the present embodiment may be a CNN (Convolutional Neural Network) model or a VGG (Visual Geometry Group) model. Further, in order to reduce the amount of calculation, the target image may be scaled first, and then the scaled target image may be input to the neural network model for object detection.
The neural network model comprises a plurality of stages of convolutional layers, after the target image is input into the neural network model, the neural network model sequentially performs convolutional processing on the target image through the one-stage and one-stage convolutional layers, and characteristic diagrams of the convolutional layers at all stages are sequentially obtained. And selecting one of the feature maps of the convolution layers at each level to determine the feature map of the target image.
For the situation that image frames are continuously obtained from the monitoring video to serve as target images, every time a target image is obtained, the target image is input into the neural network model, and therefore the feature map corresponding to each target image is obtained.
Optionally, in order to make the result of object detection more accurate, the object detection may be performed using feature maps of different scales of the target image, so that the feature degrees of different scales are respectively responsible for the detection of objects of different sizes, and the corresponding processing procedure may be as follows: and determining a plurality of feature maps with different scales of the target image.
In implementation, the neural network model includes multiple convolutional layers, and after the target image is input into the neural network model, the target image is sequentially convolved by the convolutional layers one by one, and the feature maps of the convolutional layers at each level are sequentially obtained. The feature map corresponding to the convolutional layer with the front convolutional layer number has a larger scale, and is more suitable for detecting objects with smaller sizes. The feature map corresponding to the convolution layer with the later number of convolution layers has a smaller scale, and is more suitable for detecting objects with larger sizes.
And selecting a plurality of characteristic graphs with different scales from the characteristic graphs corresponding to the convolutional layers at each level, determining the characteristic graphs as the characteristic graphs of the target image, and enabling the characteristic graphs with different scales to be respectively responsible for the detection of objects with different sizes, thereby improving the accuracy of object detection.
The specific operation method can be as follows: firstly, inputting a target image into an vgg16 neural network model, and then extracting three-layer feature maps of conv3_3, conv4_3 and conv5_3 by using a single-point multi-box detector (ssd) frame as feature maps of the target image so as to improve the accuracy of object detection.
In step 102, a plurality of feature points in a feature map are determined.
In step 103, a plurality of reference points are determined corresponding to each feature point, and at least one anchor point is determined centering on each reference point.
Anchor points may also be referred to as preselection boxes, anchors, and the like.
In implementation, each feature point may correspond to n reference points, each reference point corresponds to m anchor points, and if the number of the feature points is p, then p × m × n anchor points are determined in the feature map, and the anchor points divide the feature map into p × m × n feature map regions.
In the embodiment of the present disclosure, a plurality of reference points are first determined based on each feature point, and then at least one anchor point is generated centering on each reference point. Such that each feature point corresponds to a plurality of non-concentric anchor points.
Compared with the technical scheme in the related art, each feature point corresponds to a plurality of anchor points with different centers, so that the anchor points at different positions are responsible for object detection in respective areas, and the coincidence of the detection areas responsible for the anchor points at different positions is less, so that when the method provided by the embodiment of the disclosure is applied to detection of dense small objects, omission of the objects is less.
Optionally, an initial anchor point may be generated in advance, and then the initial anchor point is generated in a manner of dividing the initial anchor point, and the corresponding processing procedure may be as follows: determining at least one initial anchor point corresponding to each feature point; and determining a plurality of reference points in each initial anchor point, and respectively determining at least one anchor point by taking each reference point as a center.
In implementation, corresponding to each feature point, at least one initial anchor point central point is determined first, and then at least one initial anchor point is generated by taking each initial anchor point central point as a center. When the initial anchor point is generated, it is further necessary to design scale information and proportion information of the initial anchor point, where the scale information represents the size of the area of the initial anchor point, and the proportion information represents the aspect ratio of the initial anchor point (the size of the initial anchor point in the horizontal direction may be taken as the length, and the size of the initial anchor point in the vertical direction may be taken as the width). A plurality of initial anchor points may be generated based on the initial anchor point center point, the scale information and the scale information of the initial anchor points. For example, the center of the feature point may be determined as the center point of the initial anchor point, and the area of the initial anchor point is set to 1, and the aspect ratio is set to 1:1, as shown in fig. 6.
After the initial anchor point is generated, a plurality of reference points need to be selected from the initial anchor point. When the reference point is selected, the coordinates of each reference point can be determined by taking the four vertexes of the initial anchor point or the central point of the initial anchor point as the origin, the horizontal direction as the x axis, and the vertical direction as the y axis.
After the reference points are determined, the area and the aspect ratio of the anchor point (the size of the anchor point in the horizontal direction can be taken as the length, and the size of the anchor point in the vertical direction can be taken as the width) are designed, and at least one anchor point is determined respectively.
Optionally, each initial anchor point may be uniformly divided into several anchor points, and the corresponding processing procedure may be as follows: determining a plurality of uniformly distributed reference points in each initial anchor point, dividing each initial anchor point into a plurality of anchor points respectively based on the reference points in each initial anchor point, and taking the central point of each anchor point obtained by division as one reference point.
In implementation, after the initial anchor point is generated, several reference points are uniformly determined in the initial anchor point, and then one anchor point is determined by taking the several reference points as the center. Assuming that the number of reference points determined in one initial anchor point is k, one initial anchor point is divided into k anchor points, the shapes of the k anchor points are completely the same, and the area of each anchor point is 1/k of the area of the initial anchor point.
For example, as shown in fig. 6, the center of each feature point is determined as an initial anchor point center point, and then an initial anchor point is generated with each initial anchor point center point as the center. The initial anchor point has a dimension of 1 and an aspect ratio of 1. That is, each feature point corresponds to an initial anchor point, and the initial anchor point is a square frame with the area of 1. In the initial anchor point, four reference points are uniformly selected, and coordinates of the four reference points are (0.25 ), (0.25, 0.75), (0.75, 0.25) and (0.75 ) respectively, with the upper left corner of the initial anchor point as an origin, the horizontal direction as an x-axis, the right direction as an x-axis positive direction, the vertical direction as a y-axis, and the downward direction as a y-axis positive direction. Taking the four reference points as the center, the initial anchor point is divided into 4 square anchor points with equal size and 0.25 area.
Alternatively, a plurality of reference points may be directly determined, and then at least one anchor point is determined respectively centering on the reference points, and the corresponding processing procedure may be as follows: and corresponding to each feature point, respectively determining a plurality of reference points based on the preset position information of the plurality of reference points relative to the feature point, and respectively determining at least one anchor point by taking each reference point as a center.
In implementation, the position information of the reference points relative to the feature points may be preset, and then the plurality of reference points may be determined based on the position information.
The coordinates of the reference point may be determined with the center of each feature point as the origin, with the horizontal direction as the x-axis, and the right as the positive x-axis direction, with the vertical direction as the y-axis, and the down as the positive y-axis direction. For example, as shown in fig. 7, the coordinates of the reference points are determined to be (0.25, -0.25), (-0.25,0.25) (-0.25 ), and (0.25,0.25), then the reference points are located around the feature point corresponding thereto, and are all 0.25 from the feature point in the horizontal direction and 0.25 from the feature point in the vertical direction.
After the reference point is determined, area information and proportion information of anchor points are designed, and if the area of an anchor point is preset to be 0.25 and the length-width ratio is 1:1, one reference point corresponds to one anchor point, as shown in fig. 7.
A plurality of anchor points with different areas and different aspect ratios can be preset to increase the number of anchor points. For example, the areas of the design anchor points are 1 and 2, and the aspect ratios are 1:2 and 2:1, so that one reference point corresponds to four anchor points, that is, one anchor point with an area of 1 and an aspect ratio of 1:2, one anchor point with an area of 1 and an aspect ratio of 2:1, one anchor point with an area of 2 and an aspect ratio of 1:2, and one anchor point with an area of 2 and an aspect ratio of 2: 1.
In step 104, object detection is performed on the feature map based on the determined anchor points, and position information and detection object types corresponding to the detection objects included in the target image are obtained.
In practice, each determined anchor point divides the feature map into a plurality of different feature map regions, and the number of feature map regions is the same as the number of determined anchor points.
And detecting the characteristic diagram areas contained in the anchor points in sequence, and obtaining a detection result based on each characteristic diagram area. Each detection result includes position information and a detection object type corresponding to the detection object included in each feature map region. And then, integrating and processing all detection results to finally obtain the position information and the detection object type corresponding to each detection object in the target image.
Optionally, the determined feature map area included in each anchor point may be detected by using different detection models, and the corresponding processing procedure may be as follows: inputting the determined characteristic map area contained by each anchor point into the detection models corresponding to different detection object types to obtain the detection result of each anchor point corresponding to different detection models; and determining the position information and the detection object type corresponding to each detection object in the target image based on the detection result of each anchor point corresponding to different detection models.
Wherein, the different types of detection models are responsible for the detection of different types of objects, and the detection models can be classifiers.
In implementation, the feature map regions determined by all the anchor points are sequentially input into detection models of different types, each detection model detects the feature map region contained in each anchor point, each detection model obtains a detection result for each feature region, the detection result contains position information of a detection object belonging to the object type corresponding to the detection model, and if the feature map region does not contain the detection object belonging to the object type corresponding to the detection model, the position information is null information. Then, each type of detection model performs deduplication processing on the position information based on the detection results of all the feature map regions.
And finally, obtaining the position information and the detection object type corresponding to each detection object in the target image according to the detection results obtained by all the classification models.
Optionally, after determining the position information and the type of the detected object corresponding to each detected object included in the target image, the position and the type of the detected object may be marked in the target image, and the corresponding processing procedure may be as follows: and displaying the target image, and adding a mark to each detection object in the target image based on the position information and the detection object type corresponding to each detection object.
In some implementations, in some scenarios where the target image may be displayed, the detection object may be marked in the displayed target image. The mark can be a position mark and a type mark of a detected object in the target image, and when the class mark is not needed, such as the mark of a human face in a large group photograph, the mark can also be the position mark of the detected object.
The position mark may be in the form of a rectangular frame that frames the detection object in the target image. The category mark may be in the form of a rectangular frame of the position mark, and the category to which the detection object belongs may be displayed in text.
For example, when a criminal is marked in an intelligent monitoring scene, after each image frame in a monitoring video is subjected to the object detection processing, when the criminal is detected in the image frame, the criminal is displayed in the image frame by using a rectangular frame, and then the processed image frame is displayed.
FIG. 2 is a block diagram illustrating an apparatus for object detection in accordance with an exemplary embodiment. Referring to fig. 2, the apparatus includes a determination unit 201 and a detection unit 202.
A determining unit 201 configured to determine a feature map of a target image, determine a plurality of feature points in the feature map, respectively determine a plurality of reference points corresponding to each feature point, and respectively determine at least one anchor point with each reference point as a center;
a detecting unit 202, configured to perform object detection on the feature map based on the determined anchor points, so as to obtain position information and a detected object type corresponding to each detected object included in the target image.
Optionally, the determining unit 201 is configured to:
determining at least one initial anchor point corresponding to each feature point;
and determining a plurality of reference points in each initial anchor point, and respectively determining at least one anchor point by taking each reference point as a center.
Optionally, the determining unit 201 is configured to:
determining a plurality of uniformly distributed reference points in each initial anchor point, dividing each initial anchor point into a plurality of anchor points respectively based on the reference points in each initial anchor point, and taking the central point of each anchor point obtained by division as one reference point.
Optionally, the determining unit 201 is configured to:
and corresponding to each feature point, respectively determining a plurality of reference points based on the preset position information of the plurality of reference points relative to the feature point, and respectively determining at least one anchor point by taking each reference point as a center.
Optionally, the determining unit 201 is configured to:
and determining a plurality of feature maps with different scales of the target image.
Optionally, the apparatus further comprises:
a marking unit 203 configured to display the target image in which a mark is added to each detection object based on the position information and the detection object type corresponding to each detection object.
Optionally, the detecting unit 202 is configured to:
inputting the determined characteristic map area contained by each anchor point into the detection models corresponding to different detection object types to obtain the detection result of each anchor point corresponding to different detection models;
and determining the position information and the detection object type corresponding to each detection object in the target image based on the detection result of each anchor point corresponding to different detection models.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 3 is a block diagram illustrating a structure of a terminal according to an exemplary embodiment. The terminal 300 may be a portable mobile terminal such as: smart phones, tablet computers. The terminal 300 may also be referred to by other names such as user equipment, portable terminal, etc.
Generally, the terminal 300 includes: a processor 301 and a memory 302.
The processor 301 may include one or more processing cores, such as a 4-core processor, a 9-core processor, and so on. The processor 301 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 301 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 301 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 301 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 302 may include one or more computer-readable storage media, which may be tangible and non-transitory. Memory 302 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 302 is used to store at least one instruction for execution by processor 301 to implement the methods of object detection provided herein.
In some embodiments, the terminal 300 may further include: a peripheral interface 303 and at least one peripheral. Specifically, the peripheral device includes: at least one of radio frequency circuitry 304, touch display screen 305, camera 306, audio circuitry 307, positioning components 308, and power supply 309.
The peripheral interface 303 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 301 and the memory 302. In some embodiments, processor 301, memory 302, and peripheral interface 303 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 301, the memory 302 and the peripheral interface 303 may be implemented on a separate chip or circuit board, which is not limited by the embodiment.
The Radio Frequency circuit 304 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 304 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 304 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 304 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 304 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 304 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.
The touch display screen 305 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. Touch display screen 305 also has the ability to capture touch signals on or over the surface of touch display screen 305. The touch signal may be input to the processor 301 as a control signal for processing. The touch screen display 305 is used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the touch display screen 305 may be one, providing the front panel of the terminal 300; in other embodiments, the touch display screen 305 may be at least two, respectively disposed on different surfaces of the terminal 300 or in a folded design; in still other embodiments, the touch display 305 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 300. Even more, the touch screen display 305 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The touch Display screen 305 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.
The camera assembly 306 is used to capture images or video. Optionally, camera assembly 306 includes a front camera and a rear camera. Generally, a front camera is used for realizing video call or self-shooting, and a rear camera is used for realizing shooting of pictures or videos. In some embodiments, the number of the rear cameras is at least two, and each of the rear cameras is any one of a main camera, a depth-of-field camera and a wide-angle camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting function and a VR (Virtual Reality) shooting function. In some embodiments, camera assembly 306 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
Audio circuit 307 is used to provide an audio interface between the user and terminal 300. Audio circuitry 307 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 301 for processing or inputting the electric signals to the radio frequency circuit 304 to realize voice communication. The microphones may be provided in plural numbers, respectively, at different portions of the terminal 300 for the purpose of stereo sound collection or noise reduction. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 301 or the radio frequency circuitry 304 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 307 may also include a headphone jack.
The positioning component 308 is used to locate the current geographic Location of the terminal 300 to implement navigation or LBS (Location Based Service). The Positioning component 308 may be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, or the galileo System in russia.
The power supply 309 is used to supply power to the various components in the terminal 300. The power source 309 may be alternating current, direct current, disposable batteries, or rechargeable batteries. When the power source 309 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the terminal 300 also includes one or more sensors 310. The one or more sensors 310 include, but are not limited to: acceleration sensor 311, gyro sensor 312, pressure sensor 313, fingerprint sensor 314, optical sensor 315, and proximity sensor 316.
The acceleration sensor 311 may detect the magnitude of acceleration in three coordinate axes of a coordinate system established with the terminal 300. For example, the acceleration sensor 311 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 301 may control the touch display screen 305 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 311. The acceleration sensor 311 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 312 may detect a body direction and a rotation angle of the terminal 300, and the gyro sensor 312 may cooperate with the acceleration sensor 311 to acquire a 3D motion of the user on the terminal 300. The processor 301 may implement the following functions according to the data collected by the gyro sensor 312: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
The pressure sensor 313 may be disposed on a side bezel of the terminal 300 and/or an underlying layer of the touch display screen 305. When the pressure sensor 313 is disposed at the side frame of the terminal 300, a user's grip signal of the terminal 300 can be detected, and left-right hand recognition or shortcut operation can be performed according to the grip signal. When the pressure sensor 313 is disposed at the lower layer of the touch display screen 305, the operability control on the UI interface can be controlled according to the pressure operation of the user on the touch display screen 305. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 314 is used for collecting a fingerprint of a user to identify the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, processor 301 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 314 may be disposed on the front, back, or side of the terminal 300. When a physical button or a vendor Logo is provided on the terminal 300, the fingerprint sensor 314 may be integrated with the physical button or the vendor Logo.
The optical sensor 315 is used to collect the ambient light intensity. In one embodiment, the processor 301 may control the display brightness of the touch screen display 305 based on the ambient light intensity collected by the optical sensor 315. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 305 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 305 is turned down. In another embodiment, the processor 301 may also dynamically adjust the shooting parameters of the camera head assembly 306 according to the ambient light intensity collected by the optical sensor 315.
A proximity sensor 316, also known as a distance sensor, is typically provided on the front face of the terminal 300. The proximity sensor 316 is used to collect the distance between the user and the front surface of the terminal 300. In one embodiment, when the proximity sensor 316 detects that the distance between the user and the front surface of the terminal 300 gradually decreases, the processor 301 controls the touch display screen 305 to switch from the bright screen state to the dark screen state; when the proximity sensor 316 detects that the distance between the user and the front surface of the terminal 300 gradually becomes larger, the processor 301 controls the touch display screen 305 to switch from the breath screen state to the bright screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 3 is not intended to be limiting of terminal 300 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
Fig. 4 is a schematic structural diagram illustrating a computer device according to an exemplary embodiment, where the computer device may be a server in the above embodiments. The computer device 400 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 401 and one or more memories 402, where the memory 402 stores at least one instruction, and the at least one instruction is loaded and executed by the processor 401 to implement the method for object detection.
In an embodiment of the present disclosure, a non-transitory computer-readable storage medium is further provided, and when executed by a processor of a computer device, instructions in the storage medium enable the computer device to execute the method for detecting an object described above.
In an embodiment of the present disclosure, an application program is further provided, which includes one or more instructions that can be executed by a processor of a server to perform the above object detection method.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (12)

1. A method of object detection, comprising:
determining a feature map of a target image;
determining a plurality of feature points in the feature map;
corresponding to each feature point, respectively determining a plurality of reference points, and respectively determining at least one anchor point by taking each reference point as a center, wherein the method comprises the following steps: determining at least one initial anchor point corresponding to each feature point; determining a plurality of uniformly distributed reference points in each initial anchor point, dividing each initial anchor point into a plurality of anchor points respectively based on the reference points in each initial anchor point, wherein the central point of each anchor point obtained by division is a reference point, and each anchor point obtained by division is positioned in the corresponding initial anchor point;
and performing object detection on the feature map based on the determined anchor points to obtain position information and detection object types corresponding to the detection objects in the target image.
2. The method of claim 1, wherein a plurality of reference points are respectively determined corresponding to each feature point, and at least one anchor point is respectively determined centering on each reference point, further comprising:
and corresponding to each feature point, respectively determining a plurality of reference points based on the preset position information of the plurality of reference points relative to the feature point, and respectively determining at least one anchor point by taking each reference point as a center.
3. The method of claim 1, wherein determining the feature map of the target image comprises:
and determining a plurality of feature maps with different scales of the target image.
4. The method according to claim 1, wherein after the object detection is performed on the feature map based on the determined anchor points to obtain the position information and the detected object type corresponding to each detected object included in the target image, the method further comprises:
and displaying the target image, and adding a mark to each detection object in the target image based on the position information and the detection object type corresponding to each detection object.
5. The method according to claim 1, wherein the performing object detection on the feature map based on the determined anchor points to obtain position information and a detected object type corresponding to each detected object included in the target image includes:
inputting the determined characteristic map area contained by each anchor point into the detection models corresponding to different detection object types to obtain the detection result of each anchor point corresponding to different detection models;
and determining the position information and the detection object type corresponding to each detection object in the target image based on the detection result of each anchor point corresponding to different detection models.
6. An apparatus for object detection, comprising:
a determining unit configured to determine a feature map of a target image, determine a plurality of feature points in the feature map, respectively determine a plurality of reference points corresponding to each feature point, respectively determine at least one anchor point with each reference point as a center, wherein the determining unit respectively determines a plurality of reference points corresponding to each feature point, and respectively determines at least one anchor point with each reference point as a center includes: determining at least one initial anchor point corresponding to each feature point; determining a plurality of uniformly distributed reference points in each initial anchor point, dividing each initial anchor point into a plurality of anchor points respectively based on the reference points in each initial anchor point, wherein the central point of each anchor point obtained by division is a reference point, and each anchor point obtained by division is positioned in the corresponding initial anchor point;
and the detection unit is configured to perform object detection on the feature map based on the determined anchor points to obtain position information and detection object types corresponding to the detection objects included in the target image.
7. The apparatus of claim 6, wherein the determining unit is further configured to:
and corresponding to each feature point, respectively determining a plurality of reference points based on the preset position information of the plurality of reference points relative to the feature point, and respectively determining at least one anchor point by taking each reference point as a center.
8. The apparatus of claim 6, wherein the determining unit is configured to:
and determining a plurality of feature maps with different scales of the target image.
9. The apparatus of claim 6, further comprising:
a marking unit configured to display the target image in which a mark is added to each detection object based on the position information and the detection object type corresponding to each detection object.
10. The apparatus of claim 6, wherein the detection unit is configured to:
inputting the determined characteristic map area contained by each anchor point into the detection models corresponding to different detection object types to obtain the detection result of each anchor point corresponding to different detection models;
and determining the position information and the detection object type corresponding to each detection object in the target image based on the detection result of each anchor point corresponding to different detection models.
11. A computer device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
performing the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium, wherein instructions in the storage medium, when executed by a processor of a computer device, enable the computer device to perform the method of any of claims 1-5.
CN201910137428.3A 2019-02-25 2019-02-25 Object detection method and device, computer equipment and storage medium Active CN109886208B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910137428.3A CN109886208B (en) 2019-02-25 2019-02-25 Object detection method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910137428.3A CN109886208B (en) 2019-02-25 2019-02-25 Object detection method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109886208A CN109886208A (en) 2019-06-14
CN109886208B true CN109886208B (en) 2020-12-18

Family

ID=66929163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910137428.3A Active CN109886208B (en) 2019-02-25 2019-02-25 Object detection method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109886208B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115035407A (en) * 2019-11-06 2022-09-09 支付宝(杭州)信息技术有限公司 Method, device and equipment for identifying object in image
CN111476306B (en) * 2020-04-10 2023-07-28 腾讯科技(深圳)有限公司 Object detection method, device, equipment and storage medium based on artificial intelligence
CN112199987A (en) * 2020-08-26 2021-01-08 北京贝思科技术有限公司 Multi-algorithm combined configuration strategy method in single area, image processing device and electronic equipment
CN113076955A (en) * 2021-04-14 2021-07-06 上海云从企业发展有限公司 Target detection method, system, computer equipment and machine readable medium
CN114596706B (en) * 2022-03-15 2024-05-03 阿波罗智联(北京)科技有限公司 Detection method and device of road side perception system, electronic equipment and road side equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106529527A (en) * 2016-09-23 2017-03-22 北京市商汤科技开发有限公司 Object detection method and device, data processing deice, and electronic equipment
CN107316001A (en) * 2017-05-31 2017-11-03 天津大学 Small and intensive method for traffic sign detection in a kind of automatic Pilot scene
CN108304808A (en) * 2018-02-06 2018-07-20 广东顺德西安交通大学研究院 A kind of monitor video method for checking object based on space time information Yu depth network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108681718B (en) * 2018-05-20 2021-08-06 北京工业大学 Unmanned aerial vehicle low-altitude target accurate detection and identification method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106529527A (en) * 2016-09-23 2017-03-22 北京市商汤科技开发有限公司 Object detection method and device, data processing deice, and electronic equipment
CN107316001A (en) * 2017-05-31 2017-11-03 天津大学 Small and intensive method for traffic sign detection in a kind of automatic Pilot scene
CN108304808A (en) * 2018-02-06 2018-07-20 广东顺德西安交通大学研究院 A kind of monitor video method for checking object based on space time information Yu depth network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"SSD: Single shot multibox detector";Liu W 等;《European Conference on Computer Vision》;20160930;第21-37页 *
"基于深度卷积神经网络的汽车驾驶场景目标检测算法研究";陈康;《中国优秀硕士学位论文全文数据库 信息科技辑》;20180615(第06期);全文 *
"目标检测网络SSD的区域候选框的设置问题研究";翁昕;《中国优秀硕士学位论文全文数据库 信息科技辑》;20180415(第04期);全文 *

Also Published As

Publication number Publication date
CN109886208A (en) 2019-06-14

Similar Documents

Publication Publication Date Title
CN110502954B (en) Video analysis method and device
CN109886208B (en) Object detection method and device, computer equipment and storage medium
CN110148178B (en) Camera positioning method, device, terminal and storage medium
CN111464749B (en) Method, device, equipment and storage medium for image synthesis
CN109859102B (en) Special effect display method, device, terminal and storage medium
CN109522863B (en) Ear key point detection method and device and storage medium
CN110839128B (en) Photographing behavior detection method and device and storage medium
CN109302632B (en) Method, device, terminal and storage medium for acquiring live video picture
CN113763228B (en) Image processing method, device, electronic equipment and storage medium
CN110570460A (en) Target tracking method and device, computer equipment and computer readable storage medium
CN111982305A (en) Temperature measuring method, device and computer storage medium
CN113627413A (en) Data labeling method, image comparison method and device
CN111754386A (en) Image area shielding method, device, equipment and storage medium
CN112308103B (en) Method and device for generating training samples
CN109754439B (en) Calibration method, calibration device, electronic equipment and medium
CN112396076A (en) License plate image generation method and device and computer storage medium
CN111753606A (en) Intelligent model upgrading method and device
CN111127541A (en) Vehicle size determination method and device and storage medium
CN112241987B (en) System, method, device and storage medium for determining defense area
CN111860064B (en) Video-based target detection method, device, equipment and storage medium
CN112967261B (en) Image fusion method, device, equipment and storage medium
CN111369684B (en) Target tracking method, device, equipment and storage medium
CN112243083B (en) Snapshot method and device and computer storage medium
CN112184802B (en) Calibration frame adjusting method, device and storage medium
CN110672036B (en) Method and device for determining projection area

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant