CN110310301A - A kind of method and device detecting target image - Google Patents
A kind of method and device detecting target image Download PDFInfo
- Publication number
- CN110310301A CN110310301A CN201810258574.7A CN201810258574A CN110310301A CN 110310301 A CN110310301 A CN 110310301A CN 201810258574 A CN201810258574 A CN 201810258574A CN 110310301 A CN110310301 A CN 110310301A
- Authority
- CN
- China
- Prior art keywords
- candidate frame
- target
- configuration information
- image
- picture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/251—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/62—Analysis of geometric attributes of area, perimeter, diameter or volume
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30232—Surveillance
Abstract
This application discloses a kind of method and devices for detecting target image, belong to the communications field.The described method includes: carrying out the fisrt feature picture that convolution algorithm obtains to the picture to be detected by obtaining the corresponding foreground moving image of picture to be detected and obtaining;It detects the target image in the fisrt feature picture and obtains the first candidate frame configuration information set, the first candidate frame configuration information set includes the configuration information of each candidate frame at least one candidate frame;The configuration information for filtering the candidate frame that the target image for including is Partial image image from the first candidate frame configuration information set according to the foreground moving image, obtains the second candidate frame configuration information set;According to the second candidate frame configuration information set, detection block is added in the picture to be detected, includes at least one target image in the picture to be detected in the detection block.The application can be improved detection accuracy.
Description
Technical field
This application involves the communications field, in particular to a kind of method and device for detecting target image.
Background technique
Along with the construction of safe city, a large amount of monitoring cameras are arranged at present, these monitoring cameras are for shooting
Monitor video.For the monitor video of each monitoring camera shooting, need to detect the mesh in every frame picture in monitor video
Logo image, target image can be the human body image being kept in motion in monitor video and/or vehicle image etc..Picture is held
After the detection operation of row target image, the human body image being kept in motion or vehicle image etc. are framed in the picture using rectangle frame
Target image tracks someone or tracks etc. to some vehicle in order to subsequent.
The target image in picture can be detected in the following way at present, it can be with are as follows: picture is input to convolutional Neural
Network (Convolutional Neural Network, CNN), which obtains first after multiple convolution operation in CNN
Feature image and second feature picture, the convolution algorithm number that fisrt feature picture passes through are less than the volume that second feature picture passes through
Product operation times.Fisrt feature picture is input to region candidate network (Region Proposal Network, RPN), is passed through
RPN obtains location information and confidence score of each candidate frame at least one candidate frame in fisrt feature picture,
Each candidate circle lives a target image in one feature image, and the location information of candidate frame includes that a pair of the candidate frame is diagonal
The position of point, the state for the target image that the confidence score of candidate frame is used to indicate that the candidate circle is lived are the general of motion state
Rate.According to the location information and second feature picture of each candidate frame in the maximum N number of candidate frame of confidence score, in the picture
The middle type for adding the target image for including in each corresponding detection block of candidate frame and the detection block.
During realizing the application, the inventor finds that the existing technology has at least the following problems:
The target image lived in the maximum N number of candidate frame of confidence score there are part candidate circle is not complete target figure
Part human body image or Some vehicles image may be framed in picture, such as some candidate frames, candidate frame exists according to the part in this way
The target image in detection block added in picture is also incomplete target image, reduces detection accuracy.
Summary of the invention
In order to improve detection accuracy, the embodiment of the present application provides a kind of method and device for detecting target image.It is described
Technical solution is as follows:
In a first aspect, the method is by obtaining mapping to be checked this application provides a kind of method for detecting target image
The corresponding foreground moving image of piece and acquisition carry out the fisrt feature picture that convolution algorithm obtains, institute to the picture to be detected
Stating foreground moving image includes the target image being kept in motion in the picture to be detected and in addition to the target image
Background image;It detects the target image in the fisrt feature picture and obtains the first candidate frame configuration information set, described first
Candidate frame configuration information set includes the configuration information of each candidate frame at least one candidate frame, in the fisrt feature figure
Include at least one target image in each candidate frame described in piece, target image in the fisrt feature picture and it is described to
The target image detected in picture is identical;According to the foreground moving image from the first candidate frame configuration information set mistake
The target image that filter includes is the configuration information of the candidate frame of Partial image image, obtains the second candidate frame configuration information collection
It closes;According to the second candidate frame configuration information set, detection block is added in the picture to be detected, is wrapped in the detection block
Include at least one target image in the picture to be detected.Due to filtering out from the first candidate frame configuration information set including non-
The configuration information of the candidate frame of complete object object obtains the second candidate frame configuration information set, so according to the second candidate frame
Configuration information is integrated into picture to be detected and adds detection block, and detection accuracy can be improved.
In a kind of possible implementation of first aspect, by carrying out mixed Gaussian background to the picture to be detected
Modeling, obtains the corresponding foreground moving image of the picture to be detected.In this way can by the foreground moving image, with realize from
The configuration information of the candidate frame including Partial image object is filtered out in first candidate frame configuration information set.
In a kind of possible implementation of first aspect, according to the foreground moving image, the prospect fortune is calculated
The corresponding integrogram of motion video;The target for including is filtered from the first candidate frame configuration information set according to the integrogram
Image is the configuration information of the candidate frame of Partial image image.Due to obtaining the corresponding integrogram of foreground moving image, in this way
The configuration information in the first candidate frame configuration information set can be filtered according to integrogram, and filtering velocity can be improved by integrogram
Degree, and then improve detection efficiency.
In a kind of possible implementation of first aspect, according to the configuration information of target candidate frame, the mesh is obtained
Candidate frame corresponding integral graph region in the integrogram is marked, the configuration information of the target candidate frame is first candidate
The configuration information of any one candidate frame in frame configuration information set;It is calculated according to the integral graph region and is located at the target marquis
Select the ratio between the area of the target image area and the target candidate frame in frame;It is less than default ratio threshold in the ratio
When value, the configuration information of the target candidate frame is filtered from the first candidate frame configuration information set.Due to getting mesh
The corresponding integral graph region of candidate frame is marked, can reduce the calculation amount for calculating the ratio according to the integral graph region, to improve
Calculating speed.
In a kind of possible implementation of first aspect, four vertex positions for being located at the integral graph region are obtained
Pixel integrated value;According to the integrated value of each pixel of the acquisition, the mesh being located in the target candidate frame is calculated
Logo image area;According to the configuration information of the target candidate frame, the area of the target candidate frame is calculated;Calculate the target
Ratio between image area and the area of the target candidate frame.Wherein, the integrated value of the pixel of position is pushed up according to four,
It is smaller to calculate calculation amount required for target image area, so as to quickly calculate target image area, improves calculating
Efficiency.
In a kind of possible implementation of first aspect, according to the configuration information of target candidate frame, the mesh is obtained
Candidate frame corresponding image-region in the foreground moving image is marked, the configuration information of the target candidate frame is described first
The configuration information of any one candidate frame in candidate frame configuration information set;It is calculated according to described image region and is located at the target
Target image area in candidate frame and the ratio between the area of the target candidate frame;It is less than default ratio in the ratio
When threshold value, the configuration information of the target candidate frame is filtered from the first candidate frame configuration information set.In this way according to figure
As region can then determine whether to filter out target candidate frame, simplifies scheme and realize logic.
In a kind of possible implementation of first aspect, the pixel for belonging to target image in described image region is counted
Total pixel number of point number and described image region;It calculates between the pixel number and total pixel number
Ratio obtains the ratio between the target image area being located in the target candidate frame and the area of the target candidate frame.
In a kind of possible implementation of first aspect, obtains and the picture progress convolution algorithm to be detected is obtained
Second feature picture, to the fisrt feature picture carry out convolution algorithm number be less than to the second feature picture carry out
The number of convolution algorithm;According to the second feature picture and the second candidate frame configuration information set, described to be detected
The type of detection block and the target image in the detection block is added in picture.Due to the second candidate frame configuration information set by
The configuration information of a large amount of candidate frame is filtered out, in this way according to the second candidate frame configuration information set, is added in picture to be detected
It can reduce operand when detection block, and then improve detection efficiency.
Second aspect, this application provides a kind of devices for detecting target image, for executing first aspect or first party
Method in the possible implementation of any one of face.Specifically, described device includes for executing first aspect or first
The module of the method for the possible implementation of any one of aspect.
The third aspect, this application provides a kind of devices for detecting target image, and described device includes: at least one processing
Device;And at least one processor;At least one processor is stored with one or more programs, one or more of programs
It is configured to be executed by least one described processor, one or more of programs include for carrying out first aspect or first
The instruction of the method for the possible implementation of any one of aspect.
Fourth aspect, this application provides a kind of devices for detecting target image, and described device includes transceiver, processor
And memory.Wherein, can be connected by bus system between the transceiver, the processor and the memory.Institute
It states memory and is used to execute program, instruction or the generation in the memory for storing program, instruction or code, the processor
Code completes the method in any possible implementation of first aspect or first aspect.
5th aspect, this application provides a kind of computer program product, the computer program product is included in calculating
The computer program stored in machine readable storage medium storing program for executing, and the calculation procedure loaded by processor it is above-mentioned to realize
The method of any possible implementation of first aspect or first aspect.
6th aspect, this application provides a kind of non-volatile computer readable storage medium storing program for executing, for storing computer journey
Sequence, the computer program are loaded to execute any possible reality of above-mentioned first aspect or first aspect by processor
The instruction of the method for existing mode.
7th aspect, the application propose embodiment and have supplied a kind of chip, and the chip includes programmable logic circuit and/or journey
Sequence instruction, when the chip operation when for realizing above-mentioned first aspect or first aspect any possible implementation side
Method.
Detailed description of the invention
Fig. 1 is a kind of network architecture schematic diagram provided by the embodiments of the present application;
Fig. 2-1 is a kind of method flow diagram for detecting target image provided by the embodiments of the present application;
Fig. 2-2 is the module map of RPN device provided by the embodiments of the present application;
Fig. 2-3 is the schematic diagram of RPN device addition sliding window provided by the embodiments of the present application;
Fig. 2-4 is the method flow diagram of filtering configuration information provided by the embodiments of the present application;
Fig. 2-5 is the schematic diagram of integral graph region provided by the embodiments of the present application;
Fig. 2-6 is the method flow diagram of another filtering configuration information provided by the embodiments of the present application;
Fig. 2-7 is the module map of Fast Rcnn device provided by the embodiments of the present application;
Fig. 2-8 is the software system module figure of detection target image provided by the embodiments of the present application;
Fig. 3-1 is a kind of apparatus structure schematic diagram for detecting target image provided by the embodiments of the present application;
Fig. 3-2 is the apparatus structure schematic diagram of another detection target image provided by the embodiments of the present application;
Fig. 3-3 is the apparatus structure schematic diagram of another detection target image provided by the embodiments of the present application;
Fig. 4 is the apparatus structure schematic diagram of another detection target image provided by the embodiments of the present application.
Specific embodiment
The application embodiment is described in further detail below in conjunction with attached drawing.
Referring to Fig. 1, the embodiment of the present application provides a kind of network architecture, comprising:
Picture pick-up device and server, establishing between picture pick-up device and server has network connection, which can be
Wireless connection or wired connection.
Picture pick-up device may be mounted at the places such as market and road, and for shooting picture, the figure of shooting is sent to server
Piece.
Optionally, which can be applied to the scenes such as video monitoring, for example, under video monitoring scene, camera shooting
Equipment can shoot to obtain the picture of a frame frame, and the picture of shooting can be sent to server.
It wherein, include that the foreground moving image being kept in motion and processing are static in the picture that picture pick-up device is shot
The background image of state.The foreground moving image being kept in motion can be the human body image and/or vehicle being kept in motion
Image etc., the background image to remain static can be building image, Tree image and/or the vehicle to remain static
Image etc..
When picture pick-up device shoots to obtain the picture of a frame frame, the target image in the picture, target image can detecte
It can be one or more of the foreground moving image being kept in motion in the picture.In the target image for detecting picture
When, the target image also is framed using detection block in the picture simultaneously, some target is tracked convenient for subsequent in this way.Example
Such as, when target is someone or some vehicle, can according to each picture of addition detection block, to the someone or to this some
Vehicle tracks etc..
Optionally, it for the treatment process of the target image in above-mentioned detection picture, can be executed by picture pick-up device, i.e.,
Picture pick-up device can execute the treatment process for detecting the target image in the picture after shooting obtains a frame frame picture.
Wherein, in order to improve the detection efficiency of picture pick-up device, higher computing resource, the meter can be configured for picture pick-up device
Calculating resource can be central processing unit (Central Processing Unit, CPU), graphics processor (Graphics
At least one of Processing Unit, GPU) and the resources such as memory size.
Optionally, for the treatment process of the target image in above-mentioned detection picture, picture pick-up device can not be executed, but
It can be executed by server, i.e. for server after the frame frame picture for receiving picture pick-up device transmission, can execute detection should
The treatment process of target image in picture;Alternatively, server reads a frame picture from memory, and executes and detect the picture
In target image treatment process, the picture in memory can be the picture of picture pick-up device camera shooting.
Wherein, server can first store the picture in memory when receiving the picture of picture pick-up device transmission.
Picture pick-up device can be the equipment such as monitor camera or the mobile phone with camera.
Referring to fig. 2-1, the embodiment of the present application provides a kind of method for detecting target image, and this method can be applied to figure
The executing subject of the network architecture that embodiment shown in 1 provides, this method can be the picture pick-up device or clothes in the network architecture
Business device etc., comprising:
Step 201: obtaining the corresponding foreground moving image of picture to be detected and obtain to picture to be detected progress convolution
The fisrt feature picture that operation obtains, foreground moving image include the target image being kept in motion in picture to be detected and remove
Background image outside target image.
Picture to be detected can be any picture in the video of picture pick-up device shooting.When the executing subject of the present embodiment
When for picture pick-up device, picture pick-up device, can be using the picture as picture to be detected when taking a frame picture.Work as the present embodiment
Executing subject be server when, server receive picture pick-up device transmission a frame picture when, can using the picture as
Picture to be detected;Alternatively, server reads a frame picture as picture to be detected from memory.Wherein, server reception is taken the photograph
As equipment send picture when, can by the picture store in memory.
Foreground moving image corresponding for picture to be detected, can be by carrying out mixed Gaussian background to picture to be detected
Modeling processing, obtains the corresponding foreground moving image of picture to be detected.
In the embodiment of the present application, mixed Gauss model device and the convolutional neural networks based on fast area are preset
(Fast Region-based Convolution Neural Network, Fast Rcnn) device, Fast Rcnn device packet
Include CNN.In this step, picture to be detected can be separately input to mixed Gauss model device and Fast Rcnn device
In CNN;Then, mixed Gaussian background modeling processing is carried out to picture to be detected by mixed Gauss model device, obtained to be checked
The corresponding foreground moving image of mapping piece carries out convolution algorithm processing to picture to be detected by CNN, obtains picture pair to be detected
The fisrt feature picture answered.
The corresponding foreground moving image of picture to be detected is a black and white picture, each pixel in foreground moving image
Pixel value be 1 or be 0.
The corresponding foreground moving image of picture to be detected is the picture of the sizes such as the size of a size and picture to be detected.
For each pixel of picture to be detected, there are corresponding pixels in foreground moving image for the pixel.If to be checked
A pixel in mapping piece is the pixel in the target image being kept in motion in picture to be detected, then the pixel
The pixel value of corresponding pixel is 1 in the corresponding foreground moving image of picture to be detected.If one in picture to be detected
A pixel is the pixel in the background image to remain static in picture to be detected, then the pixel is in picture to be detected
The pixel value of corresponding pixel is 0 in corresponding foreground moving image.In the present embodiment, target image may be to be detected
Human body image and/or vehicle image in picture etc..
Optionally, to picture to be detected carry out mixed Gaussian background modeling processing operation, can be divided into following 2011 to
2014 operation:
2011: the blank foreground moving image of the sizes such as one size of creation and the size of picture to be detected.
2012: from the pixel value of the pixel read in picture to be detected in picture to be detected, which includes the channel R
Pixel value, the pixel value in the channel G and the pixel value of channel B, then as follows (1) calculate the pixel belong in fortune
The probability of the target image of dynamic state.
Wherein, in above-mentioned formula (1), P (xj) be picture to be detected in j-th of pixel probability, the probability is just
It is the probability that j-th of pixel belongs to the target image being kept in motion, xjFor the pixel value of j-th of pixel, xj=
[xjRxjGxjB], xjRFor the pixel value in the channel R, xjGFor the pixel value in the channel G, xjBFor the pixel value of channel B;T is mapping to be checked
At the time of piece corresponds to, the frame number of picture to be detected can be used as moment t when realizing,It indicates in moment t mixed Gaussian mould
The estimated value of the weight coefficient of i-th of Gaussian Profile in type device,WithIt is illustrated respectively in moment t mixed Gauss model dress
The mean vector and covariance matrix (it is assumed herein that the red, green, blue component of pixel is mutually indepedent) of i-th of Gaussian Profile in setting;η
Indicate Gaussian Profile probability density function.
K is default value, before calculating the probability of the pixel using formula (1), presets mixed Gauss model device
According to j-th of pixel in the probability of j-th of pixel in the picture at the 0th moment having calculated that, the picture at the 1st moment
Probability ... the t-1 moment picture in j-th of pixel probability, obtain the power of K Gaussian Profile in moment t
The estimated value of coefficient, the mean vector and K covariance matrix of K Gaussian Profile.
Wherein, the estimated value of the weight coefficient of the K Gaussian Profile is respectivelyIt is this K high
This distribution mean vector be respectivelyThe K covariance matrix be respectively
2013: if calculated probability is greater than predetermined probabilities threshold value, it is determined that the pixel is in picture to be detected
Pixel in the target image of motion state, according to position of the pixel in picture to be detected, in the prospect of creation
The pixel that filler pixels value is 1 in moving image.
2014: if calculated probability is less than or equal to predetermined probabilities threshold value, it is determined that the pixel is mapping to be checked
The pixel in background image to remain static in piece is being created according to position of the pixel in picture to be detected
Foreground moving image in filler pixels value be 0 pixel.For each pixel in picture to be detected, in a manner described
The pixel is filled in the corresponding pixel of foreground moving image of creation, obtains the corresponding foreground moving figure of picture to be detected
Picture.
Wherein, CNN includes multiple convolutional layers, and first convolutional layer is for rolling up the picture to be detected for being input to CNN
Product calculation process.The input of other each convolutional layers in CNN in addition to first convolutional layer is its adjacent upper convolution
The output of layer, the result that other each convolutional layers are used for the upper convolutional layer output adjacent to its carry out convolution algorithm processing.
The result of each convolutional layer output in CNN is the corresponding feature image of picture to be detected, for each volume
Lamination, the level of abstraction of the feature image of convolutional layer output are greater than the spy that a upper convolutional layer adjacent with the convolutional layer exports
Levy the level of abstraction of picture.
In this step, the process that CNN carries out convolution algorithm processing to picture to be detected can be with are as follows: picture to be detected is defeated
Enter into CNN, first in CNN convolutional layer carries out process of convolution to picture to be detected, obtains the corresponding spy of picture to be detected
Picture is levied, and this feature picture is input to second convolutional layer.Second convolutional layer carries out convolution algorithm to this feature picture
Processing still obtains the corresponding feature image of picture to be detected, and the level of abstraction of this feature picture is greater than first convolution
The level of abstraction of the feature image of layer output, is input to third convolutional layer for this feature picture.By the above process until CNN's
Until when the last one convolutional layer exports picture to be detected corresponding feature image.
In this step, the feature image of the picture to be detected of first object convolutional layer output is obtained as fisrt feature figure
Piece, first object convolutional layer are other convolutional layers in CNN in addition to first convolutional layer and the last one convolutional layer.
Optionally, it can choose a convolutional layer positioned at the middle position CNN as first object convolutional layer, obtain first
The corresponding feature image of picture to be detected of target convolutional layer output is as fisrt feature picture.
Optionally, in this step, the second feature picture that process of convolution is carried out to picture to be detected can also be obtained, the
The convolution algorithm number that one feature image is passed through is less than the convolution algorithm number that second feature picture passes through.
Optionally, it can choose a convolutional layer positioned at the position rearward CNN as the second target convolutional layer, obtain second
The corresponding feature image of picture to be detected of target convolutional layer output is as second feature picture, where the second target convolutional layer
The number of plies is greater than the number of plies where first object convolutional layer.
Optionally, a convolutional layer of the position rearward so-called CNN, it can certain in selection CNN in last N number of convolutional layer
A convolutional layer is as the second target convolutional layer.N is default value, for example, N can be equivalent for numerical value 5,4,3,2 or 1.
Optionally, it can choose the last layer convolutional layer in CNN as the second target convolutional layer, i.e., by the last of CNN
The corresponding feature image of picture to be detected of one layer of convolutional layer output is as second feature picture.
Step 202: the target image in detection fisrt feature picture obtains the first candidate frame configuration information set, the first marquis
Selecting frame configuration information set includes the configuration information of each candidate frame at least one candidate frame.
It wherein, include at least one target image in each candidate frame in fisrt feature picture, in fisrt feature picture
Target image it is identical as the target image in picture to be detected.
The configuration information of candidate frame includes at least the location information and confidence score of candidate frame.The location information of candidate frame can
With include the candidate frame a pair of of angle steel joint position, which can be on any one diagonal line of candidate frame
Two angle steel joints, the position of angle steel joint can be position of the angle steel joint in fisrt feature picture;Alternatively, the position of candidate frame
Information may include the position on a vertex of the candidate frame and the size of the candidate frame, which can be appointing for the candidate frame
One vertex, the position on the vertex are position of the vertex in fisrt feature picture, and the size of the candidate frame may include this
The width and height of candidate frame.
Optionally, candidate frame can be rectangle frame, and the confidence score of candidate frame can indicate the target object in candidate frame
State be motion state probability.
Optionally, in the embodiment of the present application, RPN device is preset.In this step, fisrt feature picture can be inputted
To RPN device, fisrt feature picture is handled by the RPN device, obtains each candidate at least one candidate frame
The configuration information of each candidate frame is formed the first candidate frame configuration information set by the configuration information of frame.
Wherein, it should be understood that may include at least one target image in fisrt feature picture, target image can be with
For human body image and/or vehicle image etc..RPN device passes through RPN device in the fisrt feature picture for receiving input
The Propoasls layers of candidate frame added in fisrt feature picture for framing target image obtain the candidate frame in the first spy
It levies the location information in picture and estimates the confidence score for the probability that the state for indicating the target image is motion state,
To obtain the configuration information of the candidate frame.
The module map of RPN device shown in -2 referring to fig. 2, when fisrt feature picture is input to RPN device, RPN device exists
Sliding window is added in fisrt feature picture, the position of mobile sliding window and the size for zooming in or out the sliding window obtain
To multiple and different sliding windows, the feature vector of each sliding window is encoded out by convolutional layer, by full articulamentum according to
The feature vector of each sliding window exports the location information of at least one candidate frame and the confidence score of at least one candidate frame.
Referring to figure 2-3, after adding sliding window in fisrt feature picture, pass through mobile sliding window and amplification or contracting
The small sliding window obtains the confidence score of multiple candidate frames and each candidate frame of output.
In this step, there are a part of candidate frame in the candidate frame obtained, the target figure that includes in the part candidate frame
Seem Partial image image, and further includes the biggish background image of area.
Step 203: the target figure for including is filtered from the first candidate frame configuration information set according to the foreground moving image
As the configuration information of the candidate frame for Partial image image, the second candidate frame configuration information set is obtained.
There are many filter methods for realizing this step, for example, according to the integrogram of the foreground moving image to the first candidate
Frame configuration information set is filtered;For another example being carried out according to the foreground moving image to the first candidate frame configuration information set
Filtering.Other filter methods will not enumerate.
Referring to fig. 2-4, to the above-mentioned integrogram according to the foreground moving image to the first candidate frame configuration information set into
The process of row filtering, can be completed by following 2031 to 2034 operation, be respectively as follows:
2031: according to the foreground moving image, calculating the corresponding integrogram of foreground moving image.
The equal sized blank integrogram for creating a size and the foreground moving image first, for the foreground moving
Any one pixel in image, it is assumed that be the pixel of M row Nth column, the integrated value of the pixel can be by following public
Formula (2) is calculated, and according to position of the pixel in the foreground moving image, the pixel is filled in the integrogram of creation
The integrated value of point, i.e., fill the integrated value of the pixel at the position of the M row Nth column of the integrogram of creation.
In above-mentioned formula (2), Integral (M, N) is the integrated value of the pixel of M row Nth column, image (i, j)
For the pixel value for the pixel that the i-th row jth in the foreground moving image arranges.
For other each pixels of the foreground moving image, filled in the integrogram of creation in a manner described each
The integrated value of pixel obtains the corresponding integrogram of foreground moving image.
Since in foreground moving image, the pixel value of the pixel for the foreground moving image being kept in motion is 1, place
In the background image of stationary state pixel pixel value be 0, so the integrated value of the pixel of M row Nth column can wait
The area of the foreground moving image in an image-region in the foreground moving image, the image-region include prospect fortune
The pixel of the first row first row in motion video and the pixel of M row Nth column, and the size of the image-region is M × N.
Next the target image for including can be filtered from the first candidate frame configuration information set according to the integrogram is
The configuration information of the candidate frame of Partial image image, detailed implementation may include following 2032 to 2034 operation.
2032: according to the configuration information of target candidate frame, obtaining target candidate frame corresponding integrogram area in integrogram
Domain, the configuration information of target candidate frame are the configuration information of any one candidate frame in the first candidate frame configuration information set.
Optionally, target candidate frame corresponding product in integrogram can be obtained according to the location information of target candidate frame
Component region.
It is right according to a pair when the location information of target candidate frame includes the position of a pair of of angle steel joint of target candidate frame
The position of each angle steel joint in angle point obtains the corresponding integral graph region of target candidate frame in integrogram.
When the location information of target candidate frame includes the position on a vertex of target candidate frame and the ruler of target candidate frame
When very little, according to the position on a vertex and the size, the corresponding integral graph region of target candidate frame is obtained in integrogram.
For example, with reference to Fig. 2-5, it is assumed that the location information of target candidate frame includes the position of a pair of of angle steel joint of target candidate frame
It sets, the position of one of angle steel joint is i-th1Row jth1Column, the position of another angle steel joint are i-th2Row jth2Column.According to this
The position of two angle steel joints obtains the corresponding integral graph region of target candidate frame in the integrogram shown in Fig. 2-5.
2033: the target image area and target candidate frame being located in target candidate frame are calculated according to the integral graph region
Ratio between area.
Optionally, for the realization of this step, the pixel of available four vertex positions positioned at the integral graph region
The integrated value of point;According to the integrated value of each pixel of acquisition, the target image area being located in target candidate frame is calculated.
Wherein, referring to fig. 2-5, the pixel for integrating the left upper apex position of graph region is i-th1Row jth1The pixel of column,
The pixel of bottom right vertex position is i-th2Row jth2The pixel of column, the pixel of bottom left vertex position are i-th2Row jth1Column,
The pixel of right vertices position is i-th1Row jth2Column.According to the integrated value of four pixels, (3) are calculated as follows
The target image area Area being located in target candidate frame out;
Area=Integral (i2,j2)-Integral(i1,j2)-Integral(i2,j1)+Integral(i1,
j1)……(3);
In above-mentioned formula (3), Integral (i2,j2) it is i-th2Row jth2The integrated value of the pixel of column, Integral
(i1,j2) it is i-th1Row jth2The integrated value of the pixel of column, Integral (i2,j1) it is i-th2Row jth1The product of the pixel of column
Score value, Integral (i1,j1) it is i-th1Row jth1The integrated value of the pixel of column.
And the configuration information according to the target candidate frame, calculate the area of the target candidate frame;Calculate target image face
Ratio between the long-pending and area of target candidate frame.
Optionally, the area of the target candidate frame can be calculated according to the location information of target candidate frame.
It is right according to a pair when the location information of target candidate frame includes the position of a pair of of angle steel joint of target candidate frame
The position of each angle steel joint in angle point calculates the area of the target candidate frame.
When the location information of target candidate frame includes the position on a vertex of target candidate frame and the ruler of target candidate frame
When very little, according to the size, the area of the target candidate frame is calculated.
In this step, it is only necessary to according to the integrated value of the pixel of four vertex positions, target image can be calculated
Area, required calculation amount is smaller, so as to reduce calculation amount required for filter operation, improves the rate of filtration, into
And improve the efficiency of detection target image.
2034: when the ratio is less than default fractional threshold, the Filtration Goal marquis from the first candidate frame configuration information set
Select the configuration information of frame.
When the ratio is greater than or equal to default fractional threshold, target marquis is retained in the first candidate frame configuration information set
Select the location information of frame.
Referring to fig. 2-6, according to the foreground moving image the first candidate frame configuration information set is filtered to above-mentioned
Process can be completed by following 2131 to 2134 operation, is respectively as follows:
2131: according to the configuration information of target candidate frame, obtaining target candidate frame corresponding figure in foreground moving image
As region, the configuration information of target candidate frame is any one candidate frame in the first candidate frame configuration information set with confidence
Breath.
Optionally, it is right in foreground moving image can be obtained according to the location information of target candidate frame for target candidate frame
The image-region answered.
It is right according to a pair when the location information of target candidate frame includes the position of a pair of of angle steel joint of target candidate frame
The position of each angle steel joint in angle point obtains target candidate frame corresponding image-region in foreground moving image.
When the location information of target candidate frame includes the position on a vertex of target candidate frame and the ruler of target candidate frame
When very little, according to the position on a vertex and the size, target candidate frame corresponding image district in foreground moving image is obtained
Domain.
Next, the target image area and target candidate being located in target candidate frame can be calculated according to the image-region
Ratio between the area of frame, realization process may include following 2132 to 2134 operation.
2132: counting total pixel number of the pixel number and the image-region that belong to target image in the image-region
Mesh.
Pixel in foreground moving image is divided into two classes, and a kind of pixel belongs to the target image being kept in motion,
And the pixel value for belonging to such each pixel is 1, another kind of pixel belongs to the background image to remain static, and belongs to
In another kind of each pixel pixel value be 0.
Optionally, the pixel number that pixel value is 1 in the image-region can be counted, obtains belonging in the image-region
The pixel number of target image.
2133: calculating the ratio between the pixel number and total pixel number, obtain being located in target candidate frame
Target image area and target candidate frame area between ratio.
2134: when the ratio is less than default fractional threshold, the Filtration Goal marquis from the first candidate frame configuration information set
Select the configuration information of frame.
When the ratio is greater than or equal to default fractional threshold, target marquis is retained in the first candidate frame configuration information set
Select the location information of frame.
204: according to the second candidate frame configuration information set, adding detection block in picture to be detected, wrapped in the detection block
Include at least one target image in picture to be detected.
In the present embodiment, can according to the confidence score of each candidate frame in the second candidate frame configuration information set,
The configuration information of each candidate frame in second candidate frame configuration information set is ranked up, the first configuration information sequence is obtained
Column.
Optionally, matching for the maximum default value candidate frame of confidence score can be selected from the first configuration information sequence
Confidence breath, the detection block of each candidate frame, marquis are added according to the location information of each candidate frame of selection in picture to be detected
Select the equal in magnitude of the detection block of frame and the candidate frame.
Optionally, non-maxima suppression operation can also be carried out to the first configuration information sequence, obtains the second configuration information
Sequence, the number of the candidate frame configuration information in the second configuration information sequence are less than or equal to the marquis in the first configuration information sequence
Select the number of frame configuration information.The maximum default value candidate frame of confidence score can be selected from the second configuration information sequence
Configuration information, the detection of each candidate frame is added in picture to be detected according to the location information of each candidate frame of selection
The detection block of frame, candidate frame and the candidate frame it is equal in magnitude.
So-called non-maxima suppression operation is exactly to identify that overlapping area is more than default threshold from the first configuration information sequence
Any two candidate frame of value filters out the configuration information of one of candidate frame from two candidate frames, alternatively, by this two
A candidate frame synthesizes a candidate frame, and the configuration information of the candidate frame after being synthesized.
Wherein, due in step 203, filtering out a large amount of candidate frame configuration from the first candidate frame configuration information set
Information, so when the configuration information to the candidate frame in the first configuration information sequence carries out non-maxima suppression operation, it can be with
Reduction needs the number of the candidate frame configuration information of operation processing to further improve to improve the efficiency of operation processing
Detect the efficiency of target image.
Optionally, when picture to be detected adds detection block, the type of the target image in the detection block can also be added,
It can be be added in picture to be detected according to the configuration information of second feature picture and each candidate frame of selection when realizing
The type of target image in detection block and the detection block.
When realizing, the configuration information of second feature picture and each candidate frame of selection can be input to Fast
Area-of-interest (Region of Interest, RoI) pond layer of Rcnn device, passes through the pond RoI of Fast Rcnn device
Layer exports the target image types in each candidate frame of selection, and according to the location information of each candidate frame, to be detected
The type of the target image in detection block and the detection block is added in picture.
Above-mentioned treatment process can be executed to each frame picture, thus realize in each frame picture add detection block and
The type of target image in detection block.
The module map of Fast Rcnn device shown in -7 referring to fig. 2, Fast Rcnn device includes shared convolutional layer, peculiar
Convolutional layer, the pond RoI layer and full articulamentum, picture to be detected are special with second after shared convolutional layer and the processing of peculiar convolutional layer
Sign picture and the second candidate frame configuration information set are input to the pond RoI layer, using the processing of the pond RoI layer and full articulamentum
The type of detection block and the target image in each detection block is exported afterwards.
Referring to fig. 2-8, pass through above-mentioned process, it can be deduced that the embodiment of the present application is applied to following software systems, the software
System can be executed by the device, and to execute the process of the above method, which can set for the camera shooting in embodiment shown in FIG. 1
Standby or server etc., the software systems may include filter device, mixed Gauss model device, RPN device and Fast Rcnn dress
It sets.
Picture to be detected is separately input to CNN and mixed Gauss model device in Fast Rcnn device, Fast Rcnn
CNN in device inputs fisrt feature picture to RPN device;Mixed Gauss model device exports foreground moving figure to filter device
Picture, RPN device input the first candidate frame configuration information set to filtering module again;Filter device through the above steps 203 behaviour
The second candidate frame configuration information set is obtained, inputs the second candidate frame configuration information set, Fast to Fast Rcnn device
Rcnn device adds the type of detection block and the target image in detection block in picture to be detected.
In the embodiment of the present application, the foreground moving image for obtaining picture to be detected, according to the foreground moving image, from
The configuration information that the candidate frame including Partial image object is filtered out in one candidate frame configuration information set, obtains the second candidate
Frame configuration information set is integrated into picture to be detected according to the second candidate frame configuration information and adds detection block, improves detection
Precision.Due to filtering out the configuration information of a large amount of candidate frames in the second candidate frame configuration information set, in this way to the second marquis
When frame configuration information set being selected to carry out non-maxima suppression operation, the configuration information for needing the candidate frame of operation processing is reduced
Number improves processing speed, and then improve detection efficiency to reduce calculation amount.
Referring to Fig. 3-1, the embodiment of the present application provides a kind of device 300 for detecting target image, and described device 300 can be used
In realizing embodiment shown in Fig. 2-1, the function of the server or picture pick-up device in embodiment illustrated in fig. 1 can also be realized, wrap
It includes:
Acquiring unit 301, for obtaining the corresponding foreground moving image of picture to be detected and obtaining to picture to be detected
The fisrt feature picture that convolution algorithm obtains is carried out, which includes the mesh being kept in motion in picture to be detected
Logo image and the background image in addition to the target image;
Detection unit 302 is also used to detect the target image in fisrt feature picture and obtains the first candidate frame configuration information
Set, the first candidate frame configuration information set includes the configuration information of each candidate frame at least one candidate frame, first
It include at least one target image in each candidate frame in feature image, target image and mapping to be checked in fisrt feature picture
Target image in piece is identical;
Filter element 303 is also used to according to the foreground moving image bag filter from the first candidate frame configuration information set
The target image included is the configuration information of the candidate frame of Partial image image, obtains the second candidate frame configuration information set;
Adding unit 304 is also used to add detection in picture to be detected according to the second candidate frame configuration information set
Frame includes at least one target image in picture to be detected in the detection block.
Optionally, referring to Fig. 3-2, the device 300 further include: at least one in Transmit-Receive Unit 305 and storage unit 306
It is a;
Wherein, picture to be detected can be the received picture of Transmit-Receive Unit 305, alternatively, picture to be detected can be storage
The picture stored in unit 306.
Optionally, referring to Fig. 3-3, when function of the device 300 for realizing picture pick-up device, which can be with
Including camera unit 307, which can be camera etc., and picture to be detected can shoot for camera unit 307
The picture arrived.The device 300 can also include Transmit-Receive Unit 305 and/or storage unit 306, which can be used for
The picture that camera unit 307 is shot is sent, which can be used for storing the picture of the shooting of camera unit 307.
Optionally, when function of the device 300 for realizing server, which may include Transmit-Receive Unit 305
And/or storage unit 306.
Optionally, acquiring unit 301, for obtaining to be checked by carrying out mixed Gaussian background modeling to picture to be detected
The corresponding foreground moving image of mapping piece.
Optionally, filter element 303 are used for:
According to the foreground moving image, the corresponding integrogram of foreground moving image is calculated;
Filtering the target image for including from the first candidate frame configuration information set according to the integrogram is Partial image
The configuration information of the candidate frame of image.
Optionally, filter element 303 are used for
According to the configuration information of target candidate frame, target candidate frame corresponding integral graph region in the integrogram is obtained,
The configuration information of target candidate frame is the configuration information of any one candidate frame in the first candidate frame configuration information set;
The area of the target image area and target candidate frame that are located in target candidate frame is calculated according to the integral graph region
Between ratio;
When the ratio is less than default fractional threshold, the Filtration Goal candidate frame from the first candidate frame configuration information set
Configuration information.
Optionally, filter element 303 are used for:
Obtain the integrated value for being located at the pixel of four vertex positions of the integral graph region;
According to the integrated value of each pixel of acquisition, the target image area being located in target candidate frame is calculated;
According to the configuration information of target candidate frame, the area of target candidate frame is calculated;
Calculate the ratio between target image area and the area of target candidate frame.
Optionally, filter element 303 are used for:
According to the configuration information of target candidate frame, target candidate frame corresponding image district in the foreground moving image is obtained
Domain, the configuration information of target candidate frame are the configuration information of any one candidate frame in the first candidate frame configuration information set;
According to the image-region calculate be located at target candidate frame in target image area and target candidate frame area it
Between ratio;
When the ratio is less than default fractional threshold, the Filtration Goal candidate frame from the first candidate frame configuration information set
Configuration information.
Optionally, filter element 303 are used for:
Count total pixel number of the pixel number and the image-region that belong to target image in the image-region;
The ratio between the pixel number and total pixel number is calculated, the target being located in target candidate frame is obtained
Ratio between image area and the area of target candidate frame.
Optionally, adding unit 304 are used for:
It obtains and the second feature picture that convolution algorithm obtains is carried out to picture to be detected, convolution is carried out to fisrt feature picture
The number of operation is less than the number that convolution algorithm is carried out to second feature picture;
According to second feature picture and the second candidate frame configuration information set, detection block is added in picture to be detected and is somebody's turn to do
The type of target image in detection block.
In the embodiment of the present application, due to the foreground moving image of acquisition picture to be detected, in this way according to the foreground moving
Image filters out the configuration information of the candidate frame including Partial image object from the first candidate frame configuration information set, obtains
To the second candidate frame configuration information set, it is integrated into picture to be detected according to the second candidate frame configuration information and adds detection block,
Detection accuracy can be improved.
Referring to fig. 4, Fig. 4 show a kind of 400 schematic diagram of device for detecting target image provided by the embodiments of the present application.It should
Device 400 includes at least one processor 401, bus system 402, memory 403 and at least one transceiver 404.
The device 400 is a kind of device of hardware configuration, can be used to implement the function mould in device described in Fig. 3-1
Block.For example, it may occur to persons skilled in the art that acquiring unit 301, detection unit 302 in device 300 shown in Fig. 3-1,
Filter element 303 and/or adding unit 304 can be called by least one processor 401 code in memory 403 come
It realizes, the Transmit-Receive Unit 305 in device 300 shown in Fig. 3-1 can be realized by least one transceiver 404.
Optionally, which can also be used in the function of realizing picture pick-up device in embodiment as described in Figure 1, Huo Zheshi
The function of server in existing embodiment shown in FIG. 1.
When the device 400 is used for the function of picture pick-up device, which can also include camera 407, shown in Fig. 3-1
Device 300 in camera unit 307 can be realized by the camera 407.
Optionally, above-mentioned processor 401 can be a general central processor (central processing unit,
CPU), microprocessor, application-specific integrated circuit (application-specific integrated circuit, ASIC),
Or it is one or more for controlling the integrated circuit of application scheme program execution.
Above-mentioned bus system 402 may include an access, and information is transmitted between said modules.
Above-mentioned transceiver 404 is used for and other equipment or communication, such as Ethernet, wireless access network (radio
Access network, RAN), WLAN (wireless local area networks, WLAN) etc..
Above-mentioned memory 403 can be read-only memory (read-only memory, ROM) or can store static information and
The other kinds of static storage device of instruction, random access memory (random access memory, RAM) or can deposit
The other kinds of dynamic memory for storing up information and instruction, is also possible to Electrically Erasable Programmable Read-Only Memory
(electrically erasable programmable read-only memory, EEPROM), CD-ROM (compact
Disc read-only memory, CD-ROM) or other optical disc storages, optical disc storage (including compression optical disc, laser disc, light
Dish, Digital Versatile Disc, Blu-ray Disc etc.), magnetic disk storage medium or other magnetic storage apparatus or can be used in carry or
Store have instruction or data structure form desired program code and can by any other medium of computer access, but
It is without being limited thereto.Memory, which can be, to be individually present, and is connected by bus with processor.Memory can also be integrated with processor
Together.
Wherein, memory 403 is used to store the application code for executing application scheme, and is controlled by processor 401
System executes.Processor 401 is for executing the application code stored in memory 403, to realize in this patent method
Function.
In the concrete realization, as one embodiment, processor 401 may include one or more CPU, such as in Fig. 4
CPU0 and CPU1.
In the concrete realization, as one embodiment, which may include multiple processors, such as the place in Fig. 4
Manage device 401 and processor 408.Each of these processors can be monokaryon (single-CPU) processor, can also
To be multicore (multi-CPU) processor.Here processor can refer to one or more equipment, circuit, and/or be used for
Handle the processing core of data (such as computer program instructions).
In the concrete realization, as one embodiment, when function of the device 400 for realizing server, the device
400 can also include output equipment 405 and input equipment 406.Output equipment 405 and processor 401 communicate, can be with a variety of sides
Formula shows information.For example, output equipment 405 can be liquid crystal display (liquid crystal display, LCD), hair
Light diode (light emitting diode, LED) shows equipment, and cathode-ray tube (cathode ray tube, CRT) is aobvious
Show equipment or projector (projector) etc..Input equipment 406 and processor 401 communicate, and can receive use in many ways
The input at family.For example, input equipment 406 can be mouse, keyboard, touch panel device or sensing equipment etc..
Above-mentioned the embodiment of the present application serial number is for illustration only, does not represent the advantages or disadvantages of the embodiments.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware
It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely the alternative embodiments of the application, not to limit the application, it is all in spirit herein and
Within principle, any modification, equivalent replacement, improvement and so on be should be included within the scope of protection of this application.
Claims (18)
1. a kind of method for detecting target image, which is characterized in that the described method includes:
It obtains the corresponding foreground moving image of picture to be detected and obtains and the picture progress convolution algorithm to be detected is obtained
Fisrt feature picture, the foreground moving image includes the target image being kept in motion in the picture to be detected and removes
Background image outside the target image;
It detects the target image in the fisrt feature picture and obtains the first candidate frame configuration information set, the first candidate frame
Configuration information set includes the configuration information of each candidate frame at least one candidate frame, the institute in the fisrt feature picture
Stating includes at least one target image in each candidate frame, target image and the mapping to be checked in the fisrt feature picture
Target image in piece is identical;
It is non-for filtering the target image for including from the first candidate frame configuration information set according to the foreground moving image
The configuration information of the candidate frame of complete object image obtains the second candidate frame configuration information set;
According to the second candidate frame configuration information set, detection block is added in the picture to be detected, in the detection block
Including at least one target image in the picture to be detected.
2. the method as described in claim 1, which is characterized in that it is described to obtain the corresponding foreground moving image of picture to be detected,
Include:
By carrying out mixed Gaussian background modeling to the picture to be detected, the corresponding foreground moving of the picture to be detected is obtained
Image.
3. method according to claim 1 or 2, which is characterized in that it is described according to the foreground moving image from described first
The configuration information for the candidate frame that the target image for including is Partial image image, packet are filtered in candidate frame configuration information set
It includes:
According to the foreground moving image, the corresponding integrogram of the foreground moving image is calculated;
Filtering the target image for including from the first candidate frame configuration information set according to the integrogram is incomplete mesh
The configuration information of the candidate frame of logo image.
4. method as claimed in claim 3, which is characterized in that described to be configured according to the integrogram from the first candidate frame
The configuration information for the candidate frame that the target image for including is Partial image image is filtered in information aggregate, comprising:
According to the configuration information of target candidate frame, the target candidate frame corresponding integrogram area in the integrogram is obtained
Domain, the configuration information of the target candidate frame are the configuration of any one candidate frame in the first candidate frame configuration information set
Information;
The target image area and the target candidate frame being located in the target candidate frame are calculated according to the integral graph region
Area between ratio;
When the ratio is less than default fractional threshold, the target marquis is filtered from the first candidate frame configuration information set
Select the configuration information of frame.
5. method as claimed in claim 4, which is characterized in that described calculated according to the integral graph region is located at the target
Target image area in candidate frame and the ratio between the area of the target candidate frame, comprising:
Obtain the integrated value for being located at the pixel of four vertex positions of the integral graph region;
According to the integrated value of each pixel of the acquisition, the target image area being located in the target candidate frame is calculated;
According to the configuration information of the target candidate frame, the area of the target candidate frame is calculated;
Calculate the ratio between the target image area and the area of the target candidate frame.
6. method according to claim 1 or 2, which is characterized in that it is described according to the foreground moving image from described first
The configuration information for the candidate frame that the target image for including is Partial image image, packet are filtered in candidate frame configuration information set
It includes:
According to the configuration information of target candidate frame, the target candidate frame corresponding image in the foreground moving image is obtained
Region, the configuration information of the target candidate frame are matching for any one candidate frame in the first candidate frame configuration information set
Confidence breath;
The target image area and the target candidate frame being located in the target candidate frame are calculated according to described image region
Ratio between area;
When the ratio is less than default fractional threshold, the target marquis is filtered from the first candidate frame configuration information set
Select the configuration information of frame.
7. method as claimed in claim 6, which is characterized in that described calculated according to described image region is located at the target marquis
Select the ratio between the area of the target image area and the target candidate frame in frame, comprising:
Statistics belongs to the pixel number of target image and total pixel number in described image region in described image region;
The ratio between the pixel number and total pixel number is calculated, obtains being located in the target candidate frame
Ratio between target image area and the area of the target candidate frame.
8. method as described in any one of claim 1 to 7, which is characterized in that described to match confidence according to the second candidate frame
Breath set, adds detection block in the picture to be detected, comprising:
It obtains and the second feature picture that convolution algorithm obtains is carried out to the picture to be detected, the fisrt feature picture is carried out
The number of convolution algorithm is less than the number that convolution algorithm is carried out to the second feature picture;
According to the second feature picture and the second candidate frame configuration information set, inspection is added in the picture to be detected
Survey the type of frame and the target image in the detection block.
9. a kind of device for detecting target image, which is characterized in that described device includes:
Acquiring unit carries out the picture to be detected for obtaining the corresponding foreground moving image of picture to be detected and obtaining
The fisrt feature picture that convolution algorithm obtains, the foreground moving image include being kept in motion in the picture to be detected
Target image and the background image in addition to the target image;
Detection unit is also used to detect the target image in the fisrt feature picture and obtains the first candidate frame configuration information collection
It closes, the first candidate frame configuration information set includes the configuration information of each candidate frame at least one candidate frame, in institute
Stating includes at least one target image in each candidate frame described in fisrt feature picture, the target in the fisrt feature picture
Image is identical as the target image in the picture to be detected;
Filter element, is also used to be filtered from the first candidate frame configuration information set according to the foreground moving image and includes
Target image be Partial image image candidate frame configuration information, obtain the second candidate frame configuration information set;
Adding unit is also used to add detection in the picture to be detected according to the second candidate frame configuration information set
Frame includes at least one target image in the picture to be detected in the detection block.
10. device as claimed in claim 9, which is characterized in that
The acquiring unit, for obtaining described to be detected by carrying out mixed Gaussian background modeling to the picture to be detected
The corresponding foreground moving image of picture.
11. the device as described in claim 9 or 10, which is characterized in that the filter element is used for:
According to the foreground moving image, the corresponding integrogram of the foreground moving image is calculated;
Filtering the target image for including from the first candidate frame configuration information set according to the integrogram is incomplete mesh
The configuration information of the candidate frame of logo image.
12. device as claimed in claim 11, which is characterized in that the filter element is used for:
According to the configuration information of target candidate frame, the target candidate frame corresponding integrogram area in the integrogram is obtained
Domain, the configuration information of the target candidate frame are the configuration of any one candidate frame in the first candidate frame configuration information set
Information;
The target image area and the target candidate frame being located in the target candidate frame are calculated according to the integral graph region
Area between ratio;
When the ratio is less than default fractional threshold, the target marquis is filtered from the first candidate frame configuration information set
Select the configuration information of frame.
13. device as claimed in claim 12, which is characterized in that the filter element is used for:
Obtain the integrated value for being located at the pixel of four vertex positions of the integral graph region;
According to the integrated value of each pixel of the acquisition, the target image area being located in the target candidate frame is calculated;
According to the configuration information of the target candidate frame, the area of the target candidate frame is calculated;
Calculate the ratio between the target image area and the area of the target candidate frame.
14. the device as described in claim 9 or 10, which is characterized in that the filter element is used for:
According to the configuration information of target candidate frame, the target candidate frame corresponding image in the foreground moving image is obtained
Region, the configuration information of the target candidate frame are matching for any one candidate frame in the first candidate frame configuration information set
Confidence breath;
The target image area and the target candidate frame being located in the target candidate frame are calculated according to described image region
Ratio between area;
When the ratio is less than default fractional threshold, the target marquis is filtered from the first candidate frame configuration information set
Select the configuration information of frame.
15. device as claimed in claim 14, which is characterized in that the filter element is used for:
Statistics belongs to the pixel number of target image and total pixel number in described image region in described image region;
The ratio between the pixel number and total pixel number is calculated, obtains being located in the target candidate frame
Ratio between target image area and the area of the target candidate frame.
16. such as the described in any item devices of claim 9 to 15, which is characterized in that the adding unit is used for:
It obtains and the second feature picture that convolution algorithm obtains is carried out to the picture to be detected, the fisrt feature picture is carried out
The number of convolution algorithm is less than the number that convolution algorithm is carried out to the second feature picture;
According to the second feature picture and the second candidate frame configuration information set, inspection is added in the picture to be detected
Survey the type of frame and the target image in the detection block.
17. a kind of device for detecting target image, which is characterized in that described device includes:
At least one processor;With
At least one processor;
At least one processor is stored with one or more programs, one or more of programs be configured to by it is described extremely
A few processor executes, and one or more of programs include for carrying out such as any one of claim 1 to 8 claim institute
The instruction for the method stated.
18. a kind of non-volatile computer readable storage medium storing program for executing, which is characterized in that for storing computer program, the calculating
Machine program is loaded to execute the instruction of the method as described in any one of claim 1 to 8 claim by processor.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810258574.7A CN110310301B (en) | 2018-03-27 | 2018-03-27 | Method and device for detecting target object |
PCT/CN2019/074761 WO2019184604A1 (en) | 2018-03-27 | 2019-02-11 | Method and device for detecting target image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810258574.7A CN110310301B (en) | 2018-03-27 | 2018-03-27 | Method and device for detecting target object |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110310301A true CN110310301A (en) | 2019-10-08 |
CN110310301B CN110310301B (en) | 2021-07-16 |
Family
ID=68062170
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810258574.7A Active CN110310301B (en) | 2018-03-27 | 2018-03-27 | Method and device for detecting target object |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110310301B (en) |
WO (1) | WO2019184604A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110852321A (en) * | 2019-11-11 | 2020-02-28 | 北京百度网讯科技有限公司 | Candidate frame filtering method and device and electronic equipment |
CN111815570A (en) * | 2020-06-16 | 2020-10-23 | 浙江大华技术股份有限公司 | Regional intrusion detection method and related device thereof |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113836985A (en) * | 2020-06-24 | 2021-12-24 | 富士通株式会社 | Image processing apparatus, image processing method, and computer-readable storage medium |
CN114511694B (en) * | 2022-01-28 | 2023-05-12 | 北京百度网讯科技有限公司 | Image recognition method, device, electronic equipment and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106557778A (en) * | 2016-06-17 | 2017-04-05 | 北京市商汤科技开发有限公司 | Generic object detection method and device, data processing equipment and terminal device |
CN106709447A (en) * | 2016-12-21 | 2017-05-24 | 华南理工大学 | Abnormal behavior detection method in video based on target positioning and characteristic fusion |
US20170169315A1 (en) * | 2015-12-15 | 2017-06-15 | Sighthound, Inc. | Deeply learned convolutional neural networks (cnns) for object localization and classification |
CN107256225A (en) * | 2017-04-28 | 2017-10-17 | 济南中维世纪科技有限公司 | A kind of temperature drawing generating method and device based on video analysis |
CN107730553A (en) * | 2017-11-02 | 2018-02-23 | 哈尔滨工业大学 | A kind of Weakly supervised object detecting method based on pseudo- true value search method |
CN107833213A (en) * | 2017-11-02 | 2018-03-23 | 哈尔滨工业大学 | A kind of Weakly supervised object detecting method based on pseudo- true value adaptive method |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9965719B2 (en) * | 2015-11-04 | 2018-05-08 | Nec Corporation | Subcategory-aware convolutional neural networks for object detection |
CN107133974B (en) * | 2017-06-02 | 2019-08-27 | 南京大学 | Gaussian Background models the vehicle type classification method combined with Recognition with Recurrent Neural Network |
CN107590489A (en) * | 2017-09-28 | 2018-01-16 | 国家新闻出版广电总局广播科学研究院 | Object detection method based on concatenated convolutional neutral net |
-
2018
- 2018-03-27 CN CN201810258574.7A patent/CN110310301B/en active Active
-
2019
- 2019-02-11 WO PCT/CN2019/074761 patent/WO2019184604A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170169315A1 (en) * | 2015-12-15 | 2017-06-15 | Sighthound, Inc. | Deeply learned convolutional neural networks (cnns) for object localization and classification |
CN106557778A (en) * | 2016-06-17 | 2017-04-05 | 北京市商汤科技开发有限公司 | Generic object detection method and device, data processing equipment and terminal device |
CN106709447A (en) * | 2016-12-21 | 2017-05-24 | 华南理工大学 | Abnormal behavior detection method in video based on target positioning and characteristic fusion |
CN107256225A (en) * | 2017-04-28 | 2017-10-17 | 济南中维世纪科技有限公司 | A kind of temperature drawing generating method and device based on video analysis |
CN107730553A (en) * | 2017-11-02 | 2018-02-23 | 哈尔滨工业大学 | A kind of Weakly supervised object detecting method based on pseudo- true value search method |
CN107833213A (en) * | 2017-11-02 | 2018-03-23 | 哈尔滨工业大学 | A kind of Weakly supervised object detecting method based on pseudo- true value adaptive method |
Non-Patent Citations (4)
Title |
---|
C. LAWRENCE ZITNICK ET AL: "Edge Boxes: Locating Object Proposals from Edges", 《ECCV 2014》 * |
XIAOJIANG PENG ET AL: "Multi-region Two-Stream R-CNN for Action Detection", 《ECCV 2016》 * |
XUE YUAN ET AL: "A Graph-Based Vehicle Proposal Location and Detection Algorithm", 《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》 * |
张娇娇: "基于协同分割的多视频目标提取算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110852321A (en) * | 2019-11-11 | 2020-02-28 | 北京百度网讯科技有限公司 | Candidate frame filtering method and device and electronic equipment |
CN110852321B (en) * | 2019-11-11 | 2022-11-22 | 北京百度网讯科技有限公司 | Candidate frame filtering method and device and electronic equipment |
CN111815570A (en) * | 2020-06-16 | 2020-10-23 | 浙江大华技术股份有限公司 | Regional intrusion detection method and related device thereof |
Also Published As
Publication number | Publication date |
---|---|
WO2019184604A1 (en) | 2019-10-03 |
CN110310301B (en) | 2021-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109559320B (en) | Method and system for realizing visual SLAM semantic mapping function based on hole convolution deep neural network | |
CN110310301A (en) | A kind of method and device detecting target image | |
CN110135243B (en) | Pedestrian detection method and system based on two-stage attention mechanism | |
WO2018107910A1 (en) | Method and device for fusing panoramic video images | |
CN101908231B (en) | Reconstruction method and system for processing three-dimensional point cloud containing main plane scene | |
US11270441B2 (en) | Depth-aware object counting | |
CN109918977A (en) | Determine the method, device and equipment of free time parking stall | |
CN107274445A (en) | A kind of image depth estimation method and system | |
CN106599805A (en) | Supervised data driving-based monocular video depth estimating method | |
CN107220603A (en) | Vehicle checking method and device based on deep learning | |
CN108510504A (en) | Image partition method and device | |
CN110648363A (en) | Camera posture determining method and device, storage medium and electronic equipment | |
CN110009675A (en) | Generate method, apparatus, medium and the equipment of disparity map | |
CN109840982A (en) | It is lined up recommended method and device, computer readable storage medium | |
CN112287824A (en) | Binocular vision-based three-dimensional target detection method, device and system | |
CN110458128A (en) | A kind of posture feature acquisition methods, device, equipment and storage medium | |
CN111105351B (en) | Video sequence image splicing method and device | |
CN117197388A (en) | Live-action three-dimensional virtual reality scene construction method and system based on generation of antagonistic neural network and oblique photography | |
CN115482523A (en) | Small object target detection method and system of lightweight multi-scale attention mechanism | |
CN114037087A (en) | Model training method and device, depth prediction method and device, equipment and medium | |
Zhu et al. | PairCon-SLAM: Distributed, online, and real-time RGBD-SLAM in large scenarios | |
US10706499B2 (en) | Image processing using an artificial neural network | |
US10861174B2 (en) | Selective 3D registration | |
CN115527189A (en) | Parking space state detection method, terminal device and computer readable storage medium | |
CN115171011A (en) | Multi-class building material video counting method and system and counting equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |