CN109977912A - Video human critical point detection method, apparatus, computer equipment and storage medium - Google Patents
Video human critical point detection method, apparatus, computer equipment and storage medium Download PDFInfo
- Publication number
- CN109977912A CN109977912A CN201910276687.4A CN201910276687A CN109977912A CN 109977912 A CN109977912 A CN 109977912A CN 201910276687 A CN201910276687 A CN 201910276687A CN 109977912 A CN109977912 A CN 109977912A
- Authority
- CN
- China
- Prior art keywords
- image
- detected
- flow field
- optical flow
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/269—Analysis of motion using gradient-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
Abstract
The present invention relates to a kind of video human critical point detection methods, comprising: extracts multiframe image to be detected in video to be detected;The optical flow field between image to be detected described in multiframe is obtained, and obtains the characteristic pattern of each described image to be detected;The characteristic pattern is merged according to the optical flow field, obtains the Enhanced feature figure of described image to be detected;The Enhanced feature figure is inputted into default neural network, obtains the human body key point in described image to be detected.By extracting the optical flow field between multiple image, image to be detected is enhanced, and then improve the accuracy rate of Video Key point detection.
Description
Technical field
The present invention relates to field of image processings, more particularly to a kind of video human critical point detection method, apparatus, calculate
Machine equipment and storage medium.
Background technique
The research of human body critical point detection is how accurately to identify and position to each key point of human body in image, it
It is the basis of many computer vision applications such as action recognition, human-computer interaction.
Generally use " bottom-up " and " top-down " two methods on video human critical point detection at present, but this two
Video is all only simply decomposed into several frames by kind algorithm, and the Processing Algorithm of single frames is recycled to be handled frame by frame, without benefit
With the time-domain information of interframe, cause human body critical point detection accuracy rate lower.
Summary of the invention
The purpose of the present invention is to provide a kind of video human critical point detection method, apparatus, computer equipment and readable
Storage medium can effectively improve the accuracy of human body critical point detection in video.
The purpose of the present invention is achieved through the following technical solutions:
A kind of video human critical point detection method, which comprises
Extract multiframe image to be detected in video to be detected;
The optical flow field between image to be detected described in multiframe is obtained, and obtains the characteristic pattern of each described image to be detected;
The characteristic pattern is merged according to the optical flow field, obtains the Enhanced feature figure of described image to be detected;
The Enhanced feature figure is inputted into default neural network, obtains the human body key point in described image to be detected.
In one embodiment, described image to be detected includes current frame image and at least one historical frames image;It is described
The step of extracting multiframe image to be detected in video to be detected, comprising:
Extract the current frame image in the video to be detected;
At least one historical frames image in the video to be detected is extracted, the historical frames image takes the frame moment to be located at institute
It is before stating current frame image and adjacent with the current frame image.
In one embodiment, the optical flow field obtained between image to be detected described in multiframe, and obtain each described
The step of characteristic pattern of image to be detected, comprising:
Obtain the optical flow field between the current frame image and the historical frames image;
The current signature figure of the current frame image is obtained, and obtains the history feature figure of the historical frames image.
In one embodiment, the step for obtaining the optical flow field between the current frame image and the historical frames image
Suddenly, comprising:
By the current frame image and the default neural light stream network of historical frames image input, the present frame figure is obtained
Optical flow field between picture and the historical frames image.
In one embodiment, described to be merged the characteristic pattern according to the optical flow field, obtain described image to be detected
Enhanced feature figure the step of, comprising:
The history feature figure is aligned to the current signature figure according to the optical flow field, obtains alignment feature figure;
The alignment feature figure and the current signature figure are subjected to Time Domain Fusion, obtain the Enhanced feature figure.
In one embodiment, the step for obtaining the optical flow field between the current frame image and the historical frames image
Suddenly, comprising:
By the current frame image and the default neural light stream network of historical frames image input, the present frame figure is obtained
Optical flow field between picture and the historical frames image, also obtains scale field;Wherein, the scale field and the characteristic pattern dimension phase
Together.
In one embodiment, described to be merged the characteristic pattern according to the optical flow field, obtain described image to be detected
Enhanced feature figure the step of, comprising:
The history feature figure is aligned to the current signature figure according to the optical flow field, obtains alignment feature figure;
The alignment feature figure is multiplied to obtain fine-characterization figure with the scale field;
The fine-characterization figure and the current signature figure are subjected to Time Domain Fusion, obtain the Enhanced feature figure.
A kind of video human critical point detection device, described device include:
Image zooming-out module, for extracting multiframe image to be detected in video to be detected;
Optical-flow Feature extraction module, for obtaining the optical flow field between image to be detected described in multiframe, and each institute of acquisition
State the characteristic pattern of image to be detected;
Image enhancement module obtains described image to be detected for merging the characteristic pattern according to the optical flow field
Enhanced feature figure;
Critical point detection module obtains the mapping to be checked for the Enhanced feature figure to be inputted default neural network
Human body key point as in.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing
Above-mentioned steps when device executes the computer program.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor
Above-mentioned steps are realized when row.
Video human critical point detection method provided by the invention extracts the multiframe mapping to be checked in video to be detected
Picture;The optical flow field between image to be detected described in multiframe is obtained, and obtains the characteristic pattern of each described image to be detected;According to institute
It states optical flow field to merge the characteristic pattern, obtains the Enhanced feature figure of described image to be detected;The Enhanced feature figure is inputted
Default neural network, obtains the human body key point in described image to be detected.It is right by extracting the optical flow field between multiple image
Image to be detected is enhanced, and then improves the accuracy rate of Video Key point detection.
Detailed description of the invention
Fig. 1 is the applied environment figure of video human critical point detection method in one embodiment;
Fig. 2 is the flow diagram of video human critical point detection method in one embodiment;
Fig. 3 is the flow diagram of video human critical point detection method in another embodiment;
Fig. 4 is the structural block diagram of video human critical point detection device in another embodiment;
Fig. 5 is the internal structure chart of computer equipment in one embodiment.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention more comprehensible, with reference to the accompanying drawings and embodiments, to this
Invention is described in further detail.It should be appreciated that the specific embodiments described herein are only used to explain the present invention,
And the scope of protection of the present invention is not limited.
Video human critical point detection method provided by the present application can be applied in application environment as shown in Figure 1.It should
Application environment includes server 104 and photographic device 102, and server 104 obtains video to be detected from photographic device 102, and
Extract multiframe image to be detected in video to be detected;Server 104 obtains the optical flow field between image to be detected described in multiframe,
And obtain the characteristic pattern of each described image to be detected;Server 104 merges the characteristic pattern according to the optical flow field, obtains
The Enhanced feature figure of described image to be detected;The Enhanced feature figure is inputted default neural network by server 104, is obtained described
Human body key point in image to be detected.Wherein, server can use the either multiple server compositions of independent server
Server cluster is realized;Photographic device can have the device of camera function to realize using camera, camera, mobile phone etc..
In one embodiment, it as shown in Fig. 2, providing a kind of video human critical point detection method, answers in this way
For being illustrated for the server in Fig. 1, comprising the following steps:
Step S202 extracts multiframe image to be detected in video to be detected.
In this step, image to be detected includes current frame image and at least one historical frames image.
In the specific implementation process, multiframe image to be detected in the extraction video to be detected of step S202, comprising:
1) current frame image in the video to be detected is extracted;
2) at least one historical frames image in the video to be detected is extracted, the historical frames image takes the frame moment to be located at
It is before the current frame image and adjacent with the current frame image.
For example, three adjacent frame images can be extracted, using last frame image as current frame image, by two frame figure of front
As being used as historical frames image.
Step S204 obtains the optical flow field between image to be detected described in multiframe, and obtains each described image to be detected
Characteristic pattern.
In this step, light stream estimation be according to two observe the variation of body surface, shape etc. between moments to
Calculate a kind of method of object of which movement variation.What light stream characterized is the motion information between two images, and what it reflected is previous
The instantaneous velocity of pixel motion in frame image to a later frame image.
In the present embodiment, the light stream estimation of interframe is carried out using Flownet2S network.
As shown in figure 3, in one embodiment, the light stream between image to be detected described in the acquisition multiframe of step S204
, and obtain the characteristic pattern of each described image to be detected, comprising:
Step S410 obtains the optical flow field between the current frame image and the historical frames image;
Step S420 obtains the current signature figure of the current frame image, and obtains the history of the historical frames image
Characteristic pattern.
In one embodiment, the light stream of step S410 obtained between the current frame image and the historical frames image
, it may include: to obtain the current frame image and the default neural light stream network of historical frames image input described current
Optical flow field between frame image and the historical frames image.
Specifically, using Mi→kTo indicate that one is calculated the two-dimentional light stream of the i-th frame to kth frame by Flownet2S
?.Assuming that a certain pixel in the i-th framing bit in position p, be the pixel motion to position q in kth frame, then then there is q=p+ δ p,
Middle δ p=Mi→k(p);Before carrying out feature alignment, need light stream to be zoomed to by bilinear interpolation the identical size of characteristic pattern;By
δ p is mostly decimal in above formula, so needing to be aligned by formula (1) Lai Shixian feature.
Wherein what c was indicated is a channel of characteristic pattern f, and q traverses each coordinate on characteristic pattern, and G is that bilinearity is inserted
It is worth transformation kernel.Due to G be it is two-dimensional, two one-dimensional transformation kernels can be decomposed into and be multiplied, as shown in formula (2).
G (q, p+ δ p)=g (qx,px+δpx)·g(qy,py+δpy) (2)
Wherein g (a, b)=max (0,1- | a-b |);It is non-zero due to there was only seldom item in above formula, the meter of institute's above formula
Calculating can quickly.
In another embodiment, it is detected in order to enable obtaining feature after alignment and can be more advantageous to, step S410's
Obtain the optical flow field between the current frame image and the historical frames image, can also include: by the current frame image and
The default neural light stream network of historical frames image input, obtains the light between the current frame image and the historical frames image
Flow field also obtains scale field;Wherein, the scale field is identical as the characteristic pattern dimension.
Specifically, Flownet2S not only exports optical flow field also while exporting the scale field of one and characteristic pattern identical dimensional
Si→k。
The characteristic pattern is merged according to the optical flow field, obtains the Enhanced feature of described image to be detected by step S206
Figure.
In this step, characteristic pattern fusion is carried out using GRU (Gated Recurrent Units, gating cycle unit),
Specifically, carrying out temporal signatures fusion using GRU, that is, ConvGRU of convolution form.
In one embodiment, when only obtaining optical flow field in step S410, step S206's will according to the optical flow field
The characteristic pattern fusion, obtains the Enhanced feature figure of described image to be detected, comprising:
1) the history feature figure is aligned according to the optical flow field to the current signature figure, obtains alignment feature figure;
2) the alignment feature figure and the current signature figure are subjected to Time Domain Fusion, obtain the Enhanced feature figure.
Specifically, input information is handled according to formula formula (3) inside a GRU unit.GRU unit it is new
State htIt is preceding state ht-1With memory state h'tWeighted sum.Update door ztHow many ingredient quilt in memory state determined
For calculating new state ht, reset door rtControl preceding state ht-1To the influence degree of memory state.With full type of attachment
GRU is different, and * indicates convolution here,Indicate that contraposition is multiplied, σ is sigmoid function, and w is weight to be learned, and b is bias term.
zt=σ (xt*wxz+ht-1*whz+bz),
rt=σ (xt*wxr+ht-1*whr+br),
In another embodiment, when obtaining optical flow field in step S410 and when scale field, step S206 according to
Optical flow field merges the characteristic pattern, obtains the Enhanced feature figure of described image to be detected, comprising:
1) the history feature figure is aligned according to the optical flow field to the current signature figure, obtains alignment feature figure;
2) the alignment feature figure is multiplied to obtain fine-characterization figure with the scale field;Specifically, scale field Si→kAnd sky
Between be aligned after alignment feature figure be multiplied, obtain fine-characterization figure.
3) the fine-characterization figure and the current signature figure are subjected to Time Domain Fusion, obtain the Enhanced feature figure.
The Enhanced feature figure is inputted default neural network, obtains the human body in described image to be detected by step S208
Key point.
In this step, the human body critical point detection based on image is carried out using Mask-RCNN.The net of Mask-RCNN
Network structure mainly includes the feature extraction network of bottom, the candidate frame generation network of middle layer and the specific subtask positioned at head
Network three parts composition.
Bottom feature extraction network is original image for extracting feature abundant, input from image, and output is special
Sign figure.In order to extract better feature, it is stronger that VGG network used in Faster-RCNN is replaced with into feature representation ability
Residual error network.Simultaneously as often there is the different different target of size scale in image, only from the characteristic pattern of single scale into
Row detection easily causes missing inspection.For the core network as the Resnet, the feature resolution of shallow-layer is high but semantic
Level is lower, and the Feature Semantics level of deep layer is higher but resolution ratio is low.It can be incited somebody to action by using FPN network as core network
The information fusion of different scale is got up, and the Analysis On Multi-scale Features figure of output examines subsequent target detection, semantic segmentation, key point
Survey has great importance.
Intermediate candidate frame generates network for distinguishing target and background, generating target candidate frame, is later exactly according to time
Frame is selected to be cut out characteristic pattern.The method used in Faster-RCNN is RoI Pooling, and realization is reflected from original image region
The last pooling in convolution region is mapped to the function of fixed size, the size in the region is normalized into convolutional network input
Size.Mask-RCNN between the ROIAlign layers of feature and input to extraction using calibrating.It avoids to each side ROI
Boundary or block are digitized, and are obtained using bilinear interpolation method calculating four sampling locations fixed in ROI block defeated
Enter characteristic value and result is merged.The characteristic pattern of ROIAlign layers of 7 × 7 size of final output gives subsequent subtask net
Network.
The sub-network of specific tasks is located at top, for human body critical point detection, including 8 layer of 3 × 3 convolution.By
It is very sensitive for the resolution ratio of characteristic pattern in the accuracy rate of critical point detection therefore double by one uncoiling lamination of cascade and one
Linear interpolation layer makes the result scale finally exported be 56 × 56.
Above-mentioned video human critical point detection method, by extracting multiframe image to be detected in video to be detected;It obtains
The optical flow field between image to be detected described in multiframe is taken, and obtains the characteristic pattern of each described image to be detected;According to the light
The characteristic pattern is merged in flow field, obtains the Enhanced feature figure of described image to be detected;Enhanced feature figure input is default
Neural network obtains the human body key point in described image to be detected.By extracting the optical flow field between multiple image, to be checked
Altimetric image is enhanced, and then improves the accuracy rate of Video Key point detection.
As shown in figure 4, Fig. 4 is the structural schematic diagram of video human critical point detection device in one embodiment, this implementation
A kind of video human critical point detection device, including image zooming-out module 401, Optical-flow Feature extraction module 402, figure are provided in example
Image intensifying module 403 and critical point detection module 404, in which:
Image zooming-out module 401, for extracting multiframe image to be detected in video to be detected;
Optical-flow Feature extraction module 402 for obtaining the optical flow field between image to be detected described in multiframe, and obtains each
The characteristic pattern of described image to be detected;
Image enhancement module 403 obtains described image to be detected for merging the characteristic pattern according to the optical flow field
Enhanced feature figure;
Critical point detection module 404 obtains described to be detected for the Enhanced feature figure to be inputted default neural network
Human body key point in image.
Specific restriction about video human critical point detection device may refer to above for video human key point
The restriction of detection method, details are not described herein.Modules in above-mentioned video human critical point detection device can whole or portion
Divide and is realized by software, hardware and combinations thereof.Above-mentioned each module can be embedded in the form of hardware or independently of computer equipment
In processor in, can also be stored in a software form in the memory in computer equipment, in order to processor calling hold
The corresponding operation of the above modules of row.
As shown in figure 5, Fig. 5 is the schematic diagram of internal structure of computer equipment in one embodiment.The computer equipment packet
Include processor, non-volatile memory medium, memory and the network interface connected by device bus.Wherein, which sets
Standby non-volatile memory medium is stored with operating device, database and computer-readable instruction, can be stored with control in database
Part information sequence when the computer-readable instruction is executed by processor, may make processor to realize a kind of video human key point
Detection method.The processor of the computer equipment supports the operation of entire computer equipment for providing calculating and control ability.
Computer-readable instruction can be stored in the memory of the computer equipment, when which is executed by processor,
Processor may make to execute a kind of video human critical point detection method.The network interface of the computer equipment is used to connect with terminal
Connect letter.It will be understood by those skilled in the art that structure shown in Fig. 5, only part relevant to application scheme is tied
The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment
It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment it is proposed that a kind of computer equipment, computer equipment include memory, processor and storage
On a memory and the computer program that can run on a processor, processor realize following steps when executing computer program:
Extract multiframe image to be detected in video to be detected;The optical flow field between image to be detected described in multiframe is obtained, and is obtained
The characteristic pattern of each described image to be detected;The characteristic pattern is merged according to the optical flow field, obtains described image to be detected
Enhanced feature figure;The Enhanced feature figure is inputted into default neural network, obtains the human body key point in described image to be detected.
It includes current frame image that processor, which executes described image to be detected when computer program, in one of the embodiments,
With at least one historical frames image;The step of described multiframe image to be detected extracted in video to be detected, comprising: described in extraction
Current frame image in video to be detected;Extract at least one historical frames image in the video to be detected, the historical frames figure
The frame moment that takes of picture is located at before the current frame image and adjacent with the current frame image.
When processor executes computer program in one of the embodiments, image to be detected described in the acquisition multiframe it
Between optical flow field, and the step of obtaining the characteristic pattern of each described image to be detected, comprising: obtain the current frame image and institute
State the optical flow field between historical frames image;The current signature figure of the current frame image is obtained, and obtains the historical frames figure
The history feature figure of picture.
Processor executes the current frame image and described of obtaining when computer program in one of the embodiments,
The step of optical flow field between historical frames image, comprising: by the current frame image and the default mind of historical frames image input
Through light stream network, the optical flow field between the current frame image and the historical frames image is obtained.
In one of the embodiments, processor execute when computer program it is described according to the optical flow field by the feature
Figure fusion, the step of obtaining the Enhanced feature figure of described image to be detected, comprising: according to the optical flow field by the history feature
Figure is aligned to the current signature figure, obtains alignment feature figure;When the alignment feature figure and the current signature figure are carried out
Domain fusion, obtains the Enhanced feature figure.
Processor executes the current frame image and described of obtaining when computer program in one of the embodiments,
The step of optical flow field between historical frames image, comprising: by the current frame image and the default mind of historical frames image input
Through light stream network, the optical flow field between the current frame image and the historical frames image is obtained, scale field is also obtained;Wherein,
The scale field is identical as the characteristic pattern dimension.
In one of the embodiments, processor execute when computer program it is described according to the optical flow field by the feature
Figure fusion, the step of obtaining the Enhanced feature figure of described image to be detected, comprising: according to the optical flow field by the history feature
Figure is aligned to the current signature figure, obtains alignment feature figure;The alignment feature figure is multiplied to obtain carefully with the scale field
Change characteristic pattern;The fine-characterization figure and the current signature figure are subjected to Time Domain Fusion, obtain the Enhanced feature figure.
In one embodiment it is proposed that a kind of storage medium for being stored with computer-readable instruction, this is computer-readable
When instruction is executed by one or more processors, so that one or more processors execute following steps: extracting video to be detected
In multiframe image to be detected;The optical flow field between image to be detected described in multiframe is obtained, and obtains each mapping to be checked
The characteristic pattern of picture;The characteristic pattern is merged according to the optical flow field, obtains the Enhanced feature figure of described image to be detected;By institute
It states Enhanced feature figure and inputs default neural network, obtain the human body key point in described image to be detected.
Described image to be detected includes current when computer-readable instruction is executed by processor in one of the embodiments,
Frame image and at least one historical frames image;The step of described multiframe image to be detected extracted in video to be detected, comprising: mention
Take the current frame image in the video to be detected;At least one historical frames image in the video to be detected is extracted, it is described to go through
The frame moment that takes of history frame image is located at before the current frame image and adjacent with the current frame image.
It is to be detected described in the acquisition multiframe when computer-readable instruction is executed by processor in one of the embodiments,
Optical flow field between image, and the step of obtaining the characteristic pattern of each described image to be detected, comprising: obtain the present frame figure
Optical flow field between picture and the historical frames image;The current signature figure of the current frame image is obtained, and is gone through described in acquisition
The history feature figure of history frame image.
It is described when computer-readable instruction is executed by processor in one of the embodiments, to obtain the current frame image
The step of optical flow field between the historical frames image, comprising: input the current frame image and the historical frames image
Default nerve light stream network, obtains the optical flow field between the current frame image and the historical frames image.
When computer-readable instruction is executed by processor in one of the embodiments, it is described according to the optical flow field by institute
The step of stating characteristic pattern fusion, obtaining the Enhanced feature figure of described image to be detected, comprising: gone through according to the optical flow field by described
History characteristic pattern is aligned to the current signature figure, obtains alignment feature figure;By the alignment feature figure and the current signature figure
Time Domain Fusion is carried out, the Enhanced feature figure is obtained.
It is described when computer-readable instruction is executed by processor in one of the embodiments, to obtain the current frame image
The step of optical flow field between the historical frames image, comprising: input the current frame image and the historical frames image
Default nerve light stream network, obtains the optical flow field between the current frame image and the historical frames image, also obtains scale field;
Wherein, the scale field is identical as the characteristic pattern dimension.
When computer-readable instruction is executed by processor in one of the embodiments, it is described according to the optical flow field by institute
The step of stating characteristic pattern fusion, obtaining the Enhanced feature figure of described image to be detected, comprising: gone through according to the optical flow field by described
History characteristic pattern is aligned to the current signature figure, obtains alignment feature figure;The alignment feature figure is multiplied with the scale field
Obtain fine-characterization figure;The fine-characterization figure and the current signature figure are subjected to Time Domain Fusion, obtain the Enhanced feature
Figure.
It should be understood that although each step in the flow chart of attached drawing is successively shown according to the instruction of arrow,
These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps
Execution there is no stringent sequences to limit, can execute in the other order.Moreover, at least one in the flow chart of attached drawing
Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps
Completion is executed, but can be executed at different times, execution sequence, which is also not necessarily, successively to be carried out, but can be with other
At least part of the sub-step or stage of step or other steps executes in turn or alternately.
The above is only some embodiments of the invention, it is noted that for the ordinary skill people of the art
For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered
It is considered as protection scope of the present invention.
Claims (10)
1. a kind of video human critical point detection method, which is characterized in that the described method includes:
Extract multiframe image to be detected in video to be detected;
The optical flow field between image to be detected described in multiframe is obtained, and obtains the characteristic pattern of each described image to be detected;
The characteristic pattern is merged according to the optical flow field, obtains the Enhanced feature figure of described image to be detected;
The Enhanced feature figure is inputted into default neural network, obtains the human body key point in described image to be detected.
2. the method according to claim 1, wherein described image to be detected includes current frame image and at least one
A historical frames image;The step of described multiframe image to be detected extracted in video to be detected, comprising:
Extract the current frame image in the video to be detected;
At least one historical frames image in the video to be detected is extracted, the historical frames image takes the frame moment to be located at described work as
It is before prior image frame and adjacent with the current frame image.
3. according to the method described in claim 2, it is characterized in that, the light stream obtained between image to be detected described in multiframe
, and the step of obtaining the characteristic pattern of each described image to be detected, comprising:
Obtain the optical flow field between the current frame image and the historical frames image;
The current signature figure of the current frame image is obtained, and obtains the history feature figure of the historical frames image.
4. according to the method described in claim 3, it is characterized in that, the acquisition current frame image and the historical frames figure
As between optical flow field the step of, comprising:
By the current frame image and the default neural light stream network of historical frames image input, obtain the current frame image and
Optical flow field between the historical frames image.
5. according to the method described in claim 4, it is characterized in that, described merge the characteristic pattern according to the optical flow field,
The step of obtaining the Enhanced feature figure of described image to be detected, comprising:
The history feature figure is aligned to the current signature figure according to the optical flow field, obtains alignment feature figure;
The alignment feature figure and the current signature figure are subjected to Time Domain Fusion, obtain the Enhanced feature figure.
6. according to the method described in claim 3, it is characterized in that, the acquisition current frame image and the historical frames figure
As between optical flow field the step of, comprising:
By the current frame image and the default neural light stream network of historical frames image input, obtain the current frame image and
Optical flow field between the historical frames image, also obtains scale field;Wherein, the scale field is identical as the characteristic pattern dimension.
7. according to the method described in claim 6, it is characterized in that, described merge the characteristic pattern according to the optical flow field,
The step of obtaining the Enhanced feature figure of described image to be detected, comprising:
The history feature figure is aligned to the current signature figure according to the optical flow field, obtains alignment feature figure;
The alignment feature figure is multiplied to obtain fine-characterization figure with the scale field;
The fine-characterization figure and the current signature figure are subjected to Time Domain Fusion, obtain the Enhanced feature figure.
8. a kind of video human critical point detection device, which is characterized in that described device includes:
Image zooming-out module, for extracting multiframe image to be detected in video to be detected;
Optical-flow Feature extraction module, for obtaining the optical flow field between image to be detected described in multiframe, and obtain it is each it is described to
The characteristic pattern of detection image;
Image enhancement module obtains the enhancing of described image to be detected for merging the characteristic pattern according to the optical flow field
Characteristic pattern;
Critical point detection module obtains in described image to be detected for the Enhanced feature figure to be inputted default neural network
Human body key point.
9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists
In the step of processor realizes any one of claims 1 to 7 the method when executing the computer program.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The step of method described in any one of claims 1 to 7 is realized when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910276687.4A CN109977912B (en) | 2019-04-08 | 2019-04-08 | Video human body key point detection method and device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910276687.4A CN109977912B (en) | 2019-04-08 | 2019-04-08 | Video human body key point detection method and device, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109977912A true CN109977912A (en) | 2019-07-05 |
CN109977912B CN109977912B (en) | 2021-04-16 |
Family
ID=67083370
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910276687.4A Active CN109977912B (en) | 2019-04-08 | 2019-04-08 | Video human body key point detection method and device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109977912B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110853074A (en) * | 2019-10-09 | 2020-02-28 | 天津大学 | Video target detection network system for enhancing target by utilizing optical flow |
CN111160237A (en) * | 2019-12-27 | 2020-05-15 | 智车优行科技(北京)有限公司 | Head pose estimation method and apparatus, electronic device, and storage medium |
CN111914756A (en) * | 2020-08-03 | 2020-11-10 | 北京环境特性研究所 | Video data processing method and device |
CN112053327A (en) * | 2020-08-18 | 2020-12-08 | 南京理工大学 | Video target detection method and system, storage medium and server |
CN112199978A (en) * | 2019-07-08 | 2021-01-08 | 北京地平线机器人技术研发有限公司 | Video object detection method and device, storage medium and electronic equipment |
CN113901909A (en) * | 2021-09-30 | 2022-01-07 | 北京百度网讯科技有限公司 | Video-based target detection method and device, electronic equipment and storage medium |
CN115909508A (en) * | 2023-01-06 | 2023-04-04 | 浙江大学计算机创新技术研究院 | Image key point enhancement detection method under single-person sports scene |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8116624B1 (en) * | 2007-01-29 | 2012-02-14 | Cirrex Systems Llc | Method and system for evaluating an optical device |
CN106529419A (en) * | 2016-10-20 | 2017-03-22 | 北京航空航天大学 | Automatic detection method for significant stack type polymerization object in video |
CN108229336A (en) * | 2017-12-13 | 2018-06-29 | 北京市商汤科技开发有限公司 | Video identification and training method and device, electronic equipment, program and medium |
CN108242062A (en) * | 2017-12-27 | 2018-07-03 | 北京纵目安驰智能科技有限公司 | Method for tracking target, system, terminal and medium based on depth characteristic stream |
CN108776974A (en) * | 2018-05-24 | 2018-11-09 | 南京行者易智能交通科技有限公司 | A kind of real-time modeling method method suitable for public transport scene |
CN109117701A (en) * | 2018-06-05 | 2019-01-01 | 东南大学 | Pedestrian's intension recognizing method based on picture scroll product |
CN109508643A (en) * | 2018-10-19 | 2019-03-22 | 北京陌上花科技有限公司 | Image processing method and device for porny |
-
2019
- 2019-04-08 CN CN201910276687.4A patent/CN109977912B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8116624B1 (en) * | 2007-01-29 | 2012-02-14 | Cirrex Systems Llc | Method and system for evaluating an optical device |
CN106529419A (en) * | 2016-10-20 | 2017-03-22 | 北京航空航天大学 | Automatic detection method for significant stack type polymerization object in video |
CN108229336A (en) * | 2017-12-13 | 2018-06-29 | 北京市商汤科技开发有限公司 | Video identification and training method and device, electronic equipment, program and medium |
CN108242062A (en) * | 2017-12-27 | 2018-07-03 | 北京纵目安驰智能科技有限公司 | Method for tracking target, system, terminal and medium based on depth characteristic stream |
CN108776974A (en) * | 2018-05-24 | 2018-11-09 | 南京行者易智能交通科技有限公司 | A kind of real-time modeling method method suitable for public transport scene |
CN109117701A (en) * | 2018-06-05 | 2019-01-01 | 东南大学 | Pedestrian's intension recognizing method based on picture scroll product |
CN109508643A (en) * | 2018-10-19 | 2019-03-22 | 北京陌上花科技有限公司 | Image processing method and device for porny |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112199978A (en) * | 2019-07-08 | 2021-01-08 | 北京地平线机器人技术研发有限公司 | Video object detection method and device, storage medium and electronic equipment |
CN110853074A (en) * | 2019-10-09 | 2020-02-28 | 天津大学 | Video target detection network system for enhancing target by utilizing optical flow |
CN110853074B (en) * | 2019-10-09 | 2023-06-27 | 天津大学 | Video target detection network system for enhancing targets by utilizing optical flow |
CN111160237A (en) * | 2019-12-27 | 2020-05-15 | 智车优行科技(北京)有限公司 | Head pose estimation method and apparatus, electronic device, and storage medium |
CN111914756A (en) * | 2020-08-03 | 2020-11-10 | 北京环境特性研究所 | Video data processing method and device |
CN112053327A (en) * | 2020-08-18 | 2020-12-08 | 南京理工大学 | Video target detection method and system, storage medium and server |
CN112053327B (en) * | 2020-08-18 | 2022-08-23 | 南京理工大学 | Video target detection method and system, storage medium and server |
CN113901909A (en) * | 2021-09-30 | 2022-01-07 | 北京百度网讯科技有限公司 | Video-based target detection method and device, electronic equipment and storage medium |
CN113901909B (en) * | 2021-09-30 | 2023-10-27 | 北京百度网讯科技有限公司 | Video-based target detection method and device, electronic equipment and storage medium |
CN115909508A (en) * | 2023-01-06 | 2023-04-04 | 浙江大学计算机创新技术研究院 | Image key point enhancement detection method under single-person sports scene |
Also Published As
Publication number | Publication date |
---|---|
CN109977912B (en) | 2021-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109977912A (en) | Video human critical point detection method, apparatus, computer equipment and storage medium | |
Chen et al. | Learning spatial attention for face super-resolution | |
CN112733797B (en) | Method, device and equipment for correcting sight of face image and storage medium | |
WO2020134818A1 (en) | Image processing method and related product | |
Nasrollahi et al. | Extracting a good quality frontal face image from a low-resolution video sequence | |
CN111160164A (en) | Action recognition method based on human body skeleton and image fusion | |
CN111428664B (en) | Computer vision real-time multi-person gesture estimation method based on deep learning technology | |
WO2020233427A1 (en) | Method and apparatus for determining features of target | |
CN112580521B (en) | Multi-feature true and false video detection method based on MAML (maximum likelihood markup language) element learning algorithm | |
Zhou et al. | A lightweight hand gesture recognition in complex backgrounds | |
KR102551835B1 (en) | Active interaction method, device, electronic equipment and readable storage medium | |
CN110853039B (en) | Sketch image segmentation method, system and device for multi-data fusion and storage medium | |
CN111914756A (en) | Video data processing method and device | |
Vieriu et al. | On HMM static hand gesture recognition | |
CN112712019A (en) | Three-dimensional human body posture estimation method based on graph convolution network | |
CN116092178A (en) | Gesture recognition and tracking method and system for mobile terminal | |
Hua et al. | Dynamic scene deblurring with continuous cross-layer attention transmission | |
CN110021036A (en) | Infrared target detection method, apparatus, computer equipment and storage medium | |
CN116309983A (en) | Training method and generating method and device of virtual character model and electronic equipment | |
CN109492755B (en) | Image processing method, image processing apparatus, and computer-readable storage medium | |
CN111476868B (en) | Animation generation model training and animation generation method and device based on deep learning | |
Xiong et al. | Extraction of hand gestures with adaptive skin color models and its applications to meeting analysis | |
Qian et al. | Multi-Scale tiny region gesture recognition towards 3D object manipulation in industrial design | |
TWI734297B (en) | Multi-task object recognition system sharing multi-range features | |
KR102591082B1 (en) | Method and apparatus for creating deep learning-based synthetic video contents |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |