CN107180224A - Finger motion detection and localization method based on spatio-temporal filtering and joint space Kmeans - Google Patents
Finger motion detection and localization method based on spatio-temporal filtering and joint space Kmeans Download PDFInfo
- Publication number
- CN107180224A CN107180224A CN201710231824.3A CN201710231824A CN107180224A CN 107180224 A CN107180224 A CN 107180224A CN 201710231824 A CN201710231824 A CN 201710231824A CN 107180224 A CN107180224 A CN 107180224A
- Authority
- CN
- China
- Prior art keywords
- kmeans
- finger
- frame
- space
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/42—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The present invention discloses finger motion detection and localization method based on spatio-temporal filtering and joint space Kmeans, first by ten kinds of different colours on player's finger plaster(It is except black, ultrawhite)Label, shoot player's finger and play the video of keyboard instrument;Then to the frame of video of input, finger motion target is detected with the method for spatio-temporal filtering, airspace filter result feedback guidance dynamic background is updated;Finger motion target positioning is carried out with joint space Kmeans, with reference to R, G, B statistics with histogram characteristic adaptive determining clusters number and initialization class center, so as to realize the fingering identification, writing function that low computation complexity, fast convergence rate, positional accuracy are high, real-time performance is good.
Description
Technical field
The present invention relates to the technical fields such as vision monitoring, Digital Image Processing, and in particular to based on spatio-temporal filtering and joint
Space Kmeans finger motion detection and localization method.
Background technology
The piano correct fingering of (or other keyboard instruments) player is most important for flexible performance and annotation music.
Good fingering can embody understanding and annotation of the player to composer's style and features, works content, at the same can be sparing of one's energy, when
Between, improve and play efficiency.Though playing fingering has universal law, the flexibility of different song fingering is practiced to the fingering of beginner
Practise and difficulty is added to the fingering imitation of virtuoso.Manual record fingering not only needs higher musicianship, consumes simultaneously
When effort.Therefore, recognize that fingering turns into the inexorable trend of fingering the research and learning with realizing machine automated intelligent.
Wherein, the key of fingering identification is the combination of moving object detection and moving target positioning.
Conventional moving target detecting method includes:Background modeling method, frame difference method and optical flow method.
1) background modeling method:It is assumed that the static scene without intrusion object has some normal attributes, and use statistical model
Weighted sum be mixed together to simulation background model.Once known background model, intrusion object just can be by marking scene graph
The part that this background model is not met as in is detected.Conventional background modeling method includes:Single Gauss model, mixed Gaussian
Model, Density Estimator etc..Though these methods can obtain more accurate motion target area, amount of calculation is relatively large,
Speed is partially slow, changes sensitive to illumination variation, background.
2) frame difference method:The moving region in image is extracted by the time difference of adjacent interframe.Although frame difference method computing
Speed, preferably, but when finger motion is slow, moving target pixel is sufficiently close to stability between two frames
Intersection can not detect.
3) optical flow method:The optical flow characteristic changed over time using moving target carries out motion detection, although do not need background
Modeling, in the case where any information of scene can not be obtained ahead of time, can also detect independent Moving Objects.But it calculates multiple
It is miscellaneous, it is necessary to special hardware unit, it is difficult to requirement of real-time be met, while moving boundaries, motion block, do more physical exercises and (wherein wrap
Include transparent, translucent motion) the problems such as be also optical flow method bottleneck.
Moving target localization method is typically based on rim detection simultaneously.Rim detection represents to replace with accurate objective contour
Simplified location information, but rim detection is when fingering is complicated or label of two and above finger has lap, edge
Detection can lose bulk information, or even two moving targets are judged as into one.And can not classify because rim detection can only be positioned
The profile that finger can not be with detecting is caused correctly to match.Rim detection is larger by background influence simultaneously, without filtering function, makes an uproar
Sound point can also be detected and disturb the positioning of finger.
Therefore for playing under fingering identification application scenarios, various the asking of above-mentioned moving object detection and localization method presence
Topic, for example:The missing inspection of low-speed motion target, be difficult to classification cause finger can not with positioning correctly match, noise jamming seriously, this hair
Finger motion detection and localization method of the bright proposition based on spatio-temporal filtering and joint space Kmeans, by analyzing player's finger
The video of piano (or other keyboard instruments) plays fingering identification to realize.Air filter when moving object detection of the present invention is used
The method of ripple, the influence that illumination variation and background can be overcome to change, is prevented effectively from the missing inspection of low-speed motion target;Moving target
Position using joint space Kmeans methods can make full use of image statistical property carry out self-adaptive decision, improve positioning and
The accuracy of cluster.
The content of the invention
This method aims to overcome that existing moving object detection and localization method are applied to play fingering identification scene
Deficiency, propose based on time-space domain filter and joint space Kmeans finger motion detect and localization method.
In order to reach object above, the finger motion inspection of the present invention based on spatio-temporal filtering and joint space Kmeans
Survey and positioned three modules by labelling and shooting video, moving object detection, moving target with localization method and constituted.
Above-mentioned labelling and shooting video module are used for the video file for generating subsequent module for processing, first by player's hand
Refer to the label for sticking ten kinds of different colours (except black, ultrawhite), then simultaneously clap playing procedure in the normal piano of player
Take the photograph into video.
Above-mentioned moving object detection module is used for moving object detection, using the method for spatio-temporal filtering.Input is regarded first
Frequency frame carries out airspace filter, obtains accurate motion target area.Then airspace filter result feedback guidance time domain band logical is filtered
Ripple result and temporal low-pass filter result carry out spatial domain restructuring in prospect (finger motion) position and background position and complete the dynamic back of the body
Scape updates, the influence that illumination variation, DE Camera Shake and background can be overcome to change, and is prevented effectively from fortune during finger low-speed motion
Moving-target missing inspection.Finger motion object detection results are transformed into YCrCb, HSV space (color from rgb space (color space)
Space) bandpass filtering is carried out, remove the colour of skin and shade, and by prospect threshold decision, extract label.
Target motion detection to implement step as shown in Figure 2.
Step 1:Airspace filter, comprises the following steps:
1.1 searching moving target areas.Current input video frame is put to be compared pixel-by-pixel with background image progress spatial domain to be come
Searching moving target area.
1.2 determine prospect and background.Motion target area is set to the pixel of current video input frame relevant position, background
The pixel in region is set to white (in rgb space, being in vain (255,255,255)).
1.3 feedback prospects and background.Prospect (motion target area) and background are fed back to the context update for next frame.
Step 2:Dynamic background updates, and comprises the following steps:
2.1 airspace filter results are fed back.Last airspace filter result feedback guidance dynamic background is updated.Judge current
Whether input video frame is the 2nd two field picture.If present incoming frame is the 2nd frame, background does not update, directly using the first two field picture as
Background;If non-2nd frame of present incoming frame, next step operation is carried out.
2.2 spatial domains are recombinated.Time domain bandpass filtering result and temporal low-pass filter result in prospect (finger motion) position and
Background position carries out spatial domain restructuring and completes context update.
Step 3:Label is extracted, is comprised the following steps:
3.1 remove the colour of skin.Rgb space is changed to YCrCb spaces, judges coordinate (Cr, Cb) whether in colour of skin elliptical modes
In type.If certain pixel is in colour of skin model of ellipse, will the pixel be set to it is white.
3.2 remove shade.Rgb space is changed to HSV space, bandpass filtering is carried out to V component histogram.
3.3 judge label.In HSV space, the prospect average threshold of S components is calculated, by S components in the moving target of extraction
Less than prospect saturation degree average threshold pixel be set to it is white.
Above-mentioned moving target locating module is used to position moving target, using joint space Kmeans methods.Joint space
Kmeans can not only be positioned, and can be classified, so as to realize that different fingers are matched with the correct of labeling, be prevented effectively from
The Wrong localization that colour superimposition, fingering complexity and noise spot interference are caused.First determine whether tri- histogram of component low pass filtereds of R, G, B
Crest after ripple, adaptive determining clusters number K size makes classification more accurate and intelligence.Then histogrammic system is utilized
Characteristic self-adaptive initialization cluster is counted, can avoid being absorbed in the situation of local optimum, accelerates the speed of iteration convergence, algorithm is improved
Efficiency and the degree of accuracy.Carry out color space (R, G, B) and geometric space (x, y) joint 5 ties up Kmeans and can make full use of phase
The priori being closely located to colored pixels point, improves the accuracy of cluster and positioning.And carry out cluster centre random
Disturbance and simulated annealing, algorithm stability is improved while avoiding being absorbed in local optimum as far as possible.Finally cluster result is entered
Row classification and positioning, determine relevant position of each frame picture finger on keyboard, so as to obtain the fingering of player.
Target motion positions to implement step as shown in Figure 3.
Step 1:The adaptive Kmeans of joint space, comprises the following steps:
1.1 statistics R, G, B property of the histogram.Tri- histogram of component of moving object detection result R, G, B are subjected to low pass
Filtering, adaptive judgement histogram crest.
1.2 adaptive determining clusters number K.The maximum crest number of R, G, B histogram is taken to be used as joint space Kmeans
Clusters number.
1.3 self-adaption clusters are initialized.Cluster centre is initialized using R, G, B histogram crest location.
1.4 iteration are until convergence.Following operation is repeated, until convergence:(a) K Ge Leilei centers are calculated respectively.Kth (1
≤ k≤K) Lei Lei centers be 5 dimension observation (R, G, B, x, y) vectors in kth class mean vector.(b) distribution will each be observed
(defined into the class where closest class center with Euclidean distance " nearest ").
Step 2:Random perturbation and simulated annealing, comprise the following steps:
2.1 calculate 5 dimension disturbance radiuses of each class.Take each class class centre-to-centre spacing such somewhat farthest distance as disturbing
Dynamic radius rK(five n dimensional vector ns, K is clusters number).
2.2 random perturbation.Take the random number random between -1~10, class center is subjected to rK*random0Disturbance.Will
Result after class hub disturbances re-starts the adaptive Kmeans of joint space as new initialization class center.Calculate newly
Object function and current goal function difference Δ J=J'-J.If Δ J < 0, receive new explanation as current solution, and update disturbance
Radius.The object function is the object function in Kmeans.
2.3 simulated annealing.The random number for participating in disturbance is modified to random0*a-t, wherein a is annealing speed, a>1, t
For annealing times, proceed 2.1 and 2.2 operation.
Step 3:Fingering is recognized, is comprised the following steps:
3.1 moving targets are positioned.By the coordinate of the adaptive Kmeans cluster centres of joint space, each frame of video is determined
Relevant position of the finger on keyboard is so as to obtain fingering.
3.2 fingering are exported.The fingering of each frame of video is uniformly stored in csv and learns and grinds for follow-up fingering
Study carefully.
Compared with prior art, the invention has the advantages that and technique effect:
1) present invention updates airspace filter result feedback guidance dynamic background in moving object detection, makes the back of the body of renewal
Scape can overcome illumination to become closest to the background of airspace filter input video frame compared to conventional Detection for Moving Target
The influence that change, DE Camera Shake and background change, effectively prevent moving target when degeneration and the finger low-speed motion of background
Missing inspection, beneficial to the detection and extraction of moving target.
2) method that the present invention uses airspace filter in moving object detection, is entered by input video frame with background image
Row spatial domain is put relatively to determine moving target pixel-by-pixel, can obtain more smart compared to conventional Detection for Moving Target
True motion target area.
3) present invention uses the adaptive Kmeans methods of joint space in moving target positioning, will position and adaptive poly-
Class is combined, so as to realize that different fingers are matched with the correct of labeling, is prevented effectively from colour superimposition, fingering complexity and noise
The Wrong localization that point interference is caused.Tieing up Kmeans using color space (R, G, B) and geometric space (x, y) joint 5 can be abundant
The priori being closely located to using same color pixel, improves the accuracy of cluster and positioning.
4) present invention is in the joint space Kmeans methods that moving target is positioned, according to tri- histogram of component of R, G, B
LPF postwave peak number purpose maximum, adaptive determining clusters number K size makes classification more accurate and intelligence.Profit
Clustered with histogrammic statistical property self-adaptive initialization, can avoid being absorbed in the situation of local optimum, accelerate iteration convergence
Speed, improves efficiency and the degree of accuracy of algorithm.
To sum up, the present invention can overcome existing moving object detection and localization method to be applied to play fingering identification scene
Deficiency, with changing to illumination and background, low insensitive, computation complexity, fast convergence rate, positional accuracy be high, real-time performance
Good the advantages of, while being suitably subject to transformation can be widely used for gesture identification and other fields.
Brief description of the drawings
Fig. 1 is the finger motion detection of the present invention based on spatio-temporal filtering and joint space Kmeans and localization method
Overview flow chart;
Fig. 2 is the flow chart of moving object detection module of the present invention;
Fig. 3 is the flow chart of moving target locating module of the present invention.
Embodiment
The present invention by the label of ten kinds of different colours on player's finger plaster (except black, ultrawhite), shoots player's hand first
Refer to the video of piano.
Then, above-mentioned video is taken exercises object detection process.First to the frame of video of input, with the method for spatio-temporal filtering
Finger motion region is determined, label is extracted.Airspace filter carries out spatial domain point pixel-by-pixel by input video frame and background image
Compare to detect moving target, so as to obtain more accurate motion target area.Then by airspace filter result feedback guidance
Dynamic background updates, and makes the background of renewal closest to the background of airspace filter input video frame, effectively prevent the degeneration of background,
Beneficial to the detection and extraction of moving target.In YCrCb spaces, projection approximation of the skin information on CrCb two dimensional surfaces is into ellipse
Distribution, by judging coordinate (Cr, Cb) whether in colour of skin model of ellipse, to remove the skin pixel in finger motion target
Point.Inevitably hatched pixel in moving target recognition, in HSV space, lightness V represents bright journey
Degree, more dark then V is smaller.Shade is for the other parts of finger motion target, and lightness is minimum, by V component Nogata
The bandpass filtering of figure is that can remove shade.S represents the saturation degree of color, and color is deep and gorgeous, then saturation degree is higher.Label relative to
For the other parts of finger motion target, saturation degree is maximum, and i.e. extractable label is judged by prospect saturation degree average threshold.
Finally, above-mentioned video is taken exercises localization process.Finger motion target classification is carried out using joint space Kmeans
With positioning so that realize fingering identification, writing function.Kmeans in space centered on K point to be clustered, in
The object categorization of the heart.By the method for iteration, each cluster centre is gradually updated, error can be reduced constantly, when error is constant
Time converges on optimal solution.In moving target positioning, optimal solution is i.e. near 10 kinds of label colors R, G, B.Therefore R, G, B are used
Crest after three histogram of component LPFs carries out self-adaptive initial to cluster, can allow Kmeans initialization center
Closer to optimal solution, so as to accelerate the speed of iteration convergence, the efficiency of algorithm is improved, the cluster with random initializtion Kmeans can
Locally optimal solution can be obtained rather than total optimization solution is different, self-adaptive initialization can avoid being absorbed in the situation of local optimum.Together
When color space (R, G, B) and geometric space (x, y) joint 5 tie up Kmeans same color pixel position can be made full use of to connect
Near priori, improves the accuracy of cluster and positioning.Simulated annealing Kmean algorithms are a kind of heuristic iterative algorithms, tool
There is asymptotic Convergence Property, its verified convergence with probability 1 is in globally optimal solution in theory.Therefore cluster centre is disturbed at random
Dynamic and simulated annealing, algorithm stability is improved while avoiding being absorbed in local optimum.
The methods such as machine learning and Digital Signal Processing are organically combined together by the present invention, based on spatio-temporal filtering and connection
Space Kmeans methods are closed, the detection and positioning of finger motion is realized.With reference to specific implementation step and accompanying drawing to the present invention
Explanation is described in further detail, but the implementation of the present invention is not limited to this.
Fig. 1 for the present invention a kind of embodiment, it is main include labelling and shoot video, moving object detection,
Moving target positions three modules.The present invention first by the label of ten kinds of different colours on player's finger plaster (except black, ultrawhite),
Shoot the video of player's finger piano.Then to the frame of video of input, finger motion is detected with the method for spatio-temporal filtering
Target area, extracts label, and finger motion target classification and positioning are carried out with joint space Kmeans, so as to realize that fingering is known
Not, writing function.
Above-mentioned labelling and shooting video module are used for the video file for generating subsequent module for processing, first by player's hand
Refer to the label for sticking ten kinds of different colours (except black, ultrawhite), then simultaneously clap playing procedure in the normal piano of player
Take the photograph into video.
Above-mentioned moving object detection module is used for moving object detection, using the method for spatio-temporal filtering.Input is regarded first
Frequency frame carries out airspace filter, obtains accurate motion target area.Then airspace filter result feedback guidance time domain band logical is filtered
Ripple result and temporal low-pass filter result carry out spatial domain restructuring in prospect (finger motion) position and background position and complete the dynamic back of the body
Scape updates, the influence that illumination variation, DE Camera Shake and background can be overcome to change, and is prevented effectively from fortune during finger low-speed motion
Moving-target missing inspection.Finger motion object detection results are transformed into YCrCb, HSV space from rgb space and carry out bandpass filtering, are gone
Fall the colour of skin and shade, and by prospect threshold decision, extract label.
Target motion detection to implement step as shown in Figure 2.
Step 1:Airspace filter, comprises the following steps:
1.1 searching moving target areas.Current input video frame is put to be compared pixel-by-pixel with background image progress spatial domain to be come
Searching moving target area.
1.2 determine prospect and background.Motion target area is set to the pixel of current video input frame relevant position, background
The pixel in region is set to white.
1.3 feedback prospects and background.Prospect (motion target area) and background are fed back to the context update for next frame.
Step 2:Dynamic background updates, and comprises the following steps:
2.1 airspace filter results are fed back.Last airspace filter result feedback guidance dynamic background is updated.Judge current
Whether input video frame is the 2nd two field picture.If present incoming frame is the 2nd frame, background does not update, directly using the first two field picture as
Background;If non-2nd frame of present incoming frame, next step operation is carried out.
2.2 spatial domains are recombinated.Time domain bandpass filtering result and temporal low-pass filter result in prospect (finger motion) position and
Background position carries out spatial domain restructuring and completes context update.
Step 3:Label is extracted, is comprised the following steps:
3.1 remove the colour of skin.Rgb space is changed to YCrCb spaces, judges coordinate (Cr, Cb) whether in colour of skin elliptical modes
In type.If certain pixel is in colour of skin model of ellipse, will the pixel be set to it is white.
3.2 remove shade.Rgb space is changed to HSV space, bandpass filtering is carried out to V component histogram.
3.3 judge label.In HSV space, the prospect average threshold of S components is calculated, by S components in the moving target of extraction
Less than prospect saturation degree average threshold pixel be set to it is white.
Above-mentioned moving target locating module is used to position moving target, using joint space Kmeans methods.Joint space
Kmeans can not only be positioned, and can be classified, so as to realize that different fingers are matched with the correct of labeling, be prevented effectively from
The Wrong localization that colour superimposition, fingering complexity and noise spot interference are caused.First determine whether tri- histogram of component low pass filtereds of R, G, B
Crest after ripple, adaptive determining clusters number K size makes classification more accurate and intelligence.Then histogrammic system is utilized
Characteristic self-adaptive initialization cluster is counted, can avoid being absorbed in the situation of local optimum, accelerates the speed of iteration convergence, algorithm is improved
Efficiency and the degree of accuracy.Carry out color space (R, G, B) and geometric space (x, y) joint 5 ties up Kmeans and can make full use of phase
The priori being closely located to colored pixels point, improves the accuracy of cluster and positioning.And carry out cluster centre random
Disturbance and simulated annealing, algorithm stability is improved while avoiding being absorbed in local optimum as far as possible.Finally cluster result is entered
Row classification and positioning, determine relevant position of each frame picture finger on keyboard, so as to obtain the fingering of player.
Target motion positions to implement step as shown in Figure 3.
Step 1:The adaptive Kmeans of joint space, comprises the following steps:
1.1 statistics R, G, B property of the histogram.Tri- histogram of component of moving object detection result R, G, B are subjected to low pass
Filtering, adaptive judgement histogram crest.
1.2 adaptive determining clusters number K.The crest number for taking R, G, B histogram maximum is used as joint space
Kmeans clusters number.
1.3 self-adaption clusters are initialized.Cluster centre is initialized using R, G, B histogram crest location.
1.4 iteration are until convergence.Following operation is repeated, until convergence:(a) K Ge Leilei centers are calculated respectively.Kth (1
≤ k≤K) Lei Lei centers be 5 dimension observation (R, G, B, x, y) vectors in kth class mean vector.(b) distribution will each be observed
Into the class where closest class center.
Step 2:Random perturbation and simulated annealing, comprise the following steps:
2.1 calculate 5 dimension disturbance radiuses of each class.Take each class class centre-to-centre spacing such somewhat farthest distance as disturbing
Dynamic radius rK(five n dimensional vector ns, K is clusters number).
2.2 random perturbation.Take the random number random between -1~10, class center is subjected to rK*random0Disturbance.Will
Result after class hub disturbances re-starts the adaptive Kmeans of joint space as new initialization class center.Calculate newly
Object function and current goal function difference Δ J=J'-J.If Δ J < 0, receive new explanation as current solution, and update disturbance
Radius.
2.3 simulated annealing.The random number for participating in disturbance is modified to random0*a-t, wherein a is annealing speed, a>1, t
For setting annealing times, the operation of progress 2.1 and 2.2.
Step 3:Fingering is recognized, is comprised the following steps:
3.1 moving targets are positioned.By the coordinate of the adaptive Kmeans cluster centres of joint space, each frame of video is determined
Relevant position of the finger on keyboard is so as to obtain fingering.
3.2 fingering are exported.The fingering of each frame of video is uniformly stored in csv and learns and grinds for follow-up fingering
Study carefully.
As above it can preferably realize the present invention and obtain aforementioned invention effect, present example is through suitably being transformed
It can be widely used for gesture identification and other fields.
Claims (4)
1. finger motion detection and localization method based on spatio-temporal filtering and joint space Kmeans, it is characterised in that including:It is first
First by the label of ten kinds of different colours in addition to black and white on player's finger plaster, shoot player's finger and play regarding for keyboard instrument
Frequently;Then to the frame of video of input, finger motion region is determined with the method for spatio-temporal filtering, and airspace filter result feedback is referred to
Dynamic background renewal is led, finger motion object detection results are transformed into YCrCb, HSV space from rgb space carries out band logical filter
Ripple, removes the colour of skin and shade, by prospect threshold decision, extracts label;Finally finger motion is carried out with joint space Kmeans
Target is positioned, and crest number after adaptive judgement R, G, B histogram LPF determines clusters number K, with reference to histogram
Statistical property self-adaptive initial cluster centre, and by cluster result random perturbation and simulated annealing, thus realize fingering identification,
Writing function.
2. the finger motion based on spatio-temporal filtering and joint space Kmeans is detected and localization method according to claim 1,
It is characterized in that being realized by labelling and shooting video module, moving object detection module and moving target locating module;
Labelling and the shooting video module is used for the video file for generating subsequent module for processing:First by player's finger plaster
The label of upper except black, ten kinds of ultrawhite different colours, then simultaneously shoots into playing procedure in the normal piano of player
Video;
The moving object detection module is used for moving object detection, using the method for spatio-temporal filtering:First to input video frame
Airspace filter is carried out, accurate motion target area is obtained;Then by airspace filter result feedback guidance time domain bandpass filtering knot
Fruit and temporal low-pass filter result are that finger motion position and background position carry out spatial domain restructuring and complete dynamic background more in prospect
Newly, the influence for overcoming illumination variation, DE Camera Shake and background to change, it is to avoid moving target missing inspection during finger low-speed motion;
Finger motion object detection results are transformed into YCrCb, HSV space from rgb space and carry out bandpass filtering, remove the colour of skin and the moon
Shadow, and by prospect threshold decision, extract label;
The moving target locating module is used to position moving target, using joint space Kmeans:First determine whether R, G, B tri-
Crest after histogram of component LPF, adaptive determining clusters number K size makes classification more accurate and intelligence;So
Clustered afterwards using histogrammic statistical property self-adaptive initialization;Carry out color space (R, G, B) and geometric space (x, y) joint
5 dimension Kmeans make full use of the priori that same color pixel is closely located to, and improve the accuracy of cluster and positioning;And will
Cluster centre carries out random perturbation and simulated annealing, and algorithm stability is improved while avoiding being absorbed in local optimum;It is finally right
Cluster result is classified and positioned, and relevant position of each frame picture finger on keyboard is determined, so as to obtain player's
Fingering.
3. finger motion detection and positioning side according to claim 1 based on spatio-temporal filtering and joint space Kmeans
Method, it is characterised in that the step that implements of the target motion detection block includes:
Step 1, airspace filter, comprise the following steps:
1.1 searching moving target areas:Current input video frame is put pixel-by-pixel with background image progress spatial domain to be compared to search for
Motion target area;
1.2 determine prospect and background:Motion target area is set to the pixel of current video input frame relevant position, background region
Pixel be set to it is white, in rgb space, in vain be (255,255,255);
1.3 feedback prospects and background:Prospect is motion target area and background feeds back context update for next frame;
Step 2, dynamic background update, and comprise the following steps:
2.1 airspace filter results are fed back:Last airspace filter result feedback guidance dynamic background is updated, current input is judged
Whether frame of video is the 2nd two field picture, if present incoming frame is the 2nd frame, background does not update, and is directly used as the back of the body using the first two field picture
Scape;If non-2nd frame of present incoming frame, next step operation is carried out;
2.2 spatial domains are recombinated:Time domain bandpass filtering result and temporal low-pass filter result are finger motion position and background in prospect
Position carries out spatial domain restructuring and completes context update;
Step 3:Label is extracted, is comprised the following steps:
3.1 remove the colour of skin:Rgb space is changed to YCrCb spaces, coordinate (Cr, Cb) is judged whether in colour of skin model of ellipse,
If certain pixel is in colour of skin model of ellipse, will the pixel be set to it is white;
3.2 remove shade:Rgb space is changed to HSV space, bandpass filtering is carried out to V component histogram;
3.3 judge label:In HSV space, the prospect average threshold of S components is calculated, S components in the moving target of extraction are less than
The pixel of prospect saturation degree average threshold is set to white.
4. finger motion detection and positioning side according to claim 1 based on spatio-temporal filtering and joint space Kmeans
Method, it is characterised in that the step that implements of the target motion positions includes:
The adaptive Kmeans of step 1, joint space, comprises the following steps:
1.1 statistics R, G, B property of the histogram:Tri- histogram of component of moving object detection result R, G, B are subjected to LPF,
Adaptive judgement histogram crest;
1.2 adaptive determining clusters number K:The maximum crest number of R, G, B histogram is taken to be used as the poly- of joint space Kmeans
Class number;
1.3 self-adaption clusters are initialized:Cluster centre is initialized using R, G, B histogram crest location;
1.4 iteration are until convergence:Following operation is repeated, until convergence:(a) K Ge Leilei centers are calculated respectively;The class of kth class
Center is the mean vector of 5 dimension observation (R, G, B, x, y) vectors in kth class, 1≤k≤K;(b) each observation is assigned to distance
In class where nearest class center;It is described nearest to be determined using Euclidean distance;
Step 2:Random perturbation and simulated annealing, comprise the following steps:
2.1 calculate 5 dimension disturbance radiuses of each class:Take each class class centre-to-centre spacing such somewhat farthest distance as disturbance
Radius rK, rKFor five n dimensional vector ns, K is clusters number;
2.2 random perturbation:Take the random number random between -1~10, class center is subjected to rK*random0Disturbance;By in class
Result after heart disturbance re-starts the adaptive Kmeans of joint space as new initialization class center, calculates new target
Function J ' and current goal function J difference DELTA J=J'-J;If Δ J < 0, receive new explanation as current solution, and update disturbance
Radius, into step 3;Otherwise step 2.3 is entered;
2.3 simulated annealing:The random number for participating in disturbance is modified to random0*a-t, wherein a is annealing speed, a>1, t is to move back
Fiery number of times, proceed 2.1 and 2.2 operation;
Step 3:Fingering is recognized, is comprised the following steps:
3.1 moving targets are positioned:By the coordinate of the adaptive Kmeans cluster centres of joint space, each frame of video finger is determined
Relevant position on keyboard is so as to obtain fingering;
3.2 fingering are exported:The fingering of each frame of video is uniformly stored in csv files and learns and grinds for follow-up fingering
Study carefully.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710231824.3A CN107180224B (en) | 2017-04-10 | 2017-04-10 | Finger motion detection and positioning method based on space-time filtering and joint space Kmeans |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710231824.3A CN107180224B (en) | 2017-04-10 | 2017-04-10 | Finger motion detection and positioning method based on space-time filtering and joint space Kmeans |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107180224A true CN107180224A (en) | 2017-09-19 |
CN107180224B CN107180224B (en) | 2020-06-19 |
Family
ID=59830915
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710231824.3A Expired - Fee Related CN107180224B (en) | 2017-04-10 | 2017-04-10 | Finger motion detection and positioning method based on space-time filtering and joint space Kmeans |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107180224B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063781A (en) * | 2018-08-14 | 2018-12-21 | 浙江理工大学 | A kind of fuzzy image Fabric Design method of imitative natural colour function and form |
CN109451634A (en) * | 2018-10-19 | 2019-03-08 | 厦门理工学院 | Method and its intelligent electric lamp system based on gesture control electric light |
CN109960980A (en) * | 2017-12-22 | 2019-07-02 | 北京市商汤科技开发有限公司 | Dynamic gesture identification method and device |
CN111105398A (en) * | 2019-12-19 | 2020-05-05 | 昆明能讯科技有限责任公司 | Transmission line component crack detection method based on visible light image data |
US11221681B2 (en) | 2017-12-22 | 2022-01-11 | Beijing Sensetime Technology Development Co., Ltd | Methods and apparatuses for recognizing dynamic gesture, and control methods and apparatuses using gesture interaction |
WO2022052941A1 (en) * | 2020-09-09 | 2022-03-17 | 桂林智神信息技术股份有限公司 | Intelligent identification method and system for giving assistance with piano teaching, and intelligent piano training method and system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102368290A (en) * | 2011-09-02 | 2012-03-07 | 华南理工大学 | Hand gesture identification method based on finger advanced characteristic |
CN105335711A (en) * | 2015-10-22 | 2016-02-17 | 华南理工大学 | Fingertip detection method in complex environment |
-
2017
- 2017-04-10 CN CN201710231824.3A patent/CN107180224B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102368290A (en) * | 2011-09-02 | 2012-03-07 | 华南理工大学 | Hand gesture identification method based on finger advanced characteristic |
CN105335711A (en) * | 2015-10-22 | 2016-02-17 | 华南理工大学 | Fingertip detection method in complex environment |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109960980A (en) * | 2017-12-22 | 2019-07-02 | 北京市商汤科技开发有限公司 | Dynamic gesture identification method and device |
US11221681B2 (en) | 2017-12-22 | 2022-01-11 | Beijing Sensetime Technology Development Co., Ltd | Methods and apparatuses for recognizing dynamic gesture, and control methods and apparatuses using gesture interaction |
CN109960980B (en) * | 2017-12-22 | 2022-03-15 | 北京市商汤科技开发有限公司 | Dynamic gesture recognition method and device |
CN109063781A (en) * | 2018-08-14 | 2018-12-21 | 浙江理工大学 | A kind of fuzzy image Fabric Design method of imitative natural colour function and form |
CN109063781B (en) * | 2018-08-14 | 2021-12-03 | 浙江理工大学 | Design method of fuzzy image fabric imitating natural color function and form |
CN109451634A (en) * | 2018-10-19 | 2019-03-08 | 厦门理工学院 | Method and its intelligent electric lamp system based on gesture control electric light |
CN111105398A (en) * | 2019-12-19 | 2020-05-05 | 昆明能讯科技有限责任公司 | Transmission line component crack detection method based on visible light image data |
WO2022052941A1 (en) * | 2020-09-09 | 2022-03-17 | 桂林智神信息技术股份有限公司 | Intelligent identification method and system for giving assistance with piano teaching, and intelligent piano training method and system |
Also Published As
Publication number | Publication date |
---|---|
CN107180224B (en) | 2020-06-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107180224A (en) | Finger motion detection and localization method based on spatio-temporal filtering and joint space Kmeans | |
CN110135314B (en) | Multi-target tracking method based on depth track prediction | |
CN102324025B (en) | Human face detection and tracking method based on Gaussian skin color model and feature analysis | |
Neves et al. | An efficient omnidirectional vision system for soccer robots: From calibration to object detection | |
Žemgulys et al. | Recognition of basketball referee signals from real-time videos | |
CN109598268A (en) | A kind of RGB-D well-marked target detection method based on single flow depth degree network | |
CN107872644A (en) | Video frequency monitoring method and device | |
CN108198221A (en) | A kind of automatic stage light tracking system and method based on limb action | |
CN104834916A (en) | Multi-face detecting and tracking method | |
CN102194108A (en) | Smiley face expression recognition method based on clustering linear discriminant analysis of feature selection | |
CN113592911B (en) | Apparent enhanced depth target tracking method | |
CN106952294A (en) | A kind of video tracing method based on RGB D data | |
CN109460764A (en) | A kind of satellite video ship monitoring method of combination brightness and improvement frame differential method | |
CN110399888B (en) | Weiqi judging system based on MLP neural network and computer vision | |
Yang et al. | Robust player detection and tracking in broadcast soccer video based on enhanced particle filter | |
CN113643278A (en) | Confrontation sample generation method for unmanned aerial vehicle image target detection | |
CN109993052A (en) | The method for tracking target and system of dimension self-adaption under a kind of complex scene | |
CN105046721A (en) | Camshift algorithm for tracking centroid correction model on the basis of Grabcut and LBP (Local Binary Pattern) | |
Zha et al. | Distractor-aware visual tracking by online Siamese network | |
CN107330918B (en) | Football video player tracking method based on online multi-instance learning | |
Arbués-Sangüesa et al. | Single-camera basketball tracker through pose and semantic feature fusion | |
Shiting et al. | Clustering-based shadow edge detection in a single color image | |
Lee et al. | Efficient Face Detection and Tracking with extended camshift and haar-like features | |
CN112614161A (en) | Three-dimensional object tracking method based on edge confidence | |
CN103996207A (en) | Object tracking method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200619 |