CN107180224A - Finger motion detection and localization method based on spatio-temporal filtering and joint space Kmeans - Google Patents

Finger motion detection and localization method based on spatio-temporal filtering and joint space Kmeans Download PDF

Info

Publication number
CN107180224A
CN107180224A CN201710231824.3A CN201710231824A CN107180224A CN 107180224 A CN107180224 A CN 107180224A CN 201710231824 A CN201710231824 A CN 201710231824A CN 107180224 A CN107180224 A CN 107180224A
Authority
CN
China
Prior art keywords
kmeans
finger
frame
space
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710231824.3A
Other languages
Chinese (zh)
Other versions
CN107180224B (en
Inventor
韦岗
梁舒
马碧云
李增
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201710231824.3A priority Critical patent/CN107180224B/en
Publication of CN107180224A publication Critical patent/CN107180224A/en
Application granted granted Critical
Publication of CN107180224B publication Critical patent/CN107180224B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The present invention discloses finger motion detection and localization method based on spatio-temporal filtering and joint space Kmeans, first by ten kinds of different colours on player's finger plaster(It is except black, ultrawhite)Label, shoot player's finger and play the video of keyboard instrument;Then to the frame of video of input, finger motion target is detected with the method for spatio-temporal filtering, airspace filter result feedback guidance dynamic background is updated;Finger motion target positioning is carried out with joint space Kmeans, with reference to R, G, B statistics with histogram characteristic adaptive determining clusters number and initialization class center, so as to realize the fingering identification, writing function that low computation complexity, fast convergence rate, positional accuracy are high, real-time performance is good.

Description

Finger motion detection and localization method based on spatio-temporal filtering and joint space Kmeans
Technical field
The present invention relates to the technical fields such as vision monitoring, Digital Image Processing, and in particular to based on spatio-temporal filtering and joint Space Kmeans finger motion detection and localization method.
Background technology
The piano correct fingering of (or other keyboard instruments) player is most important for flexible performance and annotation music. Good fingering can embody understanding and annotation of the player to composer's style and features, works content, at the same can be sparing of one's energy, when Between, improve and play efficiency.Though playing fingering has universal law, the flexibility of different song fingering is practiced to the fingering of beginner Practise and difficulty is added to the fingering imitation of virtuoso.Manual record fingering not only needs higher musicianship, consumes simultaneously When effort.Therefore, recognize that fingering turns into the inexorable trend of fingering the research and learning with realizing machine automated intelligent.
Wherein, the key of fingering identification is the combination of moving object detection and moving target positioning.
Conventional moving target detecting method includes:Background modeling method, frame difference method and optical flow method.
1) background modeling method:It is assumed that the static scene without intrusion object has some normal attributes, and use statistical model Weighted sum be mixed together to simulation background model.Once known background model, intrusion object just can be by marking scene graph The part that this background model is not met as in is detected.Conventional background modeling method includes:Single Gauss model, mixed Gaussian Model, Density Estimator etc..Though these methods can obtain more accurate motion target area, amount of calculation is relatively large, Speed is partially slow, changes sensitive to illumination variation, background.
2) frame difference method:The moving region in image is extracted by the time difference of adjacent interframe.Although frame difference method computing Speed, preferably, but when finger motion is slow, moving target pixel is sufficiently close to stability between two frames Intersection can not detect.
3) optical flow method:The optical flow characteristic changed over time using moving target carries out motion detection, although do not need background Modeling, in the case where any information of scene can not be obtained ahead of time, can also detect independent Moving Objects.But it calculates multiple It is miscellaneous, it is necessary to special hardware unit, it is difficult to requirement of real-time be met, while moving boundaries, motion block, do more physical exercises and (wherein wrap Include transparent, translucent motion) the problems such as be also optical flow method bottleneck.
Moving target localization method is typically based on rim detection simultaneously.Rim detection represents to replace with accurate objective contour Simplified location information, but rim detection is when fingering is complicated or label of two and above finger has lap, edge Detection can lose bulk information, or even two moving targets are judged as into one.And can not classify because rim detection can only be positioned The profile that finger can not be with detecting is caused correctly to match.Rim detection is larger by background influence simultaneously, without filtering function, makes an uproar Sound point can also be detected and disturb the positioning of finger.
Therefore for playing under fingering identification application scenarios, various the asking of above-mentioned moving object detection and localization method presence Topic, for example:The missing inspection of low-speed motion target, be difficult to classification cause finger can not with positioning correctly match, noise jamming seriously, this hair Finger motion detection and localization method of the bright proposition based on spatio-temporal filtering and joint space Kmeans, by analyzing player's finger The video of piano (or other keyboard instruments) plays fingering identification to realize.Air filter when moving object detection of the present invention is used The method of ripple, the influence that illumination variation and background can be overcome to change, is prevented effectively from the missing inspection of low-speed motion target;Moving target Position using joint space Kmeans methods can make full use of image statistical property carry out self-adaptive decision, improve positioning and The accuracy of cluster.
The content of the invention
This method aims to overcome that existing moving object detection and localization method are applied to play fingering identification scene Deficiency, propose based on time-space domain filter and joint space Kmeans finger motion detect and localization method.
In order to reach object above, the finger motion inspection of the present invention based on spatio-temporal filtering and joint space Kmeans Survey and positioned three modules by labelling and shooting video, moving object detection, moving target with localization method and constituted.
Above-mentioned labelling and shooting video module are used for the video file for generating subsequent module for processing, first by player's hand Refer to the label for sticking ten kinds of different colours (except black, ultrawhite), then simultaneously clap playing procedure in the normal piano of player Take the photograph into video.
Above-mentioned moving object detection module is used for moving object detection, using the method for spatio-temporal filtering.Input is regarded first Frequency frame carries out airspace filter, obtains accurate motion target area.Then airspace filter result feedback guidance time domain band logical is filtered Ripple result and temporal low-pass filter result carry out spatial domain restructuring in prospect (finger motion) position and background position and complete the dynamic back of the body Scape updates, the influence that illumination variation, DE Camera Shake and background can be overcome to change, and is prevented effectively from fortune during finger low-speed motion Moving-target missing inspection.Finger motion object detection results are transformed into YCrCb, HSV space (color from rgb space (color space) Space) bandpass filtering is carried out, remove the colour of skin and shade, and by prospect threshold decision, extract label.
Target motion detection to implement step as shown in Figure 2.
Step 1:Airspace filter, comprises the following steps:
1.1 searching moving target areas.Current input video frame is put to be compared pixel-by-pixel with background image progress spatial domain to be come Searching moving target area.
1.2 determine prospect and background.Motion target area is set to the pixel of current video input frame relevant position, background The pixel in region is set to white (in rgb space, being in vain (255,255,255)).
1.3 feedback prospects and background.Prospect (motion target area) and background are fed back to the context update for next frame.
Step 2:Dynamic background updates, and comprises the following steps:
2.1 airspace filter results are fed back.Last airspace filter result feedback guidance dynamic background is updated.Judge current Whether input video frame is the 2nd two field picture.If present incoming frame is the 2nd frame, background does not update, directly using the first two field picture as Background;If non-2nd frame of present incoming frame, next step operation is carried out.
2.2 spatial domains are recombinated.Time domain bandpass filtering result and temporal low-pass filter result in prospect (finger motion) position and Background position carries out spatial domain restructuring and completes context update.
Step 3:Label is extracted, is comprised the following steps:
3.1 remove the colour of skin.Rgb space is changed to YCrCb spaces, judges coordinate (Cr, Cb) whether in colour of skin elliptical modes In type.If certain pixel is in colour of skin model of ellipse, will the pixel be set to it is white.
3.2 remove shade.Rgb space is changed to HSV space, bandpass filtering is carried out to V component histogram.
3.3 judge label.In HSV space, the prospect average threshold of S components is calculated, by S components in the moving target of extraction Less than prospect saturation degree average threshold pixel be set to it is white.
Above-mentioned moving target locating module is used to position moving target, using joint space Kmeans methods.Joint space Kmeans can not only be positioned, and can be classified, so as to realize that different fingers are matched with the correct of labeling, be prevented effectively from The Wrong localization that colour superimposition, fingering complexity and noise spot interference are caused.First determine whether tri- histogram of component low pass filtereds of R, G, B Crest after ripple, adaptive determining clusters number K size makes classification more accurate and intelligence.Then histogrammic system is utilized Characteristic self-adaptive initialization cluster is counted, can avoid being absorbed in the situation of local optimum, accelerates the speed of iteration convergence, algorithm is improved Efficiency and the degree of accuracy.Carry out color space (R, G, B) and geometric space (x, y) joint 5 ties up Kmeans and can make full use of phase The priori being closely located to colored pixels point, improves the accuracy of cluster and positioning.And carry out cluster centre random Disturbance and simulated annealing, algorithm stability is improved while avoiding being absorbed in local optimum as far as possible.Finally cluster result is entered Row classification and positioning, determine relevant position of each frame picture finger on keyboard, so as to obtain the fingering of player.
Target motion positions to implement step as shown in Figure 3.
Step 1:The adaptive Kmeans of joint space, comprises the following steps:
1.1 statistics R, G, B property of the histogram.Tri- histogram of component of moving object detection result R, G, B are subjected to low pass Filtering, adaptive judgement histogram crest.
1.2 adaptive determining clusters number K.The maximum crest number of R, G, B histogram is taken to be used as joint space Kmeans Clusters number.
1.3 self-adaption clusters are initialized.Cluster centre is initialized using R, G, B histogram crest location.
1.4 iteration are until convergence.Following operation is repeated, until convergence:(a) K Ge Leilei centers are calculated respectively.Kth (1 ≤ k≤K) Lei Lei centers be 5 dimension observation (R, G, B, x, y) vectors in kth class mean vector.(b) distribution will each be observed (defined into the class where closest class center with Euclidean distance " nearest ").
Step 2:Random perturbation and simulated annealing, comprise the following steps:
2.1 calculate 5 dimension disturbance radiuses of each class.Take each class class centre-to-centre spacing such somewhat farthest distance as disturbing Dynamic radius rK(five n dimensional vector ns, K is clusters number).
2.2 random perturbation.Take the random number random between -1~10, class center is subjected to rK*random0Disturbance.Will Result after class hub disturbances re-starts the adaptive Kmeans of joint space as new initialization class center.Calculate newly Object function and current goal function difference Δ J=J'-J.If Δ J < 0, receive new explanation as current solution, and update disturbance Radius.The object function is the object function in Kmeans.
2.3 simulated annealing.The random number for participating in disturbance is modified to random0*a-t, wherein a is annealing speed, a>1, t For annealing times, proceed 2.1 and 2.2 operation.
Step 3:Fingering is recognized, is comprised the following steps:
3.1 moving targets are positioned.By the coordinate of the adaptive Kmeans cluster centres of joint space, each frame of video is determined Relevant position of the finger on keyboard is so as to obtain fingering.
3.2 fingering are exported.The fingering of each frame of video is uniformly stored in csv and learns and grinds for follow-up fingering Study carefully.
Compared with prior art, the invention has the advantages that and technique effect:
1) present invention updates airspace filter result feedback guidance dynamic background in moving object detection, makes the back of the body of renewal Scape can overcome illumination to become closest to the background of airspace filter input video frame compared to conventional Detection for Moving Target The influence that change, DE Camera Shake and background change, effectively prevent moving target when degeneration and the finger low-speed motion of background Missing inspection, beneficial to the detection and extraction of moving target.
2) method that the present invention uses airspace filter in moving object detection, is entered by input video frame with background image Row spatial domain is put relatively to determine moving target pixel-by-pixel, can obtain more smart compared to conventional Detection for Moving Target True motion target area.
3) present invention uses the adaptive Kmeans methods of joint space in moving target positioning, will position and adaptive poly- Class is combined, so as to realize that different fingers are matched with the correct of labeling, is prevented effectively from colour superimposition, fingering complexity and noise The Wrong localization that point interference is caused.Tieing up Kmeans using color space (R, G, B) and geometric space (x, y) joint 5 can be abundant The priori being closely located to using same color pixel, improves the accuracy of cluster and positioning.
4) present invention is in the joint space Kmeans methods that moving target is positioned, according to tri- histogram of component of R, G, B LPF postwave peak number purpose maximum, adaptive determining clusters number K size makes classification more accurate and intelligence.Profit Clustered with histogrammic statistical property self-adaptive initialization, can avoid being absorbed in the situation of local optimum, accelerate iteration convergence Speed, improves efficiency and the degree of accuracy of algorithm.
To sum up, the present invention can overcome existing moving object detection and localization method to be applied to play fingering identification scene Deficiency, with changing to illumination and background, low insensitive, computation complexity, fast convergence rate, positional accuracy be high, real-time performance Good the advantages of, while being suitably subject to transformation can be widely used for gesture identification and other fields.
Brief description of the drawings
Fig. 1 is the finger motion detection of the present invention based on spatio-temporal filtering and joint space Kmeans and localization method Overview flow chart;
Fig. 2 is the flow chart of moving object detection module of the present invention;
Fig. 3 is the flow chart of moving target locating module of the present invention.
Embodiment
The present invention by the label of ten kinds of different colours on player's finger plaster (except black, ultrawhite), shoots player's hand first Refer to the video of piano.
Then, above-mentioned video is taken exercises object detection process.First to the frame of video of input, with the method for spatio-temporal filtering Finger motion region is determined, label is extracted.Airspace filter carries out spatial domain point pixel-by-pixel by input video frame and background image Compare to detect moving target, so as to obtain more accurate motion target area.Then by airspace filter result feedback guidance Dynamic background updates, and makes the background of renewal closest to the background of airspace filter input video frame, effectively prevent the degeneration of background, Beneficial to the detection and extraction of moving target.In YCrCb spaces, projection approximation of the skin information on CrCb two dimensional surfaces is into ellipse Distribution, by judging coordinate (Cr, Cb) whether in colour of skin model of ellipse, to remove the skin pixel in finger motion target Point.Inevitably hatched pixel in moving target recognition, in HSV space, lightness V represents bright journey Degree, more dark then V is smaller.Shade is for the other parts of finger motion target, and lightness is minimum, by V component Nogata The bandpass filtering of figure is that can remove shade.S represents the saturation degree of color, and color is deep and gorgeous, then saturation degree is higher.Label relative to For the other parts of finger motion target, saturation degree is maximum, and i.e. extractable label is judged by prospect saturation degree average threshold.
Finally, above-mentioned video is taken exercises localization process.Finger motion target classification is carried out using joint space Kmeans With positioning so that realize fingering identification, writing function.Kmeans in space centered on K point to be clustered, in The object categorization of the heart.By the method for iteration, each cluster centre is gradually updated, error can be reduced constantly, when error is constant Time converges on optimal solution.In moving target positioning, optimal solution is i.e. near 10 kinds of label colors R, G, B.Therefore R, G, B are used Crest after three histogram of component LPFs carries out self-adaptive initial to cluster, can allow Kmeans initialization center Closer to optimal solution, so as to accelerate the speed of iteration convergence, the efficiency of algorithm is improved, the cluster with random initializtion Kmeans can Locally optimal solution can be obtained rather than total optimization solution is different, self-adaptive initialization can avoid being absorbed in the situation of local optimum.Together When color space (R, G, B) and geometric space (x, y) joint 5 tie up Kmeans same color pixel position can be made full use of to connect Near priori, improves the accuracy of cluster and positioning.Simulated annealing Kmean algorithms are a kind of heuristic iterative algorithms, tool There is asymptotic Convergence Property, its verified convergence with probability 1 is in globally optimal solution in theory.Therefore cluster centre is disturbed at random Dynamic and simulated annealing, algorithm stability is improved while avoiding being absorbed in local optimum.
The methods such as machine learning and Digital Signal Processing are organically combined together by the present invention, based on spatio-temporal filtering and connection Space Kmeans methods are closed, the detection and positioning of finger motion is realized.With reference to specific implementation step and accompanying drawing to the present invention Explanation is described in further detail, but the implementation of the present invention is not limited to this.
Fig. 1 for the present invention a kind of embodiment, it is main include labelling and shoot video, moving object detection, Moving target positions three modules.The present invention first by the label of ten kinds of different colours on player's finger plaster (except black, ultrawhite), Shoot the video of player's finger piano.Then to the frame of video of input, finger motion is detected with the method for spatio-temporal filtering Target area, extracts label, and finger motion target classification and positioning are carried out with joint space Kmeans, so as to realize that fingering is known Not, writing function.
Above-mentioned labelling and shooting video module are used for the video file for generating subsequent module for processing, first by player's hand Refer to the label for sticking ten kinds of different colours (except black, ultrawhite), then simultaneously clap playing procedure in the normal piano of player Take the photograph into video.
Above-mentioned moving object detection module is used for moving object detection, using the method for spatio-temporal filtering.Input is regarded first Frequency frame carries out airspace filter, obtains accurate motion target area.Then airspace filter result feedback guidance time domain band logical is filtered Ripple result and temporal low-pass filter result carry out spatial domain restructuring in prospect (finger motion) position and background position and complete the dynamic back of the body Scape updates, the influence that illumination variation, DE Camera Shake and background can be overcome to change, and is prevented effectively from fortune during finger low-speed motion Moving-target missing inspection.Finger motion object detection results are transformed into YCrCb, HSV space from rgb space and carry out bandpass filtering, are gone Fall the colour of skin and shade, and by prospect threshold decision, extract label.
Target motion detection to implement step as shown in Figure 2.
Step 1:Airspace filter, comprises the following steps:
1.1 searching moving target areas.Current input video frame is put to be compared pixel-by-pixel with background image progress spatial domain to be come Searching moving target area.
1.2 determine prospect and background.Motion target area is set to the pixel of current video input frame relevant position, background The pixel in region is set to white.
1.3 feedback prospects and background.Prospect (motion target area) and background are fed back to the context update for next frame.
Step 2:Dynamic background updates, and comprises the following steps:
2.1 airspace filter results are fed back.Last airspace filter result feedback guidance dynamic background is updated.Judge current Whether input video frame is the 2nd two field picture.If present incoming frame is the 2nd frame, background does not update, directly using the first two field picture as Background;If non-2nd frame of present incoming frame, next step operation is carried out.
2.2 spatial domains are recombinated.Time domain bandpass filtering result and temporal low-pass filter result in prospect (finger motion) position and Background position carries out spatial domain restructuring and completes context update.
Step 3:Label is extracted, is comprised the following steps:
3.1 remove the colour of skin.Rgb space is changed to YCrCb spaces, judges coordinate (Cr, Cb) whether in colour of skin elliptical modes In type.If certain pixel is in colour of skin model of ellipse, will the pixel be set to it is white.
3.2 remove shade.Rgb space is changed to HSV space, bandpass filtering is carried out to V component histogram.
3.3 judge label.In HSV space, the prospect average threshold of S components is calculated, by S components in the moving target of extraction Less than prospect saturation degree average threshold pixel be set to it is white.
Above-mentioned moving target locating module is used to position moving target, using joint space Kmeans methods.Joint space Kmeans can not only be positioned, and can be classified, so as to realize that different fingers are matched with the correct of labeling, be prevented effectively from The Wrong localization that colour superimposition, fingering complexity and noise spot interference are caused.First determine whether tri- histogram of component low pass filtereds of R, G, B Crest after ripple, adaptive determining clusters number K size makes classification more accurate and intelligence.Then histogrammic system is utilized Characteristic self-adaptive initialization cluster is counted, can avoid being absorbed in the situation of local optimum, accelerates the speed of iteration convergence, algorithm is improved Efficiency and the degree of accuracy.Carry out color space (R, G, B) and geometric space (x, y) joint 5 ties up Kmeans and can make full use of phase The priori being closely located to colored pixels point, improves the accuracy of cluster and positioning.And carry out cluster centre random Disturbance and simulated annealing, algorithm stability is improved while avoiding being absorbed in local optimum as far as possible.Finally cluster result is entered Row classification and positioning, determine relevant position of each frame picture finger on keyboard, so as to obtain the fingering of player.
Target motion positions to implement step as shown in Figure 3.
Step 1:The adaptive Kmeans of joint space, comprises the following steps:
1.1 statistics R, G, B property of the histogram.Tri- histogram of component of moving object detection result R, G, B are subjected to low pass Filtering, adaptive judgement histogram crest.
1.2 adaptive determining clusters number K.The crest number for taking R, G, B histogram maximum is used as joint space Kmeans clusters number.
1.3 self-adaption clusters are initialized.Cluster centre is initialized using R, G, B histogram crest location.
1.4 iteration are until convergence.Following operation is repeated, until convergence:(a) K Ge Leilei centers are calculated respectively.Kth (1 ≤ k≤K) Lei Lei centers be 5 dimension observation (R, G, B, x, y) vectors in kth class mean vector.(b) distribution will each be observed Into the class where closest class center.
Step 2:Random perturbation and simulated annealing, comprise the following steps:
2.1 calculate 5 dimension disturbance radiuses of each class.Take each class class centre-to-centre spacing such somewhat farthest distance as disturbing Dynamic radius rK(five n dimensional vector ns, K is clusters number).
2.2 random perturbation.Take the random number random between -1~10, class center is subjected to rK*random0Disturbance.Will Result after class hub disturbances re-starts the adaptive Kmeans of joint space as new initialization class center.Calculate newly Object function and current goal function difference Δ J=J'-J.If Δ J < 0, receive new explanation as current solution, and update disturbance Radius.
2.3 simulated annealing.The random number for participating in disturbance is modified to random0*a-t, wherein a is annealing speed, a>1, t For setting annealing times, the operation of progress 2.1 and 2.2.
Step 3:Fingering is recognized, is comprised the following steps:
3.1 moving targets are positioned.By the coordinate of the adaptive Kmeans cluster centres of joint space, each frame of video is determined Relevant position of the finger on keyboard is so as to obtain fingering.
3.2 fingering are exported.The fingering of each frame of video is uniformly stored in csv and learns and grinds for follow-up fingering Study carefully.
As above it can preferably realize the present invention and obtain aforementioned invention effect, present example is through suitably being transformed It can be widely used for gesture identification and other fields.

Claims (4)

1. finger motion detection and localization method based on spatio-temporal filtering and joint space Kmeans, it is characterised in that including:It is first First by the label of ten kinds of different colours in addition to black and white on player's finger plaster, shoot player's finger and play regarding for keyboard instrument Frequently;Then to the frame of video of input, finger motion region is determined with the method for spatio-temporal filtering, and airspace filter result feedback is referred to Dynamic background renewal is led, finger motion object detection results are transformed into YCrCb, HSV space from rgb space carries out band logical filter Ripple, removes the colour of skin and shade, by prospect threshold decision, extracts label;Finally finger motion is carried out with joint space Kmeans Target is positioned, and crest number after adaptive judgement R, G, B histogram LPF determines clusters number K, with reference to histogram Statistical property self-adaptive initial cluster centre, and by cluster result random perturbation and simulated annealing, thus realize fingering identification, Writing function.
2. the finger motion based on spatio-temporal filtering and joint space Kmeans is detected and localization method according to claim 1, It is characterized in that being realized by labelling and shooting video module, moving object detection module and moving target locating module;
Labelling and the shooting video module is used for the video file for generating subsequent module for processing:First by player's finger plaster The label of upper except black, ten kinds of ultrawhite different colours, then simultaneously shoots into playing procedure in the normal piano of player Video;
The moving object detection module is used for moving object detection, using the method for spatio-temporal filtering:First to input video frame Airspace filter is carried out, accurate motion target area is obtained;Then by airspace filter result feedback guidance time domain bandpass filtering knot Fruit and temporal low-pass filter result are that finger motion position and background position carry out spatial domain restructuring and complete dynamic background more in prospect Newly, the influence for overcoming illumination variation, DE Camera Shake and background to change, it is to avoid moving target missing inspection during finger low-speed motion; Finger motion object detection results are transformed into YCrCb, HSV space from rgb space and carry out bandpass filtering, remove the colour of skin and the moon Shadow, and by prospect threshold decision, extract label;
The moving target locating module is used to position moving target, using joint space Kmeans:First determine whether R, G, B tri- Crest after histogram of component LPF, adaptive determining clusters number K size makes classification more accurate and intelligence;So Clustered afterwards using histogrammic statistical property self-adaptive initialization;Carry out color space (R, G, B) and geometric space (x, y) joint 5 dimension Kmeans make full use of the priori that same color pixel is closely located to, and improve the accuracy of cluster and positioning;And will Cluster centre carries out random perturbation and simulated annealing, and algorithm stability is improved while avoiding being absorbed in local optimum;It is finally right Cluster result is classified and positioned, and relevant position of each frame picture finger on keyboard is determined, so as to obtain player's Fingering.
3. finger motion detection and positioning side according to claim 1 based on spatio-temporal filtering and joint space Kmeans Method, it is characterised in that the step that implements of the target motion detection block includes:
Step 1, airspace filter, comprise the following steps:
1.1 searching moving target areas:Current input video frame is put pixel-by-pixel with background image progress spatial domain to be compared to search for Motion target area;
1.2 determine prospect and background:Motion target area is set to the pixel of current video input frame relevant position, background region Pixel be set to it is white, in rgb space, in vain be (255,255,255);
1.3 feedback prospects and background:Prospect is motion target area and background feeds back context update for next frame;
Step 2, dynamic background update, and comprise the following steps:
2.1 airspace filter results are fed back:Last airspace filter result feedback guidance dynamic background is updated, current input is judged Whether frame of video is the 2nd two field picture, if present incoming frame is the 2nd frame, background does not update, and is directly used as the back of the body using the first two field picture Scape;If non-2nd frame of present incoming frame, next step operation is carried out;
2.2 spatial domains are recombinated:Time domain bandpass filtering result and temporal low-pass filter result are finger motion position and background in prospect Position carries out spatial domain restructuring and completes context update;
Step 3:Label is extracted, is comprised the following steps:
3.1 remove the colour of skin:Rgb space is changed to YCrCb spaces, coordinate (Cr, Cb) is judged whether in colour of skin model of ellipse, If certain pixel is in colour of skin model of ellipse, will the pixel be set to it is white;
3.2 remove shade:Rgb space is changed to HSV space, bandpass filtering is carried out to V component histogram;
3.3 judge label:In HSV space, the prospect average threshold of S components is calculated, S components in the moving target of extraction are less than The pixel of prospect saturation degree average threshold is set to white.
4. finger motion detection and positioning side according to claim 1 based on spatio-temporal filtering and joint space Kmeans Method, it is characterised in that the step that implements of the target motion positions includes:
The adaptive Kmeans of step 1, joint space, comprises the following steps:
1.1 statistics R, G, B property of the histogram:Tri- histogram of component of moving object detection result R, G, B are subjected to LPF, Adaptive judgement histogram crest;
1.2 adaptive determining clusters number K:The maximum crest number of R, G, B histogram is taken to be used as the poly- of joint space Kmeans Class number;
1.3 self-adaption clusters are initialized:Cluster centre is initialized using R, G, B histogram crest location;
1.4 iteration are until convergence:Following operation is repeated, until convergence:(a) K Ge Leilei centers are calculated respectively;The class of kth class Center is the mean vector of 5 dimension observation (R, G, B, x, y) vectors in kth class, 1≤k≤K;(b) each observation is assigned to distance In class where nearest class center;It is described nearest to be determined using Euclidean distance;
Step 2:Random perturbation and simulated annealing, comprise the following steps:
2.1 calculate 5 dimension disturbance radiuses of each class:Take each class class centre-to-centre spacing such somewhat farthest distance as disturbance Radius rK, rKFor five n dimensional vector ns, K is clusters number;
2.2 random perturbation:Take the random number random between -1~10, class center is subjected to rK*random0Disturbance;By in class Result after heart disturbance re-starts the adaptive Kmeans of joint space as new initialization class center, calculates new target Function J ' and current goal function J difference DELTA J=J'-J;If Δ J < 0, receive new explanation as current solution, and update disturbance Radius, into step 3;Otherwise step 2.3 is entered;
2.3 simulated annealing:The random number for participating in disturbance is modified to random0*a-t, wherein a is annealing speed, a>1, t is to move back Fiery number of times, proceed 2.1 and 2.2 operation;
Step 3:Fingering is recognized, is comprised the following steps:
3.1 moving targets are positioned:By the coordinate of the adaptive Kmeans cluster centres of joint space, each frame of video finger is determined Relevant position on keyboard is so as to obtain fingering;
3.2 fingering are exported:The fingering of each frame of video is uniformly stored in csv files and learns and grinds for follow-up fingering Study carefully.
CN201710231824.3A 2017-04-10 2017-04-10 Finger motion detection and positioning method based on space-time filtering and joint space Kmeans Expired - Fee Related CN107180224B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710231824.3A CN107180224B (en) 2017-04-10 2017-04-10 Finger motion detection and positioning method based on space-time filtering and joint space Kmeans

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710231824.3A CN107180224B (en) 2017-04-10 2017-04-10 Finger motion detection and positioning method based on space-time filtering and joint space Kmeans

Publications (2)

Publication Number Publication Date
CN107180224A true CN107180224A (en) 2017-09-19
CN107180224B CN107180224B (en) 2020-06-19

Family

ID=59830915

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710231824.3A Expired - Fee Related CN107180224B (en) 2017-04-10 2017-04-10 Finger motion detection and positioning method based on space-time filtering and joint space Kmeans

Country Status (1)

Country Link
CN (1) CN107180224B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063781A (en) * 2018-08-14 2018-12-21 浙江理工大学 A kind of fuzzy image Fabric Design method of imitative natural colour function and form
CN109451634A (en) * 2018-10-19 2019-03-08 厦门理工学院 Method and its intelligent electric lamp system based on gesture control electric light
CN109960980A (en) * 2017-12-22 2019-07-02 北京市商汤科技开发有限公司 Dynamic gesture identification method and device
CN111105398A (en) * 2019-12-19 2020-05-05 昆明能讯科技有限责任公司 Transmission line component crack detection method based on visible light image data
US11221681B2 (en) 2017-12-22 2022-01-11 Beijing Sensetime Technology Development Co., Ltd Methods and apparatuses for recognizing dynamic gesture, and control methods and apparatuses using gesture interaction
WO2022052941A1 (en) * 2020-09-09 2022-03-17 桂林智神信息技术股份有限公司 Intelligent identification method and system for giving assistance with piano teaching, and intelligent piano training method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102368290A (en) * 2011-09-02 2012-03-07 华南理工大学 Hand gesture identification method based on finger advanced characteristic
CN105335711A (en) * 2015-10-22 2016-02-17 华南理工大学 Fingertip detection method in complex environment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102368290A (en) * 2011-09-02 2012-03-07 华南理工大学 Hand gesture identification method based on finger advanced characteristic
CN105335711A (en) * 2015-10-22 2016-02-17 华南理工大学 Fingertip detection method in complex environment

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109960980A (en) * 2017-12-22 2019-07-02 北京市商汤科技开发有限公司 Dynamic gesture identification method and device
US11221681B2 (en) 2017-12-22 2022-01-11 Beijing Sensetime Technology Development Co., Ltd Methods and apparatuses for recognizing dynamic gesture, and control methods and apparatuses using gesture interaction
CN109960980B (en) * 2017-12-22 2022-03-15 北京市商汤科技开发有限公司 Dynamic gesture recognition method and device
CN109063781A (en) * 2018-08-14 2018-12-21 浙江理工大学 A kind of fuzzy image Fabric Design method of imitative natural colour function and form
CN109063781B (en) * 2018-08-14 2021-12-03 浙江理工大学 Design method of fuzzy image fabric imitating natural color function and form
CN109451634A (en) * 2018-10-19 2019-03-08 厦门理工学院 Method and its intelligent electric lamp system based on gesture control electric light
CN111105398A (en) * 2019-12-19 2020-05-05 昆明能讯科技有限责任公司 Transmission line component crack detection method based on visible light image data
WO2022052941A1 (en) * 2020-09-09 2022-03-17 桂林智神信息技术股份有限公司 Intelligent identification method and system for giving assistance with piano teaching, and intelligent piano training method and system

Also Published As

Publication number Publication date
CN107180224B (en) 2020-06-19

Similar Documents

Publication Publication Date Title
CN107180224A (en) Finger motion detection and localization method based on spatio-temporal filtering and joint space Kmeans
CN110135314B (en) Multi-target tracking method based on depth track prediction
CN102324025B (en) Human face detection and tracking method based on Gaussian skin color model and feature analysis
Neves et al. An efficient omnidirectional vision system for soccer robots: From calibration to object detection
Žemgulys et al. Recognition of basketball referee signals from real-time videos
CN109598268A (en) A kind of RGB-D well-marked target detection method based on single flow depth degree network
CN107872644A (en) Video frequency monitoring method and device
CN108198221A (en) A kind of automatic stage light tracking system and method based on limb action
CN104834916A (en) Multi-face detecting and tracking method
CN102194108A (en) Smiley face expression recognition method based on clustering linear discriminant analysis of feature selection
CN113592911B (en) Apparent enhanced depth target tracking method
CN106952294A (en) A kind of video tracing method based on RGB D data
CN109460764A (en) A kind of satellite video ship monitoring method of combination brightness and improvement frame differential method
CN110399888B (en) Weiqi judging system based on MLP neural network and computer vision
Yang et al. Robust player detection and tracking in broadcast soccer video based on enhanced particle filter
CN113643278A (en) Confrontation sample generation method for unmanned aerial vehicle image target detection
CN109993052A (en) The method for tracking target and system of dimension self-adaption under a kind of complex scene
CN105046721A (en) Camshift algorithm for tracking centroid correction model on the basis of Grabcut and LBP (Local Binary Pattern)
Zha et al. Distractor-aware visual tracking by online Siamese network
CN107330918B (en) Football video player tracking method based on online multi-instance learning
Arbués-Sangüesa et al. Single-camera basketball tracker through pose and semantic feature fusion
Shiting et al. Clustering-based shadow edge detection in a single color image
Lee et al. Efficient Face Detection and Tracking with extended camshift and haar-like features
CN112614161A (en) Three-dimensional object tracking method based on edge confidence
CN103996207A (en) Object tracking method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200619