CN109448131A

CN109448131A - A kind of virtual piano based on Kinect plays the construction method of system

Info

Publication number: CN109448131A
Application number: CN201811243690.8A
Authority: CN
Inventors: 吴俊�; 张子涵; 王凯; 王家霈; 张瑶; 何贵青; 蒋晓悦; 谢红梅; 夏召强; 冯晓毅; 李会方
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2018-10-24
Filing date: 2018-10-24
Publication date: 2019-03-08
Anticipated expiration: 2038-10-24
Also published as: CN109448131B

Abstract

The present invention provides the construction methods that a kind of virtual piano based on Kinect plays system, the three-dimensional reconstruction of scene is completed using Kinect device, the region of creation dummy keyboard is selected in suitable plane in the scene, generate virtual key, after carrying out key pressing detection, corresponding note is arranged to play, the function of playing virtual piano can be realized.Interactive mode of the present invention as people and machine, simple and convenient dummy keyboard can extend to the fields such as smart home, game and robot；Display is used as using the library OpenGL, and combines the value of fingertip location to judge the state of key, the accuracy that key is played is improved, good user experience can be brought.When realizing virtual piano, three-dimensional stereo model is established, so that picture effect more has three-dimensional sense, meets the experience of the immersion of people.

Description

A kind of virtual piano based on Kinect plays the construction method of system

Technical field

The present invention relates to electronics and computer field, especially a kind of construction method based on Kinect device.

Background technique

In current market, mostly with use Computer key simulation piano key to be played program technic based on, carry out body Sense interactive mode is played using fewer and fewer.Although current virtual piano technology can be realized playing for piano to a certain extent Effect, but most of is all the program technic inputted using computor-keyboard, and be difficult to imitate out using computor-keyboard true Piano key plays effect.Meanwhile sound quality that technology is played out is poor at present, is extremely difficult to the audio effect of true piano playing Fruit is difficult meet the needs of professional user.

Summary of the invention

For overcome the deficiencies in the prior art, the present invention provides the structure that a kind of virtual piano based on Kinect plays system Construction method is to realize the demand of simple virtual piano performance system for ordinary user and design, so can satisfy most The requirement of basic connection music.System is easy to use, and user only needs to select creation dummy keyboard in suitable plane Region, so that it may realize the function of playing virtual piano immediately.

The step of the technical solution adopted by the present invention to solve the technical problems, is as follows:

This system can be also used for the medical health of bone injury patient other than it can satisfy the demand of virtual piano It is multiple.Patient needs to repeat a large amount of movement under the guidance of doctor during rehabilitation.And by Kinect device, patient can be with Display picture auxiliary under, without doctor help and oneself carry out rehabilitation movement.And patient is the ring in a kind of game Under border, mood can be more pleasant, this can also allow patients ' recovery faster.

To realize that user only needs to select to create in suitable plane the region of virtual key, so that it may realize bullet immediately Play the function of virtual piano.System research contents are as follows:

Step 1: the three-dimensional reconstruction of scene

Kinect device camera is opened, capture space depth information is believed using the depth that camera obtains real world images Breath carries out trigonometric ratio calculating, and benefit to cloud coordinate by calculating the point cloud coordinate of each pixel in each frame depth image The pixel normal vector is calculated with the coordinate information of each pixel after trigonometric ratio and depth information, realizes three-dimensional reconstruction；

Detailed step is as follows:

Step 1.1: depth map obtains

Kinect is controlled using the Kinect SDK of Microsoft, three-dimensional drawing uses Open Frameworks, starts Three-dimensional Gravity To OpenGL and three-dimensional scenic set-point cloud total number when structure, and the key area for marking virtual piano on screen is clicked by mouse Domain obtains the depth captured in Kinect camera with the NuiImageStreamGetNextFrame () function in SDK and believes Breath, updates one by one in update (), the depth information of each frame is stored in the buffer of setting；

Step 1.2: point cloud chart obtains

A class is established, such utilizes NuiTransf in Kinect SDK according to the depth information obtained in step 1.1 OrmDepthImageToSkeleton () function calculates point cloud, and obtained point cloud is stored with a matrix type, every in matrix One element represents a point in point cloud, while the pixel of the element in the matrix and ranks coordinate identical in depth image It is corresponding；

Step 1.3: point cloud trigonometric ratio

After obtaining point cloud data matrix, trigonometric ratio processing is carried out to cloud:

It regards a cloud as a width figure, traverses cloud atlas, to obtain all consecutive points of certain point in a cloud space, use Each two point for pressing tandem and front is linked to be a triangular facet, i.e., by a line by the line function of OpenGL Point cloud trigonometric ratio；

Step 1.4: vertex scheme line chart

It is unanimously standard that the present invention, which chooses towards all normal positive directions of camera, and all points adjacent to a point make One best fit plane is fitted with least square method, the normal direction of best fit plane is the normal direction of the point, which includes the neighbour of surrounding The quadratic sum that point arrives the distance of this best fit plane is minimum, i.e., following formula is minimum:

Wherein i is the number calculated a little, M indicate the point include surrounding adjoint point to this best fit plane distance it is flat Fang He, (a, b, c) are the law vector of plane, and xi, yi, zi is respectively the coordinate value of the point, respectively to three parameters of law vector Local derviation is sought, can be obtained:

Formula (1) is minimized, then three equatioies need to set up below:

According to Cramer rule, the solution being minimized to formula (1) is

Wherein D indicates gradient, that is, has rebuild the environment scene of virtual piano system；

Step 2: the generation of virtual key

In the scene after Kinect reconstruction, in the plane of any have no occluder, the screen field of PC is clicked by mouse, Selection needs to generate the specific region of key, is drawn using OpenGL, constructs virtual key in this region；

After the building of virtual piano environment scene, the generation position of key is selected in the virtual environment constructed, uses mouse Selection region is clicked, is monitored using mousePressed (), the screen two-dimensional coordinate that user selects key region is obtained, two dimension Coordinate is the screen coordinate established by origin of computer screen center, to construct three-dimensional key, uses what is obtained in step 1.2 Using Kinect camera as the point cloud coordinate in the depth image space of origin, the point cloud data being calculated before is with rectangular Formula storage obtains respective point cloud coordinate according to the column locations corresponding relationship of screen coordinate and depth image coordinate；

Key region point cloud coordinate is obtained using Update () function, createkey () function is called, calculates each The spatial position three-dimensional coordinate of key is initially set in step 1.1 according to device screen size and frame per second setting segmentation key number Fixed point cloud total number is divided by the point cloud number that the key number currently set is in each key, in draw () function, with Kinect camera is visual angle, realizes the drafting of three-dimensional key using OpenGL, i.e., using mouse directly cropping qin on the screen Key range；After setting virtual key, each position of piano key is corresponded to the note file of the position using play command；

Step 3: key pressing detection

User detects whether user presses key in real time in piano after setting completed, i.e. progress pressing detection；

Finger point cloud value and key region point cloud number are compared to judge the state change of key, once detect response area Domain point cloud number value changes, and in key region, changing value range and finger coincide finger position at this time, then assert qin Bonded state changes, i.e. identification key is pressed；

Step 4: corresponding note plays

It has been pressed by key pressing detection, such as key, has then played out corresponding note, while virtual key transformation Color shows to user and has been detected by.

The present invention is in order to realize real-time broadcasting, firstly, making the broadcasting of note that cannot influence figure using asynchronous play It has been shown that, that is, detect and show figure while key is pressed, secondly, opening thread plays sound, makes to play sound and display unit Divide unaffected mutually；

Broadcasting sound of the invention carries midi player using windows system, passes through midiOutShortMsg () letter Number sends sound play command to virtual key system and uses Acoustica when playing multiple notes at the same time Pianissimo sound source library when multiple keys are pressed simultaneously, calls thread pool order to avoid the abuse of single thread.

The present invention carries out median filtering to image, in update () function, uses the depth data of acquisition the window of 5x5 Mouth carries out median filtering.

The present invention, using 8 field points around required point as adjoint point, is led in the normal map of the step 1.4 calculates to 8 Domain point judges depth, and the only depth of 8 field points and the depth difference of central point is when within the 5% of depth to center value, by this Field point carries out next step calculating as adjoint point.

The present invention when key selection region there are objects moving pass through, then the point cloud number of key selection region can change Become, global variable is set, records the refreshing frame number of camera collection image in the above object moving process time, takes frame number at least First 20 times point cloud numbers are removed a cloud number peak and are averaged with the remaining number after minimum as current point cloud Number.

The present invention obtains point cloud number when each key is not touched first, chooses the above frame number of 20 frames and removes peak Averaged after minimum, each key using itself 10 total cloud number as threshold value, when key point cloud number is flat When mean value fluctuation range is less than threshold value, the non-discolouring also not sounding of key.

The beneficial effects of the present invention are using Kinect device currently popular, realize a virtual piano and play to be System, meets the demand in terms of consumer entertainment, which to a certain extent, can be used as general input equipment, as The interactive mode of people and machine, simple and convenient dummy keyboard can extend to the fields such as smart home, game and robot； The depth map resolution ratio that Kinect is provided is low, influence of the depth value vulnerable to noise, to solve this to the accurate performance band of piano The problem come, the present invention is used as display using the library OpenGL, and combines the value of fingertip location to judge the state of key, mentions The accuracy that high key is played, can bring good user experience.When realizing virtual piano, 3 D stereo mould is established Type meets the experience of the immersion of people so that picture effect more has three-dimensional sense.

Detailed description of the invention

Fig. 1 is flow chart of the invention.

Fig. 2 is the schematic diagram of trigonometric ratio of the present invention.

Fig. 3 is the environment scene of virtual piano system of the present invention.

Fig. 4 is the key schematic diagram that the present invention constructs.

Fig. 5 is pressing detection figure of the invention.

Fig. 6 is the comparison diagram of filtering front and back of the invention, and wherein Fig. 6 (a) is the schematic diagram before filtering, and Fig. 6 (b) is after filtering Schematic diagram.

Fig. 7 is computer screen live effect figure when user is played using the present invention.

Specific embodiment

Present invention will be further explained below with reference to the attached drawings and examples.

Step 1: the three-dimensional reconstruction of scene

Detailed step is as follows:

Step 1.1: depth map obtains

Step 1.2: point cloud chart obtains

Step 1.3: point cloud trigonometric ratio

The point cloud that Kinect is returned is structuring, is not unordered cloud, so carrying out trigonometric ratio calculating to cloud Calculation amount is not too large, and step 1.2 is cloud storage in matrix, and the size of matrix and depth map are just the same (row x column) regard a cloud as a width figure, traverse cloud atlas, to obtain all consecutive points of certain point in a cloud space, use Each two point for pressing tandem and front is linked to be a triangular facet, i.e., by a line by the line function of OpenGL Point cloud trigonometric ratio.As shown in Fig. 2, the serial number on point side is the sequence of picture point:

Step 1.4: vertex scheme line chart

There are many kinds of the calculating for finding out normal, and the present invention takes two points adjacent with the point to form triangle each point Shape calculates the normal direction of triangle, the i.e. normal direction as the point, but this method is too inaccurate, and the seat of one of point Mark varies slightly, and just will affect the direction of final method line, and lighting effect can be very unstable.In order to reach a more accurate effect Fruit, it is unanimously standard that the present invention, which chooses towards all normal positive directions of camera, and all points adjacent to a point are using most Small square law is fitted one best fit plane, and the normal direction of best fit plane is the normal direction of the point, which includes that the adjoint point of surrounding arrives The quadratic sum of the distance of this best fit plane is minimum, i.e., following formula is minimum:

Wherein i is the number calculated a little, M indicate the point include surrounding adjoint point to this best fit plane distance it is flat Fang He, (a, b, c) are the law vector of plane, x_i, y_i, z_iThe respectively coordinate value of the point respectively seeks three parameters of law vector Local derviation can obtain:

Formula (1) is minimized, then three equatioies need to set up below:

According to Cramer rule, the solution being minimized to formula (1) is

Wherein D indicates gradient, that is, has rebuild the environment scene of virtual piano system, as shown in Figure 3

Step 2: the generation of virtual key

Key region point cloud coordinate is obtained using Update () function, createkey () function is called, calculates each The spatial position three-dimensional coordinate of key is initially set in step 1.1 according to device screen size and frame per second setting segmentation key number Fixed point cloud total number is divided by the point cloud number that the key number currently set is in each key, in draw () function, with Kinect camera is visual angle, realizes the drafting of three-dimensional key using OpenGL, i.e., using mouse directly cropping qin on the screen Key range；After setting virtual key, each position of piano key is corresponded to the note file of the position using play command, is subsequent Note broadcasting is prepared；

Step 3: key pressing detection

User detects whether user presses key in real time in piano after setting completed, i.e. progress pressing detection.

Finger point cloud value and key region point cloud number are compared to judge the state change of key, i.e., once detecting response Region point cloud number value changes, and in key region, changing value range and finger coincide finger position at this time, then assert Key state changes, i.e. identification key is pressed, and takes dual judgment criterion in this way, accurate when improving piano playing Degree, brings the good experience of user；

Step 4: corresponding note plays

When detecting that virtual key is pressed, virtual piano responds at once, plays note sound corresponding with the key File, the key in interface change color accordingly, i.e., corresponding key has been pressed, it is contemplated that user plays the feelings of fast speed Condition, thus system in real time it is more demanding, in order to realize this high request, firstly, making the broadcasting of note using asynchronous play The display that figure cannot be influenced, that is, detect and show figure while key is pressed, secondly, opening thread plays sound, makes It plays sound and display portion is unaffected mutually；

Broadcasting sound of the invention carries midi player using windows system, passes through midiOutShortMsg () letter Number sends sound play command to virtual key system, and when playing multiple notes at the same time, sound quality is reduced, and finds sound quality more High open professional piano sound source library Acoustica Pianissimo sound source library, keeps key broadcasting sound more pleasing to the ear, due to meeting There are multiple keys while pressing, the broadcasting of different notes cannot conflict, and call thread pool order to avoid the abuse of single thread, and protect It demonstrate,proves multiple notes while playing unaffected.

Due to obtaining depth image and there is many apertures and noise, and edge using the depth map that OpenCV is drawn It is unsmooth.The present invention is filtered image, in order to be blurred edge not, using median filtering, in update () function In, median filtering is carried out using the window of 5x5 to the depth data of acquisition.

In the normal map of step 1.4 calculates, each point, if the point closed on it is simply taken to form triangle, The triangulation method vector is calculated, then the variation of one of point, just will affect final normal direction, increase around the point Point, normal can be more stable, but it is then too big to have taken calculation amount more.The thought for using for reference two dimensional image filter, by 8 around required point Field point also needs to judge depth, only 8 fields to 8 field points as adjoint point since the pixel in image has depth information The depth of point and the depth difference of central point are next as adjoint point progress by the field point when within the 5% of depth to center value Step calculates.

For finger differentiate process be explained further, when key selection region there are objects moving pass through, then key select area The point cloud number in domain can change, so setting global variable, records camera in the above object moving process time and acquire The refreshing frame number of image takes frame number at least preceding 20 times point cloud numbers, removes a cloud number peak and remaining after minimum Remainder mesh is averaged as current point cloud number, reduces error in the case where being not take up resource as far as possible.

By calling midi to play single note, effect can also receive.But multiple notes are played simultaneously, the fusion of sound is too Difference.Professional person is audio repository of one audio file of each note definitions freely to use oneself, creates thread pool, is every One key creates a thread dispatching audio file, and solves thread multiplying question using this thread pool, reduces expense.

Environment such as illumination is different, and respective threshold also should be different, even if different keys are separated by relatively closely, point cloud number is still Difference, fixed key spatial point cloud number, determines that variation range is obviously undesirable.The present invention obtains each key first and is not touched Point cloud number when touching chooses the above frame number of 20 frames and removes averaged after peak minimum, each key with itself hundred / ten total cloud number is threshold value, and when key point cloud number average value fluctuation range is less than threshold value, key is non-discolouring Not sounding, as shown in fig. 7, having object to block before key will not influence key state.

In order to verify the exploitativeness of technical solution of the present invention, Kinect device is connected using computer and system of the present invention is installed System has carried out multiple open demonstration, has carried out hand performance in classroom desktop, has carried out foot on dormitory ground and played, the present invention Good tone color sound quality and faster reaction speed are obtained.

Claims

1. the construction method that a kind of virtual piano based on Kinect plays system, it is characterised in that:

Step 1: the three-dimensional reconstruction of scene

Kinect device camera is opened, capture space depth information obtains the depth information of real world images using camera, leads to The point cloud coordinate for calculating each pixel in each frame depth image is crossed, trigonometric ratio calculating is carried out to cloud coordinate, and utilize three The coordinate information and depth information of each pixel after angling calculate the pixel normal vector, realize three-dimensional reconstruction；

Detailed step is as follows:

Step 1.1: depth map obtains

Kinect is controlled using the Kinect SDK of Microsoft, three-dimensional drawing uses Open Frameworks, when starting three-dimensionalreconstruction To OpenGL and three-dimensional scenic set-point cloud total number, and the key region for marking virtual piano on screen is clicked by mouse, The depth information captured in Kinect camera is obtained with the NuiImageStreamGetNextFrame () function in SDK, It is updated one by one in update (), the depth information of each frame is stored in the buffer of setting；

Step 1.2: point cloud chart obtains

A class is established, such utilizes NuiTransformD in Kinect SDK according to the depth information obtained in step 1.1 EpthImageToSkeleton () function calculates point cloud, obtained point cloud is stored with a matrix type, each in matrix Element represents a point in point cloud, while the element in the matrix is opposite with the pixel of ranks coordinate identical in depth image It answers；

Step 1.3: point cloud trigonometric ratio

Step 1.4: vertex scheme line chart

It is unanimously standard that the present invention, which chooses towards all normal positive directions of camera, and all points adjacent to a point are using most Small square law is fitted one best fit plane, and the normal direction of best fit plane is the normal direction of the point, which includes that the adjoint point of surrounding arrives The quadratic sum of the distance of this best fit plane is minimum, i.e., following formula is minimum:

Wherein i is the number calculated a little, M indicate the point include surrounding adjoint point to this best fit plane distance square With, (a, b, c) be plane law vector, xi, yi, zi is respectively the coordinate value of the point, is asked respectively three parameters of law vector Local derviation can obtain:

Formula (1) is minimized, then three equatioies need to set up below:

According to Cramer rule, the solution being minimized to formula (1) is

Step 2: the generation of virtual key

In the scene after Kinect reconstruction, in the plane of any have no occluder, the screen field of PC, selection are clicked by mouse The specific region for needing to generate key, is drawn using OpenGL, constructs virtual key in this region；

After the building of virtual piano environment scene, the generation position of key is selected in the virtual environment constructed, is clicked with mouse Selection region is monitored using mousePressed (), obtains the screen two-dimensional coordinate that user selects key region, two-dimensional coordinate For the screen coordinate established by origin of computer screen center, to construct three-dimensional key, using obtained in step 1.2 with Kinect camera is the point cloud coordinate in the depth image space of origin, and the point cloud data being calculated before is in the matrix form Storage obtains respective point cloud coordinate according to the column locations corresponding relationship of screen coordinate and depth image coordinate；

Key region point cloud coordinate is obtained using Update () function, createkey () function is called, calculates each key Spatial position three-dimensional coordinate, it is initially set in step 1.1 according to device screen size and frame per second setting segmentation key number Point cloud total number is divided by the point cloud number that the key number currently set is in each key, in draw () function, with Kinect camera is visual angle, realizes the drafting of three-dimensional key using OpenGL, i.e., using mouse directly cropping qin on the screen Key range；After setting virtual key, each position of piano key is corresponded to the note file of the position using play command；

Step 3: key pressing detection

Finger point cloud value and key region point cloud number are compared to judge the state change of key, once detect response region point Cloud number value changes, and in key region, changing value range and finger coincide finger position at this time, then assert key shape State changes, i.e. identification key is pressed；

Step 4: corresponding note plays

It has been pressed by key pressing detection, such as key, has then played out corresponding note, while virtual key turn colors It shows and has been detected by user.

2. a kind of virtual piano based on Kinect according to claim 1 plays the construction method of system, feature exists In:

The present invention is in order to realize real-time broadcasting, firstly, make the broadcasting of note that cannot influence the display of figure using asynchronous play, It detects and shows figure while key is pressed, secondly, opening thread plays sound, make to play sound and display portion is mutual It is unaffected.

3. a kind of virtual piano based on Kinect according to claim 1 plays the construction method of system, feature exists In:

Broadcasting sound of the invention using windows system carry midi player, by midiOutShortMsg () function to Virtual key system sends sound play command, when playing multiple notes at the same time, uses Acoustica Pianissimo sound Source library when multiple keys are pressed simultaneously, calls thread pool order to avoid the abuse of single thread.

4. a kind of virtual piano based on Kinect according to claim 1 plays the construction method of system, feature exists In:

The present invention carries out median filtering to image, in update () function, to the depth data of acquisition using 5x5 window into Row median filtering.

5. a kind of virtual piano based on Kinect according to claim 1 plays the construction method of system, feature exists In:

The present invention is in the normal map of the step 1.4 calculates, using 8 field points around required point as adjoint point, to 8 field points Judge depth, the only depth of 8 field points and the depth difference of central point is when within the 5% of depth to center value, by the field Point carries out next step calculating as adjoint point.

6. a kind of virtual piano based on Kinect according to claim 1 plays the construction method of system, feature exists In:

The present invention when key selection region there are objects moving pass through, then the point cloud number of key selection region can change, if Global variable is set, the refreshing frame number of camera collection image in the above object moving process time is recorded, takes frame number at least preceding 20 Secondary point cloud number is removed a cloud number peak and is averaged with the remaining number after minimum as current point cloud number Mesh.

7. a kind of virtual piano based on Kinect according to claim 1 plays the construction method of system, feature exists In:

The present invention obtains point cloud number when each key is not touched first, and it is minimum to choose the above frame number removing peak of 20 frames Averaged after value, each key using itself 10 total cloud number as threshold value, when key point cloud number average value When fluctuation range is less than threshold value, the non-discolouring also not sounding of key.