CN106095109B - The method for carrying out robot on-line teaching based on gesture and voice - Google Patents

The method for carrying out robot on-line teaching based on gesture and voice Download PDF

Info

Publication number
CN106095109B
CN106095109B CN201610459874.2A CN201610459874A CN106095109B CN 106095109 B CN106095109 B CN 106095109B CN 201610459874 A CN201610459874 A CN 201610459874A CN 106095109 B CN106095109 B CN 106095109B
Authority
CN
China
Prior art keywords
robot
gesture
voice
instruction
control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610459874.2A
Other languages
Chinese (zh)
Other versions
CN106095109A (en
Inventor
杜广龙
邵亨康
陈燕娇
林思洁
姜思君
黄凯鹏
叶玉琦
雷颖仪
张平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201610459874.2A priority Critical patent/CN106095109B/en
Publication of CN106095109A publication Critical patent/CN106095109A/en
Application granted granted Critical
Publication of CN106095109B publication Critical patent/CN106095109B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Manipulator (AREA)
  • Numerical Control (AREA)

Abstract

The present invention discloses the method for carrying out robot on-line teaching based on gesture and voice, comprising the following steps: S1, the coarse tuning process based on gesture;S2, voice-based trim process;S3, robot teaching is carried out in conjunction with gesture and voice.The invention proposes a kind of methods of robot on-line teaching, include gesture teaching and voice teaching, and operator can complete corresponding movement in such a way that the coarse adjustment of gesture and the fine tuning of voice be combined with each other come guidance machine people.Compared to existing technology, it appears more naturally, flexibly, convenient, easily manipulation, operator may not necessarily be placed on energy and how to operate in robot, and can complete certain tasks with wholesouled Manipulation of the machine people.

Description

The method for carrying out robot on-line teaching based on gesture and voice
Technical field
The invention belongs to robot teaching fields, relate to a kind of man-machine interaction method based on gesture and voice.
Background technique
With the continuous development of robot technology, people are more and more deep to the research of robot, want to intellectual technology Ask higher and higher.As forefathers and the intelligent interaction of robot and the hot spot direction for becoming robot research that cooperates, and robot Teaching playback technology is the basis of man-machine collaboration again.Robot teaching reproducing technology refers to that operator in advance wants robot displaying The operation and task of completion, robot is by learning and Memory, then reappears these operations.The starting of external teaching playback technology compared with It is early, and many research achievements have been achieved, such as: wearing metallic framework teaching passes through wise robot teaching etc..Although domestic Teaching technology reproducing technology is started late, but development is very rapid, is carried out in addition to traditional by control stick or teaching box Teaching, there are also off-line teachings and Virtual Demonstration, however these methods all rely on specific environment, have certain limitation.
This invention proposes a kind of robot teaching reproducing technology based on three-dimension gesture and speech recognition, which has Very high flexibility and untethered property.Wherein gesture teaching use Leap Motion sensor obtain gesture position data with And the attitude data of gesture, the technology also refer to mixing Kalman filtering and particle filter method to the gesture data of acquisition into Row processing and optimization.By gesture teaching, robot can get operator and it is expected the location information that is reached of robot and fast Speed is moved to the position.Voice teaching will use Microsoft Speech SDK and identify to voice, and the natural language of people is turned The instruction that robot can identify is turned to, the control instruction corpus having built up is recycled to execute these instructions.Of the invention Teaching playback technology is to carry out teaching to robot based on gesture and speech recognition, and robot utilizes the identification and acquisition pair of gesture Operating position carries out coarse positioning, then speech recognition is recycled to carry out finely positioning to operating position, further according to the language recognized The other content of sound carries out relevant operation.The technology has very high flexibility, Accuracy and high efficiency, and operator only needs to make With the speech gestures instruction being very natural, robot just very can rapidly and accurately complete the online task of teaching.Therefore, of the invention Think that the teaching playback technology based on three-dimension gesture and speech recognition will be the inevitable choice of the following intelligent robot development, And intelligent human-machine interaction technology very will be effectively pushed to develop toward higher level.
Summary of the invention
This invention proposes a kind of method of robot on-line teaching, includes gesture teaching and voice teaching, operation Person can complete corresponding movement in such a way that the coarse adjustment of gesture and the fine tuning of voice be combined with each other come guidance machine people.
The present invention is based on the methods that gesture and voice carry out robot on-line teaching, include the following steps:
S1, the coarse tuning process based on gesture;
S2, voice-based trim process;
S3, robot teaching is carried out in conjunction with gesture and voice.
The step S1 the following steps are included:
The gesture of people has the characteristics that nature, intuitive, flexible, operator can easily be indicated by gesture very much from Oneself intention, by it with obviously having good advantage on robot teaching, gesture is used to carry out coarse adjustment operation to robot. Operator can directly control robot motion by gesture motion, and the data such as hand gesture location and direction can be by Leap Motion is acquired, and can control robot motion after handling gesture position and direction these data.
1) gesture coordinate system
Leap Motion is the capture that the data such as hand gesture location and direction are carried out by a gesture tracking system, is Coordinate system there are three being set in system.
1: world coordinate system XWYWZW
2:Leap Motion coordinate system XLYLZL
3: palm coordinate system XHYHZH
By palm coordinate system XHYHZHTo Leap Motion coordinate system XLYLZLTransformation can represent hand gesture location.Assuming that Leap Motion coordinate system XHYHZHWith palm coordinate system XLYLZLIt is respectively φ, θ, ψ in the direction of rotation of X-axis, Y-axis, Z axis, then These rotation angles (φ, θ, ψ) can represent gestures direction.
2) location estimation is carried out by interval Kalman filtering
The hand gesture location got by Leap Motion is inaccurate, it is possible that handshaking equal error, right in this way Robot on-line teaching can have a great impact, although Kalman filtering can be used in location estimation, in some environment phases In the case where complexity, the obtained data of system are uncertain, carry out the mistake of location estimation appearance to object with Kalman filtering It is poor possible bigger, it can solve this problem using interval Kalman filtering.
The model of interval Kalman filtering is expressed as follows:
HereIt is the state vector of n × 1 at k moment,It is the state transition matrix of n × n,It is n × l control output square Battle array,It is the input vector of l × 1,WithRepresent noisy vector;It is the measurement vector of m × 1 at k moment,It is m × n Observation matrix.Wherein
By being spaced the location estimation of Kalman filtering, the present invention can be by gesture state x'kIt is indicated in the state of moment k It is as follows:
x′k=[px,k,Vx,k,Ax,k,py,k,Vy,k,Ay,k,pz,k,Vz,k,Az,k] (3)
In this process, noisy vector indicates are as follows:
w'k=[0,0, w'x,0,0,w'y,0,0,wz]T (4)
Wherein (w'x,w'y,wz) be palm acceleration process noise.By being spaced the fused gesture of Kalman filtering Position data can be used to carry out robot coarse adjustment control operation with regard to more accurate.
3) Attitude estimation is carried out by improved particle filter
Quaternion Algorithm can be used to carry out the estimation in rigid body direction, use brought by Quaternion Algorithm accidentally to reduce Difference enhances data fusion using improved particle filter.In tkMoment, the approximation of rear portion density is defined as:
WhereinIt is tkI-th of state particle at moment, N is number of samples,It is tkThe standard of i-th of particle at moment Weight, δ () are Dirac functions.
Therefore, the particle analyzed can be calculated as follows:
In tk+1The quaternion components of moment each particle can be expressed as follows:
Wherein ωaxis,kIt is angular speed, t is sample time.Gesture posture is estimated by improved particle filter method Meter, the very big raising that accuracy also obtains, therefore can also be used to carry out robot coarse adjustment control operation.
The step S2 the following steps are included:
Voice has naturally, easy, easy the features such as manipulating.Operator directly controls robot by voice command will Become simple direct, voice is used to be finely adjusted operation to robot.The present invention will carry out language using Microsoft Speech SDK The acquisition of sound, when user issues phonetic order, Microsoft Speech SDK extracts keyword from voice input, and voice is believed Breath is converted into natural language text, then handles the natural language text information, and the user wherein contained is intended to turn Robot control instruction is turned to, robot control instruction is finally converted to corresponding robot manipulation, last robot is complete At this operation.But therefore the process can be divided into four-stage: voice input, speech recognition, it is intended that understand, complete operation.Its It is also the part of the invention next to be discussed that it is most important that middle speech recognition and intention, which understand,.Carrying out robot teaching reproduction The preceding present invention can be pre-designed a set of perfect control command system and corresponding phonetic control command corpus.Due to this Invention research is the robot teaching reproducing technology based on three-dimension gesture and speech recognition, so the control that the present invention designs refers to It enables in corpus in addition to there is phonetic control command, also to there is gesture control instruction.Equally when designing voice control command system, Gesture instruction may be accompanied by while assigning phonetic order in view of operator, therefore the present invention will use five parameters (Cdir,Copt,Chand,Cval,Cunit) instruction identified.When operator assigns voice command to robot, robot First determine whether the phonetic order includes gesture instruction, if including ChandIt is set as 1, switchs to execute gesture instruction, if not wrapping Contain, ChandIt is set as NULL, voice is identified, the direction in acquisition instruction sentence, operation, characteristic value, the parameters such as unit are simultaneously Carry out respective operations.
After to speech recognition, into intention comprehension portion.The part mainly by natural language it is instruction morphing for pair The robot control instruction answered.Before carrying out understanding conversion to the natural language just identified instruction, the present invention will have one Maximum entropy disaggregated model, the present invention first text feature is extracted from training corpus, then using TF-IDF to text feature into The weighting of row feature vector, is Text eigenvector by text representation, has n word to be indicated as n dimensional feature vector.Then using most Big entropy algorithm models Text eigenvector with the corresponding conditional probability for being intended to output label, obtains being distributed most uniformly Model, utilize formula:
Maximum entropy probability distribution is obtained, to complete maximum entropy modeling.Wherein, fi(x, y) is ith feature function, if Text vector is existing in the same sample with corresponding output label, then fi(x, y) is equal to 1, is otherwise 0.λiFor fi(x, y) is right The weight answered, z (x) are normalization factor.After maximum entropy disaggregated model is established, natural language to be tested can be instructed Convert.Text feature is first extracted from text to be tested, then with method mentioned above by text representation is text Then feature vector classifies to Text eigenvector using established maximum entropy disaggregated model, finally obtains robot Control instruction.
There are two types of modeling patterns: unified model attributes and independent attribute modeling.Unified model attributes refer to all properties It is combined into an instruction, and maximum entropy modeling is carried out to the instruction, then text is tested.Independent attribute modeling refers to point It is other that maximum entropy modeling is carried out to four attributes, then sample is tested, test result is finally combined into an output order. Unified model attributes can improve the accuracy of model in view of association mutual between each attribute, but such modeling side Method can make number of combinations very huge, become difficult classification very much.Can be many less although independent attribute models number of combinations, It is to lack relevance between attribute, the accuracy of model can reduce very much.Using unified model attributes, to guarantee the accurate of model Degree.By the model established above, accurately identifying for voice may be implemented, accordingly, it is possible to realize by voice to machine The fine tuning of device people controls operation.
The step S3 the following steps are included:
Robot on-line teaching is carried out by gesture and voice two ways, and gesture is responsible for coarse adjustment, and voice is responsible for micro- It adjusts.Voice control is divided into two kinds of orders: controlling order and commanding order, operator can be started by controlling order or The process for terminating robot on-line teaching, can also do between gesture instruction and commanding order and switch.The stream of on-line teaching Journey figure is as shown in Figure 1:
Firstly, operator issues the voice command started, robot is at armed state after receiving order, quasi- at any time It is standby to receive new instruction.Then, operator can set order to gesture control state, and in this condition, operator can To control the movement all around of robot by gesture, the amplitude moved at this time is bigger, thus gesture be easier into Row control.This is known as the coarse tuning process of robot.
However, in some cases, the movement range that robot is done is smaller, operator passes through gesture control machine at this time People is relatively difficult, because the gesture of people is not easy to control for small distance, can at this time turn to voice control, operator Voice control command can be switched to by voice, robot is controlled by voice and is all around moved, this is known as robot Trim process.
In most cases, the coarse adjustment and fine tuning of robot are bound together.For example, operator's finger one Direction, voice control robot is mobile to that direction, and robot just can read the content of phonetic order at this time, while read hand The direction of gesture meaning, makes correct operation.The present invention carries out whole control with IF-THEN rule herein, if A series of rule has been counted to realize the combination between gesture instruction and phonetic order.
Finally, after operation terminates, operator issues the order terminated, and robot will terminate corresponding operation, whole A teaching process just completes.
The mode that this gesture and voice be combined with each other controls robot motion, and naturality and flexibility are all very strong, Convenient for operation.
The present invention has the following advantages and effects with respect to the prior art:
The invention proposes a kind of methods of robot on-line teaching, include gesture teaching and voice teaching, operator Corresponding movement can be completed come guidance machine people in such a way that the coarse adjustment of gesture and the fine tuning of voice be combined with each other.Compared to existing Some technologies, it appears more naturally, flexibly, convenient, easily manipulation, operator may not necessarily be placed on energy how to operate machine On people, and certain tasks can be completed with wholesouled Manipulation of the machine people.
Detailed description of the invention
Fig. 1 is the flow chart of on-line teaching.
Specific embodiment
Below with reference to embodiment, the present invention is described in further detail, and embodiments of the present invention are not limited thereto.
Embodiment:
The present invention is based on gestures and voice to include the following steps: to carry out robot on-line teaching
S1, the coarse tuning process based on gesture;
S2, voice-based trim process;
S3, robot teaching is carried out in conjunction with gesture and voice.
The step S1 the following steps are included:
The gesture of people has the characteristics that nature, intuitive, flexible, operator can easily be indicated by gesture very much from Oneself intention, by it with obviously having good advantage on robot teaching, gesture is used to carry out coarse adjustment operation to robot. Operator can directly control robot motion by gesture motion, and the data such as hand gesture location and direction can be by Leap Motion is acquired, and can control robot motion after handling gesture position and direction these data.
1) gesture coordinate system
Leap Motion is the capture that the data such as hand gesture location and direction are carried out by a gesture tracking system, is Coordinate system there are three being set in system.
1: world coordinate system XWYWZW
2:Leap Motion coordinate system XLYLZL
3: palm coordinate system XHYHZH
By palm coordinate system XHYHZHTo Leap Motion coordinate system XLYLZLTransformation can represent hand gesture location.Assuming that Leap Motion coordinate system XHYHZHWith palm coordinate system XLYLZLIt is respectively φ, θ, ψ in the direction of rotation of X-axis, Y-axis, Z axis, then These rotation angles (φ, θ, ψ) can represent gestures direction.
2) location estimation is carried out by interval Kalman filtering
The hand gesture location got by Leap Motion is inaccurate, it is possible that handshaking equal error, right in this way Robot on-line teaching can have a great impact, although Kalman filtering can be used in location estimation, in some environment phases In the case where complexity, the obtained data of system are uncertain, carry out the mistake of location estimation appearance to object with Kalman filtering It is poor possible bigger, it can solve this problem using interval Kalman filtering.
The model of interval Kalman filtering is expressed as follows:
HereIt is the state vector of n × 1 at k moment,It is the state transition matrix of n × n,It is n × l control output square Battle array,It is the input vector of l × 1,WithRepresent noisy vector;It is the measurement vector of m × 1 at k moment,It is m × n Observation matrix.Wherein
By being spaced the location estimation of Kalman filtering, the present invention can be by gesture state x'kIt is indicated in the state of moment k It is as follows:
x′k=[px,k,Vx,k,Ax,k,py,k,Vy,k,Ay,k,pz,k,Vz,k,Az,k] (3)
In this process, noisy vector indicates are as follows:
w'k=[0,0, w'x,0,0,w'y,0,0,wz]T (4)
Wherein (w'x,w'y,wz) be palm acceleration process noise.So the observing matrix of location estimation can be determined Justice is as follows:
It is available by interval Kalman filtering, the covariance of model error and observation error are as follows:
Wherein Δ QtWith Δ RtIt is non-negative constant matrices.By being spaced the fused hand gesture location data of Kalman filtering With regard to more accurate, can be used to carry out robot coarse adjustment control operation.
3) Attitude estimation is carried out by improved particle filter
Quaternion Algorithm can be used to carry out the estimation in rigid body direction, use brought by Quaternion Algorithm accidentally to reduce Difference enhances data fusion using improved particle filter.In tkMoment, the approximation of rear portion density is defined as:
WhereinIt is tkI-th of state particle at moment, N is number of samples,It is tkThe standard of i-th of particle at moment is weighed Weight, δ () is Dirac function.
Therefore, the particle analyzed can be calculated as follows:
In tk+1The quaternion components of moment each particle can be expressed as follows:
Wherein ωaxis,kIt is angular speed, t is sample time.Gesture posture is estimated by improved particle filter method Meter, the very big raising that accuracy also obtains, therefore can also be used to carry out robot coarse adjustment control operation.
The step S2 the following steps are included:
Voice has naturally, easy, easy the features such as manipulating.Operator directly controls robot by voice command will Become simple direct, voice is used to be finely adjusted operation to robot.The present invention will carry out language using Microsoft Speech SDK The acquisition of sound, when user issues phonetic order, Microsoft Speech SDK extracts keyword from voice input, and voice is believed Breath is converted into natural language text, then handles the natural language text information, and the user wherein contained is intended to turn Robot control instruction is turned to, robot control instruction is finally converted to corresponding robot manipulation, last robot is complete At this operation.But therefore the process can be divided into four-stage: voice input, speech recognition, it is intended that understand, complete operation.Its It is also the part of the invention next to be discussed that it is most important that middle speech recognition and intention, which understand,.
The present invention can be pre-designed a set of perfect control command system and opposite before carrying out robot teaching and reproducing The phonetic control command corpus answered.Due to the present invention study be the robot teaching based on three-dimension gesture and speech recognition again Existing technology, so also to have gesture control instruction in addition to there is phonetic control command in the control instruction corpus that the present invention designs. Equally when designing voice control command system, it is contemplated that gesture may be accompanied by while operator assigns phonetic order and referred to It enables, therefore the present invention will use five parameter (Cdir,Copt,Chand,Cval,Cunit) instruction identified.As operator couple When voice command is assigned by robot, robot first determines whether the phonetic order includes gesture instruction, if including ChandIt is set as 1, switch to execute gesture instruction, if not including, ChandIt is set as NULL, voice is identified, the direction in acquisition instruction sentence, Operation, characteristic value, the parameters such as unit simultaneously carry out respective operations.
After to speech recognition, into intention comprehension portion.The part mainly by natural language it is instruction morphing for pair The robot control instruction answered.Before carrying out understanding conversion to the natural language just identified instruction, the present invention will have one Maximum entropy disaggregated model, the present invention first text feature is extracted from training corpus, then using TF-IDF to text feature into The weighting of row feature vector, is Text eigenvector by text representation, has n word to be indicated as n dimensional feature vector.Then using most Big entropy algorithm models Text eigenvector with the corresponding conditional probability for being intended to output label, obtains being distributed most uniformly Model, utilize formula:
Maximum entropy probability distribution is obtained, to complete maximum entropy modeling.Wherein, fi(x, y) is ith feature function, if Text vector is existing in the same sample with corresponding output label, then fi(x, y) is equal to 1, is otherwise 0.λiFor fi(x, y) is right The weight answered, z (x) are normalization factor.It, can natural language to be tested to the present invention after maximum entropy disaggregated model is established Speech instruction convert.The present invention first extracts text feature from text to be tested, then will be literary with method mentioned above Originally it is expressed as Text eigenvector, is then classified using established maximum entropy disaggregated model to Text eigenvector, most After obtain robot control instruction.
There are two types of modeling patterns: unified model attributes and independent attribute modeling.Unified model attributes refer to all properties It is combined into an instruction, and maximum entropy modeling is carried out to the instruction, then text is tested.Independent attribute modeling refers to point It is other that maximum entropy modeling is carried out to four attributes, then sample is tested, test result is finally combined into an output order. Unified model attributes can improve the accuracy of model in view of association mutual between each attribute, but such modeling side Method can make number of combinations very huge, become difficult classification very much.Can be many less although independent attribute models number of combinations, It is to lack relevance between attribute, the accuracy of model can reduce very much.The present invention will will use unified model attributes, to guarantee The accuracy of model.By the model established above, accurately identifying for voice may be implemented, accordingly, it is possible to pass through voice To realize the fine tuning control operation to robot.
The step S3 the following steps are included:
Robot on-line teaching is carried out by gesture and voice two ways, and gesture is responsible for coarse adjustment, and voice is responsible for micro- It adjusts.Voice control is divided into two kinds of orders: controlling order and commanding order, operator can be started by controlling order or The process for terminating robot on-line teaching, can also do between gesture instruction and commanding order and switch.The stream of on-line teaching Journey figure is as shown in Figure 1:
Firstly, operator issues the voice command started, robot is at armed state after receiving order, quasi- at any time It is standby to receive new instruction.Then, operator can set order to gesture control state, and in this condition, operator can To control the movement all around of robot by gesture, the amplitude moved at this time is bigger, thus gesture be easier into Row control.This is known as the coarse tuning process of robot.
However, in some cases, the movement range that robot is done is smaller, operator passes through gesture control machine at this time People is relatively difficult, because the gesture of people is not easy to control for small distance, can at this time turn to voice control, operator Voice control command can be switched to by voice, robot is controlled by voice and is all around moved, this is known as robot Trim process.
In most cases, the coarse adjustment and fine tuning of robot are bound together.For example, operator's finger one Direction, voice control robot is mobile to that direction, and robot just can read the content of phonetic order at this time, while read hand The direction of gesture meaning, makes correct operation.The present invention carries out whole control with IF-THEN rule herein, if A series of rule has been counted to realize the combination between gesture instruction and phonetic order.
Finally, after operation terminates, operator issues the order terminated, and robot will terminate corresponding operation, whole A teaching process just completes.
The mode that this gesture and voice be combined with each other controls robot motion, and naturality and flexibility are all very strong, Convenient for operation.
The above embodiment is a preferred embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment Limitation, other any changes, modifications, substitutions, combinations, simplifications made without departing from the spirit and principles of the present invention, It should be equivalent displacement
Mode is included within the scope of the present invention.

Claims (3)

1. the method for carrying out robot on-line teaching based on gesture and voice, which comprises the following steps:
S1, the coarse tuning process based on gesture;
S2, voice-based trim process;
S3, robot teaching is carried out in conjunction with gesture and voice;The step S1 includes:
1) gesture coordinate system
Leap Motion is the capture that hand gesture location and bearing data are carried out by a gesture tracking system, is provided with Three coordinate systems,
1: world coordinate system XWYWZW
2:Leap Motion coordinate system XLYLZL
3: palm coordinate system XHYHZH
By palm coordinate system XHYHZHTo Leap Motion coordinate system XLYLZLTransformation can represent hand gesture location;Assuming that LeapMotion coordinate system XHYHZHWith palm coordinate system XLYLZLIt is respectively φ, θ, ψ in the direction of rotation of X-axis, Y-axis, Z axis, then These rotation angles (φ, θ, ψ) can represent gestures direction;
2) location estimation is carried out by interval Kalman filtering
The hand gesture location got by Leap Motion is inaccurate, solves this problem using interval Kalman filtering, It is expressed as follows every the model of Kalman filtering:
HereIt is the state vector of n × 1 at k moment,It is the state transition matrix of n × n,It is n × l control output matrix, It is l × 1
Input vector,WithRepresent noisy vector;It is the measurement vector of m × 1 at k moment,It is the observation matrix of m × n; Wherein
By being spaced the location estimation of Kalman filtering, by gesture state x'kIt is expressed as follows in the state of moment k:
x′k=[px,k,Vx,k,Ax,k,py,k,Vy,k,Ay,k,pz,k,Vz,k,Az,k] (3)
In this process, noisy vector indicates are as follows:
w'k=[0,0, w'x,0,0,w'y,0,0,wz]T (4)
Wherein (w'x,w'y,wz) be palm acceleration process noise;By being spaced the fused hand gesture location of Kalman filtering Data can be used to carry out robot coarse adjustment control operation with regard to more accurate;
3) Attitude estimation is carried out by improved particle filter
Quaternion Algorithm can be used to carry out the estimation in rigid body direction, enhance data fusion using improved particle filter;In tkWhen It carves, the approximation of rear portion density is defined as:
WhereinIt is tkI-th of state particle at moment, N is number of samples,It is tkThe criteria weights of i-th of particle at moment, δ () is Dirac function;
Therefore, the particle analyzed can be calculated as follows:
In tk+1The quaternion components of moment each particle can be expressed as follows:
Wherein ωaxis,kIt is angular speed, t is sample time.
2. the method according to claim 1 for carrying out robot on-line teaching based on gesture and voice, which is characterized in that step Suddenly S2 includes:
Operator directly controls robot by voice command, and voice is used to be finely adjusted operation to robot;Use Microsoft Speech SDK carries out the acquisition of voice, and when user issues phonetic order, Microsoft Speech SDK is mentioned from voice input Keyword is taken out, converts natural language text for voice messaging, then handle the natural language text information, it will wherein The user contained is intended to be converted into robot control instruction, and robot control instruction is finally converted to corresponding robot Operation, last robot complete this operation;But therefore the process can be divided into four-stage: voice input, speech recognition, it is intended that Understand, completes operation;
Control command system and corresponding phonetic control command corpus are pre-designed before carrying out robot teaching reproduction; In addition to there is phonetic control command in control instruction corpus, also there is gesture control instruction;Equally in design voice control command When system, it is contemplated that gesture instruction may be accompanied by while operator assigns phonetic order, therefore use five parameters (Cdir,Copt,Chand,Cval,Cunit) instruction identified;When operator assigns voice command to robot, robot First determine whether the phonetic order includes gesture instruction, if including ChandIt is set as 1, switchs to execute gesture instruction, if not wrapping Contain, ChandIt is set as NULL, voice is identified, the direction in acquisition instruction sentence operates, and characteristic value, unit parameter is gone forward side by side Row respective operations;
After to speech recognition, into intention comprehension portion, instruction morphing control for corresponding robot of natural language is referred to It enables;Before carrying out understanding conversion to the natural language just identified instruction, there is a maximum entropy disaggregated model, first from training Text feature is extracted in corpus, feature vector weighting then is carried out to text feature using TF-IDF, is text by text representation Eigen vector has n word to be indicated as n dimensional feature vector;Then utilize maximum entropy algorithm, to Text eigenvector with it is corresponding The conditional probability of intention output label modeled, obtain being distributed most uniform model, utilize formula:
Maximum entropy probability distribution is obtained, to complete maximum entropy modeling;Wherein, fi(x, y) be ith feature function, if text to Amount is existing in the same sample with corresponding output label, then fi(x, y) is equal to 1, is otherwise 0;λiFor fi(x, y) corresponding power Value, z (x) are normalization factor;After maximum entropy disaggregated model is established, just natural language to be tested instruction is converted ?;Text feature is first extracted from text to be tested, then with method mentioned above by text representation is Text eigenvector, Then classified using established maximum entropy disaggregated model to Text eigenvector, finally obtain robot control instruction;
There are two types of modeling patterns: unified model attributes and independent attribute modeling, unified model attributes, which refer to, combines all properties It is instructed at one, and maximum entropy modeling is carried out to the instruction, then text is tested;It is right respectively that independent attribute modeling refers to Four attributes carry out maximum entropy modeling, then test sample, and test result is finally combined into an output order.
3. the method according to claim 1 for carrying out robot on-line teaching based on gesture and voice, which is characterized in that institute Stating step S3 includes:
Robot on-line teaching is carried out by gesture and voice two ways, and gesture is responsible for coarse adjustment, and voice is responsible for fine tuning;Language Sound control system is divided into two kinds of orders: controlling order and commanding order, and operator can be started or be terminated by controlling order The process of robot on-line teaching can also be done between gesture instruction and commanding order and switch;
Firstly, operator issues the voice command started, robot is at armed state after receiving order, is ready to connect By new instruction;Then, order is set gesture control state by operator, and in this condition, operator can pass through hand Gesture controls the movement all around of robot, and the amplitude moved at this time is bigger, so gesture is easier to be controlled;This The referred to as coarse tuning process of robot;
However, operator is relatively difficult by gesture control robot at this time when the movement range that robot is done is smaller, because It is not easy to control for small distance for the gesture of people, voice control can be at this time turned to, operator can pass through voice Voice control command is switched to, robot is controlled by voice and is all around moved, this is known as the trim process of robot;
In most cases, the coarse adjustment and fine tuning of robot are bound together;Operator's finger a direction, voice It is mobile to that direction to control robot, robot just can read the content of phonetic order at this time, while read gesture meaning Correct operation is made in direction;Whole control is carried out with IF-THEN rule, designs a series of rule to realize hand Combination between gesture instruction and phonetic order;
Finally, after operation terminates, operator issues the order terminated, and robot will terminate corresponding operation, entirely show Journey is taught just to complete.
CN201610459874.2A 2016-06-20 2016-06-20 The method for carrying out robot on-line teaching based on gesture and voice Expired - Fee Related CN106095109B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610459874.2A CN106095109B (en) 2016-06-20 2016-06-20 The method for carrying out robot on-line teaching based on gesture and voice

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610459874.2A CN106095109B (en) 2016-06-20 2016-06-20 The method for carrying out robot on-line teaching based on gesture and voice

Publications (2)

Publication Number Publication Date
CN106095109A CN106095109A (en) 2016-11-09
CN106095109B true CN106095109B (en) 2019-05-14

Family

ID=57252259

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610459874.2A Expired - Fee Related CN106095109B (en) 2016-06-20 2016-06-20 The method for carrying out robot on-line teaching based on gesture and voice

Country Status (1)

Country Link
CN (1) CN106095109B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108986801B (en) * 2017-06-02 2020-06-05 腾讯科技(深圳)有限公司 Man-machine interaction method and device and man-machine interaction terminal
CN107351058A (en) * 2017-06-08 2017-11-17 华南理工大学 Robot teaching method based on augmented reality
CN107150347B (en) * 2017-06-08 2021-03-30 华南理工大学 Robot perception and understanding method based on man-machine cooperation
CN108247633B (en) * 2017-12-27 2021-09-03 珠海格力节能环保制冷技术研究中心有限公司 Robot control method and system
JP2019126902A (en) * 2018-01-25 2019-08-01 川崎重工業株式会社 Robot teaching device
CN108447477A (en) * 2018-01-30 2018-08-24 华南理工大学 A kind of robot control method based on natural language understanding
CN108334198B (en) * 2018-02-09 2021-05-14 华南理工大学 Virtual sculpture method based on augmented reality
CN109358747B (en) * 2018-09-30 2021-11-30 平潭诚信智创科技有限公司 Companion robot control method, system, mobile terminal and storage medium
JP7063844B2 (en) * 2019-04-26 2022-05-09 ファナック株式会社 Robot teaching device
CN110473535A (en) * 2019-08-15 2019-11-19 网易(杭州)网络有限公司 Teaching playback method and device, storage medium and electronic equipment
CN110815210A (en) * 2019-08-26 2020-02-21 华南理工大学 Novel remote control method based on natural human-computer interface and augmented reality
CN110815258B (en) * 2019-10-30 2023-03-31 华南理工大学 Robot teleoperation system and method based on electromagnetic force feedback and augmented reality
CN110788860A (en) * 2019-11-11 2020-02-14 路邦科技授权有限公司 Bionic robot action control method based on voice control
CN110992777B (en) * 2019-11-20 2020-10-16 华中科技大学 Multi-mode fusion teaching method and device, computing equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1302056A (en) * 1999-12-28 2001-07-04 索尼公司 Information processing equiopment, information processing method and storage medium
JP2006297531A (en) * 2005-04-20 2006-11-02 Fujitsu Ltd Service robot
CN104936748A (en) * 2012-12-14 2015-09-23 Abb技术有限公司 Bare hand robot path teaching

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005088179A (en) * 2003-09-22 2005-04-07 Honda Motor Co Ltd Autonomous mobile robot system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1302056A (en) * 1999-12-28 2001-07-04 索尼公司 Information processing equiopment, information processing method and storage medium
JP2006297531A (en) * 2005-04-20 2006-11-02 Fujitsu Ltd Service robot
CN104936748A (en) * 2012-12-14 2015-09-23 Abb技术有限公司 Bare hand robot path teaching

Also Published As

Publication number Publication date
CN106095109A (en) 2016-11-09

Similar Documents

Publication Publication Date Title
CN106095109B (en) The method for carrying out robot on-line teaching based on gesture and voice
Du et al. Online robot teaching with natural human–robot interaction
Sun et al. Object classification and grasp planning using visual and tactile sensing
Li Human–robot interaction based on gesture and movement recognition
Yang et al. Human action learning via hidden Markov model
CN105807926A (en) Unmanned aerial vehicle man-machine interaction method based on three-dimensional continuous gesture recognition
Jingqiu et al. An ARM-based embedded gesture recognition system using a data glove
CN104732752A (en) Advanced control device for home entertainment utilizing three dimensional motion technology
Yongda et al. Research on multimodal human-robot interaction based on speech and gesture
Chang et al. A kinect-based gesture command control method for human action imitations of humanoid robots
Nguyen et al. Reinforcement learning based navigation with semantic knowledge of indoor environments
Aleotti et al. Trajectory clustering and stochastic approximation for robot programming by demonstration
CN106055244B (en) Man-machine interaction method based on Kinect and voice
Huang et al. Language-driven robot manipulation with perspective disambiguation and placement optimization
Lin et al. Action recognition for human-marionette interaction
Axyonov et al. Method of multi-modal video analysis of hand movements for automatic recognition of isolated signs of Russian sign language
Dhamanskar et al. Human computer interaction using hand gestures and voice
Zhou et al. Intelligent grasping with natural human-robot interaction
Kwolek GAN-based data augmentation for visual finger spelling recognition
Kulecki Intuitive robot programming and interaction using RGB-D perception and CNN-based objects detection
Nguyen et al. A fully automatic hand gesture recognition system for human-robot interaction
Kenshimov et al. Development of a Verbal Robot Hand Gesture Recognition System
Cutugno et al. Interacting with robots via speech and gestures, an integrated architecture.
Di Benedetto et al. A hidden markov model-based approach to grasping hand gestures classification
Mathew et al. Multi-modal intent classification for assistive robots with large-scale naturalistic datasets

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190514