CN107146619A - A kind of intelligent sound interacts robot - Google Patents

A kind of intelligent sound interacts robot Download PDF

Info

Publication number
CN107146619A
CN107146619A CN201710579708.0A CN201710579708A CN107146619A CN 107146619 A CN107146619 A CN 107146619A CN 201710579708 A CN201710579708 A CN 201710579708A CN 107146619 A CN107146619 A CN 107146619A
Authority
CN
China
Prior art keywords
cavity
voice
robot
sound
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710579708.0A
Other languages
Chinese (zh)
Other versions
CN107146619B (en
Inventor
臧红彬
周颖玥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University of Science and Technology
Original Assignee
Southwest University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University of Science and Technology filed Critical Southwest University of Science and Technology
Priority to CN201710579708.0A priority Critical patent/CN107146619B/en
Publication of CN107146619A publication Critical patent/CN107146619A/en
Application granted granted Critical
Publication of CN107146619B publication Critical patent/CN107146619B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J11/00Manipulators not otherwise provided for
    • B25J11/0005Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/221Announcement of recognition results
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mechanical Engineering (AREA)
  • Robotics (AREA)
  • Manipulator (AREA)
  • Toys (AREA)

Abstract

Robot is interacted the invention discloses a kind of intelligent sound, it is therefore intended that solve existing speech-sound intelligent interaction robot at present and be only capable of being controlled by the way of question-response, the problem of friendly and security of man-machine interaction can not be protected.The robot of the present invention can effectively carry out double-directional speech identification by the improvement to its structure, break through the defect present in prior art.On the other hand, the improvement based on robot interior structure, robot is in moving process, and because the ground noise of the equipment such as stepper motor is disturbed, caused interactive voice problem is effectively solved.The present invention can realize that people exchanges with the two-way interaction of robot, effectively the friendly of lifting man-machine interaction, with remarkable progress.Through actual test, accuracy of identification of the invention can reach more than 95%, effectively realize the two-way progress of speaker and robot double-directional speech input and output so that friendly between speaker and robot and interactive be greatly enhanced.

Description

A kind of intelligent sound interacts robot
Technical field
The present invention relates to robot field, especially interactive voice robot field, specially a kind of intelligent sound interaction Robot.The present invention is improved by the structure to robot and interacts robot there is provided a kind of brand-new intelligent sound, and it is adopted With the structure design of similar panda profile, and by being improved to internal structure, existing voice interaction machine is efficiently solved The problem of voice present in people is only capable of individually entering or exported, for promoting interactive voice machine man-based development, promotes machine The progress of person speech interaction technology, has great importance.
Background technology
Voice, as ability specific to the mankind, is the important instrument of exchange and acquisition external information resource between the mankind And channel, the development for human civilization has great importance.Speech recognition technology as man-machine interaction branch important set Into, be the important interface of man-machine interaction, for artificial intelligence development have important practical significance.Speech recognition technology passes through The development of many decades, has been achieved for significant progress, progressively starts slowly to move towards market from laboratory.At present, for specific The speech recognition system of speaker has had higher accuracy of identification, and is widely used in industry, household electrical appliances, communication, automobile electricity The fields such as son, medical treatment, home services and consumable electronic product.
In recent years, the application with speech recognition technology in robot control, the application field of robot constantly expands. Meanwhile, the research on the Robot Control Technology based on speech recognition both at home and abroad also makes some progress.For example, domestic There is Bai Lin to be improved in the research of the Robot Control Technology based on speech recognition speech characteristic parameter extracting method, Traditional MFCC characteristic parameters are combined with formant parameter, it is proposed that new speech characteristic parameter extracting method.
At present, existing interactive voice product is mostly based on special voice recognition chip, and its kernel is single-chip microcomputer or number Word signal central processing unit, its essence is by the voice signal sample code of microphone input, then passes through internal processor and its The voice messaging matching recorded in advance, then corresponding voice messaging is defeated by external loudspeaker by the module in piece Go out.For example, Chinese patent CN201620720668.8 discloses a kind of robot system with voice interactive function, it includes A pcb board, the PCB are provided with the robot being made up of robot head, machine person portion and base, the machine person portion Plate is connected with a single-chip microcomputer, and the single-chip microcomputer is connected with a signal transmission circuit, and the robot head is passed provided with IMAQ Sensor and voice receiver, the signal transmission circuit are connected with the voice receiver, image acquiring sensor, the signal Radiating circuit is connected with mobile terminal, and the single-chip microcomputer is also associated with a signal receiving circuit and speech player, the signal Receiving circuit is connected with mobile terminal and speech player respectively, and the signal transmission circuit, signal receiving circuit are respectively connected with One wave filter, the machine person portion includes robot arm, display device and load button, and the load button shows with described Showing device is connected, and it can realize the function of interactive voice.
However, applicants have found that, existing speech recognition machine people has preferably unidirectional recognition capability, but two-way Speech recognition capabilities are weaker, mainly there is following both sides problem:
1)Robot, because the ground noise of the equipment such as stepper motor is disturbed, can give interactive voice robot in moving process Bring unpredictable results;
2)When robot is speaking, or when playing music, even if user sends instruction, robot is also difficult to what user was sent Instruction is identified, and double-directional speech recognition capability is almost lost, and this is also that current existing robot mainly uses interrogation reply system The main cause being controlled.
Drawbacks described above present in robot is interacted based on existing voice, the friendly and security of man-machine interaction can not be obtained To guarantee, three laws of robot have been run counter to.Therefore, in the urgent need to a kind of new device, to solve the above problems.
The content of the invention
The goal of the invention of the present invention is:It is only capable of using question-response for existing speech-sound intelligent interaction robot at present Mode be controlled, there is provided a kind of friendship of intelligent sound for the problem of friendly and security of man-machine interaction can not be protected Mutual robot.The robot of the present invention can effectively carry out double-directional speech identification by the improvement to its structure, break through prior art Existing defect.On the other hand, the improvement based on robot interior structure, robot is in moving process, due to stepping electricity The ground noise interference of the equipment such as machine, caused interactive voice problem is effectively solved.The present invention can realize people and machine The friendly of the two-way interaction exchange, effectively lifting man-machine interaction of device people, with remarkable progress.
To achieve these goals, the present invention is adopted the following technical scheme that:
A kind of intelligent sound interacts robot, including bottom support frame, drive mechanism, the first cavity, the second cavity, control system System, the drive mechanism is arranged on the support frame of bottom and drive mechanism can drive robot motion, institute by bottom support frame The connected composition robot body of the first cavity, the second cavity is stated, the robot body is arranged on the support frame of bottom;
Two the 3rd cavitys are symmetrically arranged with second cavity, first cavity, the second cavity, the 3rd cavity are respectively Hollow structure;
The first support frame is provided with the cavity of first cavity, first support frame is connected with bottom support frame, described It is respectively arranged with first cavity wall below the first voice playing device, the first cavity, first cavity and is provided with first Sound insulation drawer, lower sound insulation drawer and first are disposed with sound panel, the first cavity of first cavity from bottom to up Support frame can be respectively that upper sound insulation drawer, lower sound insulation drawer provide support, first sound panel be located at bottom support frame with Between lower sound insulation drawer;
It is provided between first cavity and the second cavity on the second sound panel, the 3rd cavity and is respectively arranged with the 3rd language Sound playing device, bellmouth orifice, the speech recognition equipment being engaged with the 3rd voice playing device, the 3rd cavity are spherical in shape, 3rd voice playing device is two and to be separately positioned on the 3rd cavity, and the bellmouth orifice is in for several and bellmouth orifice Fan-shaped ring-band shape, the speech recognition equipment is located between the 3rd voice playing device;
The control system is connected with the first voice playing device, the 3rd voice playing device, speech recognition equipment respectively.
Several heat emission holes are provided with below the robot body.
Several heat emission holes constitute rectangle and are arranged at below main body.
The signal reception for being provided with and being connected with control system in groove, the groove is additionally provided with first cavity One or more in device, handrail.
The signal receiver is arranged on the first support frame.
Also include the display being connected with control system, the display is arranged on the side wall of the second cavity, described aobvious Show that device is located between two the 3rd cavitys and speech recognition equipment is arranged on below display.
Angle between the display and horizontal plane is 15 ~ 90 °.
The 3rd sound panel is provided between the upper sound insulation drawer, lower sound insulation drawer.
The speech recognition equipment is located on the center line between the 3rd voice playing device.
Also include camera tracking mechanism, avoidance mechanism, the camera tracking mechanism, avoidance mechanism are separately positioned on machine Device human agent is upper and camera tracking mechanism, avoidance mechanism are connected with control system respectively, and the control system can receive, locate The position signalling that the picture signal and avoidance mechanism of reason camera tracking mechanism transmission are detected, and then control drive mechanism Action.
Also include the navigation sector being connected with control system.
For the method for aforementioned intelligent interactive voice robot interactive system, comprise the following steps:
(One)Judge phonetic entry type
1)Phonetic entry type is judged, if input and output bidirectional recognition system, then perform step(Two)If input is unidirectional to be known Other system, then perform step(Three);
(Two)Predefined input and output bidirectional recognition system;
2)Predefined voice output table, and according to predefined voice output table gather voice playing device composition output sample set and Export test set;
3)Predefined voice vocabulary table, and voice sample data composition input sample collection and input are gathered according to the voice vocabulary table Test set;
4) N is obtained to N number of speech samples in output sample set, M speech samples fully intermeshing in input sample collection respectively! M!Individual arrangement;Respectively by each arrangement input training system, the speech vector trained a center is obtained;Finally obtain N!M!The mean vector and variance parameter at individual speech vector center, obtain final voice training template;Wherein, N, M be more than 1 integer;
5)The speech samples concentrated simultaneously using output test set, input test are tested as voice to be measured, obtain difference Robustness degree under speech samples, includes the average correct recognition rata of correct recognition rata and speech samples of each speech samples;
6) speech samples are ranked up according to the size of speech samples correct recognition rata, selection word correct recognition rata is more than flat The speech samples of equal correct recognition rata constitute two-way candidate's vocabulary;
7) two-way candidate's vocabulary is directed to, step 4 is reused) training sound template, obtain the average arrow of each sound template Measure the peaceful meansquaredeviationσs 1 of μ 1;
8) when phonetic entry to be measured, the matching distance of voice to be measured and each sound template is calculated, minimal matching span pair is selected The sound template answered is recognition result;
9) recognition result of voice to be measured is exported;
(Three)The predefined unidirectional identifying system of input;
10)To step 3)M speech samples fully intermeshing in interior input sample collection, obtains M!Individual arrangement;Each is arranged respectively In row input training system, the speech vector trained a center is obtained;Finally obtain M!Individual speech vector center is averaged Vector variance parameter, obtains final voice training template;Wherein, M is the integer more than 1;
11)Tested using the speech samples that input test is concentrated as voice to be measured, obtain the robust of corresponding speech samples Property degree, include the average correct recognition rata of correct recognition rata and speech samples of each speech samples;
12) speech samples are ranked up according to the size of speech samples correct recognition rata, selection word correct recognition rata is more than The speech samples of average correct recognition rata constitute unidirectional candidate's vocabulary;
13) unidirectional candidate's vocabulary is directed to, step 10 is reused) training sound template, obtain being averaged for each sound template The peaceful meansquaredeviationσs 2 of vector μ 2;
14) when phonetic entry to be measured, the matching distance of voice to be measured and each sound template is calculated, minimal matching span pair is selected The sound template answered is recognition result;
15) recognition result of voice to be measured is exported.
In existing structure, mainly it is controlled by the way of question-response, this is mainly due to robot itself Output can to speech recognition effect produce extreme influence the problem of.At present, it is general by the way of being improved to chip, with Solve foregoing problems.And in invention, be improved by the overall structure to robot, effectively reduce voice output defeated to voice The interference entered, and then reach that phonetic entry exports the purpose of two-way interactive.
The structure includes bottom support frame, drive mechanism, the first cavity, the second cavity, control system;Wherein, bottom branch Support provides support for miscellaneous part, and drive mechanism is connected with bottom support frame, and drive mechanism drives bottom support frame and thereon Miscellaneous part motion.Drive mechanism includes one group of driving wheel, driven pulley, and driving wheel is connected with control system respectively.The present invention In, driven pulley can be universal wheel, and driving wheel is two and drives driving wheel to rotate by motor respectively.Further, driving wheel Can be McCrum wheel, driving wheel, driven pulley are distributed in isosceles triangle.
First cavity, the second cavity are connected, and constitute robot body, the first cavity, the second cavity are set successively from bottom to up Put;Also, two the 3rd cavitys are symmetrically arranged with the second cavity, the first cavity, the second cavity, the 3rd cavity are respectively hollow Structure, the 3rd cavity is spherical in shape.First cavity is more than the second cavity, and the second cavity is more than the 3rd cavity.Using the structure, formed The intelligent robot of the panda form of two ears is arranged at small, top upper greatly down.
It is respectively arranged with the first voice playing device, the first opening, the cavity of the first cavity and sets on first cavity wall The first support frame is equipped with, the first support frame provides support for the miscellaneous part in the first cavity.By the first voice playing device, The voice output of robot can be realized.
In the present invention, sound insulation drawer, lower sound insulation drawer are disposed with from bottom to up in the cavity of the first cavity, lead to Cross sound insulation drawer, lower sound insulation drawer and play a part of sound insulation, damping.The first sound panel, first are provided with below first cavity Sound panel is located between bottom support frame and lower sound insulation drawer, and the second sound panel is provided between the first cavity and the second cavity. The 3rd voice playing device, the bellmouth orifice being engaged with the 3rd voice playing device, voice is respectively arranged with 3rd cavity to know Other device, the 3rd voice playing device is two and is separately positioned on the 3rd cavity that bellmouth orifice is in eccentric circular ring zonal distribution, language Sound identifying device is located between two the 3rd voice playing devices, preferably on center line.
Think after applicant's analysis, existing welcome's class, household small-size are anthropomorphic or imitative zoomorphism intelligent robot can not be real The problem of existing voice bidirectional input and output, is, in the structure of robot itself;Existing welcome's class, household small-size are anthropomorphic or imitative dynamic Thing form intelligent robot uses single cavity body structure, a huge sound chamber can be formed inside it, sound chamber can have a strong impact on The effect of speech recognition.Therefore, the present invention has carried out the improvement of following several respects in structure:1)To be single in the prior art Cavity body structure is improved to the first cavity, the second cavity two single cavity, 2)Set between the first cavity and the second cavity There is the second sound panel, block influence of the first cavity sound chamber to identifying device, 3)And by the cavity of the first cavity from bottom to up Sound insulation drawer, lower sound insulation drawer are set gradually, by the setting of upper sound insulation drawer, lower sound insulation drawer, is on the one hand conducive to using The placement of family article etc., plays a part of glove, on the other hand can then destroy the original sound chamber of the first cavity, and the is reduced as far as possible Influence of one voice playing device to speech recognition equipment;4)3rd voice playing device is symmetricly set on the 3rd cavity, loudspeaker Hole is in fan-shaped ring-band shape, using which, the 3rd voice playing device one symmetrical voice output of formation, greatly reduces by the Influence of three voice playing devices for speech recognition equipment.Improvement based on said structure, the present invention can realize robot Voice output and user phonetic entry two-way interaction, the double-directional speech recognition efficiency of user is greatly improved, effectively solve The problems of prior art and defect.
Further, several heat emission holes are provided with below robot body;Robot can effectively be distributed by heat emission hole Internal heat, it is ensured that the normal operation of robot.
Further, it is additionally provided with the first cavity and is provided with the signal being connected with control system in groove, the groove and connects Receive the one or more in device, handrail.Using which, user can be operated by handrail to robot;User is except straight Connect using voice furthermore it is possible to which by signal receiver, corresponding control instruction is sent to robot.
Further, in addition to control system the display being connected, display is arranged on the side wall of the second cavity, display Device is located between two the 3rd voice playing devices and speech recognition equipment is arranged on below display.Using which, display Device forms the lovely face of similar panda, and the 3rd cavity then forms the ear of panda, gives more preferable friendliness, strengthens user Interactive friendly.
Further, the angle between display and horizontal plane is 15 ~ 90 °.Using which, user can be easy to display The viewing of device.
Further, in order to improve soundproof effect, the present invention be provided between upper sound insulation drawer, lower sound insulation drawer the 3rd every Soundboard, further to reduce influence of first voice playing device to speech recognition equipment.
Further, in addition to camera tracking mechanism, avoidance mechanism, camera tracking mechanism, avoidance mechanism are set respectively On robot body and camera tracking mechanism, avoidance mechanism are connected with control system respectively, control system can receive, locate The position signalling that the picture signal and avoidance mechanism of reason camera tracking mechanism transmission are detected, and then control drive mechanism Action.Using which, robot of the invention recognizes the movement locus of user by camera tracking mechanism, and image is believed Breath passes to control system;Meanwhile, the position signalling detected is passed to control system by avoidance mechanism;Control system receiving, The position signalling that the picture signal and avoidance mechanism of processing camera tracking mechanism transmission are detected, and control drive mechanism Action, realizes and the intelligence of user is followed.
Further, in addition to control system the navigation sector being connected.By navigation sector, the present invention can carry for user For navigation instruction;Meanwhile, based on navigation sector, the present invention can also be automatically moved to the position of setting.
The present invention is also provided in the implementation method of the interactive system based on aforementioned intelligent interactive voice robot, this method, Judged for different voice scenes, and the result based on judgement, perform corresponding identification operation.In this method, use Single speech recognition equipment, you can realize the identifying processing to voice, without using multiple speech recognition equipments, carries out language Sound noise reduction process.Meanwhile, this method is independent of specific expositor, by the definition to voice output table, voice vocabulary table, And by subsequent treatments such as voice training templates so that the present invention is improved in antinoise and aspect unrelated with speaker, weak The individual information of different speakers is changed.On the other hand, the method based on the present invention, can be used for online networking identification, also can Realize that, without net identified off-line, discrimination is high, and recognition effect is good.
Using this method, recognition result can be effectively corrected, one-way voice input and double-directional speech input and output are realized, and In one-way voice input and double-directional speech input and output, preferable recognition effect is respectively provided with.Through actual test, knowledge of the invention Other precision can reach more than 95%, effectively realize the two-way progress of speaker and robot double-directional speech input and output so that Friendly between speaker and robot and it is interactive be greatly enhanced, with remarkable progress.
Brief description of the drawings
Examples of the present invention will be described by way of reference to the accompanying drawings, wherein:
Fig. 1 is the side view of device in embodiment 1.
Fig. 2 is the rearview of device in embodiment 1.
Marked in figure:1 is drive mechanism, and 2 be the first cavity, and 3 be the second cavity, and 4 be the 3rd cavity, and 6 be that upper sound insulation is taken out Drawer, 7 be lower sound insulation drawer, and 8 be bellmouth orifice, and 9 be signal receiver, and 10 be display, and 11 be heat emission hole.
Embodiment
All features disclosed in this specification, or disclosed all methods or during the step of, except mutually exclusive Feature and/or step beyond, can combine in any way.
Any feature disclosed in this specification, unless specifically stated otherwise, can be equivalent by other or with similar purpose Alternative features are replaced.I.e., unless specifically stated otherwise, each feature is an example in a series of equivalent or similar characteristics .
Embodiment 1
The intelligent sound interaction robot of the present embodiment includes bottom support frame, drive mechanism, the first cavity, the second cavity, control System processed.Wherein, drive mechanism is arranged on the support frame of bottom, and the first cavity, the second cavity, which are connected, constitutes robot body, machine Device human agent is arranged on the support frame of bottom.Two the 3rd cavitys, the first cavity, the second chamber are symmetrically arranged with second cavity Body, the 3rd cavity are respectively hollow structure.
In the present embodiment, drive mechanism includes a driven pulley and two driving wheels, two drivings for being connected with driving wheel Motor.Using the structure, drive mechanism can relatively face be moved with mobile robot.
Meanwhile, the first support frame is provided with the cavity of the first cavity, the first support frame is connected with bottom support frame, first It is respectively arranged with below the first voice playing device, the first cavity, the first cavity and is provided with positioned at bottom support on cavity wall Sound insulation is disposed with from bottom to up in the first sound panel between frame and lower sound insulation drawer, the first cavity of the first cavity to take out Drawer, lower sound insulation drawer, the first support frame can be respectively upper sound insulation drawer, the offer support of lower sound insulation drawer.
In the present embodiment, it is provided with the second sound panel, the 3rd cavity and sets respectively between the first cavity and the second cavity Bellmouth orifice, the speech recognition equipment for have the 3rd voice playing device, being engaged with the 3rd voice playing device, the 3rd cavity are in ball Shape, the 3rd voice playing device is two and is separately positioned on the 3rd cavity that bellmouth orifice is several and bellmouth orifice is in sector Ring-band shape(As shown in the figure), speech recognition equipment is between the 3rd voice playing device.
In the present embodiment, motor, the first voice playing device, the 3rd voice playing device, speech recognition equipment point It is not connected with control system.
In the present embodiment, several heat emission holes are additionally provided with below the first cavity, heat emission hole is in rectangular layout;First cavity On be additionally provided with the signal receiver for being provided with and being connected with control system in groove, groove;Also set up on the side wall of second cavity The display being connected with control system, display is located between two the 3rd cavitys, and speech recognition equipment is located at below display. In the present embodiment, the angle between display and horizontal plane is 45 °, and speech recognition equipment is located at two the 3rd voice playing devices Between center line on.
In the present embodiment, in addition to camera tracking mechanism, avoidance mechanism, navigation sector, camera tracking mechanism, avoidance Mechanism is separately positioned on robot body.Camera tracking mechanism, avoidance mechanism, navigation sector respectively with control system phase Even, control system can receive, handle the position that the picture signal and avoidance mechanism of the transmission of camera tracking mechanism are detected Signal, and then control the action of drive mechanism.
Using which, the robot of the present embodiment recognizes the movement locus of user by camera tracking mechanism, and will Image information passes to control system;Meanwhile, the position signalling detected is passed to control system by avoidance mechanism;Control system Receive, handle the position signalling that the picture signal and avoidance mechanism of the transmission of camera tracking mechanism are detected, and control driving The action of mechanism, realizes and the intelligence of user is followed.And based on navigation sector, control system can control the machine of the present embodiment People is automatically moved to setting position.
In the present embodiment, fly speech recognition interface using news and carry out speech recognition, double-directional speech input and output identification is accurate Degree reaches more than 88%, and one-way voice inputs accuracy of identification up to 95% or so, with preferable effect.
Embodiment 2
Based on the device of embodiment 1, the present embodiment provides a kind of implementation method of different voice interactive systems, and it includes Following steps:
(One)Judge phonetic entry type
1)Phonetic entry type is judged, if input and output bidirectional recognition system, then perform step(Two)If input is unidirectional to be known Other system, then perform step(Three);
(Two)Predefined input and output bidirectional recognition system;
2)Predefined voice output table, and according to predefined voice output table gather voice playing device composition output sample set and Export test set;
3)Predefined voice vocabulary table, and voice sample data composition input sample collection and input are gathered according to the voice vocabulary table Test set;
4) N is obtained to N number of speech samples in output sample set, M speech samples fully intermeshing in input sample collection respectively! M!Individual arrangement;Respectively by each arrangement input training system, the speech vector trained a center is obtained;Finally obtain N!M!The mean vector and variance parameter at individual speech vector center, obtain final voice training template;Wherein, N, M be more than 1 integer;
5)The speech samples concentrated simultaneously using output test set, input test are tested as voice to be measured, obtain difference Robustness degree under speech samples, includes the average correct recognition rata of correct recognition rata and speech samples of each speech samples;
6) speech samples are ranked up according to the size of speech samples correct recognition rata, selection word correct recognition rata is more than flat The speech samples of equal correct recognition rata constitute two-way candidate's vocabulary;
7) two-way candidate's vocabulary is directed to, step 4 is reused) training sound template, obtain the average arrow of each sound template Measure the peaceful meansquaredeviationσs 1 of μ 1;
8) when phonetic entry to be measured, the matching distance of voice to be measured and each sound template is calculated, minimal matching span pair is selected The sound template answered is recognition result;
9) recognition result of voice to be measured is exported;
(Three)The predefined unidirectional identifying system of input;
10)To step 3)M speech samples fully intermeshing in interior input sample collection, obtains M!Individual arrangement;Each is arranged respectively In row input training system, the speech vector trained a center is obtained;Finally obtain M!Individual speech vector center is averaged Vector variance parameter, obtains final voice training template;Wherein, M is the integer more than 1;
11)Tested using the speech samples that input test is concentrated as voice to be measured, obtain the robust of corresponding speech samples Property degree, include the average correct recognition rata of correct recognition rata and speech samples of each speech samples;
12) speech samples are ranked up according to the size of speech samples correct recognition rata, selection word correct recognition rata is more than The speech samples of average correct recognition rata constitute unidirectional candidate's vocabulary;
13) unidirectional candidate's vocabulary is directed to, step 10 is reused) training sound template, obtain being averaged for each sound template The peaceful meansquaredeviationσs 2 of vector μ 2;
14) when phonetic entry to be measured, the matching distance of voice to be measured and each sound template is calculated, minimal matching span pair is selected The sound template answered is recognition result;
15) recognition result of voice to be measured is exported.
In the present embodiment, double-directional speech input and output identification accuracy reaches more than 95%, one-way voice input accuracy of identification Up to 97% or so, with preferable effect.
The invention is not limited in foregoing embodiment.The present invention, which is expanded to, any in this manual to be disclosed New feature or any new combination, and disclose any new method or process the step of or any new combination.

Claims (10)

1. a kind of intelligent sound interacts robot, including bottom support frame, drive mechanism, the first cavity, the second cavity, control system System, the drive mechanism is arranged on the support frame of bottom and drive mechanism can drive robot motion, institute by bottom support frame The connected composition robot body of the first cavity, the second cavity is stated, the robot body is arranged on the support frame of bottom;
Characterized in that, two the 3rd cavitys are symmetrically arranged with second cavity, first cavity, the second cavity, Three cavitys are respectively hollow structure;
The first support frame is provided with the cavity of first cavity, first support frame is connected with bottom support frame, described It is respectively arranged with first cavity wall below the first voice playing device, the first cavity, first cavity and is provided with first Sound insulation drawer, lower sound insulation drawer and first are disposed with sound panel, the first cavity of first cavity from bottom to up Support frame can be respectively that upper sound insulation drawer, lower sound insulation drawer provide support, first sound panel be located at bottom support frame with Between lower sound insulation drawer;
It is provided between first cavity and the second cavity on the second sound panel, the 3rd cavity and is respectively arranged with the 3rd language Sound playing device, bellmouth orifice, the speech recognition equipment being engaged with the 3rd voice playing device, the 3rd cavity are spherical in shape, 3rd voice playing device is two and to be separately positioned on the 3rd cavity, and the bellmouth orifice is in for several and bellmouth orifice Fan-shaped ring-band shape, the speech recognition equipment is located between the 3rd voice playing device;
The control system is connected with the first voice playing device, the 3rd voice playing device, speech recognition equipment respectively.
2. intelligent sound interacts robot according to claim 1, it is characterised in that be additionally provided with first cavity recessed The one or more in the signal receiver being connected with control system, handrail are provided with groove, the groove.
3. intelligent sound interacts robot according to claim 2, it is characterised in that the signal receiver is arranged on first On support frame.
4. intelligent sound interacts robot according to claim 1, it is characterised in that aobvious also including what is be connected with control system Show device, the display is arranged on the side wall of the second cavity, the display is located between two the 3rd cavitys and voice is known Other device is arranged on below display.
5. intelligent sound interacts robot according to claim 4, it is characterised in that between the display and horizontal plane Angle is 15 ~ 90 °.
6. robot is interacted according to any one of claim 1 ~ 5 intelligent sound, it is characterised in that the upper sound insulation drawer, The 3rd sound panel is provided between lower sound insulation drawer.
7. robot is interacted according to any one of claim 1 ~ 6 intelligent sound, it is characterised in that the speech recognition equipment On center line between the 3rd voice playing device.
8. robot is interacted according to any one of claim 1 ~ 7 intelligent sound, it is characterised in that also followed including camera Mechanism, avoidance mechanism, the camera tracking mechanism, avoidance mechanism are separately positioned on robot body above and camera is with random Structure, avoidance mechanism are connected with control system respectively, and the control system can receive, handle the figure of camera tracking mechanism transmission As the position signalling that signal and avoidance mechanism are detected, and then control the action of drive mechanism.
9. robot is interacted according to any one of claim 1 ~ 8 intelligent sound, it is characterised in that also include and control system Connected navigation sector.
10. the method for interacting robot interactive system for any one of preceding claims 1 ~ 9 intelligent sound, its feature exists In comprising the following steps:
(One)Judge phonetic entry type
1)Phonetic entry type is judged, if input and output bidirectional recognition system, then perform step(Two)If input is unidirectional to be known Other system, then perform step(Three);
(Two)Predefined input and output bidirectional recognition system;
2)Predefined voice output table, and according to predefined voice output table gather voice playing device composition output sample set and Export test set;
3)Predefined voice vocabulary table, and voice sample data composition input sample collection and input are gathered according to the voice vocabulary table Test set;
4) N is obtained to N number of speech samples in output sample set, M speech samples fully intermeshing in input sample collection respectively! M!Individual arrangement;Respectively by each arrangement input training system, the speech vector trained a center is obtained;Finally obtain N!M!The mean vector and variance parameter at individual speech vector center, obtain final voice training template;Wherein, N, M be more than 1 integer;
5)The speech samples concentrated simultaneously using output test set, input test are tested as voice to be measured, obtain difference Robustness degree under speech samples, includes the average correct recognition rata of correct recognition rata and speech samples of each speech samples;
6) speech samples are ranked up according to the size of speech samples correct recognition rata, selection word correct recognition rata is more than flat The speech samples of equal correct recognition rata constitute two-way candidate's vocabulary;
7) two-way candidate's vocabulary is directed to, step 4 is reused) training sound template, obtain the average arrow of each sound template Measure the peaceful meansquaredeviationσs 1 of μ 1;
8) when phonetic entry to be measured, the matching distance of voice to be measured and each sound template is calculated, minimal matching span pair is selected The sound template answered is recognition result;
9) recognition result of voice to be measured is exported;
(Three)The predefined unidirectional identifying system of input;
10)To step 3)M speech samples fully intermeshing in interior input sample collection, obtains M!Individual arrangement;Each is arranged respectively In row input training system, the speech vector trained a center is obtained;Finally obtain M!Individual speech vector center is averaged Vector variance parameter, obtains final voice training template;Wherein, M is the integer more than 1;
11)Tested using the speech samples that input test is concentrated as voice to be measured, obtain the robust of corresponding speech samples Property degree, include the average correct recognition rata of correct recognition rata and speech samples of each speech samples;
12) speech samples are ranked up according to the size of speech samples correct recognition rata, selection word correct recognition rata is more than The speech samples of average correct recognition rata constitute unidirectional candidate's vocabulary;
13) unidirectional candidate's vocabulary is directed to, step 10 is reused) training sound template, obtain being averaged for each sound template The peaceful meansquaredeviationσs 2 of vector μ 2;
14) when phonetic entry to be measured, the matching distance of voice to be measured and each sound template is calculated, minimal matching span pair is selected The sound template answered is recognition result;
15) recognition result of voice to be measured is exported.
CN201710579708.0A 2017-07-17 2017-07-17 Intelligent voice interaction robot Active CN107146619B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710579708.0A CN107146619B (en) 2017-07-17 2017-07-17 Intelligent voice interaction robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710579708.0A CN107146619B (en) 2017-07-17 2017-07-17 Intelligent voice interaction robot

Publications (2)

Publication Number Publication Date
CN107146619A true CN107146619A (en) 2017-09-08
CN107146619B CN107146619B (en) 2020-11-13

Family

ID=59776377

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710579708.0A Active CN107146619B (en) 2017-07-17 2017-07-17 Intelligent voice interaction robot

Country Status (1)

Country Link
CN (1) CN107146619B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577118A (en) * 2009-06-12 2009-11-11 北京大学 Implementation method of voice interaction system facing intelligent service robot
WO2012020858A1 (en) * 2010-08-11 2012-02-16 (주) 퓨처로봇 Intelligent driving robot for providing customer service and calculation in restaurants
CN203522988U (en) * 2013-09-26 2014-04-02 深圳市金立通信设备有限公司 Microphone apparatus and terminal
CN105425799A (en) * 2015-12-03 2016-03-23 昆山穿山甲机器人有限公司 Bank self-service robot system and automatic navigation method thereof
KR101651493B1 (en) * 2010-07-15 2016-08-26 현대모비스 주식회사 Apparatus for two way voice recognition
CN106737760A (en) * 2017-03-01 2017-05-31 深圳市爱维尔智能科技有限公司 A kind of human-like intelligent robot and man-machine communication's system
CN106378786B (en) * 2016-11-30 2018-12-21 北京百度网讯科技有限公司 Robot based on artificial intelligence

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577118A (en) * 2009-06-12 2009-11-11 北京大学 Implementation method of voice interaction system facing intelligent service robot
KR101651493B1 (en) * 2010-07-15 2016-08-26 현대모비스 주식회사 Apparatus for two way voice recognition
WO2012020858A1 (en) * 2010-08-11 2012-02-16 (주) 퓨처로봇 Intelligent driving robot for providing customer service and calculation in restaurants
CN203522988U (en) * 2013-09-26 2014-04-02 深圳市金立通信设备有限公司 Microphone apparatus and terminal
CN105425799A (en) * 2015-12-03 2016-03-23 昆山穿山甲机器人有限公司 Bank self-service robot system and automatic navigation method thereof
CN106378786B (en) * 2016-11-30 2018-12-21 北京百度网讯科技有限公司 Robot based on artificial intelligence
CN106737760A (en) * 2017-03-01 2017-05-31 深圳市爱维尔智能科技有限公司 A kind of human-like intelligent robot and man-machine communication's system

Also Published As

Publication number Publication date
CN107146619B (en) 2020-11-13

Similar Documents

Publication Publication Date Title
US7519537B2 (en) Method and apparatus for a verbo-manual gesture interface
CN107464564B (en) Voice interaction method, device and equipment
CN106098075B (en) Audio collection method and apparatus based on microphone array
US6206745B1 (en) Programmable assembly toy
CN108320742A (en) Voice interactive method, smart machine and storage medium
TWI420433B (en) Speech interactive system and method
CN203300127U (en) Children teaching and monitoring robot
CN106440192A (en) Household appliance control method, device and system and intelligent air conditioner
CN202315292U (en) Comprehensive greeting robot based on smart phone interaction
CN108711430A (en) Audio recognition method, smart machine and storage medium
CN100418498C (en) Guide for blind person
JP2011530727A (en) Gesture multi-dimensional analysis system and method
CN103611294B (en) A kind of chess and card games phonetic controller and control method thereof
CN206559550U (en) The remote control and television system of a kind of built-in microphone array
CN109885162B (en) Vibration method and mobile terminal
US20060142082A1 (en) Motion analyzing apparatus and method for a portable device
CN110097875A (en) Interactive voice based on microphone signal wakes up electronic equipment, method and medium
CN110223711A (en) Interactive voice based on microphone signal wakes up electronic equipment, method and medium
CN110111776A (en) Interactive voice based on microphone signal wakes up electronic equipment, method and medium
CN109333544A (en) A kind of image exchange method for the marionette performance that spectators participate in
CN205412099U (en) Intelligence fitness test device
CN108877357A (en) Interaction method based on family education machine and family education machine
CN106157971A (en) Intelligence control system
WO2016134633A1 (en) Online-offline interaction toy and method for realizing online-offline data interaction of the toy
CN109887383A (en) A kind of logical block, logic card, joint way programming in logic system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant