CN106409283A - Audio frequency-based man-machine mixed interaction system and method - Google Patents

Audio frequency-based man-machine mixed interaction system and method Download PDF

Info

Publication number
CN106409283A
CN106409283A CN201610791966.0A CN201610791966A CN106409283A CN 106409283 A CN106409283 A CN 106409283A CN 201610791966 A CN201610791966 A CN 201610791966A CN 106409283 A CN106409283 A CN 106409283A
Authority
CN
China
Prior art keywords
message
unit
module
intervention
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610791966.0A
Other languages
Chinese (zh)
Other versions
CN106409283B (en
Inventor
俞凯
石开宇
郑达
陈露
常成
曹迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sipic Technology Co Ltd
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201610791966.0A priority Critical patent/CN106409283B/en
Publication of CN106409283A publication Critical patent/CN106409283A/en
Application granted granted Critical
Publication of CN106409283B publication Critical patent/CN106409283B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • G10L13/10Prosody rules derived from text; Stress or intonation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Abstract

The invention discloses an audio frequency-based man-machine mixed interaction system, a voice recognition module and a semantic recognition module are interconnected and used for transmitting word information corresponding to voice, an exception handling module is connected with the voice recognition module and the semantic recognition module, the voice recognition module is used for transmitting the word information to the exception handling module, the semantic recognition module is used for transmitting semantic parsing results to the exception handling module, and the exception handling module and a voice synthesis module are interconnected and used for transmitting intervention information. The invention also discloses an audio frequency-based man-machine mixed interaction method, voice information is converted into the word information via the voice recognition module, the word information is output to a semantic recognition unit, a user aim and corresponding key information are extracted from the word information via the semantic recognition unit, and whether anomaly occurs during a current human computer dialogue is determined by the exception handling module according to the word information of the voice recognition module and semantic information of the semantic recognition module, and an information reply for the anomaly is given. The audio frequency-based man-machine mixed interaction system and method disclosed in a technical solution of the invention can provide unified human computer dialogue experience.

Description

Man-machine mixing interactive system based on audio frequency and method
Technical field
The present invention relates to technical field of information processing, more particularly, to a kind of man-machine mixing interactive system based on audio frequency and side Method.
Background technology
As shown in figure 1, the interactive system being currently based on audio frequency all replies as finally replying using machine and presents to User, when machine decision system can not specify user view, most of conversational system selects to assume " pardon " etc The input to allow user carry out again for the reply, wherein part interactive system introduces the manual intervention based on telephone traffic center Method.
Existing human computer conversation's abnormality processing is mainly realized by telephone traffic center form at present, cannot process user in machine defeated Enter audio frequency or user explicitly indicate that need manual service when, the intervention of business center, now user and traffic if request is artificial Set up man-to-man call between member to connect, operator is directly exchanged with user, know the demand of user and pass through traffic Platform issues corresponding instruction.
The problem that the manual intervention mode at live traffic center exists mainly has:Man efficiency is low, and intervening teacher is needed with user Set up man-to-man speech exchange, in the time period waiting user input, cannot be serviced other people;High cost, exhales on a large scale Make center need a series of telecommunication apparatus and respective service integrated, simultaneously because efficiency is low, need more intervention Shi Jinhang Intervene service, thus indirectly improve human cost;Big by network environment influence:Directly transmit audio frequency needs using Internet resources Stable network connection, the fluctuation of network environment can lead to audio quality to decline thus affecting dialogue experience, or even interrupts man-machine Conversation process.
Therefore, those skilled in the art is devoted to developing a kind of man-machine mixing interactive system based on audio frequency and method, Manual intervention reply is replied with machine and combines, thus unified interactive flow process and lifting Consumer's Experience.
Content of the invention
In view of the drawbacks described above of prior art, the technical problem to be solved be how to improve customer service during Interactive efficiency and Consumer's Experience.
For achieving the above object, the invention provides a kind of man-machine mixing interactive system based on audio frequency, know including voice Other module, voice synthetic module, semantics recognition module and exception processing module, wherein, described sound identification module is configured It is to be connected with described semantics recognition module and transmit the corresponding Word message of voice, described exception processing module is configured to and institute State sound identification module to be connected with described semantics recognition module, described sound identification module is configured to transmit Word message to institute State exception processing module, described semantics recognition module is configured to transmit semantic analysis result to described exception processing module;Institute State exception processing module to be configured to be connected with described voice synthetic module and transmit intervention information.
Further, described sound identification module includes signal processing and feature extraction unit, acoustic model, language model And decoder, wherein, described signal processing and feature extraction unit are configured to be connected with described acoustic model and transmit sound Learn characteristic information, described decoder is configured to be connected with described acoustic model and described language model and export recognition result.
Further, described voice synthetic module includes text analysis unit, prosodic control unit and synthesis voice list Unit, wherein, described text analysis unit is configured to receive text message and described text message is processed, and will process knot Fruit is transferred to described prosodic control unit and described synthesis voice unit, and described prosodic control unit is configured to and described synthesis Voice unit is connected, and transmits pitch, the duration of a sound, loudness of a sound, pause and prosody information, and described synthesis voice unit is configured to institute State the voice of the analysis result receiving text analysis unit and the control parameter synthesis output of described prosodic control unit.
Further, described semantics recognition module includes field mark unit, intention determination unit, information extraction unit, Wherein, described field mark unit is configured to be connected with described intention determination unit and transmission field information, and described intention is sentenced Disconnected unit is configured to be connected with described information extraction unit and transmit user intent information, and described information extraction unit exports language The result of justice analysis.
Further, described exception processing module includes abnormality detecting unit, data base querying unit and intervenes Shi Dan Unit, wherein, described abnormality detecting unit is configured to receive the output of described sound identification module and described semantics recognition module, And deciding whether to take intervening measure, described data base querying unit is configured to receive the intervention letter of described abnormality detecting unit Number, and receive the semantic information of described semantics recognition module, inquire about and export intervention message, described intervention Shi Danyuan is configured to Using intervene teacher's described intervention message that described data base querying unit export carry out necessary preferentially and change, finally defeated Go out replying message to user.
Present invention also offers a kind of man-machine mixing exchange method based on audio frequency, comprise the following steps:
Step 1, offer sound identification module, voice synthetic module, semantics recognition module and exception processing module;
Voice messaging is converted to Word message and exports to described semantics recognition list by step 2, described sound identification module Unit;
Step 3, described semantics recognition unit extract customer objective and corresponding key message from Word message;
Step 4, described exception processing module are according to the Word message of described sound identification module and described semantics recognition The semantic information of module judges whether human computer conversation abnormal and for abnormality processing message reply currently.
Further, in step 2, following steps are specifically included:
Step 2.1, from input audio stream extract feature supply acoustic model process, simultaneously reduce environment noise, channel The impact described feature being caused with speaker's factor;
Step 2.2, decoder, according to acoustics, linguistic model and dictionary, the result to described acoustic model, are found The word string of described audio stream can be exported with maximum of probability, as the recognition result of voice.
Further, in step 3, following steps are specifically included:
Step 3.1, using the significant field belonging to keyword tag current session in Word message;
Step 3.2, rule-based in described field user view is judged;
Step 3.3, according to described field and described user view, binding rule, specific key message is carried Take.
Further, in step 4, following steps are specifically included:
Step 4.1, abnormality detecting unit are according to the Word message of described sound identification module and described semantics recognition mould The semantic information of block judges whether current human computer conversation exception, if abnormal, takes over human computer conversation by intervening Shi Danyuan;
Step 4.2, data base querying unit carry out the inquiry of data base according to semantic information, obtain thering is the dry of recommendation degree Pre- message, if the recommendation degree of intervention message is higher, is directly intervened using this intervention message, if recommendation degree is relatively low, Then intervention required Shi Jinhang manpower intervention;
Step 4.3, when machine algorithm cannot find the intervention message of high recommendations degree, intervene teacher and intervene and carry out intervention message Selection and modification, subsequently amended intervention message is sent to client.
Further, described key message includes dialogue field, dialogue key word, and described dialogue key word includes content and closes Keyword and emotion key word.
Compared with prior art, the technique effect of the present invention includes:
1st, efficiency improves:Take full advantage of the time intervening teacher's wait user input so that intervention teacher can be simultaneously to multiple User carries out intervening service, improves the efficiency intervened.
2nd, cost reduces:A series of related telecommunication apparatus of telephone traffic center need not be purchased, using existing computer and Server can build Interference service platform.
3rd, operative scenario is enriched:Employ B/S (Browser/Server browser/server) knot due to intervening teacher interface Structure, intervention teacher opens the corresponding website of browser login and can carry out intervention operation it is not necessary to receive calls on station, permissible The mobile terminals such as PAD, smart mobile phone, personal notebook carry out intervening service.
4th, network requirement is low:The data volume very little of File Transfer, thus the requirement to network reduces, user's uppick simultaneously Voice by locally synthesizing, do not affected by network condition.
5th, unified human computer conversation's experience:For user, it is transparent for intervening teacher, and the experience of user is as filled with one " machine " that divide intelligence, can be with the current man-machine conversation mode of seamless connection in dialogue.
Technique effect below with reference to design, concrete structure and generation to the present invention for the accompanying drawing is described further, with It is fully understood from the purpose of the present invention, feature and effect.
Brief description
Fig. 1 is the intervention mode schematic diagram of existing tradition telephone traffic center;
Fig. 2 is the system module schematic diagram of the present invention;
Fig. 3 is the system flow schematic diagram of a preferred embodiment of the present invention;
Fig. 4 is the part dialog schematic flow sheet of a preferred embodiment of the present invention.
Specific embodiment
The present invention is achieved by the following technical solutions:
As shown in Fig. 2 the present invention relates to a kind of human computer conversation's abnormality processing system based on audio frequency, including:Speech recognition Module, voice synthetic module, semantics recognition module and exception processing module, wherein:Sound identification module and semantics recognition mould Block is connected and transmits the corresponding Word message of voice, sound identification module and semantics recognition module all with exception processing module phase Even, and respectively Word message and semantic analysis result are transmitted, exception processing module is connected with voice synthetic module and transmits intervention Information.
Described sound identification module includes:Signal processing and feature extraction unit, acoustic model, language model and solution Code device, wherein:Signal processing and feature extraction unit are connected with acoustic model and transmit acoustic featuress information, decoder and acoustics Model is connected with language model, exports recognition result to external world.
Described voice synthetic module includes:Text analysis unit, prosodic control unit and synthesis voice unit, its In:Text analysis unit receives text message and it is processed, and result is transferred to prosodic control unit and synthesis Voice unit, prosodic control unit is connected with synthesis voice unit, and the pitch of transmission objectives, the duration of a sound, loudness of a sound, pause and intonation Etc. information, synthesize voice unit and receive the analysis result of text analysis unit and the control parameter of prosodic control unit, to external world The voice of output synthesis.
Described semantics recognition module includes:Field mark unit, intention determination unit, information extraction unit, wherein:Neck Domain mark unit is connected with intention determination unit and transmission field information is it is intended that judging unit is connected with information extraction unit and passes Defeated user intent information, information unit is connected with the external world and transmits the information of semantic analysis.
Described exception processing module includes:Abnormality detecting unit, data base querying unit, intervention Shi Danyuan are with wherein: Abnormality detecting unit receives the output of sound identification module and semantics recognition module, and decides whether to take intervening measure, data The intervention signal of library inquiry unit reception abnormality detecting unit, and receive the semantic information of semantics recognition module, inquire about and export Intervention message, intervene Shi Danyuan using intervene teacher's intervention message that data base querying unit is exported carry out necessary preferentially and Modification, final output user reply message.
The present invention relates to human computer conversation's abnormality eliminating method of said system, specifically include following steps:
Step 1, offer sound identification module, voice synthetic module, semantics recognition module and exception processing module.
Voice messaging is converted to Word message and exports to semantics recognition unit by step 2, sound identification module, concrete step Rapid inclusion:
2.1 front-end processing audio streams, extract feature from input signal, process for acoustic model.Reduce as far as possible simultaneously The impact that the factors such as environment noise, channel, speaker cause to feature.
To the signal inputting according to acoustics, linguistic model and dictionary, searching can be exported 2.2 decoders with maximum of probability The word string of this signal, as the recognition result of voice.
Step 3, semantics recognition unit extract customer objective and corresponding key message, concrete steps from Word message Including:
3.1 utilize the significant field belonging to keyword tag current session in Word message.
3.2 in specific field the rule-based intention to user judge.
3.3 according to field and user view, binding rule, and template for example set in advance, to specific key message Extracted.
Step 4, exception processing module are believed according to the Word message of sound identification module and the semantic of semantics recognition module Breath judges whether human computer conversation currently exception and carry out the reply of abnormal process and message, and concrete steps include:
4.1 abnormality detecting unit are sentenced according to the Word message of sound identification module and the semantic information of semantics recognition module Whether current human computer conversation of breaking exception.Abnormal then processed by local client, abnormal then by intervention server Adapter human computer conversation.
4.2 data base querying units carry out the inquiry of data base according to semantic information, obtain the intervention message recommended, if The recommendation degree of intervention message is higher, then directly intervened using this intervention message, if recommendation degree is relatively low, intervention required teacher Carry out manpower intervention.
4.3, when machine algorithm cannot find the intervention message of high recommendation degree, intervene the choosing that teacher's intervention carries out intervention message Select and change, subsequently send amended intervention message to client.
During human computer conversation's abnormality processing, the speech recognition of machine and semantic solution are passed through in the phonetic entry of user After analysis, the result of the recognition result of voice and semantic parsing can be passed to intervention Shi Duan in a text form, intervene teacher and accept Can select to send conversation message or transmitting order to lower levels message to after message.Conversation message is transferred to machine in a text form Device, subsequently synthesizes voice by speech synthesis system (TTS) and plays to user, command messages are then directly to be executed by machine Order.
The present embodiment comprises the following steps, as shown in Figure 3 and Figure 4, i.e. user input -->Intervention message generates -->Client Machine pushes the introduction that three steps of intervention message carry out technical scheme respectively:
1) user input
During user carries out phonetic entry, using speech recognition system the phonetic entry audio frequency of user is converted to Word, this word is carried out with semantic analysis simultaneously, and (result of semantic analysis includes the current dialogue field of user, user's request Key message of service etc.), finally the result of word and semantic analysis is passed through in the form of text the POST side of http protocol Method is transferred to exception processing module.
2) intervention message generates
Exception processing module in abnormal cases, inquire about by the text message according to speech recognition and the semantic groove of semantics recognition Data base, obtains alternative intervention message.If the recommendation degree of intervention message is higher, directly done using this intervention message In advance, if recommendation degree is relatively low, intervention required Shi Jinhang manpower intervention.Intervene teacher can see by abnormality processing mould on interface Result of the recognition result of assistance data such as user input of block offer and semantic analysis etc., intervenes teacher's energy in conjunction with these information Enough more quickly and accurately candidate's intervention message is screened and changed.Intervention message is divided into conversation message and command messages, all Be transmitted using unified Websocket agreement in a text form, its difference from the different of transferring content and machine Processing mode different.
3) client computer pushes intervention message
Client computer immediately returns to after receiving intervention message intervene the confirmation of teacher's " message has been received by ", and by intervention message It is buffered in message queue.Client computer can be monitored current human computer conversation's state and attempt under certain condition from message queue Take out message to push to user, specific push opportunity includes:1st, intervention message reach when, 2, TTS synthesis speech message When report completes;Need the condition meeting be 1, message queue be not empty, 2, the audio player current idle of client computer.If Intervention message successfully pushes, and returns the confirmation intervening teacher's " intervention message pushes ".
For example:
1st, user A sends phonetic order " I will go to a joyful place ".
2nd, phonetic entry is converted to word by sound identification module.
3rd, semantic module obtains user view after processing is " navigation ", and the label on the target ground of navigation is " joyful ".
4th, the abnormality detecting unit in exception processing module receives the service request of user A, comprises complete speech recognition Result " I will go to a joyful place ", and the result " navigation " of semantic analysis, " joyful ", are detected simultaneously by current dialogue State occurs abnormal.
5th, the data base querying unit in exception processing module according to " navigation ", " joyful " carry out data base querying, obtain Some alternate message are such as " may I ask the joyful snack in your Suzhou to be gone?", " be you look for 5 to joyful related place ", Recommendation degree all ratios of this two message are relatively low, therefore the manpower intervention of intervention required Shi Danyuan.Intervene Shi Liyong exception processing module The text results of the database query result obtaining and semantic analysis result and speech recognition carry out intervention message selection and Modification, intervention message is changed to " may I ask what kind of entertainment way you want?", send text message to user.
6th, client computer is deposited into message queue after receiving intervention message, sends " message has been received by " to exception processing module Feedback, and attempt being pushed.
7th, condition carries out the speech synthesis system synthesis of intervention message and reports after meeting, and user hears that audio frequency " be may I ask What kind of entertainment way you want ", client computer sends " message pushes " feedback to exception processing module.
8th, client carries out further phonetic entry " I will go to sing "
9th, phonetic entry is converted to word by ASR system
10th, semantic analysis obtain user view is " navigation ", and the target of navigation is " KTV "
11st, abnormality detecting unit obtains the specific service demand of user A, and " I will go to comprise complete voice identification result Sing ", and the result of semantic analysis " navigation ", " KTV ".
12nd, data base querying unit according to " navigation ", " KTV " and the relevant information of user carries out the search of data base, Obtain alternative intervention message " recommend xxx may I ask for you whether to go to?", simultaneously because recommendation degree is very high, therefore bypass intervention Shi Dan Unit, directly sends word message to client computer " recommend xxx may I ask for you whether to go to?“
13rd, user confirms to go to
14th, abnormality processing system user pushes the intervention message of command type, comprises command type " navigation " and purpose The POI on ground.
15th, client computer takes out the message of command type " navigation " and corresponding POI from message queue, is led Boat operation, client computer sends " message pushes " feedback to exception processing module, and interaction terminates.
The preferred embodiment of the present invention described in detail above.It should be appreciated that the ordinary skill of this area need not be created The property made work just can make many modifications and variations according to the design of the present invention.Therefore, all technical staff in the art Pass through the available technology of logical analysis, reasoning, or a limited experiment under this invention's idea on the basis of existing technology Scheme, all should be in the protection domain being defined in the patent claims.

Claims (10)

1. a kind of man-machine mixing interactive system based on audio frequency is it is characterised in that include sound identification module, phonetic synthesis mould Block, semantics recognition module and exception processing module, wherein, described sound identification module is configured to and described semantics recognition mould Block is connected and transmits the corresponding Word message of voice, and described exception processing module is configured to and described sound identification module and institute Predicate justice identification module is connected, and described sound identification module is configured to transmit Word message to described exception processing module, institute Predicate justice identification module is configured to transmit semantic analysis result to described exception processing module;Described exception processing module is joined It is set to and be connected with described voice synthetic module and transmit intervention information.
2. the man-machine mixing interactive system based on audio frequency as claimed in claim 1 is it is characterised in that described sound identification module Including signal processing and feature extraction unit, acoustic model, language model and decoder, wherein, described signal processing and spy Levy extraction unit to be configured to be connected and transmit acoustic featuress information with described acoustic model, described decoder is configured to and institute State acoustic model to be connected with described language model and export recognition result.
3. the man-machine mixing interactive system based on audio frequency as claimed in claim 1 is it is characterised in that described voice synthetic module Including text analysis unit, prosodic control unit and synthesis voice unit, wherein, described text analysis unit is configured to connect Receive text message and described text message is processed, result is transferred to described prosodic control unit and described synthesis Voice unit, described prosodic control unit be configured to described synthesis voice unit be connected, and transmit pitch, the duration of a sound, loudness of a sound, Pause and prosody information, described synthesis voice unit be configured to by described receive text analysis unit analysis result with described The voice of the control parameter synthesis output of prosodic control unit.
4. the man-machine mixing interactive system based on audio frequency as claimed in claim 1 is it is characterised in that described semantics recognition module Including field mark unit, intention determination unit, information extraction unit, wherein, described field mark unit is configured to and institute State intention determination unit to be connected and transmission field information, described intention determination unit is configured to and described information extraction unit phase Connect and transmit user intent information, described information extraction unit exports the result of semantic analysis.
5. the man-machine mixing interactive system based on audio frequency as claimed in claim 1 is it is characterised in that described exception processing module Including abnormality detecting unit, data base querying unit and intervention Shi Danyuan, wherein, described abnormality detecting unit is configured to connect Receive described sound identification module and the output of described semantics recognition module, and decide whether to take intervening measure, described data base Query unit is configured to receive the intervention signal of described abnormality detecting unit, and receives the semantic letter of described semantics recognition module Breath, inquires about and exports intervention message, and it is defeated to described data base querying unit that described intervention Shi Danyuan is configured to, with intervention teacher The described intervention message going out carry out necessary preferentially and modification, final output replying message to user.
6. a kind of man-machine mixing exchange method based on audio frequency is it is characterised in that comprise the following steps:
Step 1, offer sound identification module, voice synthetic module, semantics recognition module and exception processing module;
Voice messaging is converted to Word message and exports to described semantics recognition unit by step 2, described sound identification module;
Step 3, described semantics recognition unit extract customer objective and corresponding key message from Word message;
Step 4, described exception processing module are according to the Word message of described sound identification module and described semantics recognition module Semantic information judge whether human computer conversation reply that is abnormal and being directed to abnormality processing message currently.
7. the man-machine mixing exchange method based on audio frequency as claimed in claim 6 it is characterised in that in step 2, specifically wraps Include following steps:
Step 2.1, extract feature from the audio stream of input and supply acoustic model to process, reduce environment noise, channel and say simultaneously The impact that words people's factor causes to described feature;
, according to acoustics, linguistic model and dictionary, the result to described acoustic model, searching can for step 2.2, decoder Export the word string of described audio stream with maximum of probability, as the recognition result of voice.
8. the man-machine mixing exchange method based on audio frequency as claimed in claim 6 it is characterised in that in step 3, specifically wraps Include following steps:
Step 3.1, using the significant field belonging to keyword tag current session in Word message;
Step 3.2, rule-based in described field user view is judged;
Step 3.3, according to described field and described user view, binding rule, specific key message is extracted.
9. the man-machine mixing exchange method based on audio frequency as claimed in claim 6 it is characterised in that in step 4, specifically wraps Include following steps:
Step 4.1, abnormality detecting unit are according to the Word message of described sound identification module and described semantics recognition module Semantic information judges whether current human computer conversation exception, if abnormal, takes over human computer conversation by intervening Shi Danyuan;
Step 4.2, data base querying unit carry out the inquiry of data base according to semantic information, and the intervention obtaining having recommendation degree disappears Breath, if the recommendation degree of intervention message is higher, is directly intervened using this intervention message, if recommendation degree is relatively low, please Seek intervention Shi Jinhang manpower intervention;
Step 4.3, when machine algorithm cannot find the intervention message of high recommendations degree, intervene teacher and intervene the choosing carrying out intervention message Select and change, subsequently send amended intervention message to client.
10. the man-machine mixing exchange method based on audio frequency as described in claim 6 or 8 is it is characterised in that described key message Including dialogue field, dialogue key word, described dialogue key word includes content keyword and emotion key word.
CN201610791966.0A 2016-08-31 2016-08-31 Man-machine mixed interaction system and method based on audio Active CN106409283B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610791966.0A CN106409283B (en) 2016-08-31 2016-08-31 Man-machine mixed interaction system and method based on audio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610791966.0A CN106409283B (en) 2016-08-31 2016-08-31 Man-machine mixed interaction system and method based on audio

Publications (2)

Publication Number Publication Date
CN106409283A true CN106409283A (en) 2017-02-15
CN106409283B CN106409283B (en) 2020-01-10

Family

ID=58001464

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610791966.0A Active CN106409283B (en) 2016-08-31 2016-08-31 Man-machine mixed interaction system and method based on audio

Country Status (1)

Country Link
CN (1) CN106409283B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122807A (en) * 2017-05-24 2017-09-01 努比亚技术有限公司 A kind of family's monitoring method, service end and computer-readable recording medium
CN107733780A (en) * 2017-09-18 2018-02-23 上海量明科技发展有限公司 Task smart allocation method, apparatus and JICQ
CN107992587A (en) * 2017-12-08 2018-05-04 北京百度网讯科技有限公司 A kind of voice interactive method of browser, device, terminal and storage medium
CN109697226A (en) * 2017-10-24 2019-04-30 上海易谷网络科技股份有限公司 Text silence seat monitoring robot interactive method
CN110069607A (en) * 2017-12-14 2019-07-30 株式会社日立制作所 For the method, apparatus of customer service, electronic equipment, computer readable storage medium
CN110602334A (en) * 2019-09-03 2019-12-20 上海航动科技有限公司 Intelligent outbound method and system based on man-machine cooperation
WO2020057446A1 (en) * 2018-09-17 2020-03-26 Huawei Technologies Co., Ltd. Method and system for generating a semantic point cloud map
CN110926493A (en) * 2019-12-10 2020-03-27 广州小鹏汽车科技有限公司 Navigation method, navigation device, vehicle and computer readable storage medium
CN110970017A (en) * 2018-09-27 2020-04-07 北京京东尚科信息技术有限公司 Human-computer interaction method and system and computer system
CN111125384A (en) * 2018-11-01 2020-05-08 阿里巴巴集团控股有限公司 Multimedia answer generation method and device, terminal equipment and storage medium
CN111540353A (en) * 2020-04-16 2020-08-14 重庆农村商业银行股份有限公司 Semantic understanding method, device, equipment and storage medium
CN112509575A (en) * 2020-11-26 2021-03-16 上海济邦投资咨询有限公司 Financial consultation intelligent guiding system based on big data
CN112735410A (en) * 2020-12-25 2021-04-30 中国人民解放军63892部队 Automatic voice interactive force model control method and system
CN112735427A (en) * 2020-12-25 2021-04-30 平安普惠企业管理有限公司 Radio reception control method and device, electronic equipment and storage medium
CN107204185B (en) * 2017-05-03 2021-05-25 深圳车盒子科技有限公司 Vehicle-mounted voice interaction method and system and computer readable storage medium
CN116453540A (en) * 2023-06-15 2023-07-18 山东贝宁电子科技开发有限公司 Underwater frogman voice communication quality enhancement processing method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1920948A (en) * 2005-08-24 2007-02-28 富士通株式会社 Voice recognition system and voice processing system
CN101276584A (en) * 2007-03-28 2008-10-01 株式会社东芝 Prosody-pattern generating apparatus, speech synthesizing apparatus, and computer program product and method thereof
CN102509483A (en) * 2011-10-31 2012-06-20 苏州思必驰信息科技有限公司 Distributive automatic grading system for spoken language test and method thereof
CN102982799A (en) * 2012-12-20 2013-03-20 中国科学院自动化研究所 Speech recognition optimization decoding method integrating guide probability
CN104678868A (en) * 2015-01-23 2015-06-03 贾新勇 Business and equipment operation and maintenance monitoring system
CN105227790A (en) * 2015-09-24 2016-01-06 北京车音网科技有限公司 A kind of voice answer method, electronic equipment and system
CN105723362A (en) * 2013-10-28 2016-06-29 余自立 Natural expression processing method, processing and response method, device, and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1920948A (en) * 2005-08-24 2007-02-28 富士通株式会社 Voice recognition system and voice processing system
CN101276584A (en) * 2007-03-28 2008-10-01 株式会社东芝 Prosody-pattern generating apparatus, speech synthesizing apparatus, and computer program product and method thereof
CN102509483A (en) * 2011-10-31 2012-06-20 苏州思必驰信息科技有限公司 Distributive automatic grading system for spoken language test and method thereof
CN102982799A (en) * 2012-12-20 2013-03-20 中国科学院自动化研究所 Speech recognition optimization decoding method integrating guide probability
CN105723362A (en) * 2013-10-28 2016-06-29 余自立 Natural expression processing method, processing and response method, device, and system
CN104678868A (en) * 2015-01-23 2015-06-03 贾新勇 Business and equipment operation and maintenance monitoring system
CN105227790A (en) * 2015-09-24 2016-01-06 北京车音网科技有限公司 A kind of voice answer method, electronic equipment and system

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107204185B (en) * 2017-05-03 2021-05-25 深圳车盒子科技有限公司 Vehicle-mounted voice interaction method and system and computer readable storage medium
CN107122807A (en) * 2017-05-24 2017-09-01 努比亚技术有限公司 A kind of family's monitoring method, service end and computer-readable recording medium
CN107122807B (en) * 2017-05-24 2021-05-21 努比亚技术有限公司 Home monitoring method, server and computer readable storage medium
CN107733780B (en) * 2017-09-18 2020-07-03 上海量明科技发展有限公司 Intelligent task allocation method and device and instant messaging tool
CN107733780A (en) * 2017-09-18 2018-02-23 上海量明科技发展有限公司 Task smart allocation method, apparatus and JICQ
CN109697226A (en) * 2017-10-24 2019-04-30 上海易谷网络科技股份有限公司 Text silence seat monitoring robot interactive method
CN107992587A (en) * 2017-12-08 2018-05-04 北京百度网讯科技有限公司 A kind of voice interactive method of browser, device, terminal and storage medium
CN110069607A (en) * 2017-12-14 2019-07-30 株式会社日立制作所 For the method, apparatus of customer service, electronic equipment, computer readable storage medium
CN110069607B (en) * 2017-12-14 2024-03-05 株式会社日立制作所 Method, apparatus, electronic device, and computer-readable storage medium for customer service
WO2020057446A1 (en) * 2018-09-17 2020-03-26 Huawei Technologies Co., Ltd. Method and system for generating a semantic point cloud map
US10983526B2 (en) 2018-09-17 2021-04-20 Huawei Technologies Co., Ltd. Method and system for generating a semantic point cloud map
CN110970017A (en) * 2018-09-27 2020-04-07 北京京东尚科信息技术有限公司 Human-computer interaction method and system and computer system
CN111125384A (en) * 2018-11-01 2020-05-08 阿里巴巴集团控股有限公司 Multimedia answer generation method and device, terminal equipment and storage medium
CN111125384B (en) * 2018-11-01 2023-04-07 阿里巴巴集团控股有限公司 Multimedia answer generation method and device, terminal equipment and storage medium
CN110602334A (en) * 2019-09-03 2019-12-20 上海航动科技有限公司 Intelligent outbound method and system based on man-machine cooperation
CN110926493A (en) * 2019-12-10 2020-03-27 广州小鹏汽车科技有限公司 Navigation method, navigation device, vehicle and computer readable storage medium
CN111540353A (en) * 2020-04-16 2020-08-14 重庆农村商业银行股份有限公司 Semantic understanding method, device, equipment and storage medium
CN112509575A (en) * 2020-11-26 2021-03-16 上海济邦投资咨询有限公司 Financial consultation intelligent guiding system based on big data
CN112735427A (en) * 2020-12-25 2021-04-30 平安普惠企业管理有限公司 Radio reception control method and device, electronic equipment and storage medium
CN112735410A (en) * 2020-12-25 2021-04-30 中国人民解放军63892部队 Automatic voice interactive force model control method and system
CN112735427B (en) * 2020-12-25 2023-12-05 海菲曼(天津)科技有限公司 Radio reception control method and device, electronic equipment and storage medium
CN116453540A (en) * 2023-06-15 2023-07-18 山东贝宁电子科技开发有限公司 Underwater frogman voice communication quality enhancement processing method
CN116453540B (en) * 2023-06-15 2023-08-29 山东贝宁电子科技开发有限公司 Underwater frogman voice communication quality enhancement processing method

Also Published As

Publication number Publication date
CN106409283B (en) 2020-01-10

Similar Documents

Publication Publication Date Title
CN106409283A (en) Audio frequency-based man-machine mixed interaction system and method
CN110266899B (en) Client intention identification method and customer service system
US10911596B1 (en) Voice user interface for wired communications system
US10412206B1 (en) Communications for multi-mode device
CN101207586B (en) Method and system for real-time automatic communication
US20190332679A1 (en) Auto-translation for multi user audio and video
EP2411977B1 (en) Service oriented speech recognition for in-vehicle automated interaction
US20180054506A1 (en) Enabling voice control of telephone device
CN109961792B (en) Method and apparatus for recognizing speech
US20120253823A1 (en) Hybrid Dialog Speech Recognition for In-Vehicle Automated Interaction and In-Vehicle Interfaces Requiring Minimal Driver Processing
CN101207655A (en) Method and system switching between voice and text exchanging forms in a communication conversation
CN104010267A (en) Method and system for supporting a translation-based communication service and terminal supporting the service
US10194023B1 (en) Voice user interface for wired communications system
CN106230689A (en) Method, device and the server that a kind of voice messaging is mutual
US9390426B2 (en) Personalized advertisement device based on speech recognition SMS service, and personalized advertisement exposure method based on partial speech recognition SMS service
US10326886B1 (en) Enabling additional endpoints to connect to audio mixing device
CN110119514A (en) The instant translation method of information, device and system
CN112866086A (en) Information pushing method, device, equipment and storage medium for intelligent outbound
CN111554280A (en) Real-time interpretation service system for mixing interpretation contents using artificial intelligence and interpretation contents of interpretation experts
CN108881507B (en) System comprising voice browser and block chain voice DNS unit
CN116863935B (en) Speech recognition method, device, electronic equipment and computer readable medium
KR20090076318A (en) Realtime conversational service system and method thereof
CN110740212B (en) Call answering method and device based on intelligent voice technology and electronic equipment
US20020072916A1 (en) Distributed speech recognition for internet access
CN1427394A (en) Speech sound browsing network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200619

Address after: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120

Patentee after: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd.

Address before: 200240 Dongchuan Road, Shanghai, No. 800, No.

Patentee before: SHANGHAI JIAO TONG University

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20201105

Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu.

Patentee after: AI SPEECH Ltd.

Address before: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120

Patentee before: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd.

CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Patentee after: Sipic Technology Co.,Ltd.

Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Patentee before: AI SPEECH Ltd.