CN110428816A - A kind of method and device voice cell bank training and shared - Google Patents

A kind of method and device voice cell bank training and shared Download PDF

Info

Publication number
CN110428816A
CN110428816A CN201910706841.7A CN201910706841A CN110428816A CN 110428816 A CN110428816 A CN 110428816A CN 201910706841 A CN201910706841 A CN 201910706841A CN 110428816 A CN110428816 A CN 110428816A
Authority
CN
China
Prior art keywords
voice
cell bank
user
label
tag set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910706841.7A
Other languages
Chinese (zh)
Other versions
CN110428816B (en
Inventor
龚思颖
赵晓朝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Suddenly Cognitive Technology Co ltd
Original Assignee
Beijing Suddenly Cognitive Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Suddenly Cognitive Technology Co Ltd filed Critical Beijing Suddenly Cognitive Technology Co Ltd
Publication of CN110428816A publication Critical patent/CN110428816A/en
Application granted granted Critical
Publication of CN110428816B publication Critical patent/CN110428816B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The embodiment of the invention provides a kind of personal methods sound bank training and shared, comprising: step 1, voice assistant obtains the voice command of user's input;Step 2, the voice command of user's input is identified and parsed, marks one or more labels for the voice command;Step 3, for statistical analysis to the voice command of user;Step 4, voice assistant creation and/or more new speech cell bank.This method can make the interaction between voice assistant and user more intelligent, and hommization improves interactive efficiency, improve user experience.

Description

A kind of method and device voice cell bank training and shared
Technical field
The present embodiments relate to technical field of information processing, in particular to a kind of sound bank training and the method shared, Device, equipment and computer readable storage medium.
Background technique
With the development of technology, artificial intelligence gradually penetrates into people's lives, and voice assistant is as human-computer interaction Bridge plays critically important effect, and user is interacted by way of voice with voice assistant, on the one hand can liberate both hands, separately On the one hand it can arbitrarily communicate, but there are some problems for existing voice assistant, since user has multifarious spy Point, the languages spoken, accent, dialect and user interest hobby are different, and voice assistant needs to obtain the voice compared with multiple types Library and user link up, and existing sound bank generation is usually taken a large amount of corpus of collection and is trained, and wherein most is all from official It is obtained in square corpus or large cardinal samples to obtain, which does not have specific aim for a user, limits voice assistant With the good interaction of user, poor user experience.
How to make voice assistant that there is the sound bank for more meeting user individual requirement to become a urgent problem to be solved.
Summary of the invention
For the above-mentioned problems in the prior art, the present invention proposes a kind of method voice cell bank training and shared And device, to overcome the above problem.
The embodiment of the invention provides a kind of methods voice cell bank training and shared, comprising:
Step 1, voice assistant obtains the voice command of user's input;
Step 2, the voice command of user's input is identified and parsed, marks one or more labels for the voice command;
Step 3, for statistical analysis to the voice command of user;
Step 4, voice assistant creation and/or more new speech cell bank.
Preferably, this method further includes
Step 5, creation and/or the voice cell bank updated are shared to intelligent interaction platform and/or user good friend.
Preferably, this method further includes
Step 6, the frequency of use for judging the voice cell bank that voice assistant includes will when frequency of use is lower than threshold value The voice cell bank is deleted.
Preferably, step 2 specifically includes
When voice assistant has the personal voice cell bank of user, it is primarily based on the cell bank and is identified, judgement is It is no to identify the voice command, if can identify, which is parsed, marks one or more labels for it;
When voice assistant does not have the personal voice cell bank of user, speech recognition is carried out based on general-purpose library, if can be with Identification, then parsed, and mark one or more labels;If can not be identified based on general-purpose library, voice assistant is by the voice Order is scanned for or is sent to Cloud Server by network, carries out identification parsing by it, if identification successfully resolved, to the language Sound Command Flags one or more label carries out manual labeling by user, for the voice command mark if identification parsing is unsuccessful Remember one or more labels.
Preferably, in step 3, the number occurred to the label of voice command and/or tag set counts;Wherein Tag set refers to the set that in a voice command while multiple labels of label are constituted.
Preferably, when a predetermined condition is satisfied, step 3 is executed;
The predetermined condition are as follows: reach predetermined time interval or voice command quantity reaches predetermined quantity, when one of item When part meets, step 3 is executed.
Preferably, create voice cell bank the following steps are included:
Step 41, judge whether user good friend has the voice cell bank of the label and/or tag set feature, if so, It then sends and requests to user good friend, request voice cell bank, if it is not, thening follow the steps 42;
Step 42, inquiry request is sent to intelligent interaction platform, judges whether it has with the label and/or tag set The voice cell bank of feature, if not having, executes step 43 if then obtaining the voice cell bank from intelligent interaction platform;
Step 43, the voice command of user is grouped according to each label and/or tag set, to the language after grouping Sound order is trained respectively, and the voice cell bank corresponding to the label and/or tag set feature is respectively created;
More new speech cell bank is that the user command based on label is special with respective labels and/or tag set to corresponding to The voice cell bank of sign is updated.
Preferably, when label and/or tag set meet following either conditions, creation and/or update have the satisfaction The label of condition and/or the voice cell bank of tag set feature:
Condition (1): the number that i-th group of tag set Ci occurs meets following formula:
NCi≥n (1)
Condition (2): when label Li belongs to the tag set for meeting formula (1), the number of label Li is subtracted expires at this The number occurred in the tag set of sufficient condition, residue degree are more than or equal to m;
Condition (3): when label Li is not belonging to meet the tag set of formula (1), the number of label Li is more than or equal to k;
Wherein NCiIndicate the number that i-th group of tag set Ci occurs;N, m, k are nonnegative integer.
The present embodiment additionally provides a kind of device voice cell bank training and shared, which includes:
Module is obtained, for obtaining the voice command that user inputs in voice assistant;
Mark module for identification and parses the voice command that user inputs, marks for the voice command one or more Label;
Statistical module, it is for statistical analysis to the voice command of user;
Creation/update module, for creation and/or more new speech cell bank.
Preferably, which further includes
Sharing module is good to intelligent interaction platform and/or user for sharing creation and/or the voice cell bank updated Friend.
Preferably, which further includes
Removing module, for judging the frequency of use of voice cell bank that voice assistant includes, when frequency of use is lower than threshold When value, which is deleted.
Preferably, the mark module is specifically used for
When voice assistant has the personal voice cell bank of user, it is primarily based on the cell bank and is identified, judgement is It is no to identify the voice command, if can identify, which is parsed, marks one or more labels for it;
When voice assistant does not have the personal voice cell bank of user, speech recognition is carried out based on general-purpose library, if can be with Identification, then parsed, and mark one or more labels;If can not be identified based on general-purpose library, voice assistant is by the voice Order is scanned for or is sent to Cloud Server by network, carries out identification parsing by it, if identification successfully resolved, to the language Sound Command Flags one or more label;If identification parsing is unsuccessful, manual labeling is carried out by user, for the voice command Mark one or more labels.
Preferably, statistical module is specifically used for uniting to the number that the label and/or tag set of voice command occur Meter;Wherein tag set refers to the set that in a voice command while multiple labels of label are constituted.
Preferably, when a predetermined condition is satisfied, statistical module is triggered;
The predetermined condition are as follows: reach predetermined time interval or voice command quantity reaches predetermined quantity, when one of item When part meets, statistical module is triggered.
Preferably, create voice cell bank the following steps are included:
Step 41, judge whether user good friend has the voice cell bank of the label and/or tag set feature, if so, It then sends and requests to user good friend, request voice cell bank, if it is not, thening follow the steps 42;
Step 42, inquiry request is sent to intelligent interaction platform, judges whether it has with the label and/or tag set The voice cell bank of feature, if not having, executes step 43 if then obtaining the voice cell bank from intelligent interaction platform;
Step 43, the voice command of user is grouped according to each label and/or tag set, to the language after grouping Sound order is trained respectively, and the voice cell bank corresponding to the label and/or tag set feature is respectively created;
More new speech cell bank is that the user command based on label is special with respective labels and/or tag set to corresponding to The voice cell bank of sign is updated.
Preferably, when label and/or tag set meet following either conditions, creation and/or update have the satisfaction The label of condition and/or the voice cell bank of tag set feature:
Condition (1): the number that i-th group of tag set Ci occurs meets following formula:
NCi≥n (1)
Condition (2): when label Li belongs to the tag set for meeting formula (1), the number of label Li is subtracted expires at this The number occurred in the tag set of sufficient condition, residue degree are more than or equal to m;
Condition (3): when label Li is not belonging to meet the tag set of formula (1), the number of label Li is more than or equal to k;
Wherein NCiIndicate the number that i-th group of tag set Ci occurs;N, m, k are nonnegative integer.
The present invention also provides a kind of voice assistant, which includes above-mentioned apparatus.
The present invention also provides a kind of terminal, which includes above-mentioned voice assistant.
The present invention also provides a kind of computer equipment, the computer equipment includes processor and memory, the storage The computer instruction that device storage can be executed by processor is realized as described above when processor executes above-mentioned computer instruction Method.
The present invention also provides a kind of computer readable storage mediums, store computer instruction, and the computer instruction is used for Realize method as described above.
Detailed description of the invention
Fig. 1 is intelligent interaction platform schematic diagram provided in an embodiment of the present invention.
Fig. 2 is personal sound bank training in an embodiment of the present invention and the method shared.
Fig. 3 is in an embodiment of the present invention for personal device sound bank training and shared.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.The embodiment of the present invention and the specific features of embodiment are to technical side of the embodiment of the present invention The detailed description of case, rather than the restriction to description of the invention technical solution, in the absence of conflict, the embodiment of the present invention And the technical characteristic of embodiment can be combined with each other.
Refering to fig. 1, Fig. 1 is the schematic diagram of intelligent interaction platform of the present invention, specifically includes that human-computer interaction interface 101, processing Module 102, database 103 etc..Wherein processing module includes multiple interactive engines 112, and interactive engine 112 may include semantic reason Module 201, dialogue management and control module 202 are solved, generation module 203, command execution module 204 are talked with.Wherein, processing module 102 are connected with each other with human-computer interaction interface 101, and the data of user's input, Yi Jitong can be received by human-computer interaction interface 101 It crosses human-computer interaction interface and exports interaction data to user, is i.e. 101 one side of human-computer interaction interface can be connect by processing module 102 The dialogue data for feeding back to user is received, on the one hand can receive the order implementation procedure and result data of the feedback of processing module 102. For intelligent sound interaction platform, processing module 102 can also include: speech recognition module 210, voice output module 211.Institute Speech recognition module 210 is stated, voice output module 211 can also be configured in interactive engine 112.In addition, interactive engine 112 can To be single interactive engine, can also be made of one or more interaction sub- engine.
The one of them main points of the optimization of interactive engine in intelligent interaction platform are to improve the processing capacity of interactive engine, are increased Strong interactive engine improves the efficiency of dialogue interaction for semantic understanding, and promotes task execution accuracy;These are required pair The carry out accurate understanding that user is intended to is parsed by slot position in enrichment interactive engine and slot position, improves interactive engine for interaction Control and management.
Referring to Fig. 2, Fig. 2 is a kind of voice cell bank training that provides of the embodiment of the present invention one and the method shared, this Method includes but is not limited to:
Step 1, voice assistant obtains the voice command of user's input;
Specifically, in step 1, as user and voice assistant dialogue, voice assistant obtains the voice command of user.
Such as when user and voice assistant talk about the theme discussed about " king's honor " with Sichuan, voice assistant obtains user Voice command.
Step 2, the voice command of user's input is identified and parsed, marks one or more labels for the voice command;
In step 2, voice assistant identifies user voice command, when voice assistant has the personal voice of user It when cell bank, is primarily based on the cell bank and is identified, judge whether to identify the voice command, if can identify, to this Voice command is parsed, and marks one or more labels for it, the label is related to user voice command, such as voice command Theme (being king's honor in this example), user vocal feature are related (being Sichuan dialect in this example);If voice assistant does not have When having the personal voice cell bank of user, speech recognition is carried out based on general-purpose library and is parsed if can identify, and is marked Label;If can not identify that the voice command is scanned for or be sent to cloud by network by voice assistant based on general-purpose library Server carries out identification parsing by it, if identification successfully resolved, one or more labels is marked to the voice command, if identification It parses unsuccessful, manual labeling is carried out by user, marks one or more labels for the voice command
Further, the mark mode of label is the mode { label 1, label 2 } of set.
Step 3, for statistical analysis to the voice command of user;
In this step, for statistical analysis to the voice command of user includes carrying out to the label of the voice command of user Statistical analysis.
Statistical analysis to the label includes: to count the number that label and/or tag set occur, and tag set refers to one The set that multiple labels of label are constituted simultaneously in a voice command.
Such as user is as follows to the statistics of label:
Label Li Number
Label 1 12
Label 2 13
Label 3 5
Label 4 9
It is as follows to the statistics of tag set:
Preferably, when a predetermined condition is satisfied, step 3 is executed;Wherein, predetermined condition are as follows: reach predetermined time interval Or voice command quantity reaches predetermined quantity, when one of condition meets, then executes statistical analysis step;The predetermined time Interval, predetermined quantity is set by the user perhaps voice assistant adaptive setting or intelligent interaction platform configures it. Voice assistant carries out statistical to the label and/or tag set of the voice command of the user's input obtained in predetermined time interval Analysis, or analyzed when the voice command for user's input that it is obtained is more, it can determine the characteristics of user speech inputs.
Step 4, voice assistant creation and/or more new speech cell bank;
By the statistical analysis to user voice command, voice assistant can determine the spy of voice when user interacts Sign, the hobby etc. of user are that voice assistant creates and/or update has the mark based on above-mentioned label and/or tag set The voice cell bank of label and/or tag set feature, wherein when voice assistant does not have the label and/or tag set feature Voice cell bank when, create the voice cell bank with the label and/or tag set feature, it is otherwise thin to corresponding voice Born of the same parents library is updated.
Create voice cell bank the following steps are included:
Step 41, judge whether user good friend has the voice cell bank with the label and/or tag set feature, if It is then to send and request to user good friend, request voice cell bank, if it is not, thening follow the steps 42;
User good friend includes that the contacts list personnel of the various communication modes of user or voice assistant are helped with user speech Hand establishes the personnel of friend relation;By sending inquiry request to user good friend, it can relatively go the voice for obtaining the condition that meets Assistant.
Step 42, inquiry request is sent to intelligent interaction platform, judges whether it has with the label and/or tag set The voice cell bank of feature, if not having, executes step 43 if then obtaining the voice cell bank from intelligent interaction platform;
The voice cell bank and other staff's training that there is intelligent interaction platform general sound bank, developer to upload Voice cell bank etc., voice assistant not from good friend obtain needed for voice cell bank when, by being asked to the transmission of intelligent interaction platform It asks, improves the probability for obtaining and meeting condition sound bank.
Step 43, the voice command of user is grouped according to each label and/or tag set, to the language after grouping Sound order is trained respectively, and the voice cell bank corresponding to the label and/or tag set feature is respectively created;
If voice assistant can not obtain the voice cell bank of the condition of satisfaction, the voice command based on user is trained, Form the voice cell bank for meeting self-demand.
The voice command that wherein user is trained can derive from the dialogue of user and voice assistant, can also derive from User and the call voice between other people.
More new speech cell bank is that the user command based on label is special with respective labels and/or tag set to corresponding to The voice cell bank of sign is updated.
Preferably, when label and/or tag set meet following either conditions, creation and/or update have the satisfaction The label of condition and/or the voice cell bank of tag set feature:
Condition (1): the number that i-th group of tag set Ci occurs meets following formula:
NCi≥n (1)
Condition (2): when label Li belongs to the tag set for meeting formula (1), the number of label Li is subtracted expires at this The number occurred in the tag set of sufficient condition, residue degree are more than or equal to m;
Condition (3): when label Li is not belonging to meet the tag set of formula (1), the number of label Li is more than or equal to k;
Wherein NCiIndicate the number that i-th group of tag set Ci occurs;N, m, k are nonnegative integer, and value is set by the user, or Person's voice assistant adaptive setting or intelligent interaction platform configure it.
Further, this method further includes step 5, and creation and/or the voice cell bank updated are shared to intelligent interaction Platform and/or user good friend;
Voice assistant shares the voice cell bank for creating and/or updating to intelligent interaction platform, and intelligent friendship can be improved The classification of voice cell bank in mutual platform, further, it is also possible to which voice cell bank is shared with voice command with the voice cell The good friend of label and/or tag set feature that library has, so that good friend has the voice cell bank or combines the voice thin The voice cell bank that born of the same parents library has it is trained update, which can be executed by user's operation or voice assistant.
Further, this method further includes step 6, judges the frequency of use for the voice cell bank that voice assistant includes, when When frequency of use is lower than threshold value, which is deleted, can achieve the effect for saving user terminal space.
It by the above method, can make voice assistant that there is the voice cell bank for meeting user characteristics, meet personalized Demand further increases the intelligence of voice assistant, improves user experience.
A kind of device for voice cell bank training and sharing that the present invention also proposes, as shown in figure 3, for execution Method is stated, which includes
Module is obtained, for obtaining the voice command that user inputs in voice assistant;
Mark module for identification and parses the voice command that user inputs, marks for the voice command one or more Label;
Specifically, mark module identifies user voice command, when voice assistant has the personal voice of user thin It when born of the same parents library, is primarily based on the cell bank and is identified, judge whether to identify the voice command, if can identify, to the language Sound order is parsed, and marks one or more labels for it, if voice assistant does not have the personal voice cell bank of user, Speech recognition is carried out based on general-purpose library to be parsed if can identify, and marks one or more labels;If based on general Library can not identify that then the voice command is scanned for or be sent to Cloud Server by network by voice assistant, by its progress Identification parsing marks one or more labels to pass through if identification parsing is unsuccessful the voice command if identification successfully resolved User carries out manual labeling, marks one or more labels for the voice command
Further, the mark mode of label is the mode { label 1, label 2 } of set.
Statistical module, it is for statistical analysis to the voice command of user;
Wherein, for statistical analysis to the voice command of user includes carrying out statistical to the label of the voice command of user Analysis.
Statistical analysis to the label includes: to count the number that label and/or tag set occur, and tag set refers to one The set that multiple labels of label are constituted simultaneously in a voice command.
Preferably, when a predetermined condition is satisfied, statistical module is triggered;Wherein, predetermined condition are as follows: reach between the predetermined time Every or voice command quantity reach predetermined quantity, when one of condition meets, trigger statistical module;Between the predetermined time Every, predetermined quantity is set by the user perhaps voice assistant adaptive setting or intelligent interaction platform configures it.
Creation/update module, for creation and/or more new speech cell bank.
Creation/update module is based on above-mentioned label and/or tag set is that voice assistant is created and/or updated with the mark The voice cell bank of label and/or tag set feature, wherein when voice assistant does not have the label and/or tag set feature Voice cell bank when, create the voice cell bank with the label and/or tag set feature, it is otherwise thin to corresponding voice Born of the same parents library is updated.
Create individual's voice cell bank the following steps are included:
Create voice cell bank the following steps are included:
Step 41, judge whether user good friend has the voice cell bank of the label and/or tag set feature, if so, It then sends and requests to user good friend, request voice cell bank, if it is not, thening follow the steps 42;
Step 42, inquiry request is sent to intelligent interaction platform, judges whether it has with the label and/or tag set The voice cell bank of feature, if not having, executes step 43 if then obtaining the voice cell bank from intelligent interaction platform;
Step 43, the voice command of user is grouped according to each label and/or tag set, to the language after grouping Sound order is trained respectively, and the voice cell bank corresponding to the label and/or tag set feature is respectively created;
More new speech cell bank is that the user command based on label is special with respective labels and/or tag set to corresponding to The voice cell bank of sign is updated.
Preferably, when label and/or tag set meet following either conditions, creation and/or update have the satisfaction The label of condition and/or the voice cell bank of tag set feature:
Condition (1): the number that i-th group of tag set Ci occurs meets following formula:
NCi≥n (1)
Condition (2): when label Li belongs to the tag set for meeting formula (1), the number of label Li is subtracted expires at this The number occurred in the tag set of sufficient condition, residue degree are more than or equal to m;
Condition (3): when label Li is not belonging to meet the tag set of formula (1), the number of label Li is more than or equal to k;
Wherein NCiIndicate the number that i-th group of tag set Ci occurs;N, m, k are nonnegative integer, and value is set by the user, or Person's voice assistant adaptive setting or intelligent interaction platform configure it.
Further, which further includes sharing module, for sharing creation and/or the voice cell bank updated to intelligence It can interaction platform and/or user good friend.
Sharing module shares the voice cell bank for creating and/or updating to intelligent interaction platform, and intelligent friendship can be improved The classification of voice cell bank in mutual platform, further, it is also possible to which voice cell bank is shared with voice command with the voice cell The good friend of label and/or tag set feature that library has, so that good friend has the voice cell bank or combines the voice thin The voice cell bank that born of the same parents library has it is trained update, which can be executed by user's operation or voice assistant.
Further, which further includes removing module, for judging the use of voice cell bank that voice assistant includes Frequency deletes the voice cell bank when frequency of use is lower than threshold value.
The present invention also provides a kind of voice assistant, which includes above-mentioned apparatus.
The present invention also provides a kind of terminal, which includes above-mentioned voice assistant.
Specifically, terminal device can be computer, tablet computer, mobile phone, intelligent assistant, car-mounted terminal etc..
The present invention also provides a kind of computer equipment, the computer equipment includes processor and memory, the storage The computer instruction that device storage can be executed by processor is realized as described above when processor executes above-mentioned computer instruction Method.
The present invention also provides a kind of computer readable storage mediums, store computer instruction, and the computer instruction is used for Realize method as described above.
It can be using any combination of one or more computer-readable media.Computer-readable medium can be calculating Machine readable signal medium or computer readable storage medium.Computer readable storage medium can for example be but not limited to electricity, Magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Computer-readable storage Medium may include: the electrical connection with one or more conducting wires, portable computer diskette, hard disk, random access memory (RAM), read-only memory (ROM), flash memory, erasable programmable read only memory (EPROM), optical fiber, portable compact disc Read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, Computer readable storage medium can be any tangible medium for including or store program, which can be commanded and execute system System, device or device use or in connection.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof Program code.
It is described above to be intended merely to facilitate the example for understanding the present invention and enumerating, it is not used in and limits the scope of the invention.In When specific implementation, those skilled in the art can according to the actual situation change the component of device, increase, reduce, not The step of method, can be changed according to the actual situation on the basis of the function that influence method is realized, increased, reduced or Change sequence.
Although an embodiment of the present invention has been shown and described, it should be understood by those skilled in the art that: do not departing from this These embodiments can be carried out with a variety of change, modification, replacement and modification in the case where the principle and objective of invention, it is of the invention Range is limited by claim and its equivalent replacement, without creative work improvements introduced etc., should be included in this hair Within bright protection scope.

Claims (7)

1. a kind of method voice cell bank training and shared, which is characterized in that method includes the following steps:
Step 1, voice assistant obtains the voice command of user's input;
Step 2, the voice command of user's input is identified and parsed, marks one or more labels for the voice command;
Step 3, for statistical analysis to the voice command of user;
Step 4, voice assistant creation and/or more new speech cell bank.
2. the method according to claim 1, wherein this method further includes
Step 5, creation and/or the voice cell bank updated are shared to intelligent interaction platform and/or user good friend.
3. method according to claim 1 or 2, which is characterized in that this method further includes
Step 6, the frequency of use for judging the voice cell bank that voice assistant includes, when frequency of use is lower than threshold value, by the language Sound cell bank is deleted.
4. method according to claim 1-3, which is characterized in that this method further includes
Step 2 specifically includes
When voice assistant has the personal voice cell bank of user, it is primarily based on the cell bank and is identified, judging whether can To identify the voice command, if can identify, which is parsed, marks one or more labels for it;
When voice assistant does not have the personal voice cell bank of user, speech recognition is carried out based on general-purpose library, if can identify, It is then parsed, and marks one or more labels;If can not identify that voice assistant leads to the voice command based on general-purpose library It crosses network and scans for or be sent to Cloud Server, identification parsing is carried out by it, if identification successfully resolved, to the voice command One or more labels are marked, if identification parsing is unsuccessful, manual labeling is carried out by user, is the voice command label one Or multiple labels.
5. method according to claim 1-4, which is characterized in that this method further includes
In step 3, the number occurred to the label of voice command and/or tag set counts;Wherein tag set refers to The set that multiple labels of label are constituted simultaneously in a voice command.
6. method according to claim 1-5, which is characterized in that this method further includes
When a predetermined condition is satisfied, step 3 is executed;
The predetermined condition are as follows: reach predetermined time interval or voice command quantity reaches predetermined quantity, when one of condition is full When sufficient, step 3 is executed.
7. method according to claim 1-6, which is characterized in that this method further includes
Create voice cell bank the following steps are included:
Step 41, judge whether user good friend has the voice cell bank of the label and/or tag set feature, if so, to User good friend sends request, request voice cell bank, if it is not, thening follow the steps 42;
Step 42, inquiry request is sent to intelligent interaction platform, judges whether it has with the label and/or tag set feature Voice cell bank, if then obtaining the voice cell bank from intelligent interaction platform, if not having, execute step 43;
Step 43, the voice command of user is grouped according to each label and/or tag set, the voice after grouping is ordered Order is trained respectively, and the voice cell bank corresponding to the label and/or tag set feature is respectively created;
More new speech cell bank is the user command based on label to corresponding to respective labels and/or tag set feature Voice cell bank is updated.
CN201910706841.7A 2019-02-26 2019-08-01 Method and device for training and sharing voice cell bank Active CN110428816B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910141638 2019-02-26
CN201910141638X 2019-02-26

Publications (2)

Publication Number Publication Date
CN110428816A true CN110428816A (en) 2019-11-08
CN110428816B CN110428816B (en) 2022-06-03

Family

ID=68412081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910706841.7A Active CN110428816B (en) 2019-02-26 2019-08-01 Method and device for training and sharing voice cell bank

Country Status (1)

Country Link
CN (1) CN110428816B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110853676A (en) * 2019-11-18 2020-02-28 广州国音智能科技有限公司 Audio comparison method, device and equipment
CN110930986A (en) * 2019-12-06 2020-03-27 北京明略软件系统有限公司 Voice processing method and device, electronic equipment and storage medium
CN111048088A (en) * 2019-12-26 2020-04-21 北京蓦然认知科技有限公司 Voice interaction method and device for multiple application programs

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105096940A (en) * 2015-06-30 2015-11-25 百度在线网络技术(北京)有限公司 Method and device for voice recognition
CN105810200A (en) * 2016-02-04 2016-07-27 深圳前海勇艺达机器人有限公司 Man-machine dialogue apparatus and method based on voiceprint identification
CN105957525A (en) * 2016-04-26 2016-09-21 珠海市魅族科技有限公司 Interactive method of a voice assistant and user equipment
CN106328139A (en) * 2016-09-14 2017-01-11 努比亚技术有限公司 Voice interaction method and voice interaction system
CN106462608A (en) * 2014-05-16 2017-02-22 微软技术许可有限责任公司 Knowledge source personalization to improve language models
US20180166067A1 (en) * 2016-12-14 2018-06-14 International Business Machines Corporation Using recurrent neural network for partitioning of audio data into segments that each correspond to a speech feature cluster identifier
CN109272995A (en) * 2018-09-26 2019-01-25 出门问问信息科技有限公司 Audio recognition method, device and electronic equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106462608A (en) * 2014-05-16 2017-02-22 微软技术许可有限责任公司 Knowledge source personalization to improve language models
CN105096940A (en) * 2015-06-30 2015-11-25 百度在线网络技术(北京)有限公司 Method and device for voice recognition
CN105810200A (en) * 2016-02-04 2016-07-27 深圳前海勇艺达机器人有限公司 Man-machine dialogue apparatus and method based on voiceprint identification
CN105957525A (en) * 2016-04-26 2016-09-21 珠海市魅族科技有限公司 Interactive method of a voice assistant and user equipment
CN106328139A (en) * 2016-09-14 2017-01-11 努比亚技术有限公司 Voice interaction method and voice interaction system
US20180166067A1 (en) * 2016-12-14 2018-06-14 International Business Machines Corporation Using recurrent neural network for partitioning of audio data into segments that each correspond to a speech feature cluster identifier
CN109272995A (en) * 2018-09-26 2019-01-25 出门问问信息科技有限公司 Audio recognition method, device and electronic equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110853676A (en) * 2019-11-18 2020-02-28 广州国音智能科技有限公司 Audio comparison method, device and equipment
CN110930986A (en) * 2019-12-06 2020-03-27 北京明略软件系统有限公司 Voice processing method and device, electronic equipment and storage medium
CN110930986B (en) * 2019-12-06 2022-05-17 北京明略软件系统有限公司 Voice processing method and device, electronic equipment and storage medium
CN111048088A (en) * 2019-12-26 2020-04-21 北京蓦然认知科技有限公司 Voice interaction method and device for multiple application programs

Also Published As

Publication number Publication date
CN110428816B (en) 2022-06-03

Similar Documents

Publication Publication Date Title
CN106407178B (en) A kind of session abstraction generating method, device, server apparatus and terminal device
EP3451328A1 (en) Method and apparatus for verifying information
CN109949071A (en) Products Show method, apparatus, equipment and medium based on voice mood analysis
CN107222865A (en) The communication swindle real-time detection method and system recognized based on suspicious actions
CN109635117A (en) A kind of knowledge based spectrum recognition user intention method and device
CN109299458A (en) Entity recognition method, device, equipment and storage medium
CN107153965A (en) A kind of intelligent customer service solution of multiple terminals
CN110334241A (en) Quality detecting method, device, equipment and the computer readable storage medium of customer service recording
CN110428816A (en) A kind of method and device voice cell bank training and shared
CN105657129A (en) Call information obtaining method and device
CN110019742A (en) Method and apparatus for handling information
CN112699645B (en) Corpus labeling method, apparatus and device
CN109670166A (en) Collection householder method, device, equipment and storage medium based on speech recognition
CN116737908A (en) Knowledge question-answering method, device, equipment and storage medium
CN108876177A (en) A kind of tourism consulting management platform and management method
CN109582788A (en) Comment spam training, recognition methods, device, equipment and readable storage medium storing program for executing
CN109582954A (en) Method and apparatus for output information
CN112466289A (en) Voice instruction recognition method and device, voice equipment and storage medium
CN112163081A (en) Label determination method, device, medium and electronic equipment
CN111798118B (en) Enterprise operation risk monitoring method and device
CN113239204A (en) Text classification method and device, electronic equipment and computer-readable storage medium
CN112417121A (en) Client intention recognition method and device, computer equipment and storage medium
CN111399629A (en) Operation guiding method of terminal equipment, terminal equipment and storage medium
CN112235470A (en) Incoming call client follow-up method, device and equipment based on voice recognition
CN117112065B (en) Large model plug-in calling method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20220112

Address after: 310024 floor 5, zone 2, building 3, Hangzhou cloud computing Industrial Park, Zhuantang street, Xihu District, Hangzhou City, Zhejiang Province

Applicant after: Hangzhou suddenly Cognitive Technology Co.,Ltd.

Address before: Room 401, gate 2, block a, Zhongguancun 768 Creative Industry Park, 5 Xueyuan Road, Haidian District, Beijing 100083

Applicant before: BEIJING MORAN COGNITIVE TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant