CN109036406A - A kind of processing method of voice messaging, device, equipment and storage medium - Google Patents
A kind of processing method of voice messaging, device, equipment and storage medium Download PDFInfo
- Publication number
- CN109036406A CN109036406A CN201810864520.5A CN201810864520A CN109036406A CN 109036406 A CN109036406 A CN 109036406A CN 201810864520 A CN201810864520 A CN 201810864520A CN 109036406 A CN109036406 A CN 109036406A
- Authority
- CN
- China
- Prior art keywords
- information
- text information
- current speech
- user
- voice messaging
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
Abstract
The embodiment of the invention discloses a kind of processing method of voice messaging, device, equipment and storage mediums.This method comprises: receiving the current speech information of user's input after phonetic function unlatching;If stored reference voice information mismatches in the current speech information and sound bank, the current speech information is converted into text information and is shown;It obtains to the edit instruction of the text information, and the text information executive editor is operated according to the edit instruction, and the new text information after executive editor is operated is as target text information;It will be in the target text information deposit sound bank corresponding with the current speech information.By using above-mentioned technical proposal, when solving using voice control of electrical equipment, the limited problem of the different accents recognitions of different user.While promoting user experience, it helps a large amount of of electrical equipment phonitic entry method popularize.
Description
Technical field
The present embodiments relate to field of speech recognition more particularly to a kind of processing methods of voice messaging, device, equipment
And storage medium.
Background technique
With the development of science and technology, electrical equipment intelligence and hommization have been people's questions of common concern, electricity
Device device intelligence and hommization are that the operation of people provides a great convenience.
For the various electrical equipments on current market, such as television set, set-top box, between these electrical equipments and user
Human-computer interaction be generally single key interaction, i.e., electrical equipment is controlled by traditional soft keyboard input mode.
Currently, the input of such soft keyboard is more commonly used also popular input mode in the market.But this input mode is using
In the process, operation is relatively cumbersome, for example, user when carrying out Chinese character input, need to input one by one the corresponding phonetic of Chinese character.
For some users for not knowing about phonetic or five-stroke input method, then this soft keyboard input mode is not available.
Currently, being popular voice input mode there is also another.Although can be user by voice input
It provides a great convenience, but since the accent of different regions user has differences, electrical equipment is difficult in identification process
Different accents are identified, are also difficult to be popularized so as to cause phonitic entry method.
Summary of the invention
The embodiment of the present invention provides processing method, device, equipment and the storage medium of a kind of voice messaging, to solve to utilize
When voice control of electrical equipment, the limited problem of the different accents recognitions of different user.
In a first aspect, the embodiment of the invention provides a kind of processing methods of voice messaging, this method comprises:
After phonetic function unlatching, the current speech information of user's input is received;
It, will be described current if stored reference voice information mismatches in the current speech information and sound bank
Voice messaging is converted to text information and is shown;
The edit instruction to the text information is obtained, and according to the edit instruction to the text information executive editor
Operation, and the new text information after executive editor is operated is as target text information;
It will be in the target text information deposit sound bank corresponding with the current speech information.
Second aspect, the embodiment of the invention also provides a kind of processing unit of voice messaging, which includes:
Current speech data obtaining module, for receiving the current speech information of user's input after phonetic function unlatching;
First display module, if not for stored reference voice information in the current speech information and sound bank
Matching, then be converted to text information for the current speech information and show;
Text information editor module, for obtaining the edit instruction to the text information, and according to the edit instruction
The text information executive editor is operated, and the new text information after executive editor is operated is as target text information;
Memory module is used for the target text information deposit sound bank corresponding with the current speech information
In.
The third aspect, the embodiment of the invention also provides a kind of equipment, which includes:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processing
Device realizes the processing method of voice messaging provided by any embodiment of the invention.
Fourth aspect, the embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer
Program, the program realize the processing method of voice messaging provided by any embodiment of the invention when being executed by processor.
The embodiment of the present invention is after phonetic function unlatching, by receiving the current speech information of user's input, if it is determined that
Current speech information is mismatched with reference voice information stored in sound bank out, then current speech information is converted to text
Information is shown.At this point, if user has found the language that the text information and user have been sent by shown text information
Message breath can not edit user when corresponding to the text information, thus the language for having exported the text information and user
The matching of message manner of breathing.Electrical equipment can believe text according to edit instruction after obtaining user to the edit instruction of text information
Executive editor's operation is ceased, and the new text information after executive editor is operated is as target text information.By the way that target is literary
It, can be from default voice when user issues the voice messaging again in word information deposit sound bank corresponding with current speech information
Corresponding text information is found in library, if the text information is corresponding with the control instruction of electrical equipment, can control electricity
Device equipment executes control operation corresponding with text information.By using above-mentioned technical proposal, electrical equipment is realized to difference
The different accents of user identify, and corresponding movement can be executed according to recognition result, so that different regions exist not
User with accent can be transferred through voice messaging control electrical equipment, while promoting user experience, it helps electric appliance is set
The a large amount of of standby phonitic entry method popularize.
Detailed description of the invention
Fig. 1 is a kind of flow chart of the processing method for voice messaging that the embodiment of the present invention one provides;
Fig. 2 is a kind of flow chart of the processing method of voice messaging provided by Embodiment 2 of the present invention;
Fig. 3 is a kind of structural block diagram of the processing unit for voice messaging that the embodiment of the present invention three provides;
Fig. 4 is a kind of structural schematic diagram for equipment that the embodiment of the present invention four provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just
Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
Embodiment one
Fig. 1 is a kind of flow chart for the processing method of voice messaging that the embodiment of the present invention one provides, and this method can be by
The processing unit of voice messaging executes, which can be realized by way of software and/or hardware, which can integrate
In the electrical equipments such as such as TV, air-conditioning, also it can be integrated into the mobile terminals such as smart phone or tablet computer.Referring to Fig. 1, originally
The method of embodiment specifically includes:
S110, phonetic function unlatching after, receive user input current speech information.
Wherein, the state of speech voice input function includes state of activation and two kinds of unactivated state.It needs in user using language
When phonetic input method is communicated or exchanged, user can open language by the talk button on the remote controler of click electrical equipment
Sound input function.
Illustratively, corresponding identifier can be arranged for the state of speech voice input function can be set in active state
The flag bit of the identifier is 1;Under unactivated state, the flag bit that the identifier can be set is 0.In the present embodiment,
When whether detection phonetic function is activated, it can be determined by reading numerical value corresponding to the flag bit.
Illustratively, after detecting that phonetic function is opened, show that user exists and want to control electric appliance by voice messaging
The wish of equipment, at this point, electrical equipment will activate phonitic entry method panel, to receive the current speech information of user's input.
If stored reference voice information mismatches in S120, current speech information and sound bank, by current language
Message breath is converted to text information and is shown.
Illustratively, user's input voice information is primarily to control electrical equipment executes relevant movement, such as cuts
Change channel, adjust volume etc., so as to substitute user's manual operation, to promote user experience.And utilizing voice messaging control
When electrical equipment processed is acted, generally all different control instructions can be set, and electrical equipment can according to the difference of voice messaging
Corresponding movement is executed according to different control instructions.
In general, electrical equipment to the identification of user speech information is known according to the identification method to mandarin
Not, and phonetic control command present in electrical equipment is generally corresponding with mandarin.If the received language of electrical equipment institute
Message breath not instead of mandarin pronunciation, the dialect with local characteristic where user, then electrical equipment will be unable to according to voice
Information carries out corresponding control.Therefore in order to guarantee the correct identification to different phonetic information, electrical equipment can be by voice messaging
Recognition result shown with written form, so that user confirms.Also, in the present embodiment, text information and electric appliance
There is also preset corresponding relationships for the control instruction of equipment, as long as user confirmed that text information is errorless, and have sent confirmation and refer to
It enables, control action corresponding with the text information then can be performed after receiving confirmation instruction in electrical equipment.
Illustratively, in the present embodiment, speech recognition, semantic parsing and voice is can be used in voice messaging conversion text information
The voice messaging that user inputs is converted to text information automatically by the technologies such as synthesis.Wherein, the effect for carrying out text conversion is as follows:
User can determine that electrical equipment identifies whether wrong to current speech information according to the text information after conversion, i.e., electric appliance is set
The standby current speech information identified most starts whether the content being intended by consistent namely identification knot of electrical equipment with user
Whether fruit meets the original intention of user.Or if user place still in need added or modify after input voice information,
User can also be modified in time, to avoid voice messaging sending that will be also imperfect.
Illustratively, if the user find that the voice messaging that shown text information is intended by with oneself mismatches,
It can then modify to text information, keep it corresponding with issued voice messaging.
It should be noted that the sound bank in the present embodiment is mainly used for storing the voice messaging and corresponding text of user
Information, the text information refer to the text information to match with the voice messaging of user's output.For example, if the user find that electric appliance
The voice messaging and mismatch that text information shown by equipment current interface is intended by with oneself, then need to text information into
Row modification, and the text information stored in sound bank is that text that is passing through user's modification and matching with voice messaging is believed
Breath.
Illustratively, in the present embodiment, when electrical equipment every time receive current speech information when, then need current language
Message breath is matched with stored voice messaging in default sound bank, if current speech information be active user before
Voice messaging through inputting is preset in sound bank and is stored with the voice messaging and corresponding text information, even if the voice
Information is not the default mandarin for supporting identification of electrical equipment, and electrical equipment can also find voice letter from default sound bank
Breath, and can correspond to and find text information corresponding to the voice messaging and shown.After user confirms that text information is errorless, i.e.,
If receiving the confirmation instruction of user, the operation of control corresponding to the text information can be performed.
Preferably, electrical equipment is matching current speech information with reference voice information stored in sound bank
When, first voice messaging can be pre-processed, such as VAD (Voice Activity Detection, speech activity inspection can be used
Survey) and the modes such as echo cancellor, wherein VAD mode is mute the cutting off to voice signal head and the tail section, with reduction to subsequent
It is interfered caused by speech recognition.After completing pretreatment, it can be used and fallen if any linear prediction residue error (LPCC) algorithm and Mel
Spectral coefficient (MFCC) algorithm carries out feature extraction to voice signal, then utilizes acoustic model and speech model technology by sound
Tablet section is matched with voice messaging stored in sound bank.
S130, it obtains to the edit instruction of text information, and text information executive editor is operated according to edit instruction, and
New text information after executive editor is operated is as target text information.
Illustratively, in the default sound bank of the present embodiment, voice messaging that user has inputted and right can be stored with
The text information answered.As long as electrical equipment detects voice messaging, the voice messaging received can be identified, and turned
It is changed to text information to be shown, so that user confirms and corrects.If receiving the edit instruction of user, illustrate that electric appliance is set
The standby voice messaging that the recognition result of the voice messaging and user are intended by and mismatch, at this point, by being referred to according to editor
It enables and text executive editor is operated, can be using the new text information after edit operation as target text information, and it can be by user
Current speech information and corresponding be stored in of target voice information are preset in sound bank.
S140, will be in target text information deposit sound bank corresponding with current speech information.
Illustratively, in the present embodiment by target voice information it is corresponding with current speech information storage into sound bank, such as
Fruit electrical equipment receives voice messaging identical with current speech information again, then can be based on the sound bank to receiving again
Voice messaging accurately identified, and text information corresponding to the voice messaging is found from sound bank, and shown
Show, to solve the problems, such as that electrical equipment is difficult to different accents, facilitates the speech voice input function for promoting electrical equipment.
Illustratively, if the voice messaging that electrical equipment receives is the voice messaging that user inputs for the first time, the language
Message breath with voice messaging stored in sound bank and mismatch, then can be according to mode provided in an embodiment of the present invention to the language
Message breath carries out text conversion, if user edits the text information after conversion, by edited target text
In information and the corresponding deposit sound bank of corresponding voice messaging.
The technical solution of the present embodiment, electrical equipment by by received current speech information with it is stored in sound bank
Reference voice information is matched, if the two mismatches, current speech is converted to text information and is shown, and will be right
In the corresponding deposit sound bank of target text information and current speech information after text information executive editor operation, so as to make
Electrical equipment when receiving the target voice information again, even if should the voice messaging there is the accent of user, electrical equipment
The voice messaging can also be identified, and execute corresponding movement, by using above-mentioned technical proposal, may make and differently deposit
Electrical equipment can be controlled by voice messaging in different accent users, while promoting user experience, it helps electricity
The a large amount of of device equipment phonitic entry method popularize.
Embodiment two
Fig. 2 is a kind of flow chart of the processing method of voice messaging provided by Embodiment 2 of the present invention, and the present embodiment is upper
It states and is optimized on the basis of embodiment, wherein the explanation of same as the previously described embodiments or corresponding term is no longer superfluous herein
It states.Referring to fig. 2, method provided in this embodiment includes:
S210, phonetic function unlatching after, receive user input current speech information.
It is illustratively, more preferable relative to the user experience of manual manipulation mode due to voice input mode, in order to mention
User experience is risen, electrical equipment can preferential recommendation voice input mode.But after phonetic function is opened, if setting
The current speech information for not receiving user's input in time, then be switched to text input interface for current speech input interface,
So that user carries out text input.
Wherein, setting time can be time set before electrical equipment factory, such as 30 seconds, or Yong Hugen
The time being arranged according to self-demand.
S220, judge whether stored reference voice information matches in current speech information and sound bank, if so,
Execute step S230;Otherwise, step S250 is executed.
Illustratively, the operation for determining that stored reference voice information matches in current speech information and sound bank can
With are as follows:
The voice messaging is pre-processed based on default speech recognition algorithm, obtains multiple sound bites;Based on pre-
If acoustic model, voice messaging stored in multiple sound bites and sound bank is subjected to similarity-rough set;If similarity
Reach given threshold, it is determined that current speech information matches with voice messaging stored in sound bank.
Wherein, presetting speech recognition algorithm is that VDA, echo cancellor and voice split scheduling algorithm, available by the algorithm
Multiple sound bites.Wherein, predetermined acoustic model can be Hidden Markov Model, by the model, extractable sound bite
In acoustic feature, and voice messaging stored in the acoustic feature and sound bank is subjected to similarity-rough set.Wherein, it sets
Threshold value is empirical value, preferably 95%.
Text information corresponding to S230, display current speech information, continues to execute step S240.
If S240, the confirmation instruction for receiving user, control current device and execute control corresponding with text information
Operation.
Optionally, show that the mode of text information corresponding to current speech information can be with are as follows: from sound bank inquiry with
Text information corresponding to the reference information that current speech information matches, and the text information is shown.Or if
Current speech information is mandarin pronunciation, then the mandarin pronunciation directly can be converted to text information and shown.This implementation
In example, the effect of word-information display is the accuracy in order to identify for user's confirmation electrical equipment to voice messaging, if electric
Device equipment will execute the corresponding control operation of text information, need to also carry out again after the confirmation instruction for receiving user.
Illustratively, confirmation instruction can be with are as follows: the confirmation instruction that user is issued by remote controler, such as user click it is distant
Control the confirmation key on device.Or may be electrical equipment identifies to include the voice messaging for confirming mark, i.e. user sends out
Having gone out includes the voice messaging for confirming mark, and confirmation mark can be " OK " or " confirmation " etc..
S250, it current speech information is converted into text information shows, continue to execute step S260.
S260, it obtains to the edit instruction of text information, and text information executive editor is operated according to edit instruction, and
New text information after executive editor is operated is as target text information.
Illustratively, the character input method provided in the present embodiment is the input method with intelligent memory functional, i.e. electric appliance
Equipment can store vocabulary according to frequency of use of the user to vocabulary.When user is when carrying out text input, only need to input
The initial character of text.If electrical equipment detects the initial character phase of initial character with locally stored multiple target vocabularies
Match, is then shown multiple target vocabularies in such a way that frequency of use is successively decreased;Wherein, multiple target vocabularies are using frequency
Rate reaches the vocabulary of predeterminated frequency.
Further, since local storage space is limited, electrical equipment will reach in the memory capacity of memory space
It is automatic to remove the lower vocabulary of frequency of use or phrase before default maximum storage capacity.It is specific to remove rule preferably are as follows: electricity
Device equipment sorts to the utilization frequency of the vocabulary and phrase that are locally stored, and sequence can be removed preferentially in rearmost vocabulary
Fall, the vocabulary or phrase capacity removed every time account for 20 the percent of total size of vocabulary.
S270, will be in target text information deposit sound bank corresponding with current speech information.
The present embodiment on the basis of the above embodiments, uses voice input and has intelligent memory functional input method phase
In conjunction with mode, intelligently record electrical equipment identified when to speech recognition mistake voice messaging and user modification after
Correct target text information, and will be in the corresponding deposit sound bank of the correct text information and voice messaging.When user again
When secondary input voice messaging, electrical equipment can automatically identify the voice messaging of user based on the content stored in sound bank,
After the confirmation instruction for receiving user, then controls current electrical equipment and execute control operation corresponding with target text information.
Embodiment three
Fig. 3 is a kind of structural block diagram of the processing unit for voice messaging that the embodiment of the present invention three provides, as shown in figure 3,
The device includes: current speech data obtaining module 310, the first display module 320, text information editor module 330 and storage
Module 340.
Wherein, current speech data obtaining module 310, for receiving the current of user's input after phonetic function unlatching
Voice messaging;
First display module 320, if believed for stored reference voice in the current speech information and sound bank
Breath mismatches, then the current speech information is converted to text information and shown;
Text information editor module 330 refers to for obtaining the edit instruction to the text information, and according to the editor
It enables and the text information executive editor is operated, and the new text information after executive editor is operated is as target text letter
Breath;
Memory module 340 is used for the target text information deposit voice corresponding with the current speech information
In library.
The technical solution of the present embodiment, electrical equipment by by received current speech information with it is stored in sound bank
Reference voice information is matched, if the two mismatches, current speech is converted to text information and is shown, and will be right
In the corresponding deposit sound bank of target text information and current speech information after text information executive editor operation, so as to make
Electrical equipment when receiving the target voice information again, even if should the voice messaging there is the accent of user, electrical equipment
The voice messaging can also be identified, and execute corresponding movement, by using above-mentioned technical proposal, may make and differently deposit
Electrical equipment can be controlled by voice messaging in different accent users, while promoting user experience, it helps electricity
The a large amount of of device equipment phonitic entry method popularize.
On the basis of the above embodiments, the device further include:
Second display module, if being used for the current speech information and reference voice information phase stored in sound bank
Matching, then show text information corresponding to the current speech information;
Control module controls current device and executes and believe with the text if the confirmation for receiving user instructs
Cease corresponding control operation.
On the basis of the above embodiments, the device further include:
Changing interface module is used for after phonetic function unlatching, if not receiving user's input within the set time
Current speech input interface is then switched to text input interface by current speech information, so that user carries out text input.
On the basis of the above embodiments, the second display module is specifically used for:
The voice messaging is pre-processed based on default speech recognition algorithm, obtains multiple sound bites;
Based on predetermined acoustic model, reference voice information stored in the multiple sound bite and sound bank is carried out
Similarity-rough set;
If the similarity reaches given threshold, it is determined that stored ginseng in the current speech information and sound bank
The matching of written comments on the work, etc of public of officials message manner of breathing;
Show text information corresponding to the current speech information.
On the basis of the above embodiments, the confirmation instruction are as follows:
User is instructed by the confirmation that remote controler issues;Or,
It include the voice messaging of confirmation mark.
On the basis of the above embodiments, the device further include:
Initial character identification module is used on the basis of the above embodiments,
Vocabulary display module, if the initial character phase for the initial character and locally stored multiple target vocabularies
Match, is then shown multiple target vocabularies in such a way that frequency of use is successively decreased;Wherein, the multiple target vocabulary is to make
Reach the vocabulary of predeterminated frequency with frequency.
The processing unit of voice messaging provided by the embodiment of the present invention can be performed provided by any embodiment of the invention
The processing method of voice messaging has the corresponding functional module of execution method and beneficial effect.It is not detailed in the above-described embodiments
The technical detail of description, reference can be made to the processing method of voice messaging provided by any embodiment of the invention.
Example IV
Fig. 4 is a kind of structural schematic diagram for equipment that the embodiment of the present invention four provides.Fig. 4, which is shown, to be suitable for being used to realizing this
The block diagram of the example devices 12 of invention embodiment.The equipment 12 that Fig. 4 is shown is only an example, should not be to of the invention real
The function and use scope for applying example bring any restrictions.
As shown in figure 4, equipment 12 is showed in the form of universal computing device.The component of equipment 12 may include but unlimited
In one or more processor or processing unit 16, system storage 28, connecting different system components, (including system is deposited
Reservoir 28 and processing unit 16) bus 18.
Bus 18 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller,
Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts
For example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC)
Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Equipment 12 typically comprises a variety of computer system readable media.These media can be it is any can be by equipment 12
The usable medium of access, including volatile and non-volatile media, moveable and immovable medium.
System storage 28 may include the computer system readable media of form of volatile memory, such as arbitrary access
Memory (RAM) 30 and/or cache memory 32.Equipment 12 may further include it is other it is removable/nonremovable,
Volatile/non-volatile computer system storage medium.Only as an example, storage system 34 can be used for reading and writing irremovable
, non-volatile magnetic media (Fig. 4 do not show, commonly referred to as " hard disk drive ").Although not shown in fig 4, use can be provided
In the disc driver read and write to removable non-volatile magnetic disk (such as " floppy disk "), and to removable anonvolatile optical disk
The CD drive of (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driver can
To be connected by one or more data media interfaces with bus 18.Memory 28 may include at least one program product,
The program product has one group of (for example, at least one) program module, these program modules are configured to perform each implementation of the invention
The function of example.
Program/utility 40 with one group of (at least one) program module 42 can store in such as memory 28
In, such program module 42 include but is not limited to operating system, one or more application program, other program modules and
It may include the realization of network environment in program data, each of these examples or certain combination.Program module 42 is usual
Execute the function and/or method in embodiment described in the invention.
Equipment 12 can also be communicated with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 etc.),
Can also be enabled a user to one or more equipment interacted with the equipment 12 communication, and/or with enable the equipment 12 with
One or more of the other any equipment (such as network interface card, modem etc.) communication for calculating equipment and being communicated.It is this logical
Letter can be carried out by input/output (I/O) interface 22.Also, equipment 12 can also by network adapter 20 and one or
The multiple networks of person (such as local area network (LAN), wide area network (WAN) and/or public network, such as internet) communication.As shown,
Network adapter 20 is communicated by bus 18 with other modules of equipment 12.It should be understood that although not shown in the drawings, can combine
Equipment 12 use other hardware and/or software module, including but not limited to: microcode, device driver, redundant processing unit,
External disk drive array, RAID system, tape drive and data backup storage system etc..
Processing unit 16 by the program that is stored in system storage 28 of operation, thereby executing various function application and
Data processing, such as realize the processing method of voice messaging provided by any embodiment of the invention, this method comprises:
After phonetic function unlatching, the current speech information of user's input is received;
It, will be described current if stored reference voice information mismatches in the current speech information and sound bank
Voice messaging is converted to text information and is shown;
The edit instruction to the text information is obtained, and according to the edit instruction to the text information executive editor
Operation, and the new text information after executive editor is operated is as target text information;
It will be in the target text information deposit sound bank corresponding with the current speech information.
Embodiment five
The embodiment of the present invention five additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should
The processing method of voice messaging provided by any embodiment of the invention is realized when program is executed by processor, this method comprises:
After phonetic function unlatching, the current speech information of user's input is received;
It, will be described current if stored reference voice information mismatches in the current speech information and sound bank
Voice messaging is converted to text information and is shown;
The edit instruction to the text information is obtained, and according to the edit instruction to the text information executive editor
Operation, and the new text information after executive editor is operated is as target text information;
It will be in the target text information deposit sound bank corresponding with the current speech information.
The computer storage medium of the embodiment of the present invention, can be using any of one or more computer-readable media
Combination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.It is computer-readable
Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or
Device, or any above combination.The more specific example (non exhaustive list) of computer readable storage medium includes: tool
There are electrical connection, the portable computer diskette, hard disk, random access memory (RAM), read-only memory of one or more conducting wires
(ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-
ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable storage
Medium can be any tangible medium for including or store program, which can be commanded execution system, device or device
Using or it is in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for
By the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited
In wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof
Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++,
Further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with
It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion
Divide and partially executes or executed on a remote computer or server completely on the remote computer on the user computer.?
Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including local area network (LAN) or
Wide area network (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as mentioned using Internet service
It is connected for quotient by internet).
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that
The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation,
It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention
It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also
It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.
Claims (10)
1. a kind of processing method of voice messaging characterized by comprising
After phonetic function unlatching, the current speech information of user's input is received;
If stored reference voice information mismatches in the current speech information and sound bank, by the current speech
Information is converted to text information and is shown;
The edit instruction to the text information is obtained, and the text information executive editor is grasped according to the edit instruction
Make, and the new text information after executive editor is operated is as target text information;
It will be in the target text information deposit sound bank corresponding with the current speech information.
2. the method according to claim 1, wherein further include:
If the current speech information matches with reference voice information stored in sound bank, the current language is shown
The corresponding text information of message breath;
If receiving the confirmation instruction of user, controls current device and execute control operation corresponding with the text information.
3. the method according to claim 1, wherein further include:
It, will be current if not receiving the current speech information of user's input within the set time after phonetic function unlatching
Voice input interface is switched to text input interface, so that user carries out text input.
4. according to the method described in claim 2, it is characterized in that, the current speech information and stored ginseng in sound bank
Written comments on the work, etc of public of officials message manner of breathing matches
The voice messaging is pre-processed based on default speech recognition algorithm, obtains multiple sound bites;
It is based on predetermined acoustic model, the multiple sound bite is similar to reference voice information progress stored in sound bank
Degree compares;
If the similarity reaches given threshold, it is determined that stored in the current speech information and sound bank to refer to language
The matching of message manner of breathing.
5. according to the method described in claim 2, it is characterized in that, the confirmation instructs are as follows:
User is instructed by the confirmation that remote controler issues;Or,
It include the voice messaging of confirmation mark.
6. according to the method described in claim 3, it is characterized by further comprising:
When user carries out text input, the initial character of identification user's input;
If the initial character and the initial character of locally stored multiple target vocabularies match, multiple target vocabularies are pressed
It is shown according to the mode that frequency of use is successively decreased;Wherein, the multiple target vocabulary is that frequency of use reaches predeterminated frequency
Vocabulary.
7. a kind of processing unit of voice messaging characterized by comprising
Current speech data obtaining module, for receiving the current speech information of user's input after phonetic function unlatching;
First display module, if not for stored reference voice information in the current speech information and sound bank
Match, then the current speech information is converted into text information and shown;
Text information editor module, for obtaining the edit instruction to the text information, and according to the edit instruction to institute
Text information executive editor operation is stated, and the new text information after executive editor is operated is as target text information;
Memory module, for will the target text information is corresponding with the current speech information is stored in the sound bank.
8. device according to claim 7, which is characterized in that further include:
Second display module, if being used for the current speech information and reference voice information phase stored in sound bank
Match, then shows text information corresponding to the current speech information;
Control module controls current device and executes and the text information pair if the confirmation for receiving user instructs
The control operation answered.
9. a kind of equipment, which is characterized in that the equipment includes:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
Now such as the processing method of voice messaging as claimed in any one of claims 1 to 6.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The processing method such as voice messaging as claimed in any one of claims 1 to 6 is realized when execution.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810864520.5A CN109036406A (en) | 2018-08-01 | 2018-08-01 | A kind of processing method of voice messaging, device, equipment and storage medium |
PCT/CN2019/082706 WO2020024620A1 (en) | 2018-08-01 | 2019-04-15 | Voice information processing method and device, apparatus, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810864520.5A CN109036406A (en) | 2018-08-01 | 2018-08-01 | A kind of processing method of voice messaging, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109036406A true CN109036406A (en) | 2018-12-18 |
Family
ID=64648341
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810864520.5A Pending CN109036406A (en) | 2018-08-01 | 2018-08-01 | A kind of processing method of voice messaging, device, equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109036406A (en) |
WO (1) | WO2020024620A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109215638A (en) * | 2018-10-19 | 2019-01-15 | 珠海格力电器股份有限公司 | A kind of phonetic study method, apparatus, speech ciphering equipment and storage medium |
CN109584875A (en) * | 2018-12-24 | 2019-04-05 | 珠海格力电器股份有限公司 | A kind of speech ciphering equipment control method, device, storage medium and speech ciphering equipment |
WO2020024620A1 (en) * | 2018-08-01 | 2020-02-06 | 深圳创维-Rgb电子有限公司 | Voice information processing method and device, apparatus, and storage medium |
CN111261155A (en) * | 2019-12-27 | 2020-06-09 | 北京得意音通技术有限责任公司 | Speech processing method, computer-readable storage medium, computer program, and electronic device |
CN112927693A (en) * | 2021-03-03 | 2021-06-08 | 立讯电子科技(昆山)有限公司 | Control method, device and system based on voice control |
CN113674743A (en) * | 2021-08-20 | 2021-11-19 | 云知声(上海)智能科技有限公司 | ASR result replacement processing device and processing method used in natural language processing |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103187056A (en) * | 2011-12-28 | 2013-07-03 | 上海博泰悦臻电子设备制造有限公司 | Voice processing system based on vehicle-mounted application |
WO2014024751A1 (en) * | 2012-08-10 | 2014-02-13 | エイディシーテクノロジー株式会社 | Voice response system |
CN104346127A (en) * | 2013-08-02 | 2015-02-11 | 腾讯科技(深圳)有限公司 | Realization method, realization device and terminal for voice input |
CN105408952A (en) * | 2013-02-21 | 2016-03-16 | 谷歌技术控股有限责任公司 | Recognizing accented speech |
CN106384593A (en) * | 2016-09-05 | 2017-02-08 | 北京金山软件有限公司 | Voice information conversion and information generation method and device |
CN106790942A (en) * | 2016-12-28 | 2017-05-31 | 努比亚技术有限公司 | Voice messaging intelligence store method and device |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101276514A (en) * | 2008-03-31 | 2008-10-01 | 深圳创维-Rgb电子有限公司 | Method, system and apparatus for controlling electronic equipment |
DK201670539A1 (en) * | 2016-03-14 | 2017-10-02 | Apple Inc | Dictation that allows editing |
CN107146607B (en) * | 2017-04-10 | 2021-06-18 | 北京猎户星空科技有限公司 | Method, device and system for correcting interaction information of intelligent equipment |
CN108154878A (en) * | 2017-12-12 | 2018-06-12 | 北京小米移动软件有限公司 | Control the method and device of monitoring device |
CN108806688A (en) * | 2018-07-16 | 2018-11-13 | 深圳Tcl数字技术有限公司 | Sound control method, smart television, system and the storage medium of smart television |
CN109036406A (en) * | 2018-08-01 | 2018-12-18 | 深圳创维-Rgb电子有限公司 | A kind of processing method of voice messaging, device, equipment and storage medium |
-
2018
- 2018-08-01 CN CN201810864520.5A patent/CN109036406A/en active Pending
-
2019
- 2019-04-15 WO PCT/CN2019/082706 patent/WO2020024620A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103187056A (en) * | 2011-12-28 | 2013-07-03 | 上海博泰悦臻电子设备制造有限公司 | Voice processing system based on vehicle-mounted application |
WO2014024751A1 (en) * | 2012-08-10 | 2014-02-13 | エイディシーテクノロジー株式会社 | Voice response system |
CN105408952A (en) * | 2013-02-21 | 2016-03-16 | 谷歌技术控股有限责任公司 | Recognizing accented speech |
CN104346127A (en) * | 2013-08-02 | 2015-02-11 | 腾讯科技(深圳)有限公司 | Realization method, realization device and terminal for voice input |
CN106384593A (en) * | 2016-09-05 | 2017-02-08 | 北京金山软件有限公司 | Voice information conversion and information generation method and device |
CN106790942A (en) * | 2016-12-28 | 2017-05-31 | 努比亚技术有限公司 | Voice messaging intelligence store method and device |
Non-Patent Citations (4)
Title |
---|
(英)柯林森: "《航空电子系统导论 原书第3版》", 31 October 2013, 北京:国防工业出版社 * |
九天科技: "《中老年人学电脑与上网傻瓜书(Windows 10+Office 2016版)》", 31 January 2018, 北京:中国铁道出版社 * |
创客诚品: "《五笔打字新手速成》", 31 July 2017, 北京希望电子出版社 * |
周学君等: "《计算机基础教程》", 30 September 2005, 武汉:华中科技大学出版社 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020024620A1 (en) * | 2018-08-01 | 2020-02-06 | 深圳创维-Rgb电子有限公司 | Voice information processing method and device, apparatus, and storage medium |
CN109215638A (en) * | 2018-10-19 | 2019-01-15 | 珠海格力电器股份有限公司 | A kind of phonetic study method, apparatus, speech ciphering equipment and storage medium |
CN109584875A (en) * | 2018-12-24 | 2019-04-05 | 珠海格力电器股份有限公司 | A kind of speech ciphering equipment control method, device, storage medium and speech ciphering equipment |
CN111261155A (en) * | 2019-12-27 | 2020-06-09 | 北京得意音通技术有限责任公司 | Speech processing method, computer-readable storage medium, computer program, and electronic device |
CN112927693A (en) * | 2021-03-03 | 2021-06-08 | 立讯电子科技(昆山)有限公司 | Control method, device and system based on voice control |
CN113674743A (en) * | 2021-08-20 | 2021-11-19 | 云知声(上海)智能科技有限公司 | ASR result replacement processing device and processing method used in natural language processing |
Also Published As
Publication number | Publication date |
---|---|
WO2020024620A1 (en) | 2020-02-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11727914B2 (en) | Intent recognition and emotional text-to-speech learning | |
CN109036406A (en) | A kind of processing method of voice messaging, device, equipment and storage medium | |
US11302302B2 (en) | Method, apparatus, device and storage medium for switching voice role | |
CN109643549B (en) | Speech recognition method and device based on speaker recognition | |
US10089974B2 (en) | Speech recognition and text-to-speech learning system | |
KR101213835B1 (en) | Verb error recovery in speech recognition | |
KR102108500B1 (en) | Supporting Method And System For communication Service, and Electronic Device supporting the same | |
US9336773B2 (en) | System and method for standardized speech recognition infrastructure | |
US7930183B2 (en) | Automatic identification of dialog timing problems for an interactive speech dialog application using speech log data indicative of cases of barge-in and timing problems | |
JP4481972B2 (en) | Speech translation device, speech translation method, and speech translation program | |
EP2388778B1 (en) | Speech recognition | |
US9070363B2 (en) | Speech translation with back-channeling cues | |
US11093110B1 (en) | Messaging feedback mechanism | |
EP3779971A1 (en) | Method for recording and outputting conversation between multiple parties using voice recognition technology, and device therefor | |
JP2011504624A (en) | Automatic simultaneous interpretation system | |
CN111916088B (en) | Voice corpus generation method and device and computer readable storage medium | |
KR20110099434A (en) | Method and apparatus to improve dialog system based on study | |
CN108882101A (en) | A kind of control method for playing back of intelligent sound box, device, equipment and storage medium | |
CN114120979A (en) | Optimization method, training method, device and medium of voice recognition model | |
US9218807B2 (en) | Calibration of a speech recognition engine using validated text | |
CN113611316A (en) | Man-machine interaction method, device, equipment and storage medium | |
JP5818753B2 (en) | Spoken dialogue system and spoken dialogue method | |
KR20180066513A (en) | Automatic interpretation method and apparatus, and machine translation method | |
JP7333371B2 (en) | Automatic Interpretation Method Based on Speaker Separation, User Terminal Providing Automatic Interpretation Service Based on Speaker Separation, and Automatic Interpretation Service Providing System Based on Speaker Separation | |
Wang et al. | Cross Cultural Comparison of Users’ Barge-in with the In-Vehicle Speech System |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181218 |