CN109036406A - A kind of processing method of voice messaging, device, equipment and storage medium - Google Patents

A kind of processing method of voice messaging, device, equipment and storage medium Download PDF

Info

Publication number
CN109036406A
CN109036406A CN201810864520.5A CN201810864520A CN109036406A CN 109036406 A CN109036406 A CN 109036406A CN 201810864520 A CN201810864520 A CN 201810864520A CN 109036406 A CN109036406 A CN 109036406A
Authority
CN
China
Prior art keywords
information
text information
current speech
user
voice messaging
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810864520.5A
Other languages
Chinese (zh)
Inventor
干晓萍
范思越
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Skyworth RGB Electronics Co Ltd
Original Assignee
Shenzhen Skyworth RGB Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Skyworth RGB Electronics Co Ltd filed Critical Shenzhen Skyworth RGB Electronics Co Ltd
Priority to CN201810864520.5A priority Critical patent/CN109036406A/en
Publication of CN109036406A publication Critical patent/CN109036406A/en
Priority to PCT/CN2019/082706 priority patent/WO2020024620A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Abstract

The embodiment of the invention discloses a kind of processing method of voice messaging, device, equipment and storage mediums.This method comprises: receiving the current speech information of user's input after phonetic function unlatching;If stored reference voice information mismatches in the current speech information and sound bank, the current speech information is converted into text information and is shown;It obtains to the edit instruction of the text information, and the text information executive editor is operated according to the edit instruction, and the new text information after executive editor is operated is as target text information;It will be in the target text information deposit sound bank corresponding with the current speech information.By using above-mentioned technical proposal, when solving using voice control of electrical equipment, the limited problem of the different accents recognitions of different user.While promoting user experience, it helps a large amount of of electrical equipment phonitic entry method popularize.

Description

A kind of processing method of voice messaging, device, equipment and storage medium
Technical field
The present embodiments relate to field of speech recognition more particularly to a kind of processing methods of voice messaging, device, equipment And storage medium.
Background technique
With the development of science and technology, electrical equipment intelligence and hommization have been people's questions of common concern, electricity Device device intelligence and hommization are that the operation of people provides a great convenience.
For the various electrical equipments on current market, such as television set, set-top box, between these electrical equipments and user Human-computer interaction be generally single key interaction, i.e., electrical equipment is controlled by traditional soft keyboard input mode. Currently, the input of such soft keyboard is more commonly used also popular input mode in the market.But this input mode is using In the process, operation is relatively cumbersome, for example, user when carrying out Chinese character input, need to input one by one the corresponding phonetic of Chinese character. For some users for not knowing about phonetic or five-stroke input method, then this soft keyboard input mode is not available.
Currently, being popular voice input mode there is also another.Although can be user by voice input It provides a great convenience, but since the accent of different regions user has differences, electrical equipment is difficult in identification process Different accents are identified, are also difficult to be popularized so as to cause phonitic entry method.
Summary of the invention
The embodiment of the present invention provides processing method, device, equipment and the storage medium of a kind of voice messaging, to solve to utilize When voice control of electrical equipment, the limited problem of the different accents recognitions of different user.
In a first aspect, the embodiment of the invention provides a kind of processing methods of voice messaging, this method comprises:
After phonetic function unlatching, the current speech information of user's input is received;
It, will be described current if stored reference voice information mismatches in the current speech information and sound bank Voice messaging is converted to text information and is shown;
The edit instruction to the text information is obtained, and according to the edit instruction to the text information executive editor Operation, and the new text information after executive editor is operated is as target text information;
It will be in the target text information deposit sound bank corresponding with the current speech information.
Second aspect, the embodiment of the invention also provides a kind of processing unit of voice messaging, which includes:
Current speech data obtaining module, for receiving the current speech information of user's input after phonetic function unlatching;
First display module, if not for stored reference voice information in the current speech information and sound bank Matching, then be converted to text information for the current speech information and show;
Text information editor module, for obtaining the edit instruction to the text information, and according to the edit instruction The text information executive editor is operated, and the new text information after executive editor is operated is as target text information;
Memory module is used for the target text information deposit sound bank corresponding with the current speech information In.
The third aspect, the embodiment of the invention also provides a kind of equipment, which includes:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processing Device realizes the processing method of voice messaging provided by any embodiment of the invention.
Fourth aspect, the embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer Program, the program realize the processing method of voice messaging provided by any embodiment of the invention when being executed by processor.
The embodiment of the present invention is after phonetic function unlatching, by receiving the current speech information of user's input, if it is determined that Current speech information is mismatched with reference voice information stored in sound bank out, then current speech information is converted to text Information is shown.At this point, if user has found the language that the text information and user have been sent by shown text information Message breath can not edit user when corresponding to the text information, thus the language for having exported the text information and user The matching of message manner of breathing.Electrical equipment can believe text according to edit instruction after obtaining user to the edit instruction of text information Executive editor's operation is ceased, and the new text information after executive editor is operated is as target text information.By the way that target is literary It, can be from default voice when user issues the voice messaging again in word information deposit sound bank corresponding with current speech information Corresponding text information is found in library, if the text information is corresponding with the control instruction of electrical equipment, can control electricity Device equipment executes control operation corresponding with text information.By using above-mentioned technical proposal, electrical equipment is realized to difference The different accents of user identify, and corresponding movement can be executed according to recognition result, so that different regions exist not User with accent can be transferred through voice messaging control electrical equipment, while promoting user experience, it helps electric appliance is set The a large amount of of standby phonitic entry method popularize.
Detailed description of the invention
Fig. 1 is a kind of flow chart of the processing method for voice messaging that the embodiment of the present invention one provides;
Fig. 2 is a kind of flow chart of the processing method of voice messaging provided by Embodiment 2 of the present invention;
Fig. 3 is a kind of structural block diagram of the processing unit for voice messaging that the embodiment of the present invention three provides;
Fig. 4 is a kind of structural schematic diagram for equipment that the embodiment of the present invention four provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
Embodiment one
Fig. 1 is a kind of flow chart for the processing method of voice messaging that the embodiment of the present invention one provides, and this method can be by The processing unit of voice messaging executes, which can be realized by way of software and/or hardware, which can integrate In the electrical equipments such as such as TV, air-conditioning, also it can be integrated into the mobile terminals such as smart phone or tablet computer.Referring to Fig. 1, originally The method of embodiment specifically includes:
S110, phonetic function unlatching after, receive user input current speech information.
Wherein, the state of speech voice input function includes state of activation and two kinds of unactivated state.It needs in user using language When phonetic input method is communicated or exchanged, user can open language by the talk button on the remote controler of click electrical equipment Sound input function.
Illustratively, corresponding identifier can be arranged for the state of speech voice input function can be set in active state The flag bit of the identifier is 1;Under unactivated state, the flag bit that the identifier can be set is 0.In the present embodiment, When whether detection phonetic function is activated, it can be determined by reading numerical value corresponding to the flag bit.
Illustratively, after detecting that phonetic function is opened, show that user exists and want to control electric appliance by voice messaging The wish of equipment, at this point, electrical equipment will activate phonitic entry method panel, to receive the current speech information of user's input.
If stored reference voice information mismatches in S120, current speech information and sound bank, by current language Message breath is converted to text information and is shown.
Illustratively, user's input voice information is primarily to control electrical equipment executes relevant movement, such as cuts Change channel, adjust volume etc., so as to substitute user's manual operation, to promote user experience.And utilizing voice messaging control When electrical equipment processed is acted, generally all different control instructions can be set, and electrical equipment can according to the difference of voice messaging Corresponding movement is executed according to different control instructions.
In general, electrical equipment to the identification of user speech information is known according to the identification method to mandarin Not, and phonetic control command present in electrical equipment is generally corresponding with mandarin.If the received language of electrical equipment institute Message breath not instead of mandarin pronunciation, the dialect with local characteristic where user, then electrical equipment will be unable to according to voice Information carries out corresponding control.Therefore in order to guarantee the correct identification to different phonetic information, electrical equipment can be by voice messaging Recognition result shown with written form, so that user confirms.Also, in the present embodiment, text information and electric appliance There is also preset corresponding relationships for the control instruction of equipment, as long as user confirmed that text information is errorless, and have sent confirmation and refer to It enables, control action corresponding with the text information then can be performed after receiving confirmation instruction in electrical equipment.
Illustratively, in the present embodiment, speech recognition, semantic parsing and voice is can be used in voice messaging conversion text information The voice messaging that user inputs is converted to text information automatically by the technologies such as synthesis.Wherein, the effect for carrying out text conversion is as follows: User can determine that electrical equipment identifies whether wrong to current speech information according to the text information after conversion, i.e., electric appliance is set The standby current speech information identified most starts whether the content being intended by consistent namely identification knot of electrical equipment with user Whether fruit meets the original intention of user.Or if user place still in need added or modify after input voice information, User can also be modified in time, to avoid voice messaging sending that will be also imperfect.
Illustratively, if the user find that the voice messaging that shown text information is intended by with oneself mismatches, It can then modify to text information, keep it corresponding with issued voice messaging.
It should be noted that the sound bank in the present embodiment is mainly used for storing the voice messaging and corresponding text of user Information, the text information refer to the text information to match with the voice messaging of user's output.For example, if the user find that electric appliance The voice messaging and mismatch that text information shown by equipment current interface is intended by with oneself, then need to text information into Row modification, and the text information stored in sound bank is that text that is passing through user's modification and matching with voice messaging is believed Breath.
Illustratively, in the present embodiment, when electrical equipment every time receive current speech information when, then need current language Message breath is matched with stored voice messaging in default sound bank, if current speech information be active user before Voice messaging through inputting is preset in sound bank and is stored with the voice messaging and corresponding text information, even if the voice Information is not the default mandarin for supporting identification of electrical equipment, and electrical equipment can also find voice letter from default sound bank Breath, and can correspond to and find text information corresponding to the voice messaging and shown.After user confirms that text information is errorless, i.e., If receiving the confirmation instruction of user, the operation of control corresponding to the text information can be performed.
Preferably, electrical equipment is matching current speech information with reference voice information stored in sound bank When, first voice messaging can be pre-processed, such as VAD (Voice Activity Detection, speech activity inspection can be used Survey) and the modes such as echo cancellor, wherein VAD mode is mute the cutting off to voice signal head and the tail section, with reduction to subsequent It is interfered caused by speech recognition.After completing pretreatment, it can be used and fallen if any linear prediction residue error (LPCC) algorithm and Mel Spectral coefficient (MFCC) algorithm carries out feature extraction to voice signal, then utilizes acoustic model and speech model technology by sound Tablet section is matched with voice messaging stored in sound bank.
S130, it obtains to the edit instruction of text information, and text information executive editor is operated according to edit instruction, and New text information after executive editor is operated is as target text information.
Illustratively, in the default sound bank of the present embodiment, voice messaging that user has inputted and right can be stored with The text information answered.As long as electrical equipment detects voice messaging, the voice messaging received can be identified, and turned It is changed to text information to be shown, so that user confirms and corrects.If receiving the edit instruction of user, illustrate that electric appliance is set The standby voice messaging that the recognition result of the voice messaging and user are intended by and mismatch, at this point, by being referred to according to editor It enables and text executive editor is operated, can be using the new text information after edit operation as target text information, and it can be by user Current speech information and corresponding be stored in of target voice information are preset in sound bank.
S140, will be in target text information deposit sound bank corresponding with current speech information.
Illustratively, in the present embodiment by target voice information it is corresponding with current speech information storage into sound bank, such as Fruit electrical equipment receives voice messaging identical with current speech information again, then can be based on the sound bank to receiving again Voice messaging accurately identified, and text information corresponding to the voice messaging is found from sound bank, and shown Show, to solve the problems, such as that electrical equipment is difficult to different accents, facilitates the speech voice input function for promoting electrical equipment.
Illustratively, if the voice messaging that electrical equipment receives is the voice messaging that user inputs for the first time, the language Message breath with voice messaging stored in sound bank and mismatch, then can be according to mode provided in an embodiment of the present invention to the language Message breath carries out text conversion, if user edits the text information after conversion, by edited target text In information and the corresponding deposit sound bank of corresponding voice messaging.
The technical solution of the present embodiment, electrical equipment by by received current speech information with it is stored in sound bank Reference voice information is matched, if the two mismatches, current speech is converted to text information and is shown, and will be right In the corresponding deposit sound bank of target text information and current speech information after text information executive editor operation, so as to make Electrical equipment when receiving the target voice information again, even if should the voice messaging there is the accent of user, electrical equipment The voice messaging can also be identified, and execute corresponding movement, by using above-mentioned technical proposal, may make and differently deposit Electrical equipment can be controlled by voice messaging in different accent users, while promoting user experience, it helps electricity The a large amount of of device equipment phonitic entry method popularize.
Embodiment two
Fig. 2 is a kind of flow chart of the processing method of voice messaging provided by Embodiment 2 of the present invention, and the present embodiment is upper It states and is optimized on the basis of embodiment, wherein the explanation of same as the previously described embodiments or corresponding term is no longer superfluous herein It states.Referring to fig. 2, method provided in this embodiment includes:
S210, phonetic function unlatching after, receive user input current speech information.
It is illustratively, more preferable relative to the user experience of manual manipulation mode due to voice input mode, in order to mention User experience is risen, electrical equipment can preferential recommendation voice input mode.But after phonetic function is opened, if setting The current speech information for not receiving user's input in time, then be switched to text input interface for current speech input interface, So that user carries out text input.
Wherein, setting time can be time set before electrical equipment factory, such as 30 seconds, or Yong Hugen The time being arranged according to self-demand.
S220, judge whether stored reference voice information matches in current speech information and sound bank, if so, Execute step S230;Otherwise, step S250 is executed.
Illustratively, the operation for determining that stored reference voice information matches in current speech information and sound bank can With are as follows:
The voice messaging is pre-processed based on default speech recognition algorithm, obtains multiple sound bites;Based on pre- If acoustic model, voice messaging stored in multiple sound bites and sound bank is subjected to similarity-rough set;If similarity Reach given threshold, it is determined that current speech information matches with voice messaging stored in sound bank.
Wherein, presetting speech recognition algorithm is that VDA, echo cancellor and voice split scheduling algorithm, available by the algorithm Multiple sound bites.Wherein, predetermined acoustic model can be Hidden Markov Model, by the model, extractable sound bite In acoustic feature, and voice messaging stored in the acoustic feature and sound bank is subjected to similarity-rough set.Wherein, it sets Threshold value is empirical value, preferably 95%.
Text information corresponding to S230, display current speech information, continues to execute step S240.
If S240, the confirmation instruction for receiving user, control current device and execute control corresponding with text information Operation.
Optionally, show that the mode of text information corresponding to current speech information can be with are as follows: from sound bank inquiry with Text information corresponding to the reference information that current speech information matches, and the text information is shown.Or if Current speech information is mandarin pronunciation, then the mandarin pronunciation directly can be converted to text information and shown.This implementation In example, the effect of word-information display is the accuracy in order to identify for user's confirmation electrical equipment to voice messaging, if electric Device equipment will execute the corresponding control operation of text information, need to also carry out again after the confirmation instruction for receiving user.
Illustratively, confirmation instruction can be with are as follows: the confirmation instruction that user is issued by remote controler, such as user click it is distant Control the confirmation key on device.Or may be electrical equipment identifies to include the voice messaging for confirming mark, i.e. user sends out Having gone out includes the voice messaging for confirming mark, and confirmation mark can be " OK " or " confirmation " etc..
S250, it current speech information is converted into text information shows, continue to execute step S260.
S260, it obtains to the edit instruction of text information, and text information executive editor is operated according to edit instruction, and New text information after executive editor is operated is as target text information.
Illustratively, the character input method provided in the present embodiment is the input method with intelligent memory functional, i.e. electric appliance Equipment can store vocabulary according to frequency of use of the user to vocabulary.When user is when carrying out text input, only need to input The initial character of text.If electrical equipment detects the initial character phase of initial character with locally stored multiple target vocabularies Match, is then shown multiple target vocabularies in such a way that frequency of use is successively decreased;Wherein, multiple target vocabularies are using frequency Rate reaches the vocabulary of predeterminated frequency.
Further, since local storage space is limited, electrical equipment will reach in the memory capacity of memory space It is automatic to remove the lower vocabulary of frequency of use or phrase before default maximum storage capacity.It is specific to remove rule preferably are as follows: electricity Device equipment sorts to the utilization frequency of the vocabulary and phrase that are locally stored, and sequence can be removed preferentially in rearmost vocabulary Fall, the vocabulary or phrase capacity removed every time account for 20 the percent of total size of vocabulary.
S270, will be in target text information deposit sound bank corresponding with current speech information.
The present embodiment on the basis of the above embodiments, uses voice input and has intelligent memory functional input method phase In conjunction with mode, intelligently record electrical equipment identified when to speech recognition mistake voice messaging and user modification after Correct target text information, and will be in the corresponding deposit sound bank of the correct text information and voice messaging.When user again When secondary input voice messaging, electrical equipment can automatically identify the voice messaging of user based on the content stored in sound bank, After the confirmation instruction for receiving user, then controls current electrical equipment and execute control operation corresponding with target text information.
Embodiment three
Fig. 3 is a kind of structural block diagram of the processing unit for voice messaging that the embodiment of the present invention three provides, as shown in figure 3, The device includes: current speech data obtaining module 310, the first display module 320, text information editor module 330 and storage Module 340.
Wherein, current speech data obtaining module 310, for receiving the current of user's input after phonetic function unlatching Voice messaging;
First display module 320, if believed for stored reference voice in the current speech information and sound bank Breath mismatches, then the current speech information is converted to text information and shown;
Text information editor module 330 refers to for obtaining the edit instruction to the text information, and according to the editor It enables and the text information executive editor is operated, and the new text information after executive editor is operated is as target text letter Breath;
Memory module 340 is used for the target text information deposit voice corresponding with the current speech information In library.
The technical solution of the present embodiment, electrical equipment by by received current speech information with it is stored in sound bank Reference voice information is matched, if the two mismatches, current speech is converted to text information and is shown, and will be right In the corresponding deposit sound bank of target text information and current speech information after text information executive editor operation, so as to make Electrical equipment when receiving the target voice information again, even if should the voice messaging there is the accent of user, electrical equipment The voice messaging can also be identified, and execute corresponding movement, by using above-mentioned technical proposal, may make and differently deposit Electrical equipment can be controlled by voice messaging in different accent users, while promoting user experience, it helps electricity The a large amount of of device equipment phonitic entry method popularize.
On the basis of the above embodiments, the device further include:
Second display module, if being used for the current speech information and reference voice information phase stored in sound bank Matching, then show text information corresponding to the current speech information;
Control module controls current device and executes and believe with the text if the confirmation for receiving user instructs Cease corresponding control operation.
On the basis of the above embodiments, the device further include:
Changing interface module is used for after phonetic function unlatching, if not receiving user's input within the set time Current speech input interface is then switched to text input interface by current speech information, so that user carries out text input.
On the basis of the above embodiments, the second display module is specifically used for:
The voice messaging is pre-processed based on default speech recognition algorithm, obtains multiple sound bites;
Based on predetermined acoustic model, reference voice information stored in the multiple sound bite and sound bank is carried out Similarity-rough set;
If the similarity reaches given threshold, it is determined that stored ginseng in the current speech information and sound bank The matching of written comments on the work, etc of public of officials message manner of breathing;
Show text information corresponding to the current speech information.
On the basis of the above embodiments, the confirmation instruction are as follows:
User is instructed by the confirmation that remote controler issues;Or,
It include the voice messaging of confirmation mark.
On the basis of the above embodiments, the device further include:
Initial character identification module is used on the basis of the above embodiments,
Vocabulary display module, if the initial character phase for the initial character and locally stored multiple target vocabularies Match, is then shown multiple target vocabularies in such a way that frequency of use is successively decreased;Wherein, the multiple target vocabulary is to make Reach the vocabulary of predeterminated frequency with frequency.
The processing unit of voice messaging provided by the embodiment of the present invention can be performed provided by any embodiment of the invention The processing method of voice messaging has the corresponding functional module of execution method and beneficial effect.It is not detailed in the above-described embodiments The technical detail of description, reference can be made to the processing method of voice messaging provided by any embodiment of the invention.
Example IV
Fig. 4 is a kind of structural schematic diagram for equipment that the embodiment of the present invention four provides.Fig. 4, which is shown, to be suitable for being used to realizing this The block diagram of the example devices 12 of invention embodiment.The equipment 12 that Fig. 4 is shown is only an example, should not be to of the invention real The function and use scope for applying example bring any restrictions.
As shown in figure 4, equipment 12 is showed in the form of universal computing device.The component of equipment 12 may include but unlimited In one or more processor or processing unit 16, system storage 28, connecting different system components, (including system is deposited Reservoir 28 and processing unit 16) bus 18.
Bus 18 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts For example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC) Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Equipment 12 typically comprises a variety of computer system readable media.These media can be it is any can be by equipment 12 The usable medium of access, including volatile and non-volatile media, moveable and immovable medium.
System storage 28 may include the computer system readable media of form of volatile memory, such as arbitrary access Memory (RAM) 30 and/or cache memory 32.Equipment 12 may further include it is other it is removable/nonremovable, Volatile/non-volatile computer system storage medium.Only as an example, storage system 34 can be used for reading and writing irremovable , non-volatile magnetic media (Fig. 4 do not show, commonly referred to as " hard disk drive ").Although not shown in fig 4, use can be provided In the disc driver read and write to removable non-volatile magnetic disk (such as " floppy disk "), and to removable anonvolatile optical disk The CD drive of (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driver can To be connected by one or more data media interfaces with bus 18.Memory 28 may include at least one program product, The program product has one group of (for example, at least one) program module, these program modules are configured to perform each implementation of the invention The function of example.
Program/utility 40 with one group of (at least one) program module 42 can store in such as memory 28 In, such program module 42 include but is not limited to operating system, one or more application program, other program modules and It may include the realization of network environment in program data, each of these examples or certain combination.Program module 42 is usual Execute the function and/or method in embodiment described in the invention.
Equipment 12 can also be communicated with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 etc.), Can also be enabled a user to one or more equipment interacted with the equipment 12 communication, and/or with enable the equipment 12 with One or more of the other any equipment (such as network interface card, modem etc.) communication for calculating equipment and being communicated.It is this logical Letter can be carried out by input/output (I/O) interface 22.Also, equipment 12 can also by network adapter 20 and one or The multiple networks of person (such as local area network (LAN), wide area network (WAN) and/or public network, such as internet) communication.As shown, Network adapter 20 is communicated by bus 18 with other modules of equipment 12.It should be understood that although not shown in the drawings, can combine Equipment 12 use other hardware and/or software module, including but not limited to: microcode, device driver, redundant processing unit, External disk drive array, RAID system, tape drive and data backup storage system etc..
Processing unit 16 by the program that is stored in system storage 28 of operation, thereby executing various function application and Data processing, such as realize the processing method of voice messaging provided by any embodiment of the invention, this method comprises:
After phonetic function unlatching, the current speech information of user's input is received;
It, will be described current if stored reference voice information mismatches in the current speech information and sound bank Voice messaging is converted to text information and is shown;
The edit instruction to the text information is obtained, and according to the edit instruction to the text information executive editor Operation, and the new text information after executive editor is operated is as target text information;
It will be in the target text information deposit sound bank corresponding with the current speech information.
Embodiment five
The embodiment of the present invention five additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should The processing method of voice messaging provided by any embodiment of the invention is realized when program is executed by processor, this method comprises:
After phonetic function unlatching, the current speech information of user's input is received;
It, will be described current if stored reference voice information mismatches in the current speech information and sound bank Voice messaging is converted to text information and is shown;
The edit instruction to the text information is obtained, and according to the edit instruction to the text information executive editor Operation, and the new text information after executive editor is operated is as target text information;
It will be in the target text information deposit sound bank corresponding with the current speech information.
The computer storage medium of the embodiment of the present invention, can be using any of one or more computer-readable media Combination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.It is computer-readable Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or Device, or any above combination.The more specific example (non exhaustive list) of computer readable storage medium includes: tool There are electrical connection, the portable computer diskette, hard disk, random access memory (RAM), read-only memory of one or more conducting wires (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD- ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable storage Medium can be any tangible medium for including or store program, which can be commanded execution system, device or device Using or it is in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for By the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited In wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, Further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion Divide and partially executes or executed on a remote computer or server completely on the remote computer on the user computer.? Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including local area network (LAN) or Wide area network (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as mentioned using Internet service It is connected for quotient by internet).
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation, It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.

Claims (10)

1. a kind of processing method of voice messaging characterized by comprising
After phonetic function unlatching, the current speech information of user's input is received;
If stored reference voice information mismatches in the current speech information and sound bank, by the current speech Information is converted to text information and is shown;
The edit instruction to the text information is obtained, and the text information executive editor is grasped according to the edit instruction Make, and the new text information after executive editor is operated is as target text information;
It will be in the target text information deposit sound bank corresponding with the current speech information.
2. the method according to claim 1, wherein further include:
If the current speech information matches with reference voice information stored in sound bank, the current language is shown The corresponding text information of message breath;
If receiving the confirmation instruction of user, controls current device and execute control operation corresponding with the text information.
3. the method according to claim 1, wherein further include:
It, will be current if not receiving the current speech information of user's input within the set time after phonetic function unlatching Voice input interface is switched to text input interface, so that user carries out text input.
4. according to the method described in claim 2, it is characterized in that, the current speech information and stored ginseng in sound bank Written comments on the work, etc of public of officials message manner of breathing matches
The voice messaging is pre-processed based on default speech recognition algorithm, obtains multiple sound bites;
It is based on predetermined acoustic model, the multiple sound bite is similar to reference voice information progress stored in sound bank Degree compares;
If the similarity reaches given threshold, it is determined that stored in the current speech information and sound bank to refer to language The matching of message manner of breathing.
5. according to the method described in claim 2, it is characterized in that, the confirmation instructs are as follows:
User is instructed by the confirmation that remote controler issues;Or,
It include the voice messaging of confirmation mark.
6. according to the method described in claim 3, it is characterized by further comprising:
When user carries out text input, the initial character of identification user's input;
If the initial character and the initial character of locally stored multiple target vocabularies match, multiple target vocabularies are pressed It is shown according to the mode that frequency of use is successively decreased;Wherein, the multiple target vocabulary is that frequency of use reaches predeterminated frequency Vocabulary.
7. a kind of processing unit of voice messaging characterized by comprising
Current speech data obtaining module, for receiving the current speech information of user's input after phonetic function unlatching;
First display module, if not for stored reference voice information in the current speech information and sound bank Match, then the current speech information is converted into text information and shown;
Text information editor module, for obtaining the edit instruction to the text information, and according to the edit instruction to institute Text information executive editor operation is stated, and the new text information after executive editor is operated is as target text information;
Memory module, for will the target text information is corresponding with the current speech information is stored in the sound bank.
8. device according to claim 7, which is characterized in that further include:
Second display module, if being used for the current speech information and reference voice information phase stored in sound bank Match, then shows text information corresponding to the current speech information;
Control module controls current device and executes and the text information pair if the confirmation for receiving user instructs The control operation answered.
9. a kind of equipment, which is characterized in that the equipment includes:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as the processing method of voice messaging as claimed in any one of claims 1 to 6.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The processing method such as voice messaging as claimed in any one of claims 1 to 6 is realized when execution.
CN201810864520.5A 2018-08-01 2018-08-01 A kind of processing method of voice messaging, device, equipment and storage medium Pending CN109036406A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810864520.5A CN109036406A (en) 2018-08-01 2018-08-01 A kind of processing method of voice messaging, device, equipment and storage medium
PCT/CN2019/082706 WO2020024620A1 (en) 2018-08-01 2019-04-15 Voice information processing method and device, apparatus, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810864520.5A CN109036406A (en) 2018-08-01 2018-08-01 A kind of processing method of voice messaging, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN109036406A true CN109036406A (en) 2018-12-18

Family

ID=64648341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810864520.5A Pending CN109036406A (en) 2018-08-01 2018-08-01 A kind of processing method of voice messaging, device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN109036406A (en)
WO (1) WO2020024620A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109215638A (en) * 2018-10-19 2019-01-15 珠海格力电器股份有限公司 A kind of phonetic study method, apparatus, speech ciphering equipment and storage medium
CN109584875A (en) * 2018-12-24 2019-04-05 珠海格力电器股份有限公司 A kind of speech ciphering equipment control method, device, storage medium and speech ciphering equipment
WO2020024620A1 (en) * 2018-08-01 2020-02-06 深圳创维-Rgb电子有限公司 Voice information processing method and device, apparatus, and storage medium
CN111261155A (en) * 2019-12-27 2020-06-09 北京得意音通技术有限责任公司 Speech processing method, computer-readable storage medium, computer program, and electronic device
CN112927693A (en) * 2021-03-03 2021-06-08 立讯电子科技(昆山)有限公司 Control method, device and system based on voice control
CN113674743A (en) * 2021-08-20 2021-11-19 云知声(上海)智能科技有限公司 ASR result replacement processing device and processing method used in natural language processing

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103187056A (en) * 2011-12-28 2013-07-03 上海博泰悦臻电子设备制造有限公司 Voice processing system based on vehicle-mounted application
WO2014024751A1 (en) * 2012-08-10 2014-02-13 エイディシーテクノロジー株式会社 Voice response system
CN104346127A (en) * 2013-08-02 2015-02-11 腾讯科技(深圳)有限公司 Realization method, realization device and terminal for voice input
CN105408952A (en) * 2013-02-21 2016-03-16 谷歌技术控股有限责任公司 Recognizing accented speech
CN106384593A (en) * 2016-09-05 2017-02-08 北京金山软件有限公司 Voice information conversion and information generation method and device
CN106790942A (en) * 2016-12-28 2017-05-31 努比亚技术有限公司 Voice messaging intelligence store method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101276514A (en) * 2008-03-31 2008-10-01 深圳创维-Rgb电子有限公司 Method, system and apparatus for controlling electronic equipment
DK201670539A1 (en) * 2016-03-14 2017-10-02 Apple Inc Dictation that allows editing
CN107146607B (en) * 2017-04-10 2021-06-18 北京猎户星空科技有限公司 Method, device and system for correcting interaction information of intelligent equipment
CN108154878A (en) * 2017-12-12 2018-06-12 北京小米移动软件有限公司 Control the method and device of monitoring device
CN108806688A (en) * 2018-07-16 2018-11-13 深圳Tcl数字技术有限公司 Sound control method, smart television, system and the storage medium of smart television
CN109036406A (en) * 2018-08-01 2018-12-18 深圳创维-Rgb电子有限公司 A kind of processing method of voice messaging, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103187056A (en) * 2011-12-28 2013-07-03 上海博泰悦臻电子设备制造有限公司 Voice processing system based on vehicle-mounted application
WO2014024751A1 (en) * 2012-08-10 2014-02-13 エイディシーテクノロジー株式会社 Voice response system
CN105408952A (en) * 2013-02-21 2016-03-16 谷歌技术控股有限责任公司 Recognizing accented speech
CN104346127A (en) * 2013-08-02 2015-02-11 腾讯科技(深圳)有限公司 Realization method, realization device and terminal for voice input
CN106384593A (en) * 2016-09-05 2017-02-08 北京金山软件有限公司 Voice information conversion and information generation method and device
CN106790942A (en) * 2016-12-28 2017-05-31 努比亚技术有限公司 Voice messaging intelligence store method and device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
(英)柯林森: "《航空电子系统导论 原书第3版》", 31 October 2013, 北京:国防工业出版社 *
九天科技: "《中老年人学电脑与上网傻瓜书(Windows 10+Office 2016版)》", 31 January 2018, 北京:中国铁道出版社 *
创客诚品: "《五笔打字新手速成》", 31 July 2017, 北京希望电子出版社 *
周学君等: "《计算机基础教程》", 30 September 2005, 武汉:华中科技大学出版社 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020024620A1 (en) * 2018-08-01 2020-02-06 深圳创维-Rgb电子有限公司 Voice information processing method and device, apparatus, and storage medium
CN109215638A (en) * 2018-10-19 2019-01-15 珠海格力电器股份有限公司 A kind of phonetic study method, apparatus, speech ciphering equipment and storage medium
CN109584875A (en) * 2018-12-24 2019-04-05 珠海格力电器股份有限公司 A kind of speech ciphering equipment control method, device, storage medium and speech ciphering equipment
CN111261155A (en) * 2019-12-27 2020-06-09 北京得意音通技术有限责任公司 Speech processing method, computer-readable storage medium, computer program, and electronic device
CN112927693A (en) * 2021-03-03 2021-06-08 立讯电子科技(昆山)有限公司 Control method, device and system based on voice control
CN113674743A (en) * 2021-08-20 2021-11-19 云知声(上海)智能科技有限公司 ASR result replacement processing device and processing method used in natural language processing

Also Published As

Publication number Publication date
WO2020024620A1 (en) 2020-02-06

Similar Documents

Publication Publication Date Title
US11727914B2 (en) Intent recognition and emotional text-to-speech learning
CN109036406A (en) A kind of processing method of voice messaging, device, equipment and storage medium
US11302302B2 (en) Method, apparatus, device and storage medium for switching voice role
CN109643549B (en) Speech recognition method and device based on speaker recognition
US10089974B2 (en) Speech recognition and text-to-speech learning system
KR101213835B1 (en) Verb error recovery in speech recognition
KR102108500B1 (en) Supporting Method And System For communication Service, and Electronic Device supporting the same
US9336773B2 (en) System and method for standardized speech recognition infrastructure
US7930183B2 (en) Automatic identification of dialog timing problems for an interactive speech dialog application using speech log data indicative of cases of barge-in and timing problems
JP4481972B2 (en) Speech translation device, speech translation method, and speech translation program
EP2388778B1 (en) Speech recognition
US9070363B2 (en) Speech translation with back-channeling cues
US11093110B1 (en) Messaging feedback mechanism
EP3779971A1 (en) Method for recording and outputting conversation between multiple parties using voice recognition technology, and device therefor
JP2011504624A (en) Automatic simultaneous interpretation system
CN111916088B (en) Voice corpus generation method and device and computer readable storage medium
KR20110099434A (en) Method and apparatus to improve dialog system based on study
CN108882101A (en) A kind of control method for playing back of intelligent sound box, device, equipment and storage medium
CN114120979A (en) Optimization method, training method, device and medium of voice recognition model
US9218807B2 (en) Calibration of a speech recognition engine using validated text
CN113611316A (en) Man-machine interaction method, device, equipment and storage medium
JP5818753B2 (en) Spoken dialogue system and spoken dialogue method
KR20180066513A (en) Automatic interpretation method and apparatus, and machine translation method
JP7333371B2 (en) Automatic Interpretation Method Based on Speaker Separation, User Terminal Providing Automatic Interpretation Service Based on Speaker Separation, and Automatic Interpretation Service Providing System Based on Speaker Separation
Wang et al. Cross Cultural Comparison of Users’ Barge-in with the In-Vehicle Speech System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181218