CN107888479A

CN107888479A - Voice communication method, device, computer equipment and storage medium

Info

Publication number: CN107888479A
Application number: CN201711043205.8A
Authority: CN
Inventors: 关华; 李杨晶; 钟伟杰; 钟华健
Original assignee: Shenzhen Yunzhijia Network Co Ltd
Current assignee: Shenzhen Yunzhijia Network Co Ltd
Priority date: 2017-10-31
Filing date: 2017-10-31
Publication date: 2018-04-06

Abstract

The present invention relates to a kind of voice communication method, device, computer equipment and storage medium.Method includes obtaining the speech message that first terminal is sent by instant messaging chat interface, speech message includes speech data and text information, speech data triggers voice collecting by instant messaging chat interface by first terminal and obtained, and text information is converted to by speech data；Speech message is sent to second terminal, so that second terminal shows speech message in instant messaging chat interface.It is this when speech message is sent to second terminal, also the method text information included in speech message sent, so that second terminal just can be intuitively by checking that text information knows the content to be stated of speech message without played voice message when receiving this speech message, without playing each speech message by clicking on one by one and can just know content in speech message, so as to save operating procedure, the time cost of user's operation is saved.

Description

Voice communication method, device, computer equipment and storage medium

Technical field

The present invention relates to Internet technical field, more particularly to a kind of voice communication method, device, computer equipment and Storage medium.

Background technology

In existing instant messaging chat tool, speech message is a kind of relatively common communication way.Existing market Upper more well-known instant messaging chat tool, such as wechat, QQ etc. can be used and be arrived speech message.But chatted in instant messaging When being linked up in its instrument using speech message, when being chatted by instant messaging chat tool, if speech message is more And user wants quick when knowing the content in speech message, then need to play one by one to obtain the content of speech message, operate Step is more cumbersome, adds the consume of time.

The content of the invention

Based on this, it is necessary to for above-mentioned technical problem, there is provided a kind of voice that can save user operation time cost leads to Believe method, apparatus, computer equipment and storage medium.

A kind of voice communication method, methods described include：

The speech message that first terminal is sent by instant messaging chat interface is obtained, the speech message includes voice number According to and text information, the speech data by the first terminal by instant messaging chat interface trigger voice collecting obtain, The text information is converted to by the speech data；

The speech message is sent to second terminal, so that the second terminal is shown in instant messaging chat interface The speech message.

In one of the embodiments, the speech message is sent to second terminal, including：

Metadata and text information corresponding to the speech message are sent to second terminal, wherein, the metadata bag Include the speech message and identify duration corresponding with the speech data.

In one of the embodiments, the speech message is sent to second terminal, in addition to：

Obtain the word that the first terminal is triggered by instant messaging chat interface and correct instruction, the word, which is corrected, to be referred to Order carries the text information after being corrected in the speech message；

Speech message comprising the text information after the speech data and the correction is sent to second terminal, so that The second terminal shows the speech message sent again in instant messaging chat interface.

In one of the embodiments, it is described to send the speech message to second terminal, including：

By the word included in the text information user's name progress corresponding with the instant messaging chat interface Match somebody with somebody, obtain the user's name with the word match；

The word matched in the text information with user's name is substituted for the user's name；

Comprising the speech data and the speech message of the text information after the user's name will be substituted for send to institute Second terminal is stated, so that the second terminal shows the speech message in instant messaging chat interface.

In one of the embodiments, the voice communication method also includes：

To the word that is included in the text information and the second terminal in the corresponding user in instant messaging chat interface Title is matched；

If the match is successful, the speech message is marked, and/or, prompting message is sent to the second terminal.

In one of the embodiments, the voice communication method also includes：

To the application name included in the word group corresponding with instant messaging chat interface that is included in the text information Title is matched；

If the match is successful, Application Hints message is sent to the second terminal.

A kind of voice communication method, methods described include：

Voice collecting is triggered at instant messaging chat interface, gathers speech data；

The speech data is changed into text information；

According to the speech data and text information generation speech message；

Send the speech message.

In one of the embodiments, the instruction for obtaining the transmission speech message that server returns, according to the hair After the instruction of sending voice message shows the speech message, in addition to：

Word is triggered by the instant messaging chat interface and corrects request；

Request is corrected according to the word and enters word correction operation interface, operation interface is corrected to institute by the word The text information stated in message is corrected；

Speech message is generated according to text information after the speech data and correction again；

Send the speech message generated again.

In one of the embodiments, the speech message that the transmission generates again, including：

Recall the speech message shown before correcting in the instant messaging chat interface.

In one of the embodiments, it is described that speech message is generated according to the speech data and text information, including：

Speech message is generated according to the text information after the speech data and replacement user's name.

A kind of voice communication assembly, described device include：

Acquisition module, the speech message sent for obtaining first terminal by instant messaging chat interface, the voice Message includes speech data and text information, and the speech data is triggered by the first terminal by instant messaging chat interface Voice collecting is obtained, and the text information is converted to by the speech data；

Sending module, for the speech message to be sent to second terminal, so that the second terminal is in instant messaging The speech message is shown in chat interface.

A kind of voice communication assembly, described device include：

Speech data collection module, for triggering voice collecting at instant messaging chat interface, gather speech data；

Speech data conversion module, for the speech data to be changed into text information；

Speech message generation module, for generating speech message according to the speech data and text information；

Speech message sending module, for sending the speech message.

A kind of computer equipment, including memory, processor and be stored on the memory and can transport on a processor Capable computer program, following steps are realized during computer program described in the computing device：

The speech data is changed into text information；

According to the speech data and text information generation speech message；

Send the speech message.

A kind of computer-readable recording medium, is stored thereon with computer program, and the computer program is held by processor Following steps are realized during row：

The speech data is changed into text information；

According to the speech data and text information generation speech message；

Send the speech message.

Above-mentioned voice communication method, device, computer equipment and storage medium, pass through IMU by obtaining first terminal Believe the speech message that chat interface is sent, speech message includes speech data and text information, and speech data is led to by first terminal Cross instant messaging chat interface triggering voice collecting to obtain, text information is converted to by speech data, then speech message is sent out Second terminal is delivered to, so that second terminal shows speech message in instant messaging chat interface.It is this to be sent out by speech message When delivering to second terminal, also by the method for the text information included in speech message transmission so that second terminal is receiving this During bar speech message without played voice message just can intuitively by checking that text information knows that speech message to be stated in Hold, without playing each speech message by clicking on one by one and can just know content in speech message, so as to save Operating procedure, save the time cost of user's operation.

Brief description of the drawings

Fig. 1 is the applied environment figure of voice communication method in one embodiment；

Fig. 2 is the schematic flow sheet of voice communication method in one embodiment；

Fig. 3 is to recall to have corrected the interface schematic diagram that speech message corresponds to original voice message in one embodiment；

Fig. 4 is the schematic flow sheet of voice communication method in another embodiment；

Fig. 5 is the interface schematic diagram that the text information in speech message is corrected in one embodiment；

Fig. 6 receives interface schematic diagram during related news for second terminal in one embodiment；

Fig. 7 is the schematic flow sheet of voice communication method in one embodiment；

Fig. 8 is the schematic flow sheet that speech message is corrected in one embodiment；

Fig. 9 is the Operation interface diagram that word is corrected in one embodiment；

Figure 10 is the structured flowchart of voice communication assembly in one embodiment；

Figure 11 is the structured flowchart of voice communication assembly in one embodiment；

Figure 12 is the internal structure schematic diagram of one embodiment Computer equipment.

Embodiment

For the objects, technical solutions and advantages of the present invention are more clearly understood, below in conjunction with drawings and Examples, to this Invention is described in further detail.It should be appreciated that embodiment described herein is only to explain the present invention, Do not limit protection scope of the present invention.

Fig. 1 shows the applied environment figure of voice communication method in one embodiment.Reference picture 1, the voice communication method Can be applied to realizing in system for voice communication, the system include terminal 110, server 120 and the terminal 110 of terminal 130 with Terminal 130 is connected by network with server 120.Terminal 110 and terminal 130 can be but not limited to various can run immediately The personal computers of communications applications, notebook computer, personal digital assistant, smart mobile phone, tablet personal computer etc..Server 120 can Can be specifically independent physics clothes to be the server for realizing simple function or the server for realizing multiple functions Business device or physical server cluster.The chat interface of instant messaging can be shown in terminal 110 by specifically application, Voice collecting is triggered at instant messaging chat interface, after collecting speech data, speech data is changed into text information；Root again According to speech data and text information generation speech message；Speech message is sent to terminal 130 by server 120 again.Server 120 after the speech message that terminal 110 is sent by instant messaging chat interface is got, speech message is sent to terminal On 130.

As shown in Fig. 2 in one embodiment, there is provided a kind of voice communication method, this method is with applied in such as Fig. 1 It is illustrated in shown terminal.Including：

Step 202, the speech message that first terminal is sent by instant messaging chat interface is obtained, speech message includes language Sound data and text information, speech data trigger voice collecting by instant messaging chat interface by first terminal and obtained, word Information is converted to by speech data.

After server obtains the logging request of first terminal, return to notice to the first terminal logined successfully and cause first eventually End, which is realized, to be logged in.Server obtains the speech message that first terminal is sent by instant messaging chat interface, wherein, speech message The speech data obtained including first terminal by instant messaging chat interface triggering voice collecting, and turned according to speech data Change obtained text information.

Speech data, which is converted to obtain text information, to be handled by sound identification module, and sound identification module can be placed In the configuration file of client, it may also placed in the file of server.Server get first terminal pass through it is instant , can be according to a kind of current identification side of network state selection during the speech data that messaging chat interface triggering voice collecting obtains Formula.For example in wireless network or 4G networks, it may be selected to identify on the server；And it is 3G or 2G network states in network When lower, then it may be selected to identify on the client.The identification carried out on the server is ONLINE RECOGNITION, and the knowledge carried out in client It is not a local identification, comparatively speaking, the discrimination of server is generally an advantage over client identification.For knowing For the speech data not collected, it can go to take setting corresponding according to system developer or based on the consideration that user is accustomed to Identification method.

Step 204, speech message is sent to second terminal, so that second terminal is shown in instant messaging chat interface Speech message.

Server the speech message for the text information being converted to comprising speech data and by speech data can be sent to Second terminal, second terminal also contains speech data and are converted to by speech data when receiving this speech message Text information, therefore can show simultaneously when second terminal shows speech message in instant messaging chat interface speech data and The text information being converted to by speech data.

Server passes through the speech data for collecting first terminal and the text information for converting to obtain according to speech data It is common to send to second terminal so that second terminal not only receives speech data when receiving message, moreover it is possible to receives and voice number According to corresponding text information.Second terminal, can be in the case where not putting and starting broadcasting and put speech data, directly when reading this message The content to be expressed in speech data can be intuitively obtained by reading text information.Especially in the more situation of message Under, the operation a little started broadcasting and put can be more saved, so as to save the time cost of user's operation.

In one embodiment, step 204 includes, and metadata corresponding to speech message and text information are sent to second Terminal, wherein, metadata includes speech message and identifies duration corresponding with speech data.

Server will send speech message when sending to second terminal, will actually include speech message mark and voice The metadata and text information of data duration are sent to second terminal.Therefore second terminal when receiving this speech message, The speech audio of first terminal collection, i.e. speech data are not received in fact.But pass through instant messaging in second terminal After chat interface triggers corresponding speech message, server receives second terminal and read by triggering voice caused by speech message Instruction fetch, voice, which is read, contains speech message mark (for example speech message ID) in instruction, server is according to the speech message Mark notifies that second terminal is after the notice of server return is received normally by voice number to second terminal corresponding to returning Completed according to downloading, the broadcasting of speech data could be realized.

For second terminal when receiving the speech message comprising metadata, speech data corresponding to the speech message is actually Also in the state do not downloaded, it is necessary to which operation corresponding to second terminal triggering can just download speech data, because for For second terminal, too many flow can't be consumed by receiving the speech message comprising metadata, when second terminal pass through it is directly perceived Reading speech message in text information when learning the content to be expressed in speech data, second terminal can not select any more Speech data corresponding to load.The consume of flow has been greatly reduced in this design method, due to speech data will not automatically under It is downloaded to client and is also effectively reduced occupancy of the useless speech data to client's end memory.

In one embodiment, above-mentioned steps 204, including：The word included in text information and instant messaging are chatted User's name is matched corresponding to interface, obtains the user's name with word match；By in text information with user's name The word matched somebody with somebody is substituted for user's name；Comprising speech data and the speech message of the text information after user's name will be substituted for send out Second terminal is delivered to, so that second terminal shows speech message in instant messaging chat interface.

Server can be carried out first when speech message is sent to second terminal to the text information included in speech message Analysis.When the match is successful for the word included in text information user's name corresponding with instant messaging chat interface, server Automatically the word matched in text information with user's name will be substituted for user's name, then will be replaced again comprising this The speech message of text information and speech data is sent to second terminal.The mode of this Auto-matching, reduces due to voice The situation for the name that " is mistake " caused by analysis occurs, and decreases the operation for needing user's manual modification, has saved communication cost.

In one embodiment, after step 204, in addition to：Obtain first terminal and pass through instant messaging chat interface The word of triggering corrects instruction, and word corrects instruction and carries the text information after being corrected in speech message；Speech data will be included Sent with the speech message of the text information after correction to second terminal, so that second terminal shows in instant messaging chat interface Show the speech message sent again.

Server is by the voice comprising the speech data collected and the text information for converting to obtain according to speech data Message is sent to second terminal, is obtained the word that first terminal is triggered by instant messaging chat interface and is corrected instruction, word Correct instruction and carry the text information after being corrected in speech message.The word that first terminal is triggered by instant messaging chat interface Instruction is corrected, for example, by triggering the text information for wanting to correct on instant messaging chat interface, or button is corrected in triggering, it is this Trigger action can go to take corresponding triggering mode according to system developer or based on the consideration that user is accustomed to.First terminal touches After word of sending the documents corrects instruction, text information can be corrected accordingly according to individual demand, the text information after correction can quilt Interim preserve together is sent with speech data to second terminal, i.e., sends into the speech message of second terminal and include again Text information after having speech data and correcting.Second terminal receive include speech data and correct after text After the speech message of word information, this message can be included in instant messaging chat interface.

As the first terminal of sender, after speech message has been sent, if finding the word letter included in speech message Breath is not inconsistent with the word in speech data or has the problems such as wrong word, can trigger word by instant messaging chat interface and correct Instruction, corrected so as to enter edlin to the text information in speech message, then the text information after correction and speech data are sent out Second terminal is delivered to, when second terminal is received again by the speech message comprising the text information after speech data and correction, just Directly the correct content to be stated of first terminal can be got by reading the text information after correcting.First terminal can lead to This mode corrected to having sent the text information included in speech message is crossed, is corrected in time inaccurate in text information Perhaps contentious content in really, without gathering speech data again again.Although moreover, after being corrected to text information Speech data can also be sent again simultaneously, but speech data is stored in server in fact, and send to consume again again Take the flow of speech data part, the text information only having in fact after correcting updated in the data of transmission, therefore resend The problem of flow consume can't be brought serious.

Further, in one embodiment, the voice comprising speech data with the text information after correction is being sent After message, original voice message corresponding to speech message that this has been corrected can be recalled.

After the speech message comprising the text information after speech data and correction has been sent, server meeting basis is sent out again Original voice message corresponding to the identifier lookup of the speech message sent to this speech message, and original voice message is recalled, All without showing that does not correct speech message again on first terminal and first terminal.Recalling the original voice message do not corrected When, corresponding prompting text prompt first terminal, such as display text " you have changed automatic recognition of speech content " can be shown, the One terminal would know that message has corrected success, as shown in Figure 3.

It is whole for second when having the speech message sent again after correcting again when the existing original voice message do not corrected For end, it is difficult to distinguish that important message and which bar are only the message of accurate presentation.For second terminal, it will not entangle Positive original voice message is recalled, and can reduce the repeat reading of invalid message, it is not required that on instant messaging chat interface It is the corrected message of first terminal that the consuming time, which goes which is differentiated, in numerous message of display.

In one embodiment, above-mentioned voice communication method also includes：To the word included in text information and second eventually End is matched in user's name corresponding to instant messaging chat interface；If the match is successful, speech message is marked, And/or prompting message is sent to second terminal.

When being chatted by instant messaging chat interface, each user for participating in chat can have respective user's name, This user's name can be user's voluntarily amended Real Name, it is also possible to the society that user is used based on personal consider The pet name is handed over, if in group chatting, user's name is also likely to be this user for remarks in the group of this group setting Title.When first terminal sends the speech message comprising speech data and text information to second terminal, server can be right The text information included in speech message carries out intellectual analysis, by the word included in text information and the user name of second terminal Title is matched.If the match is successful, just speech message is marked, had either sent prompting message to second terminal or both Speech message is marked and sends prompting message to second terminal.

When being chatted by instant messaging chat interface, when the message (for example voice messaging or text information) received is more When, the message related to oneself can not be often found out immediately, then needs to check that every a piece of news can just be known and oneself phase one by one The content of pass.But when being matched using this user's name used when analyzing with oneself in chat, just to this message When prompting or transmission prompting message prompting related to me is marked, then related content can be quickly viewed, was saved Filter the time of unnecessary information.

In one embodiment, above-mentioned voice communication method also includes：To the word and IMU included in text information The Apply Names included in group corresponding to letter chat interface is matched；If the match is successful, Application Hints message is sent extremely Second terminal.

When carrying out group chatting by instant messaging chat interface, can include in group group members addition application or The self-defined application being pre-configured with group, generally, the application included in group is referred to as light application.Work as first terminal When speech message comprising speech data and text information is sent to second terminal, server can be to including in speech message Text information carries out intellectual analysis, by the word included in text information with the chat group currently chatted comprising application Apply Names matched.If the match is successful, Application Hints message can be sent to second terminal, prompt the member in group It is related to the message of application in group.

To application prompting message addition mark or send a prompting message more, quickly notify and exist in recipient's message Have to applying related message, especially when needing to initiate group activity by some specific light application in group, that Group members can then be known by markd application message immediately, so as to be handled accordingly this message, save The time of numerous message is filtered, so as to the important message of quick obtaining.

In one embodiment, as shown in Figure 4, there is provided a kind of voice communication method, this method is with applied in such as Fig. 1 It is illustrated in shown terminal.Including：

Step 402, the speech message that first terminal is sent by instant messaging chat interface is obtained, speech message includes language Sound data and text information, speech data trigger voice collecting by instant messaging chat interface by first terminal and obtained, word Information is converted to by speech data.

When sending speech message by instant messaging chat interface, first terminal can be chatted by clicking on instant messaging Talk button triggering sound-recording function on interface.After recording next section of speech data, speech data can be converted into text in terminal Word information, speech data can also be uploaded onto the server, speech data is converted by server, obtain corresponding word letter Breath.Generally, client identification or server identification are not selected by user, and instant messaging chat tool can basis The automatic decisions such as the network condition of active user, discrimination requirement.Such as when network condition is good or discrimination requires higher Identified using server, then identified when network condition is poor or discrimination is less demanding using client.Or it may be designed in Which kind of according to application scenarios select that identifying schemes used, specific identifying schemes can be depending on developer or user's request. The advantages of client identifies is that consumed flow, identification soon, and are not that discrimination compares the advantages of server identification independent of network Height, and identify that storehouse can reduce the EMS memory occupation to client independent of local.

Step 404, detect whether that getting the word that first terminal is triggered by instant messaging chat interface corrects instruction, If so, then perform step 406；If it is not, then perform step 408.

After first terminal has sent the speech message comprising speech data and corresponding text information to second terminal, When this speech message is shown on first terminal, first terminal can be referred to by clicking on word correction corresponding to text information triggering Order.For first terminal, it can show that mark prompting first terminal can disappear to the voice in the original voice message do not corrected The text information of breath is corrected, such as the mark of display " click on and correct ", when button pair " is clicked on and corrected " in first terminal triggering After text information enters edlin, this message of mark prompting first terminal is shown on the speech message that can be sent again after correction It is corrected mistake, such as display text " correction ", as shown in Figure 5.

Step 406, obtain speech data and correct the text information after the correction carried in instruction.

When getting the word correction instruction that first terminal is triggered by instant messaging chat interface, can be entangled according to word The speech message identifier lookup carried in positive order original voice message corresponding to, and entangled what is carried in word correction instruction Text information in text information and original voice message after just is replaced.Because server is getting first terminal triggering When word corrects instruction, original voice message can be found according to the message identifier carried in instruction, you can find original language The speech data included in sound message, then when first terminal by speech message after being corrected to text information again When sending to second terminal, and speech data need not be uploaded again.And server disappears in the voice sent after being corrected When breath is to second terminal, only the text information of correction can be also updated, because the byte of word is less, therefore sent again The flow spent by speech message that correct for, which is also substantially negligible, to be disregarded.

Step 408, intellectual analysis is carried out to text information.

Step 410, when the word included in text information and second terminal are in the corresponding user in instant messaging chat interface During name-matches success, then step 412 is performed.

Step 412, speech message is sent to second terminal, so that second terminal is shown in instant messaging chat interface Speech message, and speech message is marked, and/or, a prompting message is retransmited to second terminal.

Speech message is corrected regardless of whether first terminal has, sent speech message to second in server During terminal, there is the process of an intellectual analysis.The text information that server can be obtained and included in voice message, when point , can be by this word user name corresponding with instant messaging chat interface when analysing the word for including doubtful name in text information Title is matched.Chinese character can be carried out during matching and accurately matching or noun is converted into fuzzy matching after phonetic, analyzes word It whether there is name in information.If in the presence of that is, the match is successful, then can prompt corresponding with the successful user's name of the word match User, prompting mode are that this speech message is marked when sending speech message, for example show red point or display text Marks such as " someone@you " prompts user corresponding with the successful user's name of the word match.Or after the match is successful, After sending speech message, a prompting message is retransmited to second terminal, and prompting message can be system message or general Logical text message, as shown in Figure 6.

Step 414, should when what is included in the word included in text information group corresponding with instant messaging chat interface During with name-matches success, then step 416 is performed.

Step 416, speech message is sent to second terminal, so that second terminal is shown in instant messaging chat interface Speech message, and an Application Hints message is retransmited to second terminal.

Consistent, the text information that server can be obtained and included in voice message with step 410.In server analysis When whether including the word of doubtful name in text information, it can also analyze whether to include in text information and gently be applied with group The word of name-matches.Matching way can be that noun accurately matching or is converted into fuzzy matching after phonetic by Chinese character, if matching Success, then send to second terminal by speech message, so that second terminal shows that voice disappears in instant messaging chat interface After breath, an Application Hints message can be retransmited to second terminal, prompting message can be that system message or plain text disappear Breath.For second terminal when receiving the speech message comprising light Apply Names, the title gently applied can be highlighted state, second Terminal clicks on the title gently applied can be into corresponding light application operating interface to light application progress associative operation.

When sending speech message, text information can together be sent to recipient so that recipient can directly lead to Cross and know the content to be stated in speech data in the case of reading text information without downloading broadcasting speech data.Enter one Step ground, in group chatting, due to speech data can not intuitively express it is whether related to some special member in group, and By the matching to text information, if the match is successful with some member in group can receive voice in the member of the matching disappears During breath, when showing speech message on the instant messaging chat interface of the member of the matching, the speech message can carry prompting mark The speech message of note, for prompting matching member that this is related to oneself, or as the application speech message that the match is successful, meeting A prompting message is received in addition.Therefore, second terminal also can quickly be known in speech message and wrap when receiving speech message The content to be stated in the speech data contained, the message related to oneself can be also got in time, reduces message screening Process.Without downloading the consume for playing speech data and decreasing flow, decreasing broadcasting speech data every time causes voice Data can be cached to the EMS memory occupation that client is brought.And for first terminal, can be in the speech message that has sent Text information enter edlin correction, can also correct the mistake in the message of transmission in time, then by the message after correction again The content that also more can just state in time is sent, reduces and links up upper caused unnecessary dispute.

As shown in fig. 7, in one embodiment, there is provided a kind of voice communication method, this method is with applied in such as Fig. 1 It is illustrated in shown terminal.Including：

Step 702, voice collecting is triggered at instant messaging chat interface, gathers speech data.

First terminal can pass through instant messaging chat interface and different ends after triggering logging request successfully realizes login End carries out instant messaging.When carrying out instant messaging, voice communication or text communication typically can be all carried out.When selection voice communication When, voice collecting function can be triggered on chat interface, gathers speech data.

Step 704, speech data is changed into text information.

After speech data is collected, speech data can be changed into by text information by speech identifying function.In client In the application bag of end storage, be integrated with offline speech recognition bag, thus speech recognition can both have been carried out in client and Server is carried out.

Step 706, speech message is generated according to speech data and text information.

Step 708, speech message is sent.

Speech data and the text information for converting to obtain according to speech data are generated into speech message, then speech message is led to Cross server to send to second terminal, therefore second terminal is when receiving this speech message, can receive simultaneously speech data and Speech data converts obtained text information.

Second terminal can include when receiving the speech message of first terminal transmission in played voice message is not put out In the case of speech data, known by the text information included in direct reading speech message in being passed in speech data Hold, this voice communication mode can reduce the flow damage for receiving and playing speech data as the second terminal of recipient and bring Consumption, it need not separately play speech data and also reduce requirement to network quality.

In one embodiment, after step 708, in addition to the word in the speech message of transmission is corrected Step.As shown in figure 8, the step includes：

Step 802, word is triggered by instant messaging chat interface and corrects request.

Step 804, request is corrected according to word and enters word correction operation interface, correcting operation interface by word offsets Text information in breath is corrected.

Step 806, speech message is generated according to text information after speech data and correction again.

Step 808, the speech message generated again is sent.

First terminal have sent comprising speech data and according to voice by instant messaging chat interface to second terminal After data convert the speech message of obtained text information, if stated in finding text information and speech data that conversion obtains There is any discrepancy for the meaning, or when needing to enter the text information in speech message edlin for other considerations and correcting, first terminal can Word is triggered by instant messaging chat interface and corrects request.After triggering word correction request, word as shown in Figure 9 can be entered Correct operation interface.First terminal can be corrected operation interface by word and enter edlin correction to text information, when confirmation is edited After correction, first terminal can be by the speech data included in original voice message with entering the text information after edlin is corrected again A new speech message is generated, and is sent by server to second terminal.First terminal is wrong in speech message is found Mistake, text information can be corrected in time, enhance the accuracy of speech message.

In one embodiment, above-mentioned steps 808 include：Show before recalling correction in the instant messaging chat interface The speech message shown.

When first terminal sends the speech message after correction to second terminal again, correction can be recalled and sent before Original voice message, i.e., the speech message do not corrected, in first terminal and the instant messaging chat interface of second terminal not The speech message that this does not correct can be shown.Recall the speech message before correcting and can reduce on interface and excessively invalid disappear occur Breath, second terminal can also reduce the filtering to useless rubbish message when reading message, save the time for obtaining useful message.

In one embodiment, above-mentioned step 806 includes：The word included in text information and instant messaging are chatted User's name is matched corresponding to interface, obtains the user's name with word match；By in text information with user's name The word matched somebody with somebody is substituted for user's name；Speech message is generated according to the text information after speech data and replacement user's name.

During according to speech data and text information generation speech message, the offline speech recognition bag integrated in client can be right The word included in text information user's name corresponding with instant messaging chat interface is matched, and matching process can be divided into two In the individual stage, first, understand according to the Chinese grammer grammatical corrections basic to text information progress, similar English Grammar correcting system, Corrected for the significantly identification mistake such as common-use words, SVO.Secondly, offline speech recognition bag can analyze recognition result In doubtful name noun, matched in the data of group member title, Chinese character accurately matching or by noun can be carried out Fuzzy matching after phonetic is converted into, name is corrected after the match is successful, prevents occurring unisonance difference word in recognition result Name.For example analyze and circular pitch is included in text information, and there are use in user's name corresponding to instant messaging chat interface Name in an account book is referred to as Zhou Jie, then " circular pitch " will automatically be corrected, is corrected as " Zhou Jie ".I.e. by text information with user name Claim the word of matching to be substituted for user's name, then generated again according to the text information after speech data and replacement user's name The generation of wrong word can be greatly reduced in new speech message, this correction mode, also largely avoid and pass through in ditch Occur the situation for being wrong name in journey, reduce the unnecessary correction again to speech message.

In one embodiment, there is provided a kind of voice communication assembly, as shown in Figure 10, device includes：

Acquisition module 1002, the speech message sent for obtaining first terminal by instant messaging chat interface, voice Message includes speech data and text information, and speech data triggers voice collecting by first terminal by instant messaging chat interface Obtain, text information is converted to by speech data；

Sending module 1004, for speech message to be sent to second terminal, so that second terminal is in instant messaging chat Speech message is shown in interface.

In one embodiment, sending module 1004 is additionally operable to, and metadata corresponding to speech message and text information are sent out Second terminal is delivered to, wherein, metadata includes speech message and identifies duration corresponding with speech data.

In one embodiment, above-mentioned voice communication assembly also includes：Instruction acquisition module is corrected, for obtaining first eventually The word that end is triggered by instant messaging chat interface corrects instruction, and word corrects instruction and carries the text after being corrected in speech message Word information；Sending module 804 is additionally operable to send the speech message comprising the text information after speech data and correction to second Terminal, so that second terminal shows the speech message sent again in instant messaging chat interface.

In one embodiment, above-mentioned sending module 1004 includes matching module and replacement module, and matching module is used for will The word included in text information user's name corresponding with instant messaging chat interface is matched, and is obtained and word match User's name；Replacement module is used to the word matched with user's name in text information being substituted for user's name, will include language Sound data and it is substituted for the speech message of the text information after user's name and sends to second terminal, so that second terminal is instant Messaging chat shows speech message in interface.

In one embodiment, above-mentioned voice communication assembly also includes user's name matching module, for text information In the word that includes and second terminal matched in the corresponding user's name in instant messaging chat interface；If the match is successful, Speech message is marked, and/or, prompting message is sent to second terminal.

In one embodiment, above-mentioned voice communication assembly also includes Apply Names matching module, for text information In the Apply Names that includes in the word group corresponding with instant messaging chat interface that includes matched；If the match is successful, Application Hints message is then sent to second terminal.

In one embodiment, a kind of voice communication assembly is additionally provided, as shown in figure 11, the device includes：

Speech data collection module 1102, for triggering voice collecting at instant messaging chat interface, gather speech data；

Speech data conversion module 1104, for speech data to be changed into text information；

Speech message generation module 1106, for generating speech message according to speech data and text information；

Speech message sending module 1108, for sending speech message.

In one embodiment, speech message generation module 1106 be additionally operable to the word that will be included in text information with immediately User's name corresponding to messaging chat interface is matched, and obtains the user's name with word match；By in text information with The word of family name-matches is substituted for user's name；Voice is generated according to the text information after speech data and replacement user's name Message.

In one embodiment, above-mentioned voice communication assembly also includes word correction module, for being chatted by instant messaging Its interface triggering word corrects request；Request is corrected according to word and enters word correction operation interface, is corrected and operated by word Interface is corrected by the text information in message；Speech message is generated according to text information after speech data and correction again； Send the speech message generated again.

In one embodiment, above-mentioned voice communication assembly also includes recalling module, for according to speech data and entangling When text information generates speech message again after just, recall the voice shown before correcting in instant messaging chat interface and disappear Breath.

Figure 12 is the internal structure schematic diagram of one embodiment Computer equipment.Reference picture 12, the computer equipment bag Include processor, non-volatile memory medium, built-in storage, display and the network interface connected by system bus.Wherein, should The non-volatile memory medium of computer equipment can storage program area and the computer program for realizing voice communication assembly, the meter When calculation machine program is performed, a kind of voice communication method of computing device may be such that.The processor of the computer equipment is used to carry For calculating and control ability, the operation of whole computer equipment is supported.Computer program can be stored in the built-in storage, the meter When calculation machine program is executed by processor, the method that may be such that computing device voice communication.The network interface of computer equipment is used In progress network service.Display screen is used to show application interface etc., for example, what display instant messaging chat interface or word were corrected Operation interface etc..The display screen of computer equipment can be LCDs or electric ink display screen, computer equipment Input unit can be the button of equipment, trace ball on the touch-screen or computer equipment shell covered on display screen Or Trackpad, it can also be external keyboard, Trackpad or mouse etc..Touch layer forms touch screen with display screen.

It will be understood by those skilled in the art that the structure shown in Figure 12, the only part related to the present invention program The block diagram of structure, does not form the restriction for the terminal being applied thereon to the present invention program, specific terminal can include than More or less parts shown in figure, either combine some parts or arranged with different parts.

In one embodiment, comprise the following steps during above-mentioned computing device above computer program：Obtain first eventually The speech message sent by instant messaging chat interface is held, speech message includes speech data and text information, speech data Voice collecting is triggered by first terminal by instant messaging chat interface to obtain, text information is converted to by speech data；Will Speech message is sent to second terminal, so that second terminal shows speech message in instant messaging chat interface.

In another embodiment, comprise the following steps during above-mentioned computing device above computer program：In IMU Believe chat interface triggering voice collecting, gather speech data；Speech data is changed into text information；According to speech data and text Word information generates speech message；Send speech message.

In one embodiment, there is provided a kind of computer-readable recording medium, be stored thereon with computer program, calculate Machine program realizes following steps when being executed by processor：The voice that first terminal is sent by instant messaging chat interface is obtained to disappear Breath, speech message include speech data and text information, and speech data is triggered by first terminal by instant messaging chat interface Voice collecting is obtained, and text information is converted to by speech data；Speech message is sent to second terminal, so that second terminal Speech message is shown in instant messaging chat interface.

In one embodiment, computer program be executed by processor by speech message send to the step of second terminal it Afterwards, in addition to：Obtain the word that first terminal is triggered by instant messaging chat interface and correct instruction, word is corrected instruction and carried Text information after being corrected in speech message；The speech message of text information comprising speech data and after correcting is sent to the Two terminals, so that second terminal shows the speech message sent again in instant messaging chat interface.

In one embodiment, computer program is executed by processor the step of sending speech message to second terminal When, including：Metadata corresponding to speech message and text information are sent to second terminal, wherein, metadata disappears including voice Breath identifies duration corresponding with speech data.

In one embodiment, it is further comprising the steps of when computer program is executed by processor：To being wrapped in text information The word contained and second terminal are matched in the corresponding user's name in instant messaging chat interface；If the match is successful, to language Sound message is marked, and/or, prompting message is sent to second terminal.

In one embodiment, computer program is executed by processor the step of sending speech message to second terminal When, including：The word included in text information user's name corresponding with instant messaging chat interface is matched, obtain with The user's name of word match；The word matched in text information with user's name is substituted for user's name；Voice will be included Data and it is substituted for the speech message of the text information after user's name and sends to second terminal, so that second terminal is in IMU Speech message is shown in letter chat interface.

In one embodiment, it is further comprising the steps of when computer program is executed by processor：To being wrapped in text information The Apply Names included in the word contained group corresponding with instant messaging chat interface is matched；If the match is successful, send out Application Hints message is sent to second terminal.

In one embodiment, a kind of computer-readable recording medium is additionally provided, is stored thereon with computer program, is counted Calculation machine program realizes following steps when being executed by processor：Voice collecting is triggered at instant messaging chat interface, gathers voice number According to；Speech data is changed into text information；According to speech data and text information generation speech message；Send speech message.

In one embodiment, computer program was executed by processor after the step of above-mentioned transmission speech message, was also wrapped Include：Word is triggered by instant messaging chat interface and corrects request；Request is corrected according to word and enters word correction operation interface, Operation interface is corrected by word to correct the text information in message；According to text information after speech data and correction again It is secondary into speech message；Send the speech message generated again.

In one embodiment, computer program is executed by processor the step for the speech message that above-mentioned transmission generates again Suddenly, including：Recall the speech message shown before correcting in instant messaging chat interface.

In one embodiment, computer program is executed by processor above-mentioned according to speech data and text information generation language The step of sound message, including：The word included in text information user's name corresponding with instant messaging chat interface is carried out Matching, obtains the user's name with word match；The word matched in text information with user's name is substituted for user's name； Speech message is generated according to the text information after speech data and replacement user's name.

One of ordinary skill in the art will appreciate that realize all or part of flow in above-described embodiment method, being can be with The hardware of correlation is instructed to complete by computer program, the computer program can be stored in a computer-readable storage and be situated between In matter, the program is upon execution, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, foregoing storage medium can be The non-volatile memory mediums such as magnetic disc, CD, read-only memory (Read-Only Memory, ROM), or random storage note Recall body (Random Access Memory, RAM) etc..

Each technical characteristic of embodiment described above can be combined arbitrarily, to make description succinct, not to above-mentioned reality Apply all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, the scope that this specification is recorded all is considered to be.

Embodiment described above only expresses the several embodiments of the present invention, and its description is more specific and detailed, but simultaneously Can not therefore it be construed as limiting the scope of the patent.It should be pointed out that come for one of ordinary skill in the art Say, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the protection of the present invention Scope.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.

Claims

1. a kind of voice communication method, methods described include：

Obtain the speech message that is sent by instant messaging chat interface of first terminal, the speech message include speech data and Text information, the speech data triggers voice collecting by instant messaging chat interface by the first terminal and obtained, described Text information is converted to by the speech data；

The speech message is sent to second terminal, so that described in the second terminal shows in instant messaging chat interface Speech message.

2. according to the method for claim 1, it is characterised in that described to send the speech message to second terminal, bag Include：

Metadata and text information corresponding to the speech message are sent to second terminal, wherein, the metadata includes institute Predicate sound message identifier duration corresponding with the speech data.

3. according to the method for claim 1, it is characterised in that it is described by the speech message send to second terminal it Afterwards, in addition to：

Obtain the word that the first terminal is triggered by instant messaging chat interface and correct instruction, the word is corrected instruction and taken With the text information after being corrected in the speech message；

Speech message comprising the text information after the speech data and the correction is sent to second terminal, so that described Second terminal shows the speech message sent again in instant messaging chat interface.

4. according to the method for claim 1, it is characterised in that described to send the speech message to second terminal, bag Include：

The word included in the text information user's name corresponding with the instant messaging chat interface is matched, obtained To the user's name with the word match；

Comprising the speech data and the speech message of the text information after the user's name will be substituted for send to described the Two terminals, so that the second terminal shows the speech message in instant messaging chat interface.

5. according to the method for claim 1, it is characterised in that methods described also includes：

To the word that is included in the text information and the second terminal in the corresponding user's name in instant messaging chat interface Matched；

6. according to the method for claim 1, it is characterised in that methods described also includes：

The Apply Names included in the word group corresponding with instant messaging chat interface that is included in the text information is entered Row matching；

7. a kind of voice communication method, methods described include：

The speech data is changed into text information；

According to the speech data and text information generation speech message；

Send the speech message.

8. according to the method for claim 7, it is characterised in that after the transmission speech message, in addition to：

Request is corrected according to the word and enters word correction operation interface, correct operation interface by the word disappears to described Text information in breath is corrected；

Send the speech message generated again.

9. according to the method for claim 8, it is characterised in that the speech message that the transmission generates again, including：

10. according to the method for claim 7, it is characterised in that described to be generated according to the speech data and text information Speech message, including：

11. a kind of voice communication assembly, it is characterised in that described device includes：

Acquisition module, the speech message sent for obtaining first terminal by instant messaging chat interface, the speech message Including speech data and text information, the speech data triggers voice by the first terminal by instant messaging chat interface Collect, the text information is converted to by the speech data；

Sending module, for the speech message to be sent to second terminal, so that the second terminal is in instant messaging chat The speech message is shown in interface.

12. a kind of voice communication assembly, it is characterised in that described device includes：

Speech message sending module, for sending the speech message.

13. a kind of computer equipment, including memory, processor and it is stored on the memory and can runs on a processor Computer program, it is characterised in that described in the computing device during computer program realize as claim 1 to 8 it is any The step of one methods described.

14. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that the computer program Realized when being executed by processor such as the step of claim 1 to 8 any one methods described.