CN108492833A

CN108492833A - Voice messaging acquisition method, instantaneous communication system, mobile terminal and storage medium

Info

Publication number: CN108492833A
Application number: CN201810277436.3A
Authority: CN
Inventors: 杨威
Original assignee: Jiangxi University of Technology
Current assignee: Jiangxi University of Technology
Priority date: 2018-03-30
Filing date: 2018-03-30
Publication date: 2018-09-04

Abstract

The present invention provides a kind of voice messaging acquisition method, instantaneous communication system, mobile terminal and storage medium, methods to include：Start timing when receiving voice input instruction, obtains current speech data in real time, and according to current speech data Dynamic Announce voice input picture；It when receiving Speech Record stop instruction, closes voice input picture and stops the acquisition of current speech data, to obtain voice input data, format conversion is carried out to voice input data, and transformed voice input data are stored；The preset icon image being locally stored is obtained, and obtains the sonic data in voice input data and voice input time；Sonic data and voice input time are shown on preset icon image by the way of sound spectrum image with mode that numerical value is shown respectively, the mode that the present invention is shown by using the mode and numerical value of sound spectrum image, so as to be distinctly displayed to voice input data, differentiation of the user to different voice input data is facilitated.

Description

Voice messaging acquisition method, instantaneous communication system, mobile terminal and storage medium

Technical field

The present invention relates to field of communication technology, more particularly to a kind of voice messaging acquisition method, instantaneous communication system, movement Terminal and storage medium.

Background technology

Voice collecting is the preposition stage of speech recognition, and by carrying out data under voice to user pronunciation, extraction is adopted The phonetic feature of the voice data of collection according to the phonetic feature that is extracted carries out speech recognition, it can be achieved that user pronunciation content The purpose of determining, identifying user identity.Current voice collecting mode is that (such as smart mobile phone is put down using terminal device is set to The user equipmenies such as plate computer) on voice acquisition device (such as microphone) user pronunciation is acquired, obtain voice data, Feature extraction then carried out to the voice data that is acquired, voice messaging acquisition method in the use of instantaneous communication system particularly Frequently, thus convenience of the people to voice messaging acquisition method more stringent requirements are proposed.

In existing voice messaging acquisition method when carrying out voice input, the typing picture of display is more single, reduces User experience, and in existing voice messaging acquisition method, when carrying out phonetic storage after the typing for completing voice data, no The display icon all same used after same voice data storage, and then it is more tired to cause user to distinguish different voice data It is difficult.

Invention content

Based on this, being designed to provide for the embodiment of the present invention a kind of facilitates user to distinguish different phonetic data Voice messaging acquisition method, instantaneous communication system, mobile terminal and storage medium.

In a first aspect, the present invention provides a kind of voice messaging acquisition method, the method includes：

Start timing when receiving voice input instruction, obtains current speech data in real time, and according to the current language Behavioral characteristics Dynamic Announce voice input picture in sound data；

When receiving Speech Record stop instruction, closes the voice input picture and stop obtaining for the current speech data It takes, to obtain voice input data, format conversion is carried out to the voice input data, and by the transformed voice input Data are stored；

The preset icon image being locally stored is obtained, and obtains the sonic data and Speech Record in the voice input data The angle of incidence；

The sonic data and the voice input time are sequentially corresponded to and shown with numerical value by the way of sound spectrum image Mode shown on the preset icon image respectively.

Above-mentioned voice messaging acquisition method shows have in such a way that the sonic data is used sound spectrum image Effect is distinctly displayed the different voice input data, and then facilitates subsequent user to the different voices The differentiation of logging data, the design shown by the acquisition and use numerical value of the voice input time, has further facilitated use The family subsequently differentiation to the different voice input data, by the design of voice input picture described in Dynamic Announce, effectively Improve user experience, it is therefore prevented that uninteresting phenomenon during user speech occurs.

Further, the behavioral characteristics Dynamic Announce voice input picture according in the current speech data：

Present timing time and the default background image being locally stored, recording icon are obtained in real time, and according to described current Timing time, the default background image and the recording icon perform image display；

The voice decibel information in the current speech data is obtained in real time, and according to the voice decibel information to described Icon of recording is rendered into Mobile state.

Further, described the step of being rendered into Mobile state to the recording icon according to the voice decibel information, wraps It includes：

Judge the decibel ratings of the voice decibel information, and judges rendering region according to the decibel ratings；

The rendered color being locally stored is obtained, and color rendering is carried out to the rendering region according to the rendered color.

Further, after the step of real-time acquisition current speech data, the method further includes：

Judge the current decibels in the current speech data received in preset time whether continuously less than default Decibels；

If so, closing the voice input picture and stopping the acquisition of the current speech data.

Further, described that format conversion is carried out to the voice input data, and by the transformed voice input The step of data are stored include：

The voice input data are converted into amr formats, and Text region is carried out to obtain to the voice input data It is the word that number of repetition is most in the voice input data to take Feature Words, the Feature Words；

Renaming is carried out according to the Feature Words voice input data transformed to format.

The voice input data are converted into amr formats, and obtain current time；

Renaming is carried out according to the current time voice input data transformed to format.

Second aspect, the present invention provides a kind of instantaneous communication systems, including：

Immediate communication platform, the operational order for receiving user；

Voice capture device is communicated to connect with the immediate communication platform, for being received according to the immediate communication platform The operational order arrived, with the broadcasting of the corresponding acquisition or voice data for carrying out voice data；

The voice capture device includes：

First acquisition module, for when receive voice input instruction when start timing, obtain current speech data in real time, And according to the behavioral characteristics Dynamic Announce voice input picture in the current speech data；

Memory module is worked as when receiving Speech Record stop instruction, closing described in the voice input picture and stopping The acquisition of preceding voice data carries out format conversion, and will be after conversion to obtain voice input data to the voice input data The voice input data stored；

Second acquisition module for obtaining the preset icon image being locally stored, and obtains in the voice input data Sonic data and the voice input time；

Display module, for the sonic data and the voice input time sequentially to be corresponded to the side using sound spectrum image The mode that formula and numerical value are shown is shown on the preset icon image respectively.

Above-mentioned instantaneous communication system carries out in such a way that the sonic data is used sound spectrum image by the display module It has been shown that, is effectively distinctly displayed the different voice input data, and then facilitate subsequent user to different The differentiation of the voice input data is shown by the display module to the acquisition of the voice input time and using numerical value Design, further facilitated differentiation of the user subsequently to the different voice input data, pass through it is described first obtain mould The design of voice input picture, effectively raises user experience described in block Dynamic Announce, it is therefore prevented that during user speech Uninteresting phenomenon occurs.

Further, first acquisition module includes：

First acquisition unit, default background image, recording figure for obtaining the present timing time in real time He being locally stored Mark, and performed image display according to the present timing time, the default background image and the recording icon；

Second acquisition unit, for obtaining the voice decibel information in the current speech data in real time, and according to described Voice decibel information renders the recording icon into Mobile state.

The third aspect, the present invention provides a kind of mobile terminal, including memory and processor, the memory is used for Computer program is stored, the processor runs the computer program so that the above-mentioned voice messaging of the mobile terminal execution Acquisition method.

Fourth aspect, the present invention provides a kind of storage medium, the meter being stored thereon with used in above-mentioned mobile terminal Calculation machine program.

Description of the drawings

Fig. 1 is the flow chart for the voice messaging acquisition method that first embodiment of the invention provides；

Fig. 2 is the flow chart for the voice messaging acquisition method that second embodiment of the invention provides；

Fig. 3 is the flow chart of the specific implementation step of step S21 in Fig. 2；

Fig. 4 is the structural schematic diagram for the instantaneous communication system that third embodiment of the invention provides；

Fig. 5 is the structural schematic diagram for the instantaneous communication system that fourth embodiment of the invention provides；

Essential element symbol description

Instantaneous communication system	100	Immediate communication platform	101
				Voice capture device	102	First acquisition module	10
First acquisition unit	11	Second acquisition unit	12
				Judging unit	13	Rendering unit	14
Memory module	20	First converting unit	21
				First name unit	22	Second converting unit	23
Second name unit	24	Second acquisition module	30
				Display module	31	Judgment module	40
Stopping modular	41

Specific implementation mode

For the ease of more fully understanding the present invention, the present invention is carried out further below in conjunction with related embodiment attached drawing It explains.The embodiment of the present invention is given in attached drawing, but the present invention is not limited in above-mentioned preferred embodiment.On the contrary, providing The purpose of these embodiments be in order to make disclosure of the invention face more fully.

Referring to Fig. 1, the flow chart of the voice messaging acquisition method provided for first embodiment of the invention, including step S10 to S50.

Step S10 starts timing when receiving voice input instruction, obtains current speech data in real time, and according to institute State the behavioral characteristics Dynamic Announce voice input picture in current speech data；

Wherein, the voice input instruction is carried out by the way of electric signal, push button signalling, wireless signal or voice signal Transmission, the acquisition of the current speech data by way of activate microphone to be obtained, specifically, logical in the present embodiment Setting behavioral characteristics are crossed, the Dynamic Announce for carrying out the voice input picture in equipment are shown with control, the behavioral characteristics can Think temporal characteristics or voice decibel feature etc..

Step S20 closes the voice input picture and stops the current speech when receiving Speech Record stop instruction The acquisition of data, to obtain voice input data；

Wherein, the transmission mode of the Speech Record stop instruction is identical as the voice input instruction, is sent out when receiving user When the Speech Record stop instruction gone out, control shows the display for stopping the voice input picture in equipment, and mute microphone (MIC) To the acquisition when close voice data, at this point, being institute from the data got when voice input stops to typing are proceeded by Predicate sound logging data.

Step S30, to the voice input data carry out format conversion, and by the transformed voice input data into Row storage；

Wherein, since the original document of the voice input data is larger, use the mode of format conversion to reduce The size of the voice input data file facilitates the storage subsequently to the voice input data；

It preferably, will by background thread when locally completing format conversion and the storage to the voice input data The voice input data upload to a server, and control received server-side program and carried out while receiving recording file Encryption, server is arrived in storage in the form of a file after the completion of encryption, and then when the local voice input loss of data Or it when damage, to carry out the acquisition of the voice input data, and then can be improved by carrying out file download in the server The security performance of the voice messaging acquisition method.

Step S40 obtains the preset icon image being locally stored, and obtains the sonic data in the voice input data With the voice input time；

Wherein, the preset icon image is the pre-set picture of user, which can be for local picture or based on net The picture that network is downloaded, the sonic data are obtained by sound spectrograph, the voice input time used timing Device is obtained.

Step S50, the sonic data and the voice input time are sequentially corresponded to using sound spectrum image by the way of with The mode that numerical value is shown is shown on the preset icon image respectively；

Wherein, since the corresponding sound spectrum image of the difference sonic data is different, sound spectrum is used in the present embodiment The mode that image is shown carries out distinctly displaying for the voice input data, facilitates area of the user to the voice input data Point；

Preferably, since uncontrollable factor leads to the not phase of voice input time described in the different voice input data Together, therefore distinctly displaying for the voice input can be carried out by way of showing voice input time progress numerical value.

In the present embodiment, shown in such a way that the sonic data is used sound spectrum image, it effectively will be different The voice input data distinctly displayed, and then facilitate subsequent user to the different voice input data It distinguishes, the design shown by the acquisition and use numerical value of the voice input time has further facilitated user subsequently to not The differentiation of the same voice input data, by the Dynamic Announce for carrying out the voice input picture based on the behavioral characteristics Design, effectively raise user experience, it is therefore prevented that uninteresting phenomenon during user speech occurs.

Referring to Fig. 2, the flow chart of the voice messaging acquisition method provided for second embodiment of the invention, the method packet Include step S11 to S71.

Step S11 starts timing when receiving voice input instruction, obtains current speech data in real time, obtains in real time Present timing time and the default background image being locally stored, recording icon, and according to present timing time, described default Background image and the recording icon perform image display；

Wherein, the voice input instruction is carried out by the way of electric signal, push button signalling, wireless signal or voice signal Transmission, the acquisition of the current speech data by way of activating microphone to be obtained, when by the present timing Between acquisition and display, facilitate user voice input operation.

Step S21 obtains the voice decibel information in the current speech data, and is believed according to the voice decibel in real time Breath renders the recording icon into Mobile state；

Wherein, it by the way that behavioral characteristics are arranged in the present embodiment, is shown with control and carries out the voice input picture in equipment Dynamic Announce, the behavioral characteristics can be temporal characteristics or voice decibel feature etc., due to user carry out voice input Shi Suoshu voice decibel information is changed in real time, therefore carries out the recording figure according to the variation with corresponding in the present embodiment Target dynamic renders, to prevent the uninteresting phenomenon during user speech from occurring.It preferably, can to the rendering of the recording icon It is rendered into Mobile state in a manner of using color rendering or image change, the perception to improve user is experienced.

Referring to Fig. 3, for the specific implementation step of step S21 in Fig. 2：

Step S210 judges the decibel ratings of the voice decibel information, and judges rendering area according to the decibel ratings Domain；

Wherein, be locally stored render region table, it is described render region table in be stored with different decibel ratings for wash with watercolours Coordinate is contaminated, the rendering of different zones should be carried out according to the rendering coordinate pair inquired, such as when decibel ratings are 0, recording Icon both sides show the camber line of grey, with the increase of the decibel ratings of judgement, gradually carry out the color rendering of different zones, with Dynamic Announce effect is formed, user experience is improved.

Step S211 obtains the rendered color being locally stored, and is carried out to the rendering region according to the rendered color Color rendering；

Wherein, the rendered color can be independently configured according to the demand of user, rendered color described in the present embodiment For blue, specifically, with the increase of decibel ratings, the camber line of grey is gradually replaced with the camber line of blue from inside to outside, with Dynamic Announce effect is formed on the recording icon.

Please continue to refer to Fig. 2, step S31, when receiving Speech Record stop instruction, closes the voice input picture and stop The only acquisition of the current speech data, to obtain voice input data；

The voice input data are converted to amr formats, and carry out word to the voice input data by step S41 To obtain Feature Words, the Feature Words are the word that number of repetition is most in the voice input data for identification；

Wherein, since the original document of the voice input data is larger, use the mode of format conversion to reduce The size of the voice input data file facilitates the storage subsequently to the voice input data, and by institute's predicate Sound logging data carries out the design of Text region, facilitates the acquisition of the follow-up Feature Words；

Step S51 carries out renaming according to the Feature Words voice input data transformed to format；

Wherein, the number occurred in the voice input data due to the Feature Words is most, and then can be by with institute The corresponding voice input data of Feature Words statement are stated, and carries out the voice input data by the way of renaming and corresponds to text The name of part is shown.

Preferably, described that format conversion is carried out to the voice input data, and by the transformed voice input number It may also include according to the step of being stored：

The voice input data are converted into amr formats, and obtain current time；

Renaming is carried out according to the current time voice input data transformed to format；

Step S61 obtains the preset icon image being locally stored, and obtains the sonic data in the voice input data With the voice input time；

Step S71, the sonic data and the voice input time are sequentially corresponded to using sound spectrum image by the way of with The mode that numerical value is shown is shown on the preset icon image respectively；

Preferably, after the step of real-time acquisition current speech data, the method further includes：

If so, closing the voice input picture and stopping the acquisition of the current speech data；

Wherein, whether continued by the current decibels in the current speech data that judges to receive in preset time Less than the design of default decibels, effectively prevent since user forgets kwh loss caused by stopping voice input.

In the present embodiment, shown in such a way that the sonic data is used sound spectrum image, it effectively will be different The voice input data distinctly displayed, and then facilitate subsequent user to the different voice input data It distinguishes, the design shown by the acquisition and use numerical value of the voice input time has further facilitated user subsequently to not The differentiation of the same voice input data effectively raises use by the design of voice input picture described in Dynamic Announce It experiences at family, it is therefore prevented that the uninteresting phenomenon during user speech occurs.

Referring to Fig. 4, the structural schematic diagram of the instantaneous communication system 100 provided for third embodiment of the invention, including：

Immediate communication platform 101, the operational order for receiving user；

Voice capture device 102 is communicated to connect with the immediate communication platform 101, for flat according to the instant messaging The operational order that platform receives, with the broadcasting of the corresponding acquisition or voice data for carrying out voice data；

The voice capture device 102 includes：

First acquisition module 10 obtains current speech number in real time for starting timing when receiving voice input instruction According to, and according to the behavioral characteristics Dynamic Announce voice input picture in the current speech data；

Memory module 20, for when receiving Speech Record stop instruction, closing described in the voice input picture and stopping The acquisition of current speech data carries out format conversion, and will conversion to obtain voice input data to the voice input data The voice input data afterwards are stored；

Specifically, since the original document of the voice input data is larger, use the mode of format conversion to drop The size of the low voice input data file, facilitates the storage subsequently to the voice input data；

Second acquisition module 30 for obtaining the preset icon image being locally stored, and obtains the voice input data In sonic data and the voice input time；

Wherein, the preset icon image is the pre-set picture of user, which can be for local picture or based on net The picture that network is downloaded, the sonic data are obtained by sound spectrograph, the voice input time used timing Device is obtained,

Display module 31, for sequentially corresponding to the sonic data and the voice input time using sound spectrum image The mode that mode and numerical value are shown is shown on the preset icon image respectively；

First acquisition module 10 includes：

First acquisition unit 11, default background image, recording for obtaining the present timing time in real time and being locally stored Icon, and performed image display according to the present timing time, the default background image and the recording icon；

Second acquisition unit 12, for obtaining the voice decibel information in the current speech data in real time, and according to institute Predicate cent shellfish information renders the recording icon into Mobile state.

The second acquisition unit 12 includes：

Judging unit 13, the decibel ratings for judging the voice decibel information, and judged according to the decibel ratings Render region；

Rendering unit 14, for obtaining the rendered color being locally stored, and according to the rendered color to the rendering area Domain carries out color rendering.

The memory module 20 includes：

First converting unit 21, for the voice input data to be converted to amr formats, and to the voice input number Feature Words are obtained according to Text region is carried out, the Feature Words are the word that number of repetition is most in the voice input data；

First name unit 22, for carrying out weight according to the Feature Words voice input data transformed to format Name；

Second converting unit 23 for the voice input data to be converted to amr formats, and obtains current time；

Second name unit 24, for being carried out according to the current time voice input data transformed to format Renaming.

In the present embodiment, shown in such a way that the sonic data is used sound spectrum image by the display module 31 Show, effectively distinctly displayed the different voice input data, and then facilitates subsequent user to different institutes The differentiation of predicate sound logging data is shown by the display module 31 to the acquisition of the voice input time and using numerical value Design, further facilitated differentiation of the user subsequently to the different voice input data, pass through it is described first obtain mould The design of voice input picture, effectively raises user experience, it is therefore prevented that during user speech described in 10 Dynamic Announce of block Uninteresting phenomenon occur.

Referring to Fig. 5, the structural schematic diagram of the instantaneous communication system 100 provided for fourth embodiment of the invention, the 4th The structure of embodiment and 3rd embodiment is more or less the same, and difference lies in voice capture device 102 described in the present embodiment is also wrapped It includes：.

Judgment module 40, for judging that the current decibels in the current speech data received in preset time are It is no continuously less than default decibels；

Stopping modular 41, for when the judging result of the judgment module 40 is to be, closing the voice input picture And stop the acquisition of the current speech data.

In the present embodiment, judged in the current speech data received in preset time by the judgment module 40 Current decibels whether continuously less than default decibels design, effectively prevent due to user forget stop voice input Caused kwh loss.

The present embodiment additionally provides a kind of mobile terminal, including memory and processor, and the memory is for storing Computer program, the processor run the computer program so that the mobile terminal execution it is above-mentioned voice messaging acquisition Method.

The present embodiment additionally provides a kind of storage medium, the computer journey being stored thereon with used in above-mentioned mobile terminal Sequence, the program when being executed, include the following steps：

The sonic data and the voice input time are sequentially corresponded to and shown with numerical value by the way of sound spectrum image Mode shown on the preset icon image respectively.The storage medium, such as：ROM/RAM, magnetic disc, CD etc..

Above embodiment described the technical principles of the present invention, and the description is merely to explain the principles of the invention, and It cannot be construed to the limitation of the scope of the present invention in any way.Based on the explanation herein, those skilled in the art is not required to Other specific implementation modes of the present invention can be associated by paying performing creative labour, these modes fall within the present invention's In protection domain.

Claims

1. a kind of voice messaging acquisition method, which is characterized in that the method includes：

Start timing when receiving voice input instruction, obtains current speech data in real time, and according to the current speech number Behavioral characteristics Dynamic Announce voice input picture in；

When receiving Speech Record stop instruction, closes the voice input picture and stops the acquisition of the current speech data, To obtain voice input data, format conversion is carried out to the voice input data, and by the transformed voice input number According to being stored；

When obtaining the preset icon image being locally stored, and obtaining sonic data and voice input in the voice input data Between；

The sonic data and the voice input time are sequentially corresponded into the side that by the way of sound spectrum image and numerical value is shown Formula is shown on the preset icon image respectively.

2. voice messaging acquisition method according to claim 1, which is characterized in that described according to the current speech data In behavioral characteristics Dynamic Announce voice input picture：

Present timing time and the default background image being locally stored, recording icon are obtained in real time, and according to the present timing Time, the default background image and the recording icon perform image display；

The voice decibel information in the current speech data is obtained in real time, and according to the voice decibel information to the recording Icon is rendered into Mobile state.

3. voice messaging acquisition method according to claim 2, which is characterized in that described according to the voice decibel information The step of being rendered into Mobile state to the recording icon include：

4. voice messaging acquisition method according to claim 1, which is characterized in that the real-time acquisition current speech data The step of after, the method further includes：

Judge the current decibels in the current speech data received in preset time whether continuously less than default decibel Number；

5. voice messaging acquisition method according to claim 1, which is characterized in that it is described to the voice input data into Row format is converted, and the step of transformed voice input data are stored includes：

The voice input data are converted into amr formats, and Text region is carried out to obtain spy to the voice input data Word is levied, the Feature Words are the word that number of repetition is most in the voice input data；

6. voice messaging acquisition method according to claim 1, which is characterized in that it is described to the voice input data into Row format is converted, and the step of transformed voice input data are stored includes：

The voice input data are converted into amr formats, and obtain current time；

7. a kind of instantaneous communication system, which is characterized in that including：

Immediate communication platform, the operational order for receiving user；

Voice capture device is communicated to connect with the immediate communication platform, for what is received according to the immediate communication platform The operational order, with the broadcasting of the corresponding acquisition or voice data for carrying out voice data；

The voice capture device includes：

First acquisition module obtains current speech data, and root in real time for starting timing when receiving voice input instruction According to the behavioral characteristics Dynamic Announce voice input picture in the current speech data；

Memory module, for when receiving Speech Record stop instruction, closing the voice input picture and stopping the current language The acquisition of sound data carries out format conversion, and by transformed institute to obtain voice input data to the voice input data Predicate sound logging data is stored；

Second acquisition module for obtaining the preset icon image being locally stored, and obtains the sound in the voice input data Wave number evidence and voice input time；

Display module, for the sonic data and the voice input time are sequentially corresponded to using sound spectrum image by the way of with The mode that numerical value is shown is shown on the preset icon image respectively.

8. instantaneous communication system according to claim 7, which is characterized in that first acquisition module includes：

First acquisition unit, default background image, recording icon for obtaining the present timing time in real time and being locally stored, and It is performed image display according to the present timing time, the default background image and the recording icon；

Second acquisition unit, for obtaining the voice decibel information in the current speech data in real time, and according to the voice Decibel information renders the recording icon into Mobile state.

9. a kind of mobile terminal, which is characterized in that including memory and processor, the memory is for storing computer journey Sequence, the processor runs the computer program so that the mobile terminal execution is according to described in any one of claim 1 to 6 Voice messaging acquisition method.

10. a kind of storage medium, which is characterized in that it is stored with the calculating used in the mobile terminal described in claim 9 Machine program.