CN110798393B

CN110798393B - Voiceprint bubble display method and terminal using voiceprint bubbles

Info

Publication number: CN110798393B
Application number: CN201810873112.6A
Authority: CN
Inventors: 段迪; 王旭飞; 徐晓鑫
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2018-08-02
Filing date: 2018-08-02
Publication date: 2021-10-26
Anticipated expiration: 2038-08-02
Also published as: CN110798393A

Abstract

The invention discloses a method for displaying voiceprint bubbles in instant messaging. The display method comprises the following steps: acquiring voice information in instant messaging; determining key information related to the voice information according to the voice information; and matching and displaying the corresponding voiceprint bubbles for the voice information according to the key information. In addition, the invention also discloses a voiceprint bubble display method and a terminal using the voiceprint bubble. According to the method for displaying the voiceprint bubbles in the instant messaging, the voiceprint bubbles are matched according to the key information related to the voice information, and the voiceprint bubbles are displayed, so that the voice information can be displayed in a personalized mode, interestingness is increased, and user experience is improved.

Description

Voiceprint bubble display method and terminal using voiceprint bubbles

Technical Field

The present invention relates to the field of communication information processing technologies, and in particular, to a method for displaying voiceprint bubbles in instant messaging, a method for displaying voiceprint bubbles, and a terminal using voiceprint bubbles.

Background

In the related art, an Instant Messenger (IM) typically presents voice information sent by a user in the form of a chat bubble. However, the chat bubble is usually a solid background, and can only simply show the play of the voice and the duration of the voice, which lacks interest, cannot meet more demands of the user, and is poor in user experience.

Disclosure of Invention

The embodiment of the invention provides a voiceprint bubble display method in instant messaging, a voiceprint bubble display method and a terminal using the voiceprint bubbles.

The method for displaying the voiceprint bubbles in the instant messaging comprises the following steps:

acquiring voice information in instant messaging;

determining key information related to the voice information according to the voice information;

and matching and displaying the corresponding voiceprint bubbles for the voice information according to the key information.

According to the method for displaying the voiceprint bubbles in the instant messaging, the voiceprint bubbles are matched according to the key information related to the voice information, and the voiceprint bubbles are displayed, so that the voice information can be displayed in a personalized mode, interestingness is increased, and user experience is improved.

The method for displaying the voiceprint bubbles, provided by the embodiment of the invention, is used for displaying voice information, and the voiceprint bubbles are matched with key information of the voice information, and comprises the following steps:

when the voice information is received, the voiceprint bubbles are statically displayed; and

and dynamically displaying the voiceprint bubbles when the voice information is played.

The method for displaying the voiceprint bubbles provided by the embodiment of the invention utilizes the voiceprint bubbles matched with the key information of the voice information to perform personalized display on the voice information. When voice information is received, displaying voiceprint bubbles statically; when voice information is played, the voiceprint bubbles are dynamically displayed, interestingness is increased, and user experience is improved.

The terminal using the voiceprint bubbles in the embodiment of the invention is characterized in that the voiceprint bubbles are used for presenting voice information, the voiceprint bubbles are matched with key information of the voice information, the terminal comprises a display module, and the display module is used for:

The terminal using the voiceprint bubbles in the embodiment of the invention utilizes the voiceprint bubbles matched with the key information of the voice information to perform personalized display on the voice information. When voice information is received, displaying voiceprint bubbles statically; when voice information is played, the voiceprint bubbles are dynamically displayed, interestingness is increased, and user experience is improved.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a block diagram illustrating an implementation environment related to a method for displaying voiceprint bubbles in instant messaging according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a method for displaying voiceprint bubbles in instant messaging according to an embodiment of the present invention;

FIG. 3 is another flow chart illustrating a method for displaying voiceprint bubbles in instant messaging according to an embodiment of the present invention;

FIGS. 4-5 are schematic diagrams of a voiceprint bubble display according to embodiments of the invention;

FIG. 6 is a flowchart illustrating a method for displaying voiceprint bubbles in instant messaging according to an embodiment of the present invention;

FIG. 7 is a schematic illustration of a voiceprint bubble display of an embodiment of the present invention;

FIG. 8 is a schematic flow chart illustrating a method for displaying voiceprint bubbles in instant messaging according to an embodiment of the present invention;

FIGS. 9-11 are schematic diagrams of a voiceprint bubble display according to embodiments of the invention;

FIG. 12 is a flowchart illustrating a method for displaying voiceprint bubbles in instant messaging according to an embodiment of the present invention;

FIGS. 13-14 are schematic diagrams of a voiceprint bubble display according to an embodiment of the invention;

FIG. 15 is a schematic flow chart of a method of displaying a voiceprint bubble in accordance with an embodiment of the invention;

fig. 16 is a block diagram of a terminal using a voiceprint bubble in accordance with an embodiment of the present invention.

Description of the main element symbols:

the system comprises a server 00, a sender client 01, a receiver client 02, a terminal 10, a display module 12, a voice acquisition module 14, a determination module 16 and a matching module 18.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

Referring to fig. 1, an implementation environment related to a method for showing voiceprint bubbles in instant messaging according to an embodiment of the present invention is shown in fig. 1. The implementation environment includes an Instant messaging system (IM) having a voice function, such as WeChat, Tencent QQ, and the like. The instant messaging system includes a sender client 01, a receiver client 02, and a server 00. The sender client 01 and the receiver client 02 are instant messaging software. It can be understood that the distinction between the sender client 01 and the receiver client 02 is merely for convenience, and in practical applications, the sender client 01 and the receiver client 02 may be the same terminal having both functions. The terminal is an electronic terminal with a network connection function, and includes but is not limited to: smart phones, computers, multimedia players, electronic readers, wearable electronic devices, or the like. The server 00 may be a server, a server cluster composed of several servers, or a cloud computing service center.

The voiceprint bubble is used to present voice information. The voiceprint bubbles show special effects on the voice information in the process of instant voice communication. The voiceprint bubble comprises a background picture and a voiceprint picture frame, and the voiceprint picture frame is related to the background picture. The voiceprint bubble is provided with a character label, and the background picture and the voiceprint picture frame correspond to the character label. The voiceprint picture frame includes a set of dynamic picture frames. In other embodiments, the voiceprint bubble can further include a bubble border in which the background picture is disposed. Furthermore, the voiceprint bubble further comprises a recording length of the voice information displayed on the background picture, and the recording length of the voice information and the voiceprint picture frame can be arranged side by side along the length direction of the background picture.

Referring to fig. 2, a method for displaying voiceprint bubbles in instant messaging according to an embodiment of the present invention includes:

step S110: and acquiring voice information in instant messaging.

Specifically, when receiving the voice message, the sender client 01 records the voice message, thereby obtaining the voice message, the time of generating the voice message, the intonation of the voice message, the tone of the voice message, the speed of the voice message, the recording length of the voice message, and the like.

In one example, the sound of different notes may be treated as a word-by-word, with the time interval between words being calculated. If the average time interval is shorter, the speech speed is considered to be faster; if the average time interval is longer, the speech rate is considered slower. An interval threshold may be set and the average time interval compared to the interval threshold to determine whether the average time interval is shorter or longer to determine the pace of the speech information.

Step S120: and determining key information related to the voice information according to the voice information.

Specifically, referring to fig. 3, in one embodiment of the present invention, step 120 includes: and converting the voice information into text information, and determining key information according to the key words in the text information.

It can be understood that after the voice information is acquired, the voice information can be converted into the text information through a preset voice recognition algorithm. After the voice information is converted into the text information, word segmentation processing can be carried out on the text information, a keyword in the voice information is determined, and the key information which is in a corresponding relation with the voiceprint bubbles is determined according to the keyword. Further, words obtained after word segmentation of the text information may be compared with words in a preset keyword library to determine whether the speech information includes keywords.

Step S130: and matching and displaying the corresponding voiceprint bubbles for the voice information according to the key information.

Specifically, the corresponding voiceprint bubbles are found out according to the key information, and the voiceprint bubbles are displayed in the instant messaging interface to present the voice information. The voiceprint bubble comprises a background picture and a voiceprint picture frame. When the voiceprint bubble is displayed statically (i.e. the voice information is not played), the voiceprint bubble is displayed by only displaying the background picture or displaying the voiceprint picture frame on the background picture in an overlapping manner, and at the moment, the voiceprint picture frame is a static voiceprint picture selected from a group of dynamic picture frames. When the voiceprint bubble is dynamically displayed (i.e. the voice information is being played), the voiceprint bubble is displayed such that the dynamic picture frames are alternately displayed on the background picture, and at this time, the voiceprint picture frames are a group of dynamic picture frames. The carousel speed of the voiceprint picture frame can be determined according to the speech speed of the voice information, and the carousel speed of the voiceprint picture frame is in direct proportion to the speech speed of the voice information. The time length of the carousel display of the voiceprint picture frame is consistent with the time length of the voice information playing.

Further, the voiceprint picture frame can include a first portion and a second portion. When the voiceprint bubble is displayed and the voice information is not played, statically displaying a first part and a second part of the voiceprint picture frame on the background picture; when the voiceprint bubble is displayed and the voice information is played, the first part of the voiceprint picture frame is statically displayed on the background picture, and the second part of the voiceprint picture frame is dynamically displayed on the background picture.

To show more vividly, the length of the background picture can be proportional to the length of the voice message recorded. And the recording length of the longest fixed voice information corresponds to the display length of the longest background picture. And when the recording length of the voice information is less than the recording length of the longest voice information, shortening or cutting the background picture to a corresponding proportional length according to the proportion of the recording length of the voice information in the recording length of the longest voice information, and rendering and displaying the background picture. To improve user experience, the instant messaging system may limit the maximum recording length of voice messages. In one example, the maximum recording length of the voice message is 59 seconds.

When the voiceprint bubble comprises the bubble border, the size (including the length and the width) of the bubble border can be determined according to the recording length of the voice information, and then the background picture is rendered into the bubble border according to the size of the bubble border. The length of the bubble border is proportional to the recording length of the voice message. The width of the bubble borders is generally fixed.

Referring to fig. 4 and 5, after determining the key information according to the keyword, the corresponding voiceprint bubble can be found by finding a text label (the keyword or a word associated with the keyword). Usually, the search is preferentially performed in the database local to the terminal, and when the corresponding voiceprint bubble cannot be searched, the search request is sent to the server 00, and the server 00 searches the voiceprint bubble. After the server 00 finds the corresponding voiceprint bubbles, the server 00 issues the corresponding voiceprint bubbles to the terminal.

In an example, referring to fig. 4, if the voice message recorded by the sender is: i drive home by oneself, the word after carrying out the word segmentation includes: the keyword 'car' is determined as key information in the voice information 'i drive oneself to go home'. And then, acquiring the voiceprint bubbles of the vehicle by searching the character label 'vehicle', and displaying the voiceprint bubbles of the vehicle on an instant messaging interface. In the example of fig. 4, the background picture is a picture of a car, the voiceprint picture frame is a plurality of square patterns resembling a window of a car, and the recording length of the voice message is 3 seconds, and is displayed on the background picture at a position near the right side. When voice information is played, the square patterns arranged on the background picture can be displayed in a carousel mode, and when the video information is displayed in the carousel mode, the square patterns fluctuate up and down or fluctuate left and right. The key information corresponding to the voiceprint bubbles shown in fig. 5 is watermelon, the background picture is a picture of the watermelon, and the voiceprint picture frames are a plurality of strip-shaped patterns similar to watermelon seeds.

In summary, the method for displaying voiceprint bubbles in instant messaging according to the embodiments of the present invention matches the voiceprint bubbles according to the key information related to the voice information and displays the voiceprint bubbles, so that the voice information can be displayed in a personalized manner, thereby increasing interest and improving user experience.

In the invention, two interactive parties (such as a sender and a receiver) can carry out voice chat through an instant messaging system (such as a sender client 01 and a receiver client 02). A voice input icon is displayed on the chat interface of the sender client 01, and a user can click the voice input icon to record voice information. Further, prompt information such as a character of 'holding and speaking' is displayed on the voice input icon, when the user holds the voice input icon to start speaking, the sender client 01 can record voice information through a microphone to acquire the voice information in the instant messaging.

The method for showing the voiceprint bubbles in the instant messaging can be realized by an instant messaging system. The instant messaging system includes a sender client 01, a receiver client 02, and a server 00. Preferably, the steps S110, S120 and S130 are implemented by the sender client 01. Specifically, the sender client 01 may obtain voice information through a microphone, determine key information according to the voice information, and match and display voiceprint bubbles according to the key information. Further, the sender client 01 may synthesize the voiceprint bubbles and the voice information related to the key information of the voice information into a data packet and send the data packet to the receiver client 02 through the server 00, and the receiver client 02 may also directly display the voiceprint bubbles.

Of course, in some embodiments, step S110, step S120, and step S130 may also be implemented by the recipient client 02. The sender client 01 records the voice information and then sends the voice information to the server 00, and the server 00 sends the voice information to the receiver client 02. The receiver client 02 acquires the voice information from the server 00, determines key information according to the voice information, matches voiceprint bubbles according to the key information and displays the voiceprint bubbles. In such an embodiment, the sender client 01 may also display the voiceprint bubble.

In other embodiments, the server 00 may obtain the voice information from the sender client 01, determine key information related to the voice information, match corresponding voiceprint bubbles for the voice information according to the key information, synthesize the voiceprint bubbles and the voice information into a data packet, and then send the data packet to the sender client 01 and/or the receiver client 02. The sender client 01 and/or the receiver client 02 directly display the voiceprint bubble.

Referring to fig. 6, in another embodiment of the present invention, step S120 includes: and acquiring the time of voice information generation, and determining the key information according to the specific time when the generated time is the specific time.

In one example, the particular time may be a holiday. And comparing the time of voice information generation with the holiday date to judge whether the time of voice information generation is holiday or not. And when the generation time of the voice information is holidays, determining key information which establishes a corresponding relation with the voiceprint bubbles. Referring to fig. 7, unlike the above embodiment, after determining the key information according to the holidays, the corresponding voiceprint bubbles can be searched by searching the text labels (holiday names). The key information corresponding to the voiceprint bubble shown in fig. 7 is christmas, the background picture is a picture of a christmas scene, and the voiceprint picture frame is a plurality of circular patterns resembling snowflakes.

In another example, the specific time may be a time set by the sender or the receiver. These times may be memorial times for the sender or recipient, such as a birthday of the sender itself, a wedding anniversary, a child birthday, a parent birthday, or other memorable day. The setting of the specific time may be implemented in an instant messaging system.

Referring to fig. 8, in another embodiment of the present invention, step S120 includes: and acquiring the tone of the voice information, and determining key information according to the tone.

It can be understood that the voice information includes the intonation information of the speaker, and after the voice information is acquired, the intonation can be judged according to the fluctuation amplitude and frequency of the sound wave in the voice information, so as to determine the emotion information contained in the voice information. For example, when the wave amplitude of the sound wave is large and the frequency increases, the emotional information may be considered as anger. Alternatively, a large amount of speech information with different tones may be trained through a clustering algorithm to obtain different emotion models, and the different emotion models are stored in the terminal or the server 00. The mood information contained in the speech information may then be determined by analyzing the intonation of the speech information through a mood model. The emotion models include anger, joy, fear, sadness, lovely, etc. At this time, the key information is words indicating emotion, such as anger, joy, fear, sadness, and lovely feeling. Referring to fig. 9-11, different from the above embodiment, after determining the key information according to the intonation, the corresponding voiceprint bubble can be searched by searching the text label (the word representing the emotion). The key information corresponding to the voiceprint bubble shown in fig. 9 is anger, the background picture is a heavy tone picture, and the voiceprint picture frame is a plurality of square patterns. The key information corresponding to the voiceprint bubble shown in fig. 10 is happy, the background picture is a cheerful tone picture, and the voiceprint picture frame is a plurality of note patterns. The key information corresponding to the voiceprint bubble shown in fig. 11 is dusting, the background picture is a chicken picture, and the voiceprint picture frame is a plurality of oval patterns similar to eggs.

Referring to fig. 12, in yet another embodiment of the present invention, step S120 includes: and acquiring the tone of the voice information, judging the gender of the speaker according to the tone, and determining key information according to the gender of the speaker.

It is understood that the speech information includes the timbre information of the speaker. The frequency band of the male voice is generally lower, and the frequency band of the female voice is generally higher. After the voice information is acquired, the tone can be judged according to the frequency band of the sound in the voice information, and then the gender information contained in the voice information is determined. Alternatively, a large number of different timbres (male and female) of voice information may be trained by a clustering algorithm to obtain male and female voice models and stored in the terminal or server 00. The voice information can be analyzed through the male voice model and the female voice model to determine the gender information contained in the voice information. At this time, the key information is male or female. Referring to fig. 13 and 14, unlike the above embodiment, after determining the key information according to the gender of the speaker, the corresponding voiceprint bubble can be found by finding the text label (male or female). The key information corresponding to the voiceprint bubble shown in fig. 13 is a male, the background picture is a blue compact picture, and the voiceprint picture frame is a plurality of strip-shaped patterns. The key information corresponding to the voiceprint bubble shown in fig. 14 is female, the background picture is a picture of a pearl gemstone, and the voiceprint picture frame is a plurality of gemstone patterns.

In some embodiments, the key information is a plurality of pieces of key information, and a plurality of voiceprint bubbles are correspondingly matched with the plurality of pieces of key information. The display method comprises the following steps: and when the voice information is played, switching the voiceprint bubbles according to a preset sequence and displaying the voiceprint picture frames in turn.

Specifically, there may be two or more key information related to the voice information. The key information may be a keyword in the voice information or a word associated with the keyword, may be a holiday, may be a word representing emotion, or may be a male or female. When the key information includes the above four types, when the voice information is played, the voiceprint bubbles matched with the key words or the words associated with the key words, the voiceprint bubbles matched with holidays, the voiceprint bubbles matched with emotions and the voiceprint bubbles matched with genders can be sequentially displayed on the instant messaging interface, and the voiceprint bubbles matched with the key words or the words associated with the key words, the voiceprint bubbles matched with holidays, the voiceprint bubbles matched with emotions and the voiceprint bubbles matched with genders are sequentially switched and displayed in turn. The presentation order of the voiceprint bubbles correspondingly matched with the four key information can be other orders.

Of course, it is also possible to display only the voiceprint bubbles matching one of the key information on the instant messaging interface and display the voiceprint picture frames in turn while playing the voice information, for example, only the voiceprint bubbles matching the keywords or words associated with the keywords are displayed. Certainly, the voiceprint bubbles matched with the two key information can also be displayed on an instant messaging interface, and the voiceprint image frames are switched according to a preset sequence and displayed in turn when the voice information is played. The voiceprint bubbles matched with the three key information can be displayed on an instant messaging interface, and the voiceprint image frames are switched according to a preset sequence and displayed in turn when the voice information is played. The embodiment of the present invention is not limited to this, and the user may set the information according to his/her preference or may display the information according to the priority of the preset key information.

Further, when the key information is a keyword and includes two or more keywords, for example, "i drive to pick up a friend to play a table ball" includes two keywords "car" and "table ball", when playing the voice information, the voiceprint bubbles of the car can be dynamically displayed first, and then switched to the voiceprint bubbles of the table ball table and dynamically displayed. In one example, when the voice message is not played, the instant messaging interface statically displays a voiceprint bubble matched with the first keyword; and when the voice information is played, the instant messaging interface statically displays the voiceprint bubbles matched with the last keyword. Preferably, when the voice message includes two or more keywords, when the voice message is played to the next keyword, the voice message is switched to the voiceprint bubble matched with the next keyword and is dynamically displayed.

In some embodiments, the display method comprises: and determining the switching speed of a plurality of voiceprint bubbles and the carousel speed of voiceprint picture frames according to the speech speed. Therefore, the interestingness of displaying the voice information can be increased. When the speech speed is high, the switching speed of a plurality of voiceprint bubbles is high, and the carousel speed of the voiceprint picture frame is high; when the speech speed is slow, the switching speed of the voiceprint bubbles is slow and the carousel speed of the voiceprint picture frame is slow.

Referring to fig. 15 and 16, a method for displaying a voiceprint bubble according to an embodiment of the present invention can be implemented by the terminal 10 using a voiceprint bubble according to an embodiment of the present invention. The voiceprint bubble is used for presenting the voice information, and the voiceprint bubble is matched with the key information of the voice information. The display method comprises the following steps:

step S210: when voice information is received, displaying voiceprint bubbles statically; and

step S220: and dynamically displaying the voiceprint bubbles when the voice information is played.

Specifically, the voiceprint bubble includes a background picture and a voiceprint picture frame. The voiceprint picture frame is related to a background picture. The voiceprint picture frame includes a set of dynamic picture frames. Dynamically displaying the voiceprint bubble comprises: and displaying the voiceprint picture frame on the background picture in a carousel mode.

It can be understood that, when the voice message is received, the static voiceprint bubble display is performed by displaying only the background picture or displaying the voiceprint picture frame on the background picture in an overlapping manner, and at this time, the voiceprint picture frame is a frame of voiceprint picture selected from a group of dynamic picture frames. When voice information is played, the voiceprint bubbles are dynamically displayed, namely dynamic picture frames are displayed on the background picture in a carousel mode, and at the moment, the voiceprint picture frames are a group of dynamic picture frames. Further, the voiceprint picture frame can include a first portion and a second portion. When the voiceprint bubble is displayed and the voice information is not played, statically displaying a first part and a second part of the voiceprint picture frame on the background picture; when the voiceprint bubble is displayed and the voice information is played, the first part of the voiceprint picture frame is statically displayed on the background picture, and the second part of the voiceprint picture frame is dynamically displayed on the background picture.

It should be noted that the explanation and the beneficial effects of the method for displaying voiceprint bubbles in instant messaging according to the above embodiment are also applicable to the method for displaying voiceprint bubbles according to the present embodiment, and are not detailed here to avoid redundancy.

In some embodiments, the length of the background picture is proportional to the length of the voice message recorded. The time length of the carousel display of the voiceprint picture frame is consistent with the time length of the voice information playing.

In some embodiments, the carousel speed of the voiceprint picture frames is proportional to the speech rate of the speech information.

Referring to fig. 16, the terminal 10 of the present embodiment uses voiceprint bubbles. The voiceprint bubble is used for presenting the voice information, and the voiceprint bubble is matched with the key information of the voice information. The terminal 10 includes a display module 12. The display module 12 is used for: when voice information is received, displaying voiceprint bubbles statically; and dynamically displaying the voiceprint bubbles when the voice information is played.

That is, step S210 and step S220 of the method for displaying a voiceprint bubble according to the embodiment of the present invention may be implemented by the display module 12 of the terminal 10 using a voiceprint bubble according to the embodiment of the present invention.

The terminal 10 of the embodiment of the present invention performs personalized display on the voice information by using the voiceprint bubbles matched with the key information of the voice information. When voice information is received, displaying voiceprint bubbles statically; when voice information is played, the voiceprint bubbles are dynamically displayed, interestingness is increased, and user experience is improved.

It is understood that the terminal 10 is an electronic terminal having a network connection function, including but not limited to: smart phones, computers, multimedia players, electronic readers, wearable electronic devices, or the like. For example, when the terminal 10 is a smart phone, the voiceprint bubbles can be displayed on clients such as WeChat, Tencent QQ, and short message.

It should be noted that the above explanation and beneficial effects of the embodiment of the method for displaying voiceprint bubbles in instant messaging also apply to the terminal 10 using voiceprint bubbles according to the embodiment of the present invention, and are not detailed here to avoid redundancy.

In some embodiments, the terminal 10 further includes a voice acquisition module 14, a determination module 16, and a matching module 18. The voice acquisition module 14 is configured to: and acquiring voice information in instant messaging. The determination module 16 is configured to: and determining key information related to the voice information according to the voice information. The matching module 18 is configured to: and matching the corresponding voiceprint bubbles for the voice information according to the key information.

In some embodiments, the voiceprint bubble includes a background picture and a voiceprint picture frame. The voiceprint picture frame is related to a background picture. The voiceprint picture frame includes a set of dynamic picture frames. The display module 12 is used for: and displaying the voiceprint picture frame on the background picture in a carousel mode.

In the description of the embodiments of the present invention, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the embodiments of the present invention, "a plurality" means two or more unless specifically limited otherwise.

In the description of the embodiments of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as being fixedly connected, detachably connected, or integrally connected; may be mechanically connected, may be electrically connected or may be in communication with each other; either directly or indirectly through intervening media, either internally or in any other relationship. Specific meanings of the above terms in the embodiments of the present invention can be understood by those of ordinary skill in the art according to specific situations.

In the description herein, references to the description of the terms "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example" or "some examples" or the like mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

The logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processing module-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic terminal) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of embodiments of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made in the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. A method for showing voiceprint bubbles in instant messaging is characterized by comprising the following steps:

acquiring voice information in instant messaging;

matching corresponding voiceprint bubbles for the voice information according to the key information, wherein the voiceprint bubbles comprise a background picture and a voiceprint picture frame, the voiceprint picture frame is related to the background picture, and the voiceprint picture frame comprises a group of dynamic picture frames;

and when the voice information is not played, the voiceprint bubbles are statically displayed, and when the voice information is played, the voiceprint bubbles are dynamically displayed.

2. The presentation method of claim 1, wherein the step of determining key information related to the voice information from the voice information comprises:

converting the voice information into text information, and determining the key information according to key words in the text information;

alternatively, the first and second electrodes may be,

acquiring the time of generating the voice information, and determining the key information according to the specific time when the generated time is the specific time;

alternatively, the first and second electrodes may be,

acquiring the tone of the voice information, and determining the key information according to the tone;

alternatively, the first and second electrodes may be,

and acquiring the tone of the voice information, judging the gender of the speaker according to the tone, and determining the key information according to the gender of the speaker.

3. The method according to claim 1, wherein the matching and displaying the corresponding voiceprint bubble for the voice message according to the key information comprises: and matching the corresponding background picture and the voiceprint picture frame according to the key information, and displaying the voiceprint picture frame on the background picture in a carousel mode.

4. The display method according to claim 3, wherein the obtaining of the voice message in the instant messaging includes obtaining a recording length of the voice message, the display length of the background picture is proportional to the recording length of the voice message, and the time length of the frame carousel display of the voiceprint picture is consistent with the time length of the play of the voice message.

5. The method according to claim 3 or 4, wherein the obtaining the voice message in the instant messaging includes obtaining a speech rate of the voice message, and the carousel displaying of the voiceprint picture frames includes:

and determining the carousel speed of the voiceprint picture frame according to the speech speed of the voice information.

6. A method for displaying a voiceprint bubble, wherein the voiceprint bubble is used for presenting voice information, the voiceprint bubble is matched with key information of the voice information, the voiceprint bubble comprises a background picture and a voiceprint picture frame, the voiceprint picture frame is related to the background picture, the voiceprint picture frame comprises a group of dynamic picture frames, and the method comprises the following steps:

7. The method of claim 6, wherein dynamically displaying the voiceprint bubble comprises: displaying the voiceprint picture frame on the background picture in a carousel manner.

8. The display method according to claim 7, wherein the display length of the background picture is proportional to the recording length of the voice message, and the time length of the frame carousel display of the voiceprint picture is consistent with the time length of the voice message.

9. The display method of claim 8, wherein the display method comprises: the carousel speed of the voiceprint picture frame is in direct proportion to the speech speed of the voice information.

10. A terminal using a voiceprint bubble, wherein the voiceprint bubble is used for presenting voice information, the voiceprint bubble is matched with key information of the voice information, the voiceprint bubble comprises a background picture and a voiceprint picture frame, the voiceprint picture frame is related to the background picture, the voiceprint picture frame comprises a set of dynamic picture frames, the terminal comprises a display module, and the display module is used for:

11. The terminal of claim 10, wherein the terminal further comprises a voice acquisition module, a determination module, and a matching module, the voice acquisition module being configured to:

acquiring voice information in instant messaging; the determination module is to:

determining the key information related to the voice information according to the voice information;

the matching module is used for:

and matching the corresponding voiceprint bubbles for the voice information according to the key information.

12. The terminal of claim 10, wherein the presentation module is to: displaying the voiceprint picture frame on the background picture in a carousel manner.

13. The terminal of claim 12, wherein the length of displaying the background picture is proportional to the length of recording the voice message, and the duration of displaying the voiceprint picture frame carousel is consistent with the duration of playing the voice message.

14. The terminal of claim 13, wherein the carousel speed of the voiceprint picture frame is proportional to a speech rate of the voice information.