CN107040452B

CN107040452B - Information processing method and device and computer readable storage medium

Info

Publication number: CN107040452B
Application number: CN201710068757.8A
Authority: CN
Inventors: 俞悦; 帅颖斌; 张书超
Original assignee: Zhejiang Yixin Technology Co Ltd
Current assignee: Zhejiang Yixin Technology Co Ltd
Priority date: 2017-02-08
Filing date: 2017-02-08
Publication date: 2020-08-04
Anticipated expiration: 2037-02-08
Also published as: CN107040452A

Abstract

The embodiment of the invention provides an information processing method. The method comprises the following steps: receiving first information content sent by a first instant messaging client by a second instant messaging client; and searching a voice information set suitable for interacting with the user sending the first information content from a preset information base according to the first information content so as to enable the user of the second instant messaging client to select one or more voice options to interact with the user sending the first information content, wherein each voice option comprises at least one piece of voice content. The method provided by the invention has the advantages that the method can provide the fighting sound for the user in the field of instant messaging to interact with other users, so that the information interaction mode between the user and other users is richer and more interesting, and better experience is brought to the user. Further, embodiments of the present invention provide an information processing apparatus and a computer-readable storage medium.

Description

Information processing method and device and computer readable storage medium

Technical Field

Embodiments of the present invention relate to the field of information processing, and more particularly, to an information processing method, an information processing apparatus, and a computer-readable storage medium.

Background

This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.

At present, in the process of instant messaging between a user and other users based on an instant messaging interactive interface, the user usually interacts with other users in a bucket diagram manner in order to make better expression. The interaction is realized in a bucket image mode, namely, pictures and expression packages which are very consistent with interaction situations are used for communicating with other users. Although the prior art adopts a bucket image mode to interact with other users, the content which can be expressed by the image is still limited, and sometimes the rich emotion of the human cannot be well expressed. However, the existing instant messaging technology only realizes the function of sending the own voice or song-like audio file of the user to other users or providing voice communication for two users, which still cannot well meet the diversified expression requirements of the user and the interaction requirements of the user and other users, and also lacks interest.

Disclosure of Invention

From the above, it can be seen that the prior art only provides a way of a fighting map in the field of instant messaging for a user to interact with other users, and sometimes the way of the fighting map still cannot well express the emotion of the user, which is a very annoying process.

Therefore, an improved information processing method and apparatus are highly needed to solve the defects in the prior art and bring better interactive experience for the interaction between the user and other users.

In this context, embodiments of the present invention are intended to provide an information processing method, apparatus, and computer-readable storage medium.

In a first aspect of embodiments of the present invention, there is provided an information processing method including: receiving first information content sent by a first instant messaging client by a second instant messaging client; and searching a voice information set suitable for interacting with the user sending the first information content from a preset information base according to the first information content so as to enable the user of the second instant messaging client to select one or more voice options to interact with the user sending the first information content, wherein each voice option comprises at least one piece of voice content.

In an embodiment of the present invention, the information processing method further includes: and in response to receiving a preview operation of the user of the second instant messaging client on the selected voice option, executing a preview event of the selected voice option on the second instant messaging client, wherein the preview event comprises playing voice content corresponding to the selected voice option or/and presenting text content corresponding to the voice option.

In another embodiment of the present invention, the information processing method further includes: and in response to receiving the sending operation of the user of the second instant messaging client to the selected voice option, sending the selected voice option to the first instant messaging client.

In yet another embodiment of the present invention, the set of speech information includes one or more different types of sound packets, each type of sound packet including one or more speech options.

In yet another embodiment of the present invention, according to the above further embodiment of the present invention, the first information content includes voice information, and the searching the voice information set suitable for interacting with the first information content from the predetermined information base according to the first information content includes: in response to receiving a specific operation on the first information content, searching out a voice information set matched with the first information content from a preset information base according to a preset information matching rule, wherein the voice information set is suitable for interacting with a user sending the first information content.

In yet another embodiment of the present invention, according to the above still another embodiment of the present invention, the searching out the voice information set matching the first information content from the predetermined information base according to the preset information matching rule includes: according to a preset calculation rule, and according to the emotion determined by the first information content and the voice information in the preset information base based on the voice characteristics or/and the type of the voice packet, calculating the matching degree of the first information content and the voice information in the preset information base; and taking the voice information with the matching degree larger than a preset threshold value or the voice information with the matching degree ranked in the top specific number in a preset information base as the voice information set matched with the first information content.

In some embodiments of the present invention, according to the above further embodiment of the present invention, the information processing method further includes: presenting a plurality of voice options in the voice information set to a user of a second instant messaging client in a specific form; in response to receiving a selection of at least one of the plurality of voice options, making the selected at least one voice option a voice option to be sent to the first instant messaging client.

In some embodiments of the invention, according to the above further embodiment of the invention, the specific operation on the first information content comprises: the first information content is pressed again; the presenting of the plurality of voice options in the particular form includes: presenting a plurality of voice options in a plurality of card stacks; before the receiving of the selection of at least one of the plurality of voice options, further comprising: and receiving sliding operation of at least one part of the voice options in the plurality of voice options to present the voice options to be selected.

In some embodiments of the present invention, the information processing method further includes: responding to the operation of receiving the user of the second instant messaging client for making the voice information, and presenting a plurality of voice options to be imitated in the voice information set for the user to select; and responding to the received selection of the user of the second instant messaging client to at least one voice option to be imitated and voice recording operation for imitating the voice content of the selected voice option, and obtaining a voice file corresponding to the selected voice option.

In some embodiments of the present invention, the information processing method further includes: acquiring a picture to be processed; acquiring voice information which is recorded by a user of a second instant messaging client and aims at the target picture; processing the picture to be processed according to the recorded voice information to obtain a target picture; obtaining a combined voice picture file according to a preset rule aiming at combining the target picture and the voice information; and sending the voice picture file to a first instant messaging client.

In some embodiments of the present invention, according to the above embodiments of the present invention, processing the to-be-processed picture according to the voice feature of the recorded voice information to obtain the target picture includes: according to at least one characteristic of pitch, tone quality, volume, duration and rhythm of the voice information, carrying out one or more of turning, twisting and stretching on the picture to be processed to obtain a target picture; or/and configuring text information corresponding to the voice information for the picture to be processed to obtain a target picture comprising the configured text information.

In some embodiments of the present invention, according to the above embodiments of the present invention, after the step of acquiring the recorded voice information for the target picture, the method further includes: determining the vibration intensity or/and the vibration time of the first instant messaging client when the first instant messaging client vibrates after receiving the voice message according to at least one characteristic of pitch, tone, volume, duration and rhythm of the voice message; the step of sending the voice picture file to a first instant messaging client comprises the following steps: -sending the voice picture file to the first instant messaging client so that the first instant messaging client vibrates at the determined vibration intensity or/and vibration time when receiving the voice picture file.

In some embodiments of the invention, according to the above embodiments of the invention, the target picture comprises a motion picture.

In a second aspect of embodiments of the present invention, there is provided an information processing apparatus comprising: the receiving unit is used for receiving first information content sent by a first instant messaging client;

and the searching unit is used for searching out a voice information set suitable for interacting with a user sending the first information content from a preset information base according to the first information content so as to enable the user of the information processing device to select one or more voice options to interact with the user sending the first information content, wherein each voice option comprises at least one piece of voice content.

In a third aspect of embodiments of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of: receiving first information content sent by a first instant messaging client by a second instant messaging client; and searching a voice information set suitable for interacting with the user sending the first information content from a preset information base according to the first information content so as to enable the user of the second instant messaging client to select one or more voice options to interact with the user sending the first information content, wherein each voice option comprises at least one piece of voice content.

According to the information processing method and the information processing device, the user can interact with other users in a fighting sound mode based on the instant messaging client, namely, the user can interact with other users by adopting voices which are very consistent with chat situations, and a brand-new interaction mode is provided for the user. Because the voice can well embody the exaggeration and the emotional feeling with the infection and the like of people as the expression mode, the emotional expression requirement of the user and the interaction requirement of the user and other users can be well met, the interest of the interaction is remarkably improved, and the brand new and better experience is brought to the user.

Drawings

The above and other objects, features and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

fig. 1 schematically shows an application scenario diagram of an information processing method according to an embodiment of the present invention;

fig. 2 is a schematic view of an application scenario of an information processing method according to another embodiment of the present invention;

FIG. 3 schematically shows a flow diagram of an information processing method according to one embodiment of the invention;

FIG. 4 schematically shows a flowchart specifically describing step S102 in FIG. 3, according to an embodiment of the present invention;

fig. 5 schematically shows an application scenario diagram of an information processing method according to another embodiment of the present invention;

fig. 6 schematically shows a flowchart of an information processing method according to another embodiment of the present invention;

FIG. 7 schematically illustrates a first diagram of the application scenario described with respect to FIG. 6, in accordance with one embodiment of the present invention;

FIG. 8 schematically illustrates a second diagram of the application scenario described with respect to FIG. 6, in accordance with an embodiment of the present invention;

fig. 9 schematically shows a flowchart of an information processing method according to still another embodiment of the present invention;

FIG. 10 schematically illustrates a schematic diagram of an application scenario described with respect to FIG. 9, in accordance with one embodiment of the present invention;

fig. 11 schematically shows a schematic block diagram of an information processing apparatus according to an embodiment of the present invention;

fig. 12 schematically shows a schematic block diagram of an information processing apparatus according to another embodiment of the present invention;

fig. 13 schematically shows a schematic block diagram of an information processing apparatus according to still another embodiment of the present invention;

fig. 14 schematically shows a structural diagram of an information processing apparatus according to still another embodiment of the present invention;

FIG. 15 schematically shows a program product diagram of information processing according to one embodiment of the invention;

in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

The principles and spirit of the present invention will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the invention, and are not intended to limit the scope of the invention in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As will be appreciated by one skilled in the art, embodiments of the present invention may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

According to an embodiment of the invention, an information processing method, an information processing device and a computer-readable storage medium are provided.

In this document, it is to be understood that any number of elements in the figures are provided by way of illustration and not limitation, and any nomenclature is used for differentiation only and not in any limiting sense.

The principles and spirit of the present invention are explained in detail below with reference to several representative embodiments of the invention.

Summary of The Invention

The inventor finds that in the prior art, because a user can only adopt a fighting picture mode to interact well with other users, but sometimes the user can not adopt other more proper expression modes to interact with other users under the condition that pictures can not express human emotion in a close-fitting manner, the user can only make up the defects of expression by matching pictures in other modes such as character expression and the like, but even if the user can not express emotion in a close-fitting manner, the interaction experience is poor; or the user still feels that the supplementary expression through other modes such as characters is lacked compared with the once expression in place.

In the embodiment of the invention, after receiving the first information content sent by the first instant messaging client, the user of the second instant messaging client can provide the voice option aiming at the user interaction sending the first information content, so that the user of the second instant messaging client can select the proper voice option to interact with the user sending the first information content.

Having described the general principles of the invention, various non-limiting embodiments of the invention are described in detail below.

Application scene overview

Referring to fig. 1, fig. 1 is a schematic view of an application scenario of an information processing method according to an embodiment of the present invention, where the application scenario exemplarily presents an instant messaging interactive interface that provides a plurality of voice options for a user to select and interact with, so as to meet a requirement that the user interacts with other users in a bucket sound manner. As shown in fig. 1, a scenario is shown in which a user, who is nickname "big broken bowl", interacts with information of another user based on an instant messaging interactive interface. When a user in a "big bowl" wants to interact with another user in a fighting sound manner, he or she may click a "sound" button on the interactive interface (e.g., a "sound" button located on the right side of the input box in fig. 1), so that he or she can see a plurality of voice options, such as "haha", "o …", "you have no reason to get a good alarm …", "right, you have no feeling …", and "west lake water o …" as examples in fig. 1. When the user operates one of the voice options "Pair, you are silent …", for example, by long pressing, he can preview the sound in the voice option or can see the text corresponding to the voice option "Pair, you are silent, you are cool, you are not worried about getting alarm" while previewing.

It should be noted that fig. 1 illustrates an example in which one sound packet only includes one voice option, and in practical applications, one sound packet may include one or more voice options.

Optionally, referring to fig. 2, fig. 2 is a schematic view of an application scenario of an information processing method according to another embodiment of the present invention, where the application scenario exemplarily presents that a plurality of voice preview pages are provided for a user.

In fig. 2, a sound package may include a plurality of voice options, and when the user presses a sound package in the option box below the input area for a long time, the voice preview page corresponding to each of the plurality of voice options included in the sound package is presented in a gradually enlarged bubble form on the information processing interface. Optionally, after the action of enlarging the bubble is finished, the voice content in one of the voice preview pages is directly played by default. Preferably, during the playing process, a countdown of the playing of the voice or/and an animation (such as a waveform representing the change of the sound frequency) that changes with the change of the sound frequency are also presented. Preferably, the user presses one sound packet in the option box below the input area every long time, and the voice preview pages corresponding to the voice options in the sound packet are sequentially presented in a stacked card style. Preferably, the voice preview pages corresponding to a plurality of voice options in the voice package determine the level of the stack according to a default number, for example, the voice preview page corresponding to the voice option No. 1 is located at the top layer by default, the voice preview page corresponding to the voice option No. 2 is located at the second top layer by default, and the voice preview page corresponding to the voice option No. n is located at the bottom layer by default. Preferably, the user can slide the plurality of cards to realize the switching effect of different voice preview pages, and different voice information is played according to the switching of the cards. Preferably, when the user wants to preview the voice preview page corresponding to the other sound package, the user may select the other sound package in the option box, so that the plurality of voice preview pages corresponding to the original sound package disappear, and the plurality of voice preview pages corresponding to the currently selected sound package are presented. Further, with continued reference to fig. 2, the user may also operate the sound sending button in one of the voice preview pages in a short-press manner, so as to trigger the voice option corresponding to the voice preview page to be directly sent to another user, thereby implementing better expression and interaction between users by using emotional classic voice as an interaction manner. Further preferably, with reference to fig. 2, when the user triggers the voice option corresponding to the voice preview page to be sent to the opposite party by short pressing, the voice option sent to the opposite party is presented in a reduced bubble manner on the current interactive page.

Preferably, for a pronunciation for the same text, it may comprise a plurality of series or types of sounds. By series or types, it is possible to mean the sounds of different persons, in particular different celebrities, used to express the same text. For example, for the sound of the same text, a sound of "a person in a television show", "a sound of a great horse wave", "a sound of a ringer' and the like may be used.

It should be noted that the presentation manner of the voice options shown in fig. 1 is only an example, and may also be presented in other manners, for example, the presentation manner is presented according to a difference of voice types rather than a difference of text contents corresponding to voices, for example, a "chinese princess" voice bag including a plurality of voice options is presented at a first position of the instant messaging interactive interface, a "mazeo" voice bag including a plurality of voice options is presented at a second position, and a "shiling" voice bag … … including a plurality of voice options is presented at a third position, and when a user operates one of the voice bags (for example, the "mazeo" voice bag) by clicking or the like, the voice options corresponding to a plurality of different text contents included in the voice bag are presented.

Exemplary method

An information processing method according to an exemplary embodiment of the present invention is described below with reference to fig. 3 in conjunction with the application scenarios of fig. 1 and 2. It should be noted that the above application scenarios are merely illustrated for the convenience of understanding the spirit and principles of the present invention, and the embodiments of the present invention are not limited in this respect. Rather, embodiments of the present invention may be applied to any scenario where applicable.

Fig. 3 schematically shows a flow diagram of an information processing method according to an embodiment of the present invention. In this embodiment, the execution subject of the information processing method may be a user terminal or/and an instant messaging client installed on the user terminal or/and a server communicatively connected with the user terminal. The user terminal may include, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a personal computer, etc., the instant messaging terminal may include, but is not limited to, a wechat client, a qq client, an easy-to-believe client, etc., and the server may include any server device capable of receiving and processing information sent by the user terminal or/and the instant messaging terminal.

As shown in fig. 3, the information processing method may include the steps of:

step S101, the second instant messaging client receives the first information content sent from the first instant messaging client.

The first instant messaging client and the second instant messaging client refer to the same type of client (such as a wechat client, a qq client, an easy-to-believe client, etc.), for example. In addition, the first instant messaging client and the second instant messaging client may be located on different user terminals, or may be located on the same user terminal and associated with different client accounts.

The first information content includes, but is not limited to, text, pictures, voice, video, animation, and the like. The receiving of the first information content includes, but is not limited to, being effected via the internet, a wide area network, a metropolitan area network, a local area network, etc.

Step S102, searching out a voice information set suitable for interacting with the user sending the first information content from a predetermined information base according to the first information content, so that the user of the second instant messaging client selects one or more voice options from the voice information set to interact with the user sending the first information content, where each voice option may include at least one piece of voice content.

The predetermined information base includes, but is not limited to, a local voice base of a user terminal where the instant messaging client is located, a voice base in an instant messaging server with which the instant messaging client communicates when the instant messaging client is networked, or a combination of the two.

The searching out of the set of speech information from the predetermined information repository suitable for interaction with the user transmitting the first information content may be performed according to preset search rules. For example, a voice training sample is obtained according to information exchanged by a large number of people in a specific environment (such as a normal conversation environment) or according to a large number of conversations among classical characters in a television drama, and a voice library is established according to the voice training sample (the voice library can be stored in an instant messaging server), wherein each piece of voice content has corresponding keywords/words/sentences, for example, for a certain piece of voice content, "you are reluctant, you are still cool, and you are not willing to get an alarm", the corresponding keywords/words/sentences include: keywords-passionate, cruel and irrational getting alarm, key sentences-you passionate, cruel and irrational getting alarm, and the like. When the keyword 'passivity' is known to be included in the first information content, a plurality of pieces of voice content (including the passivity, the crudely and the unreasonable alarm) corresponding to the keyword/word/sentence are searched from the voice database according to the keyword 'passivity', and are used as a voice information set suitable for interacting with a user sending the first information content.

It should be noted that, since the first information content includes, but is not limited to, text, picture, voice, video, animation, etc., in order to better realize the search of the corresponding voice content from the voice library, the information content in the non-text form may be first converted into the information content in the text form, so that according to the information content in the text form, a plurality of pieces of voice content corresponding to the keyword/word/sentence are searched from the voice library as a voice information set suitable for interacting with the user who sent the first information content.

Of course, the above-mentioned search rule for searching out the voice information set suitable for interacting with the user sending the first information content from the predetermined information base is only an example and is not meant to be limiting, and the present application may use other suitable search rules to search out the voice information set suitable for interacting with the user sending the first information content from the predetermined information base.

Wherein the set of speech information comprises one or more different types of sound packets, each type of sound packet comprising one or more speech options. As described above, the type of sound bag may refer to the type of voice, such as "princess" voice type, "great waves" voice type, "lingering" voice type, and the like. The one or more voice options may refer to voice content corresponding to one or more text content in the same voice genre. Optionally, each voice option includes at least one piece of voice content, for example, a certain voice option includes three pieces of voice content with equal time intervals.

Optionally, with continuing reference to fig. 3, the information processing method further includes:

step S103, in response to receiving the preview operation of the user of the second instant messaging client on the selected voice option, executing a preview event of the selected voice option on the second instant messaging client, wherein the preview event comprises playing the voice content corresponding to the selected voice option or/and presenting the text content corresponding to the voice option.

The preview operation may refer to a long-press operation performed by the user on the voice option. For example, if the user long presses the voice option "Pair you have nothing … …" in FIG. 1, a preview of the voice option can be implemented. The preview mode includes, but is not limited to, playing part or all of the voice content corresponding to the voice option or/and presenting the text content corresponding to the voice option, i.e. "right, you are reluctant to get an alarm"

and step S104, responding to the received sending operation of the user of the second instant messaging client on the selected voice option, and sending the selected voice option to the first instant messaging client.

The sending operation may refer to a short-press operation of the voice option by the user. For example, the user shortly presses the voice option "pair, you are silent … …" in fig. 1, and then the voice option is triggered to be sent to the first instant messaging client, so that the users can better express and interact with the classical voice option with emotional tension as an interactive mode.

Optionally, referring to fig. 4 and fig. 5, fig. 4 provides a flowchart specifically describing step S102 in fig. 3 according to an embodiment of the present invention, and fig. 5 provides an application scenario diagram of an information processing method according to another embodiment of the present invention, which corresponds to the flowchart in fig. 4. According to fig. 4, if the first information content includes voice information, the step S102 of searching a predetermined information base for a voice information set suitable for interacting with the first information content according to the first information content includes:

step S1021, responding to the fact that specific operation on the first information content is received, searching out a voice information set matched with the first information content from a preset information base according to a preset information matching rule, wherein the voice information set is used as a voice information set suitable for interacting with a user sending the first information content.

The specific operation on the first information content may refer to a press operation on the first information content.

Wherein the preset information matching rule may include: according to a preset calculation rule, and according to the emotion determined by the first information content and the voice information in the preset information base based on the voice characteristics or/and the type of the voice packet, calculating the matching degree of the first information content and the voice information in the preset information base; and taking the voice information with the matching degree larger than a preset threshold value or the voice information with the matching degree ranked in the top specific number in a preset information base as the voice information set matched with the first information content.

Wherein, the emotion determined based on the voice feature in the first information content, for example, refers to the emotion contained in the first information content determined based on the tone of voice and the comparison between the pitch and the given reference value. For example, when the tone of speech is rising and the decibel value of the sound exceeds the reference value M, it is determined that the first information content includes an angry emotion. Further, for example, on the basis of the difference between the decibel value of the sound and the reference value, it may be further determined that the first information content includes an amplitude value (or a likelihood value or a weight score) of an angry emotion, for example, when the difference is smaller than a given value C1, the amplitude value (or the likelihood value or the weight score) is 20%; when the difference is greater than C1 and less than C2, the amplitude value is 60%. Of course, the above-mentioned method of determining the emotion magnitude value or emotion weight score based on the voice features is only an example, and other methods suitable for the present invention may be adopted to determine the weight score of a certain emotion or several emotions included in the voice information. In addition, emotion related values may also be manually set for voice information directly by a user or system administrator.

Wherein the type of the sound packet to which the first information content belongs is as described above, for example, a sound packet belonging to a "Chinese imperial concubine" voice type.

The first information content corresponds to text content, namely, the voice information is converted into corresponding language text based on voice recognition.

One example of calculating the matching degree between the first information content and the plurality of pieces of voice information in the predetermined information base according to the preset calculation rule is as follows:

an example of defining the process of calculating the degree of match reliability is:

suitability ═ total emotion matching degree X matching degree Y of the type of sound packet;

it should be noted that if one of the values in the above calculation process is 0, the matching degree is determined by another value which is not 0. For example, assume that if the first information content belongs to a user-recorded sound, Y takes a value of 0, and reliability ═ X. If both terms are 0, the result is 0 or the result is determined according to other predetermined rules.

Each of the above calculation processes is detailed next:

first, the total emotional matching degree of the first information content n1 and a certain piece of speech information n2 in a predetermined information base is calculated.

Suppose the first information content n1 is "Pair, you are reluctant, you are cool, you are not willing to get an alarm! "the weighted scores of the emotions included in the first information content n1 are obtained according to a known method for determining the weighted scores of the emotions included in the voice information (for example, according to the method for determining the value of the magnitude of an anger emotion according to the above description, or according to the emotion-related values manually set for the voice information by the user or the system administrator): happy a 1-50%, angry b 1-5%, sad c 1-5%, happy d 1-20%, surprised e 1-10%, panic f 1-5%, thinking g 1-5%, wherein a + b + c + d + e + f + g-100%.

It is assumed that the weight scores of multiple similar emotions contained in a certain piece of speech information n2 in a predetermined information base are obtained through calculation and are respectively represented by a2, b2, c2, d2, e2, f2 and g 2.

If a2> -a 1, the matching degree of n1 and n2 in happy mood is a-a 1/a2, otherwise, a-a 2/a 1; preferably, if the value of at least one of a1, a2 is 0, then a takes the value of 0.

Similarly, the matching values of n1 and n2 in other emotions are calculated and are respectively represented by b, c, d, e, f and g.

Further, if none of a, b, c, d, e, f, and g are 0, the first information content n1 has a total emotional matching degree X ═ a × b × c × d × e × f × g with n 2; otherwise, if one of a, b, c, d, e, f and g is 0, the total emotion matching degree X is equal to the product of the degrees of matching of emotions whose numerical value is not 0 multiplied by each other.

Next, a degree of matching of the first information content n1 with the type of the sound packet to which a certain piece of speech information n2 belongs in a predetermined information base is calculated.

In one example, if two pieces of speech information belong to the same sound package and the respective corresponding language texts are similar or identical in language structure, the matching degree of the two pieces of speech information is defined as P1, for example, 100%; if two pieces of speech information do not belong to the same sound package, but the language texts corresponding to the two pieces of speech information are similar or identical in language structure, the matching degree of the two pieces of speech information is defined as P2, for example, 80%; otherwise, the matching degree of the two pieces of speech information is defined as P3, for example, 0.

Optionally, in order to provide the user with a voice option more suitable for the user to interact with the other party after the user receives the first information content including the voice information, according to some voice options frequently used by the user and in combination with the matching degree obtained above, some voice information with relatively high frequency of use by the user is further screened out from a plurality of pieces of voice information with high matching degree, so that the user can interact with the other party with a more personal characteristic.

Based on the press operation, the second instant messaging client can be triggered to search out the voice information set interacted with the user sending the first information content from a preset information base according to the type of the voice packet to which the first information content belongs or/and the text content corresponding to the voice information. The search may be performed according to dimensions such as a voice type (or a voice series) to which the operated first information content belongs, a text meaning corresponding to the voice content, and an emotion corresponding to the voice content (for example, the emotion is determined by intonation, decibel value, speed of speech, and the like), and based on a search rule for the dimensions.

Optionally, the information processing method of the present application further includes:

step S1022, presenting a plurality of voice options in the voice information set to the user of the second instant messaging client in a specific form.

Wherein, presenting the plurality of voice options in a specific form may mean presenting the plurality of voice options in a form of stacking a plurality of cards. As shown in fig. 5, a voice play icon with a play duration prompt (e.g., 12 seconds) of a voice option, a text corresponding to the voice content (e.g., "do you don't care, not harsh, not worry |"), a send icon that can trigger the voice option to be sent to the opposite user, etc. can be presented on each card.

Step S1024, in response to receiving the selection of at least one voice option in the plurality of voice options, making the selected at least one voice option become a voice option to be sent to the first instant messaging client.

For example, selecting the voice option "you don't have nothing, no cruel, no reason to get an alarm!shown in FIG. 5! Is there a And the corresponding card sends the voice option to the opposite user by clicking the icon of sending sound on the card.

Optionally, with continuing reference to fig. 4, before the selecting of at least one of the plurality of voice options, further comprising:

in step S1023, a sliding operation on at least a part of the voice options in the plurality of voice options is received to present the voice options to be selected.

For example, when a plurality of voice options are presented in a manner that a plurality of cards are stacked as shown in fig. 5, the user can move the cards up and down to move the display of the cards on the upper layer or the lower layer one layer up or down, so that the user can select a satisfactory voice option from the voice options and send the voice option to the opposite user.

Optionally, referring to fig. 6, fig. 7 and fig. 8, fig. 6 provides a schematic flow chart of an information processing method according to another embodiment of the present invention, and fig. 7 and fig. 8 provide a first schematic diagram and a second schematic diagram of an application scenario described with reference to fig. 6 according to an embodiment of the present invention, respectively. According to fig. 6, the information processing method further includes:

step S105, responding to the operation of receiving the user of the second instant messaging client for making the voice information, and presenting a plurality of voice options to be imitated in the voice information set for the user to select.

As shown in fig. 7, a user operates an icon "add sound" of the to-be-made voice information by clicking, double-clicking, and the like, so as to trigger an instant messaging application interface to present a plurality of voice options to be imitated in a voice information set for the user to select, for example, the user may select any one of the voice options in fig. 7 and expand to obtain a sound card shown on the right side of fig. 7, where the user presents the voice option "taffee's story" at the first position and the voice option "return to zhuge's no-reason alarm" … … at the second position in fig. 7; further, when the client presents the sound card shown in the right side of fig. 7, the user may click another voice option, so as to pack or hide the originally presented sound card, and present the sound card corresponding to the clicked new voice option. The user can slide the cards corresponding to the voice options in the sound packet in a left sliding or right sliding mode to play contents such as sound and text in the corresponding cards, and the user can click a recording button in the card and then imitate, record and store the voice contents corresponding to the card.

It should be noted that the speech information set herein and the above-described "speech information set adapted to interact with the user sending the first information content" in fig. 3 may be different logical concepts. The voice information set herein includes a plurality of voice options to be imitated, each voice option to be imitated may include voice contents corresponding to one or more texts under the same type of pronunciation, for example, a text under a "story of Chinese princess" including a certain type of pronunciation is a voice content corresponding to a corrective; the above-described voice information set interacted with the user sending the first information in fig. 3 is a voice information set matched with the content of the first information obtained according to the content of the first information and a specific information matching rule, and a voice option in the voice information set is selected by the user to interact with the other user.

Step S106, responding to the received selection of the user of the second instant messaging client to at least one voice option to be simulated and the voice recording operation for simulating the voice content of the selected voice option, and obtaining a voice file corresponding to the selected voice option.

For example, when the user clicks the record button, the client displays a page entering the record state shown in the left side of fig. 8, the original voice content in the voice option is not played temporarily, and the page may also display a countdown schedule and/or an animation of the frequency change or volume change of the real-time recorded sound (for example, a waveform representing the frequency change of the sound). When the countdown is finished, indicating that the recording is finished, a play button on the page is activated, and the recorded sound can be played by clicking the play button. Further, if the user is not satisfied with the currently recorded sound, the sound may be re-recorded. Further, a "save and send" button may also be presented on the page to save and send the recorded sound to the corresponding user.

It should be noted that, the sound recorded in this embodiment may be automatically merged with the background sound in the voice option based on the prior art, so as to obtain the sound having both the sound feature in the voice option and the unique sound feature of the individual, and further improve the interactive experience. Specifically, the original voice content and the background sound are differentiated in format by the voice options of the present application, and when a user selects any one of the voice options for imitation recording, the embodiment of the present application can automatically extract the background sound in the voice options based on the prior art, and combine the background sound and the imitation recorded voice content to newly generate one voice option.

Therefore, based on the embodiment, the user can obtain a unique voice file by combining the characteristics of the voice of the user and the characteristics of the original pronunciation of the voice option in the voice packet, so as to better interact with the user of the other party.

Optionally, in order to comprehensively express the emotion of the user in the interaction process by combining the respective characteristics of the picture and the voice, in the embodiment of the present application, a comprehensive file including the picture and the voice may be provided for the user, please refer to fig. 9 and fig. 10, fig. 9 provides a flowchart of an information processing method according to another embodiment of the present invention, and fig. 10 provides a schematic view of an application scenario described with reference to fig. 9 according to an embodiment of the present invention. According to fig. 9, the information processing method further includes:

step S107, acquiring a picture to be processed.

The method for acquiring the pictures includes, but is not limited to, selecting one or more pictures from a large number of pictures or photos stored in other servers locally or in a networked state by the user terminal, taking the pictures in real time, and the like.

Step S108, acquiring voice information recorded by the user of the second instant messaging client aiming at the target picture.

For example, after the user selects a certain photo from the terminal locally based on the instant messaging client of the embodiment, an interface as shown in the left side of fig. 10 is presented for the user, and in the interface, the user can record sound. Optionally, during the recording of the sound for the photo by the user, the status information of "recording" is displayed below the photo. Alternatively, the user may listen on trial after recording the sound and, if not satisfied, re-record. Alternatively, the user may record a plurality of different types of sounds for the same picture, for example, a sound file recording the same text with various types of sounds such as a normal sound, a roaring sound, a gentle sound, a sound imitating a character of a television show, and the like, thereby forming a plurality of voice picture files later.

And step S110, processing the picture to be processed according to the recorded voice information to obtain a target picture.

Specifically, the picture to be processed may be processed by an existing algorithm.

More specifically, according to at least one feature of pitch, tone, volume, duration, rhythm, etc. of the voice information and an existing algorithm, the to-be-processed picture which is originally static may be subjected to one or more of turning, twisting, and stretching to obtain a target picture which is a moving picture, or/and text information corresponding to the voice information may be configured for the to-be-processed picture to obtain a target picture including the configured text information.

Specifically, the processing rules of the algorithm existing here include, but are not limited to, at least one of the following ways:

turning the picture to be processed to a corresponding degree according to the pitch or/and rhythm of the voice information;

-according to the timbre or/and volume of the speech information, twisting the picture to be processed to a corresponding degree;

-matching the picture to be processed with a corresponding text; for example, according to the existing voice recognition technology, a text corresponding to the voice information is recognized, and the text is presented together with the picture in a specific form.

-stretching the picture to be processed to a corresponding degree according to the timbre or/and the chord of the speech information.

And step S111, obtaining a combined voice picture file according to a preset rule aiming at combining the target picture and the voice information.

The rule for combining the picture and the voice information can be implemented by the prior art. The combined voice picture file may be: a moving picture of the voice can be played.

Step S112, sending the voice picture file to the first instant messaging client.

For example, a picture file containing a grower utterance is sent to a first instant messaging client, and when a user of the first instant messaging client downloads and opens the voice picture file, the user can see the picture while hearing the grower utterance's voice content.

Optionally, with continuing to refer to fig. 9, after the step S108 of obtaining the voice information recorded by the user of the second instant messaging client for the target picture, the method further includes:

step S109, determining the vibration intensity or/and the vibration time of the first instant messaging client when receiving the voice message according to at least one characteristic of the pitch, the tone, the volume, the duration and the rhythm of the voice message.

For example, vibrations with different vibration intensities are set according to the pitch of the voice message, and when the pitch is within a preset first interval, the vibration intensity is weak; when the pitch is within a predetermined second interval, the vibration intensity is medium; when the pitch is within a predetermined third interval, the intensity of the vibration is strong. That is, if the pitch of the voice message received by the first instant messaging client is within the third interval, the first instant messaging client vibrates strongly. Similarly, the time when the first instant messaging client vibrates when receiving the piece of voice information can be determined according to at least one characteristic of pitch, tone, volume, duration and rhythm.

Accordingly, according to the above, the step S112 of sending the voice picture file to the first instant messaging client in fig. 9 may include:

-sending the voice picture file to the first instant messaging client so that the first instant messaging client vibrates at the determined vibration intensity or/and vibration time when receiving the voice picture file.

Therefore, based on the embodiment, the interaction between the users can be realized based on the bucket diagram mode, and the vibration with different intensity or different time can be generated when the voice information is sent to the client of the opposite user according to different characteristics of the voice information, so that the information interaction between two parties or multiple parties is further enhanced.

Exemplary device

Having described the method of the exemplary embodiment of the present invention, next, an information processing apparatus of the exemplary embodiment of the present invention is explained with reference to fig. 11.

Fig. 11 schematically shows a structural diagram of an information processing apparatus provided according to an embodiment of the present invention. As shown in fig. 11, the information processing apparatus may include:

the receiving unit 11 is configured to receive the first information content sent from the first instant messaging client.

The search unit 12 is configured to search, according to the first information content, a set of voice information suitable for interaction with a user who sends the first information content from a predetermined information library, so that the user of the information processing apparatus selects one or more voice options from the set of voice information to interact with the user who sends the first information content, where each voice option includes at least one piece of voice content.

Wherein the set of speech information may include one or more different types of sound packets, each type of sound packet may include one or more speech options therein.

Optionally, the information processing apparatus further includes:

and the preview unit 13 is configured to, in response to receiving a preview operation of the selected voice option by the user of the second instant messaging client, execute a preview event of the selected voice option on the second instant messaging client, where the preview event includes playing voice content corresponding to the selected voice option or/and presenting text content corresponding to the voice option.

Optionally, the information processing apparatus further includes:

and the sending unit 14 is used for responding to the sending operation of the user of the second instant messaging client on the selected voice option, and sending the selected voice option to the first instant messaging client.

Optionally, when the first information content includes voice information, the searching unit 12 searches a voice information set suitable for interacting with the first information content from a predetermined information base according to the first information content, including:

-in response to receiving a specific action on the first information content, searching out a set of speech information matching the first information content from a predetermined information repository as a set of speech information suitable for interaction with the user sending the first information content according to a preset information matching rule. Wherein the specific operation of the search unit 12 on the first information content comprises: and carrying out a pressing operation on the first information content.

The searching out the voice information set matched with the first information content from the preset information base according to the preset information matching rule comprises the following steps: according to a preset calculation rule, and according to the emotion determined by the first information content and the voice information in the preset information base based on the voice characteristics or/and the type of the voice packet, calculating the matching degree of the first information content and the voice information in the preset information base; and taking the voice information with the matching degree larger than a preset threshold value or the voice information with the matching degree ranked in the top specific number in a preset information base as the voice information set matched with the first information content.

Optionally, the search unit 12 further comprises a presentation module for presenting a plurality of voice options in the voice information set to a user of the second instant messaging client in a specific form. For example, multiple voice options are presented in a stack of multiple cards.

Optionally, the search unit 12 further includes a selection response module, configured to, in response to receiving a selection of at least one voice option in the plurality of voice options, make the selected at least one voice option be a voice option to be sent to the first instant messaging client.

Optionally, before the search unit 12 receives the selection of at least one of the voice options, the search unit 12 presents the voice option to be selected according to the received sliding operation of at least a part of the voice options.

Optionally, referring to fig. 12, the information processing apparatus further includes:

and the voice making operation responding unit 15 is used for responding to the operation of receiving the user voice making information of the second instant messaging client, and presenting a plurality of voice options to be imitated in the voice information set for the user to select.

And the imitation operation responding unit 16 is configured to, in response to receiving a selection of at least one voice option to be imitated by the user of the second instant messaging client and a voice recording operation for imitating the voice content of the selected voice option, obtain a voice file corresponding to the selected voice option.

Optionally, referring to fig. 13, in order to comprehensively express the emotion of the user in the interaction process by combining the respective characteristics of the picture and the voice, the information processing apparatus further includes:

a picture acquiring unit 21, configured to acquire a picture to be processed.

And a recorded voice information obtaining unit 22, configured to obtain voice information, recorded by the user of the second instant messaging client, for the target picture.

And the target picture acquiring unit 23 is configured to process the picture to be processed according to the recorded voice information to obtain a target picture. Wherein the target picture includes but is not limited to a motion picture.

And the voice picture file acquiring unit 24 is configured to obtain a combined voice picture file according to a predetermined rule for combining the target picture and the voice information.

And the voice picture file sending unit 25 is configured to send the voice picture file to the first instant messaging client.

Optionally, the processing the to-be-processed picture by the target picture obtaining unit 23 according to the voice feature of the recorded voice information, and obtaining the target picture specifically includes:

-according to at least one characteristic of pitch, timbre, volume, duration and rhythm of the voice message, performing one or more of turning, twisting and stretching on the picture to be processed to obtain a target picture; or/and

-configuring text information corresponding to the voice information for the picture to be processed, resulting in a target picture comprising the configured text information.

Optionally, the information processing apparatus further includes: and the vibration determining unit is used for determining the vibration strength or/and the vibration time of the first instant messaging client when the first instant messaging client receives the piece of voice information according to at least one characteristic of the pitch, the tone, the volume, the duration and the rhythm of the voice information. Further, after the recorded voice information obtaining unit 22 obtains the voice information recorded by the user of the second instant messaging client and directed to the target picture, the vibration determining unit determines the vibration strength or/and the vibration time of the first instant messaging client when the first instant messaging client vibrates when receiving the piece of voice information according to at least one characteristic of pitch, tone, volume, duration and rhythm of the voice information.

Optionally, the sending, by the voice picture file sending unit 25, the voice picture file to the first instant messaging client specifically includes:

Exemplary device

Having described the method and apparatus of an exemplary embodiment of the present invention, next, an information processing apparatus according to another exemplary embodiment of the present invention is described.

As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

In some possible embodiments, an information processing apparatus according to the present invention may include at least one processing unit, and at least one storage unit. Wherein the storage unit stores program code that, when executed by the processing unit, causes the processing unit to perform the steps in the information processing method according to various exemplary embodiments of the present invention described in the above section "exemplary method" of the present specification. For example, the processing unit may execute step S101 shown in fig. 3, and receive, by the second instant messaging client, the first information content sent from the first instant messaging client; step S102, according to the first information content, searching out a voice information set suitable for interacting with a user sending the first information content from a preset information base so as to enable the user of the second instant messaging client to select one or more voice options to interact with the user sending the first information content, wherein each voice option comprises at least one piece of voice content.

The information processing apparatus 80 of this embodiment of the present invention is described below with reference to fig. 14. The information processing apparatus 80 shown in fig. 14 is only an example, and should not bring any limitation to the functions and the range of use of the embodiment of the present invention.

As shown in fig. 14, the information processing apparatus 80 is represented in the form of a general-purpose computing device. The components of the information processing apparatus 80 may include, but are not limited to: the at least one processing unit 81, the at least one memory unit 82, and a bus 83 connecting the various system components (including the processing unit 81 and the memory unit 82).

Bus 83 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures.

The storage unit 82 may include readable media in the form of volatile memory, such as Random Access Memory (RAM)8201 and/or cache memory 8202, and may further include read only memory 8203.

The storage unit 82 may also include a program/utility 821 having a set (at least one) of program modules 8204 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

The information processing apparatus 80 may also communicate with one or more external devices 84 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the information processing apparatus 80, and/or with any devices (e.g., router, modem, etc.) that enable the information processing apparatus 80 to communicate with one or more other computing devices. Such communication may be through input/output (I/O) interfaces 85. Also, the information processing apparatus may communicate with one or more networks (e.g., a local area network, a wide area network, etc.) through the network adapter 86. As shown, the network adapter 86 communicates with the other modules of the information processing apparatus 80 via a bus. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the information processing apparatus, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Exemplary program product

In some possible embodiments, the various aspects of the present invention may also be implemented in the form of a program product including program code for causing a terminal device to perform the steps of the information processing method according to various exemplary embodiments of the present invention described in the "exemplary method" section above in this specification when the program product is run on the terminal device, for example, the terminal device may perform step S101 shown in fig. 3, and receive, by a second instant messaging client, a first information content sent from a first instant messaging client; step S102, according to the first information content, searching out a voice information set suitable for interacting with a user sending the first information content from a preset information base so as to enable the user of the second instant messaging client to select one or more voice options to interact with the user sending the first information content, wherein each voice option comprises at least one piece of voice content.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

As shown in fig. 15, an information processing program product 90 according to an embodiment of the present invention is described, which can employ a portable compact disc read only memory (CD-ROM) and include program codes, and can be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In situations involving remote computing devices, the remote computing devices may be connected to the user computing device over any kind of network, including a local area network or a wide area network, or may be connected to external computing devices (e.g., over the internet using an internet service provider).

It should be noted that although in the above detailed description several devices or sub-devices of the information processing device are mentioned, this division is only not mandatory. Indeed, the features and functions of two or more of the devices described above may be embodied in one device, according to embodiments of the invention. Conversely, the features and functions of one apparatus described above may be further divided into embodiments by a plurality of apparatuses.

Moreover, while the operations of the method of the invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

While the spirit and principles of the invention have been described with reference to several particular embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. An information processing method comprising:

receiving first information content sent by a first instant messaging client by a second instant messaging client;

searching a voice information set suitable for interacting with a user sending the first information content from a preset information base according to the first information content, so that the user of the second instant messaging client can select one or more voice options to interact with the user sending the first information content, wherein each voice option comprises at least one piece of voice content;

the voice information set comprises one or more sound packets of different types, and each sound packet of the different types comprises one or more voice options;

acquiring a picture to be processed;

acquiring voice information aiming at a target picture recorded by a user of a second instant messaging client;

performing corresponding processing on the picture to be processed according to the recorded voice information characteristics to obtain a target picture comprising a moving picture;

obtaining a combined voice picture file according to a preset rule aiming at combining the target picture and the voice information;

and sending the voice picture file to a first instant messaging client.

2. The information processing method according to claim 1, further comprising:

and in response to receiving a preview operation of the user of the second instant messaging client on the selected voice option, executing a preview event of the selected voice option on the second instant messaging client, wherein the preview event comprises playing voice content corresponding to the selected voice option or/and presenting text content corresponding to the voice option.

3. The information processing method according to claim 1, further comprising:

and in response to receiving the sending operation of the user of the second instant messaging client to the selected voice option, sending the selected voice option to the first instant messaging client.

4. The information processing method of claim 3, wherein the first information content comprises voice information, and the searching for a set of voice information from a predetermined information base suitable for interacting with the first information content according to the first information content comprises:

in response to receiving a specific operation on the first information content, searching out a voice information set matched with the first information content from a preset information base according to a preset information matching rule, wherein the voice information set is suitable for interacting with a user sending the first information content.

5. The information processing method according to claim 4, wherein the searching out the voice information set matching the first information content from the predetermined information library according to the preset information matching rule comprises:

according to a preset calculation rule, and according to the emotion determined by the first information content and the voice information in the preset information base based on the voice characteristics or/and the type of the voice packet, calculating the matching degree of the first information content and the voice information in the preset information base;

and taking the voice information with the matching degree larger than a preset threshold value or the voice information with the matching degree ranked in the top specific number in a preset information base as the voice information set matched with the first information content.

6. The information processing method according to claim 4 or 5, further comprising:

presenting a plurality of voice options in the voice information set to a user of a second instant messaging client in a specific form;

in response to receiving a selection of at least one of the plurality of voice options, making the selected at least one voice option a voice option to be sent to the first instant messaging client.

7. The information processing method of claim 6, wherein the specific operation on the first information content comprises: the first information content is pressed again;

the presenting of the plurality of voice options in the particular form includes: presenting a plurality of voice options in a plurality of card stacks;

before the receiving of the selection of at least one of the plurality of voice options, further comprising: and receiving sliding operation of at least one part of the voice options in the plurality of voice options to present the voice options to be selected.

8. The information processing method according to claim 1, further comprising:

responding to the operation of receiving the user of the second instant messaging client for making the voice information, and presenting a plurality of voice options to be imitated in the voice information set for the user to select;

and responding to the received selection of the user of the second instant messaging client to at least one voice option to be imitated and voice recording operation for imitating the voice content of the selected voice option, and obtaining a voice file corresponding to the selected voice option.

9. The information processing method according to claim 1, wherein processing the picture to be processed according to the voice feature of the recorded voice information to obtain a target picture specifically comprises:

according to at least one characteristic of pitch, tone quality, volume, duration and rhythm of the voice information, carrying out one or more of turning, twisting and stretching on the picture to be processed to obtain a target picture; or/and

and configuring text information corresponding to the voice information for the picture to be processed to obtain a target picture comprising the configured text information.

10. The information processing method according to claim 1, further comprising, after the step of acquiring the recorded voice information for the target picture:

determining the vibration intensity or/and the vibration time of the first instant messaging client when the first instant messaging client vibrates after receiving the voice message according to at least one characteristic of pitch, tone, volume, duration and rhythm of the voice message;

the step of sending the voice picture file to a first instant messaging client specifically comprises:

11. An information processing apparatus comprising:

the receiving unit is used for receiving first information content sent by a first instant messaging client;

the searching unit is used for searching a voice information set which is suitable for interacting with a user sending the first information content from a preset information base according to the first information content, so that the user of the information processing device can select one or more voice options to interact with the user sending the first information content, wherein each voice option comprises at least one piece of voice content;

the picture acquisition unit is used for acquiring a picture to be processed;

the recorded voice information acquisition unit is used for acquiring the voice information which is recorded by the user of the second instant messaging client and aims at the target picture;

the target picture acquisition unit is used for carrying out corresponding processing on the picture to be processed according to the recorded voice information characteristics to obtain a target picture comprising a moving picture;

the voice picture file acquisition unit is used for acquiring a combined voice picture file according to a preset rule aiming at the combination of the target picture and the voice information;

and the voice picture file sending unit is used for sending the voice picture file to the first instant messaging client.