WO2023087888A1 - 表情包显示、关联声音获取方法、装置、设备及存储介质 - Google Patents

表情包显示、关联声音获取方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023087888A1
WO2023087888A1 PCT/CN2022/119778 CN2022119778W WO2023087888A1 WO 2023087888 A1 WO2023087888 A1 WO 2023087888A1 CN 2022119778 W CN2022119778 W CN 2022119778W WO 2023087888 A1 WO2023087888 A1 WO 2023087888A1
Authority
WO
WIPO (PCT)
Prior art keywords
emoticon
sound information
package
information
emoticon package
Prior art date
Application number
PCT/CN2022/119778
Other languages
English (en)
French (fr)
Inventor
陈晓丹
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to US18/201,614 priority Critical patent/US20230300095A1/en
Publication of WO2023087888A1 publication Critical patent/WO2023087888A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/07User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail characterised by the inclusion of specific contents
    • H04L51/10Multimedia information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/0482Interaction with lists of selectable items, e.g. menus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages

Definitions

  • the present application relates to the technical fields of computers and the Internet, and in particular to a method, device, equipment and storage medium for emoticon package display and associated sound acquisition.
  • the user when communicating with other users, the user may select a specific emoticon package to send, and after sending, the emoticon package sent by the user is displayed on the chat session interface.
  • Embodiments of the present application provide a method, device, device, and storage medium for displaying emoticons and acquiring associated sounds, which can support the display of voice messages corresponding to emoticons, so that the communication method based on emoticons is not limited to the communication of images.
  • Package communication methods are more diverse, thus providing users with a better chat atmosphere. Described technical scheme is as follows:
  • a method for displaying emoticons is provided, the method is executed by a terminal device, and the method includes:
  • a method for acquiring an associated sound of an emoticon package the method is executed by a computer device, and the method includes:
  • the first sound information associated with the first emoticon package is obtained by matching from the sound information database;
  • a device for displaying emoticons comprising:
  • An interface display module configured to display a chat session interface, and the chat session interface is used to display chat messages between at least two users;
  • An emoticon display module configured to display an emoticon selection interface in response to an emoticon selection operation for the chat session interface, wherein at least one emoticon is displayed in the emoticon selection interface;
  • a message display module configured to, in response to a sending operation for the first emoticon package in the at least one emoticon package, display an audible emoticon message corresponding to the first emoticon package in the chat session interface; wherein, the first emoticon package
  • the voice emoticon message corresponding to an emoticon package is used to display the first emoticon package and the associated sound information of the first emoticon package, and the associated sound information of the first emoticon package is obtained by matching the sound information database with the first emoticon package. The sound information associated with the first emoticon package.
  • a device for acquiring associated sounds of emoticons comprising:
  • a feature acquisition module configured to acquire feature information of the first emoticon package
  • a sound matching module configured to match the first sound information associated with the first emoticon package from the sound information database according to the characteristic information
  • a sound generating module configured to generate associated sound information of the first emoticon package based on the first sound information; wherein, the associated sound information of the first emoticon package is used to generate a voiced message corresponding to the first emoticon package Emoji message.
  • a computer device the computer device includes a processor and a memory, and a computer program is stored in the memory, and the computer program is loaded and executed by the processor to realize the above-mentioned A method for displaying emoticons, or a method for obtaining associated sounds of the above emoticons.
  • the computer device includes a terminal device or a server.
  • a computer-readable storage medium is provided, and a computer program is stored in the readable storage medium, and the computer program is loaded and executed by the processor to realize the above emoticon package display method , or implement the associated sound acquisition method of the above emoticons.
  • a computer program product includes a computer program, the computer program is stored in a computer-readable storage medium, and the processor reads and executes the program from the computer-readable storage medium. Executing the computer program to realize the method for displaying the emoticon package above, or to realize the method for acquiring the associated sound of the emoticon package.
  • the first emoticon package and the associated sound information of the first emoticon package through the voice emoticon message corresponding to the first emoticon package can be simultaneously passed information exchange, so that the communication method based on emoticons is not limited to the communication of images, and the communication methods of emoticons are more diverse, thereby providing users with a better chat atmosphere; moreover, the associated sound information of the first emoticon pack is derived from the sound information
  • the sound information associated with the first emoticon package obtained by matching in the database that is, there is no need to record the first emoticon package in advance or in real time, and the vocal emoticon corresponding to the first emoticon pack can be generated by matching with the existing sound information message, which reduces the acquisition overhead and time cost of associated sound information, thereby reducing the generation overhead and time cost of voice emoticon messages; the sound information in the sound information database is also applicable to multiple emoticons, and it is not necessary to create a The audible emoticon messages
  • Fig. 1 is a schematic diagram of an emoticon package display system provided by an embodiment of the present application
  • Fig. 2 exemplarily shows a schematic diagram of an emoticon package display system
  • Fig. 3 is a flow chart of an expression package display method provided by an embodiment of the present application.
  • Fig. 4 to Fig. 5 exemplarily show the schematic diagram of the chat session interface
  • Fig. 6 is a flow chart of an emoticon package display method provided by another embodiment of the present application.
  • Fig. 7 exemplarily shows a schematic diagram of an emoticon package selection interface
  • FIG. 8 exemplarily shows a schematic diagram of another chat session interface
  • FIG. 9 is a flow chart of a method for acquiring associated sounds of emoticons provided by an embodiment of the present application.
  • Fig. 10 exemplarily shows a schematic diagram of a function setting interface
  • Fig. 11 exemplarily shows a schematic diagram of a flow of emoticon package display
  • Fig. 12 is a block diagram of an emoticon package display device provided by an embodiment of the present application.
  • Fig. 13 is a block diagram of an emoticon package display device provided by another embodiment of the present application.
  • Fig. 14 is a block diagram of an associated sound acquisition device for emoticons provided by an embodiment of the present application.
  • Fig. 15 is a block diagram of an associated sound acquisition device for emoticons provided by another embodiment of the present application.
  • FIG. 16 is a structural block diagram of a terminal device provided by an embodiment of the present application.
  • Fig. 17 is a structural block diagram of a server provided by an embodiment of the present application.
  • FIG. 1 shows a schematic diagram of an emoticon package display system provided by an embodiment of the present application.
  • This emoticon package display system can include: terminal 10 and server 20.
  • the terminal 10 may be an electronic device such as a mobile phone, a tablet computer, a game console, an e-book reader, a multimedia playback device, a wearable device, a vehicle terminal, a PC (Personal Computer, personal computer), and the like.
  • Clients of application programs can be installed in the terminal 10 .
  • the application program refers to any application program having a function of displaying emoticons, such as a social application program, a shopping application program, a game application program, and the like.
  • the application program may be an application program that needs to be downloaded and installed, or a click-to-run application program, which is not limited in this embodiment of the present application.
  • the above emoticon package may be a static image or a dynamic image, which is not limited in this embodiment of the present application.
  • a terminal device may also be referred to as a terminal.
  • the server 20 is used to provide background services for clients of applications in the terminal 10 .
  • the server 20 may be a background server of the above-mentioned application program.
  • the server 20 may be one server, or a server cluster composed of multiple servers, or a cloud computing service center.
  • the server 20 provides background services for applications in multiple terminals 10 at the same time.
  • the terminal 10 and the server 20 can communicate with each other through the network.
  • the server 20 provides the terminal 10 with at least one function among data storage, data processing and data transmission functions.
  • the server 20 includes a server 21 having a database (i.e., a sound information database) for storing sound information, a server 22 for generating associated sound information for emoticons, and a server for multiple A terminal 10 provides a server 23 for data transmission.
  • a database i.e., a sound information database
  • a server 22 for generating associated sound information for emoticons
  • a server for multiple A terminal 10 provides a server 23 for data transmission.
  • the user of the first terminal 11 switches the generation mode of the first emoticon package to the second
  • the first terminal 11 sends an associated voice information acquisition instruction to the server 22; after the server 22 receives the associated information acquisition instruction, from each voice information in the voice information database of the server 21, the first emoticon The package matches the associated first sound information, and generates the associated sound information of the first emoticon package according to the first sound information, and sends the associated sound information to the first terminal 11; afterward, the user of the first terminal 11 sends
  • the first terminal 11 sends the message to be sent to the server 23 , and then the server 23 forwards the message to be sent to the second terminal 12 .
  • the message to be sent is a message for displaying the first emoticon package and associated sound information of the first emoticon package.
  • server 21, server 22, and server 23 may be the same server or different servers, which is not limited in this embodiment of the present application.
  • FIG. 3 shows a flowchart of a method for displaying emoticons provided by an embodiment of the present application.
  • This method can be applied to the terminal 10 of the emoticon package display system shown in FIG.
  • the method may include at least one of the following steps (301-303):
  • Step 301 displaying a chat session interface.
  • the chat session interface is used to display chat messages between at least two users.
  • the chat message includes but is not limited to at least one of the following: text message, image message, voice message, video message and so on.
  • different application programs correspond to different chat session interfaces.
  • the client when the user sends a message, the client displays the message sent by the user in the chat session interface.
  • the chat session interface includes a sent chat message
  • the chat session interface displays the identification information of the sender account of the sent chat message.
  • the identification information may include at least one of the following: account name, account avatar, and account level.
  • chat session interface when the chat session interface displays real-time chat messages between users, it may also display historical chat messages between users.
  • the chat session interface includes the above-mentioned historical chat messages.
  • the client acquires historical chat messages between the aforementioned users, and displays the historical chat messages in the chat session interface.
  • the historical chat message can be a historical message acquired in real time, or can be a historical message pre-stored in the client.
  • the chat session interface does not include the foregoing historical chat messages.
  • the client when displaying the chat session interface, the client does not need to obtain the historical chat messages between the above users, and may directly display the chat session interface.
  • Step 302 displaying an emoticon pack selection interface in response to the emoticon pack selection operation on the chat session interface.
  • the client after displaying the above-mentioned chat session interface, the client detects the chat session interface, and displays the emoticon package selection interface when an emoticon package selection operation on the chat session interface is detected.
  • the above emoticon pack selection interface refers to an interface for displaying emoticon packs for the user to select.
  • at least one emoticon pack is displayed in the emoticon pack selection interface.
  • the emoticon packs in this application can also have other forms of emoticon packs, such as video emoticon packs, animated emoticon packs, and video animated emoticon packs.
  • the chat session interface is canceled on the basis of keeping the same display elements unchanged and display the display elements of the expression pack selection interface; if the same display elements do not exist in the emoticon pack selection interface and the chat session interface, directly cancel the display elements of the chat session interface and display the display of the emoticon pack selection interface element. In this way, the influence of the chat session interface on the display and selection of emoticons can be avoided, thereby improving the display effect of emoticons and the intuitiveness of emoticon selection.
  • the above emoticon pack selection operation is an operation for calling the emoticon pack selection interface.
  • the above-mentioned chat session interface includes an emoticon package selection control, and the emoticon pack selection operation is a trigger operation for the emoticon pack selection control, and the user performs a trigger operation for the emoticon pack selection control, so that the customer
  • the emoticon package selection interface is displayed on the terminal.
  • the above operation may be a click operation, a long press operation, a slide operation, etc., which is not limited in this embodiment of the present application.
  • the above-mentioned chat session interface may also include other operation controls, such as a chat message sending control, a historical message search control, a chat message sharing control, and the like.
  • the emoticon package selection operation is a specific operation for the chat session interface, that is, the chat session interface does not need to display Perform specific operations in the interface, so that the client displays the emoticon package selection interface.
  • the above-mentioned specific operations may be click operations of a specific number of times, long-press operations of a specific duration, sliding operations of a specific trajectory, pressing operations of a specific key position, etc., which are not limited in this embodiment of the present application.
  • the user may also perform other operations through other specific operations on the chat session interface, such as sending a chat message, searching for historical messages, sharing a chat message, and the like.
  • Step 303 in response to the sending operation of the first emoticon package in the at least one emoticon package, displaying the voice emoticon message corresponding to the first emoticon package in the chat session interface.
  • the above emoticon package selection interface includes options for emoticons, and different emoticons correspond to different options.
  • the selection item may be the emoticon package itself, or may be a thumbnail, cover image, name, etc. of the emoticon package, which is not limited in this embodiment of the present application.
  • the user triggers the generation of different operations for emoticons by performing different operations on the selection item. Exemplarily, by clicking on the selection item, the sending operation of the emoticon package corresponding to the selection item is triggered; by long pressing the selection item, the selection operation of the emoticon package corresponding to the selection item is triggered; by dragging the selection item , to trigger the position movement operation for the emoticon package corresponding to the selected item.
  • the client side displays the above emoticon package selection interface, it detects the emoticon package selection interface, and when it detects the sending operation for the first emoticon package in at least one emoticon package, the The voice emoticon message corresponding to the first emoticon package is displayed in the conversation interface.
  • the above-mentioned first emoticon package may be any one of the above-mentioned at least one emoticon package.
  • the voiced emoticon message corresponding to the first emoticon package is used to display the above-mentioned first emoticon package and the associated sound information of the first emoticon package, and the associated sound information of the first emoticon package is obtained from the sound information database
  • the sound information associated with the above-mentioned first emoticon package obtained through matching, the sound information database pre-stores a plurality of sound information.
  • the above-mentioned audible emoticon message includes a first emoticon package, and a sound playback control for playing associated sound information of the first emoticon package.
  • the client detects the sending operation for the above-mentioned first emoticon package, it sends the first emoticon package and the associated sound information of the first emoticon package to the recipient account, and displays the first emoticon in the chat session interface package and the sound playback control corresponding to the first emoticon package.
  • the first emoticon package 41 and the sound playback control 42 are displayed on the chat session interface 40 .
  • the above-mentioned voice emoticon message includes a voice video of the first emoticon package.
  • the client detects the sending operation for the above-mentioned first emoticon package, according to the first emoticon package and the associated sound information of the first emoticon package, generate the audio video of the first emoticon package, and send the message to the receiving party
  • the account sends the video with sound, and displays the video with sound of the first emoticon in the chat session interface.
  • the above emoticon message with sound also includes a video playback control for playing the video with sound. Exemplarily, as shown in FIG.
  • the emoticon pack is not limited to the display form of the image, thereby enriching the display diversity of the emoticon pack and further improving user experience.
  • the above-mentioned voice emoticon message further includes subtitle information.
  • the subtitle information is text information in the first emoticon package.
  • the text information may be the text set in the first emoticon pack by the first emoticon pack maker, or the text input by the sender account of the voice emoticon message, which is not limited in this embodiment of the present application.
  • the subtitle information is a mark of the first emoticon package, and feature information of the first emoticon package can be acquired based on the mark.
  • the mark may be set by the creator of the first emoticon package, or may be input by the account of the sender of the voice emoticon message, which is not limited in this embodiment of the present application. It should be noted that the above-mentioned marks may also be called identification, description, definition, etc.
  • the client when it sends the above-mentioned audible emoticon message, it may directly send the above-mentioned first emoticon package and the above-mentioned associated sound information to the corresponding device; or, the client may also send the above-mentioned first emoticon package and the above-mentioned associated sound information
  • the identification information of the associated sound information is sent to the corresponding device, and then the device obtains the associated sound information according to the identification information of the associated sound information, and generates the above-mentioned audible emoticon message.
  • the above-mentioned device may be a terminal where the receiver's account is located, or may be a message transfer server, which is not limited in this embodiment of the present application.
  • the first emoticon package and the associated sound information of the first emoticon package are displayed through the voice emoticon message corresponding to the first emoticon package, that is, when the user sends the first emoticon package, It is possible to communicate through the first expression package and the associated sound information of the first expression package at the same time, so that the communication method based on the expression package is not limited to the communication of images, and the communication methods of the expression package are more diverse, thereby providing users with a better chat atmosphere and, the associated sound information of the first emoticon pack is the sound information associated with the first emoticon pack that is obtained by matching in the sound information database, that is, it is not necessary to record the first emoticon pack in advance or in real time.
  • the voiced emoticon message corresponding to the first emoticon package can be generated by matching the sound information of the first emoticon package, which reduces the acquisition cost and time cost of the associated voice information, thereby reducing the generation cost and time cost of the voiced emoticon message; the voice information in the voice information database is also Applicable to multiple emoticons, it is not necessary to record each emoticon one by one to obtain the audio emoticons corresponding to multiple emoticons. In the case of a large number of emoticons, it can effectively improve the generation of audio emoticons efficiency.
  • FIG. 6 shows a flowchart of a method for displaying emoticons provided by another embodiment of the present application.
  • This method can be applied to the terminal 10 of the emoticon package display system shown in FIG.
  • the method may include at least one of the following steps (601-608):
  • Step 601 displaying a chat session interface.
  • Step 602 displaying an emoticon pack selection interface in response to an emoticon pack selection operation on the chat session interface.
  • steps 601 and 602 are the same as steps 301 and 302 in the embodiment in FIG. 3 .
  • steps 301 and 302 are the same as steps 301 and 302 in the embodiment in FIG. 3 .
  • steps 301 and 302 are the same as steps 301 and 302 in the embodiment in FIG. 3 .
  • details refer to the embodiment in FIG. 3 , and details are not repeated here.
  • Step 603 in response to the selection operation on the first emoticon package, display the switching control of the sending mode of the first emoticon package.
  • the client after displaying the above emoticon pack selection interface, the client detects the emoticon pack selection interface, and displays the sending method of the first emoticon pack when a selection operation for the first emoticon pack is detected.
  • the emoticon pack selection interface includes emoticon pack selection items, and different emoticon packs correspond to different options, and the user triggers the generation of a selection operation for the first emoticon pack through the selection item of the first emoticon pack.
  • the above sending mode switching control is used to control the switching of the sending mode of the first emoticon package.
  • the client after displaying the sending mode switching control, the client detects the sending mode switching control, and switches the sending mode of the first emoticon package when receiving an operation on the sending mode switching control .
  • the sending method of the first emoticon package is the second sending method
  • the client controls the sending method to be switched from the second sending method to the first sending method after receiving the operation on the switching control of the sending method; if The sending method of the first emoticon package is the first sending method, and the client controls the switching of the sending method from the first sending method to the second sending method after receiving the operation on the sending method switching control.
  • the above-mentioned first sending method refers to sending the first emoticon package in the form of a voiced emoticon message
  • the above-mentioned second sending method refers to sending the first emoticon package in the form of the first emoticon package.
  • the emoticon pack selection interface 70 includes a plurality of emoticon pack selection items, and the user triggers and generates a selection operation for the first emoticon pack by long pressing the first emoticon pack option 71, and then
  • the emoticon package selection interface 70 displays a sending mode switching control 72 of the first emoticon package, and further, the user can switch the sending mode of the first emoticon package through the sending mode switching control 72 .
  • the user can flexibly set the sending mode of the first emoticon package according to requirements, thereby improving the flexibility of sending the emoticon pack.
  • Step 604 in response to the sending operation for the first emoticon package, acquire the sending method of the first emoticon package.
  • the client after displaying the above-mentioned emoticon pack selection interface, the client detects the emoticon pack selection interface, and acquires the sending method of the first emoticon pack after detecting a sending operation for the first emoticon pack.
  • the user triggers the generation of a sending operation for the first emoticon package through the selection item of the first emoticon package.
  • Step 605 according to the sending method of the first emoticon pack, send the first emoticon pack to the recipient's account in the chat session interface.
  • the client after obtaining the above-mentioned sending method, the client sends the first emoticon package to the receiver's account in the chat session interface according to the sending method.
  • the client sends the voice emoticon message corresponding to the first emoticon package to the recipient account in the chat session interface, and displays the first emoticon package corresponding to the first emoticon message in the chat session interface.
  • the above sending method is the second sending method, then the client only sends the first emoticon package to the recipient account in the chat session interface, and displays the first emoticon package in the chat session interface.
  • the client further improves the flexibility of sending emoticons by supporting sending emoticons in the first sending manner or in the second sending manner.
  • the client sends the silent emoticon corresponding to the first emoticon package to the recipient account in the chat session interface message, and the silent emoticon message corresponding to the first emoticon package is displayed in the chat session interface.
  • the silent emoticon message includes the first emoticon package and the sound matching failure identification.
  • the first emoticon package 81 and the sound matching failure indicator 83 are displayed on the chat session interface 82 .
  • the user can control the playing, pausing or replacement of the associated audible information according to the actual situation.
  • Step 606 in response to the sound playing operation on the voice emoticon message, play the associated sound information of the first emoticon package.
  • the client side detects the voiced emoticon message, and plays the associated sound information of the first emoticon package after detecting a sound playback operation for the voiced emoticon message.
  • the sound playing operation may be an operation for the first specific control, or may be a first specific operation for the voice emoticon message, which is not limited in this embodiment of the present application.
  • the play operation is to play the associated sound information of the first emoticon package.
  • the client will play the video animation of the first emoticon package while playing the associated sound information after detecting the sound playback operation for the voice emoticon message .
  • Step 607 in response to the mute operation on the voice emoticon message, stop playing the associated sound information of the first emoticon package.
  • the client after displaying the above-mentioned voiced emoticon message, the client detects the voiced emoticon message, and stops playing the associated sound information of the first emoticon package after detecting a mute operation on the voiced emoticon message.
  • the mute operation may be an operation for the second specific control, or a second specific operation for the audible emoticon message, which is not limited in this embodiment of the present application.
  • the above-mentioned first specific control and the above-mentioned second specific control may be the same operation control, or may be different operation controls, which is not limited in this embodiment of the present application.
  • the above-mentioned first specific control and the above-mentioned second specific control are the same operation control, the above-mentioned sound playing operation and the above-mentioned mute operation are different operations for the same operation control.
  • the user triggers the generation of mute operation by double-clicking the sound playback control 42 in FIG. 4 to stop playing the associated sound information of the first emoticon package; change.
  • the client will stop playing the associated sound information after detecting the mute operation for the vocal emoticon message, but still play the video animation of the first emoticon package .
  • Step 608 changing the associated sound information of the first emoticon package in response to the voice replacement operation for the voiced emoticon message.
  • the client after displaying the above-mentioned voiced emoticon message, the client detects the voiced emoticon message, and changes the associated sound information of the first emoticon package after detecting a voice replacement operation for the voiced emoticon message.
  • the voice replacement operation may be an operation for the third specific control, or a third specific operation for the voice emoticon message, which is not limited in this embodiment of the present application.
  • a voice change control 43 is displayed in the chat session interface 40 , and the user clicks on the voice change control 43 to change the associated voice information of the first emoticon package.
  • the above-mentioned first specific control, the above-mentioned second specific control, and the above-mentioned third specific control may be the same operation control, or may be different operation controls, which is not limited in this embodiment of the present application.
  • the above-mentioned first specific control, the above-mentioned second specific control and the above-mentioned third specific control are the same operation control, then the above-mentioned sound playing operation, the above-mentioned mute operation and the above-mentioned sound replacement operation are different operations for the same operation control .
  • the client may automatically modify the associated sound information, or may modify the associated sound information based on the user's selection.
  • the client automatically modifies the associated sound information.
  • the client selects the candidate voice information satisfying the first condition from at least one candidate voice information to generate the replacement voice information of the first emoticon package, and adopts the first emoticon package
  • the replacement sound information of the first emoticon package replaces the associated sound information.
  • the above-mentioned candidate sound information is obtained according to the feature information of the first emoticon package and the tags corresponding to each sound information in the sound information database;
  • the above-mentioned first condition is a selection condition for the candidate sound information.
  • the The first condition is the candidate sound information with the highest matching degree with the feature information of the first emoticon package.
  • the replacement sound information for the first emoticon package may also be randomly selected from at least one candidate sound information.
  • the client modifies the associated sound information based on the user's selection.
  • the client displays at least one candidate sound information, and detects each candidate sound information, and after detecting the selection operation for the target sound information in the at least one candidate sound information
  • the replacement sound information of the first emoticon pack is generated according to the target sound information, and the associated sound information of the first emoticon pack is replaced by the replacement sound information of the first emoticon pack.
  • the above candidate sound information does not include the associated sound information and history associated sound information of the first emoticon package.
  • the historical associated sound information refers to the sound information that used to be associated sound information of the first emoticon package.
  • the changed associated voice information or the identification information of the changed associated voice information needs to be synchronized to the recipient account.
  • the sending method of the first emoticon package is the first sending method
  • the receiver in the chat session interface The account sends the voice emoticon message corresponding to the first emoticon package, and the sending method can be flexibly switched through the sending method switching control. Can meet the needs of different users.
  • the associated sound information of the first emoticon package can be changed through the sound replacement operation, and the associated sound information can be flexibly changed with reference to the user's opinion when acquiring the associated sound information of the first emoticon package, so as to improve the accuracy of the acquired associated sound information. accuracy.
  • the user selects the associated voice information of the first emoticon package from the candidate voice information to improve the accuracy of the associated voice information and strengthen the connection between the associated voice information and the first emoticon package, so that the voice emoticon message can be better Express the user's wishes.
  • FIG. 9 shows a flowchart of a method for acquiring associated sounds of emoticons provided by an embodiment of the present application.
  • This method can be applied to the terminal 10 of the emoticon package display system shown in Figure 1, and can also be applied to the server 20 of the emoticon package display system shown in Figure 1, and can also be implemented interactively by the terminal 10 and the server 20.
  • This is not limited (hereinafter, the execution subject of the method for obtaining the associated sound of the emoticon package is collectively referred to as "server").
  • the method may include at least one of the following steps (901-903):
  • Step 901 acquire feature information of a first emoticon package.
  • the first emoticon package refers to the emoticon package of the voice information to be matched, which may be any one of multiple emoticon packages provided by the application program.
  • the server before matching the sound information for the first emoticon package, acquires the feature information of the first emoticon package.
  • the above characteristic information may be generated in real time or in advance, which is not limited in this embodiment of the present application.
  • the above feature information is generated in real time.
  • the server generates feature information of the first emoticon package in real time when determining to match the sound information to the first emoticon package.
  • the above feature information is pre-generated.
  • the server After acquiring the above-mentioned first emoticon package, the server generates feature information of the first emoticon package, and stores the feature information, and then directly The characteristic information is obtained from the storage location of the characteristic information.
  • the feature information includes but is not limited to at least one of the following: character feature information, scene feature information, emotion feature information, and the like.
  • the character feature information is used to indicate the text contained in the first emoticon package
  • the scene feature information is used to indicate the possible use scene of the first emoticon package, such as the scene feature information of the good night emoticon package can be: before going to bed at night
  • emotional feature information It is used to indicate the emotion that the user may have when using the first emoticon package. For example, if the emoticon package includes the word "so difficult", the emotional characteristic information may be: anxiety and sadness.
  • the characteristic information includes the above-mentioned character characteristic information.
  • the server when acquiring the feature information of the first emoticon package, extracts text from the text information in the first emoticon package to obtain the text feature information of the first emoticon package.
  • the text information in the first emoticon pack includes at least one of the following: text in the first emoticon pack, input text for the first emoticon pack.
  • the text in the first emoticon package refers to the pre-existing text in the first emoticon pack
  • the input text for the first emoticon pack refers to the text input for the first emoticon pack.
  • the text in the first emoticon package can be ignored.
  • the feature information includes the above scene feature information.
  • the server when acquiring the feature information of the first emoticon package, the server performs feature extraction on the first emoticon package, the associated chat messages of the first emoticon package, and the associated chat scenes of the first emoticon package, to obtain the scene of the first emoticon package characteristic information.
  • the associated chat message of the first emoticon package refers to a historical chat message whose time difference between the sending time and the current moment is less than a threshold, and the associated chat scene of the first emoticon package is used to indicate the current chat time and at least one current chat account.
  • the number of the aforementioned associated chat messages may or may not be set in advance, which is not limited in this embodiment of the present application; the aforementioned current chat account may be understood as the aforementioned recipient account.
  • the feature information includes the above-mentioned emotion feature information.
  • the server when acquiring the feature information of the first emoticon package, the server performs feature extraction on the first emoticon package and associated chat messages of the first emoticon package to obtain the emotional feature information of the first emoticon package.
  • first emoticon pack may be any emoticon pack, or an emoticon pack that meets specific requirements.
  • the above specific requirement may be: emoticons that can extract text.
  • the feature information of the emoticon package is set to include but not limited to at least one of the following: text feature information, scene feature information, and emotional feature information, so that the feature information can be used to more accurately characterize the emoticon package, thereby facilitating Improve the matching accuracy of the first sound information.
  • Step 902 According to the characteristic information, the first sound information associated with the first emoticon package is obtained through matching from the sound information database.
  • the server matches the feature information to obtain the first sound information related to the first emoticon package from the sound information database.
  • the sound information database pre-stores a plurality of sound information.
  • the plurality of voice information stored in the voice information database is historical voice information from the account of the sender of the first emoticon package.
  • the plurality of voice information stored in the voice information database is historical voice information from different accounts.
  • the above-mentioned historical sound information may be generated during a chat session, or may be generated during a recording scene, which is not limited in this embodiment of the present application.
  • Step 903 based on the first sound information, generate associated sound information of the first emoticon package.
  • the server after acquiring the first sound information, the server generates associated sound information of the first emoticon package based on the first sound information.
  • the associated sound information of the first emoticon package is used to generate the voiced emoticon message corresponding to the first emoticon package.
  • the server may directly use the first sound information as the associated sound information, or edit the first sound information to obtain the associated sound information.
  • the server directly uses the first sound information as the associated sound information.
  • the server acquires the text information contained in the first emoticon package, and compares the text information contained in the first sound information with the text information contained in the first emoticon package.
  • the first sound information is directly used as the above-mentioned associated sound information.
  • the server clips the first sound information to obtain associated sound information.
  • the server acquires the text information contained in the first emoticon package, and compares the text information contained in the first sound information with the text information contained in the first emoticon package.
  • the text information contained in the first emoticon package is part of the text information contained in the first sound information
  • the The sound segment of the text information of the above-mentioned first emoticon package is generated based on the sound segment.
  • the server may directly use the sound clip as the associated sound information, or edit the sound clip to obtain the associated sound information.
  • the server directly uses the sound segment as associated sound information.
  • the server directly uses the sound clip as the above-mentioned associated sound information.
  • the server edits the sound clips to obtain associated sound information.
  • the server adjusts the playing duration of the sound clip to obtain the associated sound information of the first emoticon package.
  • the playing duration of the associated sound information of the first emoticon package is the same as the playing duration of the first emoticon package.
  • the server adjusts the playing duration of the sound clip by adjusting the sound playing frequency.
  • the first voice information associated with the first emoticon package is obtained by matching the feature information of the first emoticon package, and the relationship between the first sound information and the first emoticon package is improved.
  • the matching degree makes the associated sound information generated based on the first sound information highly accurate; moreover, the associated sound information of the first emoticon package can be generated through the existing sound information in the sound information database, and there is no need to specially create an associated sound information for the first emoticon package.
  • One sound information is dubbed and recorded, and the sound information in the sound information database is also applicable to multiple emoticons.
  • step 902 includes the following steps:
  • the server when the server matches the first sound information with the first emoticon package, it acquires tags corresponding to each sound information in the sound information database.
  • the above tags may be generated in real time or in advance, which is not limited in this embodiment of the present application.
  • the above tags are generated in real time.
  • the server determines to match the sound information with the first emoticon package, it acquires each sound information in the sound information database, and generates tags corresponding to each sound information.
  • the above label is pre-generated.
  • the server acquires the sound information, it generates a label of the sound information, and stores the label of the sound information, and then when it is determined to match the sound information with the above-mentioned first emoticon package, the The storage location of the label obtains the label of the sound information.
  • the tags of part of the sound information are generated in real time, and the tags of part of the sound information are generated in advance.
  • the server matches the sound information of the first emoticon package, it obtains each sound information in the sound information database, and detects whether the sound information has a label, and generates a label in real time for the sound information without a label, And store the label in the corresponding position for next use.
  • the above tags include but are not limited to at least one of the following: text tags, scene tags, emotion tags, and the like.
  • the text label is used to indicate the text corresponding to the sound information
  • the scene label is used to indicate the corresponding sending scene of the sound information, for example, the scene label is: send to the target user in the first chat group at 20:11 in the evening
  • emotion The label is used to indicate the emotion corresponding to the sound information, that is, the emotion contained in the sound information.
  • the user can set whether to allow the server to collect its own historical sound information and store it in the sound information database according to the actual situation.
  • the function setting interface 100 includes a voice recognition switch 101 through which the user controls the opening and closing of the function of collecting historical voice information.
  • the server collects a plurality of historical voice information sent by the sender account of the first emoticon package; further, each historical voice information contains The voices of each voice are converted to text respectively, and the text labels corresponding to each historical voice information are obtained; based on the sending scenes corresponding to each historical voice information, the scene labels corresponding to each historical voice information are obtained respectively; based on the voice emotions corresponding to each historical voice information , to obtain the emotion labels corresponding to each historical sound information.
  • the server when collecting the multiple pieces of historical voice information sent by the sender account of the first emoticon package, collects multiple pieces of historical voice information sent by the sender account within the target time period.
  • the target time period may be a time period composed of times whose difference with the current time is less than the target value, or may be a time period during which messages are frequently sent, which is not limited in this embodiment of the present application.
  • different sender accounts correspond to different target time periods.
  • the server when the server collects a plurality of historical sound information sent by the sender account of the first emoticon package, it collects a plurality of historical sound information sent by the sender account and whose total playback duration is less than a threshold value.
  • the threshold value may be any value, such as 10s, 7s, 5s, 2s, etc., which is not limited in this embodiment of the present application.
  • the sound information database is constructed through the historical sound information sent by the sender's account, and it is selected as the associated sound information corresponding to the emoticon package sent by the sender's account, so that the voiced emoticon message corresponding to the emoticon pack is more accurate. It conforms to the chatting style of the sender's account, thereby further improving the user's chatting experience.
  • the server after acquiring the tags corresponding to each sound information, the server selects at least one candidate sound information matching the feature information from the sound information database according to the tags corresponding to each sound information.
  • the server selects from the sound information database the text feature information that matches the text feature information according to the text feature information in the feature information and the text tags corresponding to each sound information. At least one candidate sound information.
  • the server selects from the sound information database the scene feature information that matches the scene feature information according to the scene feature information in the feature information and the scene tags corresponding to each sound information. At least one candidate sound information.
  • the server selects from the sound information database matching emotional feature information from the sound information database according to the emotional feature information in the feature information and the emotional tags corresponding to each sound information. At least one candidate sound information.
  • the embodiment of the present application provides selection methods of various candidate sound information such as character feature matching, scene feature matching and emotional feature matching, so that the server can obtain more comprehensive candidate sound information, thereby helping to improve the rationality of the acquisition of the first sound information. sex.
  • the server after acquiring the at least one candidate sound information, the server selects the candidate sound information satisfying the second condition from the at least one candidate sound information as the first sound information.
  • the above-mentioned second condition is a selection condition for candidate sound information.
  • the second condition is the candidate voice information with the highest degree of matching with the characteristic information of the first emoticon package, that is, when the server obtains the first voice information, it selects the candidate voice information with the highest degree of matching with the characteristic information
  • the sound information is used as the first sound information.
  • the server may also randomly select the first sound information for the first emoticon package from at least one candidate sound information, so as to ensure that the first emoticon package can also be selected when the candidate sound information has the same matching degree. Match the first sound information.
  • the first sound information is selected from a plurality of candidate sound information associated with the emoticon package obtained by matching according to the characteristic information of the emoticon package and the label corresponding to the sound information, so that the first sound information is consistent with the emoticon package.
  • the degree of matching between them is higher, so that the accuracy of associated sound information generated based on the first sound information is high.
  • the complete solution of the present application is introduced from the perspective of interaction between the client and the server.
  • the specific steps include at least one of the following steps:
  • Step 1101 the client terminal displays a chat session interface.
  • step 1102 the client displays the emoticon pack selection interface when receiving an emoticon pack selection operation on the chat session interface. Wherein, at least one emoticon pack is displayed in the emoticon pack selection interface.
  • Step 1103 when the client receives the sending operation for the first emoticon package, and the sending mode of the first emoticon package is the first sending mode, acquire the characteristic information of the first emoticon package.
  • Step 1104 the client sends a sound matching instruction to the server.
  • the voice matching instruction includes feature information of the first emoticon package.
  • step 1105 the server acquires tags corresponding to each sound information in the sound information database.
  • Step 1106 the server selects at least one candidate sound information that matches the feature information of the first emoticon package from the sound information database according to the tags corresponding to each sound information.
  • Step 1107 the server selects the candidate sound information satisfying the second condition from at least one candidate sound information as the first sound information.
  • Step 1108 the server generates associated sound information of the first emoticon package based on the first sound information.
  • Step 1109 the server sends associated sound information to the client.
  • Step 1110 the client generates the voiced emoticon information corresponding to the first emoticon package according to the first emoticon package and the associated sound information, and sends a voiced emoticon message to the receiver's account in the chat session interface.
  • step 1111 the client displays the voice emoticon information corresponding to the first emoticon package in the chat session interface. Moreover, the client of the receiver's account also displays the voiced emoticon information corresponding to the first emoticon package in the chat session interface.
  • the client plays the associated sound information of the first emoticon package when receiving the sound playing operation for the voice emoticon message. Moreover, the client of the recipient account also plays the associated sound information of the first emoticon package when receiving the sound playing operation for the voiced emoticon message.
  • step 1113 the client stops playing the associated sound information of the first emoticon package when receiving the mute operation for the voiced emoticon message. Moreover, when the client of the recipient account receives the mute operation for the voice emoticon message, it also stops playing the associated sound information of the first emoticon package.
  • step 1114 the client sends a voice change command for the first emoticon package to the server when receiving the voice change operation for the voice emoticon message.
  • Step 1115 the server generates replacement sound information of the first emoticon package based on at least one candidate sound information.
  • Step 1116 the server sends the replacement voice information to the client.
  • Step 1117 the client adopts the replaced voice information of the first emoticon package to replace the associated voice information of the first emoticon package, and synchronizes the changed associated voice information to the client of the recipient account. Moreover, the client of the receiver's account also uses the replacement voice information of the first emoticon package to replace the associated voice information of the first emoticon package.
  • FIG. 12 shows a block diagram of an emoticon package display device provided by an embodiment of the present application.
  • the device has the function of realizing the method for displaying the emoticon pack described above, and the function may be realized by hardware, or may be realized by executing corresponding software by the hardware.
  • the apparatus may be a terminal device, or may be set in the terminal device.
  • the device 1200 may include: an interface display module 1210 , an expression display module 1220 and a message display module 1230 .
  • the interface display module 1210 is configured to display a chat session interface, and the chat session interface is used to display chat messages between at least two users.
  • the emoticon display module 1220 is configured to display an emoticon pack selection interface in response to the emoticon pack selection operation on the chat session interface, and at least one emoticon pack is displayed in the emoticon pack selection interface.
  • the message display module 1230 is configured to, in response to the sending operation for the first emoticon package in the at least one emoticon package, display in the chat session interface an audible emoticon message corresponding to the first emoticon package; wherein, the The voice emoticon message corresponding to the first emoticon package is used to display the first emoticon package and the associated sound information of the first emoticon package, and the associated sound information of the first emoticon package is obtained by matching the sound information database with the Sound information associated with the first emoticon package.
  • the message display module 1230 is configured to acquire the sending method of the first emoticon package in response to the sending operation for the first emoticon package; if the sending method is the first sending method , then send the voiced emoticon message corresponding to the first emoticon package to the recipient account in the chat session interface, and display the voiced emoticon message corresponding to the first emoticon package in the chat session interface.
  • the device 1200 further includes: a control display module 1240 , an operation receiving module 1250 and a mode switching module 1260 .
  • the control display module 1240 is configured to display a control for switching the sending mode of the first emoticon package in response to the selection operation on the first emoticon package.
  • An operation receiving module 1250 configured to receive an operation on the sending mode switching control.
  • Mode switching module 1260 for if the transmission mode of the first emoticon package is the second transmission mode, then control the transmission mode to be switched from the second transmission mode to the first transmission mode; if the first If the sending method of the emoticon package is the first sending method, the sending method is controlled to be switched from the first sending method to the second sending method.
  • the device 1200 further includes: a sound control module 1270 .
  • the sound control module 1270 is configured to play the associated sound information of the first emoticon package in response to the sound playing operation for the audible emoticon message; or, in response to the mute operation for the audible emoticon message, stop playing the audible emoticon message.
  • the associated sound information of the first emoticon package or, changing the associated sound information of the first emoticon package in response to the voice replacement operation for the audible emoticon message.
  • the sound control module 1270 is configured to select candidate sound information satisfying the first condition from at least one candidate sound information to generate the replacement sound information of the first emoticon package; wherein, the candidate The sound information is obtained according to the feature information of the first emoticon package and the tags corresponding to each sound information in the sound information database; the replacement sound information of the first emoticon package is used to replace the first emoticon The package's associated sound information.
  • the sound control module 1270 is configured to display at least one candidate sound information; in response to a selection operation for target sound information in the at least one candidate sound information, generate the target sound information according to the target sound information
  • the replacement sound information of the first emoticon package; the associated sound information of the first emoticon package is replaced by the replacement sound information of the first emoticon package.
  • the audible emoticon message includes the first emoticon package, and a sound playback control for playing the associated sound information of the first emoticon package; or, the audible emoticon message includes the first emoticon package An audio video of an emoticon package, and a video playback control for playing the audio video.
  • the first emoticon package and the associated sound information of the first emoticon package are displayed through the voice emoticon message corresponding to the first emoticon package, that is, when the user sends the first emoticon package, It is possible to communicate through the first expression package and the associated sound information of the first expression package at the same time, so that the communication method based on the expression package is not limited to the communication of images, and the communication methods of the expression package are more diverse, thereby providing users with a better chat atmosphere and, the associated sound information of the first emoticon pack is the sound information associated with the first emoticon pack that is obtained by matching in the sound information database, that is, it is not necessary to record the first emoticon pack in advance or in real time.
  • the voiced emoticon message corresponding to the first emoticon package can be generated by matching the sound information of the first emoticon package, which reduces the acquisition cost and time cost of the associated voice information, thereby reducing the generation cost and time cost of the voiced emoticon message; the voice information in the voice information database is also Applicable to multiple emoticons, it is not necessary to record each emoticon one by one to obtain the audio emoticons corresponding to multiple emoticons. In the case of a large number of emoticons, it can effectively improve the generation of audio emoticons efficiency.
  • FIG. 14 shows a block diagram of an apparatus for acquiring associated sounds of emoticons provided by an embodiment of the present application.
  • the device has the function of realizing the above-mentioned method for acquiring associated sounds of emoticons, and the function may be realized by hardware, or may be realized by hardware executing corresponding software.
  • the device may be a server, or be set in the server.
  • the apparatus 1400 may include: a feature acquiring module 1410 , a sound matching module 1420 and a sound generating module 1430 .
  • a feature acquiring module 1410 configured to acquire feature information of the first emoticon package.
  • the sound matching module 1420 is configured to match the first sound information associated with the first emoticon package from the sound information database according to the feature information.
  • the sound generating module 1430 is configured to generate the associated sound information of the first emoticon package based on the first sound information; wherein, the associated sound information of the first emoticon package is used to generate the corresponding sound information of the first emoticon package.
  • Voice emoticons are configured to generate the associated sound information of the first emoticon package based on the first sound information; wherein, the associated sound information of the first emoticon package is used to generate the corresponding sound information of the first emoticon package.
  • the voice matching module 1420 includes: a label acquisition unit 1421 , a voice matching unit 1422 and a voice selection unit 1423 .
  • the tag obtaining unit 1421 is configured to obtain tags corresponding to each sound information in the sound information database.
  • the sound matching unit 1422 is configured to select at least one candidate sound information matching the feature information from the sound information database according to the labels corresponding to each of the sound information.
  • the sound selection unit 1423 is configured to select, from the at least one candidate sound information, candidate sound information satisfying a second condition as the first sound information.
  • the sound matching unit 1422 is configured to, according to the text feature information in the feature information and the text labels corresponding to each of the sound information, select the text from the sound information database At least one candidate sound information that matches the feature information; wherein, the text label is used to indicate the text corresponding to the sound information; or, according to the scene feature information in the feature information and each of the sound information corresponding to A scene tag, selecting at least one candidate sound information that matches the scene feature information from the sound information database; wherein, the scene tag is used to indicate the sending scene corresponding to the sound information; or, according to the The emotional feature information in the feature information and the emotional tags corresponding to each of the sound information, select at least one candidate sound information that matches the emotional feature information from the sound information database; wherein, the emotional tag is used for The emotion corresponding to the sound information is indicated.
  • the feature acquisition module 1410 is configured to perform text extraction on the text information in the first emoticon package to obtain text feature information of the first emoticon package; wherein, the feature information includes The character feature information; or, performing feature extraction on the first emoticon package, the associated chat message of the first emoticon package, and the associated chat scene of the first emoticon package, to obtain the scene of the first emoticon package Feature information; wherein, the feature information includes the scene feature information; or, feature extraction is performed on the first emoticon package and the associated chat message of the first emoticon package to obtain the emotional characteristics of the first emoticon package information; wherein, the characteristic information includes the emotional characteristic information.
  • the sound generation module 1430 includes: a text acquisition unit 1431 , a sound interception unit 1432 and a sound generation unit 1433 .
  • a text acquisition unit 1431 configured to acquire text information included in the first emoticon package.
  • the sound intercepting unit 1432 is configured to, according to the text information, intercept a sound segment containing the text information from the first sound information.
  • the sound generating unit 1433 is configured to generate associated sound information of the first emoticon package based on the sound segment.
  • the sound generating unit 1433 is configured to adjust the playing duration of the sound segment based on the playing duration of the first emoticon package if the first emoticon package is a video animation, The associated sound information of the first emoticon package is obtained; wherein, the playing duration of the associated sound information of the first emoticon package is the same as the playing duration of the first emoticon package.
  • the device 1400 further includes: a sound collection module 1440 .
  • the sound collection module 1440 is used to collect a plurality of historical sound information sent by the sender account of the first emoticon package; perform text conversion on the sounds contained in each of the historical sound information to obtain each of the historical sound information Text tags corresponding to each; based on the sending scenes corresponding to each of the historical sound information, respectively, the scene tags corresponding to each of the historical sound information are obtained; based on the sound emotions corresponding to each of the historical sound information, each of the historical The emotion labels corresponding to the sound information respectively.
  • the first voice information associated with the first emoticon package is obtained by matching the feature information of the first emoticon package, and the relationship between the first sound information and the first emoticon package is improved.
  • the matching degree makes the associated sound information generated based on the first sound information highly accurate; moreover, the associated sound information of the first emoticon package can be generated through the existing sound information in the sound information database, and there is no need to specially create an associated sound information for the first emoticon package.
  • One sound information is dubbed and recorded, and the sound information in the sound information database is also applicable to multiple emoticons.
  • the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned function allocation can be completed by different functional modules according to the needs.
  • the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the device provided by the above embodiment and the method embodiment belong to the same idea, and its specific implementation process is detailed in the method embodiment, and will not be repeated here.
  • FIG. 16 shows a structural block diagram of a terminal device 1600 provided by an embodiment of the present application.
  • the terminal device 1600 may be an electronic device such as a mobile phone, a tablet computer, a game console, an e-book reader, a multimedia playback device, a wearable device, a vehicle terminal, or a PC.
  • the terminal device is used to implement the method for displaying emoticons provided in the above embodiments, or the method for acquiring associated sounds of emoticons. Specifically:
  • the terminal device 1600 includes: a processor 1601 and a memory 1602 .
  • the processor 1601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
  • Processor 1601 can be realized by at least one hardware form in DSP (Digital Signal Processing, digital signal processing), FPGA (Field Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, programmable logic array) .
  • Processor 1601 may also include a main processor and a coprocessor, the main processor is a processor for processing data in the wake-up state, and is also called a CPU (Central Processing Unit, central processing unit); the coprocessor is Low-power processor for processing data in standby state.
  • CPU Central Processing Unit, central processing unit
  • the coprocessor is Low-power processor for processing data in standby state.
  • the processor 1601 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is used for rendering and drawing the content to be displayed on the display screen.
  • the processor 1601 may also include an AI (Artificial Intelligence, artificial intelligence) processor, where the AI processor is used to process computing operations related to machine learning.
  • AI Artificial Intelligence, artificial intelligence
  • Memory 1602 may include one or more computer-readable storage media, which may be non-transitory.
  • the memory 1602 may also include high-speed random access memory and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices.
  • the non-transitory computer-readable storage medium in memory 1602 is used to store at least one instruction, at least one program, a code set, or a set of instructions, and is configured to be executed by one or more processors to Realize the method for displaying the above emoticon package, or the method for obtaining the associated sound of the above emoticon package.
  • the terminal device 1600 may optionally further include: a peripheral device interface 1603 and at least one peripheral device.
  • the processor 1601, the memory 1602, and the peripheral device interface 1603 may be connected through buses or signal lines.
  • Each peripheral device can be connected to the peripheral device interface 1603 through a bus, a signal line or a circuit board.
  • the peripheral device includes: at least one of a radio frequency circuit 1604 , a display screen 1605 , a camera assembly 1606 , an audio circuit 1607 and a power supply 1608 .
  • FIG. 16 does not constitute a limitation on the terminal device 1600, and may include more or less components than shown in the figure, or combine certain components, or adopt a different component arrangement.
  • FIG. 17 shows a structural block diagram of a server provided by an embodiment of the present application.
  • the server is used to implement the method for acquiring associated sounds of emoticons provided in the above embodiments. Specifically:
  • the server 1700 includes a CPU (Central Processing Unit, central processing unit) 1701, a system memory 1704 including a RAM (Random Access Memory, random access memory) 1702 and a ROM (Read-Only Memory, read-only memory) 1703, and a connection system memory 1704 and system bus 1705 of the central processing unit 1701 .
  • the server 1700 also includes a basic I/O (Input/Output, input/output) system 1706 that helps to transmit information between various devices in the computer, and a system 1715 for storing operating systems 1713, application programs 1714 and other program modules 1715 mass storage device 1707 .
  • I/O Input/Output, input/output
  • the basic input/output system 1706 includes a display 1708 for displaying information and input devices 1709 such as a mouse and a keyboard for users to input information. Both the display 1708 and the input device 1709 are connected to the central processing unit 1701 through the input and output controller 1710 connected to the system bus 1705 .
  • the basic input/output system 1706 may also include an input-output controller 1710 for receiving and processing input from a keyboard, a mouse, or an electronic stylus and other devices. Similarly, input output controller 1710 also provides output to a display screen, printer, or other type of output device.
  • the mass storage device 1707 is connected to the central processing unit 1701 through a mass storage controller (not shown) connected to the system bus 1705 .
  • the mass storage device 1707 and its associated computer-readable media provide non-volatile storage for the server 1700. That is to say, the mass storage device 1707 may include a computer-readable medium (not shown) such as a hard disk or a CD-ROM (Compact Disc Read-Only Memory, CD-ROM) drive.
  • a computer-readable medium such as a hard disk or a CD-ROM (Compact Disc Read-Only Memory, CD-ROM) drive.
  • Computer-readable media may comprise computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media include RAM, ROM, EPROM (Erasable Programmable Read Only Memory, Erasable Programmable Read Only Memory), EEPROM (Electrically Erasable Programmable Read Only Memory, Erasable Programmable Read Only Memory), flash memory or other solid-state memory technology, CD-ROM, DVD (Digital Video Disc, high-density digital video disc) or other optical storage, cassette, tape, disk storage or other magnetic storage devices.
  • the server 1700 can also run on a remote computer connected to the network through a network such as the Internet. That is, the server 1700 can be connected to the network 1712 through the network interface unit 1711 connected to the system bus 1705, or in other words, the network interface unit 1711 can also be used to connect to other types of networks or remote computer systems (not shown) .
  • a computer-readable storage medium is also provided, and a computer program is stored in the storage medium.
  • the above emoticon package display method can be realized, or the above emoticon can be realized.
  • the associated sound acquisition method of the package is also provided.
  • the computer-readable storage medium may include: ROM (Read Only Memory, read-only memory), RAM (Random Access Memory, random access memory), SSD (Solid State Drives, solid state drive) or an optical disc, etc.
  • the random access memory may include ReRAM (Resistance Random Access Memory, resistive random access memory) and DRAM (Dynamic Random Access Memory, dynamic random access memory).
  • a computer program product comprising a computer program stored in a computer-readable storage medium from which a processor reads and Executing the computer program to realize the method for displaying the emoticon package above, or to realize the method for acquiring the associated sound of the emoticon package.
  • the information including but not limited to target device information, target personal information, etc.
  • data including but not limited to data used for analysis, stored data, displayed data, etc.
  • signals involved in this application All are authorized by the subject or fully authorized by all parties, and the collection, use and processing of relevant data need to comply with relevant laws, regulations and standards of relevant countries and regions.
  • the account number of the sender, account number of the receiver, identification information, historical voice information, etc. involved in this application are all obtained under the condition of full authorization.
  • the “plurality” mentioned herein refers to two or more than two.
  • “And/or” means that there may be three relationships, for example, A and/or B may mean: A exists alone, A and B exist simultaneously, and B exists alone.
  • the character "/” generally indicates that the contextual objects are an "or” relationship.
  • the numbering of the steps described herein only exemplarily shows a possible sequence of execution among the steps. In some other embodiments, the above-mentioned steps may not be executed according to the order of the numbers, such as two different numbers The steps are executed at the same time, or two steps with different numbers are executed in the reverse order as shown in the illustration, which is not limited in this embodiment of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Library & Information Science (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

一种表情包显示、关联声音获取方法、装置、设备及存储介质,属于计算机和互联网技术领域。该方法包括:显示聊天会话界面(301);响应于针对聊天会话界面的表情包选择操作,显示表情包选择界面(302);响应于针对第一表情包的发送操作,在聊天会话界面中显示第一表情包对应的有声表情消息,第一表情包的关联声音信息是从声音信息数据库中匹配得到的与第一表情包相关联的声音信息(303)。本申请通过支持显示表情包对应的有声音表情,使得表情包的交流方式更加多样,为用户提供更好的聊天氛围,另外,由于无需录音操作即可获取关联声音信息,降低有声表情消息的生成成本,在表情包数量多的情况下,能够有效提高有声表情消息的生成效率。

Description

表情包显示、关联声音获取方法、装置、设备及存储介质
本申请要求于2021年11月17日提交的申请号为202111362112.8、发明名称为“表情包显示、关联声音获取方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机和互联网技术领域,特别涉及一种表情包显示、关联声音获取方法、装置、设备及存储介质。
背景技术
目前,在社交平台中,用户可以通过表情包进行相互交流。
在相关技术中,用户在与其他用户进行交流时,可以选择具体的表情包进行发送,在发送之后,聊天会话界面中显示用户所发送的表情包。
然而,在上述相关技术中,基于表情包的交流方式单一。
发明内容
本申请实施例提供了一种表情包显示、关联声音获取方法、装置、设备及存储介质,能够支持显示表情包对应的有声音消息,使得基于表情包的交流方式不局限于图像的交流,表情包的交流方式更加多样,从而为用户提供更好的聊天氛围。所述技术方案如下:
根据本申请实施例的一个方面,提供了一种表情包显示方法,所述方法由终端设备执行,所述方法包括:
在用户界面中显示虚拟模型,以及以第一显示样式显示所述虚拟模型的关联信息;
响应于针对所述虚拟模型的旋转操作,控制所述虚拟模型和所述关联信息进行同轴旋转;其中,在旋转过程中,所述虚拟模型和所述关联信息的相对位置保持不变,且所述关联信息的显示样式从所述第一显示样式逐渐变为第二显示样式;
在所述关联信息的显示样式变为所述第二显示样式的情况下,控制所述关联信息从所述第二显示样式切换为所述第一显示样式进行显示。
根据本申请实施例的一个方面,提供了一种表情包的关联声音获取方法,所述方法由计算机设备执行,所述方法包括:
获取第一表情包的特征信息;
根据所述特征信息,从声音信息数据库中匹配得到与所述第一表情包相关联的第一声音信息;
基于所述第一声音信息,生成所述第一表情包的关联声音信息;其中,所述第一表情包的关联声音信息用于生成所述第一表情包对应的有声表情消息。
根据本申请实施例的一个方面,提供了一种表情包显示装置,所述装置包括:
界面显示模块,用于显示聊天会话界面,所述聊天会话界面用于展示至少两个用户之间的聊天消息;
表情显示模块,用于响应于针对所述聊天会话界面的表情包选择操作,显示表情包选择界面,所述表情包选择界面中显示有至少一个表情包;
消息显示模块,用于响应于针对所述至少一个表情包中的第一表情包的发送操作,在所述聊天会话界面中显示所述第一表情包对应的有声表情消息;其中,所述第一表情包对应的有声表情消息用于展示所述第一表情包以及所述第一表情包的关联声音信息,所述第一表情包的关联声音信息是从声音信息数据库中匹配得到的与所述第一表情包相关联的声音信息。
根据本申请实施例的一个方面,提供了一种表情包的关联声音获取装置,所述装置包括:
特征获取模块,用于获取第一表情包的特征信息;
声音匹配模块,用于根据所述特征信息,从声音信息数据库中匹配得到与所述第一表情包相关联的第一声音信息;
声音生成模块,用于基于所述第一声音信息,生成所述第一表情包的关联声音信息;其中,所述第一表情包的关联声音信息用于生成所述第一表情包对应的有声表情消息。
根据本申请实施例的一个方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有计算机程序,所述计算机程序由所述处理器加载并执行以实现上述表情包显示方法,或实现上述表情包的关联声音获取方法。
示例性地,所述计算机设备包括终端设备或服务器。
根据本申请实施例的一个方面,提供了一种计算机可读存储介质,所述可读存储介质中存储有计算机程序,所述计算机程序由所述处理器加载并执行以实现上述表情包显示方法,或实现上述表情包的关联声音获取方法。
根据本申请实施例的一个方面,提供了计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序存储在计算机可读存储介质中,处理器从所述计算机可读存储介质读取并执行所述计算机程序,以实现上述表情包显示方法,或实现上述表情包的关联声音获取方法。
本申请实施例提供的技术方案可以带来如下有益效果:
通过第一表情包对应的有声表情消息展示第一表情包以及第一表情包的关联声音信息,即用户在发送第一表情包时,能够同时通过第一表情包和第一表情包的关联声音信息进行交流,使得基于表情包的交流方式不局限于图像的交流,表情包的交流方式更加多样,从而为用户提供更好的聊天氛围;而且,第一表情包的关联声音信息是从声音信息数据库中匹配得到的与第一表情包相关联的声音信息,即不需要提前或实时对第一表情包进行录音操作,通过与已有的声音信息匹配即可生成第一表情包对应的有声表情消息,降低了关联声音信息的获取开销和时间成本,从而降低了有声表情消息的生成开销和时间成本;声音信息数据库中的声音信息也适用于多个表情包,不需要对每个表情包一一进行录音操作即可获取多个表情包分别对应的有声表情消息,在表情包数量多的情况下,能够有效提高有声表情消息的生成效率。
附图说明
图1是本申请一个实施例提供的表情包显示系统的示意图;
图2示例性示出了一种表情包显示系统的示意图;
图3是本申请一个实施例提供的表情包显示方法的流程图;
图4至图5示例性示出了聊天会话界面的示意图;
图6是本申请另一个实施例提供的表情包显示方法的流程图;
图7示例性示出了一种表情包选择界面的示意图;
图8示例性示出了另一种聊天会话界面的示意图;
图9是本申请一个实施例提供的表情包的关联声音获取方法的流程图;
图10示例性示出了一种功能设置界面的示意图;
图11示例性示出了一种表情包显示方式的流程的示意图;
图12是本申请一个实施例提供的表情包显示装置的框图;
图13是本申请另一个实施例提供的表情包显示装置的框图;
图14是本申请一个实施例提供的表情包的关联声音获取装置的框图;
图15是本申请另一个实施例提供的表情包的关联声音获取装置的框图;
图16是本申请一个实施例提供的终端设备的结构框图;
图17是本申请一个实施例提供的服务器的结构框图。
具体实施方式
请参考图1,其示出了本申请一个实施例提供的表情包显示系统的示意图。该表情包显 示系统可以包括:终端10和服务器20。
终端10可以是诸如手机、平板电脑、游戏主机、电子书阅读器、多媒体播放设备、可穿戴设备、车载终端、PC(Personal Computer,个人计算机)等电子设备。终端10中可以安装应用程序的客户端。其中,该应用程序是指具有表情包显示功能的任意应用程序,如社交应用程序、购物应用程序、游戏应用程序等。可选地,该应用程序可以是需要下载安装的应用程序,也可以是即点即用的应用程序,本申请实施例对此不作限定。其中,上述表情包可以为静态图像,也可以为动态图像,本申请实施例对此不作限定。在本申请实施例中,终端设备也可以被称之为终端。
服务器20用于为终端10中的应用程序的客户端提供后台服务。例如,服务器20可以是上述应用程序的后台服务器。服务器20可以是一台服务器,也可以是由多台服务器组成的服务器集群,或者是一个云计算服务中心。可选地,服务器20同时为多个终端10中的应用程序提供后台服务。
终端10和服务器20之间可通过网络进行互相通信。
可选地,服务器20为终端10提供数据存储、数据处理和数据传输功能中的至少一种功能。
示例性地,如图2所示,服务器20中包括具有用于存储声音信息的数据库(即声音信息数据库)的服务器21、用于为表情包生成关联声音信息的服务器22,以及用于为多个终端10提供数据传输的服务器23。以第一终端11和第二终端12为例,在第一终端11与第二终端12之间的聊天会话过程中,在第一终端11的使用者将第一表情包的发生方式切换至第一发送方式时,由第一终端11向服务器22发送关联声音信息获取指令;服务器22在接收到该关联信息获取指令之后,从服务器21的声音信息数据库中的各个声音信息中,为第一表情包匹配相关联的第一声音信息,并根据该第一声音信息生成第一表情包的关联声音信息,以及向第一终端11发送该关联声音信息;之后,在第一终端11的使用者向第二终端12的使用者发送第一表情包时,由第一终端11将待发送的消息发送至服务器23,进而服务器23向第二终端12转发该待发送的消息。其中,该待发送的消息为用于展示第一表情包以及第一表情包的关联声音信息的消息。
可选地,上述服务器21、服务器22和服务器23可以是相同的服务器,也可以是不同的服务器,本申请实施例对此不作限定。
请参考图3,其示出了本申请一个实施例提供的表情包显示方法的流程图。该方法可应用于图1所示表情包显示系统的终端10中,如各步骤的执行主体可以是终端10中安装的应用程序的客户端。该方法可以包括以下几个步骤(301~303)中的至少一个步骤:
步骤301,显示聊天会话界面。
聊天会话界面用于展示至少两个用户之间的聊天消息。其中,该聊天消息包括但不限于以下至少一项:文字消息、图像消息、声音消息、视频消息等。可选地,不同的应用程序对应有不同的聊天会话界面。
在本申请实施例中,在用户发送消息时,客户端在聊天会话界面中显示用户所发送的消息。可选地,若该聊天会话界面包括已发送的聊天消息,则聊天会话界面中显示有该已发送的聊天消息的发送方帐号的标识信息。其中,该标识信息可以包括以下至少一项:帐号名称、帐号头像和帐号等级。
可选地,该聊天会话界面在显示用户之间的实时聊天消息时,也可以显示用户之间的历史聊天消息。
在一种可能的实施方式中,为了提高聊天消息显示效果的完整性,聊天会话界面包括上述历史聊天消息。可选地,客户端在显示上述聊天会话界面时,获取上述用户之间的历史聊天消息,并将该历史聊天消息显示在聊天会话界面中。其中,该历史聊天消息可以为实时获 取的历史消息,也可以为预先存储在客户端的历史消息。
在另一种可能的实施方式中,为了提高聊天会话界面的简洁性,聊天会话界面不包括上述历史聊天消息。可选地,客户端在显示上述聊天会话界面时,不需要获取上述用户之间的历史聊天消息,直接显示聊天会话界面即可。
步骤302,响应于针对聊天会话界面的表情包选择操作,显示表情包选择界面。
在本申请实施例中,客户端在显示上述聊天会话界面之后,对该聊天会话界面进行检测,在检测到针对聊天会话界面的表情包选择操作的情况下,显示表情包选择界面。其中,上述表情包选择界面是指显示表情包以供用户进行选择的界面。可选地,该表情包选择界面中显示有至少一个表情包。本申请中的表情包除了上文中提到的静态图像和动态图像两种形式的表情包之外,还可以有其它形式的表情包,如视频表情包、动画表情包、视频动画表情包等。
可选地,客户端在显示上述表情包选择界面时,若表情包选择界面与聊天会话界面中存在相同的显示元素,则在保持该相同的显示元素不变的基础上,取消显示聊天会话界面的显示元素,并显示表情包选择界面的显示元素;若表情包选择界面与聊天会话界面中不存在相同的显示元素,则直接取消显示聊天会话界面的显示元素,并显示表情包选择界面的显示元素。如此可以避免聊天会话界面对表情包的显示以及选择的影响,从而提高了表情包的显示效果,以及表情包的选择直观性。
上述表情包选择操作即为用于召唤表情包选择界面的操作。
在一种可能的实施方式中,上述聊天会话界面中包括表情包选择控件,表情包选择操作为针对该表情包选择控件的触发操作,用户通过针对该表情包选择控件进行触发操作,以使得客户端显示表情包选择界面。其中,上述操作可以点击操作、长按操作、滑动操作等,本申请实施例对此不作限定。可选地,上述聊天会话界面中还可以包括其它操作控件,如聊天消息发送控件、历史消息查找控件、聊天消息分享控件等。
在另一种可能的实施方式中,为了提高聊天会话界面的简洁性,表情包选择操作为针对聊天会话界面的特定操作,即聊天会话界面中不需要显示表情包选择控件,用户通过在聊天会话界面中执行特定操作,以使得客户端显示表情包选择界面。其中,上述特定操作可以为特定次数的点击操作、特定时长的长按操作、特定轨迹的滑动操作、特定键位的按压操作等,本申请实施例对此不作限定。可选地,用户也可以通过针对聊天会话界面的其它特定操作,以执行其它操作,如聊天消息发送操作、历史消息查找操作、聊天消息分享操作等。
步骤303,响应于针对至少一个表情包中的第一表情包的发送操作,在聊天会话界面中显示第一表情包对应的有声表情消息。
可选地,上述表情包选择界面中包括表情包的选择项,不同的表情包对应有不同的选择项。其中,该选择项可以是表情包本身,也可以是表情包的缩略图、封面图、名称等,本申请实施例对此不作限定。可选地,用户通过针对该选择项的不同操作,触发生成针对表情包的不同操作。示例性地,通过点击选择项,以触发针对该选择项所对应的表情包的发送操作;通过长按选择项,以触发针对该选择项所对应的表情包的选择操作;通过拖拽选择项,以触发针对该选择项所对应的表情包的位置移动操作。
在本申请实施例中,客户端在显示上述表情包选择界面之后,对该表情包选择界面进行检测,在检测到针对至少一个表情包中的第一表情包的发送操作的情况下,在聊天会话界面中显示第一表情包对应的有声表情消息。
可选地,上述第一表情包可以为上述至少一个表情包中的任一表情包。在本申请实施例中,该第一表情包对应的有声表情消息用于展示上述第一表情包以及该第一表情包的关联声音信息,该第一表情包的关联声音信息是从声音信息数据库中匹配得到的与上述第一表情包相关联的声音信息,声音信息数据库预先存储有多个声音信息。
在一种可能的实施方式中,上述有声表情消息包括第一表情包,以及用于播放第一表情包的关联声音信息的声音播放控件。可选地,客户端在检测到针对上述第一表情包的发送操 作时,向接收方帐号发送第一表情包和第一表情包的关联声音信息,并在聊天会话界面中显示该第一表情包和该第一表情包对应的声音播放控件。示例性地,如图4所示,在第一表情包对应的有声表情消息发送之后,在聊天会话界面40中显示第一表情包41和声音播放控件42。通过提供声音播放控件,使得用户能够根据需求进行关联声音信息的播放或不播放,从而提高了用户体验。
在另一种可能的实施方式中,上述有声表情消息包括第一表情包的有声视频。可选地,客户端在检测到针对上述第一表情包的发送操作时,根据该第一表情包,以及该第一表情包的关联声音信息,生成第一表情包的有声视频,向接收方帐号发送该有声视频,并在聊天会话界面中显示该第一表情包的有声视频。可选地,上述有声表情消息中还包括用于播放该有声视频的视频播放控件。示例性地,如图5所示,在第一表情包对应的有声表情消息发送之后,在聊天会话界面50中显示第一表情包的有声视频51和视频播放控件52。如此使得表情包不局限于图像的显示形式,从而丰富了表情包的显示多样性,进一步提高了用户体验。
可选地,上述有声表情消息还包括字幕信息。在一种可能的实施方式中,该字幕信息为第一表情包中的文字信息。其中,该文字信息可以是第一表情包制作者设置在第一表情包中的文字,也可以是有声表情消息的发送方帐号所输入的文本,本申请实施例对此不作限定。在另一种可能的实施方式中,该字幕信息为第一表情包的标记,基于该标记能够获取第一表情包的特征信息。其中,该标记可以是第一表情包制作者所设置的,也可以是有声表情消息的发送方帐号所输入的,本申请实施例对此不作限定。需要说明的一点是,上述标记也可以称为标识、描述、定义等。
可选地,客户端在发送上述有声表情消息时,可以直接将上述第一表情包和上述关联声音信息发送至对应的设备;或者,客户端也可以将上述第一表情包和上述关联声音信息的标识信息发送至对应的设备,进而由该设备根据关联声音信息的标识信息获取关联声音信息,并生成上述有声表情消息。其中,上述设备可以是接收方帐号所在的终端,也可以是消息中转服务器,本申请实施例对此不作限定。
综上所述,本申请实施例提供的技术方案中,通过第一表情包对应的有声表情消息展示第一表情包以及第一表情包的关联声音信息,即用户在发送第一表情包时,能够同时通过第一表情包和第一表情包的关联声音信息进行交流,使得基于表情包的交流方式不局限于图像的交流,表情包的交流方式更加多样,从而为用户提供更好的聊天氛围;而且,第一表情包的关联声音信息是从声音信息数据库中匹配得到的与第一表情包相关联的声音信息,即不需要提前或实时对第一表情包进行录音操作,通过与已有的声音信息匹配即可生成第一表情包对应的有声表情消息,降低了关联声音信息的获取开销和时间成本,从而降低了有声表情消息的生成开销和时间成本;声音信息数据库中的声音信息也适用于多个表情包,不需要对每个表情包一一进行录音操作即可获取多个表情包分别对应的有声表情消息,在表情包数量多的情况下,能够有效提高有声表情消息的生成效率。
请参考图6,其示出了本申请另一个实施例提供的表情包显示方法的流程图。该方法可应用于图1所示表情包显示系统的终端10中,如各步骤的执行主体可以是终端10中安装的应用程序的客户端。该方法可以包括以下几个步骤(601~608)中的至少一个步骤:
步骤601,显示聊天会话界面。
步骤602,响应于针对聊天会话界面的表情包选择操作,显示表情包选择界面。
上述步骤601和602与图3实施例中的步骤301和302相同,具体参见图3实施例,在此不作赘述。
步骤603,响应于针对第一表情包的选择操作,显示第一表情包的发送方式切换控件。
在本申请实施例中,客户端在显示上述表情包选择界面之后,对该表情包选择界面进行检测,在检测到针对第一表情包的选择操作的情况下,显示第一表情包的发送方式切换控件。 可选地,表情包选择界面中包括表情包的选择项,不同的表情包对应有不同的选择项,用户通过该第一表情包的选择项来触发生成针对第一表情包的选择操作。
上述发送方式切换控件用于控制第一表情包的发送方式的切换。在本申请实施例中,客户端在显示发送方式切换控件之后,对该发送方式切换控件进行检测,在接收到针对发送方式切换控件的操作的情况下,对第一表情包的发送方式进行切换。可选地,若第一表情包的发送方式为第二发送方式,则客户端在接收到针对发送方式切换控件的操作后,控制该发送方式由第二发送方式切换至第一发送方式;若第一表情包的发送方式为第一发送方式,则客户端在接收到针对发送方式切换控件的操作后,控制发送方式由第一发送方式切换至所第二发送方式。其中,上述第一发送方式是指以有声表情消息的形式发送该第一表情包,上述第二发送方式是指以第一表情包的形式发送该第一表情包。
示例性地,如图7所示,在表情包选择界面70中包括多个表情包的选择项,用户通过长按第一表情包的选择项71触发生成针对第一表情包的选择操作,进而表情包选择界面70中显示第一表情包的发送方式切换控件72,进一步地,用户可以通过该发送方式切换控件72对第一表情包的发送方式进行切换。
本申请实施例通过提供发送方式切换控件,使得用户能够根据需求灵活设置第一表情包的发送方式,从而提高了表情包的发送灵活性。
步骤604,响应于针对第一表情包的发送操作,获取第一表情包的发送方式。
在本申请实施例中,客户端在显示上述表情包选择界面之后,对该表情包选择界面进行检测,在检测到针对第一表情包的发送操作,获取第一表情包的发送方式。可选地,用户通过该第一表情包的选择项来触发生成针对第一表情包的发送操作。
步骤605,根据第一表情包的发送方式,向聊天会话界面中的接收方帐号发送第一表情包。
在本申请实施例中,客户端在获取上述发送方式之后,根据该发送方式,向聊天会话界面中的接收方帐号发送第一表情包。
可选地,若上述发送方式为第一发送方式,则客户端向聊天会话界面中的接收方帐号发送第一表情包对应的有声表情消息,以及在该聊天会话界面中显示第一表情包对应的有声表情消息;若上述发送方式为第二发送方式,则客户端仅仅向聊天会话界面中的接收方帐号发送第一表情包,以及在该聊天会话界面中显示该第一表情包。客户端通过支持以第一发送方式或第二发送方式发送表情包,进一步提高了表情包的发送灵活性。
可选地,在上述发送方式为第一发送方式的情况下,若第一表情包未匹配到关联声音信息,则客户端向聊天会话界面中的接收方帐号发送第一表情包对应的无声表情消息,以及在该聊天会话界面中显示第一表情包对应的无声表情消息。其中,该无声表情消息包括第一表情包,以及声音匹配失败标识。示例性地,如图8所示,在第一表情包81未匹配到关联声音信息的情况下,在聊天会话界面82中显示第一表情包81和声音匹配失败标识83。
可选地,在聊天会话界面显示上述有声表情消息之后,用户可以根据实际情况控制关联声音信息的播放、暂停或更换。
步骤606,响应于针对有声表情消息的声音播放操作,播放第一表情包的关联声音信息。
在本申请实施例中,客户端在显示上述有声表情消息之后,对该有声表情消息进行检测,在检测到针对有声表情消息的声音播放操作,播放第一表情包的关联声音信息。其中,该声音播放操作可以是针对第一特定控件的操作,也可以是针对有声表情消息的第一特定操作,本申请实施例对此不作限定。示例性地,用户通过点击图4中的声音播放控件42来触发生成声音播放操作,以播放第一表情包的关联声音信息;或者,用户通过点击图5中的视频播放控件52来触发生成声音播放操作,播放第一表情包的关联声音信息。
在一个示例中,若第一表情包为多帧图像组成的视频动画,则客户端在检测到针对有声表情消息的声音播放操作后,在播放关联声音信息的同时播放第一表情包的视频动画。
步骤607,响应于针对有声表情消息的静音操作,停止播放第一表情包的关联声音信息。
在本申请实施例中,客户端在显示上述有声表情消息之后,对该有声表情消息进行检测,在检测到针对有声表情消息的静音操作,停止播放第一表情包的关联声音信息。其中,该静音操作可以是针对第二特定控件的操作,也可以是针对有声表情消息的第二特定操作,本申请实施例对此不作限定。
可选地,上述第一特定控件和上述第二特定控件可以是相同的操作控件,也可以是不同的操作控件,本申请实施例对此不作限定。可选地,若上述第一特定控件和上述第二特定控件为相同的操作控件,则上述声音播放操作和上述静音操作为针对同一操作控件的不同操作。示例性地,用户通过双击图4中的声音播放控件42来触发生成静音操作,以停止播放第一表情包的关联声音信息;而且,在用户触发生成静音操作之后,声音播放控件42的显示样式发生变化。
在一个示例中,若第一表情包为多帧图像组成的视频动画,则客户端在检测到针对有声表情消息的静音操作后,停止播放关联声音信息,但依旧播放第一表情包的视频动画。
步骤608,响应于针对有声表情消息的声音更换操作,更改第一表情包的关联声音信息。
在本申请实施例中,客户端在显示上述有声表情消息之后,对该有声表情消息进行检测,在检测到针对有声表情消息的声音更换操作,更改第一表情包的关联声音信息。其中,该声音更换操作可以是针对第三特定控件的操作,也可以是针对有声表情消息的第三特定操作,本申请实施例对此不作限定。示例性地,如图4所示,在聊天会话界面40中显示有声音更换控件43,用户通过点击该声音更换控件43,以更改第一表情包的关联声音信息。
可选地,上述第一特定控件、上述第二特定控件、上述第三特定控件可以是相同的操作控件,也可以是不同的操作控件,本申请实施例对此不作限定。可选地,若上述第一特定控件、上述第二特定控件和上述第三特定控件为相同的操作控件,则上述声音播放操作、上述静音操作和上述声音更换操作为针对同一操作控件的不同操作。
可选地,在更改第一表情包的关联声音信息时,客户端可以自动对该关联声音信息进行更改,也可以基于用户的选择对该关联声音信息进行更改。
在一种可能的实施方式中,客户端自动对关联声音信息进行更改。可选地,客户端在检测到上述声音更换操作之后,从至少一个候选声音信息中,选择满足第一条件的候选声音信息生成上述第一表情包的替换声音信息,并采用该第一表情包的替换声音信息,替换第一表情包的关联声音信息。其中,上述候选声音信息是根据第一表情包的特征信息,以及声音信息数据库中各个声音信息分别对应的标签匹配得到的;上述第一条件为针对候选声音信息的选择条件,示例性地,该第一条件为与第一表情包的特征信息的匹配度最高的候选声音信息。当然,在示例性实施例中,也可以随机从至少一个候选声音信息中,为第一表情包选择替换声音信息。
在另一种可能的实施方式中,客户端基于用户的选择对该关联声音信息进行更改。可选地,客户端在检测到上述声音更换操作之后,显示至少一个候选声音信息,并对各个候选声音信息进行检测,在检测到针对该至少一个候选声音信息中的目标声音信息的选择操作的情况下,根据该目标声音信息生成第一表情包的替换声音信息,并采用第一表情包的替换声音信息,替换第一表情包的关联声音信息。
需要说明的一点是,上述候选声音信息中不包括第一表情包的关联声音信息和历史关联声音信息。其中,该历史关联声音信息是指曾经为第一表情包的关联声音信息的声音信息。
还需要说明的一点是,在第一表情包的关联声音信息发生变化之后,需要将变化后的关联声音信息或者变化后的关联声音信息的标识信息同步至上述接收方帐号。
综上所述,本申请实施例提供的技术方案中,在第一表情包的发送方式为第一发送方式的情况下,才会在发送第一表情包时,向聊天会话界面中的接收方帐号发送第一表情包对应的有声表情消息,且该发送方式能够通过发送方式切换控件进行灵活切换,用户可以根据实 际情况灵活设置第一表情包的发送方式,使得第一表清包的交流方式能够满足不同用户的需求。
另外,通过声音更换操作能够对第一表情包的关联声音信息进行更改,在获取第一表情包的关联声音信息时参考用户的意见对关联声音信息进行灵活更改,提高所获取的关联声音信息的准确性。
另外,由用户从候选声音信息中自行选择第一表情包的关联声音信息,提高关联声音信息的准确性,增强关联声音信息与第一表情包之间的联系,使得有声表情消息能够更好地表达用户的意愿。
请参考图9,其示出了本申请一个实施例提供的表情包的关联声音获取方法的流程图。该方法可应用于图1所示表情包显示系统的终端10,也可应用于图1所示表情包显示系统的服务器20中,还可以由终端10和服务器20交互实现,本申请实施例对此不作限定(以下将表情包的关联声音获取方法的执行主体统一称为“服务器”)。该方法可以包括以下几个步骤(901~903)中的至少一个步骤:
步骤901,获取第一表情包的特征信息。
第一表情包是指待匹配声音信息的表情包,其可以是应用程序所提供的多个表情包中的任一表情包。在本申请实施例中,服务器在为第一表情包匹配声音信息之前,获取第一表情包的特征信息。
可选地,上述特征信息可以是实时生成,也可以是预先生成的,本申请实施例对此不作限定。
在一种可能的实施方式中,上述特征信息为实时生成。可选地,服务器在确定对上述第一表情包进行声音信息的匹配时,实时生成该第一表情包的特征信息。
在另一种可能的实施方式中,上述特征信息为预先生成的。可选地,服务器在获取上述第一表情包之后,即生成该第一表情包的特征信息,并对该特征信息进行存储,进而在确定对上述第一表情包进行声音信息的匹配时,直接从特征信息的存储位置获取该特征信息。
可选地,上述特征信息包括但不限于以下至少一项:文字特征信息、场景特征信息、情绪特征信息等。其中,文字特征信息用于指示第一表情包所包含的文字;场景特征信息用于指示第一表情包的可能使用场景,如晚安表情包的场景特征信息可以为:夜晚睡觉前;情绪特征信息用于指示使用第一表情包时用户可能带有的情绪,例如,若表情包中包括“好难啊”字样,则情绪特征信息可以为:焦虑、难过。
在一种可能的实施方式中,特征信息包括上述文字特征信息。可选地,服务器在获取第一表情包的特征信息时,对第一表情包中的文字信息进行文字提取,得到第一表情包的文字特征信息。可选地,第一表情包中的文字信息包括以下至少一项:第一表情包中的文字、针对第一表情包的输入文本。其中,第一表情包中的文字是指预先存在于第一表情包中的文字,针对第一表情包的输入文本是指用于针对该第一表情包所输入的文字。可选地,在存在上述输入文本的情况下,可以忽略第一表情包中的文字。
在另一种可能的实施方式中,特征信息包括上述景特征信息。可选地,服务器在获取第一表情包的特征信息时,对第一表情包、第一表情包的关联聊天消息、第一表情包的关联聊天场景进行特征提取,得到第一表情包的场景特征信息。其中,第一表情包的关联聊天消息是指发送时刻与当前时刻之间的时间差小于阈值的历史聊天消息,第一表情包的关联聊天场景用于指示当前聊天时间和至少一个当前聊天帐号。可选地,上述关联聊天消息的数量可以预先进行设定,也可以不进行设定,本申请实施例对此不作限定;上述当前聊天帐号可以理解为上述接收方帐号。
在再一种可能的实施方式中,特征信息包括上述情绪特征信息。可选地,服务器在获取第一表情包的特征信息时,对第一表情包、第一表情包的关联聊天消息进行特征提取,得到 第一表情包的情绪特征信息。
需要说明的一点是,上述第一表情包可以是任意表情包,也可以是满足特定要求的表情包。示例性地,为了提高特征信息获取的准确性,上述特定要求可以为:能够提取出文字的表情包。
本申请实施例通过将表情包的特征信息设置为包括但不限于以下至少一项:文字特征信息、场景特征信息和情绪特征信息,如此可以通过特征信息更准确地表征出表情包,从而有利于提高第一声音信息的匹配准确性。
步骤902,根据特征信息,从声音信息数据库中匹配得到与第一表情包相关联的第一声音信息。
在本申请实施例中,服务器在获取上述特征信息之后,根据该特征信息,从声音信息数据库中匹配得到与第一表情包相关的第一声音信息。其中,该声音信息数据库预先存储有多个声音信息。
在一种可能的实施方式中,上述声音信息数据库中存储的多个声音信息为来自第一表情包的发送方帐号的历史声音信息。
在另一种可能的实施方式中,上述声音信息数据库中存储的多个声音信息为来自不同帐号的历史声音信息。
需要说明的一点是,上述历史声音信息可以是在聊天会话过程中生成的,也可以是在录音场景中产生的,本申请实施例对此不作限定。
步骤903,基于第一声音信息,生成第一表情包的关联声音信息。
在本申请实施例中,服务器在获取上述第一声音信息之后,基于该第一声音信息,生成第一表情包的关联声音信息。其中,该第一表情包的关联声音信息用于生成上述第一表情包对应的有声表情消息。
可选地,服务器可以直接将第一声音信息作为关联声音信息,也可以对第一声音信息进行剪辑以获取关联声音信息。
在一种可能的实施方式中,服务器直接将第一声音信息作为关联声音信息。可选地,服务器在获取上述第一声音信息之后,获取第一表情包所包含的文字信息,并对该第一声音信息所包含的文字信息与第一表情包所包含的文字信息进行比较。在第一表情包所包含的文字信息即为第一声音信息所包含的全部文字信息的情况下,直接将该第一声音信息作为上述关联声音信息。
在另一种可能的实施方式中,服务器对第一声音信息进行剪辑以获取关联声音信息。可选地,服务器在获取上述第一声音信息之后,获取第一表情包所包含的文字信息,并对该第一声音信息所包含的文字信息与第一表情包所包含的文字信息进行比较。在第一表情包所包含的文字信息为第一声音信息所包含的部分文字信息的情况下,根据第一表情包所包含的文字信息,从第一声音信息中截取包含第一表情包所包含的文字信息的声音片段,并基于该声音片段,生成上述第一表情包的关联声音信息。通过文字信息进行声音片段的获取,可以提高声音片段与表情包之间匹配程度,进而提高声音片段获取的准确性和合理性。
可选地,服务器在获取上述声音片段之后,可以直接将声音片段作为关联声音信息,也可以对声音片段进行剪辑以获取关联声音信息。
在一种可能的实施方式中,服务器直接将声音片段作为关联声音信息。可选地,服务器在获取上述声音片段之后,若第一表情包为单帧图像,则直接将声音片段作为上述关联声音信息。
在另一种可能的实施方式中,服务器对声音片段进行剪辑以获取关联声音信息。可选地,服务器在获取上述声音片段之后,若第一表情包为视频动画,则基于第一表情包的播放时长,对声音片段的播放时长进行调整,得到第一表情包的关联声音信息。其中,第一表情包的关联声音信息的播放时长与第一表情包的播放时长相同。可选地,服务器通过对声音播放频率 的调整来调整声音片段的播放时长。在表情包为视频动画的情况下,确保表情包的关联声音信息的播放时长与表情包的播放时长相同,使得关联声音信息与表情包更契合,从而有利于表情包的显示效果的提升。
综上所述,本申请实施例提供的技术方案中,通过第一表情包的特征信息匹配获取与第一表情包相关联的第一声音信息,提高第一声音信息与第一表情包之间的匹配度,使得后续基于第一声音信息生成的关联声音信息的准确度高;而且,通过声音信息数据库中已有的声音信息即可生成第一表情包的关联声音信息,不需要特地为第一声音信息进行配音录制,且声音信息数据库中的声音信息也适用于多个表情包,在获取多个表情包分别对应的关联声音信息时,不需要对每个表情包一一进行配音录制,提高关联声音信息的生成效率,降低关联声音信息的生成开销和时间成本。
下面,对第一声音信息的获取方式进行介绍。
在示例性实施例中,上述步骤902包括以下几个步骤:
1、获取声音信息数据库中各个声音信息分别对应的标签。
在本申请实施例中,服务器在为第一表情包匹配第一声音信息时,获取声音信息数据库中各个声音信息分别对应的标签。
可选地,上述标签可以是实时生成,也可以是预先生成的,本申请实施例对此不作限定。
在一种可能的实施方式中,上述标签为实时生成。可选地,服务器在确定对上述第一表情包进行声音信息的匹配时,获取声音信息数据库中的各个声音信息,并生成各个声音信息分别对应的标签。
在另一种可能的实施方式中,上述标签为预先生成的。可选地,服务器在获取声音信息之后,即生成该声音信息的标签,并对该声音信息的标签进行存储,进而在确定对上述第一表情包进行声音信息的匹配时,直接从声音信息的标签的存储位置获取该声音信息的标签。
在再一种可能的实施方式中,上述标签中,部分声音信息的标签是实时生成,部分声音信息的标签是预先生成的。可选地,服务器在对上述第一表情包进行声音信息的匹配时,获取声音信息数据库中的各个声音信息,并检测该声音信息是否存在标签,对于不存在标签的声音信息,实时生成标签,并将该标签存储在对应的位置便于下次使用。
可选地,上述标签包括但不限于以下至少一项:文字标签、场景标签、情绪标签等。其中,文字标签用于指示声音信息所对应的文字;场景标签用于指示声音信息所对应的发送场景,例如,场景标签为:在晚上20:11向第一聊天群中的目标用户发送;情绪标签用于指示声音信息所对应的情绪,即声音信息中所带有的情绪。
可选地,用户可以根据实际情况自行设定是否允许服务器收集自身的历史声音信息存储至声音信息数据库中。示例性地,如图10所示,功能设定界面100中包括语音识别开关101,用户通过该语音识别开关101控制历史声音信息收集功能的开启和关闭。
以第一表情包的发送方帐号为例,在历史声音信息收集功能开启之后,服务器收集第一表情包的发送方帐号所发送的多个历史声音信息;进一步地,对各个历史声音信息所包含的声音分别进行文字转换,得到各个历史声音信息分别对应的文字标签;基于各个历史声音信息分别对应的发送场景,得到各个历史声音信息分别对应的场景标签;基于各个历史声音信息分别对应的声音情绪,得到各个历史声音信息分别对应的情绪标签。
在一种可能的实施方式中,服务器在收集第一表情包的发送方帐号所发送的多个历史声音信息时,收集该发送方帐号在目标时段内发送的多个历史声音信息。其中,该目标时段可以是与当前时刻之间的差值小于目标值的时刻所组成的时段,也可以是发送消息频繁的时段,本申请实施例对此不作限定。可选地,不同的发送方帐号对应有不同的目标时段。
在另一种可能的实施方式中,服务器在收集第一表情包的发送方帐号所发送的多个历史声音信息时,收集该发送方帐号所发送的,播放总时长小于门限值的多个历史声音信息。其 中,该门限值可以是任意数值,如10s、7s、5s、2s等,本申请实施例对此不作限定。
本申请实施例通过发送方帐号所发送的历史声音信息,进行声音信息数据库的构建,并将其选择为发送方帐号所发送的表情包对应的关联声音信息,使得表情包对应的有声表情消息更加符合发送方帐号的聊天风格,从而进一步提高了用户聊天体验。
2、根据各个声音信息分别对应的标签,从声音信息数据库中选择与特征信息相匹配的至少一个候选声音信息。
在本申请实施例中,服务器在获取各个声音信息分别对应的标签之后,根据各个声音信息分别对应的标签,从声音信息数据库中选择与特征信息相匹配的至少一个候选声音信息。
可选地,若特征信息包括文字特征信息,标签包括文字标签,则服务器根据特征信息中的文字特征信息以及各个声音信息分别对应的文字标签,从声音信息数据库中选择与文字特征信息相匹配的至少一个候选声音信息。
可选地,若特征信息包括场景特征信息,标签包括场景标签,则服务器根据特征信息中的场景特征信息以及各个声音信息分别对应的场景标签,从声音信息数据库中选择与场景特征信息相匹配的至少一个候选声音信息。
可选地,若特征信息包括情绪特征信息,标签包括情绪标签,则服务器根据特征信息中的情绪特征信息以及各个声音信息分别对应的情绪标签,从声音信息数据库中选择与情绪特征信息相匹配的至少一个候选声音信息。
本申请实施例通过提供文字特征匹配、场景特征匹配和情绪特征匹配等多种候选声音信息的选择方法,从而使得服务器能够获取更加全面的候选声音信息,从而有利于提高第一声音信息的获取合理性。
3、从至少一个候选声音信息中,选择满足第二条件的候选声音信息作为第一声音信息。
在本申请实施例中,服务器在获取上述至少一个候选声音信息之后,从该至少一个候选声音信息中,选择满足第二条件的候选声音信息作为第一声音信息。
上述第二条件为针对候选声音信息的选择条件。可选地,该第二条件为与第一表情包的特征信息的匹配度最高的候选声音信息,即服务器在获取第一声音信息时,从候选声音信息中,选择与特征信息匹配度最高的声音信息作为第一声音信息。当然,在示例性实施例中,服务器也可以随机从至少一个候选声音信息中,为第一表情包选择第一声音信息,以保证在候选声音信息的匹配度相同时也能够为第一表情包匹配第一声音信息。
本申请实施例通过从根据表情包的特征信息和声音信息对应的标签,匹配得到的与该表情包相关联的多个候选声音信息中选择第一声音信息,使得第一声音信息与该表情包之间的匹配程度更高,进而使得基于第一声音信息生成的关联声音信息的准确度高。
另外,结合参考图11,以客户端和服务器交互的角度,对本申请的完整方案进行介绍。具体步骤包括以下几个步骤中的至少一个步骤:
步骤1101,客户端显示聊天会话界面。
步骤1102,客户端在接收到针对聊天会话界面的表情包选择操作的情况下,显示表情包选择界面。其中,表情包选择界面中显示有至少一个表情包。
步骤1103,客户端在接收到针对第一表情包的发送操作,且第一表情包的发送方式为第一发送方式的情况下,获取第一表情包的特征信息。
步骤1104,客户端向服务器发送声音匹配指令。其中,声音匹配指令中包括第一表情包的特征信息。
步骤1105,服务器获取声音信息数据库中各个声音信息分别对应的标签。
步骤1106,服务器根据各个声音信息分别对应的标签,从声音信息数据库中选择与第一表情包的特征信息相匹配的至少一个候选声音信息。
步骤1107,服务器从至少一个候选声音信息中,选择满足第二条件的候选声音信息作为 第一声音信息。
步骤1108,服务器基于第一声音信息,生成第一表情包的关联声音信息。
步骤1109,服务器向客户端发送关联声音信息。
步骤1110,客户端根据第一表情包和关联声音信息,生成第一表情包对应的有声表情信息,并向聊天会话界面中的接收方帐号发送有声表情消息。
步骤1111,客户端在聊天会话界面中显示第一表情包对应的有声表情信息。而且,接收方帐号的客户端也在聊天会话界面中显示第一表情包对应的有声表情信息。
步骤1112,客户端在接收到针对有声表情消息的声音播放操作的情况下,播放第一表情包的关联声音信息。而且,接收方帐号的客户端在接收到针对有声表情消息的声音播放操作的情况下,也播放第一表情包的关联声音信息。
步骤1113,客户端在接收到针对有声表情消息的静音操作的情况下,停止播放第一表情包的关联声音信息。而且,接收方帐号的客户端在接收到针对有声表情消息的静音操作的情况下,也停止播放第一表情包的关联声音信息。
步骤1114,客户端在接收到针对有声表情消息的声音更换操作的情况下,向服务器发送针对第一表情包的声音更改指令。
步骤1115,服务器基于至少一个候选声音信息生成第一表情包的替换声音信息。
步骤1116,服务器向客户端发送替换声音信息。
步骤1117,客户端采用第一表情包的替换声音信息,替换第一表情包的关联声音信息,并将变化后的关联声音信息同步至接收方帐号的客户端。而且,接收方帐号的客户端也采用第一表情包的替换声音信息,替换第一表情包的关联声音信息。
需要说明的一点是,上文中通过实施例对本申请的介绍,仅仅是示例性和解释性的,将上述实施例中的步骤进行任意组合形成的新的实施例,也在本申请的保护范围内。
下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
请参考图12,其示出了本申请一个实施例提供的表情包显示装置的框图。该装置具有实现上述表情包显示方法的功能,所述功能可以由硬件实现,也可以由硬件执行相应的软件实现。该装置可以是终端设备,也可以设置在终端设备中。该装置1200可以包括:界面显示模块1210、表情显示模块1220和消息显示模块1230。
界面显示模块1210,用于显示聊天会话界面,所述聊天会话界面用于展示至少两个用户之间的聊天消息。
表情显示模块1220,用于响应于针对所述聊天会话界面的表情包选择操作,显示表情包选择界面,所述表情包选择界面中显示有至少一个表情包。
消息显示模块1230,用于响应于针对所述至少一个表情包中的第一表情包的发送操作,在所述聊天会话界面中显示所述第一表情包对应的有声表情消息;其中,所述第一表情包对应的有声表情消息用于展示所述第一表情包以及所述第一表情包的关联声音信息,所述第一表情包的关联声音信息是从声音信息数据库中匹配得到的与所述第一表情包相关联的声音信息。
在示例性实施例中,所述消息显示模块1230,用于响应于针对所述第一表情包的发送操作,获取所述第一表情包的发送方式;若所述发送方式为第一发送方式,则向所述聊天会话界面中的接收方帐号发送所述第一表情包对应的有声表情消息,以及在所述聊天会话界面中显示所述第一表情包对应的有声表情消息。
在示例性实施例中,如图13所示,所述装置1200还包括:控件显示模块1240、操作接收模块1250和方式切换模块1260。
控件显示模块1240,用于响应于针对所述第一表情包的选择操作,显示所述第一表情包的发送方式切换控件。
操作接收模块1250,用于接收针对所述发送方式切换控件的操作。
方式切换模块1260,用于若所述第一表情包的发送方式为第二发送方式,则控制所述发送方式由所述第二发送方式切换至所述第一发送方式;若所述第一表情包的发送方式为所述第一发送方式,则控制所述发送方式由所述第一发送方式切换至所述第二发送方式。
在示例性实施例中,如图13所示,所述装置1200还包括:声音控件模块1270。
声音控制模块1270,用于响应于针对所述有声表情消息的声音播放操作,播放所述第一表情包的关联声音信息;或者,响应于针对所述有声表情消息的静音操作,停止播放所述第一表情包的关联声音信息;或者,响应于针对所述有声表情消息的声音更换操作,更改所述第一表情包的关联声音信息。
在示例性实施例中,所述声音控制模块1270,用于从至少一个候选声音信息中,选择满足第一条件的候选声音信息生成所述第一表情包的替换声音信息;其中,所述候选声音信息是根据所述第一表情包的特征信息,以及所述声音信息数据库中各个声音信息分别对应的标签匹配得到的;采用所述第一表情包的替换声音信息,替换所述第一表情包的关联声音信息。
在示例性实施例中,所述声音控制模块1270,用于显示至少一个候选声音信息;响应于针对所述至少一个候选声音信息中的目标声音信息的选择操作,根据所述目标声音信息生成所述第一表情包的替换声音信息;采用所述第一表情包的替换声音信息,替换所述第一表情包的关联声音信息。
在示例性实施例中,所述有声表情消息包括所述第一表情包,以及用于播放所述第一表情包的关联声音信息的声音播放控件;或者,所述有声表情消息包括所述第一表情包的有声视频,以及用于播放所述有声视频的视频播放控件。
综上所述,本申请实施例提供的技术方案中,通过第一表情包对应的有声表情消息展示第一表情包以及第一表情包的关联声音信息,即用户在发送第一表情包时,能够同时通过第一表情包和第一表情包的关联声音信息进行交流,使得基于表情包的交流方式不局限于图像的交流,表情包的交流方式更加多样,从而为用户提供更好的聊天氛围;而且,第一表情包的关联声音信息是从声音信息数据库中匹配得到的与第一表情包相关联的声音信息,即不需要提前或实时对第一表情包进行录音操作,通过与已有的声音信息匹配即可生成第一表情包对应的有声表情消息,降低了关联声音信息的获取开销和时间成本,从而降低了有声表情消息的生成开销和时间成本;声音信息数据库中的声音信息也适用于多个表情包,不需要对每个表情包一一进行录音操作即可获取多个表情包分别对应的有声表情消息,在表情包数量多的情况下,能够有效提高有声表情消息的生成效率。
请参考图14,其示出了本申请一个实施例提供的表情包的关联声音获取装置的框图。该装置具有实现上述表情包的关联声音获取方法的功能,所述功能可以由硬件实现,也可以由硬件执行相应的软件实现。该装置可以是服务器,也可以设置在服务器中。该装置1400可以包括:特征获取模块1410、声音匹配模块1420和声音生成模块1430。
特征获取模块1410,用于获取第一表情包的特征信息。
声音匹配模块1420,用于根据所述特征信息,从声音信息数据库中匹配得到与所述第一表情包相关联的第一声音信息。
声音生成模块1430,用于基于所述第一声音信息,生成所述第一表情包的关联声音信息;其中,所述第一表情包的关联声音信息用于生成所述第一表情包对应的有声表情消息。
在示例性实施例中,如图15所示,所述声音匹配模块1420,包括:标签获取单元1421、声音匹配单元1422和声音选择单元1423。
标签获取单元1421,用于获取所述声音信息数据库中各个声音信息分别对应的标签。
声音匹配单元1422,用于根据各个所述声音信息分别对应的标签,从所述声音信息数据库中选择与所述特征信息相匹配的至少一个候选声音信息。
声音选择单元1423,用于从所述至少一个候选声音信息中,选择满足第二条件的候选声音信息作为所述第一声音信息。
在示例性实施例中,所述声音匹配单元1422,用于根据所述特征信息中的文字特征信息以及各个所述声音信息分别对应的文字标签,从所述声音信息数据库中选择与所述文字特征信息相匹配的至少一个候选声音信息;其中,所述文字标签用于指示所述声音信息所对应的文字;或者,根据所述特征信息中的场景特征信息以及各个所述声音信息分别对应的场景标签,从所述声音信息数据库中选择与所述场景特征信息相匹配的至少一个候选声音信息;其中,所述场景标签用于指示所述声音信息所对应的发送场景;或者,根据所述特征信息中的情绪特征信息以及各个所述声音信息分别对应的情绪标签,从所述声音信息数据库中选择与所述情绪特征信息相匹配的至少一个候选声音信息;其中,所述情绪标签用于指示所述声音信息所对应的情绪。
在示例性实施例中,所述特征获取模块1410,用于对所述第一表情包中的文字信息进行文字提取,得到所述第一表情包的文字特征信息;其中,所述特征信息包括所述文字特征信息;或者,对所述第一表情包、所述第一表情包的关联聊天消息、所述第一表情包的关联聊天场景进行特征提取,得到所述第一表情包的场景特征信息;其中,所述特征信息包括所述场景特征信息;或者,对所述第一表情包、所述第一表情包的关联聊天消息进行特征提取,得到所述第一表情包的情绪特征信息;其中,所述特征信息包括所述情绪特征信息。
在示例性实施例中,如图15所示,所述声音生成模块1430,包括:文字获取单元1431、声音截取单元1432和声音生成单元1433。
文字获取单元1431,用于获取所述第一表情包所包含的文字信息。
声音截取单元1432,用于根据所述文字信息,从所述第一声音信息中截取包含所述文字信息的声音片段。
声音生成单元1433,用于基于所述声音片段,生成所述第一表情包的关联声音信息。
在示例性实施例中,所述声音生成单元1433,用于若所述第一表情包为视频动画,则基于所述第一表情包的播放时长,对所述声音片段的播放时长进行调整,得到所述第一表情包的关联声音信息;其中,所述第一表情包的关联声音信息的播放时长与所述第一表情包的播放时长相同。
在示例性实施例中,如图15所示,所述装置1400还包括:声音收集模块1440。
声音收集模块1440,用于收集所述第一表情包的发送方帐号所发送的多个历史声音信息;对各个所述历史声音信息所包含的声音分别进行文字转换,得到各个所述历史声音信息分别对应的文字标签;基于各个所述历史声音信息分别对应的发送场景,得到各个所述历史声音信息分别对应的场景标签;基于各个所述历史声音信息分别对应的声音情绪,得到各个所述历史声音信息分别对应的情绪标签。
综上所述,本申请实施例提供的技术方案中,通过第一表情包的特征信息匹配获取与第一表情包相关联的第一声音信息,提高第一声音信息与第一表情包之间的匹配度,使得后续基于第一声音信息生成的关联声音信息的准确度高;而且,通过声音信息数据库中已有的声音信息即可生成第一表情包的关联声音信息,不需要特地为第一声音信息进行配音录制,且声音信息数据库中的声音信息也适用于多个表情包,在获取多个表情包分别对应的关联声音信息时,不需要对每个表情包一一进行配音录制,提高关联声音信息的生成效率,降低关联声音信息的生成开销和时间成本。
需要说明的是,上述实施例提供的装置,在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述 实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
请参考图16,其示出了本申请一个实施例提供的终端设备1600的结构框图。该终端设备1600可以是诸如手机、平板电脑、游戏主机、电子书阅读器、多媒体播放设备、可穿戴设备、车载终端、PC等电子设备。该终端设备用于实施上述实施例中提供的表情包显示方法,或表情包的关联声音获取方法。具体来讲:
通常,终端设备1600包括有:处理器1601和存储器1602。
处理器1601可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器1601可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器1601也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器1601可以在集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器1601还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器1602可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器1602还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器1602中的非暂态的计算机可读存储介质用于存储至少一条指令、至少一段程序、代码集或指令集,且经配置以由一个或者一个以上处理器执行,以实现上述表情包显示方法,或上述表情包的关联声音获取方法。
在一些实施例中,终端设备1600还可选包括有:外围设备接口1603和至少一个外围设备。处理器1601、存储器1602和外围设备接口1603之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口1603相连。具体地,外围设备包括:射频电路1604、显示屏1605、摄像头组件1606、音频电路1607和电源1608中的至少一种。
本领域技术人员可以理解,图16中示出的结构并不构成对终端设备1600的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
请参考图17,其示出了本申请一个实施例提供的服务器的结构框图。该服务器用于实施上述实施例中提供的表情包的关联声音获取方法。具体来讲:
所述服务器1700包括CPU(Central Processing Unit,中央处理单元)1701、包括RAM(Random Access Memory,随机存取存储器)1702和ROM(Read-Only Memory,只读存储器)1703的系统存储器1704,以及连接系统存储器1704和中央处理单元1701的系统总线1705。所述服务器1700还包括帮助计算机内的各个器件之间传输信息的基本I/O(Input/Output,输入/输出)系统1706,和用于存储操作系统1713、应用程序1714和其他程序模块1715的大容量存储设备1707。
所述基本输入/输出系统1706包括有用于显示信息的显示器1708和用于用户输入信息的诸如鼠标、键盘之类的输入设备1709。其中所述显示器1708和输入设备1709都通过连接到系统总线1705的输入输出控制器1710连接到中央处理单元1701。所述基本输入/输出系统1706还可以包括输入输出控制器1710以用于接收和处理来自键盘、鼠标、或电子触控笔等多个其他设备的输入。类似地,输入输出控制器1710还提供输出到显示屏、打印机或其他类型的输出设备。
所述大容量存储设备1707通过连接到系统总线1705的大容量存储控制器(未示出)连接到中央处理单元1701。所述大容量存储设备1707及其相关联的计算机可读介质为服务器 1700提供非易失性存储。也就是说,所述大容量存储设备1707可以包括诸如硬盘或者CD-ROM(Compact Disc Read-Only Memory,只读光盘)驱动器之类的计算机可读介质(未示出)。
不失一般性,所述计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括RAM、ROM、EPROM(Erasable Programmable Read Only Memory,可擦除可编程只读存储器)、EEPROM(Electrically Erasable Programmable Read Only Memory,可擦除可编程只读存储器)、闪存或其他固态存储器技术,CD-ROM、DVD(Digital Video Disc,高密度数字视频光盘)或其他光学存储、磁带盒、磁带、磁盘存储或其他磁性存储设备。当然,本领域技术人员可知所述计算机存储介质不局限于上述几种。上述的系统存储器1704和大容量存储设备1707可以统称为存储器。
根据本申请的各种实施例,所述服务器1700还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即服务器1700可以通过连接在所述系统总线1705上的网络接口单元1711连接到网络1712,或者说,也可以使用网络接口单元1711来连接到其他类型的网络或远程计算机系统(未示出)。
在示例性实施例中,还提供了一种计算机可读存储介质,所述存储介质中存储有计算机程序,所述计算机程序在被处理器执行时以实现上述表情包显示方法,或实现上述表情包的关联声音获取方法。
可选地,该计算机可读存储介质可以包括:ROM(Read Only Memory,只读存储器)、RAM(Random Access Memory,随机存取记忆体)、SSD(Solid State Drives,固态硬盘)或光盘等。其中,随机存取记忆体可以包括ReRAM(Resistance Random Access Memory,电阻式随机存取记忆体)和DRAM(Dynamic Random Access Memory,动态随机存取存储器)。
在示例性实施例中,还提供一种计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序存储在计算机可读存储介质中,处理器从所述计算机可读存储介质读取并执行所述计算机程序,以实现上述表情包显示方法,或实现上述表情包的关联声音获取方法。
需要说明的是,本申请所涉及的信息(包括但不限于对象设备信息、对象个人信息等)、数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)以及信号,均为经对象授权或者经过各方充分授权的,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。例如,本申请中涉及到的发送方帐号、接收方帐号、标识信息、历史声音信息等都是在充分授权的情况下获取的。
应当理解的是,在本文中提及的“多个”是指两个或两个以上。“和/或”,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。另外,本文中描述的步骤编号,仅示例性示出了步骤间的一种可能的执行先后顺序,在一些其它实施例中,上述步骤也可以不按照编号顺序来执行,如两个不同编号的步骤同时执行,或者两个不同编号的步骤按照与图示相反的顺序执行,本申请实施例对此不作限定。
以上所述仅为本申请的示例性实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (19)

  1. 一种表情包显示方法,所述方法由终端设备执行,所述方法包括:
    显示聊天会话界面,所述聊天会话界面用于展示至少两个用户之间的聊天消息;
    响应于针对所述聊天会话界面的表情包选择操作,显示表情包选择界面,所述表情包选择界面中显示有至少一个表情包;
    响应于针对所述至少一个表情包中的第一表情包的发送操作,在所述聊天会话界面中显示所述第一表情包对应的有声表情消息;其中,所述第一表情包对应的有声表情消息用于展示所述第一表情包以及所述第一表情包的关联声音信息,所述第一表情包的关联声音信息是从声音信息数据库中匹配得到的与所述第一表情包相关联的声音信息。
  2. 根据权利要求1所述的方法,其中,所述响应于针对所述至少一个表情包中的第一表情包的发送操作,在所述聊天会话界面中显示所述第一表情包对应的有声表情消息,包括:
    响应于针对所述第一表情包的发送操作,获取所述第一表情包的发送方式;
    若所述发送方式为第一发送方式,则向所述聊天会话界面中的接收方帐号发送所述第一表情包对应的有声表情消息,以及在所述聊天会话界面中显示所述第一表情包对应的有声表情消息。
  3. 根据权利要求2所述的方法,其中,所述显示表情包选择界面之后,还包括:
    响应于针对所述第一表情包的选择操作,显示所述第一表情包的发送方式切换控件;
    接收针对所述发送方式切换控件的操作;
    若所述第一表情包的发送方式为第二发送方式,则控制所述发送方式由所述第二发送方式切换至所述第一发送方式;
    若所述第一表情包的发送方式为所述第一发送方式,则控制所述发送方式由所述第一发送方式切换至所述第二发送方式。
  4. 根据权利要求1至3任一项所述的方法,其中,所述在所述聊天会话界面中显示所述第一表情包对应的有声表情消息之后,还包括:
    响应于针对所述有声表情消息的声音播放操作,播放所述第一表情包的关联声音信息;
    或者,
    响应于针对所述有声表情消息的静音操作,停止播放所述第一表情包的关联声音信息;
    或者,
    响应于针对所述有声表情消息的声音更换操作,更改所述第一表情包的关联声音信息。
  5. 根据权利要求4所述的方法,其中,所述更改所述第一表情包的关联声音信息,包括:
    从至少一个候选声音信息中,选择满足第一条件的候选声音信息生成所述第一表情包的替换声音信息;其中,所述候选声音信息是根据所述第一表情包的特征信息,以及所述声音信息数据库中各个声音信息分别对应的标签匹配得到的;
    采用所述第一表情包的替换声音信息,替换所述第一表情包的关联声音信息。
  6. 根据权利要求4所述的方法,其中,所述更改所述第一表情包的关联声音信息,包括:
    显示至少一个候选声音信息;
    响应于针对所述至少一个候选声音信息中的目标声音信息的选择操作,根据所述目标声音信息生成所述第一表情包的替换声音信息;
    采用所述第一表情包的替换声音信息,替换所述第一表情包的关联声音信息。
  7. 根据权利要求1至3任一项所述的方法,其中,
    所述有声表情消息包括所述第一表情包,以及用于播放所述第一表情包的关联声音信息的声音播放控件;
    或者,
    所述有声表情消息包括所述第一表情包的有声视频,以及用于播放所述有声视频的视频 播放控件。
  8. 一种表情包的关联声音获取方法,所述方法由计算机设备执行,所述方法包括:
    获取第一表情包的特征信息;
    根据所述特征信息,从声音信息数据库中匹配得到与所述第一表情包相关联的第一声音信息;
    基于所述第一声音信息,生成所述第一表情包的关联声音信息;其中,所述第一表情包的关联声音信息用于生成所述第一表情包对应的有声表情消息。
  9. 根据权利要求8所述的方法,其中,所述根据所述特征信息,从声音信息数据库中匹配得到与所述第一表情包相关联的第一声音信息,包括:
    获取所述声音信息数据库中各个声音信息分别对应的标签;
    根据各个所述声音信息分别对应的标签,从所述声音信息数据库中选择与所述特征信息相匹配的至少一个候选声音信息;
    从所述至少一个候选声音信息中,选择满足第二条件的候选声音信息作为所述第一声音信息。
  10. 根据权利要求9所述的方法,其中,所述根据各个所述声音信息分别对应的标签,从所述声音信息数据库中选择与所述特征信息相匹配的至少一个候选声音信息,包括以下至少一项:
    根据所述特征信息中的文字特征信息以及各个所述声音信息分别对应的文字标签,从所述声音信息数据库中选择与所述文字特征信息相匹配的至少一个候选声音信息;其中,所述文字标签用于指示所述声音信息所对应的文字;
    根据所述特征信息中的场景特征信息以及各个所述声音信息分别对应的场景标签,从所述声音信息数据库中选择与所述场景特征信息相匹配的至少一个候选声音信息;其中,所述场景标签用于指示所述声音信息所对应的发送场景;
    根据所述特征信息中的情绪特征信息以及各个所述声音信息分别对应的情绪标签,从所述声音信息数据库中选择与所述情绪特征信息相匹配的至少一个候选声音信息;其中,所述情绪标签用于指示所述声音信息所对应的情绪。
  11. 根据权利要求8所述的方法,其中,所述获取第一表情包的特征信息,包括以下至少一项:
    对所述第一表情包中的文字信息进行文字提取,得到所述第一表情包的文字特征信息;其中,所述特征信息包括所述文字特征信息;
    对所述第一表情包、所述第一表情包的关联聊天消息、所述第一表情包的关联聊天场景进行特征提取,得到所述第一表情包的场景特征信息;其中,所述特征信息包括所述场景特征信息;
    对所述第一表情包、所述第一表情包的关联聊天消息进行特征提取,得到所述第一表情包的情绪特征信息;其中,所述特征信息包括所述情绪特征信息。
  12. 根据权利要求8所述的方法,其中,所述基于所述第一声音信息,生成所述第一表情包的关联声音信息,包括:
    获取所述第一表情包所包含的文字信息;
    根据所述文字信息,从所述第一声音信息中截取包含所述文字信息的声音片段;
    基于所述声音片段,生成所述第一表情包的关联声音信息。
  13. 根据权利要求12所述的方法,其中,所述基于所述声音片段,生成所述第一表情包的关联声音信息,包括:
    若所述第一表情包为视频动画,则基于所述第一表情包的播放时长,对所述声音片段的播放时长进行调整,得到所述第一表情包的关联声音信息;
    其中,所述第一表情包的关联声音信息的播放时长与所述第一表情包的播放时长相同。
  14. 根据权利要求8至13任一项所述的方法,其中,所述方法还包括:
    收集所述第一表情包的发送方帐号所发送的多个历史声音信息;
    对各个所述历史声音信息所包含的声音分别进行文字转换,得到各个所述历史声音信息分别对应的文字标签;
    基于各个所述历史声音信息分别对应的发送场景,得到各个所述历史声音信息分别对应的场景标签;
    基于各个所述历史声音信息分别对应的声音情绪,得到各个所述历史声音信息分别对应的情绪标签。
  15. 一种表情包显示装置,所述装置包括:
    界面显示模块,用于显示聊天会话界面,所述聊天会话界面用于展示至少两个用户之间的聊天消息;
    表情显示模块,用于响应于针对所述聊天会话界面的表情包选择操作,显示表情包选择界面,所述表情包选择界面中显示有至少一个表情包;
    消息显示模块,用于响应于针对所述至少一个表情包中的第一表情包的发送操作,在所述聊天会话界面中显示所述第一表情包对应的有声表情消息;其中,所述第一表情包对应的有声表情消息用于展示所述第一表情包以及所述第一表情包的关联声音信息,所述第一表情包的关联声音信息是从声音信息数据库中匹配得到的与所述第一表情包相关联的声音信息。
  16. 一种表情包的关联声音获取装置,所述装置包括:
    特征获取模块,用于获取第一表情包的特征信息;
    声音匹配模块,用于根据所述特征信息,从声音信息数据库中匹配得到与所述第一表情包相关联的第一声音信息;
    声音生成模块,用于基于所述第一声音信息,生成所述第一表情包的关联声音信息;其中,所述第一表情包的关联声音信息用于生成所述第一表情包对应的有声表情消息。
  17. 一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有计算机程序,所述计算机程序由所述处理器加载并执行以实现如权利要求1至7任一项所述的表情包显示方法,或实现如权利要求8至14任一项所述的表情包的关联声音获取方法。
  18. 一种计算机可读存储介质,所述存储介质中存储有计算机程序,所述计算机程序由处理器加载并执行以实现如权利要求1至7任一项所述的表情包显示方法,或实现如权利要求8至14任一项所述的表情包的关联声音获取方法。
  19. 一种计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序存储在计算机可读存储介质中,处理器从所述计算机可读存储介质读取并执行所述计算机程序,以实现如权利要求1至7任一项所述的表情包显示方法,或实现如权利要求8至14任一项所述的表情包的关联声音获取方法。
PCT/CN2022/119778 2021-11-17 2022-09-20 表情包显示、关联声音获取方法、装置、设备及存储介质 WO2023087888A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/201,614 US20230300095A1 (en) 2021-11-17 2023-05-24 Audio-enabled messaging of an image

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111362112.8 2021-11-17
CN202111362112.8A CN116137617B (zh) 2021-11-17 2021-11-17 表情包显示、关联声音获取方法、装置、设备及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/201,614 Continuation US20230300095A1 (en) 2021-11-17 2023-05-24 Audio-enabled messaging of an image

Publications (1)

Publication Number Publication Date
WO2023087888A1 true WO2023087888A1 (zh) 2023-05-25

Family

ID=86332579

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/119778 WO2023087888A1 (zh) 2021-11-17 2022-09-20 表情包显示、关联声音获取方法、装置、设备及存储介质

Country Status (3)

Country Link
US (1) US20230300095A1 (zh)
CN (1) CN116137617B (zh)
WO (1) WO2023087888A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106888158A (zh) * 2017-02-28 2017-06-23 努比亚技术有限公司 一种即时通信方法和装置
US20180025004A1 (en) * 2016-07-19 2018-01-25 Eric Koenig Process to provide audio/video/literature files and/or events/activities ,based upon an emoji or icon associated to a personal feeling
CN112883181A (zh) * 2021-02-26 2021-06-01 腾讯科技(深圳)有限公司 会话消息的处理方法、装置、电子设备及存储介质
CN112910761A (zh) * 2021-01-29 2021-06-04 北京百度网讯科技有限公司 即时通讯方法、装置、设备、存储介质以及程序产品
CN113538628A (zh) * 2021-06-30 2021-10-22 广州酷狗计算机科技有限公司 表情包生成方法、装置、电子设备及计算机可读存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190221208A1 (en) * 2018-01-12 2019-07-18 Kika Tech (Cayman) Holdings Co., Limited Method, user interface, and device for audio-based emoji input
US10306395B1 (en) * 2018-02-05 2019-05-28 Philip Scott Lyren Emoji that indicates a location of binaural sound
WO2020062014A1 (zh) * 2018-09-28 2020-04-02 华为技术有限公司 一种向输入框中输入信息的方法及电子设备
CN110379430B (zh) * 2019-07-26 2023-09-22 腾讯科技(深圳)有限公司 基于语音的动画显示方法、装置、计算机设备及存储介质
CN112073204B (zh) * 2020-08-27 2021-09-03 腾讯科技(深圳)有限公司 一种群组控制方法、装置、电子设备和存储介质
CN112907703A (zh) * 2021-01-18 2021-06-04 深圳全民吃瓜科技有限公司 一种表情包生成方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180025004A1 (en) * 2016-07-19 2018-01-25 Eric Koenig Process to provide audio/video/literature files and/or events/activities ,based upon an emoji or icon associated to a personal feeling
CN106888158A (zh) * 2017-02-28 2017-06-23 努比亚技术有限公司 一种即时通信方法和装置
CN112910761A (zh) * 2021-01-29 2021-06-04 北京百度网讯科技有限公司 即时通讯方法、装置、设备、存储介质以及程序产品
CN112883181A (zh) * 2021-02-26 2021-06-01 腾讯科技(深圳)有限公司 会话消息的处理方法、装置、电子设备及存储介质
CN113538628A (zh) * 2021-06-30 2021-10-22 广州酷狗计算机科技有限公司 表情包生成方法、装置、电子设备及计算机可读存储介质

Also Published As

Publication number Publication date
CN116137617A (zh) 2023-05-19
US20230300095A1 (en) 2023-09-21
CN116137617B (zh) 2024-03-22

Similar Documents

Publication Publication Date Title
US11368575B2 (en) Management of calls and media content associated with a caller on mobile computing devices
US10944863B2 (en) Management of media content derived from natural language processing on mobile computing devices
US11005990B2 (en) Methods and systems for contact firewalls on mobile computing devices
US10979558B2 (en) Management of media content associated with time-sensitive offers on mobile computing devices
US11381679B2 (en) Management of media content associated with call context on mobile computing devices
US10979559B2 (en) Management of calls on mobile computing devices based on call participants
US10951755B2 (en) Management of media content for caller IDs on mobile computing devices
US10965809B2 (en) Management of media content associated with a call participant on mobile computing devices
US10931819B2 (en) Management of media content associated with a user of a mobile computing device
US8515255B2 (en) Systems and methods for enhancing media with supplemental content
US20200053211A1 (en) Management of media content associated with ending a call on mobile computing devices
EP3725105B1 (en) Methods and systems for management of media content associated with message context on mobile computing devices
EP1968266A1 (en) Device, system, and method of electronic communication utilizing audiovisual clips
CN105554027A (zh) 资源分享方法和装置
US20140164371A1 (en) Extraction of media portions in association with correlated input
US20150121248A1 (en) System for effectively communicating concepts
CN111314204A (zh) 一种互动方法、装置、终端和存储介质
US20140161423A1 (en) Message composition of media portions in association with image content
CN114880062B (zh) 聊天表情展示方法、设备、电子设备及存储介质
US20140163956A1 (en) Message composition of media portions in association with correlated text
WO2023246275A1 (zh) 语音消息的播放方法、装置、终端及存储介质
US11864066B2 (en) Complex computing network for improving establishment and streaming of audio communication among mobile computing devices
WO2023087888A1 (zh) 表情包显示、关联声音获取方法、装置、设备及存储介质
KR20190094080A (ko) 사용자간 대화 세션에 대한 모니터링에 기초하여 능동적으로 주문 또는 예약 서비스를 제공하는 대화형 ai 에이전트 시스템, 방법 및 컴퓨터 판독가능 기록 매체
WO2019156535A1 (ko) 대화 세션 내의 이전의 이력 정보를 이용하여, 사용자간 대화 세션에 대한 모니터링에 기초해서 능동적으로 주문 또는 예약 서비스를 제공하는 대화형 ai 에이전트 시스템, 방법 및 컴퓨터 판독가능 기록 매체

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22894426

Country of ref document: EP

Kind code of ref document: A1