WO2022000991A1 - 表情包生成方法及设备、电子设备和介质 - Google Patents

表情包生成方法及设备、电子设备和介质 Download PDF

Info

Publication number
WO2022000991A1
WO2022000991A1 PCT/CN2020/133649 CN2020133649W WO2022000991A1 WO 2022000991 A1 WO2022000991 A1 WO 2022000991A1 CN 2020133649 W CN2020133649 W CN 2020133649W WO 2022000991 A1 WO2022000991 A1 WO 2022000991A1
Authority
WO
WIPO (PCT)
Prior art keywords
target video
target
video
exclusive
feedback information
Prior art date
Application number
PCT/CN2020/133649
Other languages
English (en)
French (fr)
Inventor
徐传任
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Priority to EP20922491.4A priority Critical patent/EP3955131A4/en
Priority to KR1020217029278A priority patent/KR20210118203A/ko
Priority to JP2021552794A priority patent/JP7297084B2/ja
Priority to US17/471,086 priority patent/US20210407166A1/en
Publication of WO2022000991A1 publication Critical patent/WO2022000991A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/738Presentation of query results
    • G06F16/739Presentation of query results in form of a video summary, e.g. the video summary being a video sequence, a composite still image or having synthesized frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/732Query formulation
    • G06F16/7328Query by example, e.g. a complete video frame or video sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/441Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card
    • H04N21/4415Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card using biometric characteristics of the user, e.g. by voice recognition or fingerprint scanning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4668Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies

Definitions

  • the present disclosure relates to the field of multimedia technologies, in particular to the field of video image processing, and in particular to a method and device for generating an emoticon package, an electronic device and a medium.
  • the multimedia platform pushes a unified emoticon package to all users, and the emoticons in these emoticon packages are mainly based on stars, animations, movies and so on.
  • a method for generating an emoticon package including: acquiring at least one piece of target feedback information of multiple related videos of a target video, the target video and the multiple related videos involving the same video host; The at least one piece of target feedback information is matched with the target video; at least one target video segment is determined from the target video based on the matching result; and an exclusive emoticon package is generated at least based on the at least one target video segment.
  • a device for generating emoticons comprising: an acquisition unit configured to acquire at least one piece of target feedback information of multiple related videos of the target video; a matching unit configured to matching the at least one piece of target feedback information with the target video; a determining unit configured to determine at least one target video segment from the target video based on a matching result of the matching unit; and a generating unit configured to An exclusive emoticon package is generated based at least on the at least one target video segment.
  • an electronic device comprising: a processor; and a memory storing a program, the program including instructions that, when executed by the processor, cause the processor to execute according to the The above expression pack generation method.
  • a computer-readable storage medium storing a program, the program comprising instructions that, when executed by a processor of an electronic device, cause the electronic device to perform the expression according to the above-mentioned expression.
  • Package generation method
  • FIG. 1 is a flowchart illustrating a method for generating an emoticon package according to an exemplary embodiment
  • FIG. 2 is a flowchart illustrating a method for obtaining at least one piece of target feedback information of a plurality of related videos according to an exemplary embodiment
  • FIG. 3 is a flowchart illustrating a method for generating an emoticon package according to an exemplary embodiment
  • Fig. 4 is a schematic block diagram showing the composition of an expression pack generating device according to an exemplary embodiment
  • FIG. 5 is a block diagram illustrating an example computing device that can be applied to example embodiments.
  • first, second, etc. to describe various elements is not intended to limit the positional relationship, timing relationship or importance relationship of these elements, and such terms are only used for Distinguish one element from another.
  • first element and the second element may refer to the same instance of the element, while in some cases they may refer to different instances based on the context of the description.
  • the multimedia platform pushes a unified emoticon package to all users, and the emoticons in these emoticon packages are mainly based on stars, animations, movies and so on.
  • users can only use the emoticons in the emoticon package pushed by the unified push.
  • This undifferentiated unified expression guarantee has nothing to do with the user who made the video (hereinafter referred to as the video host), and can only be used as an expression.
  • the video clips related to the comments with the most occurrences are basically the video clips that the user is interested in, and these video clips are the most relevant ones. Attracting users is often the best way to reflect the personality of the video host.
  • a video clip related to a comment may refer to a comment made by other users on the video clip, or may refer to a video clip matching the comment.
  • a video segment matching a comment may mean that the comment has a high semantic similarity with at least a part of the text corresponding to the subtitle or audio of the video segment.
  • the present disclosure provides a method for generating emoticons, by acquiring target feedback information of multiple related videos of the target video, and matching the target feedback information with the target video, so as to determine at least one target video segment from the target video. Therefore, the user's feedback information can be mapped into the target video, and the target video segment related to the user's feedback information can be obtained.
  • the target video and the multiple related videos relate to the same video host. Then, based on the obtained target video clips, an exclusive emoticon package of the video host is generated, and the exclusive emoticon package of the video host can be pushed to the video host and other users, so that other users can use the exclusive emoticon in the exclusive emoticon package to broadcast the video.
  • the main uploaded video can be commented on, and the video broadcaster can also select the appropriate exclusive emoticon from the exclusive emoticon package to add to the produced video.
  • the exclusive emoticon package generated by the above technical solution can meet the interests of other users, and can also reflect the personality characteristics of the video host, so that the use of the exclusive emoticon package can deepen the impression of other users on the video host and enhance the video host's cognition degree and influence.
  • the feedback information of the video may include, for example, at least one of the following information: bullet screen, comments in the video comment area, likes, and forwarding.
  • the target video may be a live video or a recorded video.
  • the target video and a plurality of related videos may be produced by the same user (may be live or recorded), for example.
  • a video produced by a user may mean that the produced video includes the audio and/or video of the user.
  • FIG. 1 is a flowchart illustrating a method for generating an emoticon package according to an exemplary embodiment of the present disclosure.
  • the method for generating emoticons may include: Step S101, acquiring at least one piece of target feedback information of multiple related videos of a target video, where the target video and the multiple related videos involve the same video host; Step S102, matching the at least one piece of target feedback information with the target video; Step S103, determining at least one target video segment from the target video based on the matching result; and Step S104, at least based on the at least one target video Fragments generate exclusive emoticons.
  • the plurality of related videos may be obtained from, but not limited to, a video library. For example, it can also be obtained by scraping from the web.
  • At least one piece of feedback information that occurs most frequently among pieces of feedback information of the plurality of related videos may be respectively determined as the target feedback information. That is, in the multiple pieces of feedback information of the multiple related videos, the number of occurrences of the at least one piece of target feedback information is greater than the number of occurrences of the remaining feedback information. Therefore, the target video segment that is most interesting to the user can be determined based on the at least one piece of target feedback information.
  • the target video segment may include multiple consecutive video frames, or may be a single video frame, which is not limited herein.
  • step S101 may include: step S1011 , acquiring at least one piece of feedback information of each of the multiple related videos; step S1012 , evaluating the multiple related videos Perform semantic matching on multiple pieces of feedback information of the related videos; step S1013, divide the multiple pieces of feedback information into multiple feedback information groups based on the semantic matching result; step S1014, divide the feedback information included in the multiple feedback information groups At least one feedback information group whose number is greater than the threshold is determined as a target feedback information group; and in step S1015 , the target feedback information corresponding to the target feedback information group is determined based on a plurality of pieces of feedback information in each of the target feedback information groups. Therefore, it can be achieved that the at least one piece of target feedback information is the feedback information with the most occurrences.
  • the keyword with the highest semantic similarity among the multiple pieces of feedback information in each target feedback information group may be, but not limited to, the target feedback information.
  • the keyword may be, for example, "666", “like", “like you”, and so on.
  • one piece of feedback information in each target feedback information group may also be determined as the target feedback information.
  • the expression pack generating method may further include: before acquiring at least one piece of target feedback information of multiple related videos of the target video, determining whether the total number of feedback information of the multiple related videos is not less than A set value; in response to determining that the total amount of feedback information of the plurality of related videos is less than the set value, push guidance information to guide the user to input feedback information for the target video.
  • Obtaining at least one piece of target feedback information of the plurality of related videos may be performed in response to determining that the total number of feedback information of the plurality of related videos is not less than the set value. Therefore, the required target feedback information can be obtained based on a sufficient amount of feedback information, so that the obtained target feedback information can better indicate the interests of most users.
  • the pushing guide information may be, for example, sending a guide barrage (for example, "a large wave of barrage is about to hit").
  • steps S102 and S103 may be performed to obtain at least one target video segment.
  • at least one target video segment may match at least one of the at least one piece of target feedback information.
  • Matching the target video segment with the target feedback information may refer to: in the target video, the semantic similarity between the text corresponding to the subtitle or audio of the target video segment and the corresponding target feedback information is the highest, and the score of the semantic similarity is greater than set threshold.
  • step S102 may include: semantically matching each piece of target feedback information in the at least one piece of target feedback information with text corresponding to at least part of subtitles or at least part of audio of the target video. Determining at least one target video segment from the target video is performed based on the semantic matching result.
  • the exclusive emoticon package generated based on the target video clip in this disclosure may include exclusive dynamic effect emoticons and/or exclusive stickers.
  • the exclusive emoticon package can be pushed to other users and the video host, so that other users can use the expressions in the exclusive emoticon package of the video host to comment when watching the video uploaded by the video host, and the video host can also make comments. Add emojis from their exclusive emoji packs when making videos.
  • the target video clip may include a first target video clip for generating an exclusive animation expression.
  • the expression package generation method may further include: step S201, before generating the exclusive expression package, perform target recognition (for example, a human face) on each of the target video segments identification); Step S202, determine whether the target video clip includes the video host based on the identification result; and Step S203, in response to determining that the target video clip includes the video host, determine the target video clip as the target video clip. describe the first target video segment.
  • step S104 may include: step S1041 , generating an exclusive dynamic effect expression based on at least the first target video clip. Therefore, an exclusive dynamic effect expression of the video broadcaster can be generated based on the video clip including the video broadcaster, and the awareness of the video broadcaster can be improved by using the exclusive dynamic effect expression.
  • a food broadcaster often says, "Today, let's eat something good” when making some particularly good food.
  • a corresponding motion effect can be generated based on the video clip (which can be a subtitle or a video clip corresponding to the audio "A sentence of today, let's have something to eat”), and the video broadcaster and other users can be pushed to the video clip.
  • Motion By using this dynamic effect, it can greatly deepen the impression of viewing users, and it is convenient to quickly increase the popularity of the broadcaster.
  • the step S104 generating an exclusive animation expression based on at least the first target video clip may include: determining first text information corresponding to the first target video clip; based on the first target video clip and The corresponding first text information generates an exclusive dynamic effect expression.
  • the first text information may be determined based on text corresponding to subtitles or audio in the first target video segment.
  • the first text information may be the text corresponding to a sentence of subtitles or a sentence of audio when the first target video segment is played.
  • the first text information may also be determined based on target feedback information matching the first target video segment.
  • the generation of an exclusive dynamic effect expression based at least on the first target video clip may be, but is not limited to, executed in response to receiving the first trigger instruction, so that an exclusive dynamic effect expression can be selectively generated according to the trigger instruction of the video host, with Better flexibility.
  • the target video segment may further include a second target video segment for generating an exclusive sticker.
  • the expression package generating method may further include: step S204 , in response to determining that the target video clip does not include the video host (for example, does not include the video broadcaster) main face), and the target video segment is determined as the second target video segment.
  • the generating an exclusive emoticon package may further include: step S1042, determining second text information corresponding to the second target video segment; and step S1043, generating an exclusive sticker based on at least the second text information. This makes it possible to generate exclusive stickers based on video clips that do not include faces.
  • the second text information may be determined based on text corresponding to subtitles or audio in the second target video segment.
  • the second text information may be text corresponding to a sentence of subtitles or a sentence of audio when the second target video segment is played.
  • the second text information may also be determined based on target feedback information matched with the second target video segment.
  • Generating the exclusive sticker based on at least the second text information may include: acquiring a face image related to the target video; and generating the exclusive sticker based on the second text information and the face image. This makes it possible to generate exclusive stickers that include images of human faces.
  • the generated exclusive sticker may use, for example, the avatar specified by the video host, or may be the video host's avatar obtained from the target video. Thus, an exclusive sticker including the avatar of the video broadcaster can be generated, and the awareness of the video broadcaster can be improved by using the exclusive sticker.
  • an exclusive sticker can also be generated based on at least the first target video segment including a human face.
  • the specific generation method is the same as that of the second target video segment, and will not be described in detail here.
  • the above content is based on the target video clip matching the target feedback information to generate the exclusive emoticon package.
  • the method for generating an emoticon package may further include: in response to receiving the second trigger instruction, generating an exclusive emoticon based on the selected set video segment.
  • exclusive expressions which may include exclusive expressions and/or exclusive stickers, and the specific generation method is the same as the generation of exclusive expressions based on target video clips in the above content
  • a one-key conversion icon can be displayed at a designated position of the video, and the video host can input the second trigger instruction by clicking the one-key conversion icon.
  • the video host may, but is not limited to, select the set video clips in the recorded video for generating exclusive expressions.
  • the above content describes that after the exclusive emoticon package is made, it can be pushed to the video broadcaster and other users for selection and use.
  • an appropriate exclusive expression can also be automatically selected from an exclusive expression package and added to the preset video.
  • the expression pack generating method may further include: matching at least a part of the exclusive expressions in the exclusive expression pack with a preset video; determining at least one exclusive expression matching the preset video based on the matching result and determining a matching video clip in the preset video that matches each exclusive expression in the at least one exclusive expression based on the matching result, so that in the process of playing the preset video, the matching video can be played
  • the exclusive expression corresponding to the matching video clip is pushed at the time of the video clip. Therefore, by establishing an association between the preset video and the exclusive emoticon package, when the video is played, the corresponding exclusive emoticon can be automatically pushed when the matching video clip is played.
  • the preset video may be related to the target video.
  • the preset video and the target video may involve the same video host. Therefore, an association can be established between the preset video of the video host and the exclusive emoticon package of the video host, and the corresponding exclusive emoticon can be automatically pushed when the preset video is played.
  • the preset video may be the historical video stored in the video library, or may be the acquired video currently uploaded by the video host.
  • the multimedia platform can obtain the newly added update videos in the video library at regular intervals, and match at least a part of the exclusive expressions in the exclusive emoticon package of the corresponding video host with the newly added update video, so that the video host can upload it on the video host. 's videos and their exclusive emoji packs.
  • the method for generating an emoticon package may further include: establishing an association database of the exclusive emoticon package, the association database including a corresponding relationship between the at least one exclusive emoticon and at least one of the matching video clips. Therefore, it is convenient to acquire and push the corresponding exclusive expression from the associated database during the subsequent process of playing the matching video clip.
  • the method for generating an emoticon package may further include: acquiring play time information of the at least one matching video clip.
  • the association database may further include a corresponding relationship between the at least one matching video segment and at least one playing time information (ie, at least one playing time information corresponding to each of the at least one matching video segment). Therefore, during the playback of the preset video, the corresponding exclusive expression can be pushed according to the playback time information, and the playback time corresponding to the at least one exclusive expression can be quickly matched, thereby improving the pushing efficiency.
  • the playing time information may be a time period during which the corresponding matching video clip is played during the playing process of the preset video.
  • the emoticon package generating device 100 may include: an obtaining unit 101 configured to obtain at least one piece of target feedback information of multiple related videos of a target video, where the target video and the multiple related videos involve the same a video host; a matching unit 102, configured to match the at least one piece of target feedback information with the target video; a determining unit 103, configured to extract data from the target video based on the matching result of the matching unit determining at least one target video segment; and a generating unit 104, configured to generate an exclusive emoticon package based at least on the at least one target video segment.
  • the operations of the above-mentioned units 101-104 of the emoticon package generating device 100 are respectively similar to the operations of steps S101-S104 described above, and are not repeated here.
  • an electronic device may include: a processor; and a memory storing a program, the program including instructions that, when executed by the processor, cause the processor to execute According to the above expression package generation method.
  • a computer-readable storage medium storing a program, the program comprising instructions that, when executed by a processor of an electronic device, cause the electronic device to perform the expression according to the above-mentioned expression.
  • Package generation method
  • Computing device 2000 may be any machine configured to perform processing and/or computation, which may be, but is not limited to, a workstation, server, desktop computer, laptop computer, tablet computer, personal digital assistant, robot, smartphone, vehicle-mounted computer, or any combination thereof.
  • the above expression pack generation method may be implemented in whole or at least in part by the computing device 2000 or a similar device or system.
  • Computing device 2000 may include elements connected to or in communication with bus 2002 (possibly via one or more interfaces).
  • computing device 2000 may include a bus 2002 , one or more processors 2004 , one or more input devices 2006 , and one or more output devices 2008 .
  • the one or more processors 2004 may be any type of processor, and may include, but are not limited to, one or more general-purpose processors and/or one or more special-purpose processors (eg, special processing chips).
  • Input device 2006 may be any type of device capable of inputting information to computing device 2000, and may include, but is not limited to, a mouse, keyboard, touch screen, microphone, and/or remote control.
  • Output device 2008 may be any type of device capable of presenting information, and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers.
  • Computing device 2000 may also include or be connected to non-transitory storage device 2010, which may be any storage device that is non-transitory and that enables data storage, and may include, but is not limited to, disk drives , optical storage device, solid state memory, floppy disk, flexible disk, hard disk, magnetic tape or any other magnetic medium, optical disk or any other optical medium, ROM (read only memory), RAM (random access memory), cache memory and/or Any other memory chip or cartridge, and/or any other medium from which a computer can read data, instructions and/or code.
  • non-transitory storage device 2010, may be any storage device that is non-transitory and that enables data storage, and may include, but is not limited to, disk drives , optical storage device, solid state memory, floppy disk, flexible disk, hard disk, magnetic tape or any other magnetic medium,
  • the non-transitory storage device 2010 is detachable from the interface.
  • the non-transitory storage device 2010 may have data/programs (including instructions)/code for implementing the methods and steps described above.
  • Computing device 2000 may also include communication device 2012 .
  • Communication device 2012 may be any type of device or system that enables communication with external devices and/or with a network, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication devices, and/or chipsets, such as Bluetooth TM devices, 1302.11 devices, WiFi devices, WiMax devices, cellular communication devices and/or the like.
  • Computing device 2000 may also include working memory 2014, which may be any type of working memory that may store programs (including instructions) and/or data useful for the operation of processor 2004, and may include, but is not limited to, random access memory and / or read-only memory device.
  • working memory 2014 may be any type of working memory that may store programs (including instructions) and/or data useful for the operation of processor 2004, and may include, but is not limited to, random access memory and / or read-only memory device.
  • Software elements may be located in working memory 2014, including, but not limited to, operating system 2016, one or more application programs 2018, drivers, and/or other data and code. Instructions for performing the above-described methods and steps may be included in one or more application programs 2018 , and the above-described emoticon generation method may be implemented by the processor 2004 reading and executing the instructions of one or more application programs 2018 . More specifically, in the above expression pack generating method, steps S101 to S104 may be implemented, for example, by the processor 2004 executing the application program 2018 having the instructions of steps S101 to S104. In addition, other steps in the above-mentioned emoticon package generation method can be implemented, for example, by the processor 2004 executing the application program 2018 having the instructions for executing the corresponding steps.
  • the executable or source code of the instructions for the software elements may be stored in a non-transitory computer-readable storage medium (such as the storage device 2010 described above), and may be stored in the working memory 2014 (possibly compiled) when executed. and/or installation).
  • the executable code or source code of the instructions for the software elements (programs) may also be downloaded from remote locations.
  • custom hardware may also be used, and/or particular elements may be implemented in hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof.
  • some or all of the disclosed methods and apparatus may be implemented on hardware (eg, including field programmable gate arrays) in assembly language or hardware programming languages (such as VERILOG, VHDL, C++) using logic and algorithms according to the present disclosure.
  • FPGA field programmable gate array
  • PLA Programmable Logic Array
  • a client may receive data entered by a user and send the data to a server.
  • the client can also receive the data input by the user, perform part of the processing in the foregoing method, and send the data obtained from the processing to the server.
  • the server may receive data from the client, execute the aforementioned method or another part of the aforementioned method, and return the execution result to the client.
  • the client can receive the execution result of the method from the server, and can present it to the user, for example, through an output device.
  • computing device 2000 may be distributed over a network. For example, some processing may be performed using one processor, while other processing may be performed by another processor remote from the one processor. Other components of computing system 2000 may be similarly distributed. As such, computing device 2000 may be interpreted as a distributed computing system that performs processing in multiple locations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Library & Information Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

一种表情包生成方法及设备、电子设备和介质,涉及多媒体技术领域,尤其涉及视频图像处理领域。所述表情包生成方法包括:获取目标视频的多个相关视频的至少一条目标反馈信息,所述目标视频和所述多个相关视频涉及同一视频播主(S101);将所述至少一条目标反馈信息与所述目标视频进行匹配(S102);基于匹配结果从所述目标视频中确定至少一个目标视频片段(S103);以及至少基于所述至少一个目标视频片段生成专属表情包(S104)。

Description

表情包生成方法及设备、电子设备和介质
相关申请的交叉引用
本申请要求于2020年06月28日提交的中国专利申请202010601966.6的优先权,其全部内容通过引用整体结合在本申请中。
技术领域
本公开涉及多媒体技术领域,尤其涉及视频图像处理领域,特别涉及一种表情包生成方法及设备、电子设备和介质。
背景技术
近年来,随着智能手机的快速普及以及移动网速的显著提升,越来越多的用户开始使用视频类多媒体软件。通过多媒体软件可以上传自己制作的视频,也可以观看其它用户上传的视频。在制作视频时,很多用户喜欢从表情包中选择合适的表情添加到视频中。在观看其它用户上传的视频时,用户可以进行评论,也可以从表情包中选择合适的表情添加到评论中,或者仅使用从表情包中选择合适的表情来发表评论。
相关技术中,多媒体平台向所有用户推送统一的表情包,这些表情包中的表情主要是以明星、动漫、影视等作为素材。
在此部分中描述的方法不一定是之前已经设想到或采用的方法。除非另有指明,否则不应假定此部分中描述的任何方法仅因其包括在此部分中就被认为是现有技术。类似地,除非另有指明,否则此部分中提及的问题不应认为在任何现有技术中已被公认。
发明内容
根据本公开的一方面,提供一种表情包生成方法,包括:获取目标视频的多个相关视频的至少一条目标反馈信息,所述目标视频和所述多个相关视频涉及同一视频播主;将所述至少一条目标反馈信息与所述目标视频进行匹配;基于匹配结果从所述目标视频中确定至少一个目标视频片段;以及至少基于所述至少一个目标视频片段生成专属表情包。
根据本公开的另一方面,还提供一种表情包生成设备,包括:获取单元,被配置用于获取目标视频的多个相关视频的至少一条目标反馈信息;匹配单元,被配置用于将所 述至少一条目标反馈信息与所述目标视频进行匹配;确定单元,被配置用于基于所述匹配单元的匹配结果从所述目标视频中确定至少一个目标视频片段;以及生成单元,被配置用于至少基于所述至少一个目标视频片段生成专属表情包。
根据本公开的另一方面,还提供一种电子设备,包括:处理器;以及存储程序的存储器,所述程序包括指令,所述指令在由所述处理器执行时使所述处理器执行根据上述的表情包生成方法。
根据本公开的另一方面,还提供一种存储程序的计算机可读存储介质,所述程序包括指令,所述指令在由电子设备的处理器执行时,致使所述电子设备执行根据上述的表情包生成方法。
附图说明
附图示例性地示出了实施例并且构成说明书的一部分,与说明书的文字描述一起用于讲解实施例的示例性实施方式。所示出的实施例仅出于例示的目的,并不限制权利要求的范围。在所有附图中,相同的附图标记指代类似但不一定相同的要素。
图1是示出根据示例性实施例的表情包生成方法的流程图;
图2是示出根据示例性实施例的获取多个相关视频的至少一条目标反馈信息的方法流程图;
图3是示出根据示例性实施例的表情包生成方法的流程图;
图4是示出根据示例性实施例的表情包生成设备的示意性组成框图;
图5是示出能够应用于示例性实施例的示例性计算设备的结构框图。
具体实施方式
在本公开中,除非另有说明,否则使用术语“第一”、“第二”等来描述各种要素不意图限定这些要素的位置关系、时序关系或重要性关系,这种术语只是用于将一个元件与另一元件区分开。在一些示例中,第一要素和第二要素可以指向该要素的同一实例,而在某些情况下,基于上下文的描述,它们也可以指代不同实例。
在本公开中对各种所述示例的描述中所使用的术语只是为了描述特定示例的目的,而并非旨在进行限制。除非上下文另外明确地表明,如果不特意限定要素的数量,则该要素可以是一个也可以是多个。此外,本公开中所使用的术语“和/或”涵盖所列出的项目中的任何一个以及全部可能的组合方式。
相关技术中,多媒体平台向所有用户推送统一的表情包,这些表情包中的表情主要是以明星、动漫、影视等作为素材。用户在评论或制作视频时,只能使用统一推送的表情包中的表情。这种无差异性的统一表情保与制作视频的用户(以下简称视频播主)没有任何关联,只能仅仅作为表情来使用。
基于对大数据的研究,发明人意识到,与出现次数最多的评论(可以是指语义相似度较高的多个评论)相关的视频片段基本是用户感兴趣的视频片段,这些视频片段最能吸引用户,也往往最能体现视频播主的个性特点。与评论相关的视频片段可以是指其它用户针对该视频片段的评论,也可以是指与该评论匹配的视频片段。与评论匹配的视频片段可以是指该评论与视频片段的字幕或音频相应的文本的至少一部分具有较高的语义相似度。
基于此,本公开提供一种表情包生成方法,通过获取目标视频的多个相关视频的目标反馈信息,并将目标反馈信息与目标视频进行匹配,以从目标视频中确定至少一个目标视频片段。从而能够将用户的反馈信息映射到目标视频中,获取与用户反馈信息相关的目标视频片段。其中,所述目标视频和所述多个相关视频涉及同一视频播主。然后,基于得到的目标视频片段生成视频播主的专属表情包,可以将视频播主的专属表情包推送给视频播主和其它用户,从而其它用户可以使用专属表情包中的专属表情对视频播主上传的视频进行评论,视频播主也可以从专属表情包中选择合适的专属表情添加到制作的视频中。通过上述技术方案生成的专属表情包能够符合其它用户的兴趣,也能够体现视频播主的个性特点,从而通过使用专属表情包能够加深其它用户对视频播主的印象,提升视频播主的认知度和影响力。
视频的反馈信息例如可以包括以下信息中的至少其中之一:弹幕、视频评论区的评论、点赞和转发。
所述目标视频可以为直播视频,也可以为录制视频。所述目标视频和多个相关视频例如可以由同一用户制作(可以直播或录制)。
本公开中,用户制作的视频可以是指制作得到的视频中包括该用户的音频和/或视频。
以下将结合附图对本公开的表情包生成方法进行进一步描述。
图1是示出根据本公开示例性实施例的表情包生成方法的流程图。如图1所示,所述表情包生成方法可以包括:步骤S101、获取目标视频的多个相关视频的至少一条目标反馈信息,所述目标视频和所述多个相关视频涉及同一视频播主;步骤S102、将所述至少一条目标反馈信息与所述目标视频进行匹配;步骤S103、基于匹配结果从所述目标视 频中确定至少一个目标视频片段;以及步骤S104、至少基于所述至少一个目标视频片段生成专属表情包。
所述多个相关视频可以但不限于从视频库中获得。例如,也可以为从网络上抓取获得。
根据一些实施例,可以将所述多个相关视频的多条反馈信息中出现次数最多的至少一条反馈信息分别确定为所述目标反馈信息。也就是说,在所述多个相关视频的多条反馈信息中,所述至少一条目标反馈信息出现的次数大于剩余的反馈信息出现的次数。从而能够基于所述至少一条目标反馈信息确定用户最感兴趣的目标视频片段。
所述目标视频片段可以包括多个连续的视频帧,也可以为一个单独的视频帧,在此不作限定。
在一个示例性实施例中,如图2所示,步骤S101可以包括:步骤S1011、获取所述多个相关视频中的每一个所述相关视频的至少一条反馈信息;步骤S1012、对所述多个相关视频的多条反馈信息进行语义匹配;步骤S1013、基于语义匹配结果将所述多条反馈信息划分为多个反馈信息组;步骤S1014、将所述多个反馈信息组中所包括反馈信息的数量大于阈值的至少一个反馈信息组确定为目标反馈信息组;以及步骤S1015、基于每一个所述目标反馈信息组中的多条反馈信息确定该目标反馈信息组相应的所述目标反馈信息。从而能够实现所述至少一条目标反馈信息为出现次数最多的反馈信息。
可以但不限于为将每一个目标反馈信息组中的多条反馈信息中语义相似度最高的关键词作为所述目标反馈信息。所述关键词例如可以为“666”、“点赞”、“顶你”等等。例如,也可以将每一个目标反馈信息组中的其中一条反馈信息确定为所述目标反馈信息。
根据一些实施例,所述表情包生成方法还可以包括:在所述获取目标视频的多个相关视频的至少一条目标反馈信息之前,确定所述多个相关视频的反馈信息的总数量是否不小于设定值;响应于确定所述多个相关视频的反馈信息的总数量小于所述设定值,推送引导信息,以引导用户针对所述目标视频输入反馈信息。可以响应于确定所述多个相关视频的反馈信息的总数量不小于所述设定值,执行获取多个相关视频的至少一条目标反馈信息。由此,能够基于数量足够多的反馈信息来获取所需的目标反馈信息,使得得到的目标反馈信息能够更好得指示多数用户的兴趣。
所述推送引导信息例如可以为发送引导弹幕(例如,“一大波弹幕即将袭来”)。
在获取目标视频的多个相关视频的至少一条目标反馈信息之后,可以执行步骤S102和S103,得到至少一个目标视频片段。根据一些实施例,至少一个目标视频片段可以与 所述至少一条目标反馈信息中的至少其中之一匹配。目标视频片段与目标反馈信息匹配可以是指:在所述目标视频中,目标视频片段的字幕或音频相应的文本与相应的目标反馈信息之间的语义相似度最高,并且语义相似度的分数大于设定的阈值。相应地,步骤S102可以包括:将所述至少一条目标反馈信息中的每一条所述目标反馈信息与所述目标视频的至少部分字幕或至少部分音频相应的文本进行语义匹配。从所述目标视频中确定至少一个目标视频片段为基于所述语义匹配结果而执行。
本公开中基于目标视频片段生成的专属表情包可以包括专属动效表情和/或专属贴纸。所述专属表情包可以推送给其它用户和视频播主,以使得其它用户在观看视频播主上传的视频时能够使用视频播主的专属表情包中的表情进行评论,视频播主也能够在制作视频时添加其专属表情包中的表情。
利用本公开的技术方案可以基于每一位视频播主上传的视频为该视频播主生成其专属表情包。
根据一些实施例,所述目标视频片段可以包括用于生成专属动效表情的第一目标视频片段。在这种情况下,如图3所示,所述表情包生成方法还可以包括:步骤S201、在所述生成专属表情包之前,对每一个所述目标视频片段进行目标识别(例如,人脸识别);步骤S202、基于识别结果确定该目标视频片段中是否包括所述视频播主;以及步骤S203、响应于确定该目标视频片段中包括所述视频播主,将该目标视频片段确定为所述第一目标视频片段。相应地,步骤S104可以包括:步骤S1041、至少基于所述第一目标视频片段生成专属动效表情。从而能够基于包括视频播主的视频片段生成视频播主的专属动效表情,通过使用专属动效表情能够提高视频播主的认知度。
例如,某美食播主,经常在做一些特别好的美食时说一句今天啊,咱吃点好的。伴随着夸张的表情、搞笑的神气,用户经常会在评论中进行调侃。利用本公开的技术方案能够基于该视频片段(可以为字幕或音频“一句今天啊,咱吃点好的”对应的视频片段)生成一个对应的动效,可以向视频播主和其它用户推送该动效。通过使用该动效能够大大加深观看用户的印象,方便快速提高播主的人气。
根据一些实施例,步骤S104中至少基于所述第一目标视频片段生成专属动效表情可以包括:确定与所述第一目标视频片段相应的第一文本信息;基于所述第一目标视频片段和相应的第一文本信息生成专属动效表情。从而能够使得生成的专属动效表情更加形象生动。所述第一文本信息可以为基于所述第一目标视频片段中的字幕或音频相应的文本来确定。例如,所述第一文本信息可以为播放第一目标视频片段时相应的一句字幕或 音频中的一句话相应的文本。当然,所述第一文本信息也可以基于与第一目标视频片段匹配的目标反馈信息来确定。
至少基于所述第一目标视频片段生成专属动效表情可以但不限于为响应于接收到第一触发指令而执行,从而可以根据视频播主的触发指令来选择性地生成专属动效表情,具有更好得灵活性。
根据一些实施例,所述目标视频片段还可以包括用于生成专属贴纸的第二目标视频片段。在这种情况下,如图3所示,所述表情包生成方法还可以包括:步骤S204、响应于确定所述目标视频片段中不包括所述视频播主(例如,不包括所述视频播主的人脸),将该目标视频片段确定为所述第二目标视频片段。相应地,步骤S104、所述生成专属表情包还可以包括:步骤S1042、确定与所述第二目标视频片段相应的第二文本信息;步骤S1043、至少基于所述第二文本信息生成专属贴纸。从而能够基于不包括人脸的视频片段生成专属贴纸。所述第二文本信息可以为基于所述第二目标视频片段中的字幕或音频相应的文本来确定。例如,所述第二文本信息可以为播放第二目标视频片段时相应的一句字幕或音频中的一句话相应的文本。当然,所述第二文本信息也可以基于与第二目标视频片段匹配的目标反馈信息来确定。
至少基于所述第二文本信息生成专属贴纸可以包括:获取与所述目标视频相关的人脸图像;基于所述第二文本信息和所述人脸图像生成所述专属贴纸。从而能够生成包括人脸图像的专属贴纸。生成的专属贴纸例如可以使用视频播主指定的头像,也可以为从目标视频中获取的视频播主的头像。从而能够生成包括视频播主头像的专属贴纸,通过使用专属贴纸能够提高视频播主的认知度。
可以理解的是,也可以至少基于包括人脸的第一目标视频片段生成专属贴纸。具体的生成方法与第二目标视频片段相同,在此不再详述。
以上内容中是基于与目标反馈信息匹配的目标视频片段来生成专属表情包。
根据另一些实施例,所述表情包生成方法还可以包括:响应于接收到第二触发指令,基于选择的设定视频片段生成专属表情。从而用户能够主动选择指定的视频片段生成专属表情(可以包括专属表情和/或专属贴纸,具体的生成方法与上面内容中基于目标视频片段生成专属表情相同),灵活性更高,进一步提高用户体验。例如,可以在视频的指定位置显示一键转换图标,视频播主可以通过点击一键转换图标来输入第二触发指令。
视频播主可以但不限于在已录制好的录制视频中选择所述设定视频片段,以用于生成专属表情。
以上内容中描述了在制作完成专属表情包之后,可以推送给视频播主和其它用户,以供选择使用。
根据另一些实施例,也可以自动从专属表情包中选择合适的专属表情添加到预设视频中。在这种情况下,所述表情包生成方法还可以包括:将所述专属表情包中的至少一部分专属表情与预设视频进行匹配;基于匹配结果确定与所述预设视频匹配的至少一个专属表情;以及基于匹配结果确定所述预设视频中与所述至少一个专属表情中的每一个专属表情匹配的匹配视频片段,从而在播放所述预设视频的过程中,能够在播放所述匹配视频片段时推送与该匹配视频片段相应的所述专属表情。由此,通过在预设视频与专属表情包之间建立关联,从而在播放视频时,能够自动在播放匹配视频片段时推送相应的专属表情。
所述预设视频可以与所述目标视频相关。例如,所述预设视频和所述目标视频可以涉及同一视频播主。从而能够在视频播主的预设视频与视频播主的专属表情包之间建立关联,在播放所述预设视频时,能够自动推送相应的专属表情。
所述预设视频可以为视频库中存储的历史视频,也可以为获取的视频播主当前上传的视频。多媒体平台可以每隔一段时间获取视频库中新增加的更新视频,并将相应的视频播主的专属表情包中的至少一部分专属表情与新增加的更新视频进行匹配,从而能够在视频播主上传的视频与其专属表情包之间建立关联。
根据一些实施例,所述表情包生成方法还可以包括:建立所述专属表情包的关联数据库,所述关联数据库包括所述至少一个专属表情与至少一个所述匹配视频片段之间的相应关系。从而能够便于后续播放该匹配视频片段的过程中,从关联数据库中获取并推送相应的专属表情。
在一个示例性实施例中,所述表情包生成方法还可以包括:获取所述至少一个匹配视频片段的播放时间信息。所述关联数据库还可以包括所述至少一个所述匹配视频片段与至少一个播放时间信息(即所述至少一个所述匹配视频片段各自相应的至少一个播放时间信息)之间的相应关系。从而在所述预设视频的播放过程中,可以根据播放时间信息来推送相应的专属表情,能够快速匹配到与所述至少一个专属表情相应的播放时间,提高推送效率。所述播放时间信息可以为在所述预设视频的播放过程中播放相应的匹配视频片段的时间段。
根据本公开的另一方面,还提供一种表情包生成设备。如图4所示,表情包生成设备100可以包括:获取单元101,被配置用于获取目标视频的多个相关视频的至少一条目 标反馈信息,所述目标视频和所述多个相关视频涉及同一视频播主;匹配单元102,被配置用于将所述至少一条目标反馈信息与所述目标视频进行匹配;确定单元103,被配置用于基于所述匹配单元的匹配结果从所述目标视频中确定至少一个目标视频片段;以及生成单元104,被配置用于至少基于所述至少一个目标视频片段生成专属表情包。
这里,表情包生成设备100的上述各单元101-104的操作分别与前面描述的步骤S101-S104的操作类似,在此不再赘述。
根据本公开的另一方面,还提供一种电子设备,可以包括:处理器;以及存储程序的存储器,所述程序包括指令,所述指令在由所述处理器执行时使所述处理器执行根据上述的表情包生成方法。
根据本公开的另一方面,还提供一种存储程序的计算机可读存储介质,所述程序包括指令,所述指令在由电子设备的处理器执行时,致使所述电子设备执行根据上述的表情包生成方法。
参见图5所示,现将描述计算设备2000,其是可以应用于本公开的各方面的硬件设备(电子设备)的示例。计算设备2000可以是被配置为执行处理和/或计算的任何机器,可以是但不限于工作站、服务器、台式计算机、膝上型计算机、平板计算机、个人数字助理、机器人、智能电话、车载计算机或其任何组合。上述表情包生成方法可以全部或至少部分地由计算设备2000或类似设备或系统实现。
计算设备2000可以包括(可能经由一个或多个接口)与总线2002连接或与总线2002通信的元件。例如,计算设备2000可以包括总线2002、一个或多个处理器2004、一个或多个输入设备2006以及一个或多个输出设备2008。一个或多个处理器2004可以是任何类型的处理器,并且可以包括但不限于一个或多个通用处理器和/或一个或多个专用处理器(例如特殊处理芯片)。输入设备2006可以是能向计算设备2000输入信息的任何类型的设备,并且可以包括但不限于鼠标、键盘、触摸屏、麦克风和/或遥控器。输出设备2008可以是能呈现信息的任何类型的设备,并且可以包括但不限于显示器、扬声器、视频/音频输出终端、振动器和/或打印机。计算设备2000还可以包括非暂时性存储设备2010或者与非暂时性存储设备2010连接,非暂时性存储设备可以是非暂时性的并且可以实现数据存储的任何存储设备,并且可以包括但不限于磁盘驱动器、光学存储设备、固态存储器、软盘、柔性盘、硬盘、磁带或任何其他磁介质,光盘或任何其他光学介质、ROM(只读存储器)、RAM(随机存取存储器)、高速缓冲存储器和/或任何其他存储器芯片或盒、和/或计算机可从其读取数据、指令和/或代码的任何其他介质。非暂时性存 储设备2010可以从接口拆卸。非暂时性存储设备2010可以具有用于实现上述方法和步骤的数据/程序(包括指令)/代码。计算设备2000还可以包括通信设备2012。通信设备2012可以是使得能够与外部设备和/或与网络通信的任何类型的设备或系统,并且可以包括但不限于调制解调器、网卡、红外通信设备、无线通信设备和/或芯片组,例如蓝牙 TM设备、1302.11设备、WiFi设备、WiMax设备、蜂窝通信设备和/或类似物。
计算设备2000还可以包括工作存储器2014,其可以是可以存储对处理器2004的工作有用的程序(包括指令)和/或数据的任何类型的工作存储器,并且可以包括但不限于随机存取存储器和/或只读存储器设备。
软件要素(程序)可以位于工作存储器2014中,包括但不限于操作系统2016、一个或多个应用程序2018、驱动程序和/或其他数据和代码。用于执行上述方法和步骤的指令可以被包括在一个或多个应用程序2018中,并且上述表情包生成方法可以通过由处理器2004读取和执行一个或多个应用程序2018的指令来实现。更具体地,上述表情包生成方法中,步骤S101-步骤S104可以例如通过处理器2004执行具有步骤S101-步骤S104的指令的应用程序2018而实现。此外,上述表情包生成方法中的其它步骤可以例如通过处理器2004执行具有执行相应步骤中的指令的应用程序2018而实现。软件要素(程序)的指令的可执行代码或源代码可以存储在非暂时性计算机可读存储介质(例如上述存储设备2010)中,并且在执行时可以被存入工作存储器2014中(可能被编译和/或安装)。软件要素(程序)的指令的可执行代码或源代码也可以从远程位置下载。
还应该理解,可以根据具体要求而进行各种变型。例如,也可以使用定制硬件,和/或可以用硬件、软件、固件、中间件、微代码,硬件描述语言或其任何组合来实现特定元件。例如,所公开的方法和设备中的一些或全部可以通过使用根据本公开的逻辑和算法,用汇编语言或硬件编程语言(诸如VERILOG,VHDL,C++)对硬件(例如,包括现场可编程门阵列(FPGA)和/或可编程逻辑阵列(PLA)的可编程逻辑电路)进行编程来实现。
还应该理解,前述方法可以通过服务器-客户端模式来实现。例如,客户端可以接收用户输入的数据并将所述数据发送到服务器。客户端也可以接收用户输入的数据,进行前述方法中的一部分处理,并将处理所得到的数据发送到服务器。服务器可以接收来自客户端的数据,并且执行前述方法或前述方法中的另一部分,并将执行结果返回给客户端。客户端可以从服务器接收到方法的执行结果,并例如可以通过输出设备呈现给用户。
还应该理解,计算设备2000的组件可以分布在网络上。例如,可以使用一个处理器执行一些处理,而同时可以由远离该一个处理器的另一个处理器执行其他处理。计算系统2000的其他组件也可以类似地分布。这样,计算设备2000可以被解释为在多个位置执行处理的分布式计算系统。
虽然已经参照附图描述了本公开的实施例或示例,但应理解,上述的方法、系统和设备仅仅是示例性的实施例或示例,本发明的范围并不由这些实施例或示例限制,而是仅由授权后的权利要求书及其等同范围来限定。实施例或示例中的各种要素可以被省略或者可由其等同要素替代。此外,可以通过不同于本公开中描述的次序来执行各步骤。进一步地,可以以各种方式组合实施例或示例中的各种要素。重要的是随着技术的演进,在此描述的很多要素可以由本公开之后出现的等同要素进行替换。

Claims (19)

  1. 一种表情包生成方法,包括:
    获取目标视频的多个相关视频的至少一条目标反馈信息,所述目标视频和所述多个相关视频涉及同一视频播主;
    将所述至少一条目标反馈信息与所述目标视频进行匹配;
    基于匹配结果从所述目标视频中确定至少一个目标视频片段;以及
    至少基于所述至少一个目标视频片段生成专属表情包。
  2. 如权利要求1所述的表情包生成方法,其中,所述专属表情包包括专属动效表情,所述目标视频片段包括第一目标视频片段;
    所述表情包生成方法还包括:
    在所述生成专属表情包之前,对每一个所述目标视频片段进行目标识别;
    基于识别结果确定该目标视频片段中是否包括所述视频播主;以及
    响应于确定该目标视频片段中包括所述视频播主,将该目标视频片段确定为所述第一目标视频片段,
    其中,所述生成专属表情包包括:至少基于所述第一目标视频片段生成专属动效表情。
  3. 如权利要求2所述的表情包生成方法,其中,至少基于所述第一目标视频片段生成专属动效表情包括:
    确定与所述第一目标视频片段相应的第一文本信息;
    基于所述第一目标视频片段和相应的第一文本信息生成专属动效表情。
  4. 如权利要求3所述的表情包生成方法,其中,所述第一文本信息为基于所述第一目标视频片段中的字幕或音频相应的文本来确定。
  5. 如权利要求2所述的表情包生成方法,其中,至少基于所述第一目标视频片段生成专属动效表情为响应于接收到第一触发指令而执行。
  6. 如权利要求2所述的表情包生成方法,其中,所述专属表情还包括专属贴纸,所述目标视频片段包括第二目标视频片段;
    所述表情包生成方法还包括:
    响应于确定所述目标视频片段中不包括所述视频播主,将该目标视频片段确定为所述第二目标视频片段,
    其中,所述生成专属表情包还包括:
    确定与所述第二目标视频片段相应的第二文本信息;以及
    至少基于所述第二文本信息生成专属贴纸。
  7. 如权利要求6所述的表情包生成方法,其中,至少基于所述第二文本信息生成专属贴纸包括:
    获取与所述目标视频相关的人脸图像;
    基于所述第二文本信息和所述人脸图像生成所述专属贴纸。
  8. 如权利要求6所述的表情包生成方法,其中,所述第二文本信息为基于所述第二目标视频片段中的字幕或音频相应的文本来确定。
  9. 如权利要求1-8中任一项所述的表情包生成方法,所述表情包生成方法还包括:
    响应于接收到第二触发指令,基于选择的设定视频片段生成专属表情。
  10. 如权利要求1-8中任一项所述的表情包生成方法,其中,获取目标视频的多个相关视频的至少一条目标反馈信息包括:
    获取所述多个相关视频中的每一个所述相关视频的至少一条反馈信息;
    对所述多个相关视频的多条反馈信息进行语义匹配;
    基于语义匹配结果将所述多条反馈信息划分为多个反馈信息组;
    将所述多个反馈信息组中所包括反馈信息的数量大于阈值的至少一个反馈信息组确定为目标反馈信息组;以及
    基于每一个所述目标反馈信息组中的多条反馈信息确定该目标反馈信息组相应的所述目标反馈信息。
  11. 如权利要求1-8中任一项所述的表情包生成方法,还包括:
    在所述获取目标视频的多个相关视频的至少一条目标反馈信息之前,确定所述多个相关视频的反馈信息的总数量是否不小于设定值;
    响应于确定所述多个相关视频的反馈信息的总数量小于所述设定值,推送引导信息,以引导用户输入针对所述目标视频的反馈信息。
  12. 如权利要求1-8中任一项所述的表情包生成方法,其中,将所述至少一条目标反馈信息与所述目标视频进行匹配包括:
    将所述至少一条目标反馈信息中的每一条所述目标反馈信息与所述目标视频的至少部分字幕或至少部分音频相应的文本进行语义匹配,
    其中,从所述目标视频中确定至少一个目标视频片段为基于所述语义匹配结果而执行。
  13. 如权利要求1-8中任一项所述的表情包生成方法,还包括:
    将所述专属表情包中的至少一部分专属表情与预设视频进行匹配;
    基于匹配结果确定与所述预设视频匹配的至少一个专属表情;以及
    基于匹配结果确定所述预设视频中与所述至少一个专属表情中的每一个专属表情匹配的匹配视频片段,从而在播放所述预设视频的过程中,能够在播放所述匹配视频片段时推送与该匹配视频片段相应的所述专属表情。
  14. 如权利要求13所述的表情包生成方法,还包括:
    建立所述专属表情包的关联数据库,所述关联数据库包括所述至少一个专属表情与至少一个所述匹配视频片段之间的相应关系。
  15. 如权利要求14所述的表情包生成方法,还包括:
    获取所述至少一个匹配视频片段的播放时间信息,
    其中,所述关联数据库还包括所述至少一个所述匹配视频片段与至少一个播放时间信息之间的相应关系。
  16. 如权利要求13所述的表情包生成方法,其中,所述预设视频和所述目标视频涉及同一视频播主。
  17. 一种表情包生成设备,包括:
    获取单元,被配置用于获取目标视频的多个相关视频的至少一条目标反馈信息,所述目标视频和所述多个相关视频涉及同一视频播主;
    匹配单元,被配置用于将所述至少一条目标反馈信息与所述目标视频进行匹配;
    确定单元,被配置用于基于所述匹配单元的匹配结果从所述目标视频中确定至少一个目标视频片段;以及
    生成单元,被配置用于至少基于所述至少一个目标视频片段生成专属表情包。
  18. 一种电子设备,包括:
    处理器;以及
    存储程序的存储器,所述程序包括指令,所述指令在由所述处理器执行时使所述处理器执行根据权利要求1-16中任一项所述的表情包生成方法。
  19. 一种存储程序的计算机可读存储介质,所述程序包括指令,所述指令在由电子设备的处理器执行时,致使所述电子设备执行根据权利要求1-16中任一项所述的表情包生成方法。
PCT/CN2020/133649 2020-06-28 2020-12-03 表情包生成方法及设备、电子设备和介质 WO2022000991A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP20922491.4A EP3955131A4 (en) 2020-06-28 2020-12-03 METHOD AND DEVICE FOR CREATING MEMS PACKAGING, ELECTRONIC DEVICE AND MEDIUM
KR1020217029278A KR20210118203A (ko) 2020-06-28 2020-12-03 이모티콘 패키지 생성 방법 및 기기, 전자 기기 및 매체
JP2021552794A JP7297084B2 (ja) 2020-06-28 2020-12-03 インターネットミームの生成方法及び装置、電子装置並びに媒体
US17/471,086 US20210407166A1 (en) 2020-06-28 2021-09-09 Meme package generation method, electronic device, and medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010601966.6 2020-06-28
CN202010601966.6A CN111753131A (zh) 2020-06-28 2020-06-28 表情包生成方法及设备、电子设备和介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/471,086 Continuation US20210407166A1 (en) 2020-06-28 2021-09-09 Meme package generation method, electronic device, and medium

Publications (1)

Publication Number Publication Date
WO2022000991A1 true WO2022000991A1 (zh) 2022-01-06

Family

ID=72677757

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/133649 WO2022000991A1 (zh) 2020-06-28 2020-12-03 表情包生成方法及设备、电子设备和介质

Country Status (2)

Country Link
CN (1) CN111753131A (zh)
WO (1) WO2022000991A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114445896A (zh) * 2022-01-28 2022-05-06 北京百度网讯科技有限公司 视频中人物陈述内容可置信度的评估方法及装置
CN114827648A (zh) * 2022-04-19 2022-07-29 咪咕文化科技有限公司 动态表情包的生成方法、装置、设备和介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753131A (zh) * 2020-06-28 2020-10-09 北京百度网讯科技有限公司 表情包生成方法及设备、电子设备和介质
CN113542801B (zh) * 2021-06-29 2023-06-06 北京百度网讯科技有限公司 主播标识的生成方法、装置、设备、存储介质及程序产品

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160328877A1 (en) * 2014-03-13 2016-11-10 Tencent Technology (Shenzhen) Company Limited Method and apparatus for making personalized dynamic emoticon
CN106358087A (zh) * 2016-10-31 2017-01-25 北京小米移动软件有限公司 表情包生成方法及装置
CN108200463A (zh) * 2018-01-19 2018-06-22 上海哔哩哔哩科技有限公司 弹幕表情包的生成方法、服务器及弹幕表情包的生成系统
CN110049377A (zh) * 2019-03-12 2019-07-23 北京奇艺世纪科技有限公司 表情包生成方法、装置、电子设备及计算机可读存储介质
CN110719525A (zh) * 2019-08-28 2020-01-21 咪咕文化科技有限公司 弹幕表情包的生成方法、电子设备和可读存储介质
CN111753131A (zh) * 2020-06-28 2020-10-09 北京百度网讯科技有限公司 表情包生成方法及设备、电子设备和介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10331661B2 (en) * 2013-10-23 2019-06-25 At&T Intellectual Property I, L.P. Video content search using captioning data
CN106951856A (zh) * 2017-03-16 2017-07-14 腾讯科技(深圳)有限公司 表情包提取方法及装置
CN110162670B (zh) * 2019-05-27 2020-05-08 北京字节跳动网络技术有限公司 用于生成表情包的方法和装置
CN110889379B (zh) * 2019-11-29 2024-02-20 深圳先进技术研究院 表情包生成方法、装置及终端设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160328877A1 (en) * 2014-03-13 2016-11-10 Tencent Technology (Shenzhen) Company Limited Method and apparatus for making personalized dynamic emoticon
CN106358087A (zh) * 2016-10-31 2017-01-25 北京小米移动软件有限公司 表情包生成方法及装置
CN108200463A (zh) * 2018-01-19 2018-06-22 上海哔哩哔哩科技有限公司 弹幕表情包的生成方法、服务器及弹幕表情包的生成系统
CN110049377A (zh) * 2019-03-12 2019-07-23 北京奇艺世纪科技有限公司 表情包生成方法、装置、电子设备及计算机可读存储介质
CN110719525A (zh) * 2019-08-28 2020-01-21 咪咕文化科技有限公司 弹幕表情包的生成方法、电子设备和可读存储介质
CN111753131A (zh) * 2020-06-28 2020-10-09 北京百度网讯科技有限公司 表情包生成方法及设备、电子设备和介质

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114445896A (zh) * 2022-01-28 2022-05-06 北京百度网讯科技有限公司 视频中人物陈述内容可置信度的评估方法及装置
CN114445896B (zh) * 2022-01-28 2024-04-05 北京百度网讯科技有限公司 视频中人物陈述内容可置信度的评估方法及装置
CN114827648A (zh) * 2022-04-19 2022-07-29 咪咕文化科技有限公司 动态表情包的生成方法、装置、设备和介质
CN114827648B (zh) * 2022-04-19 2024-03-22 咪咕文化科技有限公司 动态表情包的生成方法、装置、设备和介质

Also Published As

Publication number Publication date
CN111753131A (zh) 2020-10-09

Similar Documents

Publication Publication Date Title
WO2022000991A1 (zh) 表情包生成方法及设备、电子设备和介质
US10324940B2 (en) Approximate template matching for natural language queries
US10528623B2 (en) Systems and methods for content curation in video based communications
US11709887B2 (en) Systems and methods for digitally fetching music content
WO2022121601A1 (zh) 一种直播互动方法、装置、设备及介质
US9854305B2 (en) Method, system, apparatus, and non-transitory computer readable recording medium for extracting and providing highlight image of video content
US9715901B1 (en) Video preview generation
WO2017161776A1 (zh) 弹幕推送方法及装置
WO2017124116A1 (en) Searching, supplementing and navigating media
US20140255003A1 (en) Surfacing information about items mentioned or presented in a film in association with viewing the film
US20130007787A1 (en) System and method for processing media highlights
US9754626B1 (en) Mobile device video personalization
US11677711B2 (en) Metrics-based timeline of previews
US20230280966A1 (en) Audio segment recommendation
CN108491178B (zh) 信息浏览方法、浏览器和服务器
US20140163956A1 (en) Message composition of media portions in association with correlated text
JP7297084B2 (ja) インターネットミームの生成方法及び装置、電子装置並びに媒体
US9066135B2 (en) System and method for generating a second screen experience using video subtitle data
US9084011B2 (en) Method for advertising based on audio/video content and method for creating an audio/video playback application
US20240086141A1 (en) Systems and methods for leveraging soundmojis to convey emotion during multimedia sessions
US20230100140A1 (en) Method and system for searching for media message using keyword extracted from media file
WO2024036979A9 (zh) 一种多媒体资源播放方法及相关装置
US20230412885A1 (en) Automatic identification of video series
CN115981769A (zh) 页面显示方法、装置、设备、计算机可读存储介质及产品

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021552794

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20217029278

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2020922491

Country of ref document: EP

Effective date: 20210909

NENP Non-entry into the national phase

Ref country code: DE