WO2021104097A1 - Meme generation method and apparatus, and terminal device - Google Patents
Meme generation method and apparatus, and terminal device Download PDFInfo
- Publication number
- WO2021104097A1 WO2021104097A1 PCT/CN2020/129209 CN2020129209W WO2021104097A1 WO 2021104097 A1 WO2021104097 A1 WO 2021104097A1 CN 2020129209 W CN2020129209 W CN 2020129209W WO 2021104097 A1 WO2021104097 A1 WO 2021104097A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- emoticon
- package
- emoticon package
- face
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 239000000463 material Substances 0.000 claims abstract description 78
- 230000001815 facial effect Effects 0.000 claims abstract description 16
- 238000004590 computer program Methods 0.000 claims description 24
- 230000008921 facial expression Effects 0.000 claims description 22
- 230000003068 static effect Effects 0.000 claims description 14
- 238000003860 storage Methods 0.000 claims description 14
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 230000000007 visual effect Effects 0.000 abstract 1
- 238000001514 detection method Methods 0.000 description 14
- 230000008569 process Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 239000000284 extract Substances 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000000605 extraction Methods 0.000 description 5
- 230000004044 response Effects 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 235000001674 Agaricus brunnescens Nutrition 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 235000019800 disodium phosphate Nutrition 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000001151 other effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- This application belongs to the technical field of video applications, and in particular relates to a method, device, terminal device, and computer-readable storage medium for generating emoticons.
- emoticons are a way of using pictures to express feelings.
- the original emoticons were mostly designed for professionals, such as emoji emoticons, QQ emoticons, etc.
- people began to popularize emoticons with pictures and text, but because the production of emoticons requires manual extraction of emoticons or adding text information, it will be time-consuming and labor-intensive, and it will also generate emoticons in character videos. The problem of inefficiency.
- the embodiments of the present application provide a method, a device and a terminal device for generating an emoticon package to solve the problem of low efficiency of extracting a face image from a character video and generating a corresponding emoticon package in the prior art.
- an embodiment of the present application provides a method for generating an emoticon package, including:
- the determining the target emoticon package image and emoticon package material based on the expression similarity includes:
- the first emoticon pack image is used as the target emoticon pack image
- the first person A face image is used as a static emoticon pack material, wherein the first face image is any face image in the portrait image, and the first emoticon pack image is any emoticon pack in the emoticon pack image library image.
- the determining the target emoticon package image and emoticon package material based on the expression similarity includes:
- the first emoticon package image is used as the target emoticon package image .
- the face image in the first image is any one of the multiple frames of images.
- the method further includes:
- the first emoticon package image is used as the target expression A packet image, using the face image in the first image and the face image in the second image as dynamic emoticon packet materials, wherein the second image is the same as the first image in the multi-frame image At least one adjacent frame of image.
- the method further includes:
- the second emoticon package image is used as the target emoticon package image
- the human face in the third image The image is used as a static emoticon package material, wherein the third image is at least one frame of image adjacent to the first image among the multiple frames of images.
- the method further includes:
- the second emoticon package image is used as the target emoticon package image
- the human face in the third image The image and the fourth image are used as dynamic emoticon package materials, wherein the fourth image is at least two consecutive images in the multi-frame images, and at least one of the at least two images is related to the first image.
- the images are adjacent.
- the calculating the expression similarity between the face image and the emoticon package image in the emoticon package image library includes:
- the facial expression feature is compared with the expression feature of the emoticon package image to obtain the expression similarity.
- an emoticon package generation device including:
- An obtaining module configured to obtain at least one portrait image from a character video to be processed, the portrait image including a face image
- Determining module used to determine a target emoticon pack image and emoticon pack material based on the expression similarity, wherein the target emoticon pack image belongs to the emoticon pack image library, and the emoticon pack material belongs to the portrait image;
- the generating module is used for extracting the text information of the target emoticon package image, and integrating the text information with the emoticon package material to generate the target emoticon package.
- an embodiment of the present application also provides a terminal device, including a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes the computer program When realizing the steps of the method in the first aspect.
- an embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program implements the steps of the method in the first aspect when the computer program is executed by a processor.
- the emoticon package generation method, device, and terminal equipment provided by the embodiments of the present application have the following beneficial effects:
- the portrait image contains a face image
- the target emoticon package image and the emoticon package material are determined based on the expression similarity, wherein the target emoticon package image belongs to the emoticon package image library, and the emoticon package material belongs to the A portrait image, extracting text information of the target emoticon package image, and integrating the text information with the emoticon package material to generate a target emoticon package.
- the present invention can determine the target emoticon pack image and emoticon pack material according to the degree of expression similarity, and automatically integrate the text information of the emoticon pack image with the emoticon pack material, the user does not need to manually select the emoticon pack image, which reduces the user's operational burden, Allow users to quickly and easily make their own emoticons in the video.
- the facial expression similarity the facial image or video fragment is extracted from the character video as the emoticon package material, which realizes the generation of the emoticon package based on the expression similarity calculation, and improves the efficiency of emoticon generation.
- FIG. 1 is a schematic diagram of an implementation process of a method for generating an emoticon package provided by an embodiment of the present application
- FIG. 2 is a schematic diagram of the implementation process of the emoticon package generation method provided by another embodiment of the present application.
- FIG. 3 is a schematic structural diagram of an emoticon package generating apparatus provided by an embodiment of the present application.
- Fig. 4 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
- 300-emoji package generation device 310-acquisition module; 320-calculation module; 330-determination module; 340-generation module; 400-terminal equipment; 410-memory; 420-processor; 430-computer program.
- the term “if” can be construed as “when” or “once” or “in response to determination” or “in response to detecting ".
- the phrase “if determined” or “if detected [described condition or event]” can be construed as meaning “once determined” or “in response to determination” or “once detected [described condition or event]” depending on the context ]” or “in response to detection of [condition or event described]”.
- FIG. 1 shows a method for generating an emoticon package provided by an embodiment of the present application, and the method for generating an emoticon package may include the following S101 to S104.
- S101 Obtain at least one portrait image from a character video to be processed, where the portrait image includes a face image;
- the execution subject of the emoticon package generation method may compose each frame image in the above-mentioned character video into a portrait image, or the above-mentioned execution subject may be based on a preset step size (for example, 1 or 2, etc.) from the above-mentioned character video.
- the image is extracted from the image, and the extracted images are formed into a portrait image.
- the images in the portrait image are arranged in the order of playback in the above-mentioned character video.
- the preset step length can be set according to actual needs. There are no specific restrictions.
- the execution body may receive the emoticon package generation request sent by the user through the electronic device in real time, and the character video may be a video included in the emoticon package generation request received by the execution body.
- the character video can be pre-stored in the electronic device, or it can be a video saved on the website for playback, or it can be recorded in real time through the electronic device.
- Face recognition technology processes the portrait image in the person video, and extracts at least one portrait image with a face image.
- the portrait image includes background, text, etc.
- the face image can be a person's face or multiple people's human face.
- the electronic device is a terminal device, it can meet the user's needs for extracting a specific face image from the video; when the electronic device is a server, by running the device for extracting the face image from the character video in the electronic device, It can meet the facial image extraction requirements of platforms such as video websites.
- a face detection algorithm is used to detect a face image with a face from multiple portrait images, and the face image can be marked, which is convenient for extracting facial expression features from the face image.
- the face detection algorithm is used to detect at least one portrait image in the person video to obtain the corresponding face detection result.
- the face detection result is used to indicate whether a human face is displayed in the portrait image.
- the face detection algorithm can be a multi-task convolutional neural network (Multi-task convolutional neural network).
- neural network, MTCNN multi-task convolutional neural network
- MTCNN multi-task neural network model for face detection tasks.
- the model mainly uses three cascaded networks and the idea of candidate boxes plus classifiers to perform fast and efficient face detection.
- the cascaded networks are P-Net that quickly generates candidate windows, R-Net that performs high-precision candidate window filtering and selection, and O-Net that generates final bounding boxes and face key points.
- the human face in the video of the person is detected by a multi-task convolutional neural network to obtain a corresponding face image.
- S102 Calculate the expression similarity between the face image in the portrait image and the emoticon package image in the preset emoticon package image library;
- a deep neural network is used to extract facial expression features in the facial image.
- the deep neural network can be used in ImageNet, Face Recognition data or facial expression data and other databases are pre-trained, and the extracted facial expression features are compared with the pre-processed expression packs in the emoticon pack image library, where the emoticon pack feature can be a new input
- the emoticon feature, feature comparison is the convolutional layer feature or the fully connected layer feature of the deep neural network, and the classification probability of the neural network can also be used as the feature.
- Feature comparison can be calculated based on feature cosine similarity, or Euclidean distance can be calculated after feature normalization, and the calculated result is used as expression similarity.
- the emoticon pack image library can be downloaded or collected from the Internet, or it can be an emoticon pack image manually added or created by the user, and classified according to the text information, format, etc. of the emoticon pack image , Use the text extraction method to obtain the text in the emoticon, such as OCR text recognition software, if the emoticon does not contain text, you can obtain the name and format of the emoticon, or extract the surrounding text of the emoticon to form the text of the emoticon Information, according to the emoticon package and the corresponding text information to establish an emoticon package image library.
- OCR text recognition software if the emoticon does not contain text, you can obtain the name and format of the emoticon, or extract the surrounding text of the emoticon to form the text of the emoticon Information, according to the emoticon package and the corresponding text information to establish an emoticon package image library.
- S103 Determine a target emoticon pack image and emoticon pack material based on the expression similarity, where the target emoticon pack image belongs to the emoticon pack image library, and the emoticon pack material belongs to the portrait image;
- the calculated expression similarity can be marked or sorted, or the expression similarities can be compared one by one to obtain the expression.
- the value corresponding to the similarity is the largest one or more.
- the determined target emoticon pack image, emoticon pack material, and corresponding emoticon similarity can also be associated, so as to quickly find the target emoticon pack image and emoticon pack material.
- S104 Extract text information of the target emoticon package image, and integrate the text information with the emoticon package material to generate a target emoticon package.
- the emoticon pack image is used as the target emoticon pack image, the text information of the target emoticon pack image is extracted, the face image in the portrait image is extracted as the emoticon pack material, and the text information is added by subtitles or naming method and the emoticon pack material is integrated to generate the target emoticon pack .
- the facial images in the multiple portrait images are respectively calculated with the emoticons in the emoticon pack image library, and the emoticon pack images and face images corresponding to the emoticon similarity that exceed the preset threshold are selected, so that the target The accurate matching of the emoticon image and the emoticon material also improves the efficiency of emoticon generation.
- the determining the target emoticon package image and the emoticon package material based on the expression similarity includes:
- the first emoticon pack image is used as the target emoticon pack image
- the first person A face image is used as a static emoticon pack material, wherein the first face image is any face image in the portrait image, and the first emoticon pack image is any emoticon pack in the emoticon pack image library image.
- the emoticon package image is used as the target emoticon package Image
- the target emoticon pack image can be understood as an emoticon pack template
- the emoticon pack template is used to extract text information. Precisely screen out the emoticon pack images and corresponding face images that meet the requirements, so as to quickly generate new emoticon packs, and save the generated emoticon packs in the emoticon pack image library, which can be downloaded or forwarded by users, thereby improving users Activity.
- FIG. 2 shows a schematic diagram of an implementation process of a method for generating an emoticon package according to another embodiment of the present application, wherein the above S103 includes the following S201 to S204.
- the expression similarity can be set to a value in [0,100]
- the first preset similarity threshold can be set to 50
- the expression similarity between the first face image and the first emoticon package image is 55.
- the attribute information of the first face image in the character video is obtained.
- the attribute information includes the position and time of the first face image in the entire video, so as to facilitate subsequent interception of the character video based on the attribute information
- Related video clips effectively avoid background information from interfering with the normal extraction of face images.
- Face images, portrait images include face images, background information, subtitles, etc., thereby improving the accuracy of extracting emoticon materials.
- S202 Acquire multiple frames of images corresponding to the first face image from the person video according to the attribute information
- the first face image may appear one or more times in the person video.
- the corresponding multi-frame images that is, a video clip, are intercepted from the person video.
- the face area can be detected from the obtained video frames, and after the face area is detected, the face frame set is obtained, and then based on the person
- the face frame collection extracts the face image from the obtained video clips, that is, the image containing the area within the face frame, and calculates the facial expression of each frame of the image with the first expression pack image to obtain the corresponding expression similarity In order to select the emoticon package material that meets the requirements.
- the above-mentioned execution subject may be based on the expression similarity corresponding to the above-mentioned multi-frame images.
- the expression similarity may be represented by a numerical value within [0,100]. In practice, the smaller the value, the lower the expression similarity. The larger the expression, the higher the similarity.
- the first preset similarity threshold may be set to 50
- Obtain the attribute information of the first face image in the character video the attribute information includes the time and location of the first face image, and the multiple frames of images corresponding to the first face image can be located according to the attribute information, and the A video clip of a certain period of time (such as 3 seconds) before and after the time in the character video is obtained.
- the images around the face position in the character video are obtained to form a condensed video clip, which can remove irrelevant background information in the video clip.
- the expression similarity is greater than or equal to the second
- a frame of image with a similarity threshold is preset
- the face image in the image is used as a static emoticon pack material, that is, an emoticon pack picture.
- the size of the image around the face position can be set to four times the size of the original face image. This makes it easy to distinguish between the face image and the background information.
- the first preset similarity threshold and the second similarity threshold may be the same or different, and are set according to actual conditions, which are not specifically limited here.
- one frame of image may contain one or more face regions, or there is no face region; and when there are one or more face regions, a corresponding number of face frames can be obtained, and then, a corresponding number can be obtained.
- the above-mentioned face frame set is a set about at least one face frame, the face frame is a rectangular frame surrounding the face area, and since the face frame can be characterized by coordinate information, therefore, The aforementioned face frame set may specifically include at least one piece of coordinate information, and each piece of coordinate information can determine a face frame.
- the face frame In order to extract the complete expression of the emoticon package material, the face frame can be expanded outward to obtain a new rectangle, and the image block enclosed by the new rectangle can be taken out to obtain the face image.
- a pre-trained face detection model can be used to detect the face area from the acquired multi-frame images, so as to quickly and accurately extract the emoticon package material.
- the first emoticon package image is used as the target expression A packet image, using the face image in the first image and the face image in the second image as dynamic emoticon packet materials, wherein the second image is the same as the first image in the multi-frame image At least one adjacent frame of image.
- the expression similarity between the first image and the second image and the first emoticon package image respectively meet the condition, that is, the expression similarity is greater than or equal to the second preset similarity threshold.
- the face image in the first image and the face image in the second image can be used as dynamic emoticon package materials.
- the face images in the three frames of images can also be used as dynamic emoticon pack materials, that is, dynamic emoticon packs can be generated, which expands the types of emoticon pack generation and adds the fun of using emoticon packs.
- the second emoticon package image is used as the target emoticon package image
- the human face in the third image The image is used as a static emoticon package material, wherein the third image is at least one frame of image adjacent to the first image among the multiple frames of images.
- the second emoticon pack image is taken as The target emoticon package image
- the face image in the third image is used as the static emoticon package material, that is, the emoticon package picture.
- the second emoticon package image is used as the target emoticon package image
- the human face in the third image The image and the fourth image are used as dynamic emoticon package materials, wherein the fourth image is at least two consecutive images in the multi-frame images, and at least one of the at least two images is related to the first image.
- the images are adjacent.
- the expression similarity between the face image in the portrait image and the first emoticon pack image is recorded as the first expression similarity
- the portrait The expression similarity between the face image in the image and the second expression pack image is recorded as the second expression similarity.
- the first expression similarity is less than the second expression similarity
- the second expression similarity corresponding to the second expression similarity is selected.
- the emoticon package image is used as the target emoticon package image.
- the continuous face images in the previous and subsequent frames of the image can be used as the dynamic emoticon pack material to improve The display effect of the emoticon package.
- the user in order to improve the display effect of the target emoticon package, the user can edit the target emoticon package according to his own wishes or add props effects, such as hats, mushroom heads and other effect props, and/or add some art Words, watermarks, etc.
- the target emoticon package is made into a preset format and stored in the emoticon package image library such as interfaces and chat tools for users to operate.
- the preset format is set according to needs.
- the preset format can be It is a GIF format.
- the GIF format can store multiple images. Multiple images saved in a file can be read out and displayed on the screen to form a simple animation to improve user operability and experience.
- the portrait image contains a face image
- the target emoticon package image and the emoticon package material are determined based on the expression similarity, wherein the target emoticon package image belongs to the emoticon package image library, and the emoticon package material belongs to the A portrait image, extracting text information of the target emoticon package image, and integrating the text information with the emoticon package material to generate a target emoticon package.
- the present invention can determine the target emoticon pack image and emoticon pack material according to the degree of expression similarity, and automatically integrate the text information of the emoticon pack image with the emoticon pack material, the user does not need to manually select the emoticon pack image, which reduces the user's operational burden, Allow users to quickly and easily make their own emoticons in the video.
- the facial expression similarity the facial image or video fragment is extracted from the character video as the emoticon package material, which realizes the generation of the emoticon package based on the expression similarity calculation, and improves the efficiency of emoticon generation.
- FIG. 3 shows a schematic structural diagram of an emoticon package generating apparatus 300 provided by the present application, as shown in FIG. 3, including:
- the obtaining module 310 is configured to obtain at least one portrait image from a character video to be processed, where the portrait image includes a face image;
- the calculation module 320 is configured to calculate the expression similarity between the face image in the portrait image and the emoticon package image in the preset emoticon package image library;
- the determining module 330 is configured to determine a target emoticon pack image and emoticon pack material based on the expression similarity, wherein the target emoticon pack image belongs to the emoticon pack image library, and the emoticon pack material belongs to the portrait image;
- the generating module 340 is configured to extract text information of the target emoticon package image, and integrate the text information with the emoticon package material to generate a target emoticon package.
- the determining module 330 is specifically configured to: when the expression similarity between the first facial image and the first emoticon package image is greater than or equal to a first preset similarity threshold, the first emoticon package The image is used as a target emoticon pack image, and the first face image is used as a static emoticon pack material, wherein the first face image is any face image in the portrait image, and the first emoticon pack image Is any emoticon package image in the emoticon package image library.
- the determining module 330 specifically includes:
- the first acquiring unit is configured to acquire the first face image in the person video when the expression similarity between the first face image and the first emoticon package image is greater than or equal to a first preset similarity threshold. Attribute information in;
- a second acquiring unit configured to acquire multiple frames of images corresponding to the first face image from the person video according to the attribute information
- a calculation unit configured to calculate the expression similarity between each frame of the image and the first expression pack image
- the first material determining unit is configured to combine the first emoticon package when the expression similarity between the face image in the first image and the first emoticon package image is greater than or equal to a second preset similarity threshold
- An image is used as the target emoticon pack image, and the face image in the first image is used as a static emoticon pack material, wherein the first image is any one of the multiple frames of images.
- the determining module 330 may further include:
- the second material determining unit is configured to: after the calculation unit calculates the expression similarity between each frame of the image and the first emoticon pack image, when the face image in the second image is the same as the first emoticon pack image.
- the expression similarity between the emoticon package images is greater than or equal to the second preset similarity threshold
- the first emoticon package image is used as the target emoticon package image
- the face image in the second image is used as a dynamic emoticon package material, wherein the second image is at least one frame of image adjacent to the first image among the multiple frames of images.
- the determining module 330 may further include:
- the third material determining unit is configured to: after the calculation unit calculates the expression similarity between each frame of the image and the first emoticon package image, when the face image in the third image is compared with the second emoticon package When the expression similarity between the images is the greatest, the second emoticon pack image is used as the target emoticon pack image, and the face image in the third image is used as the static emoticon pack material, wherein the third emoticon pack image
- the image is at least one frame of image adjacent to the first image among the multiple frames of images.
- the determining module 330 may further include:
- the fourth material determining unit is configured to: after the calculation unit calculates the expression similarity between each frame of the image and the first emoticon package image, when the face image in the third image is compared with the second emoticon package
- the second emoticon pack image is used as the target emoticon pack image
- the face image and the fourth image in the third image are used as dynamic emoticon pack materials, wherein,
- the fourth image is at least two consecutive frames of images in the multiple frames of images, and at least one of the at least two frames of images is adjacent to the first image.
- the acquisition module 310 is further configured to extract facial expression features of the facial image through a preset deep neural network, and the calculation module 320 compares the facial expression features with those of the emoticon package image. The expression features are compared to obtain the expression similarity.
- FIG. 4 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
- the terminal device 400 includes a memory 410, at least one processor 420, and is stored in the memory 410 and can be stored in the memory 410.
- the computer program 430 running on the processor 420 implements the aforementioned emoticon package generation method when the processor 420 executes the computer program 430.
- the terminal device 400 may be a desktop computer, a mobile phone, a tablet computer, a wearable device, a vehicle-mounted device, an augmented reality (AR)/virtual reality (VR) device, a notebook computer, an ultra mobile personal computer (ultra -mobile personal On terminal devices such as a computer (UMPC), a netbook, and a personal digital assistant (personal digital assistant, PDA), the embodiments of this application do not impose any restrictions on the specific types of terminal devices.
- UMPC computer
- PDA personal digital assistant
- the terminal device 400 may include but is not limited to a processor 420 and a memory 410. Those skilled in the art can understand that FIG. 4 is only an example of the terminal device 400, and does not constitute a limitation on the terminal device 400. It may include more or less components than those shown in the figure, or a combination of certain components, or different components. , For example, can also include input and output devices.
- the so-called processor 420 may be a central processing unit (Central Processing Unit, CPU), and the processor 420 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), and application specific integrated circuits (Application Specific Integrated Circuits). Integrated Circuit, ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the memory 410 may be an internal storage unit of the terminal device 400, such as a hard disk or a memory of the terminal device 400. In other embodiments, the memory 410 may also be an external storage device of the terminal device 400, such as a plug-in hard disk equipped on the terminal device 400, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital). Digital, SD) card, flash card, etc. Further, the memory 410 may also include both an internal storage unit of the terminal device 400 and an external storage device. The memory 410 is used to store an operating system, an application program, a boot loader (Boot Loader), data, and other programs, such as the program code of the computer program. The memory 410 may also be used to temporarily store data that has been output or will be output.
- a boot loader Boot Loader
- the embodiments of the present application also provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in each of the foregoing method embodiments can be realized.
- the embodiments of the present application provide a computer program product.
- the steps in the foregoing method embodiments can be realized when the mobile terminal is executed.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the computer program can be stored in a computer-readable storage medium.
- the computer program can be stored in a computer-readable storage medium.
- the steps of the foregoing method embodiments can be implemented.
- the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
- the computer-readable medium may at least include: any entity or device capable of carrying computer program code to the photographing device/terminal device, recording medium, computer memory, read-only memory (Read-Only Memory, ROM), and random access memory (Random Access Memory, RAM), electric carrier signal, telecommunications signal, and software distribution medium.
- any entity or device capable of carrying computer program code to the photographing device/terminal device recording medium, computer memory, read-only memory (Read-Only Memory, ROM), and random access memory (Random Access Memory, RAM), electric carrier signal, telecommunications signal, and software distribution medium.
- computer-readable media cannot be electrical carrier signals and telecommunication signals.
- the disclosed apparatus/network equipment and method may be implemented in other ways.
- the device/network device embodiments described above are merely illustrative.
- the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units.
- components can be combined or integrated into another system, or some features can be omitted or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (10)
- 一种表情包生成方法,其特征在于,包括: A method for generating emoticons, which is characterized in that it includes:从待处理的人物视频中获取至少一张人像图像,所述人像图像中包含人脸图像;Acquiring at least one portrait image from a character video to be processed, where the portrait image includes a face image;计算所述人像图像中的人脸图像与预设的表情包图像库中的表情包图像之间的表情相似度;Calculating the expression similarity between the face image in the portrait image and the emoticon package image in the preset emoticon package image library;基于所述表情相似度确定目标表情包图像以及表情包素材,其中,所述目标表情包图像属于所述表情包图像库,所述表情包素材属于所述人像图像;Determining a target emoticon pack image and emoticon pack material based on the expression similarity, wherein the target emoticon pack image belongs to the emoticon pack image library, and the emoticon pack material belongs to the portrait image;提取所述目标表情包图像的文本信息,并将所述文本信息与所述表情包素材进行整合生成目标表情包。Extracting the text information of the target emoticon package image, and integrating the text information with the emoticon package material to generate a target emoticon package.
- 根据权利要求1所述的表情包生成方法,其特征在于,所述基于所述表情相似度确定目标表情包图像以及表情包素材,包括: The emoticon package generation method according to claim 1, wherein the determining the target emoticon package image and the emoticon package material based on the expression similarity comprises:当第一人脸图像与第一表情包图像之间的表情相似度大于或等于第一预设相似度阈值时,将所述第一表情包图像作为目标表情包图像,将所述第一人脸图像作为静态的表情包素材,其中,所述第一人脸图像为所述人像图像中的任一人脸图像,所述第一表情包图像为所述表情包图像库中的任一表情包图像。When the expression similarity between the first face image and the first emoticon pack image is greater than or equal to the first preset similarity threshold, the first emoticon pack image is used as the target emoticon pack image, and the first person A face image is used as a static emoticon pack material, wherein the first face image is any face image in the portrait image, and the first emoticon pack image is any emoticon pack in the emoticon pack image library image.
- 根据权利要求1所述的表情包生成方法,其特征在于,所述基于所述表情相似度确定目标表情包图像以及表情包素材,包括: The emoticon package generation method according to claim 1, wherein the determining the target emoticon package image and the emoticon package material based on the expression similarity comprises:当第一人脸图像与第一表情包图像之间的表情相似度大于或等于第一预设相似度阈值时,获取所述第一人脸图像在所述人物视频中的属性信息;When the expression similarity between the first face image and the first emoticon package image is greater than or equal to the first preset similarity threshold, acquiring attribute information of the first face image in the person video;根据所述属性信息从所述人物视频中获取所述第一人脸图像对应的多帧图像;Acquiring multiple frames of images corresponding to the first face image from the person video according to the attribute information;计算每帧所述图像与所述第一表情包图像之间的表情相似度;Calculating the expression similarity between each frame of the image and the first emoticon package image;当第一图像中的人脸图像与所述第一表情包图像之间的表情相似度大于或等于第二预设相似度阈值时,将所述第一表情包图像作为所述目标表情包图像,将所述第一图像中的人脸图像作为静态的表情包素材,其中,所述第一图像为所述多帧图像中的任一帧图像。When the expression similarity between the face image in the first image and the first emoticon package image is greater than or equal to a second preset similarity threshold, the first emoticon package image is used as the target emoticon package image , Using the face image in the first image as a static emoticon package material, wherein the first image is any one of the multiple frames of images.
- 根据权利要求3所述的表情包生成方法,其特征在于,所述计算每帧所述图像与所述第一表情包图像之间的表情相似度之后,还包括: The emoticon package generation method according to claim 3, wherein after calculating the expression similarity between each frame of the image and the first emoticon package image, the method further comprises:当第二图像中的人脸图像与所述第一表情包图像之间的表情相似度大于或等于所述第二预设相似度阈值时,将所述第一表情包图像作为所述目标表情包图像,将所述第一图像中的人脸图像以及第二图像中的人脸图像作为动态的表情包素材,其中,所述第二图像为所述多帧图像中与所述第一图像相邻的至少一帧图像。When the expression similarity between the face image in the second image and the first emoticon package image is greater than or equal to the second preset similarity threshold, the first emoticon package image is used as the target expression A packet image, using the face image in the first image and the face image in the second image as dynamic emoticon packet materials, wherein the second image is the same as the first image in the multi-frame image At least one adjacent frame of image.
- 根据权利要求3所述的表情包生成方法,其特征在于,所述计算每帧所述图像与所述第一表情包图像之间的表情相似度之后,还包括: The emoticon package generation method according to claim 3, wherein after calculating the expression similarity between each frame of the image and the first emoticon package image, the method further comprises:当第三图像中的人脸图像与第二表情包图像之间的表情相似度最大时,将所述第二表情包图像作为所述目标表情包图像,将所述第三图像中的人脸图像作为静态的表情包素材,其中,所述第三图像为所述多帧图像中与所述第一图像相邻的至少一帧图像。When the facial expression similarity between the face image in the third image and the second emoticon package image is the largest, the second emoticon package image is used as the target emoticon package image, and the human face in the third image The image is used as a static emoticon package material, wherein the third image is at least one frame of image adjacent to the first image among the multiple frames of images.
- 根据权利要求3所述的表情包生成方法,其特征在于,所述计算每帧所述图像与所述第一表情包图像之间的表情相似度之后,还包括: The emoticon package generation method according to claim 3, wherein after calculating the expression similarity between each frame of the image and the first emoticon package image, the method further comprises:当第三图像中的人脸图像与第二表情包图像之间的表情相似度最大时,将所述第二表情包图像作为所述目标表情包图像,将所述第三图像中的人脸图像以及第四图像作为动态的表情包素材,其中,所述第四图像为所述多帧图像中连续的至少两帧图像,且所述至少两帧图像中至少有一帧图像与所述第一图像相邻。When the facial expression similarity between the face image in the third image and the second emoticon package image is the largest, the second emoticon package image is used as the target emoticon package image, and the human face in the third image The image and the fourth image are used as dynamic emoticon package materials, wherein the fourth image is at least two consecutive images in the multi-frame images, and at least one of the at least two images is related to the first image. The images are adjacent.
- 根据权利要求1至6任一项所述的表情包生成方法,其特征在于,所述计算所述人脸图像与所述表情包图像库中的表情包图像之间的表情相似度,包括:The emoticon package generation method according to any one of claims 1 to 6, wherein the calculating the expression similarity between the face image and the emoticon package image in the emoticon package image library comprises:通过预设的深度神经网络提取所述人脸图像的人脸表情特征;Extracting facial expression features of the facial image through a preset deep neural network;将所述人脸表情特征与所述表情包图像的表情特征进行特征比对得到所述表情相似度。The facial expression feature is compared with the expression feature of the emoticon package image to obtain the expression similarity.
- 一种表情包生成装置,其特征在于,包括: An emoticon package generating device, which is characterized in that it comprises:获取模块,用于从待处理的人物视频中获取至少一张人像图像,所述人像图像中包含人脸图像;An obtaining module, configured to obtain at least one portrait image from a character video to be processed, the portrait image including a face image;计算模块,用于计算所述人像图像中的人脸图像与预设的表情包图像库中的表情包图像之间的表情相似度;A calculation module for calculating the expression similarity between the face image in the portrait image and the emoticon package image in the preset emoticon package image library;确定模块,用于基于所述表情相似度确定目标表情包图像以及表情包素材,其中,所述目标表情包图像属于所述表情包图像库,所述表情包素材属于所述人像图像;A determining module, configured to determine a target emoticon pack image and emoticon pack material based on the expression similarity, wherein the target emoticon pack image belongs to the emoticon pack image library, and the emoticon pack material belongs to the portrait image;生成模块,用于提取所述目标表情包图像的文本信息,并将所述文本信息与所述表情包素材进行整合生成目标表情包。The generating module is used for extracting the text information of the target emoticon package image, and integrating the text information with the emoticon package material to generate the target emoticon package.
- 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至7任一项所述的表情包生成方法。 A terminal device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the computer program as claimed in claims 1 to 7. Any one of the emoticon package generation method.
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述的表情包生成方法。A computer-readable storage medium, the computer-readable storage medium stores a computer program, wherein the computer program is executed by a processor to implement the emoticon package generation method according to any one of claims 1 to 7 .
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911197094.5 | 2019-11-29 | ||
CN201911197094.5A CN110889379B (en) | 2019-11-29 | 2019-11-29 | Expression package generation method and device and terminal equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021104097A1 true WO2021104097A1 (en) | 2021-06-03 |
Family
ID=69749402
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/129209 WO2021104097A1 (en) | 2019-11-29 | 2020-11-17 | Meme generation method and apparatus, and terminal device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110889379B (en) |
WO (1) | WO2021104097A1 (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110889379B (en) * | 2019-11-29 | 2024-02-20 | 深圳先进技术研究院 | Expression package generation method and device and terminal equipment |
CN111372141B (en) * | 2020-03-18 | 2024-01-05 | 腾讯科技(深圳)有限公司 | Expression image generation method and device and electronic equipment |
CN111586466B (en) * | 2020-05-08 | 2021-05-28 | 腾讯科技(深圳)有限公司 | Video data processing method and device and storage medium |
CN111768481A (en) * | 2020-05-19 | 2020-10-13 | 北京奇艺世纪科技有限公司 | Expression package generation method and device |
CN111753131A (en) * | 2020-06-28 | 2020-10-09 | 北京百度网讯科技有限公司 | Expression package generation method and device, electronic device and medium |
CN111881776B (en) * | 2020-07-07 | 2023-07-07 | 腾讯科技(深圳)有限公司 | Dynamic expression acquisition method and device, storage medium and electronic equipment |
CN113436297A (en) * | 2021-07-15 | 2021-09-24 | 维沃移动通信有限公司 | Picture processing method and electronic equipment |
CN117150063B (en) * | 2023-10-26 | 2024-02-06 | 深圳慢云智能科技有限公司 | Image generation method and system based on scene recognition |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106803909A (en) * | 2017-02-21 | 2017-06-06 | 腾讯科技(深圳)有限公司 | The generation method and terminal of a kind of video file |
CN107239535A (en) * | 2017-05-31 | 2017-10-10 | 北京小米移动软件有限公司 | Similar pictures search method and device |
CN110162670A (en) * | 2019-05-27 | 2019-08-23 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating expression packet |
US20190354791A1 (en) * | 2018-05-17 | 2019-11-21 | Idemia Identity & Security France | Character recognition method |
CN110889379A (en) * | 2019-11-29 | 2020-03-17 | 深圳先进技术研究院 | Expression package generation method and device and terminal equipment |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107369196B (en) * | 2017-06-30 | 2021-08-24 | Oppo广东移动通信有限公司 | Expression package manufacturing method and device, storage medium and electronic equipment |
US10593087B2 (en) * | 2017-10-23 | 2020-03-17 | Paypal, Inc. | System and method for generating emoji mashups with machine learning |
CN109508399A (en) * | 2018-11-20 | 2019-03-22 | 维沃移动通信有限公司 | A kind of facial expression image processing method, mobile terminal |
CN110321845B (en) * | 2019-07-04 | 2021-06-18 | 北京奇艺世纪科技有限公司 | Method and device for extracting emotion packets from video and electronic equipment |
CN110458916A (en) * | 2019-07-05 | 2019-11-15 | 深圳壹账通智能科技有限公司 | Expression packet automatic generation method, device, computer equipment and storage medium |
-
2019
- 2019-11-29 CN CN201911197094.5A patent/CN110889379B/en active Active
-
2020
- 2020-11-17 WO PCT/CN2020/129209 patent/WO2021104097A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106803909A (en) * | 2017-02-21 | 2017-06-06 | 腾讯科技(深圳)有限公司 | The generation method and terminal of a kind of video file |
CN107239535A (en) * | 2017-05-31 | 2017-10-10 | 北京小米移动软件有限公司 | Similar pictures search method and device |
US20190354791A1 (en) * | 2018-05-17 | 2019-11-21 | Idemia Identity & Security France | Character recognition method |
CN110162670A (en) * | 2019-05-27 | 2019-08-23 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating expression packet |
CN110889379A (en) * | 2019-11-29 | 2020-03-17 | 深圳先进技术研究院 | Expression package generation method and device and terminal equipment |
Also Published As
Publication number | Publication date |
---|---|
CN110889379A (en) | 2020-03-17 |
CN110889379B (en) | 2024-02-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021104097A1 (en) | Meme generation method and apparatus, and terminal device | |
CN108833973B (en) | Video feature extraction method and device and computer equipment | |
US20220350842A1 (en) | Video tag determination method, terminal, and storage medium | |
WO2019041521A1 (en) | Apparatus and method for extracting user keyword, and computer-readable storage medium | |
CN111489290B (en) | Face image super-resolution reconstruction method and device and terminal equipment | |
WO2019153504A1 (en) | Group creation method and terminal thereof | |
CN108961267B (en) | Picture processing method, picture processing device and terminal equipment | |
CN108898082B (en) | Picture processing method, picture processing device and terminal equipment | |
CN111814770A (en) | Content keyword extraction method of news video, terminal device and medium | |
CN111209970A (en) | Video classification method and device, storage medium and server | |
WO2020259449A1 (en) | Method and device for generating short video | |
JP2021034003A (en) | Human object recognition method, apparatus, electronic device, storage medium, and program | |
CN112532882B (en) | Image display method and device | |
WO2023197648A1 (en) | Screenshot processing method and apparatus, electronic device, and computer readable medium | |
WO2020135756A1 (en) | Video segment extraction method, apparatus and device, and computer-readable storage medium | |
CN111818385B (en) | Video processing method, video processing device and terminal equipment | |
WO2021135286A1 (en) | Video processing method, video searching method, terminal device, and computer-readable storage medium | |
CN113205047A (en) | Drug name identification method and device, computer equipment and storage medium | |
CN109886239B (en) | Portrait clustering method, device and system | |
CN111128233A (en) | Recording detection method and device, electronic equipment and storage medium | |
CN110232267B (en) | Business card display method and device, electronic equipment and storage medium | |
CN108932704B (en) | Picture processing method, picture processing device and terminal equipment | |
WO2023173659A1 (en) | Face matching method and apparatus, electronic device, storage medium, computer program product, and computer program | |
CN115544214A (en) | Event processing method and device and computer readable storage medium | |
CN113361486A (en) | Multi-pose face recognition method and device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20893238 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20893238 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 19/01/2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20893238 Country of ref document: EP Kind code of ref document: A1 |