CN110933354A

CN110933354A - Customizable multi-style multimedia processing method and terminal thereof

Info

Publication number: CN110933354A
Application number: CN201911128692.7A
Authority: CN
Inventors: 孙文君; 李立; 赵柯莹
Original assignee: Shenzhen Microphone Holdings Co Ltd
Current assignee: Shenzhen Microphone Holdings Co Ltd
Priority date: 2019-11-18
Filing date: 2019-11-18
Publication date: 2020-03-27
Anticipated expiration: 2039-11-18
Also published as: CN110933354B

Abstract

The invention discloses a customizable multi-style multimedia processing method and a terminal thereof, wherein the method comprises the following steps: acquiring a picture, capturing key information of the picture, obtaining a target style characteristic template according to the key information, and calling the target style characteristic template to perform multimedia processing; the multimedia processing objects comprise videos, photos and pictures; when the multimedia processing is a multi-format video call, further comprising: collecting a multi-angle and multi-expression portrait picture of a user, and applying a target style characteristic template to the multi-angle and multi-expression portrait picture to realize frame data collection; encoding the collected frame data to obtain a target style video template; and calling the target style video template to carry out video call. The method and the device customize the target style video template according to the preference of the user, call the target style video template in real time according to the requirements of the user in the video call, meet the requirements of different styles when the user and different people have video calls, and improve the experience of the user.

Description

Customizable multi-style multimedia processing method and terminal thereof

Technical Field

The invention relates to the technical field of video call, in particular to a customizable multi-style multimedia processing method and a terminal thereof.

Background

With the popularization of smart phones and the rapid upgrade of wireless networks and data networks, video chat has been gradually expanded and even replaced traditional telephone communication and text chat modes, and video chat has become a standard for instant messaging.

At present, processing software based on pictures and short videos is endless, but most of the image and video processing software are preset fixed templates which cannot be customized according to user requirements, and users cannot select proper styles according to preferences of the users. For example, the existing video call application such as QQ or wechat can achieve the effect of video beautification, but the template is single and is open to the outside, and cannot be customized according to the user's needs. In addition, because the user has different styles of requirements when the user has video calls with different people, the existing template can not meet the requirements of the user, namely, a technical scheme for helping the user to solve the pain point does not exist at present. For the foregoing reasons, it is necessary to develop a customizable and customizable multi-style multimedia processing method.

Disclosure of Invention

The invention aims to provide a customizable multi-style multimedia processing method and a terminal thereof, which customize a target style characteristic template (such as a target style video template) according to the preference of a user, can be called in a multimedia application request (such as video call) in real time according to the requirements of the user, and meet the different style requirements of the user (such as different style requirements when the user is in video call with different people) so as to improve the experience of the user.

In order to achieve the purpose, the invention is realized by the following technical scheme:

a method of customizable multi-style multimedia processing, the method comprising: acquiring a picture, capturing key information of the picture, obtaining a target style characteristic template according to the key information, and calling the target style characteristic template to perform multimedia processing.

Preferably, the multimedia processing object comprises video, photo, and picture.

Preferably, the target style characteristic template is generated in real time or stored in a terminal and/or a cloud.

Preferably, the step of acquiring pictures comprises: the user selects one or more favorite pictures to capture.

Preferably, the key information is a favorite picture sub-region, and the favorite picture sub-region includes at least a part of the target face sub-region.

Preferably, the customizable multi-style multimedia processing method further comprises the steps of: and analyzing the common characteristics of the target face subregions to obtain the target style characteristic template.

Preferably, the customizable multi-style multimedia processing method further comprises the steps of: the common features of the target human face subregions are one or more of facial features, light filtering features, hair style features, facial and body beautifying features, clothes color features and light features.

Preferably, the step of obtaining the target style characteristic template includes optimizing the image processing, specifically, one or more of skin polishing processing, skin whitening processing, face fat processing, face thinning processing, eye brightening processing, acne removing processing, wrinkle removing processing, filter processing, makeup processing, hairstyle changing processing, mosaic processing, light supplement processing, contrast processing, saturation processing, sharpening processing, and background blurring processing.

Preferably, the customizable multi-style multimedia processing method further comprises: and automatically or manually calling the target style feature template by capturing matching information corresponding to the target style feature template.

Preferably, capturing the matching information comprises: and naming, face recognition, voice recognition and address book searching and matching when the target style characteristic template is searched and stored.

Preferably, when the multimedia processing is a multi-format video call, further comprising: collecting multi-angle and multi-expression portrait pictures of a user; the target style characteristic template is applied to a multi-angle multi-expression portrait picture to realize frame data acquisition; processing the collected frame data to obtain a target style video template; and calling the target style video template to carry out video call.

The invention also comprises a customizable multi-format multimedia processing terminal employing a multi-format multimedia processing method as described above, comprising: the input module is used for acquiring a picture to be processed; the extraction module is used for capturing key information of the picture; and the multimedia processing module obtains a target style characteristic module according to the key information and calls the target style characteristic template to perform corresponding multimedia processing.

Preferably, the multimedia processing module comprises a video processing module, a photographing processing module and a picture processing module.

The invention further provides a customizable multi-format multimedia processing system adopting the multi-format multimedia processing terminal, which comprises a first terminal, a server and a second terminal;

the first terminal and/or the second terminal comprises: the input module is used for acquiring a picture to be processed and the extraction module is used for capturing key information of the picture; the multimedia processing module obtains a target style characteristic module according to the key information and calls the target style characteristic template to perform corresponding multimedia processing;

when one terminal initiates a multimedia application request to another terminal through a server, the server transfers the multimedia application request between the one terminal and the another terminal, and the another terminal receives the multimedia application request, the one terminal and the another terminal establish a relevant relation, and the one terminal and/or the another terminal realize multi-style multimedia application by calling a target style video template.

Compared with the prior art, the invention has the beneficial effects that:

(1) the method customizes the target style characteristic template (such as the target style video template) according to the preference of the user, calls the target style characteristic template in real time according to the requirements of the user in a multimedia application request (such as video call), meets the different style requirements of the user (such as the different style requirements when the user has video calls with different people), and improves the experience of the user; (2) the invention can be automatically realized when meeting the requirements of different styles of users, does not need manual adjustment of the users, and is simple and convenient; (3) the invention can process the picture by calling the image processing algorithm according to the favorite and the customized style desired by the user, and the image processing algorithm carries out independent compiling processing, thereby obtaining various customized target style characteristic templates and having wider application range.

Drawings

FIG. 1 is a flow chart of a customizable multi-style multimedia processing method according to the present invention;

FIG. 1a is a schematic flow chart of a customizable multi-style photo processing method according to the present invention;

FIG. 1b is a schematic flow chart of a customizable multi-style-picture processing method of the present invention;

FIG. 1c is a schematic flow chart of a customizable multi-style video call method of the present invention;

FIG. 2 is a flow chart of an automatic mode customized multi-style video call method of the present invention;

FIG. 3 is a flow chart of another method for automatically customizing a multi-format video call according to the present invention;

FIG. 4 is a flow chart of a multi-format video call method for manual mode customization according to the present invention;

fig. 5 is a flow chart of another manual mode customized multi-style video call method of the present invention.

Detailed Description

The invention provides a customizable multi-style multimedia processing method, a terminal and a system thereof, which are combined with the accompanying drawings and described in detail in the following expression form of embodiments in order to solve the characteristics, contents, advantages and effects achieved by the invention.

The advantages, features and technical measures of the invention will be more readily understood by reference to the following detailed description of exemplary embodiments and the accompanying drawings, and the invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art, and the present invention will only be defined by the appended claims.

As shown in fig. 1, the present invention provides a customizable multi-style multimedia processing method, comprising the steps of: step 1, collecting pictures; step 2, capturing the key information of the picture; step 3, obtaining a target style characteristic template according to the key information; step 4, calling a target style characteristic template to perform multimedia processing; the multimedia processing objects comprise videos, photos and pictures.

The first embodiment is as follows:

as shown in fig. 1a, the present invention provides a customizable photographing method, comprising: step 1: inputting a target face picture; step 2, obtaining a target face subregion; step 3, obtaining a target style characteristic template based on the target face subarea; and 4, calling the obtained target style characteristic template to take a picture.

The target style characteristic template is generated in real time or stored in a terminal and/or a cloud. The step of acquiring the picture comprises the following steps: the user selects one or more favorite pictures to capture. The key information is a favorite picture sub-region, the favorite picture sub-region comprises at least a part of target face sub-region, and the target style feature template is obtained by analyzing common features of the target face sub-region. The common features of the target human face subareas are one or more of facial features, light filtering features, hair style features, beauty and body features, clothes color features and light features.

In order to obtain the target style characteristic template, the image processing can be optimized, specifically, the image processing comprises one or more of skin grinding processing, whitening processing, face fat processing, face thinning processing, eye brightening processing, acne removing processing, wrinkle removing processing, filter processing, makeup beautifying processing, hairstyle changing processing, mosaic processing, light filling processing, contrast processing, saturation processing, sharpening processing and background blurring processing.

The method includes the steps that matching information corresponding to a target style characteristic template is captured, and the target style characteristic template is automatically or manually called; and capturing matching information further comprises: and naming, face recognition, voice recognition and address book searching and matching when the target style characteristic template is searched and stored.

Example two:

as shown in fig. 1b, the present invention provides a customizable picture processing method, which comprises: step 1: inputting a target face picture; step 2, obtaining a target face subregion; step 3, obtaining a target style characteristic template based on the target face subarea; and 4, processing the picture to be beautified by calling the obtained target style characteristic template.

Example three:

as shown in fig. 1c, the present invention provides a customizable multi-style video call method, comprising: step 1, inputting a target face picture; step 2, obtaining a target face subregion; step 3, obtaining a target style characteristic template based on the target face subarea; step 4, collecting multi-angle and multi-expression portrait pictures of the user; step 5, applying the target style characteristic template to the multi-angle multi-expression portrait picture to realize frame data acquisition; step 6, encoding the collected frame data to obtain a target style video template; and 7, calling the obtained target style video template to carry out video call.

Example four:

as shown in fig. 2, the present invention provides an automatic mode-customizable multi-style video call method, which comprises the following steps:

s1, acquiring a target picture: inputting N candidate pictures with satisfactory face styles.

In step S1, the N candidate facial images received by the terminal may be facial images of the user, or may be any images selected by the user in the image database.

S2, obtaining M target face subregions: and carrying out region selection on the input N candidate pictures to obtain M sub-pictures corresponding to the M target sub-regions.

In step S2, the M sub-pictures are generated to correspond to different regions of each target face, such as face, nose, eyes, eyebrows, forehead, bang, inter-eye distance, chin, teeth, lips, skin color, lip color, and so on. The M sub-pictures may be different facial features regions of a plurality of images in the N sub-pictures, or may be facial features regions in one image.

In this embodiment, the M target face sub-regions are face sub-regions selected from at least two candidate face images, each sub-picture corresponds to one face sub-region, and each candidate face image includes at least one face sub-region; n, M are each integers greater than 1. M may be greater than N, or M may be less than N.

And S3, obtaining a target style characteristic template based on the face style characteristics of the M sub-pictures.

For example, when customizing a boyfriend style type, the integration is processed based on the analysis of the input picture to obtain the favorite features of the five sense organs of the preference chart: if the eyes are slightly larger than the actual eyes, the color of the pupils is light brown, the light color is the eye shadow, the eyelashes are characterized by a curling and warping type, a clear root type and an eyebrow shape, and the like; analysis of facial form of favorite diagram: if the face is slightly thinner than the actual face, the goose egg face; skin color analysis of favorites graph: like a bit whiter skin, etc., so that it can be integrated into a boyfriend style template.

And S4, collecting X multi-angle multi-expression portrait pictures of the user.

And S5, applying the target style characteristic template to the acquired X pictures to realize frame data acquisition.

And S6, encoding the collected frame data to obtain the target style video template.

It should be noted that all video style templates are only one way, all processing is based on single frame image processing, multi-angle and multi-expression portrait images are collected to be multi-angle and multi-expression characteristic templates with customized styles, and then each frame is connected to form a video by applying the characteristic template to be matched.

For example, a mom style is customized, firstly, 8 candidate pictures preferred by a mom of a user are input to a terminal, the terminal receives the 8 candidate pictures, the input 8 candidate pictures are subjected to region selection to obtain 10 sub-pictures corresponding to 10 target sub-regions, and the terminal analyzes the sub-pictures to obtain a target style picture template. And then, collecting 5 multi-angle multi-expression portrait pictures of the user, applying the target style characteristic template to the 5 multi-angle multi-expression portrait pictures to realize the collection of frame data, and coding to obtain a target style video template required by the user.

For example, customizing a dad style, first, 6 candidate pictures that users dad like are input to a terminal, the terminal receives the 6 candidate pictures, performs region selection on the input 6 candidate pictures to obtain 5 sub-pictures corresponding to 5 target sub-regions, and the terminal analyzes the sub-pictures to obtain a target style picture template. And then collecting 4 multi-angle multi-expression portrait pictures of the user, applying the target style characteristic template to the 4 multi-angle multi-expression portrait pictures to realize the collection of frame data, and coding to obtain a target style video template required by the user.

In the embodiment of the invention, a terminal receives N alternative face images and acquires M sub-images of M selected target face sub-regions, generates a target face template based on the M sub-images, and acquires a plurality of multi-angle multi-expression face images of a user and applies the target style characteristic template to the plurality of multi-angle multi-expression face images to acquire frame data and encode the frame data to obtain the target style video template required by a specific user.

The target style video templates of the embodiment can be named after being generated and stored.

And S7, when the video call event needs to be carried out, after the video call is opened by the third-party application program, calling the obtained target style video template according to the preference to finish the video call.

Example five:

as shown in fig. 3, the present invention provides another method for automatic mode customizable multi-style video call, comprising the steps of:

p1, acquiring a target picture: inputting N candidate pictures with satisfactory face styles.

P2, acquiring M target face subregions: and carrying out region selection on the input N candidate pictures to obtain M sub-pictures corresponding to the M target sub-regions.

And P3, obtaining a target style characteristic template based on the face style characteristics of the M sub-pictures.

And P4, collecting X multi-angle multi-expression portrait pictures of the user.

And P5, applying the target style characteristic template to the acquired X pictures to realize frame data acquisition.

And P6, encoding the collected frame data to obtain the target style video template.

And P7, directly setting a target style video template in the video contacts according to the preference, namely when a video call is opened in a third-party application program, directly realizing the automatic calling of the target style video template and completing the video call.

Example six:

as shown in fig. 4, the manual mode customized multi-style video call method of the present invention comprises the following steps:

t1, acquiring a target picture: inputting N candidate pictures with satisfactory face styles.

In the step T1, the N candidate facial images received by the terminal may be facial images of the user, or may be any images selected by the user in the image database.

T2, acquiring M target face subregions: and carrying out region selection on the input N candidate pictures to obtain M sub-pictures corresponding to the M target sub-regions.

In the step T2, the M sub-pictures are generated to correspond to different regions of each target face, such as face, nose, eyes, eyebrows, forehead, bang, inter-eye distance, chin, teeth, lips, skin color, lip color, and so on. The M sub-pictures may be different facial features regions of a plurality of images in the N sub-pictures, or may be facial features regions in one image.

And T3, calling one or more image processing algorithms to process the M sub-pictures according to the favorite and the customized style desired by the user, and obtaining the target style characteristic template.

In the step T3, the image processing algorithm includes a skin-polishing algorithm, a skin-whitening algorithm, a face-fat algorithm, a face-thinning algorithm, a bright-eye algorithm, an acne-removing algorithm, a wrinkle-removing algorithm, a filter algorithm, a makeup algorithm, a hair style replacement algorithm, a mosaic algorithm, a light-filling algorithm, a contrast algorithm, a saturation algorithm, a sharpening algorithm, a background blurring algorithm, and the like. Meanwhile, for different image processing algorithms, there may be different calling modes, such as an object acted according to the algorithm (e.g., a certain object or a certain position in the image), parameter adjustment in the algorithm (e.g., buffing strength or level of the buffing algorithm), and the like.

In this embodiment, the above-mentioned image processing algorithm is independently written.

And T4, collecting X multi-angle multi-expression portrait pictures of the user.

And T5, applying the target style characteristic template to the acquired X pictures to realize frame data acquisition.

And T6, encoding the collected frame data to obtain the target style video template.

For example, when the mom style is customized, a little bit fat, a little white skin, a horse tail for hairstyle, and the like can be slightly adjusted in the face shape, then at first, 8 candidate pictures preferred by the mom of the user are input to the terminal, the terminal receives the 8 candidate pictures, the input 8 candidate pictures are subjected to region selection to obtain 10 sub-pictures corresponding to 10 target sub-regions, and then the 10 sub-pictures are processed by calling a face fat algorithm, a whitening algorithm, a hairstyle changing algorithm, and the like to obtain a target style characteristic template. And then, collecting 5 multi-angle multi-expression portrait pictures of the user, applying the target style characteristic template to the 5 multi-angle multi-expression portrait pictures to realize the collection of frame data, and coding to obtain a target style video template required by the user.

For example, when customizing boyfriend style, face slimming, eye widening, lip coloring, pupil wearing, hair style widening, and the like can be performed, first, 6 candidate pictures preferred by the boyfriend of the user are input to the terminal, the terminal receives the 6 candidate pictures, performs region selection on the input 6 candidate pictures to obtain 5 sub-pictures corresponding to 5 target sub-regions, and then invokes a face slimming algorithm, a bright eye algorithm, a makeup algorithm, a hair style changing algorithm, and the like to process the 5 sub-pictures to obtain a target style feature template. And then collecting 4 multi-angle multi-expression portrait pictures of the user, applying the target style characteristic template to the 4 multi-angle multi-expression portrait pictures to realize the collection of frame data, and coding to obtain a target style video template required by the user.

And T7, when a video call event needs to be carried out, after the video call is opened by the third-party application program, calling the obtained target style video template according to the preference to finish the video call.

Example seven:

as shown in fig. 5, the present invention provides another manual mode customized multi-style video call method, which comprises the following steps:

q1, acquiring a target picture: inputting N candidate pictures with satisfactory face styles.

Q2, acquiring M target face subregions: and carrying out region selection on the input N candidate pictures to obtain M sub-pictures corresponding to the M target sub-regions.

And Q3, calling one or more image processing algorithms to process the M sub-pictures according to the preference and the customized style required by the user, and obtaining the target style characteristic template.

And Q4, collecting X multi-angle multi-expression portrait pictures of the user.

And Q5, applying the target style characteristic template to the acquired X pictures to realize frame data acquisition.

And Q6, encoding the collected frame data to obtain the target style video template.

Q7, directly setting a target style video template in the video contact according to the preference, namely when the video call is opened in a third-party application program, directly realizing the automatic calling of the target style video template to finish the video call.

Example eight:

in the multi-style video call method process of the present invention, in the step 3, the change of the appearance and gender can also be realized through feature adjustment (i.e. the above steps S3, T3, etc. are replaced, the method is the same as the step S3), so as to obtain the required target style feature template, then the specific debugging style parameters are reserved, and are respectively used for collecting frame data of a specific style when X pieces of multi-angle and multi-expression face pictures are taken, and finally the collected frame data are encoded, so as to obtain the target style video template.

Example nine:

the invention also provides a multi-format video terminal, comprising: the alternative picture input module is used for inputting N alternative pictures with a satisfactory face style for a user; the image processing module is used for receiving the N candidate face images, acquiring M target face sub-regions to obtain corresponding M sub-pictures, and generating a target style characteristic template based on the M sub-pictures, wherein N, M are integers greater than 1; the expression acquisition module is used for acquiring X multi-angle and multi-expression portrait pictures of a user; the frame data acquisition module is used for realizing frame data acquisition when the target style characteristic template is applied to the acquired X pictures; and the video template generation module is used for coding the collected frame data to obtain a target style video template.

Example ten:

the invention also provides a customizable multi-format video call system, which comprises a first client, a server and a second client, wherein the first client comprises but is not limited to the following: the alternative picture input module is used for inputting N alternative pictures with a satisfactory face style for a user; the image processing module is used for receiving the N candidate face images, acquiring M target face sub-regions to obtain corresponding M sub-pictures, and generating a target style characteristic template based on the M sub-pictures, wherein N, M are integers greater than 1; the expression acquisition module is used for acquiring X multi-angle and multi-expression portrait pictures of a user; the frame data acquisition module is used for realizing frame data acquisition when the target style characteristic template is applied to the acquired X pictures; and the video template generation module is used for coding the collected frame data to obtain a target style video template.

When a first client opens a video call in a third-party application program and initiates a video call request to a second client through a server, the server transfers the video call request between the first client and the second client, and the second client receives the video call request, so that the first client establishes the video call with the second client, and the first client calls an obtained target style video template according to the preference to realize the customizable multi-style video call. Or the first client directly sets a target style video template in the video contact according to the preference, then opens the video call in a third-party application program, namely, the automatic calling of the target style video template is directly realized, the first client initiates a video call request to the second client through the server, the server transfers the video call request between the first client and the second client, and when the second client receives the video call request, the first client and the second client establish the video call, so that the customizable multi-style video call is realized.

In the invention, not only the first client can realize the generation and application of the target style video template, but also the second client can realize the generation and application of the target style video template, and the specific functional modules and functions are the same as those of the first client, which are not repeated herein. When the first client initiates a video call request to the second client, the second client establishes a video call when receiving the video call request, and the second client can also call the obtained target style video template according to the preference to realize the customizable multi-style video call. In addition, the second client side can also initiatively initiate a video call request to the first client side, the first client side establishes a video call when receiving the video call request, the second client side can also call the obtained target style video template according to the preference to realize the customizable multi-style video call, or the second client side directly sets the target style video template in a video contact person according to the preference first, then opens the video call in a third-party application program, namely, the automatic calling of the target style video template is directly realized, then initiatively initiate the video call request to the first client side, the first client side establishes the video call when receiving the video call request, and the customizable multi-style video call is automatically realized.

In conclusion, the target style characteristic template can be customized according to the preference of the user, and can be called in real time according to the requirements of the user in the multimedia application request, so that the different style requirements of the user are met, and the user experience is improved.

While the present invention has been described in detail with reference to the preferred embodiments, it should be understood that the above description should not be taken as limiting the invention. Various modifications and alterations to this invention will become apparent to those skilled in the art upon reading the foregoing description. Accordingly, the scope of the invention should be determined from the following claims.

Claims

1. A customizable multi-format multimedia processing method, the method comprising:

acquiring a picture, capturing key information of the picture, obtaining a target style characteristic template according to the key information, and calling the target style characteristic template to perform multimedia processing.

2. The customizable multi-style multimedia processing method according to claim 1, wherein said step of capturing a picture comprises: the user selects one or more favorite pictures to capture.

3. The customizable multi-style multimedia processing method according to claim 1, wherein the key information is a favorite sub-region of pictures, the favorite sub-region of pictures comprising at least one target sub-region of faces.

4. The customizable multi-format multimedia processing method according to claim 3, further comprising the steps of: and analyzing the common characteristics of the target face subregions to obtain the target style characteristic template.

5. The customizable multi-format multimedia processing method according to claim 3, further comprising the steps of: the common features of the target human face subregions are one or more of facial features, light filtering features, hair style features, facial and body beautifying features, clothes color features and light features.

6. The customizable multi-style multimedia processing method of claim 1, further comprising: and automatically or manually calling the target style feature template by capturing matching information corresponding to the target style feature template.

7. The customizable multi-style multimedia processing method according to claim 5,

capturing the matching information comprises: and naming, face recognition, voice recognition and address book searching and matching when the target style characteristic template is searched and stored.

8. The customizable multi-style multimedia processing method according to any one of claims 1 to 7,

when the multimedia processing is a multi-style video call, further comprising:

collecting multi-angle and multi-expression portrait pictures of a user;

the target style characteristic template is applied to a multi-angle multi-expression portrait picture to realize frame data acquisition;

processing the collected frame data to obtain a target style video template;

and calling the target style video template to carry out video call.

9. A customizable multi-format multimedia processing terminal, comprising:

the input module is used for acquiring a picture to be processed;

the extraction module is used for capturing key information of the picture;

and the multimedia processing module obtains a target style characteristic module according to the key information and calls the target style characteristic template to perform corresponding multimedia processing.

10. The customizable multi-format multimedia processing terminal of claim 9,

the multimedia processing module comprises a video processing module, a photographing processing module and a picture processing module.