CN110636377A - Video processing method, device, storage medium, terminal and server - Google Patents

Video processing method, device, storage medium, terminal and server Download PDF

Info

Publication number
CN110636377A
CN110636377A CN201910894995.3A CN201910894995A CN110636377A CN 110636377 A CN110636377 A CN 110636377A CN 201910894995 A CN201910894995 A CN 201910894995A CN 110636377 A CN110636377 A CN 110636377A
Authority
CN
China
Prior art keywords
video
face
image
face image
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910894995.3A
Other languages
Chinese (zh)
Inventor
朱凌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN201910894995.3A priority Critical patent/CN110636377A/en
Publication of CN110636377A publication Critical patent/CN110636377A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47202End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting content on demand, e.g. video on demand
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/475End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data
    • H04N21/4756End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data for rating content, e.g. scoring a recommended movie

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The present disclosure provides a video processing method, apparatus, storage medium, terminal and server for automatically performing a video processing operation, the video processing method comprising: receiving a playing instruction aiming at a target video input by a user, and playing the target video; acquiring a face image of the user; and displaying a video processing interface corresponding to the target video according to the acquired face image.

Description

Video processing method, device, storage medium, terminal and server
Technical Field
The present disclosure relates to the field of video processing technologies, and in particular, to a video processing method and apparatus, a storage medium, a terminal, and a server.
Background
Currently, when a user watches a video through a video application, a screen has some operable specific icons such as like.
However, the above-mentioned video processing methods all require the user to operate with a hand, and when the user is inconvenient to operate with fingers, for example, the weather is cold, or the user wears gloves, or the user holds things with both hands, or the user performs other operations, it is inconvenient to perform the video processing operation on the video.
Disclosure of Invention
The disclosure provides a video processing method, a video processing device, a storage medium, a terminal and a server, which are used for automatically executing video processing operation. The technical scheme of the disclosure is as follows:
according to a first aspect of embodiments of the present disclosure, there is provided a video processing method, the method including:
receiving a playing instruction aiming at a target video input by a user, and playing the target video;
acquiring a face image of the user;
and displaying a video processing interface corresponding to the target video according to the acquired face image.
In one possible implementation, the acquiring the face image of the user includes:
and acquiring a face video of the user, wherein the face video comprises a plurality of frames of face images.
In a possible implementation manner, the displaying a video processing interface corresponding to the target video according to the acquired face image includes:
according to the collected face image, identifying face action information corresponding to the face image, wherein the face action information is used for representing the face action of a user in the identified face image;
if the face action information is a preset face action, acquiring a video processing instruction corresponding to the preset face action;
and displaying a video processing interface corresponding to the target video according to the video processing instruction.
In a possible implementation manner, when the acquired face image is a face image in a face video of a user, the identifying, according to the acquired face image, face action information corresponding to the face image includes:
if the face action information corresponding to the face images with the continuous preset frame number is the same, or if the face action information corresponding to all the face images within the preset time length is the same, confirming that the same face action information is the identified face action information corresponding to the face images.
In a possible implementation manner, the displaying a video processing interface corresponding to the target video according to the acquired face image includes:
acquiring preset video comment information corresponding to a first face image according to the acquired first face image;
and displaying the video comment information on a video processing interface of the target video.
In a possible implementation manner, the displaying a video processing interface corresponding to the target video according to the acquired face image includes:
and according to the acquired second face image, adding video content corresponding to a video author of the target video in a video attention list of the user, and canceling display of an attention adding function item on a video processing interface of the target video.
In a possible implementation manner, after the video processing interface of the target video cancels the display of the focus function item, the method further includes:
according to the collected third face image, deleting the video content corresponding to the video author of the target video from the video attention list, and displaying the attention function item on a video processing interface of the target video.
In a possible implementation manner, the displaying a video processing interface corresponding to the target video according to the acquired face image includes:
displaying a praise mark on a video processing interface of the target video according to the collected fourth face image;
or canceling the display of the approved mark according to the acquired fifth face image.
In a possible implementation manner, the displaying a video processing interface corresponding to the target video according to the acquired face image includes:
according to the collected sixth facial image, skipping to display a sharing interface for sharing the target video through a video playing interface of the target video;
or canceling the sharing interface according to the collected seventh face image.
According to a second aspect of the embodiments of the present disclosure, there is provided a video processing method, the method including:
receiving a face image sent by a video client, wherein the face image is the face image of a user watching a target video played by the video client;
performing image processing on the face image, and identifying face action information corresponding to the face image, wherein the face action information is used for representing the face action of a user in the identified face image;
based on the face action information, executing video processing operation corresponding to the target video;
and sending operation display information to the video client to indicate the video client to display a video processing result corresponding to the video processing operation on a video processing interface of the target video according to the operation display information.
In a possible implementation manner, the receiving a facial image sent by a video client includes: receiving a face image sent by the video client and a video identifier of the target video;
the video processing operation corresponding to the target video is executed based on the face action information, and the video processing operation comprises the following steps: and according to the video identification, obtaining the target video from a video database, and executing video processing operation corresponding to the target video based on the face action information.
In a possible implementation manner, the face action information indicates that the face action of the user in the face image is a smile of the user; or, the action in the face image is represented by the inclination angle of the head of the user in a preset angle range.
In a possible implementation manner, the executing a video processing operation corresponding to the target video includes: increasing or decreasing a statistical number of approved identification information of the target video;
the sending operation display information to the video client includes: and sending operation display information to the video client, so that the video client displays the praise mark corresponding to the praise identification information or cancels the praise mark on a video processing interface of the target video.
In a possible implementation manner, the executing a video processing operation corresponding to the target video includes: corresponding to the target video, executing preset video sharing operation processing;
the sending operation display information to the video client includes: and sending operation display information to the video client so that the video client displays a sharing interface of a target video corresponding to the video sharing operation processing.
In a possible implementation manner, the performing image processing on the face image to identify face motion information corresponding to the face image includes:
and inputting the face image into a pre-trained image recognition model to obtain the face action information output by the image recognition model.
In a possible implementation, before inputting the facial image into a pre-trained image recognition model, the method further includes:
obtaining a training sample set, the training sample set comprising: an action image and a non-action image; the user in the motion image executes a preset face action, and the user in the non-motion image does not execute the preset face action;
and training the image recognition model through the training sample set, so that the facial action information output by the image recognition model comprises first identification information and second identification information, wherein the first identification information indicates that the image input to the image recognition model is the action image, and the second identification information indicates that the image input to the image recognition model is the non-action image.
In a possible implementation manner, the receiving a facial image sent by a video client includes: receiving one frame of face image in a face video of a user sent by the video client;
the image processing of the face image and the recognition of the face action information corresponding to the face image comprise:
if the face action information corresponding to the face images with the continuous preset frame number is the same, or if the face action information corresponding to all the face images within the preset time length is the same, confirming that the same face action information is the identified face action information corresponding to the face images.
According to a third aspect of embodiments of the present disclosure, there is provided a video processing apparatus including means for performing the video processing method of the first aspect or any possible implementation manner of the first aspect.
According to a fourth aspect of embodiments of the present disclosure, there is provided a video processing apparatus including means for performing the video processing method of the second aspect or any possible implementation manner of the second aspect.
According to a fifth aspect of embodiments of the present disclosure, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a video processing method in any possible implementation of the present disclosure.
According to a sixth aspect of the embodiments of the present disclosure, there is provided a terminal, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the video processing method in the first aspect or any possible implementation manner of the first aspect when executing the program.
According to a seventh aspect of embodiments of the present disclosure, there is provided a server comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the video processing method of the second aspect or any possible implementation manner of the second aspect when executing the program.
According to an eighth aspect of embodiments of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the steps of the video processing method in any possible implementation of the present disclosure.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
the method comprises the steps of collecting a face image of a user watching a target video played by a video client, identifying face action information corresponding to the face image according to the collected face image, and executing video processing operation corresponding to the target video based on the face action information, so that automatic video processing operation is realized.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is a network architecture diagram of a video processing system according to an exemplary embodiment;
FIG. 2 is a flow diagram illustrating a video processing method according to an exemplary embodiment;
FIG. 3 is a flow diagram illustrating a video processing method according to another exemplary embodiment;
FIG. 4 is a schematic block diagram of a video processing apparatus according to an example embodiment;
FIG. 5 is a block diagram illustrating a first processing module in a video processing apparatus according to an example embodiment;
fig. 6 is a schematic diagram showing a first configuration of a video processing apparatus according to another exemplary embodiment;
fig. 7 is a second configuration diagram of a video processing apparatus according to another exemplary embodiment;
FIG. 8 is a block diagram illustrating a terminal in accordance with an exemplary embodiment;
fig. 9 is a schematic diagram illustrating a configuration of a server according to an example embodiment.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
First, a network architecture of an application scenario of the technical solution provided by the embodiment of the present disclosure is introduced.
Fig. 1 is a diagram illustrating a network architecture of a video processing system according to an exemplary embodiment, which includes a terminal 10, a server 20, and an image capture device 30, as shown in fig. 1.
The terminal 10 may be a smartphone, a tablet computer, etc., and the terminal 10 is an electronic device running a client of at least one video Application (APP).
The server 20 is a background server of the video APP, and is configured to receive data sent by the terminal 10 or send data to the terminal 10.
The image capturing device 30, such as a camera, is used to capture images of a user viewing a video while the user is viewing the video. The image capturing device 30 may be disposed on the terminal 10, such as a front camera of a mobile phone, but may be disposed at any position where the face image of the user watching the video can be captured.
The video processing method provided by the embodiment of the present disclosure may be applied to the terminal 10 or the server 20.
The first embodiment is as follows:
fig. 2 is a flowchart illustrating a video processing method, as shown in fig. 2, for use in a terminal according to an exemplary embodiment, the method including the following steps.
S101, receiving a playing instruction aiming at a target video input by a user, and playing the target video;
s102, acquiring a face image of the user;
in this embodiment, the image may be captured once every several seconds to capture the face image of the user viewing the target video, or the face video of the user may be captured, where the face video includes multiple frames of the face image.
S103, displaying a video processing interface corresponding to the target video according to the collected face image.
In some embodiments, the displaying a video processing interface corresponding to the target video according to the acquired face image in step S103 may include:
according to the collected face image, identifying face action information corresponding to the face image, wherein the face action information is used for representing the face action of a user in the identified face image;
if the face action information is a preset face action, acquiring a video processing instruction corresponding to the preset face action;
and displaying a video processing interface corresponding to the target video according to the video processing instruction.
For example, the face action information indicates that the user face action in the face image is a user smile, and the user smile corresponds to a video praise instruction, so that a praise flag is displayed on a video processing interface of the target video.
It should be noted that the correspondence between the facial movements and the video processing commands is preset and stored according to actual needs.
In some embodiments, the identifying, according to the acquired face image, face motion information corresponding to the face image may include:
and inputting the face image into a pre-trained image recognition model to obtain the face action information output by the image recognition model.
In this embodiment, the image recognition model may be, for example, a Convolutional Neural Network (CNN) model.
In some scenes, misoperation is easy to occur due to inaccurate recognition of the user action intention, for example, the user accidentally laughs while watching the video instead of liking the video, and if the user face action in the face image is recognized to be the user smile, the video is directly liked, which may cause the like.
Then, in order to ensure accurate recognition of the user action intention to prevent misoperation, in some embodiments, when the collected face image is a face image in a face video of the user, the above-mentioned recognizing the face action information corresponding to the face image according to the collected face image may include:
if the face action information corresponding to the face images with the continuous preset frame number is the same, or if the face action information corresponding to all the face images within the preset time length is the same, confirming that the same face action information is the identified face action information corresponding to the face images.
In the embodiment of the present disclosure, different processing operations may be performed on a target video according to different facial actions of a user in a captured facial image, which is described below by way of example.
The first condition is as follows:
if the acquired face image is a first face image and the video processing instruction corresponding to the user face action in the first face image is a video comment instruction, at this time, displaying a video processing interface corresponding to the target video according to the acquired face image in step S103 may include:
acquiring preset video comment information corresponding to a first face image according to the acquired first face image;
and displaying the video comment information on a video processing interface of the target video.
Wherein the video comment information includes at least one of: text comment information or expression comment information.
Case two:
if the acquired face image is a second face image, and the video processing instruction corresponding to the user face action in the second face image is a video attention instruction, at this time, displaying a video processing interface corresponding to the target video according to the acquired face image in step S103 may include:
and according to the acquired second face image, adding video content corresponding to a video author of the target video in a video attention list of the user, and canceling display of an attention adding function item on a video processing interface of the target video.
Case three:
after focusing attention, if the acquired face image is a third face image, and the video processing instruction corresponding to the user face action in the third face image is a video attention canceling instruction, at this time, the method may further include:
according to the collected third face image, deleting the video content corresponding to the video author of the target video from the video attention list, and displaying the attention function item on a video processing interface of the target video.
Case four:
if the acquired face image is a fourth face image, and the video processing instruction corresponding to the user face action in the fourth face image is a video like command, at this time, the displaying the video processing interface corresponding to the target video according to the acquired face image in step S103 may include:
and displaying the approved mark on a video processing interface of the target video according to the acquired fourth face image.
Case five:
after the video processing command is received, if the collected face image is a fifth face image, the video processing command corresponding to the user face action in the fifth face image is a video disapproval command, and at this time, the method may further include:
and canceling the display of the approved mark according to the acquired fifth face image.
Case six:
if the acquired face image is a sixth face image, and the video processing instruction corresponding to the user face action in the sixth face image is a video sharing instruction, at this time, displaying the video processing interface corresponding to the target video according to the acquired face image in step S103 may include:
and skipping to display a sharing interface for sharing the target video through a video playing interface of the target video according to the collected sixth facial image.
Case seven:
after jumping to the sharing interface, if the collected face image is a seventh face image, and the video processing instruction corresponding to the user face action in the seventh face image is a video sharing cancelling instruction, at this time, the method may further include:
and canceling the sharing interface according to the collected seventh face image.
Example two:
fig. 3 is a flowchart illustrating a video processing method, as shown in fig. 3, for use in a server, according to another exemplary embodiment, the method including the following steps.
S201, receiving a face image sent by a video client, wherein the face image is the face image of a user watching a target video played by the video client;
it should be noted that before the video client sends the facial image of the user to the server, the facial image of the user needs to be acquired only if the authorization of the user is obtained, and the acquired facial image is sent to the server.
S202, carrying out image processing on the face image, and identifying face action information corresponding to the face image, wherein the face action information is used for representing the face action of a user in the identified face image;
as an example, the face action information may indicate that the user face action in the face image is a user smile. As another example, the face motion information may indicate that the motion in the face image is that the tilt angle of the head of the user is in a preset angle range.
In some embodiments, the image processing on the face image in step S202, and identifying the face action information corresponding to the face image, may include:
and inputting the face image into a pre-trained image recognition model to obtain the face action information output by the image recognition model.
In some embodiments, before inputting the facial image into a pre-trained image recognition model, the method further comprises:
obtaining a training sample set, the training sample set comprising: an action image and a non-action image; the user in the action image performs a preset facial action (e.g., the user in the image smiles), the user in the non-action image does not perform the preset facial action (e.g., the user in the image does not smile);
and training the image recognition model through the training sample set, so that the facial action information output by the image recognition model comprises first identification information and second identification information, wherein the first identification information indicates that the image input to the image recognition model is the action image, and the second identification information indicates that the image input to the image recognition model is the non-action image.
It is noted that the training sample set may be images acquired under different conditions, wherein the different conditions include at least one of the following conditions: different cameras, lighting, angles, blur degrees, occlusion degrees, etc.
Of course, the pre-training of the image recognition model may also be completed on other devices, and then the pre-trained image recognition model is used on the device, which is not limited by the embodiment of the present disclosure.
S203, executing video processing operation corresponding to the target video based on the face action information;
the corresponding relation between the face action and the video processing operation is preset and stored according to actual needs.
For example, the action of the face of the user in the face image is that the user smiles and corresponds to the video sharing operation, and for example, the action of the face image is that the inclination angle of the head of the user is in the preset angle range and corresponds to the video sharing operation.
And S204, sending operation display information to the video client to indicate the video client to display a video processing result corresponding to the video processing operation on a video processing interface of the target video according to the operation display information.
In some embodiments, the receiving of the face image sent by the video client in step S201 includes: receiving a face image sent by the video client and a video identification (such as a video ID) of the target video;
in step S203, based on the face action information, executing a video processing operation corresponding to the target video, including: and according to the video identification, obtaining the target video from a video database, and executing video processing operation corresponding to the target video based on the face action information.
In some embodiments, the performing, in step S203, a video processing operation corresponding to the target video includes: increasing the statistical number of the approved identification information of the target video;
in step S204, sending operation display information to the video client, including: and sending operation display information to the video client so that the video client displays the praise marks corresponding to the praise identification information on a video processing interface of the target video.
In some embodiments, the performing, in step S203, a video processing operation corresponding to the target video includes: reducing a statistical number of approved identification information of the target video;
in step S204, sending operation display information to the video client, including: and sending operation display information to the video client so that the video client cancels the already praised mark on the video processing interface of the target video.
In some embodiments, the performing, in step S203, a video processing operation corresponding to the target video includes: corresponding to the target video, executing preset video sharing operation processing;
in step S204, sending operation display information to the video client, including: and sending operation display information to the video client so that the video client displays a sharing interface of a target video corresponding to the video sharing operation processing.
In some embodiments, the receiving of the face image sent by the video client in step S201 includes: receiving one frame of face image in a face video of a user sent by the video client;
in step S202, the image processing is performed on the face image, and the facial motion information corresponding to the face image is identified, including:
if the face action information corresponding to the face images with the continuous preset frame number is the same, or if the face action information corresponding to all the face images within the preset time length is the same, confirming that the same face action information is the identified face action information corresponding to the face images.
Based on the same inventive concept, the disclosed embodiment further provides a video processing apparatus, and fig. 4 is a block diagram of a video processing apparatus according to an exemplary embodiment. Referring to fig. 4, the video processing apparatus is used in a terminal, the video processing apparatus including: the system comprises a video playing module 11, an image acquisition module 12 and a first processing module 13.
The video playing module 11 is configured to receive a playing instruction for a target video input by a user and play the target video;
an image acquisition module 12 configured to acquire an image of the face of the user;
and the first processing module 13 is configured to display a video processing interface corresponding to the target video according to the acquired face image.
In one possible implementation, the image acquisition module 12 is configured to:
and acquiring a face video of the user, wherein the face video comprises a plurality of frames of face images.
In a possible implementation, as shown in fig. 5, the first processing module 13 includes:
the action recognition sub-module 131 is configured to recognize face action information corresponding to the face image according to the collected face image, wherein the face action information is used for representing the face action of the user in the recognized face image;
a processing instruction obtaining sub-module 132 configured to obtain a video processing instruction corresponding to a preset face action if the face action information is the preset face action;
the display sub-module 133 is configured to display a video processing interface corresponding to the target video according to the video processing instruction.
In a possible implementation, when the acquired face image is a face image in a video of the face of the user, the action recognition sub-module 131 is configured to:
if the face action information corresponding to the face images with the continuous preset frame number is the same, or if the face action information corresponding to all the face images within the preset time length is the same, confirming that the same face action information is the identified face action information corresponding to the face images.
In a possible implementation, the first processing module 13 is configured to:
acquiring preset video comment information corresponding to a first face image according to the acquired first face image;
and displaying the video comment information on a video processing interface of the target video.
In a possible implementation, the first processing module 13 is configured to:
and according to the acquired second face image, adding video content corresponding to a video author of the target video in a video attention list of the user, and canceling display of an attention adding function item on a video processing interface of the target video.
In a possible implementation manner, after the video processing interface of the target video cancels the display of the focus function item, the first processing module 13 is further configured to:
according to the collected third face image, deleting the video content corresponding to the video author of the target video from the video attention list, and displaying the attention function item on a video processing interface of the target video.
In a possible implementation, the first processing module 13 is configured to:
displaying a praise mark on a video processing interface of the target video according to the collected fourth face image;
or canceling the display of the approved mark according to the acquired fifth face image.
In a possible implementation, the first processing module 13 is configured to:
according to the collected sixth facial image, skipping to display a sharing interface for sharing the target video through a video playing interface of the target video;
or canceling the sharing interface according to the collected seventh face image.
Based on the same inventive concept, the disclosed embodiment further provides a video processing apparatus, and fig. 6 is a block diagram of a video processing apparatus according to an exemplary embodiment. Referring to fig. 6, the video processing apparatus is used in a server, the video processing apparatus including: a receiving module 21, an image processing and recognition module 22, a second processing module 23 and a sending module 24.
A receiving module 21 configured to receive a face image sent by a video client, where the face image is a face image of a user watching a target video played by the video client;
an image processing and recognition module 22 configured to perform image processing on the face image, and recognize face motion information corresponding to the face image, the face motion information being used for representing a face motion of a user in the recognized face image;
a second processing module 23 configured to execute a video processing operation corresponding to the target video based on the face action information;
the sending module 24 is configured to send operation display information to the video client to instruct the video client to display a video processing result corresponding to the video processing operation on a video processing interface of the target video according to the operation display information.
In a possible implementation, the receiving module 21 is configured to: receiving a face image sent by the video client and a video identifier of the target video;
the second processing module 23 is configured to: and according to the video identification, obtaining the target video from a video database, and executing video processing operation corresponding to the target video based on the face action information.
In one possible implementation, the face action information indicates that the face action of the user in the face image is a smile of the user; or, the action in the face image is represented by the inclination angle of the head of the user in a preset angle range.
In a possible implementation, the second processing module 23 is configured to: increasing or decreasing a statistical number of approved identification information of the target video;
the sending module 24 is configured to: and sending operation display information to the video client, so that the video client displays the praise mark corresponding to the praise identification information or cancels the praise mark on a video processing interface of the target video.
In a possible implementation, the second processing module 23 is configured to: corresponding to the target video, executing preset video sharing operation processing;
the sending module 24 is configured to: and sending operation display information to the video client so that the video client displays a sharing interface of a target video corresponding to the video sharing operation processing.
In one possible implementation, the image processing and recognition module 22 is configured to:
and inputting the face image into a pre-trained image recognition model to obtain the face action information output by the image recognition model.
In a possible implementation manner, as shown in fig. 7, the apparatus may further include:
a training sample acquisition module 25 configured to acquire a training sample set, the training sample set comprising: an action image and a non-action image; the user in the motion image executes a preset face action, and the user in the non-motion image does not execute the preset face action;
a model training module 26 configured to train the image recognition model through the training sample set so that the facial motion information output by the image recognition model includes first identification information and second identification information, the first identification information indicates that the image input to the image recognition model is the motion image, and the second identification information indicates that the image input to the image recognition model is the non-motion image.
In a possible implementation, the receiving module 21 is configured to: receiving one frame of face image in a face video of a user sent by the video client;
the image processing and recognition module 22 is configured to:
if the face action information corresponding to the face images with the continuous preset frame number is the same, or if the face action information corresponding to all the face images within the preset time length is the same, confirming that the same face action information is the identified face action information corresponding to the face images.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the disclosed solution. One of ordinary skill in the art can understand and implement it without inventive effort.
Based on the same inventive concept, the disclosed embodiments also provide a storage medium on which a computer program is stored, which when executed by a processor implements the steps of the video processing method in any of the possible implementations described above.
Alternatively, the storage medium may be a non-transitory computer readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
Based on the same inventive concept, the disclosed embodiments also provide a computer program product, which includes a computer program, and when the program is executed by a processor, the steps of the video processing method in any possible implementation manner are implemented.
Based on the same inventive concept, the disclosed embodiments also provide a terminal, which includes a memory, a processor, and a computer program stored on the memory and operable on the processor;
wherein the processor is configured to:
receiving a playing instruction aiming at a target video input by a user, and playing the target video;
acquiring a face image of the user;
and displaying a video processing interface corresponding to the target video according to the acquired face image.
As shown in fig. 8, fig. 8 is a schematic structural diagram of a terminal 1700 shown in the present disclosure according to an exemplary embodiment. For example, terminal 1700 may be a mobile telephone with routing capabilities, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
Referring to fig. 8, terminal 1700 may include one or more of the following components: processing component 1702, memory 1704, power component 1706, multimedia component 1708, audio component 1710, input/output (I/O) interface 1712, sensor component 1714, and communications component 1716.
The processing component 1702 generally controls the overall operation of the terminal 1700, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. Processing component 1702 may include one or more processors 1720 to execute instructions to perform all or a portion of the steps of the above-described method. Further, processing component 1702 may include one or more modules that facilitate interaction between processing component 1702 and other components. For example, processing component 1702 may include a multimedia module to facilitate interaction between multimedia component 1708 and processing component 1702.
Memory 1704 is configured to store various types of data to support operations at terminal 1700. Examples of such data include instructions for any application or method operating on terminal 1700, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 1704 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power supply component 1706 provides power to the various components of terminal 1700. The power components 1706 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the terminal 1700.
The multimedia component 1708 includes a screen providing an output interface between the terminal 1700 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1708 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the terminal 1700 is in an operation mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
Audio component 1710 is configured to output and/or input audio signals. For example, audio component 1710 includes a Microphone (MIC) configured to receive external audio signals when terminal 1700 is in an operational mode, such as a call mode, a record mode, and a voice recognition mode. The received audio signal may further be stored in the memory 1704 or transmitted via the communication component 1716. In some embodiments, audio component 1710 also includes a speaker for outputting audio signals.
The I/O interface 1712 provides an interface between the processing component 1702 and peripheral interface modules, such as a keyboard, click wheel, buttons, and the like. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
Sensor assembly 1714 includes one or more sensors for providing various aspects of state assessment for terminal 1700. For example, sensor assembly 1714 can detect the open/closed state of terminal 1700, the relative positioning of components, such as the display and keypad of terminal 1700, sensor assembly 1714 can also detect a change in the position of terminal 1700 or a component of terminal 1700, the presence or absence of user contact with terminal 1700, orientation or acceleration/deceleration of terminal 1700, and a change in the temperature of terminal 1700. The sensor assembly 1714 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 1714 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1714 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, a microwave sensor, or a temperature sensor.
The communication component 1716 is configured to facilitate communications between the terminal 1700 and other devices in a wired or wireless manner. The terminal 1700 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1716 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 1716 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the terminal 1700 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the video processing method as shown in fig. 2.
In an exemplary embodiment, a non-transitory computer readable storage medium including instructions, such as the memory 1704 including instructions, executable by the processor 1720 of the terminal 1700 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
Based on the same inventive concept, the embodiment of the present disclosure further provides a server, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor;
wherein the processor is configured to:
receiving a face image sent by a video client, wherein the face image is the face image of a user watching a target video played by the video client;
performing image processing on the face image, and identifying face action information corresponding to the face image, wherein the face action information is used for representing the face action of a user in the identified face image;
based on the face action information, executing video processing operation corresponding to the target video;
and sending operation display information to the video client to indicate the video client to display a video processing result corresponding to the video processing operation on a video processing interface of the target video according to the operation display information.
As shown in fig. 9, fig. 9 is a schematic diagram illustrating a structure of a server 1800 according to an example embodiment. Referring to FIG. 9, the server 1800 includes a processing component 1802 that further includes one or more processors and memory resources, represented by memory 1804, for storing instructions, such as application programs, that are executable by the processing component 1802. The application programs stored in memory 1804 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1802 is configured to execute instructions to perform the video processing method as shown in fig. 3.
The server 1800 may also include a power component 1806 configured to perform power management for the server 1800, a wired or wireless network interface 1808 configured to connect the server 1800 to a network, and an input/output (I/O) interface 1810. The server 1800 may operate based on an operating system stored in memory 1804 such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A method of video processing, the method comprising:
receiving a playing instruction aiming at a target video input by a user, and playing the target video;
acquiring a face image of the user;
and displaying a video processing interface corresponding to the target video according to the acquired face image.
2. The method of claim 1, wherein the capturing the image of the face of the user comprises:
and acquiring a face video of the user, wherein the face video comprises a plurality of frames of face images.
3. The method according to claim 1, wherein the displaying a video processing interface corresponding to the target video according to the acquired facial image comprises:
according to the collected face image, identifying face action information corresponding to the face image, wherein the face action information is used for representing the face action of a user in the identified face image;
if the face action information is a preset face action, acquiring a video processing instruction corresponding to the preset face action;
and displaying a video processing interface corresponding to the target video according to the video processing instruction.
4. The method according to claim 3, wherein when the acquired face image is a face image in a face video of a user, the identifying, according to the acquired face image, face motion information corresponding to the face image comprises:
if the face action information corresponding to the face images with the continuous preset frame number is the same, or if the face action information corresponding to all the face images within the preset time length is the same, confirming that the same face action information is the identified face action information corresponding to the face images.
5. A method of video processing, the method comprising:
receiving a face image sent by a video client, wherein the face image is the face image of a user watching a target video played by the video client;
performing image processing on the face image, and identifying face action information corresponding to the face image, wherein the face action information is used for representing the face action of a user in the identified face image;
based on the face action information, executing video processing operation corresponding to the target video;
and sending operation display information to the video client to indicate the video client to display a video processing result corresponding to the video processing operation on a video processing interface of the target video according to the operation display information.
6. A video processing apparatus, characterized in that the apparatus comprises:
the video playing module is configured to receive a playing instruction aiming at a target video input by a user and play the target video;
an image acquisition module configured to acquire a facial image of the user;
and the first processing module is configured to display a video processing interface corresponding to the target video according to the acquired face image.
7. A video processing apparatus, characterized in that the apparatus comprises:
the receiving module is configured to receive a face image sent by a video client, wherein the face image is the face image of a user watching a target video played by the video client;
the image processing and identifying module is configured to perform image processing on the face image, identify face action information corresponding to the face image, wherein the face action information is used for representing the identified face action of the user in the face image;
the second processing module is configured to execute video processing operation corresponding to the target video based on the face action information;
the sending module is configured to send operation display information to the video client to instruct the video client to display a video processing result corresponding to the video processing operation on a video processing interface of the target video according to the operation display information.
8. A storage medium having a computer program stored thereon, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
9. A terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1-4 are implemented when the processor executes the program.
10. A server comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method of claim 5 are performed when the processor executes the program.
CN201910894995.3A 2019-09-20 2019-09-20 Video processing method, device, storage medium, terminal and server Pending CN110636377A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910894995.3A CN110636377A (en) 2019-09-20 2019-09-20 Video processing method, device, storage medium, terminal and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910894995.3A CN110636377A (en) 2019-09-20 2019-09-20 Video processing method, device, storage medium, terminal and server

Publications (1)

Publication Number Publication Date
CN110636377A true CN110636377A (en) 2019-12-31

Family

ID=68972107

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910894995.3A Pending CN110636377A (en) 2019-09-20 2019-09-20 Video processing method, device, storage medium, terminal and server

Country Status (1)

Country Link
CN (1) CN110636377A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111541951A (en) * 2020-05-08 2020-08-14 腾讯科技(深圳)有限公司 Video-based interactive processing method and device, terminal and readable storage medium
CN112866810A (en) * 2021-01-05 2021-05-28 三星电子(中国)研发中心 Video playing method and video playing device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103369288A (en) * 2012-03-29 2013-10-23 深圳市腾讯计算机系统有限公司 Instant communication method based on network video and system thereof
CN104007826A (en) * 2014-06-17 2014-08-27 合一网络技术(北京)有限公司 Video control method and system based on face movement identification technology
CN104244101A (en) * 2013-06-21 2014-12-24 三星电子(中国)研发中心 Method and device for commenting multimedia content
CN104238983A (en) * 2014-08-05 2014-12-24 联想(北京)有限公司 Control method and electronic equipment
CN106550276A (en) * 2015-09-22 2017-03-29 阿里巴巴集团控股有限公司 The offer method of multimedia messages, device and system in video display process
CN109309878A (en) * 2017-07-28 2019-02-05 Tcl集团股份有限公司 The generation method and device of barrage
CN109361814A (en) * 2018-09-25 2019-02-19 联想(北京)有限公司 A kind of control method and electronic equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103369288A (en) * 2012-03-29 2013-10-23 深圳市腾讯计算机系统有限公司 Instant communication method based on network video and system thereof
CN104244101A (en) * 2013-06-21 2014-12-24 三星电子(中国)研发中心 Method and device for commenting multimedia content
CN104007826A (en) * 2014-06-17 2014-08-27 合一网络技术(北京)有限公司 Video control method and system based on face movement identification technology
CN104238983A (en) * 2014-08-05 2014-12-24 联想(北京)有限公司 Control method and electronic equipment
CN106550276A (en) * 2015-09-22 2017-03-29 阿里巴巴集团控股有限公司 The offer method of multimedia messages, device and system in video display process
CN109309878A (en) * 2017-07-28 2019-02-05 Tcl集团股份有限公司 The generation method and device of barrage
CN109361814A (en) * 2018-09-25 2019-02-19 联想(北京)有限公司 A kind of control method and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111541951A (en) * 2020-05-08 2020-08-14 腾讯科技(深圳)有限公司 Video-based interactive processing method and device, terminal and readable storage medium
CN112866810A (en) * 2021-01-05 2021-05-28 三星电子(中国)研发中心 Video playing method and video playing device

Similar Documents

Publication Publication Date Title
CN106791893B (en) Video live broadcasting method and device
EP3817395A1 (en) Video recording method and apparatus, device, and readable storage medium
US20170178289A1 (en) Method, device and computer-readable storage medium for video display
CN106572299B (en) Camera opening method and device
US20170304735A1 (en) Method and Apparatus for Performing Live Broadcast on Game
EP3125530A1 (en) Video recording method and device
CN106941624B (en) Processing method and device for network video trial viewing
EP3174053A1 (en) Method, apparatus and system for playing multimedia data, computer program and recording medium
CN109257645B (en) Video cover generation method and device
CN106559712B (en) Video playing processing method and device and terminal equipment
US10217487B2 (en) Method and device for controlling playback
CN106792173B (en) Video playing method and device and non-transitory computer readable storage medium
EP3796317A1 (en) Video processing method, video playing method, devices and storage medium
EP3147802B1 (en) Method and apparatus for processing information
US11310443B2 (en) Video processing method, apparatus and storage medium
CN107132769B (en) Intelligent equipment control method and device
CN109766473B (en) Information interaction method and device, electronic equipment and storage medium
CN111970566A (en) Video playing method and device, electronic equipment and storage medium
CN110636383A (en) Video playing method and device, electronic equipment and storage medium
CN106331328B (en) Information prompting method and device
CN112291631A (en) Information acquisition method, device, terminal and storage medium
EP3211879A1 (en) Method and device for automatically capturing photograph, electronic device
CN108986803B (en) Scene control method and device, electronic equipment and readable storage medium
CN110636377A (en) Video processing method, device, storage medium, terminal and server
CN108629814B (en) Camera adjusting method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191231

RJ01 Rejection of invention patent application after publication