CN107105342B

CN107105342B - Video playing control method and mobile terminal

Info

Publication number: CN107105342B
Application number: CN201710286861.4A
Authority: CN
Inventors: 杨章
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2017-04-27
Filing date: 2017-04-27
Publication date: 2020-04-17
Anticipated expiration: 2037-04-27
Also published as: CN107105342A

Abstract

The invention provides a video playing control method and a mobile terminal. The video playing control method comprises the following steps: acquiring continuous N frames of images forming a video; acquiring a target object of each frame of image in the continuous N frames of images; dividing the continuous N frames of images into T video segments according to the target object contained in each frame of image; when a playing control instruction for the video is detected, controlling the playing progress of the video to be switched among the T video segments by taking the video segments as units according to the playing control instruction; wherein N is an integer greater than 0, and T is an integer greater than 0. According to the technical scheme, the video is divided according to the target object in the video, the video is divided into the plurality of video segments, and the content played by each video segment is closely related to the target object contained in the video segment, so that when the user adjusts the video progress, the user can avoid missing the plot related to the target object, and the playing effect of the video is optimized.

Description

Video playing control method and mobile terminal

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a video playing control method and a mobile terminal.

Background

With the development of mobile terminal technology and the light and portable characteristics of mobile terminals, more and more people use mobile terminals (such as mobile phones, tablet computers, and the like) to watch videos. When a user watches a video by using a mobile terminal, sometimes the user wants to jump to watch the video in order to increase the watching speed, and at this time, the playing progress of the video is generally adjusted by adopting a fast forward or fast backward mode.

In the prior art, video fast-forwarding units are generally divided according to time, that is, fixed movement variables are specified, but the development process of videos related to characters, such as television series, movies, scene clips or art programs, is often related to the characters in the videos, the fast-forwarding or fast-rewinding method in the prior art is adopted for fast-forwarding, if the operation is not well performed, the performance of some key characters or the development of key scenarios may be missed, and the ideal position of the video playing containing the key characters and the like which a user wants to watch is difficult to be located, and at this time, the user needs to adjust the video progress again, which is very inconvenient.

Disclosure of Invention

The embodiment of the invention provides a video playing control method and a mobile terminal, and aims to solve the problem that video positioning is inconvenient when video progress adjustment is carried out in the prior art.

In a first aspect, a video playing control method is provided, including:

acquiring continuous N frames of images forming a video;

acquiring a target object of each frame of image in the continuous N frames of images;

dividing the continuous N frames of images into T video segments according to the target object contained in each frame of image;

when a playing control instruction for the video is detected, controlling the playing progress of the video to be switched among the T video segments by taking the video segments as units according to the playing control instruction;

wherein N is an integer greater than 0, and T is an integer greater than 0.

In a second aspect, a mobile terminal is provided, including:

the first acquisition module is used for acquiring continuous N frames of images forming a video;

the second acquisition module is used for acquiring a target object of each frame of image in the continuous N frames of images acquired by the first acquisition module;

the dividing module is used for dividing the continuous N frames of images into T video segments according to the target object contained in each frame of image acquired by the second acquiring module;

the control module is used for controlling the playing progress of the video to be switched among the T video segments divided by the dividing module by taking the video segments as units according to the playing control instruction when the playing control instruction of the video is detected;

wherein N is an integer greater than 0, and T is an integer greater than 0.

The invention has the beneficial effects that:

according to the technical scheme, the video is divided into the plurality of video segments according to the target object in the video, and the content played by each video segment is closely related to the target object contained in the video segment, so that a user can avoid missing a plot related to the target object when the video progress is adjusted, and the playing effect of the video is optimized. In addition, when the user adjusts the video progress, the user adjusts the video progress according to the video segment, so that the accuracy of fast forwarding can be improved, the speed of paying attention to important information by the user can be increased, and better use experience is provided for the user.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.

Fig. 1 is a flowchart illustrating a video playing control method according to an embodiment of the present invention;

FIG. 2 is a flow chart illustrating the sub-steps of step 103 provided by an embodiment of the present invention;

fig. 3 is a block diagram of a mobile terminal according to an embodiment of the present invention;

fig. 4 is a second block diagram of a mobile terminal according to an embodiment of the present invention;

fig. 5 is a third block diagram of a mobile terminal according to an embodiment of the present invention;

fig. 6 is a fourth block diagram of the mobile terminal according to the embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

According to an aspect of an embodiment of the present invention, a video playback control method is provided.

As shown in fig. 1, the video playback control method includes:

step 101, acquiring continuous N frames of images forming a video.

The N consecutive images obtained in this step may be all images constituting the video, or may be partial images constituting the video, and the specific situation may be selected according to actual needs, which is not limited in this embodiment of the present invention.

Wherein N is an integer greater than 0.

And 102, acquiring a target object of each frame of image in the continuous N frames of images.

For the consecutive N frames of images acquired in step 101, this step identifies and confirms the target object for each frame of image therein.

Wherein the target object is generally an object closely related to the development of the plot in the video. For example, if the video is a movie and the scenario of movie development is a scenario with human activities as main lines, the target object according to which the video is divided is a face image; if the video is a documentary related to an animal, dividing the video according to a target object which is an animal image; if the video is a documentary related to the plant, the target object based on which the video is divided is a plant image. Therefore, in the embodiment of the present invention, the target object is a human face image, an animal image, a plant image, or the like, and the specific situation may be determined according to the specific content of the video.

Step 103, dividing the continuous N frames of images into T video segments according to the target objects contained in each frame of images.

In this step, the acquired continuous N frames of images are divided into T video segments according to the target object in the images. Wherein T is an integer greater than 0.

The target object is related to the plot development information in the video, and the plot development corresponding to different playing schedules may be related to different specific target objects. Taking movie-like video as an example, for example, the currently played scene is a session between a female owner and a male owner, and after a period of session, the scene jumps to: the female with the first number and the female with the second number are in conversation, in the former scene, no matter whether other actors are added or not, the female owner with the first number and the male owner with the first number are always in conversation, and the development of the plot is related to the female owner with the first number and the male owner with the first number, so that the frame images continuously having the female owner with the first number and the male owner with the first number in the former scene can be divided into one video segment. The next scene is associated with the female combination of the first number and the female combination of the second number, and there are no female owner of the first number and no male owner of the first number, so that the frame images having the female combination of the first number and the female combination of the second number in succession in the next scene can be divided into one video segment. In summary, in a continuous frame image, a common target object included in a continuous frame image may be key information, so the embodiment of the present invention divides the video into T video segments according to the relationship between the target objects included in each frame image, so as to facilitate the user to pay attention to the key information.

And 104, when a playing control instruction for the video is detected, controlling the playing progress of the video to be switched among the T video segments by taking the video segments as units according to the playing control instruction.

After the video is divided, if a playing control instruction (such as a fast forward operation instruction or a fast backward operation instruction) for the video is detected, the playing progress of the video is controlled to be switched among the T video segments by taking the video as a unit according to the playing control instruction. Therefore, when the user adjusts the video progress, the user can avoid missing the plot related to the target object, and the playing effect of the video is optimized. In addition, when the user adjusts the video progress, the user adjusts the video progress according to the video segment, so that the accuracy of fast forwarding can be improved, the speed of paying attention to important information by the user can be increased, and better use experience is provided for the user.

Specifically, in an embodiment of the present invention, the play control instruction includes: a fast forward operation instruction and a fast backward operation instruction.

When a fast forward operation instruction is detected, controlling the playing progress of the video to jump to the first frame image of the next video segment of the currently played video segment for playing; when a fast-backward operation instruction is detected, the playing progress of the video is controlled to jump to the first frame image of the previous video segment of the currently played video segment for playing, so that the accuracy of video progress adjustment can be improved, the speed of paying attention to important information by a user can be increased when the user carries out fast-forward or fast-backward operation, and better use experience is provided for the user.

Further, as shown in fig. 2, when the consecutive N frames of images are divided into T video segments according to the target object contained in each frame of image in step 103, the following steps can be performed:

and step 1031, determining the first frame image containing the target object in the continuous N frame images as the first frame image of the first video segment.

In this step, according to the playing sequence, it is determined that the first frame of the acquired consecutive N frames of images contains the image of the target object (i.e., the first frame of image described in step 1031), and the image is set as the first frame of image of the first video segment, and meanwhile, the target object in the first frame of image is determined and identified. For example, a frame image of a video segment, in which a face image is presented for the first time, is set as a first frame image of a first video segment containing the face image, and faces in the frame image are identified at the same time to determine a person corresponding to each face.

The first video segment is the first video segment containing the target object, that is, in the chronological playing order, among all the video segments containing the target object, the video segment is the first video segment.

Step 1032 determines a first set of all target objects contained in the first frame image.

This step determines, from the target objects in the first frame image, a first set formed by all the target objects contained in the first frame image. For example, if the first frame image includes a target object a and a target object B, a first set X formed by all the target objects included in the first frame image is { a, B }.

Step 1033, in M frames of images after the first frame of image, dividing the M frames of image and the first frame of image into a first video segment according to the relationship between the set formed by all the target objects contained in each frame of image and the first set.

That is, after the first frame image of the first video segment containing the target object is determined, the target object in the next frame image (hereinafter referred to as the second frame image) adjacent to the first frame image is judged and recognized, and judging the relationship between the set formed by the target object contained in the second frame image and the first set according to the processing result, if the preset set relationship is met, dividing the second frame image and the first frame image into the same video segment, then continuing to judge the relation between the set formed by the target objects contained in the third frame image adjacent to the second frame image and the first set until when the set formed by the target objects contained in the M +1 frame image after the first frame image does not meet the preset set relation with the first set, and dividing the first frame image and the M frames of images after the first frame image into a video segment.

Wherein M is an integer greater than 0.

The preset set relation can be designed according to actual requirements.

And 1034, sequentially dividing the (M + 1) th frame image after the first frame image to the N (N) th frame image in the continuous N frame images into T-1 video segments according to the dividing mode of the first video segment.

Wherein, the dividing mode is as follows: and dividing the images with the preset set relation into the same video segment.

After the division of the first video segment is completed, sequentially dividing the (M + 1) th frame image after the first frame image to the N (N) th frame image in the continuous N frame images into T-1 video segments according to the division method of the first video segment so as to complete the division of the video segments of the continuous N frame images. And the Nth frame image in the continuous N frame images is the last frame image in the continuous N frame images.

Specifically, a second set formed by all target objects contained in the M +1 frame image is determined, and when a preset set relation does not exist between the second set and the first set, the M +1 frame image is determined as a first frame image of the second video segment. In the P frame images after the M +1 frame image, the P frame image and the M +1 frame image are divided into a second video segment according to the relationship between the set formed by all the target objects contained in each frame image and the second set. And according to the division mode of the second video segment, finishing the division of the residual images. Wherein P is an integer greater than 0.

Preferably, in the embodiment of the present invention, the preset set relationship is a subset relationship. The subset relationship is: in the two images subjected to set relation comparison, the third set formed by all the target objects in the previous image is a subset of the fourth set formed by all the target objects in the next image, or the fourth set is a subset of the third set.

Generally, when a set formed by target objects in two frames of images has a subset relationship, it is indicated that the two frames of images have a common target object, and the corresponding plot development should be closely related to the common target object, so that each video segment divided by judging whether the two frames of images have the subset relationship has a key target object, which can facilitate the user to pay attention to key information. Of course, the subset relationships that have may also be empty set relationships.

According to the embodiment of the invention, the video is divided according to the subset relation among the sets formed by the target objects in each frame of image, so that the plot with the relatively close association relation is divided into one video segment, thereby being convenient for a user to avoid missing the plot related to the target objects when the video progress is adjusted, and optimizing the playing effect of the video.

Based on the above description, step 1032 may specifically be: in M frames of images after the first frame of image, if a set formed by all target objects contained in each frame of image has a subset relation with the first set, dividing the M frames of images and the first frame of image into the first video segment.

For example, the target object is a face image, and the face images in the first frame of image are respectively A, B, C and D, which are determined by face recognition technology, and the set is S₁The face images in the second frame image are A, B and C, respectively, and the set of these face images is S₂(ii) a, B, C, then S₂Is S₁The subset of (2), namely the subset relation exists between the set formed by the target object contained in the second frame image and the first set, the second frame image and the first frame image are divided into the same video segment; or when S is₁＝{A，B，C，D}，S₂When { a, B, C, D, F, G }, the second frame image is added with two persons as compared with the first frame image, and S is performed at this time₁Is S₂The second frame image and the first frame image are also divided into the same video segment; when S is₂Empty set (i.e. when the second frame image does not contain the target object), or S₂Is S₁When the video segment is a pseudo subset (i.e. the target object in the second frame image is the same as the target object in the first frame image), the second frame image and the first frame image are also divided into the same video segment because of the subset relationship.

As another example, S₁＝{A，B，C，D}，S₂(ii) B, C, E, then S₂And S₁And if the second frame image does not have the subset relation, namely the second frame image contains a set formed by the target objects and the first set, the second frame image is set as the first frame image of another video segment.

Further, in the embodiment of the present invention, the preset set relationship may also be: and when the subset relation exists, the difference value between the number of elements in the set corresponding to the previous frame image and the number of elements in the set corresponding to the next frame image in the two frames of images which are subjected to set relation comparison is smaller than or equal to a preset difference value.

The preset condition not only considers key target objects, but also avoids the problem that the video progress adjustment of a user is influenced due to the fact that a divided video segment has too many frame images by increasing the limit of the difference value of the number of the target objects.

Based on the above description, step 1032 may specifically be: in M frames of images after the first frame of image, if a subset relation exists between a set formed by all target objects contained in each frame of image and the first set, and the number of elements in the set corresponding to each frame of image in the M frames of image meets a preset condition, dividing the M frames of image and the first frame of image into a first video segment.

Wherein the preset conditions include: the difference value between the element number of the set corresponding to each frame image in the M frames of images and the element number of the first set is smaller than or equal to a preset threshold value.

It should be noted that, the difference between the number of elements in the set corresponding to each frame image in the M frames of images and the number of elements in the set corresponding to the first frame image is not only a value obtained by subtracting the number of elements in the set corresponding to the first frame image from the number of elements in the set corresponding to each frame image in the M frames of images, but also a value obtained by subtracting the number of elements in the set corresponding to each frame image in the M frames of images from the number of elements in the set corresponding to each frame image in the M frames of images; or to be understood as: the absolute value of the difference between the number of elements in the set corresponding to each frame image in the M frames of images and the number of elements in the set corresponding to the first frame image is less than or equal to a preset difference, and the difference is greater than or equal to 0.

Continuing with the above example, assume that the predetermined difference is 3. The set of people corresponding to the face in the first frame image is S₁The set of persons corresponding to the face in the second frame image is S₂＝{A，B}，S₂Is S₁And the difference value of the number of elements in the two sets is 2, and the difference value is less than 3, so that the judgment condition is met, and the second frame image and the first frame image are divided into the same video segment.

As another example, S₁＝{A，B，C，D}，S₂(ii) B, C, E, then S₂And S₁If the video segment does not have the subset relation and the judgment condition is not met, the second frame image is set as the first frame image of another video segment.

As another example, S₁＝{A，B，C，E，D，F，G}，S₂＝{B，C，E}，S₂Is S₁But the difference between the numbers of elements in the two sets is 4If the difference is larger than 3 and the judgment condition is not met, setting the second frame image as the first frame image of another video segment.

Furthermore, in the embodiment of the present invention, consecutive S frame images that do not include the target object in consecutive N frame images may also be divided into the same video segment.

And determining the image of the first frame not containing the target object as the first frame image of the video segment in the video segment not containing the target object.

Wherein S is an integer greater than 0.

For example, if there is at least one image not containing the target object before the first frame contains the image of the target object (i.e., the first frame image in step 1031) in the acquired consecutive N frames of images, the images not containing the target object may be separately divided into a video segment, and the first frame image in the images not containing the target object may be set as the first frame image of the video segment.

It is understood that, of course, the images not containing the target object and the first frame of images containing the target object may also be divided into the same video segment, and the first frame of image in the images not containing the target object is set as the first frame of image in the video segment.

By the division mode, the video can be divided comprehensively, and the processing of partial frame images is prevented from being neglected.

Further, in the embodiment of the present invention, in order to avoid the playback time of a video segment being too long, the number of image frames of a video segment may be limited, that is, the maximum number of image frames of a video segment is set, and when the number of image frames of a video segment exceeds the maximum number of image frames during the division, the exceeding image frames are divided into another video segment.

In summary, the video playing control method provided in the embodiments of the present invention divides a video into a plurality of video segments based on a target object in the video, and the content played by each video segment is related to the target object included in the video segment, so that a user can avoid missing a plot related to the target object when performing video schedule adjustment, and optimize the playing effect of the video. In addition, when the user adjusts the video progress, the user adjusts the video progress according to the video segment, so that the accuracy of fast forwarding can be improved, the speed of paying attention to important information by the user can be increased, and better use experience is provided for the user.

According to another aspect of an embodiment of the present invention, a mobile terminal 300 is provided.

As shown in fig. 3, the mobile terminal 300 includes:

a first obtaining module 301, configured to obtain N consecutive images constituting a video.

Wherein N is an integer greater than 0

A second obtaining module 302, configured to obtain a target object of each frame of image in the consecutive N frames of images obtained by the first obtaining module 301.

For the consecutive N frames of images acquired by the first acquisition module 301, the second acquisition module 302 identifies and confirms the target object for each frame of image therein.

A dividing module 303, configured to divide the consecutive N frames of images into T video segments according to the target object included in each frame of image acquired by the second acquiring module 302.

The dividing module 303 divides the acquired N consecutive frames of images into T video segments according to the target object in the images.

Wherein T is an integer greater than 0.

The control module 304 is configured to, when a playing control instruction for the video is detected, control the playing progress of the video to be switched among the T video segments divided by the dividing module in units of video segments according to the playing control instruction.

After the dividing module 303 finishes dividing the video, if a play control instruction (such as a fast forward operation instruction or a fast backward operation instruction) for the video is detected, the control module 304 controls the play progress of the video to be switched among the T video segments by taking the video as a unit according to the play control instruction. Therefore, when the user adjusts the video progress, the user can avoid missing the plot related to the target object, and the playing effect of the video is optimized. In addition, when the user adjusts the video progress, the video is played according to the video segment, so that the accuracy of fast forwarding can be improved, the speed of paying attention to important information by the user can be increased, and better use experience is provided for the user.

Further, as shown in fig. 4, the dividing module 303 includes:

a first determining unit 3031, configured to determine a first frame image containing a target object in N consecutive frame images as a first frame image of a first video segment.

The first determining unit 3031 determines, according to the playing order, that a first frame of the acquired N consecutive frames of images includes an image of the target object (i.e., the first frame of image described in step 1021), sets the image as a first frame of image of the first video segment, and determines and identifies the target object in the first frame of image. For example, a frame image of a video segment in which a face image is presented for the first time is set as a first frame image of a first video segment containing the face image, and faces in the frame image are identified at the same time to determine a person corresponding to each face.

A second determining unit 3032 is configured to determine a first set of all target objects included in the first frame image.

Wherein the second determination unit 3032 determines the first set formed by all the target objects contained in the first frame image, based on the target objects in the first frame image. For example, if the first frame image includes a target object a and a target object B, a first set X formed by all the target objects included in the first frame image is { a, B }.

A first dividing unit 3033, configured to divide, in M frames of images subsequent to the first frame of image, the M frames of image and the first frame of image into a first video segment according to a relationship between a set formed by all target objects included in each frame of image and the first set.

That is, after the first frame image of the first video segment containing the target object is determined, the target object in the next frame image (hereinafter referred to as the second frame image) adjacent to the first frame image is judged and recognized by the first dividing unit 3033, and judging the relationship between the set formed by the target object contained in the second frame image and the first set according to the processing result, if the preset set relationship is met, dividing the second frame image and the first frame image into the same video segment, then continuing to judge the relation between the set formed by the target objects contained in the third frame image adjacent to the second frame image and the first set until when the set formed by the target objects contained in the M +1 frame image after the first frame image does not meet the preset set relation with the first set, and dividing the first frame image and the M frames of images after the first frame image into a video segment.

Wherein M is an integer greater than 0.

The preset set relation can be designed according to actual requirements.

The second dividing unit 3034 is configured to sequentially divide the M +1 th frame image after the first frame image to the nth frame image in the consecutive N frame images into T-1 video segments according to the dividing manner of the first video segment.

After completing the division of the first video segment, the second dividing unit 3034 sequentially divides the M +1 th frame image after the first frame image to the nth frame image in the N consecutive frame images into T-1 video segments according to the division method of the first video segment, so as to complete the division of the video segments of the N consecutive frame images. And the Nth frame image in the continuous N frame images is the last frame image in the continuous N frame images.

Further, as shown in fig. 4, the second dividing unit 3034 includes:

a first determining subunit 30341, configured to determine a second set formed by all target objects included in the M +1 th frame image.

A second determining subunit 30342, configured to determine the M +1 th frame image as the first frame image of the second video segment when the second collection does not have the preset collection relationship with the first collection.

A first dividing unit 30343, configured to divide the P frame image and the M +1 th frame image into the second video segment according to the relationship between the set formed by all the target objects included in each frame image and the second set in the P frame images after the M +1 th frame image.

After the division of the first video segment is completed, sequentially dividing the (M + 1) th frame image after the first frame image to the N (N) th frame image in the continuous N frame images into T-1 video segments according to the division method of the first video segment.

Wherein P is an integer greater than 0.

Further, as shown in fig. 4, the first dividing unit 3033 includes:

a second dividing subunit 30331, configured to, in M frames of images subsequent to the first frame of image, divide the M frames of images and the first frame of image into the first video segment if a subset relationship exists between the first set and a set formed by all target objects included in each frame of image.

Further, as shown in fig. 4, the first dividing unit 3033 includes:

a third dividing subunit 30332, configured to, in M frames of images subsequent to the first frame of image, divide the M frames of images and the first frame of image into the first video segment if a subset relationship exists between a set formed by all target objects included in each frame of image and the first set, and the number of elements in the set corresponding to each frame of image in the M frames of images satisfies a preset condition.

In the embodiment of the present invention, the preset set relationship may also be: and when the subset relation exists, the difference value between the number of the elements in the set corresponding to the previous frame image and the number of the elements in the set corresponding to the next frame image in the two frames of images which are subjected to set relation comparison is less than or equal to the preset difference value.

Continuing with the above example, assume that the predetermined difference is 3. The set of people corresponding to the face in the first frame image is S₁The set of persons corresponding to the face in the second frame image is S₂＝{A，B}，S₂Is S₁And twoThe difference value of the number of elements in the set is 2, the difference value is less than 3, and the judgment condition is met, so that the second frame image and the first frame image are divided into the same video segment.

As another example, S₁＝{A，B，C，E，D，F，G}，S₂＝{B，C，E}，S₂Is S₁If the difference between the numbers of the elements in the two sets is 4 and the difference is greater than 3, and the judgment condition is not met, the second frame image is set as the first frame image of another video segment.

Further, as shown in fig. 4, the dividing module 303 includes:

a third dividing unit 3035, configured to divide consecutive S frame images, which do not include the target object, in consecutive N frame images, into the same video segment.

In a video segment which does not contain a target object, determining an image of a first frame which does not contain the target object as a first frame image of the video segment; wherein S is an integer greater than 0.

For example, if at least one frame of image not including the target object is present before the first frame of image (i.e., the first frame of image) including the target object in the acquired consecutive N frames of images, the images not including the target object may be separately divided into a video segment, and the first frame of image in the images not including the target object may be set as the first frame of image of the video segment.

Further, in the embodiment of the present invention, the play control instruction includes: a fast forward operation instruction and a fast backward operation instruction.

As shown in fig. 4, the control module 304 includes:

the first control unit 3041 is configured to, when the fast forward operation instruction is detected, control the playing progress of the video to jump to a first frame image of a video segment subsequent to a currently playing video segment for playing.

The second control unit 3042 is configured to, when the fast-rewinding operation instruction is detected, control the playing progress of the video to jump to a first frame image of a video segment before a currently playing video segment for playing.

Therefore, when the user carries out fast forward or fast backward operation, the accuracy of video progress adjustment can be improved, the speed of paying attention to important information by the user can be increased, and better use experience is provided for the user.

In summary, the mobile terminal provided in the embodiments of the present invention divides a video into a plurality of video segments based on a target object in the video. Specifically, N consecutive images forming a video are acquired by the first acquiring module 301, a target object of each image in the N consecutive images is acquired by the second acquiring module 302, the N consecutive images are divided into T video segments by the dividing module 303 according to the target object included in each image acquired by the second acquiring module 302, and when a playing control instruction for the video is detected, the control module 304 controls the playing progress of the video to be switched among the T video segments divided by the dividing module according to the playing control instruction, taking the video segment as a unit.

The content played by each divided video segment is closely related to the target object contained in the video segment, so that when the video progress is adjusted, a user can avoid missing the plot related to the target object, and the playing effect of the video is optimized. In addition, when the user adjusts the video progress, the user adjusts the video progress according to the video segment, so that the accuracy of fast forwarding can be improved, the speed of paying attention to important information by the user can be increased, and better use experience is provided for the user.

In accordance with another aspect of an embodiment of the present invention, a mobile terminal 500 is provided.

As shown in fig. 5, the mobile terminal 500 includes: at least one processor 501, memory 502, at least one network interface 504, and a user interface 503. The various components in the mobile terminal 500 are coupled together by a bus system 505. It is understood that the bus system 505 is used to enable connection communications between these components. The bus system 505 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 505 in FIG. 5.

The user interface 503 may include, among other things, a display, a keyboard, or a pointing device (e.g., a mouse, trackball, touch pad, or touch screen, among others.

It is to be understood that the memory 502 in embodiments of the present invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of example, but not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double Data rate Synchronous Dynamic random access memory (ddr SDRAM ), Enhanced Synchronous SDRAM (ESDRAM), Synchlink DRAM (SLDRAM), and direct memory bus RAM (DRRAM). The memory 502 of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.

In some embodiments, memory 502 stores elements, executable modules or data structures, or a subset thereof, or an expanded set thereof as follows: an operating system 5021 and application programs 5022.

The operating system 5021 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application 5022 includes various applications, such as a Media Player (Media Player), a Browser (Browser), and the like, for implementing various application services. The program for implementing the method according to the embodiment of the present invention may be included in the application program 5022.

In the embodiment of the present invention, by calling a program or an instruction stored in the memory 502, specifically, a program or an instruction stored in the application 5022, the processor 501 is configured to obtain N consecutive frames of images and images constituting a video, obtain a target object of each frame of image in the N consecutive frames of images, and divide the N consecutive frames of images into T video segments according to the target object included in each frame of image; when a playing control instruction for the video is detected, controlling the playing progress of the video to be switched among the T video segments by taking the video segments as units according to the playing control instruction; wherein N is an integer greater than 0, and T is an integer greater than 0.

The method disclosed by the above-mentioned embodiments of the present invention may be applied to the processor 501, or implemented by the processor 501. The processor 501 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 501. The Processor 501 may be a general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable Gate Array (FPGA) or other programmable logic device, discrete Gate or transistor logic device, or discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 502, and the processor 501 reads the information in the memory 502 and completes the steps of the method in combination with the hardware.

It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the Processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units configured to perform the functions described herein, or a combination thereof.

For a software implementation, the techniques described herein may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.

Optionally, the processor 501 is further configured to: determining a first frame image containing a target object in the continuous N frame images as a first frame image of a first video segment; determining a first set formed by all target objects contained in the first frame image; in M frames of images after the first frame of image, dividing the M frames of images and the first frame of image into a first video segment according to the relation between a set formed by all target objects contained in each frame of image and the first set; sequentially dividing the M +1 frame image after the first frame image to the N frame image in the continuous N frame images into T-1 video segments according to the dividing mode of the first video segment; wherein M is an integer greater than 0, and the division mode is as follows: and dividing the images with the preset set relation into the same video segment.

Optionally, the processor 501 is further configured to: determining a second set formed by all target objects contained in the M +1 frame image; when the second set and the first set do not have a preset set relationship, determining the M +1 frame image as the first frame image of the second video segment; in the P frame images after the M +1 frame image, dividing the P frame image and the M +1 frame image into a second video segment according to the relation between a set formed by all target objects contained in each frame image and a second set; wherein P is an integer greater than 0.

Optionally, the processor 501 is further configured to: in M frames of images after the first frame of image, if a set formed by all target objects contained in each frame of image has a subset relation with the first set, dividing the M frames of images and the first frame of image into a first video segment.

Optionally, the processor 501 is further configured to: in M frames of images after the first frame of image, if a subset relation exists between a set formed by all target objects contained in each frame of image and the first set, and the number of elements in the set corresponding to each frame of image in the M frames of image meets a preset condition, dividing the M frames of image and the first frame of image into a first video segment. Wherein the preset conditions include: the difference value between the element number of the set corresponding to each frame image in the M frames of images and the element number of the first set is smaller than or equal to a preset threshold value.

Optionally, the processor 501 is further configured to: dividing continuous S frame images which do not contain the target object in the continuous N frame images into the same video segment; in a video segment which does not contain a target object, determining an image of a first frame which does not contain the target object as a first frame image of the video segment; wherein S is an integer greater than 0.

Optionally, the playback control instruction includes: a fast forward operation instruction and a fast backward operation instruction; the processor 501 is further configured to: when a fast forward operation instruction is detected, controlling the playing progress of the video to jump to the first frame image of the next video segment of the currently played video segment for playing; and when the fast-backward operation instruction is detected, controlling the playing progress of the video to jump to the first frame image of the previous video segment of the currently played video segment for playing.

The mobile terminal 500 can implement the processes implemented by the mobile terminal in the foregoing embodiments, and in order to avoid repetition, the detailed description is omitted here.

The mobile terminal 500 provided by the embodiment of the present invention divides the video into a plurality of video segments based on the target object in the video, and the content played by each video segment is closely related to the target object included in the video segment, so that when the user performs video schedule adjustment, the user can avoid missing the plot related to the target object, and the playing effect of the video is optimized. In addition, when the user adjusts the video progress, the user adjusts the video progress according to the video segment, so that the accuracy of fast forwarding can be improved, the speed of paying attention to important information by the user can be increased, and better use experience is provided for the user.

In accordance with another aspect of an embodiment of the present invention, a mobile terminal 600 is provided.

The mobile terminal 600 may be a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), or a vehicle-mounted computer.

As shown in fig. 6, the mobile terminal 600 includes a Radio Frequency (RF) circuit 601, a memory 602, an input unit 603, a display unit 604, a processor 606, an audio circuit 607, a wifi (wireless fidelity) module 608, and a power supply 609.

Among other things, the input unit 603 may be used to receive numeric or character information input by a user and generate signal inputs related to user settings and function control of the mobile terminal 600. Specifically, in the embodiment of the present invention, the input unit 603 may include a touch panel 6031. The touch panel 6031, also referred to as a touch screen, may collect touch operations of a user (e.g., operations of the user on the touch panel 6031 by using a finger, a stylus, or any other suitable object or accessory) thereon or nearby, and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 6031 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts it to touch point coordinates, sends it to the processor 606, and can receive and execute commands from the processor 606. In addition, the touch panel 6031 can be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. In addition to the touch panel 6031, the input unit 603 may further include other input devices 6032, and the other input devices 6032 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

Among other things, the display unit 604 may be used to display information input by the user or information provided to the user and various menu interfaces of the mobile terminal 600. The display unit 604 may include a display panel 6041, and the display panel 6041 may be configured in the form of an LCD or an Organic Light-Emitting Diode (OLED).

It should be noted that the touch panel 6031 may cover the display panel 6041 to form a touch display screen, and when the touch display screen detects a touch operation thereon or nearby, the touch display screen is transmitted to the processor 606 to determine the type of the touch event, and then the processor 606 provides a corresponding visual output on the touch display screen according to the type of the touch event.

The touch display screen comprises an application program interface display area and a common control display area. The arrangement modes of the application program interface display area and the common control display area are not limited, and can be an arrangement mode which can distinguish two display areas, such as vertical arrangement, left-right arrangement and the like. The application interface display area may be used to display an interface of an application. Each interface may contain at least one interface element such as an icon and/or widget desktop control for an application. The application interface display area may also be an empty interface that does not contain any content. The common control display area is used for displaying controls with high utilization rate, such as application icons like setting buttons, interface numbers, scroll bars, phone book icons and the like.

The processor 606 is a control center of the mobile terminal 600, connects various parts of the whole mobile phone by using various interfaces and lines, and performs various functions of the mobile terminal 600 and processes data by running or executing software programs and/or modules stored in the first memory 6021 and calling data stored in the second memory 6022, thereby performing overall monitoring of the mobile terminal 600. Optionally, processor 606 may include one or more processing units.

In the embodiment of the present invention, the processor 606 is configured to obtain consecutive N frames of images and images constituting a video and a target object of each frame of image in the consecutive N frames of images by calling a software program and/or a module stored in the first memory 6021 and/or data stored in the second memory 6022, and divide the consecutive N frames of images into T video segments according to the target object included in each frame of image; when a playing control instruction for the video is detected, controlling the playing progress of the video to be switched among the T video segments by taking the video segments as units according to the playing control instruction; wherein N is an integer greater than 0, and T is an integer greater than 0.

Optionally, the processor 606 is further configured to: determining a first frame image containing a target object in the continuous N frame images as a first frame image of a first video segment; determining a first set formed by all target objects contained in the first frame image; in M frames of images after the first frame of image, dividing the M frames of images and the first frame of image into a first video segment according to the relation between a set formed by all target objects contained in each frame of image and the first set; sequentially dividing the M +1 frame image after the first frame image to the N frame image in the continuous N frame images into T-1 video segments according to the dividing mode of the first video segment; wherein M is an integer greater than 0, and the division mode is as follows: and dividing the images with the preset set relation into the same video segment.

Optionally, the processor 606 is further configured to: determining a second set formed by all target objects contained in the M +1 frame image; when the second set and the first set do not have a preset set relationship, determining the M +1 frame image as the first frame image of the second video segment; in the P frame images after the M +1 frame image, dividing the P frame image and the M +1 frame image into a second video segment according to the relation between a set formed by all target objects contained in each frame image and a second set; wherein P is an integer greater than 0.

Optionally, the processor 606 is further configured to: in M frames of images after the first frame of image, if a set formed by all target objects contained in each frame of image has a subset relation with the first set, dividing the M frames of images and the first frame of image into a first video segment.

Optionally, the processor 606 is further configured to, in M frames of images subsequent to the first frame of image, divide the M frames of image and the first frame of image into the first video segment if a subset relationship exists between a set formed by all target objects included in each frame of image and the first set, and the number of elements in the set corresponding to each frame of image in the M frames of image satisfies a preset condition. Wherein the preset conditions include: the difference value between the element number of the set corresponding to each frame image in the M frames of images and the element number of the first set is smaller than or equal to a preset threshold value.

Optionally, the processor 606 is further configured to: dividing continuous S frame images which do not contain the target object in the continuous N frame images into the same video segment; in a video segment which does not contain a target object, determining an image of a first frame which does not contain the target object as a first frame image of the video segment; wherein S is an integer greater than 0.

Optionally, the playback control instruction includes: a fast forward operation instruction and a fast backward operation instruction; processor 606 is also configured to: when a fast forward operation instruction is detected, controlling the playing progress of the video to jump to the first frame image of the next video segment of the currently played video segment for playing; and when the fast-backward operation instruction is detected, controlling the playing progress of the video to jump to the first frame image of the previous video segment of the currently played video segment for playing.

As can be seen, the mobile terminal 600 provided in the embodiment of the present invention divides the video into a plurality of video segments according to the target object in the video, and the content played by each video segment is related to the target object included in the video segment, so that when the user performs video schedule adjustment, the user can avoid missing the plot related to the target object, and optimize the playing effect of the video. In addition, when the user adjusts the video progress, the user adjusts the video progress according to the video segment, so that the accuracy of fast forwarding can be improved, the speed of paying attention to important information by the user can be increased, and better use experience is provided for the user.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A video playback control method, comprising:

acquiring continuous N frames of images forming a video;

wherein N is an integer greater than 0, and T is an integer greater than 0;

the step of dividing the N consecutive frames of images into T video segments according to the target object contained in each frame of image comprises:

determining a first frame image containing a target object in the continuous N frame images as a first frame image of a first video segment;

determining a first set of all target objects contained in the first frame image;

in M frames of images after the first frame of image, dividing the M frames of images and the first frame of image into the first video segment according to the relation between the first set and a set formed by all target objects contained in each frame of image;

sequentially dividing the M +1 frame image after the first frame image to the N frame image in the continuous N frame images into T-1 video segments according to the dividing mode of the first video segment;

wherein M is an integer greater than 0, and the division mode is as follows: dividing the images with a preset set relation into the same video segment;

the step of dividing, in M frames of images subsequent to the first frame of image, the M frames of images and the first frame of image into the first video segment according to a relationship between a set formed by all target objects included in each frame of image and the first set, includes:

in M frames of images after the first frame of image, if a subset relation exists between a set formed by all target objects contained in each frame of image and the first set, and the number of elements in the set corresponding to each frame of image in the M frames of images meets a preset condition, dividing the M frames of images and the first frame of image into the first video segment;

wherein the preset conditions include: the difference value between the number of elements of the set corresponding to each frame image in the M frames of images and the number of elements of the first set is smaller than or equal to a preset threshold value;

if the number of image frames of one video segment exceeds the preset maximum number of image frames, the exceeding frame images are divided into another video segment.

2. The video playback control method according to claim 1, wherein the step of sequentially dividing the M +1 th frame of image after the first frame of image to the N th frame of image among the consecutive N frames of image into T-1 video segments according to the division manner of the first video segment comprises:

determining a second set formed by all target objects contained in the M +1 frame image;

when the preset set relation does not exist between the second set and the first set, determining the M +1 frame image as a first frame image of a second video segment;

in the P frame images after the M +1 frame image, dividing the P frame image and the M +1 frame image into the second video segment according to the relation between the set formed by all target objects contained in each frame image and the second set;

wherein P is an integer greater than 0.

3. The video playback control method according to claim 1, wherein the subset relation comprises: in the two images subjected to set relation comparison, the third set formed by all the target objects in the previous image is a subset of the fourth set formed by all the target objects in the next image, or the fourth set is a subset of the third set.

4. The video playback control method according to claim 1, wherein said step of dividing said N consecutive frames of images into T video segments according to said target object contained in each frame of image comprises:

dividing continuous S frame images which do not contain the target object in the continuous N frame images into the same video segment;

in the video segment without the target object, determining an image of a first frame without the target object as a first frame image of the video segment;

wherein S is an integer greater than 0.

5. The video playback control method according to claim 1, wherein the playback control instruction includes: a fast forward operation instruction and a fast backward operation instruction;

the step of controlling the playing progress of the video to be switched among the T video segments by taking the video segments as units according to the playing control instruction when the playing control instruction for the video is detected includes:

when a fast forward operation instruction is detected, controlling the playing progress of the video to jump to the first frame image of the next video segment of the currently played video segment for playing;

and when a fast-backward operation instruction is detected, controlling the playing progress of the video to jump to the first frame image of the previous video segment of the currently played video segment for playing.

6. The video playback control method according to claim 1, wherein the target object is a face image, an animal image, or a plant image.

7. A mobile terminal, comprising:

wherein N is an integer greater than 0, and T is an integer greater than 0;

the dividing module includes:

a first determining unit, configured to determine a first frame image containing a target object in the consecutive N frame images as a first frame image of a first video segment;

a second determination unit configured to determine a first set formed by all target objects included in the first frame image;

a first dividing unit, configured to divide, in M frames of images subsequent to the first frame of image, the M frames of images and the first frame of image into the first video segment according to a relationship between a set formed by all target objects included in each frame of image and the first set;

a second dividing unit, configured to sequentially divide an M +1 th frame image after the first frame image to an nth frame image in the consecutive N frame images into T-1 video segments according to a dividing manner of the first video segment;

the first division unit includes:

a third dividing subunit, configured to, in M frames of images subsequent to the first frame of image, divide the M frames of images and the first frame of image into the first video segment if a subset relationship exists between a set formed by all target objects included in each frame of image and the first set, and a number of elements in a set corresponding to each frame of image in the M frames of images satisfies a preset condition;

the mobile terminal is further configured to:

if the number of image frames of a video segment exceeds a preset maximum number of image frames, the exceeding frame images are divided into another video segment.

8. The mobile terminal of claim 7, wherein the second dividing unit comprises:

a first determining subunit, configured to determine a second set formed by all target objects included in the M +1 th frame image;

a second determining subunit, configured to determine, when the preset set relationship does not exist between the second set and the first set, the M +1 th frame image as a first frame image of a second video segment;

a first dividing unit, configured to divide, in a P-frame image subsequent to the M +1 th frame image, the P-frame image and the M +1 th frame image into the second video segment according to a relationship between a set formed by all target objects included in each frame image and the second set;

wherein P is an integer greater than 0.

9. The mobile terminal of claim 7, wherein the subset relationship comprises: in the two images subjected to set relation comparison, the third set formed by all the target objects in the previous image is a subset of the fourth set formed by all the target objects in the next image, or the fourth set is a subset of the third set.

10. The mobile terminal of claim 7, wherein the partitioning module comprises:

the third dividing unit is used for dividing the continuous S frame images which do not contain the target object in the continuous N frame images into the same video segment;

wherein S is an integer greater than 0.

11. The mobile terminal of claim 7, wherein the play control instruction comprises: a fast forward operation instruction and a fast backward operation instruction;

the control module includes:

the first control unit is used for controlling the playing progress of the video to jump to the first frame image of the next video segment of the currently played video segment for playing when the fast forward operation instruction is detected;

and the second control unit is used for controlling the playing progress of the video to jump to the first frame image of the previous video segment of the currently played video segment for playing when the fast-backward operation instruction is detected.

12. The mobile terminal of claim 7, wherein the target object is a human face image, an animal image, or a plant image.