CN111491206A

CN111491206A - Video processing method, video processing device and electronic equipment

Info

Publication number: CN111491206A
Application number: CN202010306988.XA
Authority: CN
Inventors: 孙鑫
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2020-04-17
Filing date: 2020-04-17
Publication date: 2020-08-04
Anticipated expiration: 2040-04-17
Also published as: CN111491206B

Abstract

The invention provides a video processing method, a video processing device and electronic equipment, wherein the method comprises the following steps: acquiring a target video characteristic label of a target video clip; and displaying target description information in the target video clip based on the target video feature tag. The method and the device can automatically display the matched description information in the video clip according to the video feature label of the video clip, so that the video can be processed without learning and familiar professional knowledge, professional software and professional operation of video processing by a user, thereby reducing the difficulty of video processing and helping the user add the description information to the video more conveniently and efficiently.

Description

Video processing method, video processing device and electronic equipment

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a video processing method, a video processing apparatus, and an electronic device.

Background

In daily life, people often use electronic equipment to shoot various videos to record things around, and often share the shot videos on a social network. In order to obtain better sharing effect, some users process the video before sharing the video to the social network. In the prior art, the requirement on professional literacy of video processing personnel is high, and particularly, when a user needs to process a video, the user often needs to know the related professional knowledge, professional software and professional operation of video processing, which causes great difficulty in realizing video processing.

Disclosure of Invention

The embodiment of the invention provides a video processing method, a video processing device and electronic equipment, which can solve the problem of high difficulty in video processing.

In order to solve the technical problem, the invention is realized as follows:

in a first aspect, an embodiment of the present invention provides a video processing method, including:

acquiring a target video characteristic label of a target video clip;

and displaying target description information in the target video clip based on the target video feature tag.

In a second aspect, an embodiment of the present invention further provides a video processing apparatus, including:

the first acquisition module is used for acquiring a target video characteristic label of a target video clip;

and the first display module is used for displaying the target description information in the target video clip based on the target video feature tag.

In a third aspect, an embodiment of the present invention further provides an electronic device, which includes a processor, a memory, and a computer program stored on the memory and executable on the processor, where the computer program, when executed by the processor, implements the steps of the video processing method.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the video processing method are implemented.

In the embodiment of the invention, the target video characteristic label of the target video segment is obtained, and the target description information is displayed in the target video segment based on the target video characteristic label, so that the matched description information can be automatically displayed in the video segment according to the video characteristic label of the video segment, and thus, a user can process the video without learning and familiar professional knowledge, professional software and professional operation of video processing, and the difficulty of video processing can be reduced.

Drawings

Fig. 1 is a flow chart of a video processing method provided by an embodiment of the invention;

fig. 2 is one of interface schematic diagrams of a playing interface of a target video segment in a video processing method according to an embodiment of the present invention;

fig. 3 is a second schematic interface diagram of a playing interface of a target video segment in the video processing method according to the embodiment of the present invention;

fig. 4 is a third schematic interface diagram of a playing interface of a target video segment in the video processing method according to the embodiment of the present invention;

fig. 5 is a fourth schematic interface diagram of a playing interface of a target video segment in the video processing method according to the embodiment of the present invention;

fig. 6 is a fifth schematic interface diagram of a playing interface of a target video clip in the video processing method according to the embodiment of the present invention;

fig. 7 is a sixth schematic interface diagram of a playing interface of a target video segment in the video processing method according to the embodiment of the present invention;

fig. 8 is a seventh schematic interface diagram of a playing interface of a target video segment in the video processing method according to the embodiment of the present invention;

fig. 9 is an eighth schematic interface diagram of a playing interface of a target video segment in the video processing method according to the embodiment of the present invention;

fig. 10 is a block diagram of a video processing apparatus provided in an embodiment of the present invention;

fig. 11 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Here, the electronic Device may be, but not limited to, a Mobile phone, a Tablet Personal Computer (Tablet Personal Computer), a laptop Computer (L ap Computer), a Personal Digital Assistant (PDA), a Mobile Internet Device (MID), a Wearable Device (Wearable Device), or the like.

The video processing method in the embodiment of the invention can comprise the following steps:

step 101, obtaining a target video feature tag of a target video clip.

In the embodiment of the present invention, the target video segment may be one of video segments of a target video acquired by an electronic device. Here, the target video may be a video obtained by calling a camera of the electronic device to shoot, or may be a video sent from another network device; the target video may include only the target video segment, and may also include the target video segment and other video segments.

The number of target video segments may be one or more.

The target video feature tag can be used to indicate feature information such as style or hue of the target video segment. Specifically, when the target video feature tag is used to indicate the style of the target video segment, the target video feature tag may be a beauty, a movie, or a video blog (vlog for short), etc.; when the target video feature tag is used to indicate the tone of the target video segment, the target video feature tag may be red, orange, blue, or the like.

And 102, displaying target description information in the target video clip based on the target video feature tag.

In the embodiment of the invention, the target description information can be characters, images, symbols, stickers or the like.

In practical application, after the target video feature tag is acquired, the description information corresponding to the target video feature tag may be directly used as the target description information to be displayed in the target video segment, or the target description information may be obtained by editing on the basis of the description information corresponding to the target video feature tag and displayed in the target video segment. That is, in practice, step 102 may include at least the following two cases:

case one, step 102 may comprise:

determining description information corresponding to the target video feature tag as target description information;

and displaying the target description information in the target video clip.

For ease of understanding, the caption is exemplified as the explanatory information:

assuming that the playing interface of the target video segment is as shown in fig. 2, the target video feature tag is used to indicate the style of the target video segment, and the target video feature tag is unique, the correspondence between the video feature tag and the subtitle is preset as follows: the method comprises the following steps that (1) in a mottle city, namely "" Eimeria "" corresponds to "" cloud cyan; rain, water \28601; \28601 "", if you have a footprint and "" vlog "" corresponds to "" beautiful one day starts from a vlog "", then after a target video feature tag, namely the Eimeria is obtained, electronic equipment can store the corresponding relation between the preset video feature tag and subtitles according to a storage medium in a data form such as a corresponding relation table, the corresponding relation is not limited, and subtitles corresponding to the Eimeria are automatically obtained: the method comprises the steps of cloud cyan; intended rain, water \28601.

Case two, step 102 may also include:

editing description information corresponding to the target video feature tag to obtain target description information;

and displaying the target description information in the target video clip.

For ease of understanding, the caption is still exemplified as the explanatory information:

assuming that the target video feature tag is used for indicating the style of the target video segment, and the target video feature tag is vlog, the correspondence between the video feature tag and the subtitle is preset as follows: "beautiful" corresponds "the willow eye in the Fei of thin rain is opened, cloud and smoke is wriggling around the fairy platform", "the movie" corresponds "life is like time light, always walk in the careless time" and "the vlog" corresponds "and let the life fresher", then, after obtaining the target video characteristic label promptly the vlog, electronic equipment can be according to the corresponding relation between the video characteristic label that the aforesaid sets up in advance and the subtitle, automatic acquisition corresponds with the vlog the subtitle: and (3) enabling the life to be fresher, and then editing the obtained subtitles to obtain the target subtitles: you are around my, make life fresher, each moment is wonderful, and display the target caption in the target video segment.

The video processing method in the embodiment of the present invention may be automatically triggered by the electronic device, for example, the video processing method may be triggered when the electronic device detects a target video to which the target video segment belongs. The video processing method in the embodiment of the present invention may also be triggered by an input of a user, for example, as shown in fig. 2, a target control 21 may be displayed in a target video segment, so that the user may trigger the video processing method by clicking the target control 21 displayed in the target video segment.

In the embodiment of the invention, the target video characteristic label of the target video segment is obtained, and the target description information is displayed in the target video segment based on the target video characteristic label, so that the matched description information can be automatically displayed in the video segment according to the video characteristic label of the video segment, and thus, the video can be processed without the need of learning and familiar professional knowledge, professional software and professional operation of video processing by a user, so that the difficulty of video processing can be reduced, and the user can be helped to add the description information to the video more conveniently and efficiently.

Optionally, the obtaining a target video feature tag of a target video segment includes:

identifying the video content type of each frame of video image in the target video;

dividing the target video into N video segments based on the video content type of each frame of video image, and adding a video feature tag to each of the N video segments;

determining a target video clip from the N video clips, and acquiring a target video feature tag of the target video clip;

the N is a positive integer, the video content types of each frame of video image in each of the N video clips are the same, and the video feature tag of each of the N video clips corresponds to the video content type of any frame of video image in the video clip.

In the embodiment of the present invention, for the explanation of the target video, reference may be made to the description of the target video in step 101, and thus, details are not repeated here.

The above video content type may be a natural landscape, a street building, a daily life, or the like.

The video content type of each frame of video image of the recognition target video may specifically be: the display content of each frame of video image of the target video is identified, and the video content type of each frame of video image of the target video is determined based on the display content of each frame of video image of the target video. For example, when the display content of one frame of video image of the target video is identified as a mountain, a grassland or a forest, the video content type of the frame of video image can be determined as a natural landscape; when the display content of one frame of video image of the target video is identified as a street, a building or a bridge, the video content type of the frame of video image can be determined as a street building.

The video feature tag can be used for indicating feature information such as style or color tone of the video segment. Specifically, when the video feature tag is used to indicate the genre of the video clip, the video feature tag may be a beauty, a movie, a vlog, or the like; when a video feature tag is used to indicate the hue of a video clip, the video feature tag may be red, orange, blue, or the like.

The target video segment may be any one of the N video segments. Specifically, the target video segment may be a first video segment in the N video segments, a last video segment in the N video segments, or an intermediate video segment in the N video segments.

In practical applications, the correspondence between the video feature tags and the video content types can be set as required, for example, if the video feature tags include beauty, movies and vlog, and the video content types include natural scenery, street buildings or daily life, then the beauty can be set for the natural scenery, the movies for the street buildings and the vlog for the daily life.

In this way, by identifying the video content type of each frame of video image in the target video, dividing the target video into N video segments based on the video content type of each frame of video image, adding a video feature tag to each of the N video segments, determining the target video segment from the N video segments, and acquiring the target video feature tag of the target video segment, automatic acquisition of the target video feature tag of the target video segment can be realized, and the matching degree of the target description information and the content of the target video segment can be improved.

Optionally, before displaying the target description information in the target video segment based on the target video feature tag, the method further includes:

acquiring a first target multimedia file collected in advance, wherein the first target multimedia file comprises first description information;

the displaying of target description information in the target video clip based on the target video feature tag includes:

acquiring second description information corresponding to the target video feature tag;

and displaying target description information in the target video clip based on the first description information and the second description information.

In this embodiment of the present invention, the first target multimedia file may be any one of a plurality of multimedia files collected by a user in advance.

The first description information may be characters, images, symbols, stickers, or the like. The second description information may be characters, images, symbols, stickers, or the like.

In this way, by acquiring a first target multimedia file collected in advance, the first target multimedia file including first description information and acquiring second description information corresponding to the target video feature tag, and based on the first description information and the second description information, displaying the target description information in the target video clip, the target description information displayed in the target video clip can take into account the matching degree with the target video feature tag of the target video clip and the preference of the user.

Optionally, the displaying the target description information in the target video segment based on the first description information and the second description information includes:

determining target description information from the first description information and the second description information, and displaying the target description information in the target video clip;

or generating target description information based on the first description information and the second description information, and displaying the target description information in the target video clip.

In the embodiment of the present invention, the specification of the target specification information from the first specification information and the second specification information may be understood as selecting one of the first specification information and the second specification information as the target specification information. For ease of understanding, the following are exemplified herein:

suppose that the first target multimedia file is an image collected by the user in advance, and the image comprises characters: yunquqing; intended rain, water _28601; \28601; smoke, i.e., the first explanatory information is: the second description information corresponding to the target video feature tag of the target video segment is: the water and light billow is good in sunny side and also peculiar in mountainous sky and rainy, then at this time, the electronic device can select one from the first description information and the second description information as the target description information according to the algorithm, and the target description information is displayed in the target video segment, and it is assumed here that the electronic device selects the first description information, namely, yunququyu, water 28601; smoke generation, then in the target video segment, characters will be displayed: yunquqing; intended rain, water \28601.

The above generation of the target specification information based on the first specification information and the second specification information may be understood as generating a new specification information based on the understanding of the first specification information and the second specification information, and determining the generated new specification information as the target specification information. For ease of understanding, the following are exemplified herein:

suppose that the first target multimedia file is an image collected by the user in advance, and the image comprises characters: the information of the heavy rain in the fei, i.e., the first specification information, is: in the fei of heavy rains, the second description information corresponding to the target video feature tag of the target video clip is: cloudy-grey, then at this time, the electronic device may generate verse based on the first explanatory information, namely, the fewest of fine rain, and the second explanatory information, namely cloudy-grey: the verse generated by the willow in the middle and the Fei in the thin rain and the cloud and the smoke turning around the fairy tale is used as target description information and displayed in a target video segment, that is, the verse is displayed in the target video segment: the willow eye is open in the fine rain and the Fei and the Yunyan looks like the fairy tale.

In this way, by specifying the target description information from the first description information and the second description information and displaying the target description information in the target video clip, it is possible to further increase the speed of acquiring the target description information while taking into account the degree of matching with the target video feature tag and the preference of the user, and further more efficiently add the description information to the video clip.

By generating the target description information based on the first description information and the second description information and displaying the target description information in the target video clip, the target description information added to the target video clip can better take the matching degree with the target video feature tag of the target video clip and the preference of the user into consideration.

Optionally, after displaying the target description information in the target video segment based on the target video feature tag, the method further includes:

displaying a first track, the first track including the target specification information;

receiving a first input of a user to the first track;

adjusting a display range of the target specification information in response to the first input;

wherein the display range of the target specification information is used for indicating all video images in the target video clip displaying the target specification content.

In this embodiment of the present invention, the step of displaying the first track may be automatically triggered by the electronic device, for example, the first track may be automatically displayed after the electronic device displays the target description information in the target video clip based on the target video feature tag. The step of displaying the first track may also be triggered by a user input, for example, a target control may be displayed in the target video segment, so that when the user double-clicks the target control of the playing interface of the target video segment, the electronic device may be triggered to display the first track.

The first input may be a touch input such as clicking, double clicking, long pressing, sliding, or dragging, or may be a voice control input.

In order to facilitate understanding of the display range of the above object description information, here, an example is given in which:

assuming that the target video segment includes 4 consecutive video images, the 4 consecutive video images are an a video image, a B video image, a C video image, and a D video image in sequence, and the display target specification information in the target video segment is that the target specification information is displayed in all the video images of the target video segment, the display range of the target specification information at this time may be considered as: the video image processing method comprises the following steps of A, B, C and D video images, wherein the A video image can be called a starting video image of target description information, and the D video image can be called an ending video image of the target description information;

next, if the target specification information is displayed only in the B video image and the C video image of the target video clip after the display range of the target specification information is adjusted as described above, it is considered that the adjusted display range of the target specification information is: a B video image and a C video image, wherein the B video image may be referred to as a start video image of the object description information, and the C video image may be referred to as an end video image of the object description information.

The operation of adjusting the display range of the target specification information may be understood as an operation of adjusting which video images of the target video clip the target specification information is displayed in.

In this way, by displaying a first track including the object description information, receiving a first input of the first track by a user, and adjusting a display range of the object description information in response to the first input, the display range of the object description information being used for indicating all video images in the object video in which the object description content is displayed, after automatically adding description information to a video clip, the user can adjust the display range of the object description information as needed, so that the flexibility of use of the video processing method can be further improved.

Optionally, the adjusting the display range of the target specification information in response to the first input includes:

when the first input is used for adjusting the display range of the first track to a first display range, adjusting the starting video image of the target description information to a video image corresponding to a first position of the first display range, and adjusting the ending video image of the target description information to a video image corresponding to a second position of the first display range;

and in the case that the input parameters of the first input comprise at least one of input position, input times and input time, adjusting at least one of a start video image and an end video image of the target description information to be a video image corresponding to the input parameters of the first input.

In this embodiment of the present invention, the starting video image of the target specification information may be a frame video image of the target video segment for starting displaying the target specification information; and the ending video image of the target specification information may be the frame video image of the target video clip in which the display of the target specification information is ended.

The first track may be a strip track, a circular track or a track with other shapes.

For ease of understanding, the following are exemplified herein:

application case one

Assuming that the target video clip comprises 7 frames of video images in total, the target description information is: based on the target video feature tag, after the target description information is displayed in the target video segment, the display range of the description detail information is as follows: first, second, third, fourth, fifth, sixth and seventh frames of video images in the target video clip, that is, target description information, are displayed from the first frame of video image in the target video clip to the seventh frame of video image in the target video clip, and a playing interface of the target video clip is shown in fig. 3, where 21 shown in fig. 3 is a target control, then, when a user clicks the target control 21 shown in fig. 3, as shown in fig. 4, a bar-shaped first track 41 is displayed on the playing interface of the target video clip, where the first track 41 includes target description information, that is, cloud-cyan-raining, water 28601; smoke is generated, a third position 411 of a display range of the first track 41, that is, a third position of a frame of text "cloud-cyan-raining, water 28601; left side of a frame of the first track 601; smoke generation" corresponds to a start video image of the target description information, that is, the first frame of the target video image in the target video clip, a fourth position 412 of the display range of the first track 41, i.e., a position indicated by a right side edge of a box of the text "cloud-herlongitudinally, water \28601; \; smoke _;

next, in the display interface shown in fig. 4, the user may adjust the display range of the first track 41 by dragging the third position 411 and the fourth position 412, may adjust the display range of the first track 41 by clicking any position of the display range of the first track 41 and controlling the number of clicks, and may adjust the display range of the first track 41 by pressing the first track 41 and controlling the pressing time.

If the first track 41 with the display range adjusted by the user is shown in fig. 5, the starting video image of the target specification content is adjusted to: a video image corresponding to the fifth position 413 of the display area of the first track 41 shown in fig. 5, i.e. the position indicated by the left side edge of the box of the text "cloud-herno-rain, water \28601 | _ smoke _ |, shown in fig. 5, and the ending video image of the target specification content is adjusted to: the video image corresponding to the sixth position 414 of the display area of the first track 41 shown in fig. 5, i.e. the position indicated by the right side edge of the box of the text "cloud image, water \28601 | _ 28601smoke" shown in fig. 5, assuming that the video image corresponding to the fifth position 413 is the second frame video image of the target video clip, and the video image corresponding to the sixth position 414 is the sixth frame video image of the target video clip, the display range of the target description content at this time is adjusted to: the second, third, fourth, fifth and sixth frame images in the target video segment, that is, the target specification content is only displayed from the second frame image of the target video segment to the sixth frame image of the target video segment; next, in the display interface shown in fig. 5, the user may click on the target control 21 shown in fig. 5 to save the adjustment of the display range of the target specification content; the user may also double-click the target control 21 shown in fig. 5 to undo the adjustment to the display range of the target specification content.

Application case two

Assuming that the target video clip comprises 7 frames of video images in total, the target description information is: based on the target video feature tag, after the target description information is displayed in the target video segment, the display range of the description detail information is as follows: first, second, third, fourth, fifth, sixth and seventh frames of video images in the target video segment, that is, at this time, the starting video image of the target description information is the first frame of video image in the target video segment, the ending video image of the target description information is the seventh frame of video image in the target video segment, the target description information is displayed from the first frame of video image in the target video segment to the seventh frame of video image in the target video segment, and the playing interface of the target video segment is as shown in fig. 3, where 21 shown in fig. 3 is the target control, then, when the user clicks the target 21 shown in fig. 3, as shown in fig. 4, a first strip-shaped track 41 is displayed on the playing interface of the target video segment, and the first track 41 includes the target description information, that is, yunquqing koy, water 28601; smoke generation;

next, in the display interface shown in fig. 4, the user may adjust the display range of the target specification information in various ways, including but not limited to the following ways:

in a first mode

The user may double-click on the first track 41 shown in fig. 4 and control the double-click position to adjust the starting video image of the target specification information; for example, assuming that the user double-clicks a first target position of the first track 41 shown in fig. 4, the starting video image of the target specification information will be adjusted to the video image corresponding to the first target position;

similarly, the user may press the first track 41 shown in fig. 4 for a long time and control the pressing position for a long time to adjust the ending video image of the target specification information; for example, assuming that the user has pressed the second target position of the first track 41 shown in fig. 4 for a long time, the ending video image of the target specification information will be adjusted to the video image corresponding to the second target position.

Mode two

The user may adjust the starting video image of the target specification information by clicking the third position 411 of the display range of the first track 41 shown in fig. 4, i.e., the position indicated by the left side edge of the box of the text "cloud turmerium, water \28601; smoke \" shown in fig. 4, and controlling the number of clicks, thereby implementing adjustment of the display range of the target specification information; for example, it may be set that the starting video image of the object description information is shifted backward by one view image every time the user clicks the third position 411, so that when the user clicks the third position 411 once, the starting video image of the object description information is adjusted to be the second frame video image in the object video clip, that is, the display range of the object description information is adjusted to be: when the user clicks the third position 411 twice, the starting video image of the target description information is adjusted to be the third frame video image of the target video clip, that is, the display range of the target description information is adjusted to be: the third, fourth, fifth, sixth and seventh frames of images in the target video clip, and so on, when the user clicks the third position 411 twice or more;

similarly, the user may adjust the ending video image of the target description information by clicking the fourth position 412 of the display range of the first track 41 shown in fig. 4, that is, the position indicated by the right side edge of the box of the text "cloud turpentine rain, water \28601; smoke generation" shown in fig. 4, and controlling the number of clicks, thereby implementing the adjustment of the display range of the target description information; for example, it may be set that the starting video image of the object description information moves forward by one view image every time the user clicks the fourth position 412, so that when the user clicks the fourth position 412 once, the starting video image of the object description information is adjusted to be the sixth frame video image in the object video clip, that is, the display range of the object description information is adjusted to be: when the user clicks the fourth position 412 twice, the starting video image of the target specification information is adjusted to be the fifth video image of the target video segment, that is, the display range of the target specification information is adjusted to be: the first, second, third, fourth, and fifth frames of images in the target video segment, and so on, with the user clicking the fourth location 412 more than twice.

Mode III

The user may adjust the start video image of the target specification information by pressing the third position 411 of the display range of the first track 41 shown in fig. 4, i.e., the position indicated by the left side edge of the box of the text "cloud turmerium, water \28601; smoke \"; for example, it may be set that the starting video image of the object description information is shifted back by one view image every two seconds the user presses the third position 411;

similarly, the user may adjust the ending video image of the target description information by pressing the fourth position 412 of the display range of the first track 41 shown in fig. 4, that is, the position indicated by the right side edge of the box of the text "cloud turpentine rain, water \28601; \ 28601; smoke generation" shown in fig. 4, and controlling the pressing time, thereby implementing the adjustment of the display range of the target description information; for example, it may be set that the end video image of the object description information is advanced by one view image every two seconds when the user presses the fourth position 412.

Mode IV

The user can click any position of the first track 41 shown in fig. 4, control the number of clicks, and simultaneously adjust the starting video image of the target description information and the ending video image of the target description information, thereby adjusting the display range of the target description information; for example, it may be set that each time the user clicks an arbitrary position of the first track 41, the start video image of the object description information is moved backward by one view image, and at the same time, the end video image of the object description information is also moved forward by one view image.

It should be noted that, in the process of adjusting the display range of the target specification information through the above manner or the other manners, the display range of the first track 41 shown in fig. 4 may be adjusted accordingly, or the display range of the first track 41 shown in fig. 4 may not be adjusted, which is not limited in the embodiment of the present invention.

In this way, when the first input is used for adjusting the display range of the first track to the first display range, the starting video image of the target description information is adjusted to the video image corresponding to the first position of the first display range, and the ending video image of the target description information is adjusted to the video image corresponding to the second position of the first display range, so that the user can adjust the display range of the target description information by adjusting the display range of the first track, the operation is simple and convenient, and convenience and flexibility in adjusting the display range of the target description information can be improved.

Under the condition that the input parameters of the first input include at least one of input position, input times and input time, at least one of the starting video image and the ending video image of the target description information is adjusted to be the video image corresponding to the input parameters of the first input, so that a user can adjust the display range of the target description information by controlling the input position, the input times or the input time, the operation is simple and convenient, and convenience and flexibility in adjusting the display range of the target description information can be improved.

displaying a second track including a video image thumbnail of each video image in the target video clip and a slider control that is displayed suspended over the second track;

receiving a second input of the sliding control by a user;

and under the condition that the second input is used for sliding the sliding control to be above the target video image thumbnail in the second track, updating the starting video image or the ending video image of the target description information to the video image corresponding to the target video image thumbnail.

In this embodiment of the present invention, the step of displaying the second track and the sliding control may be automatically triggered by the electronic device, for example, the second track and the sliding control may be automatically displayed after the electronic device displays the target description information in the target video clip based on the target video feature tag. The step of displaying the second track and the sliding control may also be triggered by an input of a user, for example, the target control may be displayed in the target video clip, so that when the user double-clicks the target control of the playing interface of the target video clip, the electronic device may be triggered to display the second track and the sliding control.

The second input may be a click input or a drag input.

For ease of understanding, the following are exemplified herein:

assuming that the target video segment includes 7 frames of video images, the target specification information is: based on the target video feature tag, after the target description information is displayed in the target video segment, a playing interface of the target video segment is shown in fig. 3, where 21 shown in fig. 3 is a target control, then, when the user clicks the target control 21 shown in fig. 3, as shown in fig. 4, a second track 42 and a sliding control 43 are displayed on the playing interface of the target video segment, where the second track 42 includes a video image thumbnail of each video image in the target video segment, and the sliding control 43 is displayed in a floating manner on a first video image thumbnail in the second track, that is, at this time, a starting video image of the target description information is a video image corresponding to a first video image thumbnail in the second track, that is, a first video image in the target video segment, and it is assumed that an ending video image of the target description information is default to be a last video image in the target video segment Video, then at this point, the display range of the target specification content is: first, second, third, fourth, fifth, sixth and seventh frame video images in the target video segment, that is, the target specification information is displayed from the first frame video image in the target video segment to the seventh frame video image in the target video segment;

next, in the display interface shown in fig. 4, the user may slide the sliding control 43 onto any video image thumbnail, and assuming that the user may slide the sliding control 43 onto the fourth video image thumbnail in the second track, as shown in fig. 6, 61 shown in fig. 6 is the target description information, after the sliding, the starting video image of the target description information is updated to the video image corresponding to the fourth video image thumbnail in the second track, that is, the fourth frame video image in the target video segment, and the ending video image of the target description information is not adjusted, that is, at this time, the display range of the target description content is adjusted to: the fourth, fifth, sixth and seventh frames of video images in the target video segment, that is, the target specification information will be displayed from the fourth frame of video image in the target video segment to the seventh frame of video image in the target video segment;

after the adjustment, in the display interface shown in fig. 6, the user may click the target control 21 shown in fig. 6 to save the adjustment of the display range of the target specification content; the user may also double-click the target control 21 shown in fig. 6 to undo the adjustment to the display range of the target specification content.

In this way, a second input of the user to the sliding control is received by displaying the second track and the sliding control, and the starting video image or the ending video image of the target description information is updated to the video image corresponding to the target video image thumbnail under the condition that the second input is used for sliding the sliding control onto the target video image thumbnail in the second track, so that the user can adjust the display range of the target description information by operating the sliding control, the operation is simple and convenient, and convenience and flexibility in adjusting the display range of the target description information can be improved.

displaying at least one of an edit control, a delete control, and a change control;

under the condition that the editing control is displayed, receiving sixth input of a user to the editing control, and responding to the sixth input, displaying a description information editing frame which is used for editing the target description information displayed in the target video clip;

receiving a seventh input of the user to the deletion control under the condition that the deletion control is displayed, and deleting the target description information displayed in the target video clip in response to the seventh input;

receiving an eighth input of a user to the replacement control in the condition that the replacement control is displayed, and replacing the description information displayed in the target video clip from the target description information to fourth description information in response to the eighth input.

In an embodiment of the present invention, the fourth description information may be one of a plurality of preset description information, and the fourth description information and the target description information may be different description information.

For further understanding, the description herein is by way of example:

assume that the target specification information is: based on the target video feature tag, after target description information is displayed in the target video segment, a playing interface of the target video segment is shown in fig. 3, wherein 21 shown in fig. 3 is a target control, so that when a user clicks the target control 21 shown in fig. 3, as shown in fig. 4, an editing control 44, a deleting control 45 and a replacing control 46 are displayed on the playing interface of the target video segment;

in the interface shown in fig. 4, the user may click on the edit control 44 to trigger the display of the description information edit box, where the user may edit the target description information displayed in the target video clip;

in the interface shown in fig. 4, the user also clicks the delete control 45 to delete the target description information displayed in the target video clip;

in the interface shown in fig. 4, the user may further click the replacement control 46 to replace the description information in the target video segment with "yunquqing; 28601, \ 28601; alternatively, the user may double-click the target control 21 shown in fig. 7 to undo the replacement of the explanatory information and return to the play interface of the target video clip.

In this way, at least one of the editing control, the deleting control and the replacing control is displayed, so that after the description information is automatically added to the video clip, a user can edit or delete or replace the description information in the target video clip according to the requirement, and the flexibility of the video processing method can be further improved.

displaying an import control;

receiving a third input of the user to the import control;

responding to the third input, displaying P multimedia files, wherein P is a positive integer, and each multimedia file in the P multimedia files comprises description information;

receiving a fourth input of a user to a second target multimedia file of the P multimedia files;

and responding to the fourth input, acquiring third description information in the second target multimedia file, and updating the description information displayed in the target video clip from the target description information to the third description information.

In this embodiment of the present invention, the step of displaying the import control may be automatically triggered by the electronic device, for example, the import control may be automatically displayed after the electronic device displays the target description information in the target video segment based on the target video feature tag. The step of displaying the import control may also be triggered by an input of a user, for example, the target control may be displayed in the target video segment, so that when the user double-clicks the target control of the playing interface of the target video segment, the electronic device may be triggered to display the import control.

The third input may be a touch input such as clicking, double clicking, long pressing, sliding, or dragging, or a voice control input. The fourth input may be a touch input such as clicking, double clicking, long pressing, sliding, or dragging, or a voice control input.

The multimedia file can be an image, text or video.

The P multimedia files may be P multimedia files that are downloaded or collected by the user in advance. The second target multimedia file may be any one of the P multimedia files.

For ease of understanding, the following are exemplified herein:

assume that the target specification information is: based on the target video feature tag, after the target description information is displayed in the target video segment, a playing interface of the target video segment is shown in fig. 3, where 21 shown in fig. 3 is a target control, then, when the user clicks the target control 21 shown in fig. 3, as shown in fig. 8, an import control 81 is displayed on the playing interface of the target video segment, 82 shown in fig. 8 is target description information, and when the user clicks the import control 81 shown in fig. 8, P multimedia files collected by the user in advance are displayed, each multimedia file in the P multimedia files includes description information, and the user can select one multimedia file from the P multimedia files, where it is assumed that the description information included in the multimedia file selected by the user is: the water light billed good sunny side and the mountain sky rain, then, when the user selects the multimedia file, the description information in the target video clip will be updated as: water light billed good sunny sides and mountain sky rains, as shown in fig. 9, 91 shown in fig. 9 is target description information; thereafter, the user may click on the target control 21 shown in fig. 9 to save the update of the explanatory content and jump to the play interface of the target video segment; alternatively, the user may double-click the target control 21 shown in fig. 9 to undo the update to the description information and return to the play interface of the target video clip.

In this way, by displaying an import control, receiving a third input of a user to the import control, displaying P multimedia files in response to the third input, receiving a fourth input of the user to a second target multimedia file in the P multimedia files, acquiring third description information in the second target multimedia file in response to the fourth input, and updating the description information in the target video clip from the target description information to the third description information, after automatically adding description information to the video clip, the user can import favorite description information as required and replace the description information in the target video clip with the imported description information, so that the flexibility of the use of the video processing method can be further improved.

In some embodiments, the first track may also be used as the import control, specifically, the user may trigger to display the P multimedia files by performing a fifth input on the first track, so as to implement subsequent update of the description information in the target video clip.

According to the embodiment of the invention, the target video characteristic label of the target video segment is obtained, and the target description information is displayed in the target video segment based on the target video characteristic label, so that the matched description information can be automatically displayed in the video segment according to the video characteristic label of the video segment, and thus, the video can be processed without the need of learning and familiar professional knowledge, professional software and professional operation of video processing by a user, so that the difficulty of video processing can be reduced, and the user can be helped to add description information to the video more conveniently and efficiently.

It should be noted that, various optional implementations described in the embodiments of the present invention may be implemented in combination with each other or implemented separately, and the embodiments of the present invention are not limited thereto.

Referring to fig. 10, fig. 10 is a structural diagram of a video processing apparatus according to an embodiment of the present invention, and as shown in fig. 10, the video processing apparatus 1000 includes:

a first obtaining module 1001, configured to obtain a target video feature tag of a target video segment;

a first display module 1002, configured to display the target description information in the target video segment based on the target video feature tag.

Optionally, the first obtaining module 1001 includes:

the identification unit is used for identifying the video content type of each frame of video image in the target video;

the dividing unit is used for dividing the target video into N video segments based on the video content type of each frame of video image, and adding a video feature tag to each of the N video segments;

the first acquisition unit is used for determining a target video clip from the N video clips and acquiring a target video feature tag of the target video clip;

Optionally, the video processing apparatus 1000 further includes:

the second acquisition module is used for acquiring a first target multimedia file which is collected in advance, wherein the first target multimedia file comprises first description information;

the first display module 1002 includes:

the second acquisition unit is used for acquiring second description information corresponding to the target video feature tag;

and the display unit is used for displaying target description information in the target video clip based on the first description information and the second description information.

Optionally, the display unit is configured to:

Optionally, the video processing apparatus 1000 further includes:

the second display module is used for displaying a first track, and the first track comprises the target description information;

the first receiving module is used for receiving a first input of a user to the first track;

the first adjusting module is used for responding to the first input and adjusting the display range of the target description information;

Optionally, the first adjusting module is configured to:

Optionally, the video processing apparatus 1000 further includes:

a third display module, configured to display a second track and a sliding control, where the second track includes a video image thumbnail of each video image in the target video segment, and the sliding control is displayed in a floating manner on the second track;

the second receiving module is used for receiving a second input of the sliding control by the user;

and a first updating module, configured to update a starting video image or an ending video image of the target description information to a video image corresponding to a target video image thumbnail in the second track when the second input is used to slide the slide control over the target video image thumbnail.

Optionally, the video processing apparatus 1000 further includes:

the fourth display module is used for displaying the import control;

the third receiving module is used for receiving a third input of the user to the import control;

a fifth display module, configured to respond to the third input, and display P multimedia files, where P is a positive integer, and each of the P multimedia files includes description information;

a fourth receiving module, configured to receive a fourth input of a second target multimedia file in the P multimedia files from the user;

and the second updating module is used for responding to the fourth input, acquiring third description information in the second target multimedia file, and updating the description information displayed in the target video clip from the target description information to the third description information.

The electronic device 1000 can implement each process implemented by the electronic device in the method embodiments of fig. 1 to fig. 9, and details are not repeated here to avoid repetition.

According to the electronic device 1000 of the embodiment of the invention, the target video feature label of the target video clip is obtained, and the target description information is displayed in the target video clip based on the target video feature label, so that the matched description information can be automatically displayed in the video clip according to the video feature label of the video clip, and thus, the video can be processed without the need of learning and familiar professional knowledge, professional software and professional operation of video processing by a user, so that the difficulty of video processing can be reduced, and the user can be helped to add the description information to the video more conveniently and efficiently.

Fig. 11 is a schematic diagram of a hardware structure of an electronic device 1100 for implementing various embodiments of the present invention, where the electronic device 1100 includes, but is not limited to: radio frequency unit 1101, network module 1102, audio output unit 1103, input unit 1104, sensor 1105, display unit 1106, user input unit 1107, interface unit 1108, memory 1109, processor 1110, and power supply 1111. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 11 does not constitute a limitation of electronic devices, which may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. In the embodiment of the present invention, the electronic device includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer, and the like.

Wherein processor 1110 is configured to: acquiring a target video characteristic label of a target video clip;

the display unit 1106 is used for: and displaying target description information in the target video clip based on the target video feature tag.

Optionally, the obtaining of the target video feature tag of the target video segment performed by the processor 1110 includes:

Optionally, the processor 1110 is further configured to:

the displaying performed by the displaying unit 1106, based on the target video feature tag, displays target specification information in the target video segment, including:

Optionally, the displaying, performed by the displaying unit 1106, target description information in the target video segment based on the first description information and the second description information includes:

Optionally, the display unit 1106 is further configured to: displaying a first track, the first track including the target specification information;

the user input unit 1107 is used to: receiving a first input of a user to the first track;

the display unit 1106 is further configured to: adjusting a display range of the target specification information in response to the first input;

Optionally, the adjusting, performed by the display unit 1106, the display range of the target specification information in response to the first input includes:

Optionally, the display unit 1106 is further configured to: displaying a second track including a video image thumbnail of each video image in the target video clip and a slider control that is displayed suspended over the second track;

the user input unit 1107 is also used to: receiving a second input of the sliding control by a user;

the display unit 1106 is further configured to: and under the condition that the second input is used for sliding the sliding control to be above the target video image thumbnail in the second track, updating the starting video image or the ending video image of the target description information to the video image corresponding to the target video image thumbnail.

Optionally, the display unit 1106 is further configured to: displaying an import control;

the user input unit 1107 is also used to: receiving a third input of the user to the import control;

the display unit 1106 is further configured to: responding to the third input, displaying P multimedia files, wherein P is a positive integer, and each multimedia file in the P multimedia files comprises description information;

the user input unit 1107 is also used to: receiving a fourth input of a user to a second target multimedia file of the P multimedia files;

the display unit 1106 is further configured to: and responding to the fourth input, acquiring third description information in the second target multimedia file, and updating the description information displayed in the target video clip from the target description information to the third description information.

The electronic device 1100 is capable of implementing the processes implemented by the electronic device in the foregoing embodiments, and details are not repeated here to avoid repetition.

According to the electronic device 1100 of the embodiment of the present invention, the target video feature tag of the target video clip is acquired, and based on the target video feature tag, the target description information is displayed in the target video clip, so that the video processing can be automatically performed without learning and knowing the relevant professional knowledge, professional software and professional operation of the video processing by the user, and therefore, the difficulty of the video processing can be reduced, and the user can be helped to add the description information to the video more conveniently and efficiently.

It should be understood that, in the embodiment of the present invention, the radio frequency unit 1101 may be configured to receive and transmit signals during a message transmission or a call, and specifically, receive downlink data from a base station and then process the received downlink data to the processor 1110; in addition, the uplink data is transmitted to the base station. In general, radio frequency unit 1101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 1101 may also communicate with a network and other devices through a wireless communication system.

The electronic device provides wireless broadband internet access to the user via the network module 1102, such as to assist the user in sending and receiving e-mail, browsing web pages, and accessing streaming media.

The audio output unit 1103 may convert audio data received by the radio frequency unit 1101 or the network module 1102 or stored in the memory 1109 into an audio signal and output as sound. Also, the audio output unit 1103 may also provide audio output related to a specific function performed by the electronic device 1100 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 1103 includes a speaker, a buzzer, a receiver, and the like.

The input unit 1104 is used to receive audio or video signals. The input Unit 1104 may include a Graphics Processing Unit (GPU) 11041 and a microphone 11042, and the Graphics processor 11041 processes image data of still pictures or video obtained by an image capturing device, such as a camera, in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 1106. The image frames processed by the graphic processor 11041 may be stored in the memory 1109 (or other storage medium) or transmitted via the radio frequency unit 1101 or the network module 1102. The microphone 11042 may receive sound and can process such sound into audio data. The processed audio data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 1101 in case of the phone call mode.

The electronic device 1100 also includes at least one sensor 1105, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that adjusts the brightness of the display panel 11061 according to the brightness of ambient light, and a proximity sensor that turns off the display panel 11061 and/or the backlight when the electronic device 1100 is moved to the ear. As one type of motion sensor, an accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of an electronic device (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), and vibration identification related functions (such as pedometer, tapping); the sensors 1105 may also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., and will not be described in detail herein.

The Display unit 1106 may include a Display panel 11061, and the Display panel 11061 may be configured in the form of a liquid Crystal Display (L acquired Crystal Display, L CD), an Organic light Emitting Diode (O L ED), or the like.

The user input unit 1107 may be used to receive input numeric or character information and generate key signal inputs relating to user settings and function control of the electronic apparatus. Specifically, the user input unit 1107 includes a touch panel 11071 and other input devices 11072. The touch panel 11071, also referred to as a touch screen, may collect touch operations by a user on or near the touch panel 11071 (e.g., operations by a user on or near the touch panel 11071 using a finger, a stylus, or any other suitable object or attachment). The touch panel 11071 may include two portions of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, and sends the touch point coordinates to the processor 1110, and receives and executes commands sent from the processor 1110. In addition, the touch panel 11071 may be implemented by various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The user input unit 1107 may include other input devices 11072 in addition to the touch panel 11071. In particular, the other input devices 11072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein.

Further, the touch panel 11071 can be overlaid on the display panel 11061, and when the touch panel 11071 detects a touch operation thereon or nearby, the touch operation is transmitted to the processor 1110 to determine the type of the touch event, and then the processor 1110 provides a corresponding visual output on the display panel 11061 according to the type of the touch event. Although the touch panel 11071 and the display panel 11061 are shown in fig. 11 as two separate components to implement the input and output functions of the electronic device, in some embodiments, the touch panel 11071 and the display panel 11061 may be integrated to implement the input and output functions of the electronic device, and the embodiment is not limited herein.

The interface unit 1108 is an interface for connecting an external device to the electronic apparatus 1100. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. Interface unit 1108 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within electronic device 1100 or may be used to transmit data between electronic device 1100 and external devices.

The memory 1109 may be used to store software programs as well as various data. The memory 1109 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory 1109 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The processor 1110 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device and processes data by operating or executing software programs and/or modules stored in the memory 1109 and calling data stored in the memory 1109, thereby integrally monitoring the electronic device. Processor 1110 may include one or more processing units; preferably, the processor 1110 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 1110.

The electronic device 1100 may further include a power supply 1111 (e.g., a battery) for supplying power to various components, and preferably, the power supply 1111 may be logically connected to the processor 1110 via a power management system, so as to manage charging, discharging, and power consumption management functions via the power management system.

In addition, the electronic device 1100 includes some functional modules that are not shown, and thus are not described in detail herein.

Preferably, an embodiment of the present invention further provides an electronic device, which includes a processor 1110, a memory 1109, and a computer program that is stored in the memory 1109 and is executable on the processor 1110, and when the computer program is executed by the processor 1110, the processes of the video processing method embodiment are implemented, and the same technical effect can be achieved, and details are not repeated here to avoid repetition.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the video processing method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling an electronic device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A video processing method, comprising:

acquiring a target video characteristic label of a target video clip;

2. The method of claim 1, wherein obtaining the target video feature tag of the target video clip comprises:

3. The method of claim 1, wherein the method further comprises, before displaying object description information in the target video segment based on the target video feature tag:

4. The method of claim 3, wherein displaying the target description information in the target video segment based on the first description information and the second description information comprises:

5. The method of claim 1, wherein after displaying object description information in the object video segment based on the object video feature tag, the method further comprises:

receiving a first input of a user to the first track;

6. The method of claim 5, wherein said adjusting a display range of said target specification information in response to said first input comprises:

7. The method of claim 1, wherein after displaying object description information in the object video segment based on the object video feature tag, the method further comprises:

receiving a second input of the sliding control by a user;

8. The method of claim 1, wherein after displaying object description information in the object video segment based on the object video feature tag, the method further comprises:

displaying an import control;

receiving a third input of the user to the import control;

9. A video processing apparatus, comprising:

10. An electronic device, comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the video processing method according to any one of claims 1 to 8.