CN116521925A

CN116521925A - Video recording, playing, retrieving and playback method and device, electronic equipment and medium

Info

Publication number: CN116521925A
Application number: CN202310489598.4A
Authority: CN
Inventors: 方斌; 段克; 马起礼
Original assignee: Beijing Shitong Science And Technology Co ltd
Current assignee: Beijing Shitong Science And Technology Co ltd
Priority date: 2023-05-04
Filing date: 2023-05-04
Publication date: 2023-08-01

Abstract

The present invention relates to the field of video playback control technologies, and in particular, to a video recording and playing retrieval playback method, device, electronic equipment, and medium. The method comprises the following steps: processing the original video based on the acquired first playback instruction of the user and the called preset processing rule, and determining a video clip based on the processing result, wherein the first playback instruction of the user is an instruction which needs to be played back by the user; generating prompt information based on the video clips, wherein the prompt information is information for informing a user whether the video clips need to be screened again; and if a second playback instruction fed back by the user based on the prompt information is acquired, screening the video clips based on the second playback instruction, determining a target video clip and feeding back and displaying the target video clip. The method and the device can ensure that the target video clip wanted by the user is positioned quickly and accurately, and improve the watching experience of the user.

Description

Video recording, playing, retrieving and playback method and device, electronic equipment and medium

Technical Field

The present invention relates to the field of data query technologies, and in particular, to a video recording and playing retrieval playback method, device, electronic equipment, and medium.

Background

With the development of network technology, video has become an important way for people to learn or entertain, so people can watch interesting contents through video without going out, and for some live recorded video, people can know the situation of the scene through video playback.

The conventional video playback technology has two modes, namely, when a user wants to repeatedly watch part of the segments in the video, the mode one is to watch the segments from the beginning at double speed, play the segments to the target video and then adjust the segments to normal double speed, and the mode two is to drag a progress bar by virtue of the memory of the user to the video until the target video segment is found.

The conventional video playback technology has the defects that the operation of confirming the target video clip without stopping the drag progress bar is too complicated, and the target video clip cannot be accurately positioned in a short time, so that the watching experience of a user is reduced.

Disclosure of Invention

In order to improve the viewing experience of a user, the application provides a video recording and playing retrieval playback method, a video recording and playing retrieval playback device, electronic equipment and a medium.

In a first aspect, the present application provides a video recording, playing, retrieving and playing method, which adopts the following technical scheme:

a video recording and playing retrieval playback method comprises the following steps:

Processing the original video based on the acquired first playback instruction of the user and the called preset processing rule, and determining a video clip based on a processing result;

the first playback instruction of the user is an instruction that the user needs to play back;

generating prompt information based on the video clip;

the prompt information is information for informing a user whether the video clips need to be screened again;

and if a second playback instruction fed back by the user based on the prompt information is acquired, screening the video clips based on the second playback instruction, determining a target video clip and feeding back and displaying the target video clip.

By adopting the technical scheme, a first playback instruction of a user needing to be played back is firstly obtained, video processing is carried out on an original video according to a preset processing rule, a video segment is determined, then prompt information informing the user whether the video segment needs to be searched for in a further step is generated, if the user selects the requirement, a second playback instruction of the user is obtained, corresponding screening processing is carried out on the video segment based on the second playback instruction, and finally a target video segment required by the user is determined; therefore, the target video clips wanted by the user can be positioned quickly and accurately, and the watching experience of the user is improved.

In one possible implementation manner, the processing the original video based on the acquired first playback instruction of the user and the invoked preset processing rule includes:

determining character features in the original video based on the first playback instruction of the user and the preset processing rule;

determining appearance time points and disappearance time points corresponding to the character features;

and determining a character video clip corresponding to the character feature based on the appearance time point and the disappearance time point, and setting the character video clip as a first video clip.

By adopting the technical scheme, all character features in the original video are determined by taking a first playback instruction of a user and a preset processing rule as preconditions, then, the appearance time points and the disappearance time points corresponding to all character features are determined by taking the appearance character features as references, namely, each appearance time point and the corresponding disappearance time point form a video segment, and finally, the video segments corresponding to all character features are determined and set as first video segments; the retrieval method taking the person as the main retrieval target is provided for the user.

Determining voice characteristics in the original video based on the first playback instruction of the user and the preset processing rule;

determining an appearance time point and a disappearance time point corresponding to the voice feature;

and determining a voice video segment corresponding to the voice feature based on the occurrence time point and the disappearance time point, and setting the voice video segment as a second video segment.

By adopting the technical scheme, all voice features in the original video are determined by taking the first playback instruction of the user and a preset processing rule as preconditions, then the occurrence time points and the disappearance time points corresponding to all the voice features are determined by taking the preset voice pause time as a reference, namely, each occurrence time point and the corresponding disappearance time point form a video segment, and finally the video segment corresponding to all the voice features is determined and set as a second video segment; the retrieval method taking voice as a main retrieval target is provided for the user.

determining character features and voice features in the original video based on the first playback instruction of the user and the preset processing rule;

Determining a time node at which the character feature and the voice feature coexist;

and determining a character voice video segment based on the time node, and setting the character voice video segment as a third video segment.

By adopting the technical scheme, all character features and voice features in the original video are determined by taking the first playback instruction of the user and a preset processing rule as preconditions, then the time nodes in which all character features and voice features exist simultaneously are determined by taking the character features as references, namely, each time node corresponds to one video segment, and finally, the video segments in which all character features and voice features correspond simultaneously are determined and set as third video segments; the method provides a retrieval mode with characters and voices as main retrieval targets for users.

In one possible implementation manner, the target video segment includes at least one sub-target video segment, and if a second playback instruction fed back by the user based on the prompt information is obtained, filtering the video segment based on the second playback instruction of the user, determining the target video segment, and feeding back and displaying the target video segment, including:

generating a search box and feeding back and displaying the search box based on the second playback instruction;

Acquiring keyword characteristics input by a user in the search box;

and screening the video clips based on the keyword features to determine target video clips.

By adopting the technical scheme, a second playback instruction which is needed by a user to further screen the video clips is acquired, a search box is automatically generated and displayed in a feedback manner, after the keyword features corresponding to the video clips are input into the search box by the user, the keyword features are screened in the video clips, and finally, the target video clips are determined; thereby more precisely locating the target video clip desired by the user.

In one possible implementation manner, the keyword features are human text features and/or voice text features, and the acquiring the keyword features input by the user in the search box further includes:

if the user inputs character text features, extracting the character features in the video clips, representing the character features in a text form, and determining all the character text features;

if the user inputs the voice text features, performing text conversion on the voice features in the video clips, and determining all the voice text features;

If the user inputs character text features and voice text features, extracting all character text features expressed in a text form in the video segment, and extracting all voice text features in the video segment.

By adopting the technical scheme, if the user inputs the character text feature, all the character features in the video segment are extracted, character features are expressed in the form of characters, if the user inputs the voice text feature, all the voice features in the video segment are extracted and converted into voice text, and if the user inputs the character text feature and the voice text feature, the character text feature and the voice text feature in the video segment are simultaneously extracted; thereby providing different combinations of ways for the user to retrieve the desired target video clip.

In one possible implementation, the target video clip includes at least one sub-target video clip; if a second playback instruction fed back by the user based on the prompt information is acquired, screening the video clips based on the second playback instruction, determining a target video clip and feeding back and displaying the target video clip, including:

generating a time starting point corresponding to the sub-target video segment;

If the number of the sub-target video clips is the preset number, directly jumping to a time starting point corresponding to the sub-target video clips for playing;

if the number of the sub-target video clips is greater than the preset number, preferentially jumping to the time starting point corresponding to the sub-target video clip with the earliest time starting point for playing, and performing highlighting processing on the time starting points corresponding to the rest sub-target video clips contained in the target video clip.

By adopting the technical scheme, after the screening processing of the electronic equipment is completed, generating a time starting point based on the determined occurrence time points of all the sub-target video clips, then judging the number of the sub-target video clips, if the number of the sub-target video clips is the preset number, indicating that the screened target video clips are unique, and directly positioning the electronic equipment to the time starting point of the target video clips to start playing; if the number of the sub-target video clips exceeds the preset number, the fact that the target video clips have a plurality of sub-target video clips meeting the conditions is indicated, the electronic equipment is positioned to the time starting point which appears first to start playing according to the front-back sequence of the time starting point, and the rest time starting points are subjected to highlighting processing, so that a user can be helped to clearly see the positions of the target video clips on the progress bar of the original video.

In a second aspect, the present application provides a video recording, playing, retrieving and playing device, which adopts the following technical scheme:

a video recording, playback and retrieval device, comprising: a video clip determination module, a prompt module, and a target video clip determination module, wherein,

the video segment determining module is used for processing the original video based on the acquired first playback instruction of the user and the called preset processing rule and determining a video segment based on the processing result;

the prompt module is used for generating prompt information based on the video clips;

and the target video segment determining module is used for screening the video segments based on the second playback instruction if the second playback instruction fed back by the user based on the prompt information is acquired, determining the target video segments and feeding back and displaying the target video segments.

By adopting the technical scheme, firstly, a video segment determining module acquires a first playback instruction which is needed to be played back by a user, and carries out video processing on an original video according to a preset processing rule to determine a video segment, then a prompting module generates prompting information for informing the user whether the video segment needs to be searched for further, if the user selects the requirement, a target video segment determining module acquires a second playback instruction of the user, carries out corresponding screening processing on the video segment based on the second playback instruction, and finally determines a target video segment which is needed by the user; therefore, the target video clips wanted by the user can be positioned quickly and accurately, and the watching experience of the user is improved.

In one possible implementation, the video clip determining module includes: a person determining unit, a person time determining unit, and a first video clip unit, wherein,

the character determining unit is used for determining character characteristics in the original video based on the first playback instruction of the user and the preset processing rule;

a person time determining unit, configured to determine an appearance time point and a disappearance time point corresponding to the person feature;

and the first video segment unit is used for determining the character video segment corresponding to the character feature based on the appearance time point and the disappearance time point, and setting the character video segment as the first video segment.

In one possible implementation, the video clip determining module includes: a voice determination unit, a voice time determination unit, and a second video clip unit, wherein,

the voice determining unit is used for determining voice characteristics in the original video based on the first playback instruction of the user and the preset processing rule;

a voice time determining unit, configured to determine an occurrence time point and a disappearance time point corresponding to the voice feature;

and the second video segment unit is used for determining the voice video segment corresponding to the voice feature based on the occurrence time point and the disappearance time point, and setting the voice video segment as a second video segment.

In one possible implementation, the video clip determining module includes: a character voice determination unit, a time node determination unit, and a third video clip unit, wherein,

a character voice determining unit, configured to determine character features and voice features in the original video based on the first playback instruction of the user and the preset processing rule;

a time node determining unit, configured to determine a time node in which the character feature and the voice feature coexist;

and the third video segment unit is used for determining the character voice video segment based on the time node and setting the character voice video segment as the third video segment.

In one possible implementation, the target video clip determination module includes: a search box unit, an input unit and a screening unit, wherein,

the search box unit is used for generating a search box and feeding back and displaying the search box based on the second playback instruction;

the input unit is used for acquiring keyword characteristics input by a user in the search box;

and the screening unit is used for screening the video clips based on the keyword characteristics to determine target video clips.

In one possible implementation manner, the video recording and playing retrieval playback device further includes: a persona text module, a voice text module, and persona and voice text modules, wherein,

the character text module is used for extracting character features in the video clips if the user inputs the character text features, representing the character features in a text form and determining all the character text features;

the voice text module is used for carrying out text conversion on the voice features in the video clips if the user inputs the voice text features, and determining all the voice text features;

and the character and voice text module is used for extracting all character text features expressed in a character form in the video segment and extracting all voice text features in the video segment if the character text features and the voice text features are input by a user.

In one possible implementation, the target video clip determination module includes: a time starting point unit, a first jumping unit and a second jumping unit, wherein,

a time starting point unit, configured to generate a time starting point corresponding to the sub-target video segment;

the first jumping unit is used for jumping to a time starting point corresponding to the sub-target video clips directly to play if the number of the sub-target video clips is a preset number;

And the second jumping unit is used for jumping to the time starting point corresponding to the sub-target video segment with the earliest time starting point to play if the number of the sub-target video segments is larger than the preset number, and highlighting the time starting points corresponding to the rest sub-target video segments contained in the target video segment.

In a third aspect, the present application provides an electronic device, which adopts the following technical scheme:

an electronic device, the electronic device comprising:

at least one processor;

a memory;

at least one application program, wherein the at least one application program is stored in the memory and configured to be executed by the at least one processor, the at least one application program configured to: and executing the video recording and playing retrieval playback method.

In a fourth aspect, the present application provides a computer readable storage medium, which adopts the following technical scheme:

a computer-readable storage medium, comprising: a computer program is stored that can be loaded by a processor and that performs the video recording retrieval playback method described above.

In summary, the present application includes the following beneficial technical effects:

firstly, a first playback instruction of a user needing to be played back is obtained, the original video is subjected to video processing according to a preset processing rule, a video fragment is determined, prompt information informing the user whether the video fragment needs to be searched for in a further step is generated, if the user selects the requirement, a second playback instruction of the user is obtained, the video fragment is subjected to corresponding screening processing based on the second playback instruction, and finally a target video fragment required by the user is determined; therefore, the target video clips wanted by the user can be positioned quickly and accurately, and the watching experience of the user is improved.

Drawings

Fig. 1 is a schematic flow chart of a video recording, searching and playback method according to an embodiment of the present application;

FIG. 2 is a block diagram of a video recording-based retrieval playback device according to an embodiment of the present application;

fig. 3 is a schematic diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The present application is described in further detail below in conjunction with figures 1-3.

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

The embodiment of the application provides a video recording, playing, retrieving and playing back method, which is executed by electronic equipment, wherein the electronic equipment can be a server or terminal equipment, and the server can be an independent physical server, a server cluster or distributed equipment formed by a plurality of physical servers, or a cloud server for providing cloud computing service. The terminal device may be a smart phone, a tablet computer, a notebook computer, a desktop computer, etc., but is not limited thereto, and the terminal device and the server may be directly or indirectly connected through a wired or wireless communication manner, which is not limited herein.

Referring to fig. 1, the method includes: step S101, step S102, and step S103, wherein:

s101, processing an original video based on the acquired first playback instruction of the user and the called preset processing rule, and determining a video clip based on a processing result.

For the embodiment of the application, the first playback instruction of the user indicates an instruction that the user needs to play back, and the preset processing rule is a rule for splitting the original video, including a character rule, a voice rule, and a character and voice rule.

Specifically, after the electronic device acquires an instruction that a user needs to play back, a preset processing rule is called, an original video is split and processed into three video clips based on a character rule, a voice rule and a character and voice rule respectively, then the electronic device acquires a selection mode of the user, one video clip is determined to be displayed to the user according to the selection of the user, for example, the user selects the voice rule, and the electronic device determines that the video clip after the voice rule splitting process is completed is displayed to the user.

Further, the storage mode of the preset processing rule may include at least one of the following, and the first mode may store the preset processing rule in a storage space of the electronic device itself; the second mode is that the preset processing rule can be stored in a hardware storage device connected with the electronic device; and in the third mode, the preset processing rules can be stored in the cloud storage space, so that the processing rules can be conveniently fetched at any time.

S102, generating prompt information based on the video clips.

For the embodiment of the application, the prompt information is information for informing the user whether the video clip needs to be screened again.

Specifically, the electronic device generates prompt information according to the video clips displayed to the user and at a position which does not influence the watching of the user, inquires whether the video clips need to be screened again, the prompt information can know the satisfaction degree of the user on the video clips processed by the preset processing rules, and the preset processing rules are improved continuously according to the feedback of the user based on the prompt information.

It should be noted that, for the prompting mode and the prompting position of the prompting information, the embodiment of the application is not specifically limited.

And S103, if a second playback instruction fed back by the user based on the prompt information is acquired, screening the video clips based on the second playback instruction, determining a target video clip and feeding back and displaying the target video clip.

For the embodiments of the present application, the screening process includes character feature and/or speech feature screening.

Specifically, after the user feeds back the generated prompt information, if the user selects that no further screening process is needed, the prompt information is closed, so that the user can watch the video clips conveniently, if the user needs to perform the further screening process, the electronic device generates a second playback instruction, further screens the video clips based on the second playback instruction, compares and screens the video clips through character features and/or voice features, finally determines target video clips, and displays the target video clips, wherein the target video clips can comprise a plurality of target video clips, and a more accurate searching mode is provided for the user to obtain the target video clips required by the user through the further screening process.

The embodiment of the application provides a video recording, playing and retrieving playback method, when an electronic device receives a first playback instruction which is required to be played back by a user, the electronic device firstly carries out splitting processing on an original video through character rules, voice rules and character and voice rules, then displays corresponding video clips as processing results based on selection of splitting modes by the user, generates prompt information at the same time, inquires the user whether the current video clips need further screening processing, if not, closes the prompt information so as to facilitate the user to watch the video clips of the first processing results, if so, the electronic device receives a second playback instruction of the user, compares and screens characteristics input by the user with character characteristics and/or voice characteristics obtained after extracting the characteristics of the current video clips, determines target video clips as final processing results, and displays the final processing results to the user; therefore, the target video clips which the user wants to play back can be quickly and accurately positioned, and the watching experience of the user is improved.

In step S101, processing an original video based on the acquired first playback instruction of the user and the invoked preset processing rule specifically includes: determining character features in the original video based on a first playback instruction of a user and a preset processing rule; determining appearance time points and disappearance time points corresponding to the character features; and determining a character video clip corresponding to the character feature based on the appearance time point and the disappearance time point, and setting the character video clip as a first video clip.

For the embodiment of the present application, the occurrence time point and the disappearance time point may be multiple, and similarly, the first video segment may also include multiple first sub-video segments, and it should be noted that, the description herein is for illustrating that the video segment may include multiple sub-video segments, and the description herein will be omitted herein.

Specifically, when receiving a playback instruction of a user, the electronic device collects character image information in an original video, records an appearance time point and a corresponding disappearance time point of the character head image information, marks the character image information in an original video progress bar, namely, determines a first sub-video segment, repeats the operations until the original video is completely collected, ensures that each first sub-video segment contains the character image information, respectively stores and gathers all the first sub-video segments into a first video segment, and displays the first video segment containing a plurality of time nodes.

In step S101, the processing of the original video based on the obtained first playback instruction of the user and the invoked preset processing rule specifically further includes: determining voice characteristics in the original video based on a first playback instruction of a user and a preset processing rule; determining an appearance time point and a disappearance time point corresponding to the voice features; and determining the voice video clip corresponding to the voice feature based on the appearance time point and the disappearance time point, and setting the voice video clip as a second video clip.

For the embodiment of the application, when the electronic device receives a playback instruction of a user, determining voice audio information in an original video, simultaneously, the electronic device sequentially records an occurrence time point and a corresponding disappearance time point of the voice audio information, marks the original video progress bar, namely, determines a second sub-video segment, repeats the above operations until the original video is completely collected, ensures that each second sub-video segment contains the voice audio information, respectively stores and gathers all the second sub-video segments into a second video segment, and displays the second video segment containing a plurality of time nodes.

In step S101, the processing of the original video based on the obtained first playback instruction of the user and the invoked preset processing rule specifically further includes: determining character features and voice features in the original video based on a first playback instruction of a user and a preset processing rule; determining a time node in which the character feature and the voice feature coexist; based on the time node, a person voice video clip is determined and set as a third video clip.

For the embodiment of the application, when the electronic device receives a playback instruction of a user, the electronic device collects character image information and voice audio information in an original video, simultaneously, the electronic device sequentially records time nodes when the character image information and the voice audio information appear and disappear simultaneously, marks the time nodes in an original video progress bar, namely, determines a third sub-video segment, repeats the above operation until the original video is completely collected, ensures that each third sub-video segment contains the character image information and the voice audio information, respectively stores and gathers all the third sub-video segments into a third video segment, and the electronic device displays the third video segment containing a plurality of time nodes.

In step S103, if a second playback instruction fed back by the user based on the prompt information is obtained, the video clip is filtered based on the second playback instruction of the user, and a target video clip is determined and displayed in a feedback manner, and specifically further includes: generating a search box and feeding back and displaying the search box based on the second playback instruction; acquiring keyword characteristics input by a user in a search box; and screening the video clips based on the keyword characteristics to determine target video clips.

In the embodiment of the application, the keyword features represent character text features and/or voice text features in the target video segment which the user wants to query.

Specifically, the search box is generated and displayed in feedback on the condition that a second playback instruction received by the electronic device is further selected by the user, after the user inputs the keyword features in the search box, the electronic device screens corresponding character text features and/or voice text features in the corresponding video segments selected before based on the keyword features, and extracts the video segments conforming to the features as target video segments, for example, the keyword features input by the user are voice text features "good learning", and all the video segments containing the voice "good learning" in the corresponding second video segments are extracted as final target video segments.

Further, the method for acquiring the keyword features input by the user in the search box comprises the following steps: if the user inputs character text features, extracting character features in the video clips, representing the character features in a text form, and determining all character text features; if the user inputs the voice text features, performing text conversion on the voice features in the video clips, and determining all the voice text features; if the user inputs character text features and voice text features, extracting all character text features expressed in a text form in the video segment, and extracting all the voice text features in the video segment.

In embodiments of the present application, the character features may include clothing features and character names, and the voice features may include audio information.

Specifically, if the user inputs character text features, the clothing features and character names in the video segments are expressed in a text form, for example, one segment of video of a business suit man in a lecture is in the video, and the keyword features input by the user can be 'business suit lecture', so that the corresponding video segments can be retrieved; if the user inputs the voice text features, all the voice fragments in the video are converted into voice text, so that later-stage users can conveniently search through the text, for example, a section of audio in the video is ' good learning is on the day, the keyword features input by the user can be ' learning ', can also be ' good learning ', and the corresponding video fragments can be searched.

It should be noted that the keyword features may be keyword features of various languages, which are not specifically limited in this application.

In step S103, if a second playback instruction fed back by the user based on the prompt information is obtained, the target video segment includes at least one sub-target video segment, and the video segment is screened based on the second playback instruction of the user, so as to determine the target video segment and feed back and display, and specifically further includes: generating a time starting point corresponding to the sub-target video clip; if the number of the sub-target video clips is the preset number, directly jumping to a time starting point corresponding to the sub-target video clips for playing; if the number of the sub-target video clips is greater than the preset number, preferentially jumping to the time starting point corresponding to the sub-target video clip with the earliest time starting point for playing, and performing highlighting processing on the time starting points corresponding to the rest sub-target video clips contained in the target video clip.

In the embodiment of the present application, the preset number is 1.

Specifically, after the screening process of the electronic equipment is completed, generating time starting points based on the determined occurrence time points of all the sub-target video clips, then judging the number of the sub-target video clips, if the number of the sub-target video clips is 1, indicating that the target video clips have only one corresponding time starting point, and directly positioning the electronic equipment to the time starting point of the target video clips to start playing; if the number of the sub-target video clips exceeds 1, the fact that the target video clips have a plurality of sub-target video clips meeting the conditions and comprise a plurality of time starting points is indicated, the electronic equipment is preferentially positioned to the time starting point which appears first to start playing according to the front-back sequence of the time starting points, and the rest time starting points are subjected to highlighting processing, so that a user can be helped to clearly see the positions of the target video clips on the progress bar of the original video.

The video recording and retrieving playback device 20 specifically may include: a video clip determination module 201, a prompt module 202, and a target video clip determination module 203, wherein,

the video clip determining module 201 is configured to process an original video based on the obtained first playback instruction of the user and the invoked preset processing rule, and determine a video clip based on a processing result;

a prompt module 202, configured to generate prompt information based on the video clip;

and the target video segment determining module 203 is configured to, if a second playback instruction fed back by the user based on the prompt information is acquired, screen the video segment based on the second playback instruction, determine a target video segment, and feed back and display the target video segment.

In one possible implementation manner of the embodiment of the present application, the video clip determining module 201 includes a person determining unit, a person time determining unit, and a first video clip unit, where,

the character determining unit is used for determining character characteristics in the original video based on the first playback instruction of the user and a preset processing rule;

A person time determining unit for determining an appearance time point and a disappearance time point corresponding to the person feature;

and the first video clip unit is used for determining the character video clip corresponding to the character feature based on the appearance time point and the disappearance time point, and setting the character video clip as the first video clip.

In one possible implementation manner of the embodiment of the present application, the video clip determining module 201 further includes: a voice determination unit, a voice time determination unit, and a second video clip unit, wherein,

the voice determining unit is used for determining voice characteristics in the original video based on a first playback instruction of a user and a preset processing rule;

and the second video clip unit is used for determining the voice video clip corresponding to the voice feature based on the appearance time point and the disappearance time point, and setting the voice video clip as the second video clip.

In one possible implementation manner of the embodiment of the present application, the video clip determining module 201 further includes: a character voice determination unit, a time node determination unit, and a third video clip unit, wherein,

The character voice determining unit is used for determining character features and voice features in the original video based on a first playback instruction of a user and a preset processing rule;

a time node determining unit for determining a time node in which the character feature and the voice feature coexist;

and a third video clip unit configured to determine a person voice video clip based on the time node, and set the person voice video clip as the third video clip.

In one possible implementation manner of the embodiment of the present application, the target video clip determining module 203 further includes: a search box unit, an input unit and a screening unit, wherein,

and the screening unit is used for screening the video fragments based on the keyword characteristics and determining target video fragments.

In one possible implementation manner of the embodiment of the present application, the apparatus 20 for retrieving and playing back a video recording and playing back further includes: a persona text module, a voice text module, and persona and voice text modules, wherein,

the character text module is used for extracting character features in the video clips if the user inputs the character text features, representing the character features in a character form and determining all the character text features;

and the character and voice text module is used for extracting all character text features expressed in a text form in the video segment and extracting all voice text features in the video segment if the character text features and the voice text features are input by a user.

In one possible implementation manner of the embodiment of the present application, the target video clip determining module 203 further includes: a time starting point unit, a first jumping unit and a second jumping unit, wherein,

the time starting point unit is used for generating a time starting point corresponding to the sub-target video clip;

the first jumping unit is used for jumping to a time starting point corresponding to the sub-target video clips directly for playing if the number of the sub-target video clips is a preset number;

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

The embodiment of the application also describes an electronic device from the perspective of the entity apparatus, as shown in fig. 3, the electronic device 30 shown in fig. 3 includes: a processor 301 and a memory 303. Wherein the processor 301 is coupled to the memory 303, such as via a bus 302. Optionally, the electronic device 30 may also include a transceiver 304. It should be noted that, in practical applications, the transceiver 304 is not limited to one, and the structure of the electronic device 30 is not limited to the embodiment of the present application.

The processor 301 may be a CPU (Central Processing Unit ), general purpose processor, DSP (Digital Signal Processor, data signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field Programmable Gate Array, field programmable gate array) or other programmable logic device, transistor logic device, hardware components, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. Processor 301 may also be a combination that implements computing functionality, e.g., comprising one or more microprocessor combinations, a combination of a DSP and a microprocessor, etc.

Bus 302 may include a path to transfer information between the components. Bus 302 may be a PCI (Peripheral Component Interconnect, peripheral component interconnect Standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. Bus 302 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 3, but not only one bus or one type of bus.

The Memory 303 may be, but is not limited to, a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory ) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory ), a CD-ROM (Compact Disc Read Only Memory, compact disc Read Only Memory) or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.

The memory 303 is used for storing application program codes for executing the present application and is controlled to be executed by the processor 301. The processor 301 is configured to execute the application code stored in the memory 303 to implement what is shown in the foregoing method embodiments.

Among them, electronic devices include, but are not limited to: mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. But may also be a server or the like. The electronic device shown in fig. 3 is only an example and should not be construed as limiting the functionality and scope of use of the embodiments herein.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

The foregoing is only a partial embodiment of the present application and it should be noted that, for a person skilled in the art, several improvements and modifications can be made without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims

1. A video recording, playback and retrieval method, comprising:

generating prompt information based on the video clips;

2. The method for video recording, playback and retrieval according to claim 1, wherein the processing the original video based on the acquired user first playback instruction and the invoked preset processing rule comprises:

3. The method for video recording, playback and retrieval according to claim 1, wherein the processing the original video based on the acquired user first playback instruction and the invoked preset processing rule comprises:

4. The method for video recording, playback and retrieval according to claim 1, wherein the processing the original video based on the acquired user first playback instruction and the invoked preset processing rule comprises:

5. The method of claim 1, wherein if a second playback instruction fed back by a user based on prompt information is obtained, screening the video segments based on the second playback instruction of the user, determining a target video segment, and feeding back and displaying the target video segment, comprising:

acquiring keyword characteristics input by a user in the search box;

6. The method for video recording, searching and replaying according to claim 5, wherein the keyword features are human text features and/or voice text features, and the step of obtaining the keyword features input by the user in the search box further comprises the steps of:

7. The video recording, retrieving and playback method as set forth in claim 1, wherein the target video clip comprises at least one sub-target video clip; if a second playback instruction fed back by the user based on the prompt information is acquired, screening the video clips based on the second playback instruction, determining a target video clip and feeding back and displaying the target video clip, including:

generating a time starting point corresponding to the sub-target video segment;

8. A video recording, playback and retrieval device, comprising:

9. An electronic device, comprising:

At least one processor;

a memory;

at least one application program, wherein the at least one application program is stored in the memory and configured to be executed by the at least one processor, the at least one application program configured to: a video recording and playback retrieval method according to any one of claims 1 to 7.

10. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program, when executed in a computer, causes the computer to perform a video recording, retrieving and playback method according to any one of claims 1 to 7.