CN111400553A

CN111400553A - Video searching method, video searching device and terminal equipment

Info

Publication number: CN111400553A
Application number: CN202010338121.2A
Authority: CN
Inventors: 吴恒刚
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2020-04-26
Filing date: 2020-04-26
Publication date: 2020-07-10

Abstract

The video searching method provided by the application comprises the following steps: after a search request is acquired, matching each stored video according to keyword information in the search request, wherein the stored video comprises a feature tag of at least one frame of feature image in the stored video, and the feature tag is used for matching with the keyword information; taking the detected storage video matched with the keyword information as a target video, and taking a characteristic image matched with the keyword information in the target video as a target frame; and obtaining a search result according to the target video and the target frame, wherein the search result is used for pointing at least part of the target video and/or indicating the playing position of at least part of the target frame in the at least part of the target video in the corresponding target video. By the method, the video can be conveniently and efficiently searched.

Description

Video searching method, video searching device and terminal equipment

Technical Field

The present application belongs to the field of video search technology, and in particular, relates to a video search method, a video search apparatus, a terminal device, and a computer-readable storage medium.

Background

With the wide popularization of various video applications, video materials in the network show explosive growth. In daily applications, people often want to be able to quickly find videos with specific content for playing and various video creations. Therefore, a method capable of conveniently and efficiently searching for a video is needed.

Disclosure of Invention

The embodiment of the application provides a video searching method, a video searching device, terminal equipment and a computer readable storage medium, which can conveniently and efficiently search videos.

In a first aspect, an embodiment of the present application provides a video search method, including:

after a search request is acquired, matching each stored video according to keyword information in the search request, wherein the stored video comprises a feature tag of at least one frame of feature image in the stored video, and the feature tag is used for matching with the keyword information;

taking the detected storage video matched with the keyword information as a target video, and taking a characteristic image matched with the keyword information in the target video as a target frame;

and obtaining a search result according to the target video and the target frame, wherein the search result is used for pointing at least part of the target video and/or indicating the playing position of at least part of the target frame in the at least part of the target video in the corresponding target video.

In a second aspect, an embodiment of the present application provides a video search apparatus, including:

the matching module is used for matching each storage video according to the keyword information in the search request after the search request is obtained, wherein the storage video comprises a feature tag of at least one frame of feature image in the storage video, and the feature tag is used for matching with the keyword information;

the first processing module is used for taking the detected storage video matched with the keyword information as a target video and taking a characteristic image matched with the keyword information in the target video as a target frame;

and the second processing module is used for obtaining a search result according to the target video and the target frame, wherein the search result is used for pointing at least part of the target video and/or indicating the playing position of at least part of the target frame in the at least part of the target video in the corresponding target video.

In a third aspect, an embodiment of the present application provides a terminal device, which includes a memory, a processor, a display, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the video search method according to the first aspect.

In a fourth aspect, the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the video search method as described in the first aspect.

In a fifth aspect, the present application provides a computer program product, which when run on a terminal device, causes the terminal device to execute the video search method described in the first aspect.

Compared with the prior art, the embodiment of the application has the advantages that: in the embodiment of the application, after a search request is acquired, matching is performed on each stored video according to keyword information in the search request, wherein the stored video comprises a feature tag related to at least one frame of feature image in the stored video, and the feature tag is used for matching with the keyword information; the search result is used for pointing at least part of the target video and/or indicating the playing position of at least part of the target frames in the at least part of the target video in the corresponding target video, so that a user can search the stored video matched with the keyword information and know the specific playing position of the content matched with the keyword information in the stored video, the user can conveniently operate the content matched with the keyword information, such as conveniently watching, clipping and the like, the related operation efficiency is improved, and the use experience of the user is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flowchart of a video search method according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of a video search apparatus according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a terminal device according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

Specifically, fig. 1 shows a flowchart of a video search method provided in an embodiment of the present application, where the video search method can be applied to a terminal device.

The video search method provided by the embodiment of the application can be applied to terminal devices such as a mobile phone, a tablet personal computer, a wearable device, a vehicle-mounted device, an Augmented Reality (AR)/Virtual Reality (VR) device, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a Personal Digital Assistant (PDA), and the like, and the embodiment of the application does not limit the specific types of the terminal devices at all.

In an application scenario, for example, the video search method may be applied to a cloud server, and at this time, the stored video may be stored in the cloud server. The cloud server may be coupled to a client device (e.g., a mobile phone, a tablet computer, a laptop computer, a desktop computer, etc.) for information transmission. For example, the client device may send a search request to the cloud server and receive a search result returned by the cloud server to obtain the searched related video information.

For example, in another application scenario, the video search method may also be applied to a client device to search for a video locally stored in the client device. The specific types of the terminal devices implementing the video search method can be various, and the specific application scenarios can be determined according to actual requirements.

The video searching method comprises the following steps:

step S101, after a search request is acquired, matching each stored video according to keyword information in the search request, wherein the stored video comprises a feature tag of at least one frame of feature image in the stored video, and the feature tag is used for matching with the keyword information.

In this embodiment of the present application, the search request may be generated after a terminal device executing the video search method receives a specific search operation of a user, or may be received from another device through a specific information transmission method. For example, in one example, it may be that the cloud server receives a search request sent by a client. The search request may include corresponding keyword information, and the keyword information may include a search term input by a user; in addition, in some examples, related words related to the search term may also be included, such as a near word or a word with a frequency of association higher than a preset frequency, and the like.

The feature labels described above may be used to characterize features of the corresponding feature images. For example, the features may include one or more of color features, texture features, shape features, spatial relationships, target features, scene features, and the like; the feature tag may include at least one of a feature value, a feature vector, an object class, a scene class, and an object coordinate. The feature values, the feature vectors, and the like may be obtained by extracting corresponding feature images in advance through image processing methods such as wavelet transformation, a markov random field model, a convolutional neural network, and the like, and the target categories, the scene categories, the target coordinates, and the like may be obtained by identifying the corresponding feature images through the neural network model.

The characteristic image in the stored video can be determined according to the actual application scene. In some examples, if frame-by-frame detection is performed on each stored video, the feature image may be an image of each frame in the stored video; in addition, in some examples, a frame of image may be acquired every preset number of frames to obtain more than one frame of image as the feature image of the stored video, and at this time, the storage space required by the feature image and the corresponding feature tag is smaller, which is also more convenient for fast matching.

In this embodiment, there may be a plurality of ways to match each stored video according to the keyword information in the search request, for example, the keyword information may be compared with feature tags of each feature image in any stored video, and if there is content corresponding to at least part of the keyword information in the feature tags (for example, the keyword information is the same as a target category in the feature tags or the keyword information is the same as a feature indicated by the feature vectors, etc.), it is determined that there is a feature image matching the keyword information in the stored video. In addition, the name of the stored video and the like may be matched with the keyword information.

In some embodiments, before matching each stored video according to the keyword information in the search request, the method further includes:

for any stored video, sequentially comparing the similarity of any frame image in the stored video and a corresponding previous frame image from a second frame image of the stored video;

if the similarity between the image and the corresponding previous frame image is greater than a preset similarity threshold, taking the image as a characteristic image of the stored video;

and performing feature extraction on the feature image to obtain a feature label of the feature image.

In the embodiment of the application, the similarity between any frame of image in the stored video and the corresponding previous frame of image is sequentially compared, and when the similarity between the image and the corresponding previous frame of image is greater than a preset similarity threshold, the image is used as the feature image of the stored video, and some images with relatively close contents in the stored video can be screened out through the similarity, so that the repeatability of the feature image and the information of the corresponding feature tag is reduced, and the matching detection speed of the feature tag and the keyword information is favorably improved.

and respectively carrying out frame-by-frame feature extraction on each storage video to obtain a feature label of a feature image in each storage video, wherein the feature image is any frame image in the storage videos.

In the embodiment of the application, the content of each frame of image in each stored video can be completely and comprehensively obtained by respectively extracting the frame-by-frame characteristics of each stored video, so that whether the content matched with the keyword exists in the stored video or not can be evaluated in the subsequent process, and the specific position of the content matched with the keyword in the stored video can be determined.

And step S102, taking the detected storage video matched with the keyword information as a target video, and taking a characteristic image matched with the keyword information in the target video as a target frame.

In this embodiment, for example, the storage videos matched with the keyword information may include storage videos in which the number of feature images matched with the keyword information is greater than a preset number. In addition, a stored video with a high matching degree with the keyword information may be included, for example, each feature tag corresponding to the stored video corresponds to a corresponding weight value, and if a weighting result of a feature tag matching with the keyword information in the feature tags corresponding to the stored video is greater than a preset matching threshold, the matching degree between the stored video and the keyword information may be considered to be high.

In some embodiments, if there is content corresponding to at least part of the keyword information in the feature tag (for example, the keyword information is the same as the target category in the feature tag or the keyword information is the same as the feature indicated by the feature vector, etc.), it is determined that there is a feature image matching the keyword information in the stored video.

In some embodiments, the taking the detected stored video matching the keyword information as a target video and taking the feature image matching the keyword information in the target video as a target frame includes:

and if the number of the characteristic images matched with the keyword information in any one stored video is larger than the preset number, determining that the stored video is the target video, and taking the characteristic images matched with the keyword information in the target video as the target frames.

In this embodiment of the application, if the number of the feature images matched with the keyword information in any one of the stored videos is greater than a preset number, it may be considered that the content matched with the keyword information in the stored video reaches a certain proportion, that is, it may be considered that the matching degree between the stored video and the keyword information reaches a preset degree, and therefore, it may be determined that the stored video is a target video.

Step S103, obtaining a search result according to the target video and the target frame, where the search result is used to point to at least a part of the target video and/or indicate a playing position of at least a part of the target frame in the at least a part of the target video in the corresponding target video.

In this embodiment of the application, for example, the playing position may be represented by the target frame being a few frames in the corresponding stored video; the target frame may be represented by a play time of the target frame in the corresponding stored video.

The specific presentation and storage type of the search results described above is not limited herein. Specifically, in some embodiments, the search result may be used for information transmission and presented in a specific application, for example, the search result may be sent by a cloud server executing the embodiment of the present application to a client device sending the search request, so as to present the search result on a browser page or other application page of the client device. The search results may be presented in the form of a search result list or the like. For example, in the search result, the playing position of the frame of image which is most front of the playing position in the target frame of the stored video may be marked, so that when the user acquires the search result and views any stored video, the user may quickly locate the portion associated with the search request, so as to conveniently perform subsequent operations such as viewing, editing, and the like.

In some embodiments, after obtaining the search result, further comprising:

and if a first selected instruction for any target video is acquired, generating a playing instruction, wherein the playing instruction is used for indicating that the target video selected by the first selected instruction starts to be played from the position of a frame of image which is the most front of the playing position in the corresponding target frame.

In this embodiment of the present application, the first selected instruction may be generated by the terminal device executing the embodiment of the present application after receiving a user selected operation on any target video, or may be received from the client device through a specific data transmission manner (e.g., through a 4G communication manner, a cellular network, or a 5G communication manner).

After the first selected instruction for any target video is acquired, a playing instruction can be generated, and the playing instruction can instruct the selected storage video to start playing from the playing position of the image of the frame which is most front to the playing position in the target frame of the corresponding storage video. For example, the playing instruction may be sent to a client device coupled to the cloud server executing the embodiment of the present application, or the client device executing the embodiment of the present application may instruct a corresponding playing application to play the selected storage video.

In some embodiments, after taking the detected stored video matching the keyword information as a target video and taking the feature image matching the keyword information in the target video as a target frame, the method further includes:

for any target video, determining a target video segment matched with the keyword information in the target video according to the playing position of each target frame in the target video in the corresponding target video;

after obtaining the search result, the method further comprises:

and if a video synthesis instruction is acquired and a second selection instruction for at least two target videos in the search result is acquired, splicing all to-be-spliced video segments corresponding to the second selection instruction to obtain a synthesized video, wherein the to-be-spliced video segments are the target video segments in the target videos selected by the second selection instruction.

In this embodiment of the application, a target video segment in the target video, which is matched with the keyword information, may be determined according to a playing position of each target frame in the target video in a corresponding target video, and at this time, a portion that is more associated with the keyword information, that is, the target video segment, may be captured from the stored video, so as to facilitate subsequent video composition or other editing operations.

Various ways of determining the target video segment matched with the keyword information in the target video can be provided according to the playing position of each target frame in the target video in the corresponding target video. For example, a video clip between an image of a frame of any target frame of the stored video, which is the most front of the playing position, and an image of a frame of any target frame of the stored video, which is the last of the playing position, may be determined as a target video clip of the stored video; in addition, the target video clip can be determined according to the interval between the target frames. For example, in a scene, according to the sequence of the playing positions, sequentially judging whether the number of frames of the interval between each target frame and the corresponding previous target frame is greater than a preset frame number difference value, if not, determining that the two target frames are located in the same target video segment; and if the difference value of the two target frames is larger than the difference value of the preset frame numbers, the two target frames can be determined to be positioned in different target video clips. At this time, a plurality of target video clips may exist in the same stored video.

In some embodiments, the obtaining the video composition instruction and the second selected instruction for the at least two target videos in the search result, and splicing the to-be-spliced video segments corresponding to the second selected instruction to obtain the composite video includes:

if a video synthesis instruction and a second selected instruction of at least two target videos in the search result are obtained, determining the splicing sequence and splicing mode setting information of the video segments to be spliced;

and splicing the video clips to be spliced according to the splicing sequence and the splicing mode setting information to obtain a composite video.

In this embodiment of the application, the second selected instruction may indicate a target video corresponding to the video composition instruction. For example, the splicing sequence of the video segments to be spliced may be determined according to a sequence of second selection operations of the user on each target video, or may be determined according to selection operations of the user on a menu bar in a preset application, or the like. The splicing manner setting information may include splicing animation preset by the user, and the like.

In the embodiment of the application, the target video segment matched with the keyword information in the target video can be determined according to each target frame, so that the content with high relevance to the search request of the user can be extracted from the stored video, the automatic editing of the video can be realized according to the user requirement after the video synthesis instruction is received, the complexity of video creation is reduced, and the operation efficiency is improved.

In the embodiment of the application, after a search request is acquired, matching is performed on each stored video according to keyword information in the search request, wherein the stored video comprises a feature tag related to at least one frame of feature image in the stored video, and the feature tag is used for matching with the keyword information; the search result is used for pointing at least part of the target video and/or indicating the playing position of at least part of the target frames in the at least part of the target video in the corresponding target video, so that a user can search the stored video matched with the keyword information and know the specific playing position of the content matched with the keyword information in the stored video, the user can conveniently operate the content matched with the keyword information, such as conveniently watching, clipping and the like, the related operation efficiency is improved, and the use experience of the user is improved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Fig. 2 shows a block diagram of a video search apparatus according to an embodiment of the present application, which corresponds to the video search method described above in the foregoing embodiment, and only shows portions related to the embodiment of the present application for convenience of description.

Referring to fig. 2, the video search apparatus 2 includes:

a matching module 201, configured to match, after a search request is obtained, each stored video according to keyword information in the search request, where the stored video includes a feature tag about at least one frame of feature image in the stored video, and the feature tag is used for matching with the keyword information;

a first processing module 202, configured to use a detected stored video that matches the keyword information as a target video, and use a feature image that matches the keyword information in the target video as a target frame;

the second processing module 203 is configured to obtain a search result according to the target video and the target frame, where the search result is used to point to at least part of the target video and/or indicate a playing position of at least part of the target frame in the at least part of the target video in the corresponding target video.

Optionally, the video search apparatus 2 further includes:

and the generating module is used for generating a playing instruction if a first selected instruction for any target video is acquired, wherein the playing instruction is used for indicating that the target video selected by the first selected instruction starts to be played from the image of the frame which is the most front of the playing position in the corresponding target frame.

Optionally, the video search apparatus 2 further includes:

the determining module is used for determining a target video clip matched with the keyword information in the target video according to the playing position of each target frame in the target video in the corresponding target video for any target video;

and the synthesis module is used for splicing all the video clips to be spliced corresponding to the second selected instruction to obtain a synthesized video if the video synthesis instruction is acquired and the second selected instruction of the at least two target videos in the search result is acquired, wherein the video clips to be spliced are the target video clips in the target videos selected by the second selected instruction.

Optionally, the synthesis module specifically includes:

the determining unit is used for determining the splicing sequence and the splicing mode setting information of the video segments to be spliced if the video synthesis instruction is acquired and the second selected instruction of the at least two target videos in the search result is acquired;

and the synthesis unit is used for splicing the video clips to be spliced according to the splicing sequence and the splicing mode setting information to obtain a synthesized video.

Optionally, the video search apparatus 2 further includes:

the comparison module is used for sequentially comparing the similarity of any frame image in the stored videos and a corresponding previous frame image from a second frame image of the stored videos;

a third processing module, configured to, if a similarity between the image and a corresponding previous frame image is greater than a preset similarity threshold, use the image as a feature image of the stored video;

and the first feature extraction module is used for extracting features of the feature images to obtain feature labels of the feature images.

Optionally, the video search apparatus 2 further includes:

and the second feature extraction module is used for respectively carrying out frame-by-frame feature extraction on each storage video to obtain a feature tag of a feature image in each storage video, wherein the feature image is any frame image in the storage videos.

Optionally, the first processing module is specifically configured to:

It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned functions may be distributed as different functional units and modules according to needs, that is, the internal structure of the apparatus may be divided into different functional units or modules to implement all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

Fig. 3 is a schematic structural diagram of a terminal device according to an embodiment of the present application. As shown in fig. 3, the terminal device 3 of this embodiment includes: at least one processor 30 (only one is shown in fig. 3), a memory 31, and a computer program 32 stored in the memory 31 and executable on the at least one processor 30, wherein the processor 30 implements the steps of any of the various video search method embodiments when executing the computer program 32.

The terminal device 3 may be a server, a mobile phone, a wearable device, an Augmented Reality (AR)/Virtual Reality (VR) device, a desktop computer, a notebook, a desktop computer, a palmtop computer, or other computing devices. The terminal device may include, but is not limited to, a processor 30, a memory 31. Those skilled in the art will appreciate that fig. 3 is only an example of the terminal device 3, and does not constitute a limitation to the terminal device 3, and may include more or less components than those shown, or combine some components, or different components, such as may also include input devices, output devices, network access devices, and so on. The input device may include a keyboard, a touch pad, a fingerprint sensor (for collecting fingerprint information of a user and direction information of a fingerprint), a microphone, a camera, and the like, and the output device may include a display, a speaker, and the like.

The Processor 30 may be a Central Processing Unit (CPU), and the Processor 30 may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field-Programmable Gate arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 31 may in some embodiments be an internal storage unit of the terminal device 3, such as a hard disk or a memory of the terminal device 3, the memory 31 may in other embodiments also be an external storage device of the terminal device 3, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. further, the memory 31 may also include both an internal storage unit and an external storage device of the terminal device 3, the memory 31 is used for storing an operating system, applications, a Boot loader (Boot L loader), data and other programs, such as program codes of the computer program, etc., the memory 31 may also be used for temporarily storing data that has been or will be output.

In addition, although not shown, the terminal device 3 may further include a network connection module, such as a bluetooth module Wi-Fi module, a cellular network module, and the like, which is not described herein again.

In this embodiment, when the processor 30 executes the computer program 32 to implement the steps in any of the above-mentioned video search method embodiments, after a search request is obtained, each stored video is matched according to the keyword information in the search request, where the stored video includes a feature tag regarding at least one frame of feature image in the stored video, the feature tag is used for matching with the keyword information, and since the feature tag can reflect the specific content in the corresponding feature image, it is possible to search different parts of the video respectively through the feature tag, so as to obtain a stored video that matches with the keyword information, that is, a target video, and at the same time, it is also possible to determine a feature image that matches with the keyword information in the target video, that is, a target frame, therefore, a search result can be obtained, and fast and efficient video search is realized; the search result is used for pointing at least part of the target video and/or indicating the playing position of at least part of the target frames in the at least part of the target video in the corresponding target video, so that a user can search the stored video matched with the keyword information and know the specific playing position of the content matched with the keyword information in the stored video, the user can conveniently operate the content matched with the keyword information, such as conveniently watching, clipping and the like, the related operation efficiency is improved, and the use experience of the user is improved.

The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the above method embodiments.

The embodiments of the present application provide a computer program product, which when running on a terminal device, enables the terminal device to implement the steps in the above method embodiments when executed.

The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. The computer program includes computer program code, and the computer program code may be in a source code form, an object code form, an executable file or some intermediate form. The computer-readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), random-access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/network device and method may be implemented in other ways. For example, the above-described apparatus/network device embodiments are merely illustrative, and for example, the division of the above modules or units is only one logical function division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A video search method, comprising:

2. The video search method of claim 1, after obtaining the search results, further comprising:

3. The video search method according to claim 1, further comprising, after taking the detected stored video matching the keyword information as a target video and taking a feature image matching the keyword information in the target video as a target frame:

after obtaining the search result, the method further comprises:

and if a video synthesis instruction is acquired and a second selection instruction of at least two target videos in the search result is acquired, splicing all to-be-spliced video segments corresponding to the second selection instruction to obtain a synthesized video, wherein the to-be-spliced video segments are the target video segments in the target videos selected by the second selection instruction.

4. The video search method according to claim 3, wherein the step of, if the video composition instruction is obtained and a second selected instruction for at least two target videos in the search result is obtained, splicing each video segment to be spliced corresponding to the second selected instruction to obtain the composite video comprises:

5. The video search method of claim 1, further comprising, before matching each stored video according to the keyword information in the search request:

6. The video search method of claim 1, further comprising, before matching each stored video according to the keyword information in the search request:

respectively carrying out frame-by-frame feature extraction on each storage video to obtain a feature label of a feature image in each storage video, wherein the feature image is any frame image in the storage video.

7. The video search method according to any one of claims 1 to 6, wherein the step of taking the detected stored video matching the keyword information as a target video and taking a feature image matching the keyword information in the target video as a target frame comprises:

if the number of the characteristic images matched with the keyword information in any one stored video is larger than the preset number, determining that the stored video is a target video, and taking the characteristic images matched with the keyword information in the target video as target frames.

8. A video search apparatus, comprising:

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the video search method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out a video search method according to any one of claims 1 to 7.