CN111368143A - Video similarity retrieval method and device, electronic equipment and storage medium - Google Patents

Video similarity retrieval method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111368143A
CN111368143A CN202010177728.7A CN202010177728A CN111368143A CN 111368143 A CN111368143 A CN 111368143A CN 202010177728 A CN202010177728 A CN 202010177728A CN 111368143 A CN111368143 A CN 111368143A
Authority
CN
China
Prior art keywords
video
retrieved
feature vector
feature
retrieval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010177728.7A
Other languages
Chinese (zh)
Inventor
李沁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202010177728.7A priority Critical patent/CN111368143A/en
Publication of CN111368143A publication Critical patent/CN111368143A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content

Landscapes

  • Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a video similarity retrieval method, a video similarity retrieval device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a video clip to be retrieved of a video to be retrieved; analyzing the video clip to be retrieved to obtain a first feature vector; and comparing the first feature vector with a video feature library to obtain a retrieval result of the video to be retrieved. The technical scheme carries out retrieval by extracting the characteristic vector of the video segment to be retrieved to replace the traditional image characteristic vector. On the basis of ensuring the feasibility, the stability and the robustness of the characteristics are enhanced, and meanwhile, the retrieval precision is also improved.

Description

Video similarity retrieval method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of video retrieval, and in particular, to a method and an apparatus for retrieving video similarity, an electronic device, and a storage medium.
Background
Video retrieval is an important means for solving the problems of copyright detection, infringement inquiry and video duplication checking, and at present, a video retrieval method is carried out on the basis of images in videos, and mainly comprises the steps of extracting key frames from video segments to be retrieved, comparing the key frames with video key frames in a library and further measuring the similarity of the key frames. Such treatment has the following drawbacks:
firstly, a single frame image cannot replace the content of a video clip to be retrieved, and a mapping relation between the video clips is limited by the comparison result between frames, which reduces the robustness of the algorithm, and if the key frame is slightly displaced and distorted, the result is inaccurate.
And secondly, searching in a key frame extraction mode, so that the searching precision and the searching speed are in an antagonistic relation, if the higher searching precision is kept, the key frame extraction interval needs to be reduced, and more time is consumed by selecting multiple frames for comparison. On the contrary, increasing the retrieval speed requires increasing the interval of extracting the key frames and reducing the comparison times, which may result in higher false detection rate.
And thirdly, under the influence of an image feature extraction method and a feature comparison method, features need to be adjusted according to conditions, and the complexity of the super-parameter is increased.
Disclosure of Invention
In order to solve the technical problem or at least partially solve the technical problem, the present application provides a video similarity retrieval method, apparatus, electronic device and storage medium.
In a first aspect, an embodiment of the present application provides a video similarity retrieval method, including:
acquiring a video clip to be retrieved of a video to be retrieved;
analyzing the video clip to be retrieved to obtain a first feature vector;
and comparing the first feature vector with a video feature library to obtain a retrieval result of the video to be retrieved.
Optionally, the analyzing the video segment to be retrieved to obtain a first feature vector includes:
inputting the video clip to be retrieved into a feature extraction model, wherein the feature extraction model comprises: a plurality of 3D convolutional layers;
and sequentially convolving the video clip to be retrieved by the plurality of 3D convolution layers to obtain the first characteristic vector.
Optionally, the video feature library includes: and the second feature vector is obtained by inputting the video segments in the video library into the feature extraction model.
Optionally, the comparing the first feature vector with a video feature library to obtain a retrieval result of the video to be retrieved includes:
acquiring a third feature vector matched with the first feature vector from the second feature vector;
acquiring a video set corresponding to the third feature vector;
establishing a mapping relation between the video to be retrieved and all videos in the video set;
and taking the mapping relation as a retrieval result of the video to be retrieved.
Optionally, the obtaining a third feature vector matched with the first feature vector from the second feature vector includes:
calculating the similarity of the first feature vector and the second feature vector;
and taking the second feature vector with the similarity meeting a preset condition as the third feature vector.
Optionally, the method further includes:
acquiring at least two adjacent first feature vectors;
and when the video in the video set comprises at least two adjacent third feature vectors which are matched with the at least two adjacent first feature vectors and have the same time sequence information, confirming that the video to be retrieved is the same as or partially the same as the video in the video set.
Optionally, the method further includes:
determining the repetition rate of the video to be retrieved according to the retrieval result;
and when the repetition rate is greater than or equal to a preset threshold value, executing corresponding processing operation on the video to be retrieved.
In a second aspect, an embodiment of the present application provides a video similarity retrieval apparatus, including:
the determining module is used for determining a video clip to be retrieved of the video to be retrieved;
the extraction module is used for extracting at least one first feature vector based on the video to be retrieved;
and the retrieval module is used for comparing the first characteristic vector with a preset video library to obtain a retrieval result of the video to be retrieved.
In a third aspect, the present application provides an electronic device, comprising: the system comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;
the memory is used for storing a computer program;
the processor is configured to implement the above method steps when executing the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the above-mentioned method steps.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages: and retrieving by extracting the feature vector of the video segment to be retrieved instead of the traditional image feature vector. On the basis of ensuring the feasibility, the stability and the robustness of the characteristics are enhanced, and meanwhile, the retrieval precision is also improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a flowchart of a video similarity retrieval method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a feature extraction model provided in an embodiment of the present application;
fig. 3 is a schematic diagram of a mapping relationship between a video to be retrieved and a video set according to an embodiment of the present application;
fig. 4 is a flowchart of a video similarity retrieval method according to another embodiment of the present application;
fig. 5 is a flowchart of a video similarity retrieval method according to another embodiment of the present application;
fig. 6 is a block diagram of a video similarity retrieval apparatus according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The method provided by the embodiment of the invention can be applied to any required electronic equipment, such as electronic equipment such as a server and a terminal, and the method is not particularly limited herein, and is hereinafter simply referred to as electronic equipment for convenience of description.
First, a video similarity retrieval method provided by an embodiment of the present invention is described below.
Fig. 1 is a flowchart of a video similarity retrieval method according to an embodiment of the present disclosure. As shown in fig. 1, the method comprises the steps of:
step S11, acquiring a video clip to be retrieved of a video to be retrieved;
step S12, analyzing the video clip to be retrieved to obtain a first feature vector;
and step S13, comparing the first feature vector with the video feature library to obtain a retrieval result of the video to be retrieved.
The video similarity retrieval method provided by the embodiment replaces the traditional image feature retrieval mode by extracting the feature vector of the video segment to be retrieved, so that the stability and robustness of the feature are enhanced on the basis of ensuring the feasibility, and the retrieval precision is improved.
In the embodiment, a video to be retrieved is obtained, and then the video to be retrieved is segmented according to the continuous preset frame number to obtain a video segment to be retrieved of the video to be retrieved.
The preset frame number related to the embodiment is 16 frames, images of continuous 16 frames are used as video segments to be retrieved, and a video segment of less than one second is used as a retrieval unit to perform video retrieval, so that the retrieval granularity is remarkably reduced, the identification precision is improved, and retrieval results can be given to different segments of the same video.
In this embodiment, the first feature vector is obtained by analyzing the video segment to be retrieved, which is specifically implemented in the following manner: inputting a video clip to be retrieved into a feature extraction model, wherein the feature extraction model comprises the following steps: a plurality of 3D convolutional layers; and carrying out convolution on the video segment to be retrieved by a plurality of 3D convolution layers in sequence to obtain a first feature vector.
Fig. 2 is a working schematic diagram of the feature extraction model provided in the embodiment of the present application, and as shown in fig. 2, images of 16 consecutive frames are input into the pre-trained feature extraction model, convolution calculation is performed by the feature extraction model, a 256-dimensional array is output, and 256-dimensional data is used as a first feature vector.
In this embodiment, the parameters of the continuous 16 frames of images include: length 112 pixels, width 112 pixels, 3-channel image. The feature extraction model adopted in this embodiment includes 6 convolutional layers, which perform feature extraction on input data, and each convolutional layer includes a plurality of convolutional kernels, and each element constituting a convolutional kernel corresponds to a weight coefficient and a deviation (biasvector), and is similar to a neuron (neuron) of a feedforward neural network. The parameters of the convolutional layer include convolutional kernel size, step size and padding, which together determine the size of the convolutional layer output feature map. Where the convolution kernel size can be specified as an arbitrary value smaller than the input image size, the larger the convolution kernel, the more complex the input features that can be extracted.
The convolutional layer that this embodiment adopted is the 3D convolutional layer, through using 3D convolutional compression data, guarantees that the characteristic extraction rate is fast, and compression efficiency is high, practices thrift operating time and reduces the storage cost, and its each layer parameter is as follows:
conv1 convolution kernels 1 × 3 × 3 × 3 × 64, stride [1,1,2,2,1], padding 0;
conv2 convolution kernels 1 × 5 × 5 × 64 × 128, stride [1,1,5,5,1], padding 0;
conv3 convolution kernels 3 × 1 × 1 × 128 × 256, stride [1,2,1,1,1], padding 0;
conv4 convolution kernels 3 × 1 × 1 × 256 × 512, stride [1,2,1,1,1], padding 0;
conv5 convolution kernels 2 × 2 × 2 × 512 × 1536, stride [1,2,2,2,1], padding 0;
conv6 convolution kernel 2 × 7 × 7 × 1536 × 256, stride [1,1,1,1,1], padding 0.
The feature extraction model is trained in the following way: and acquiring a training sample, wherein the training sample can be a continuous 16-frame image, and training a preset convolutional neural network model by adopting the training sample to obtain a feature extraction model.
In this embodiment, the first feature vector is compared with the video feature library to obtain a retrieval result of the video to be retrieved, and the retrieval result is specifically obtained by the following steps:
firstly, a video feature library and a video library are obtained, wherein the video library is a set consisting of complete videos, a second feature vector is stored in the video feature library, and the second feature vector is obtained in the following way: and segmenting videos in the video library according to 16 frames as one video segment to obtain a plurality of video segments, and inputting the obtained video segments into a pre-trained feature extraction model to obtain a second feature vector.
It should be noted that, in this embodiment, the 3D convolution layer in the feature extraction model is used to perform convolution on a continuous 16-frame video segment, so that on one hand, it is ensured that the video feature extraction speed is high, the convolution efficiency is high, and the time cost is saved. On the other hand, the present embodiment uses a video segment of 16 consecutive frames as a retrieval unit, which significantly reduces the retrieval granularity and improves the accuracy of the retrieval result compared with the prior art.
And then, acquiring a third feature vector matched with the first feature vector from the second feature vector, acquiring a video set corresponding to the third feature vector, establishing a mapping relation between the video to be retrieved and all videos in the video set, and taking the mapping relation as a retrieval result of the video to be retrieved.
Optionally, the third feature vector matched with the first feature vector is obtained from the second feature vector by calculating the similarity between the first feature vector and the second feature vector. And taking the second feature vector with the similarity meeting the preset condition as a third feature vector. As an example, the similarity between the first feature vector and all the second feature vectors in the video feature library is calculated, and the second feature vector with the similarity greater than or equal to 0 is used as the third feature vector. The similarity referred to in this embodiment is cosine similarity. Assuming that there are 256-dimensional first eigenvector A and second eigenvector B, the remaining chord similarity calculation formula is as follows:
Figure BDA0002411373670000071
where | a | | is the modulus of vector a and | B | | is the modulus of vector B.
After the third feature vectors are obtained, determining a video segment corresponding to each third feature vector, querying a video to which each video segment belongs, and obtaining a video set according to the video to which each video segment belongs, namely the video set corresponding to the third feature vectors.
And establishing a mapping relation between the video to be retrieved and all videos in the video set, wherein the mapping relation can be obtained by counting video segments corresponding to the video segments to be retrieved, sequencing the video segments according to the similarity, and simultaneously removing the videos with the similarity smaller than the preset similarity. Therefore, the mapping relation between each video clip to be retrieved in the video to be retrieved and all videos in the video set is obtained.
For example: the video M to be retrieved includes: the video clip to be retrieved M1, the video clip to be retrieved M2, the video clip to be retrieved M3 and the video clip to be retrieved M4, the video set comprises: video Q, video P, video O, video G, video K, and video R.
The mapping relation is as follows: the video clip M1 to be retrieved corresponds to the video clip Q2 (similarity 95%) and the video clip P2 (similarity 90%).
The video clip M2 to be retrieved corresponds to the video clip Q3 (with 99% similarity) and the video clip P3 (with 96% similarity).
The video clip O1 (with 98% similarity) and the video clip G1 (with 94% similarity) that match the video clip M3 to be retrieved.
The video clip K2 (with 97% similarity) and the video clip R3 (with 90% similarity) that match the video clip M4 to be retrieved.
Therefore, the mapping relation between the video M to be retrieved and all videos in the video set is obtained.
Fig. 3 is a flowchart of a video similarity retrieval method according to another embodiment of the present application. As shown in fig. 3, the method further comprises the steps of:
step S21, acquiring at least two adjacent first feature vectors;
step S22, when the video in the video set includes at least two adjacent third feature vectors that match with the at least two adjacent first feature vectors and have the same timing information, it is determined that the video to be retrieved is the same as or partially the same as the video in the video set.
In the embodiment, the video corresponding to a certain part of the video to be retrieved can be obtained by verifying the adjacent first feature vector and the third feature vector in the video library, so that the effectiveness and the accuracy of the retrieval result are improved.
Fig. 4 is a schematic diagram of a mapping relationship between a video to be retrieved and a video set provided in an embodiment of the present application, where as shown in fig. 4, the video to be retrieved includes: first part Z1A second moiety Z2And a third moiety Z3
Setting a first part of a video to be retrievedThe method comprises the following steps: video clip Z to be retrieved11To retrieve video clip Z12To retrieve video clip Z13And a video clip Z to be retrieved14. Obtaining a video clip Z to be searched through searching11Segment A of video A in video setj1Corresponding, video segment Z to be retrieved12Segment A of video A in video setj2Corresponding, video segment Z to be retrieved13Segment A of video A in video setj3Corresponding, video segment Z to be retrieved14Segment A of video A in video setj4And (7) corresponding.
And judging by using the time sequence information, and if the time sequence information of the four segments in the first part of the video to be retrieved is the same as that of the four segments of the video A in the video set, determining that the first part of the video to be retrieved is the same as that of the video A in the video set.
At the same time, the video clip Z to be retrieved11Segment B of video B in video setj1Corresponding, video segment Z to be retrieved12Segment B of video B in video setj2Corresponding, video segment Z to be retrieved13Segment B of video B in video setj3Corresponding, video segment Z to be retrieved14Segment B of video B in video setj4And (7) corresponding.
And judging by using the time sequence information, and if the time sequence information of the four segments in the first part of the video to be retrieved is the same as that of the four segments in the video B in the video set, determining that the first part of the video to be retrieved is the same as that of the video B in the video set.
The second portion of the video to be examined includes: video clip Z to be retrieved21To retrieve video clip Z22To retrieve video clip Z23And a video clip Z to be retrieved24Obtaining the video clip Z to be searched through searching21Segment C of video C in video setj1Corresponding, video segment Z to be retrieved22Segment C of video C in video setj2Corresponding, and video segment Z to be retrieved23Segment C of video C in video setj3And (7) corresponding.
And judging by using the time sequence information, and if the time sequence information of the four segments in the first part of the video to be retrieved is the same as that of the four segments in the video C in the video set, determining that the first part of the video to be retrieved is the same as that of the video C in the video set.
The second portion of the video to be examined includes: video clip Z to be retrieved31And a video clip Z to be retrieved32Obtaining the video clip Z to be searched through searching31Segment D of video D in video setj1Corresponding, video segment Z to be retrieved32Segment D of video D in video setj2And (7) corresponding.
And judging by using the time sequence information, and if the time sequence information of the four segments in the first part of the video to be retrieved is the same as that of the four segments in the video D in the video set, determining that the first part of the video to be retrieved is the same as that of the video D in the video set.
Fig. 5 is a flowchart of a video similarity retrieval method according to another embodiment of the present application. As shown in fig. 5, the method further comprises the steps of:
step 31, determining the repetition rate of the video to be retrieved according to the retrieval result;
and step 32, when the repetition rate is greater than or equal to the preset threshold value, executing corresponding processing operation on the video to be retrieved.
In the embodiment, the repetition rate of the video to be retrieved is determined according to the retrieval result, and whether the video to be retrieved is a redundant video is determined according to the repetition rate. As an example: the video to be retrieved comprises: the method comprises a first part, a second part and a third part, wherein the first part and the second part correspond to a certain video of a video set, then the video time lengths of the first part and the second part are calculated, and the repetition rate is calculated according to the video time length and the total time length of the video to be retrieved. Such as: the video time of the first part and the second part is 4.5min, the total time of the video to be retrieved is 15min, and then the repetition rate is as follows: 30 percent.
And when the repetition rate is greater than or equal to a preset threshold value, executing corresponding processing operation on the video to be retrieved. As an example: and when the repetition rate is greater than the preset threshold value by 90%, confirming that the video to be retrieved is the redundant video, and rejecting the redundant video so as to avoid a large amount of redundant videos from invading the resources of the video library.
The video similarity retrieval method disclosed in this embodiment may also be applied to video dotting, for example: there is a 24h live stream containing a plurality of user programs of interest and other common videos from which it is desirable to automatically locate the start time of the program of interest to remind the user to watch or otherwise manipulate. Considering that the interested program has a relatively fixed title, the title can be extracted and stored in the video library as a sample, then the live stream is used as the video to be detected to be searched in the video library, when the similarity of a certain segment and the sample in the video library is greater than a threshold value, the current segment is considered to be the program title, and the user is prompted to start the subsequent operation.
In addition, the video similarity retrieval method disclosed in this embodiment may also be used for similar video search recommendation, such as: the user hopes to see the content related to a certain video after watching the video, at the moment, the video watched by the user is the video to be retrieved, comparison is carried out in the whole video library, the retrieved videos are sorted according to the sequence of similarity from high to low, and finally the sorting result is used as the video content recommended to the user.
Fig. 6 is a block diagram of a video similarity retrieval apparatus provided in an embodiment of the present application, which may be implemented as part or all of an electronic device through software, hardware, or a combination of the two. As shown in fig. 6, the apparatus includes:
the determining module 41 is configured to determine a video segment to be retrieved of a video to be retrieved;
an extraction module 42, configured to extract at least one first feature vector based on a video to be retrieved;
and the retrieval module 43 is configured to compare the first feature vector with a preset video library to obtain a retrieval result of the video to be retrieved.
In this embodiment, the extracting module 42 is specifically configured to obtain a pre-trained feature extraction model, and input the video segment to be retrieved into the feature extraction model to obtain a first feature vector.
In this embodiment, the video feature library includes: and inputting the video segments in the video library into the feature extraction model to obtain a second feature vector.
In this embodiment, the retrieving module 43 includes:
the first obtaining submodule is used for obtaining a third feature vector matched with the first feature vector from the second feature vector;
the second obtaining submodule is used for obtaining a video set corresponding to the third feature vector;
and the storage submodule is used for establishing a mapping relation between the video to be retrieved and all videos in the video set, and taking the mapping relation as a retrieval result of the video to be retrieved.
The first obtaining submodule is specifically configured to: and calculating the similarity of the first feature vector and the second feature vector, and taking the second feature vector with the similarity meeting a preset condition as a third feature vector.
The video similarity retrieval device provided by the embodiment further comprises: and the acquisition module is used for acquiring at least two adjacent first feature vectors, and when the video in the video set comprises at least two adjacent third feature vectors which are matched with the at least two adjacent first feature vectors and have the same time sequence information, the video to be retrieved is confirmed to be the same as or partially the same as the video in the video set.
The video similarity retrieval device provided by the embodiment further comprises: and the processing module is used for determining the repetition rate of the video to be retrieved according to the retrieval result, and executing corresponding processing operation on the video to be retrieved when the repetition rate is greater than or equal to a preset threshold value.
An embodiment of the present application further provides an electronic device, as shown in fig. 7, the electronic device may include: the system comprises a processor 1501, a communication interface 1502, a memory 1503 and a communication bus 1504, wherein the processor 1501, the communication interface 1502 and the memory 1503 complete communication with each other through the communication bus 1504.
A memory 1503 for storing a computer program;
the processor 1501 is configured to implement the steps of the above embodiments when executing the computer program stored in the memory 1503.
The communication bus mentioned in the electronic device may be a Peripheral component interconnect (pci) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described embodiments.
It should be noted that, for the above-mentioned apparatus, electronic device and computer-readable storage medium embodiments, since they are basically similar to the method embodiments, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiments.
It is further noted that, herein, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A video similarity retrieval method is characterized by comprising the following steps:
acquiring a video clip to be retrieved of a video to be retrieved;
analyzing the video clip to be retrieved to obtain a first feature vector;
and comparing the first feature vector with a video feature library to obtain a retrieval result of the video to be retrieved.
2. The method of claim 1, wherein the parsing the video segment to be retrieved to obtain a first feature vector comprises:
inputting the video clip to be retrieved into a feature extraction model, wherein the feature extraction model comprises: a plurality of 3D convolutional layers;
and sequentially convolving the video clip to be retrieved by the plurality of 3D convolution layers to obtain the first characteristic vector.
3. The method of claim 2, wherein the video feature library comprises: and the second feature vector is obtained by inputting the video segments in the video library into the feature extraction model.
4. The method according to claim 3, wherein the comparing the first feature vector with a video feature library to obtain the retrieval result of the video to be retrieved comprises:
acquiring a third feature vector matched with the first feature vector from the second feature vector;
acquiring a video set corresponding to the third feature vector;
establishing a mapping relation between the video to be retrieved and all videos in the video set;
and taking the mapping relation as a retrieval result of the video to be retrieved.
5. The method of claim 4, wherein the obtaining a third feature vector matching the first feature vector from the second feature vector comprises:
calculating the similarity of the first feature vector and the second feature vector;
and taking the second feature vector with the similarity meeting a preset condition as the third feature vector.
6. The method of claim 4, further comprising:
acquiring at least two adjacent first feature vectors;
and when the video in the video set comprises at least two adjacent third feature vectors which are matched with the at least two adjacent first feature vectors and have the same time sequence information, confirming that the video to be retrieved is the same as or partially the same as the video in the video set.
7. The method of claim 1, further comprising:
determining the repetition rate of the video to be retrieved according to the retrieval result;
and when the repetition rate is greater than or equal to a preset threshold value, executing corresponding processing operation on the video to be retrieved.
8. A video similarity retrieval apparatus, comprising:
the determining module is used for determining a video clip to be retrieved of the video to be retrieved;
the extraction module is used for extracting at least one first feature vector based on the video to be retrieved;
and the retrieval module is used for comparing the first characteristic vector with a preset video library to obtain a retrieval result of the video to be retrieved.
9. An electronic device, comprising: the system comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;
the memory is used for storing a computer program;
the processor, when executing the computer program, implementing the method steps of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method steps of any one of claims 1 to 7.
CN202010177728.7A 2020-03-13 2020-03-13 Video similarity retrieval method and device, electronic equipment and storage medium Pending CN111368143A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010177728.7A CN111368143A (en) 2020-03-13 2020-03-13 Video similarity retrieval method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010177728.7A CN111368143A (en) 2020-03-13 2020-03-13 Video similarity retrieval method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111368143A true CN111368143A (en) 2020-07-03

Family

ID=71208959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010177728.7A Pending CN111368143A (en) 2020-03-13 2020-03-13 Video similarity retrieval method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111368143A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814922A (en) * 2020-09-07 2020-10-23 成都索贝数码科技股份有限公司 Video clip content matching method based on deep learning
CN111831852A (en) * 2020-07-07 2020-10-27 北京灵汐科技有限公司 Video retrieval method, device, equipment and storage medium
CN111966859A (en) * 2020-08-27 2020-11-20 司马大大(北京)智能系统有限公司 Video data processing method and device and readable storage medium
CN116188815A (en) * 2022-12-12 2023-05-30 北京数美时代科技有限公司 Video similarity detection method, system, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020009286A1 (en) * 2000-06-10 2002-01-24 Nec Corporation Image retrieving apparatus, image retrieving method and recording medium for recording program to implement the image retrieving method
CN107748750A (en) * 2017-08-30 2018-03-02 百度在线网络技术(北京)有限公司 Similar video lookup method, device, equipment and storage medium
CN109697434A (en) * 2019-01-07 2019-04-30 腾讯科技(深圳)有限公司 A kind of Activity recognition method, apparatus and storage medium
CN110688524A (en) * 2019-09-24 2020-01-14 深圳市网心科技有限公司 Video retrieval method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020009286A1 (en) * 2000-06-10 2002-01-24 Nec Corporation Image retrieving apparatus, image retrieving method and recording medium for recording program to implement the image retrieving method
CN107748750A (en) * 2017-08-30 2018-03-02 百度在线网络技术(北京)有限公司 Similar video lookup method, device, equipment and storage medium
CN109697434A (en) * 2019-01-07 2019-04-30 腾讯科技(深圳)有限公司 A kind of Activity recognition method, apparatus and storage medium
CN110688524A (en) * 2019-09-24 2020-01-14 深圳市网心科技有限公司 Video retrieval method and device, electronic equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111831852A (en) * 2020-07-07 2020-10-27 北京灵汐科技有限公司 Video retrieval method, device, equipment and storage medium
CN111831852B (en) * 2020-07-07 2023-11-24 北京灵汐科技有限公司 Video retrieval method, device, equipment and storage medium
CN111966859A (en) * 2020-08-27 2020-11-20 司马大大(北京)智能系统有限公司 Video data processing method and device and readable storage medium
CN111814922A (en) * 2020-09-07 2020-10-23 成都索贝数码科技股份有限公司 Video clip content matching method based on deep learning
CN116188815A (en) * 2022-12-12 2023-05-30 北京数美时代科技有限公司 Video similarity detection method, system, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN111368143A (en) Video similarity retrieval method and device, electronic equipment and storage medium
US10977307B2 (en) Method and apparatus for multi-dimensional content search and video identification
KR101887002B1 (en) Systems and methods for image-feature-based recognition
WO2019020049A1 (en) Image retrieval method and apparatus, and electronic device
WO2019148729A1 (en) Luxury goods identification method, electronic device, and storage medium
JP2010515991A (en) Improved image identification
CN111476256A (en) Model training method and device based on semi-supervised learning and electronic equipment
WO2021237570A1 (en) Image auditing method and apparatus, device, and storage medium
CN108881947A (en) A kind of infringement detection method and device of live stream
CN110688524B (en) Video retrieval method and device, electronic equipment and storage medium
CN111079816A (en) Image auditing method and device and server
CN112580668A (en) Background fraud detection method and device and electronic equipment
US20130121598A1 (en) System and Method for Randomized Point Set Geometry Verification for Image Identification
CN111738173B (en) Video clip detection method and device, electronic equipment and storage medium
CN111339368A (en) Video retrieval method and device based on video fingerprints and electronic equipment
CN111291807A (en) Fine-grained image classification method and device and storage medium
CN109740621B (en) Video classification method, device and equipment
CN116016365B (en) Webpage identification method based on data packet length information under encrypted flow
CN112784691B (en) Target detection model training method, target detection method and device
CN112214639A (en) Video screening method, video screening device and terminal equipment
CN110765291A (en) Retrieval method and device and electronic equipment
Ren et al. Visual words based spatiotemporal sequence matching in video copy detection
CN112148723B (en) Abnormal data optimization method and device based on electronic purse net and electronic equipment
CN111723240A (en) Image retrieval method and device and electronic equipment
CN116796022A (en) Image recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination