CN112183328A

CN112183328A - Video identification method, device, equipment and storage medium

Info

Publication number: CN112183328A
Application number: CN202011035017.2A
Authority: CN
Inventors: 张庆锋; 张晶
Original assignee: Beijing Novel Supertv Digital Tv Technology Co ltd
Current assignee: Beijing Novel Supertv Digital Tv Technology Co ltd
Priority date: 2020-09-27
Filing date: 2020-09-27
Publication date: 2021-01-05

Abstract

The invention discloses a video identification method, a device, equipment and a storage medium, wherein the method comprises the following steps: decompressing and decoding a video to be identified to obtain a video frame sequence; determining a characteristic value string of the video to be identified based on the frame sequence; and identifying the video to be identified based on the matching result of the characteristic value string of the video to be identified and the pre-stored characteristic value string. The technical scheme provided by the embodiment of the application does not count single key frames, but counts a group of continuous frames, so that the influence of attacks such as frame adding and deleting, code rate reducing, intercepting and the like on the counting result is small, the robustness is good in the practical test, and the homologous video can be correctly identified by intercepting the video for a period of several minutes.

Description

Video identification method, device, equipment and storage medium

Technical Field

The embodiment of the invention relates to the technical field of video identification, in particular to a video identification method, a video identification device, video identification equipment and a storage medium.

Background

With the rapid rise of various short video software, the authentication, management and copyright protection of digital video contents become problems to be solved urgently; meanwhile, the explosive increase of the number of various original videos makes it urgent for the industry to automatically and efficiently identify the massive videos. The video fingerprint algorithm can collect a video characteristic value string from video content, the video characteristic value string is used as the unique 'identity' of a video, homologous videos can be identified according to the video content, tracking of 'carrying', 'cutting' and the like of the video is achieved, and therefore the problems are effectively solved on the technical aspect.

There are many existing video fingerprint algorithms, mainly including the following: 1. searching a characteristic image of video content, and taking related characteristics as video fingerprints; 2. dividing the video key frame image into areas, counting the average value of Y components of each area, and recording the related information of the average value of each area as a video fingerprint; 3. according to the change of the shot boundary, the duration of each shot is calculated, and a time slice sequence is formed to be used as a video fingerprint and the like.

The 1 st algorithm requires a training and learning process, and local image comparison is performed frame by frame during fingerprint identification. The algorithm is complex in calculation and slow in speed, and is not suitable for processing massive videos on the Internet in real time. Meanwhile, in the case of only intercepting one segment of the source video, the source video may not be correctly identified. In the 2 nd algorithm, after the video is attacked by frame addition and deletion, bit rate reduction, interception and the like, key frames during video fingerprint acquisition and identification are not necessarily the same frame image, so that the homologous video cannot be correctly identified. In the 3 rd algorithm, after the video is intercepted, rotated, subjected to geometric attacks such as black edge adding and the like, shot boundaries during video fingerprint acquisition and identification are completely different, the duration of each shot cannot be correctly calculated, and therefore the video homologous video cannot be correctly identified.

Disclosure of Invention

The invention provides a video identification method, a video identification device, video identification equipment and a storage medium, so that a homologous video can be correctly identified by randomly intercepting a video, and the test robustness is good.

In a first aspect, an embodiment of the present invention provides a video identification method, including:

decompressing and decoding a video to be identified to obtain a video frame sequence;

determining a characteristic value string of the video to be identified based on the video frame sequence;

and identifying the video to be identified based on the matching result of the characteristic value string of the video to be identified and the pre-stored characteristic value string.

In a second aspect, an embodiment of the present invention further provides a video identification apparatus, including:

the decoding module is used for decompressing and decoding the video to be identified to obtain a video frame sequence;

a determining module, configured to determine a feature value string of the video to be identified based on the video frame sequence;

and the identification module is used for identifying the video to be identified based on the matching result of the characteristic value string of the video to be identified and a pre-stored characteristic value string.

In a third aspect, an embodiment of the present invention further provides an apparatus, where the apparatus includes:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement a video recognition method as described in any of the embodiments of the present application.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is configured to, when executed by a processor, implement a video identification method as described in any of the embodiments of the present application.

The video identification method, apparatus, device and storage medium provided by the above embodiments include: decompressing and decoding a video to be identified to obtain a video frame sequence; determining a characteristic value string of a video to be identified based on the video frame sequence; and identifying the video to be identified based on the matching result of the characteristic value string of the video to be identified and the pre-stored characteristic value string. The technical scheme provided by the embodiment of the application does not count single key frame, but counts a group of continuous frame sequences, so that the influence of attacks such as frame adding and deleting, code rate reducing, intercepting and the like on the counting result is small, the robustness is good in the practical test, and the video can be correctly identified by intercepting a video for a period of several minutes.

Drawings

Fig. 1 is a flowchart of a video recognition method according to an embodiment of the present invention;

fig. 2 is a flowchart of a video recognition method according to a second embodiment of the present invention;

FIG. 3 is a flowchart of video fingerprint acquisition according to an embodiment of the present application;

FIG. 4 is a flowchart of video fingerprinting provided in the second embodiment of the present application;

fig. 5 is a schematic structural diagram of a video recognition apparatus according to a third embodiment of the present invention;

fig. 6 is a schematic structural diagram of an apparatus according to a fourth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Example one

Fig. 1 is a flowchart of a video identification method according to an embodiment of the present invention, where the present embodiment is applicable to identifying a video, and the method may be executed by a video identification device, and the device may be implemented in a soft and/or hardware manner.

As shown in fig. 1, the video identification method provided in the embodiment of the present application mainly includes the following steps:

and S11, decompressing and decoding the video to be identified to obtain a video frame sequence.

In the embodiment, the characteristic value string of the video to be identified can be efficiently calculated and matched with the pre-stored video characteristic value string, so that whether the video is the homologous video or not can be judged. ,

the video to be identified may be any one section of continuous video intercepted on the internet, may be a video after processing such as frame addition and deletion, rate reduction, interception, and the like, and may also be a video without any processing, and the type of the video is not limited in this embodiment. A frame sequence may also be referred to as a video sequence and may be understood as a sequence of video images consisting of video images of one frame.

Generally, in order to meet the display requirement, the data size of the video is relatively large, and it will be difficult to directly store or transmit a large amount of video, so that the compression technology needs to be adopted to reduce the code rate. The video to be identified in the present embodiment is a video after encoding compression. And decompressing and decoding the video to be identified to obtain a video frame sequence. It should be noted that the method for video decompression and decoding may be any technology that can implement video decompression and decoding, and the method for video decompression and decoding is not limited in this embodiment.

S12, determining the characteristic value string of the video to be identified based on the video frame sequence.

In the embodiment, the characteristic value string serves as the unique 'identity' of the video, and the homologous video can be identified according to the video content. That is, the video content is analyzed to obtain a feature value string that can uniquely identify a segment of video, and this feature value string may also be referred to as a video fingerprint.

The characteristic value string of the video to be recognized is that the frame sequences of the video to be recognized are grouped according to length, each group of frame sequences is subjected to statistical analysis respectively to obtain the characteristic value string of the group of frame sequences, and the characteristic value strings of all the groups of the characteristic value strings are sequenced according to time to obtain the characteristic value string of the video to be recognized and used for identifying the identity of the video to be recognized.

In this embodiment, a method for determining a string of feature values of a video to be identified is provided, in which a video image is divided into 2 × M regions having equal sizes and being non-overlapping with each other, a source video is decoded to obtain a frame sequence in a YUV format, an average value of luminance signal Y values of consecutive N second frame sequences of the 2 × M regions is counted, a size relationship of the average values of the adjacent 2 regions is compared to obtain a size relationship of luminance signal Y of the 2 × M regions in the N second frame sequences, and K bit feature values are generated, so that a string of feature values obtained after processing a complete video is a fingerprint of the video.

In the present embodiment, only the method for determining the feature value string of the video is described, but not limited thereto.

S13, identifying the video to be identified based on the matching result of the characteristic value string of the video to be identified and the pre-stored characteristic value string.

In this embodiment, the pre-stored feature value string refers to a feature value string obtained after video fingerprint acquisition is performed on a source video. The string of characteristic values is pre-stored in a fingerprint database for identifying the identity of the source video.

In this embodiment, a video identifier and a feature value string stored in a matching manner with the video identifier are read from a fingerprint database, the feature value string of a video to be identified is compared with a pre-stored feature value string, each comparison is performed, the pre-stored feature value string slides backwards by 1 bit, the matching rate obtained by each comparison is compared with a set threshold, if any matching rate is greater than or equal to the set threshold, the comparison is finished, and a video identifier corresponding to the pre-stored feature value string is output. And if all the matching rates are smaller than the set threshold value, judging whether all the characteristic value strings prestored in the fingerprint database are compared, and if the comparison is finished, determining that the homologous video of the video to be identified is not found in the fingerprint database. And if not, reading the next video identifier and the feature value string stored in the video identifier matching way from the fingerprint database, and continuously comparing the feature value string of the video to be identified with the currently read feature value string. .

The video identification method provided by the embodiment of the application comprises the following steps: decompressing and decoding a video to be identified to obtain a video frame sequence; determining a characteristic value string of a video to be identified based on the video frame sequence; and identifying the video to be identified based on the matching result of the characteristic value string and the pre-stored characteristic value string. The technical scheme provided by the embodiment of the application does not count single key frame, but counts a group of continuous frame sequences, so that the influence of attacks such as frame adding and deleting, code rate reducing, intercepting and the like on the counting result is small, the robustness is good in the practical test, and the video can be correctly identified by intercepting a video for a period of several minutes.

On the basis of the above embodiment, the method further includes: acquiring a characteristic value string of a source video according to a video fingerprint algorithm; acquiring a video identifier corresponding to a source video; storing the pairing of the string of feature values of the source video and the video identification in a fingerprint database.

Fig. 2 is a flowchart of video fingerprint collection according to the second embodiment of the present application, and as shown in fig. 2, the video fingerprint collection must analyze and calculate a feature value string of an entire source video, and store a source video identifier and a feature value string pair in a video fingerprint database. The video fingerprint acquisition refers to a process of analyzing and calculating a given video frame sequence by using a video fingerprint algorithm to obtain video fingerprints, and storing a video identifier and the video fingerprints in a video fingerprint database in a matching manner. As shown in fig. 3, the video fingerprint acquisition mainly includes acquiring an input source video, decompressing and decoding the source video to obtain a video frame sequence, and processing the video frame sequence by using the fingerprint algorithm to obtain a feature value string; and acquiring an input source video identifier, and storing the source video identifier and the characteristic value string in a fingerprint database in a matching manner.

In this embodiment, the principle of the fingerprint algorithm is: dividing a video image into 2M regions with equal size and without overlapping, decoding a source video to obtain a frame sequence with a YUV format, counting the average value of brightness signal Y values of continuous N second frame sequences of the 2M regions, comparing the average value size relationship of the adjacent 2 regions to obtain the size relationship of the brightness signal Y of the 2M regions in the N second frame sequence, and generating K bit characteristic values, so that a string of characteristic values obtained after processing the complete video is the fingerprint of the video. It should be noted that M, N and K are set according to the application scenario.

It should be noted that, in this embodiment, the fingerprint algorithm used for determining the feature value string of the video to be identified and the feature value string of the source video is the same fingerprint algorithm.

Example two

On the basis of the above embodiment, the above video identification method is optimized, and fig. 3 is a video identification method provided in the second embodiment of the present application. As shown in fig. 3, the video identification method provided in the embodiment of the present application mainly includes the following steps:

and S21, decompressing and decoding the video to be identified to obtain a video frame sequence, and dividing the video frame sequence into a plurality of groups of frame sequences with equal length according to the time sequence.

And S22, dividing the image of the video to be identified into a plurality of image areas for each group of frame sequences.

In the present embodiment, the number of image areas is an even number. The image of the video to be recognized can be divided in an arbitrary manner on average into an even number of image regions.

S23, the average value of the luminance signals of the group of frame sequences is counted for each image region.

In this embodiment, 2 image regions are taken as an example for explanation, and the average value of the luminance signals in the group of frame sequences in the first image region is counted, and the average value of the luminance signals in the group of frame sequences in the second image region is counted.

And S24, determining the group of frame sequence characteristic value strings based on the average value of the brightness signals of the image areas in each group of frame sequences, and splicing the characteristic value strings of all groups into the characteristic value string of the video to be identified.

The method for determining the characteristic value string of the video to be identified based on the average value of the brightness signals of a plurality of image areas in each group of frame sequences comprises the following steps: determining the difference value of the average values of the brightness signals of the first image area and the second image area in sequence aiming at each group of frame sequences; wherein the first image area and the second image area are any two adjacent image areas; and determining a characteristic value string of the video to be identified based on the comparison result of each difference value and the first threshold value.

In this embodiment, the video fingerprint is an effective feature of the video itself collected from the video content, and can uniquely represent the "identity" of the video. In the process of video fingerprint collection and identification, operations such as encryption and coding are not carried out on the video, and only once reading operation is carried out on the video, which is similar to the fingerprint collection and identification of people in life.

The method comprises the steps of obtaining a YUV format frame sequence after decompressing and decoding a video to be identified, dividing an image into 2M areas which are equal in size and are not overlapped with each other, respectively counting the average value of brightness signals Y of continuous N second frame sequences of the 2M areas, comparing the average values of the brightness signals Y of the adjacent 2 areas, and calculating K bit characteristic values in the N second frame sequences according to a preset first threshold value, so that a string of characteristic values obtained after processing the complete video is the fingerprint of the video.

In the following, M ═ 1, N ═ 1, and K ═ 2 are exemplified. Dividing an image into 2 areas with equal size and without overlapping, respectively counting the average value of brightness signals Y of the two areas within 1 second, comparing the average values of the brightness signals Y of the 2 areas, calculating to obtain a 2-bit characteristic value according to a threshold value delta 1, wherein the 2-bit length can represent four conditions, and the following formula represents the calculation process of the 2-bit characteristic value:

wherein: w is the image width, h is the image height, Δ 1 is the threshold, and fval is the calculated feature value of 2bit length.

Wherein said determining a string of feature values for a set of frame sequences based on the comparison of each of said difference values to a first threshold comprises: determining a plurality of bit values based on a comparison of each of the difference values to a first threshold; sequencing the bit values according to the relative position of each region image to obtain a characteristic value string of the group of frame sequences; and splicing the characteristic value strings of the group-by-group frame sequence into the characteristic value string of the video to be identified in time sequence.

In this embodiment, the video to be video-fingerprinted may be intentionally or unintentionally attacked (i.e., processed by transcoding, scaling, cropping, rotating, mirroring, rate-reducing, etc., as mentioned above), and the processed video is the same as the source video

And

the difference of (a) may vary, resulting in that the two collected characteristic values are not necessarily the same. Especially when

And

when the difference value of (a) is close to 0, it is likely to extract the opposite result, that is, the 2bit eigenvalue 10/11 collected by the N-second frame sequence is somewhat less reliable, and the corresponding another set of eigenvalues 00/01 is relatively more reliable. Therefore, only the relatively reliable part with the value of 00/01 in the feature value string of the source video fingerprint is compared during video fingerprint identification, and the unreliable part with the value of 10/11 is not compared, but the unreliable part has the function of a placeholder and still needs to be stored in the fingerprint database.

Because the first threshold is greater than the second threshold, if a set of frame sequence feature values at the time of video fingerprinting is 00, then the feature values of the set of frame sequences at the time of video fingerprinting are also generally 00; if a set of frame sequence feature values at video fingerprinting is 10, then the set of frame sequence feature values at video fingerprinting may also be 00. Similarly, the case where the characteristic value is 01 is also true.

As described above, even if the video is attacked,

and

if the difference value of (a) is changed, the feature value of a certain group of frame sequences during video fingerprint acquisition is 00, then the feature value of the group of frame sequences during video fingerprint identification may be 00 or 10, but is almost impossible to be 01/11, thus providing guarantee for correct identification of video fingerprints. Similarly, the case where the characteristic value is 01 is also true.

And S25, comparing the characteristic value string of the video to be identified with the pre-stored characteristic value string bit by bit, and calculating to obtain a matching rate.

The video to be identified may be a section of content arbitrarily intercepted by the source video, so the collected characteristic value string is only a part of the characteristic value string of the source video; the pre-stored characteristic value string stores the complete characteristic value string of the source video, and obviously, the characteristic value string of the video to be identified is only one part of the pre-stored characteristic value string; only enumerating all possibilities, sliding the pre-stored characteristic value string bit by bit according to the length of the characteristic value string of the video to be identified to generate a plurality of characteristic value strings to be compared, and then comparing one by one.

S26, when the matching rate is larger than a set threshold value, the comparison is finished, and the video uniquely characterized by the two characteristic value strings is confirmed to be a homologous video; and when the matching rate is less than a set threshold value, sliding the pre-stored characteristic value string backwards by one bit, comparing the pre-stored characteristic value string with the characteristic value string of the video to be identified one by one until the length of the pre-stored characteristic value string after sliding backwards is less than the length of the characteristic value string of the video to be identified, finishing comparison, and confirming that the video uniquely represented by the two characteristic value strings is not a homologous video.

Identifying the video to be identified based on the matching rate of the characteristic value string of the video to be identified and any characteristic value string to be compared, wherein the identifying comprises the following steps: acquiring characteristic value strings to be compared according to the sequence; matching the characteristic value to be compared with the characteristic value string of the video to be identified to obtain a matching rate; under the condition that the matching rate is greater than or equal to a preset threshold value, determining the video identifier stored in the pre-stored characteristic value string in a matching way as a video identifier corresponding to the video to be identified; and under the condition that the matching rate is smaller than a preset threshold value, continuing to calculate the matching rate of the next characteristic value string to be compared and the characteristic value string to be identified until the matching rate is larger than or equal to the preset matching threshold value.

In this embodiment, it is assumed that a video fingerprint with a length of 64 bits is acquired during the identification of the video fingerprint to be identified, the 64-bit feature value string is subjected to bit-by-bit sliding comparison with the feature value string in the video fingerprint library, and the matching rate obtained by each comparison is compared with a set threshold value, so as to determine whether the matching is successful.

The method of setting the threshold determination is briefly described below. What is the probability R that their match rate exceeds the specified match rate threshold by 90%, given two eigenvalues of arbitrary 64-bit length? The formula for calculating the probability of a match rate is as follows:

wherein: n is the number of bits with the same position and the same value, and R is the probability of n-bit matching.

When n is 64, i.e. the matching rate is 100%, R is 1/2⁶⁴。

When n is 63, namely the matching rate 63/64 is 98.4%,

when n is 62, namely the matching rate 62/64 is 96.9%,

when n is 61, namely the matching rate 61/64 is 95.3%,

when n is 60, namely the matching rate 60/64 is 93.8%,

when n is 59, i.e. matching ratio 59/64 is 5992.2％，

When n is 58, i.e. matching ratio 58/64 is 90.6%,

when n is 57, namely the matching rate 57/64 is 89.1%,

from the above calculation, the probability that the matching rate of the feature values of two 64-bit lengths exceeds the threshold value by 90%:

R≤1/2⁶⁴+1/2⁵⁵+1/2⁴⁷+1/2³⁹+1/2³²+1/2²⁸+1/2²⁴

namely: r is less than or equal to 7/2²⁴≤1/2²¹

From the above, the probability that the matching rate of any two feature values with the length of 64 bits is more than 90% is less than 1/2²¹I.e., less than 209 parts per million (ppm) 1. That is, if the video fingerprint feature value length is set to 64 bits, and the matching rate threshold is set to 90%, the probability that any two different videos (corresponding to the two feature value strings with the length of 64 bits) are mistakenly judged as the homologous video is less than 1 ten-thousandth 209.

Further, when the shortest length of the video fingerprint characteristic value string set by the video fingerprint identification system is larger, or the preset threshold value is larger, the probability that any two different videos are misjudged as the homologous video is smaller. The video fingerprint identification system can set the shortest length of the characteristic value string and a preset threshold according to the requirements of specific application scenes.

In order to ensure that the result of video fingerprint identification is more reliable, the length of the shortest video fingerprint characteristic value string set by the video fingerprint identification system is not shorter than 400 bits. In the test of the video fingerprint identification system, 30 videos with different packaging formats, lengths, sizes, styles, definitions, types and the like are selected for testing, and experimental data show that the average value of the fingerprint matching rates of two non-homologous videos is about 20%, and the maximum value is not more than 35%, so that the setting of the threshold value of the matching rate in the fingerprint identification system is smaller, the identification accuracy cannot be influenced, and the homologous videos can be identified more quickly.

Fig. 4 is a video fingerprint identification flow chart provided in the second embodiment of the present application, and as shown in fig. 4, the video fingerprint identification flow mainly includes: acquiring an input video to be identified, decompressing and decoding the video to be identified to obtain a video frame sequence, and processing the frame sequence by adopting the fingerprint algorithm to obtain a characteristic value string; reading a video identifier and a characteristic value string corresponding to the video identifier from a fingerprint database, sliding the characteristic value string bit by bit according to the length of the characteristic value string of the video to be identified to generate a plurality of characteristic value strings to be compared, comparing the characteristic value strings of the video to be identified with the characteristic value strings to be compared one by one, comparing the matching rate obtained by each comparison with a set threshold value, and outputting the video identifier corresponding to the prestored characteristic value string if any matching rate is greater than or equal to the set threshold value. And if all the matching rates are smaller than the set threshold value, judging whether all the fingerprints in the fingerprint database are compared, and if all the fingerprints are compared, determining that the homologous video corresponding to the video to be identified is not found in the fingerprint database. And if all the fingerprints are not compared, reading the next video identifier and the characteristic value string corresponding to the video identifier from the fingerprint database, and returning to execute the operation of comparing the characteristic value string of the video to be identified with the pre-stored characteristic value string.

The video identification method, device, equipment and storage medium provided by the embodiment of the application comprise the following steps: decompressing and decoding a video to be identified to obtain a video frame sequence, grouping the frame sequences, and dividing an image of the video to be identified into a plurality of image areas aiming at each group of the frame sequences; for each image region, counting an average value of the luminance signals of the continuous set of frame sequences; determining a feature value string of the group of frame sequences based on an average value of luminance signals of a plurality of image regions in the group of frame sequences; splicing the characteristic value strings of all the groups into a characteristic value string of the video to be identified; and identifying the video to be identified based on the matching result of the characteristic value string of the video to be identified and the pre-stored characteristic value string. The technical scheme provided by the embodiment of the application does not count single key frames, but counts a group of continuous frames, so that the influence of attacks such as frame adding and deleting, code rate reducing, intercepting and the like on the counting result is small, the robustness is good in the practical test, and the homologous video can be correctly identified by intercepting the video for a period of several minutes.

EXAMPLE III

Fig. 5 is a schematic structural diagram of a video identification apparatus according to a third embodiment of the present invention, which is applicable to identifying a video, and the apparatus may be implemented in a soft and/or hardware manner.

As shown in fig. 5, the video recognition apparatus provided in the embodiment of the present application mainly includes:

the decoding module 51 is configured to decompress and decode a video to be identified to obtain a sequence of video frames, and group the video frames according to the same length;

a determining module 52, configured to determine a feature value string of the video to be identified based on the video frame sequence;

and the identifying module 53 is configured to identify the video to be identified based on a matching result between the feature value string of the video to be identified and a pre-stored feature value string.

The video identification device provided by the embodiment of the application comprises: decompressing and decoding a video to be identified to obtain a video frame sequence; determining a characteristic value string of a video to be identified based on the video frame sequence; and identifying the video to be identified based on the matching result of the characteristic value string of the video to be identified and the pre-stored characteristic value string. The technical scheme provided by the embodiment of the application does not make statistics on a single key frame, but makes statistics on a plurality of frame sequences, so that the influence of attacks such as frame adding and deleting, code rate reducing, intercepting and the like on the statistical result is small, the robustness is good in an actual test, and the homologous video can be correctly identified by intercepting the video for a period of several minutes.

Further, the determining module 52 includes: the device comprises a region dividing unit, an average value statistic unit and a characteristic value string determining unit; wherein the content of the first and second substances,

the region dividing unit is used for dividing the video frame sequence into a plurality of groups of frame sequences with equal length according to the time sequence, and dividing each frame of image in the sequence into a plurality of image regions according to each group of frame sequences;

a mean value statistic unit for counting, for each image region, a mean value of the luminance signals of the set of consecutive frame sequences;

and the characteristic value string determining unit is used for determining the characteristic value string of the group of frame sequences in the video to be identified based on the average value of the brightness signals of the image areas in each group of frame sequences.

The characteristic value string determining unit is specifically configured to sequentially determine, for each group of frame sequences, a difference value of average values of luminance signals of the first image region and the second image region; wherein the first image area and the second image area are any two adjacent image areas; and determining a characteristic value string of the group of frame sequences of the video to be identified based on the comparison result of each difference value and the first threshold value.

Further, the determining a feature value string of the video to be identified based on the comparison result of each difference value with the first threshold value includes: determining a plurality of bit values based on a comparison of each of the difference values to a first threshold; sequencing the bit values according to the relative position of each region image to obtain a characteristic value string of the group of frame sequences; and splicing the characteristic value strings of all the groups into a characteristic value string of the video to be identified.

Further, the identifying module 53 is specifically configured to compare the feature value string of the video to be identified with the pre-stored feature value string bit by bit, and calculate to obtain a matching rate;

when the matching rate is greater than a set threshold value, the comparison is finished, and the video uniquely characterized by the two characteristic value strings is determined to be the homologous video;

and when the matching rate is less than a set threshold value, sliding the pre-stored characteristic value string backwards by one bit, comparing the pre-stored characteristic value string with the characteristic value string of the video to be identified one by one until the length of the pre-stored characteristic value string after sliding backwards is less than the length of the characteristic value string of the video to be identified, finishing comparison, and confirming that the video uniquely represented by the two characteristic value strings is not a homologous video.

Further, the apparatus further comprises:

the acquisition module is used for acquiring a characteristic value string of a source video according to a preset video fingerprint algorithm; acquiring a video identifier corresponding to a source video;

and the storage module is used for storing the corresponding relation between the characteristic value string of the source video and the video identifier in a fingerprint database.

The video identification device provided by the embodiment of the invention can execute the video identification method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

Example four

Fig. 6 is a schematic structural diagram of an apparatus according to a fourth embodiment of the present invention, as shown in fig. 6, the apparatus includes a processor 610, a memory 620, an input device 630, and an output device 640; the number of processors 610 in the device may be one or more, and one processor 610 is taken as an example in fig. 6; the processor 610, the memory 620, the input device 630 and the output device 640 in the apparatus may be connected by a bus or other means, and fig. 6 illustrates an example of a connection by a bus.

The memory 620, as a computer-readable storage medium, may be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the video recognition method in the embodiment of the present invention (for example, the decoding module 51, the determining module 52, and the recognition module 53 in the video recognition apparatus). The processor 610 executes various functional applications of the device and data processing by executing software programs, instructions and modules stored in the memory 620, that is, implements the video recognition method described above.

The memory 620 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 620 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 620 can further include memory located remotely from the processor 610, which can be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input means 630 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function controls of the device. The output device 640 may include a display device such as a display screen.

EXAMPLE five

An embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a video recognition method, the method including:

Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the operations of the method described above, and may also perform related operations in the video identification method provided by any embodiment of the present invention.

From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

It should be noted that, in the embodiment of the above search apparatus, each included unit and module are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A video recognition method, comprising:

2. The method of claim 1, wherein the determining the string of feature values for the video to be identified based on the sequence of video frames comprises:

aiming at a video frame sequence, dividing the video frame sequence into a plurality of groups of frame sequences with equal length in a time sequence;

for each group of frame sequences, dividing each frame of image in the sequence into a plurality of image areas;

for each image area, counting the average value of the brightness signals of all frames in the group of frame sequences;

the series of feature values of each group of frame sequences is determined based on an average of luminance signals of a plurality of image regions in the group of frame sequences.

3. The method of claim 2, wherein determining the string of feature values for a set of frame sequences based on an average of luminance signals of a plurality of image regions in the set of frame sequences comprises:

determining the difference value of the average values of the brightness signals of the first image area and the second image area in sequence aiming at each group of frame sequences; wherein the first image area and the second image area are any two adjacent image areas;

determining a string of feature values for the set of frame sequences based on a comparison of each of the differences with a first threshold.

4. The method of claim 3, wherein determining the string of feature values for the set of frame sequences based on the comparison of each of the difference values with the first threshold comprises:

determining a plurality of bit values based on a comparison of each of the difference values to a first threshold;

and sequencing the bit values according to the relative position of each region image, and sequencing the characteristic value strings of each group of frame sequences according to the time sequence to obtain the characteristic value strings of the video to be identified.

5. The method according to claim 1, wherein identifying the video to be identified based on the matching result of the feature value string of the video to be identified and the pre-stored feature value string comprises:

comparing the characteristic value string of the video to be identified with the pre-stored characteristic value string bit by bit, and calculating to obtain a matching rate;

6. The method of claim 1, further comprising:

acquiring a characteristic value string of a source video according to a preset video fingerprint algorithm;

acquiring a video identifier corresponding to a source video;

storing the string of feature values of the source video and the video identification pair in a fingerprint database.

7. A video recognition apparatus, comprising:

8. An apparatus, characterized in that the apparatus comprises:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the video recognition method of any of claims 1-6.

9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the video recognition method of any one of claims 1 to 6.