WO2020224325A1 - 视频指纹提取及视频检索方法、装置、终端及存储介质 - Google Patents
视频指纹提取及视频检索方法、装置、终端及存储介质 Download PDFInfo
- Publication number
- WO2020224325A1 WO2020224325A1 PCT/CN2020/079014 CN2020079014W WO2020224325A1 WO 2020224325 A1 WO2020224325 A1 WO 2020224325A1 CN 2020079014 W CN2020079014 W CN 2020079014W WO 2020224325 A1 WO2020224325 A1 WO 2020224325A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- fingerprint
- area
- black border
- image
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
Definitions
- the present invention relates to the technical field of video processing, in particular to a video fingerprint extraction method, a video retrieval method, a device, a terminal and a storage medium.
- Video fingerprints are unique identifiers extracted from video sequences, used to represent the electronic identification of video files, and a unique feature vector that can distinguish one video segment from other video segments.
- video fingerprint extraction methods based on video content such as video fingerprint extraction algorithms based on wavelet transform, video fingerprint extraction algorithms based on singular value decomposition, and video fingerprint extraction algorithms based on sparse coding, take time to extract video fingerprints. Time is too much, so the real-time performance is poor when applied to video retrieval; and for video files with black borders, the traditional video fingerprint extraction algorithm has poor robustness and video retrieval results are not ideal.
- the main purpose of the present invention is to provide a method, device, terminal and storage medium for video fingerprint extraction and video retrieval, which aims to solve the problem of slow video fingerprint extraction speed, poor robustness, and poor real-time video retrieval for video files with black borders.
- the first aspect of the present invention provides a video fingerprint extraction method, which is applied to a terminal, and the method includes:
- the detecting the non-black border area in the first image includes:
- the position corresponding to the pixel point when the detection is stopped is determined as the non-black border position in the first image, and the area formed by the non-black border position is determined as the non-black border area.
- the calculating the variance of the pixels in the preset target area in the first grayscale image includes:
- the variance of the pixels in the central area is determined as the variance of the pixels in the preset target area in the first grayscale image.
- the calculating the hash fingerprint in the non-black border area in the video segment includes:
- the value of the pixel in the non-black border area is determined to be 1;
- the value of the pixel in the non-black border area is determined to be 0;
- the hash fingerprint of the video segment is determined according to the hash fingerprints of the plurality of second grayscale images.
- the combination of the values of the pixels in the non-black border area to obtain the hash fingerprint of the second grayscale image includes:
- the determining the hash fingerprint of the video segment according to the hash fingerprints of the plurality of second grayscale images includes:
- each set of grayscale image sequences includes a preset number of second grayscale images with a time sequence
- a second aspect of the present invention provides a video retrieval method, which is applied in a terminal, and the method includes:
- the target video file corresponding to the target video fingerprint in the database to be detected is output.
- a third aspect of the present invention provides a video fingerprint extraction device, which runs in a terminal, and the device includes:
- the first extraction module is used to extract the first image with a preset number of frames from the video file
- the detection module is used to detect the non-black border area in the first image
- a determining module configured to determine the non-black border area as the non-black border area of the video file
- the second extraction module is used to extract a preset number of video clips from the video file
- the first calculation module is configured to calculate the hash fingerprint in the non-black border area in the video clip
- the second calculation module is configured to calculate the video fingerprint of the video file according to the hash fingerprint of the preset number of video clips.
- a fourth aspect of the present invention provides a video retrieval device, which runs in a terminal, and includes:
- the first fingerprint extraction module is configured to extract the first video fingerprint of the specified video file by using the video fingerprint extraction method
- the second fingerprint extraction module is used to extract the second video fingerprint of the video file in the database to be detected by using the video fingerprint extraction method
- a retrieval module configured to retrieve whether there is a target video fingerprint that is the same as the first video fingerprint in the second video fingerprint
- the output module is configured to output the target video file corresponding to the target video fingerprint in the database to be detected when the retrieval module determines that the target video fingerprint exists.
- a fifth aspect of the present invention provides a terminal, the terminal includes a memory and a processor, and the memory stores a video fingerprint extraction download program or a video retrieval program that can run on the processor.
- a download program that implements the video fingerprint extraction method when the video fingerprint extraction download program is executed by the processor, and implements the video retrieval method when the video retrieval download program is executed by the processor.
- the sixth aspect of the present invention provides a computer-readable storage medium, the computer-readable storage medium stores a download program for video fingerprint extraction or a download program for video retrieval, and the download of the video fingerprint extraction
- the program may be executed by one or more processors to realize the video fingerprint extraction method
- the download program of the video retrieval may be executed by one or more processors to realize the video retrieval method.
- a first image with a preset number of frames is first extracted from a video file, and the detected non-black images in the first image
- the border area is determined as the non-black border area in the video file, and then a preset number of video clips are extracted from the video file, and then the hash fingerprint in the non-black border area in the video clip is calculated,
- the video fingerprint of the video file can be calculated according to the hash fingerprint of the preset number of video clips.
- the non-black border area of the video file is first determined, the influence of the black border on the extraction of the video fingerprint can be eliminated; and the video fingerprint is calculated in the non-black border area, and the extracted video fingerprint is robust to the black border; Secondly, a preset number of video clips are selected from the video files. Compared with the video files, the video clips greatly reduce the amount of calculation, save the calculation time of the video fingerprint, and improve the calculation efficiency of the video fingerprint. When applied to video retrieval, it effectively shortens the time of video retrieval and can meet the real-time requirements of video retrieval.
- FIG. 1 is a schematic flowchart of a video fingerprint extraction method according to a first embodiment of the present invention
- FIG. 2 is a schematic diagram of detecting a non-black border area of a grayscale image according to a preferred embodiment of the present invention
- FIG. 3 is a schematic diagram of the position of subtitles or watermarks in a grayscale image according to a preferred embodiment of the present invention
- FIG. 4 is a schematic flowchart of a video retrieval method according to a second embodiment of the present invention.
- FIG. 5 is a schematic diagram of the structure of a video fingerprint extraction device according to a third embodiment of the present invention.
- FIG. 6 is a schematic diagram of the structure of a video retrieval device according to a fourth embodiment of the present invention.
- FIG. 7 is a schematic diagram of the internal structure of the terminal disclosed in the fifth embodiment of the present invention.
- FIG. 1 it is a flowchart of a video fingerprint extraction method provided by Embodiment 1 of the present invention.
- the video fingerprint extraction method is applied to the terminal and specifically includes the following steps. According to different requirements, the order of the steps in the flowchart can be changed, and some steps can be omitted.
- S11 Extract a first image with a preset number of frames from a video file.
- the video file may include, but is not limited to: music video, short video, TV series, movie, variety show video, animation video, and so on.
- the terminal can randomly extract the first image with a preset number of frames from the video file.
- the extracting the first image with a preset number of frames from the video file includes: acquiring the time length of the video file; The first image with a preset number of frames is randomly extracted within the range.
- the duration of the video file is 1 minute
- the preset range is 30% to 80% of the duration
- the non-black border area of the first image is first determined, and then the non-black border area of the first image is determined in the video file The non-black bordered area.
- the detecting the non-black border area in the first image includes:
- the position corresponding to the pixel point when the detection is stopped is determined as the non-black border position in the first image, and the area formed by the non-black border position is determined as the non-black border area.
- the black area in the video can only appear in the areas of the top, bottom, left, and right parts of the video. Therefore, the areas of the top, bottom, left, and right parts can be pre-designated as the target area.
- the pre-designated areas of the upper, lower, left, and right parts have the same width, which are all r pixels, and r is a preset value. Afterwards, only the non-black border area in the target area needs to be detected to determine the non-black border area in the first grayscale image.
- the oblique shadow area is preset as the target area in the first grayscale image
- the black area is the central area in the target area.
- 10 frames of first images are randomly extracted from a video file, and these 10 frames of first images are converted into 10 frames of first grayscale images.
- the variances are sorted from largest to smallest, and then the first C (for example, the first 4) with larger ones are selected
- the first grayscale image corresponding to the first C variances is determined as the target grayscale image.
- the coordinate system shown in Figure 2 Assuming that the upper left corner of the target grayscale image is the origin, the horizontal to right direction is the positive y axis, and the vertical The downward direction is the positive x axis.
- the position (0, 0) in the coordinate system traverse the first pixel in each of the C target grayscale images (for example, the first pixel of the first target grayscale image) Pixel 1, the first pixel 0 of the second target grayscale image, the first pixel 1 of the third target grayscale image, the first pixel of the fourth target grayscale image 2), Calculate the relative mean (1) and relative variance (0.5) of the pixel corresponding to the position (0, 0).
- the preset stop detection condition may include: the ratio of the relative variance of the pixel points in the path direction to the total variance is greater than a preset threshold ⁇ (0-100%); or the relative value of the pixel points in the path direction The average value is greater than the preset first value ⁇ ; or the relative variance of the pixels in the path direction is greater than the preset second value ⁇ .
- the position corresponding to the pixel when the detection is stopped is determined as the non-black border position in the first image, and the area formed by the non-black border position is the non-black border area, such as the gray dot area shown in FIG. 2 .
- the video file will have a night scene picture, which will cause the contrast between the black border area and the night scene picture in the non-black border area to be insignificant, and the variance can reflect the size of the high frequency part of the image. If the image contrast is small , The variance is small, and if the image contrast is large, the variance is large.
- the variance By calculating the variance of the pixels in the target area in the first gray-scale image, it can be determined whether the target area contains a black border area. If the calculated variance is large, the target area in the first gray-scale image must contain a black area; if the calculated variance is small, the target area in the first gray-scale image may be Does not include areas with black borders.
- the target grayscale image with the largest variance is selected from the first grayscale image of the preset number of frames.
- the black border area and the non-black border area in the target grayscale image will have a very obvious contrast, then the detection The black border area is more accurate.
- since the number of pixels in the target area is much smaller than the number of pixels in the first grayscale image, compared to calculating the variance of the first grayscale image, only calculating the variance in the target area saves time. Helps improve the extraction efficiency of video fingerprints.
- calculating the relative mean value and relative variance of each pixel in the preset target area with respect to the C target gray levels reflects the brightness changes of the pixels at different times.
- the calculating the variance of the pixels in the preset target area in the first grayscale image includes: obtaining The pixels in the central area in the preset target area; calculate the variance of the pixels in the central area; determine the variance of the pixels in the central area to be within the preset target area in the first grayscale image The variance of the pixels.
- the calculating the average value of the pixels in the preset target area in the target grayscale image includes: obtaining the pixels of the central area in the preset target area; calculating the average value of the pixels in the central area ; Determine the average value of the pixels in the central area as the average value of the pixels in the preset target area in the target grayscale image.
- the central area refers to the exact central area of the preset target area, and the area of the central area is one half of the area of the preset target area. It can be seen that calculating the variance and average of the pixels in the target area becomes calculating the variance and average of the pixels in the central area. As the number of pixels in the central area is further reduced, the calculation efficiency can be further improved.
- the position and size of the black border area in each frame of the video file are basically fixed.
- the position of the non-black border area and the size of the non-black border area in each frame of a video file are basically fixed, and there will be no larger non-black border area in one frame of image, and the other frame
- the non-black border area of the image is small. Therefore, the non-black border area in the video file can be determined according to the non-black border area appearing in the first image of the preset number of frames. That is, the position of the non-black border area and the size of the non-black border area in the first image of the preset number of frames may be determined as the position of the non-black border area and the size of the non-black border area of the video file.
- a preset number of video clips are extracted from the video file.
- a preset number of video clips can be randomly extracted from the video file. It is also possible to preset time nodes, for example, preset 4 time nodes, namely: the time node at 20%, the time node at 60%, the time node at 60%, and the time node at 80% of the video playback duration. , Extract the video clip of the preset duration in the attachment of the preset time node.
- the duration of the video clip is preset, for example, 10 seconds.
- the calculating the hash fingerprint in the non-black border area in the video segment includes:
- the value of the pixel in the non-black border area is determined to be 1;
- the value of the pixel in the non-black border area is determined to be 0;
- the hash fingerprint of the video segment is determined according to the hash fingerprints of the plurality of second grayscale images.
- the video clip is resampled at a preset fixed frame rate (that is, the number of frames per second (Frames Per Second, FPS)), which can cope with the change of frame rate, so that the subsequent extracted video
- FPS Frames Per Second
- a 10-second video clip can be resampled to obtain 260 frames of images.
- the grayscale image is 6*4
- the calculated hash fingerprint of the grayscale image is 24 bytes (bit)
- the finally obtained hash fingerprint of the video segment is 260*24bit.
- the combination of the values of the pixels in the non-black border area to obtain the hash fingerprint of the second grayscale image includes:
- pixels at positions where subtitles or watermarks may appear may be removed in advance.
- the diagonally shaded area represents the area where subtitles or watermarks appear. Since the value of the pixel at the position where the subtitle or the watermark may exist is removed, the interference of the subtitle or the watermark on the video fingerprint can be effectively avoided, thereby enhancing the characterization ability of the extracted video fingerprint.
- the determining the hash fingerprint of the video segment according to the hash fingerprints of the plurality of second gray-scale images includes:
- each set of grayscale image sequences includes a preset number of second grayscale images with a time sequence
- the similarity of the second gray-scale images of two adjacent frames can be compared by calculating the Hamming distance of the second gray-scale images of two adjacent frames.
- the larger the Hamming distance the second gray-scale images of adjacent two frames. Degree images are more dissimilar; conversely, the smaller the Hamming distance, the more similar the second gray-scale images of two adjacent frames.
- the Hamming distance is 0, it means that the second grayscale images of two adjacent frames are completely the same. It is generally believed that when the Hamming distance is greater than 10, the two grayscale images are completely different images.
- 4.
- the Hamming distances of the second grayscale images of every two adjacent frames in a certain group of grayscale image sequences are summed to obtain the sum of the Hamming distances of the group of grayscale image sequences.
- the hash fingerprint of the grayscale image in the grayscale image sequence with the more drastic content change or the sharper the contrast change is selected as the hash fingerprint of the video segment, which can most effectively represent the content of the video segment and has a stronger characterization ability.
- a sliding window of a preset length may be selected and sliding on the video clips, thereby obtaining multiple sets of video clip sequences. Then, each group of video clip sequence is resampled according to the preset frame rate to obtain multiple groups of gray image sequences.
- the present invention does not make any specific restrictions on this, any calculation of the hash fingerprint based on the pixels in the non-black area of the gray image in the video clip and the calculation of the Hamming distance based on the hash fingerprints of two adjacent frames of gray image, The idea of determining the hash fingerprint of the video segment according to the sum of the Hamming distance should be included in the present invention.
- S16 Calculate the video fingerprint of the video file according to the hash fingerprint of the preset number of video clips.
- the hash fingerprints of a preset number of video segments can be combined to obtain a hash fingerprint matrix or a hash fingerprint vector, and the hash fingerprint matrix Or hash the fingerprint vector as the video fingerprint of the final video file.
- the video fingerprint extraction method first extracts a first image with a preset number of frames from a video file, and determines the detected non-black area in the first image as the video file Then extract a preset number of video clips from the video file in the non-black area in the video file, then calculate the hash fingerprints in the non-black area in the video clip, and finally according to the preset number of The hash fingerprint of the video segment can be calculated to obtain the video fingerprint of the video file.
- the non-black border area of the video file is first determined, the influence of the black border on the extraction of the video fingerprint can be eliminated; and the video fingerprint is calculated in the non-black border area, and the extracted video fingerprint is robust to the black border; Secondly, a preset number of video clips are selected from the video files. Compared with the video files, the video clips greatly reduce the amount of calculation, save the calculation time of the video fingerprint, and improve the calculation efficiency of the video fingerprint. When applied to video retrieval, it effectively shortens the time of video retrieval and can meet the real-time requirements of video retrieval.
- the influence of the subtitle or watermark on the video fingerprint is effectively reduced, and the robustness of the extracted video fingerprint to the subtitle or watermark is further improved.
- FIG. 4 it is a flowchart of a video retrieval method provided by Embodiment 2 of the present invention.
- the video retrieval method is applied to the terminal and specifically includes the following steps. According to different requirements, the order of the steps in the flowchart can be changed, and some steps can be omitted.
- the specified video file may be an uploaded video file or a video file to be queried.
- the extracted video fingerprint of the specified video file is called the first video fingerprint.
- S42 Use the video fingerprint extraction method to extract a second video fingerprint of a video file in a database to be detected.
- the database to be detected may be a video copyright database, or a video warehouse on the Internet.
- the extracted video fingerprint of the video file in the database to be detected is called the second video fingerprint.
- S43 Search whether there is a target video fingerprint that is the same as the first video fingerprint in the second video fingerprint.
- each of the second video fingerprints is compared with the first video fingerprint. If it is determined that a certain second video fingerprint is the same as the first video fingerprint, it means that there is a target video fingerprint that is the same as the first video fingerprint in the second video fingerprint. If it is determined that any one of the second video fingerprints is different from the first video fingerprint, it means that there is no target video fingerprint that is the same as the first video fingerprint in the second video fingerprint.
- the target video file corresponding to the target video fingerprint can be obtained, and the target video file can be output.
- the video fingerprint extraction method may be used to extract the first video fingerprint of each video in the video copyright database in advance.
- the second video fingerprint of the uploaded video is extracted using the video fingerprint extraction method.
- the first video fingerprint in the video copyright database contains the second video fingerprint, that is, the target video corresponding to the uploaded video is retrieved from the video copyright database, it is determined that the uploaded video has a copyright conflict.
- the file supervision department when it needs to monitor illegal videos on the Internet, it can extract the first video fingerprint of each video in the video warehouse by using the video fingerprint extraction method in advance. Then use the video fingerprint extraction method to extract the second video fingerprint of the designated illegal video.
- the first video fingerprint in the video warehouse contains the second video fingerprint, that is, the target video corresponding to the designated illegal video is retrieved from the video warehouse, it is determined that there is an illegal video on the Internet.
- the video retrieval method uses the video fingerprint extraction method to extract the first video fingerprint of the specified video file and the second video fingerprint of the video file in the database to be detected,
- the second video fingerprint is compared with the first video fingerprint to retrieve whether there is a target video fingerprint that is the same as the first video fingerprint in the second video fingerprint, and when it is determined that the target video fingerprint exists When, output the target video file corresponding to the target video fingerprint.
- the extracted video fingerprints Due to the video fingerprint extraction method described, the extracted video fingerprints have strong robustness to black borders and watermarks, and the extracted video fingerprints have strong characterization capabilities, so they can be found quickly and effectively during video file retrieval.
- the target video file is extracted; secondly, the video fingerprint extraction method is adopted, the extraction time of the video fingerprint is short, and the extraction efficiency is high. Therefore, when the video file is retrieved, the retrieval time of the video file can be effectively shortened, and the retrieval of the video file can be improved Efficiency, meets the real-time requirements of video file retrieval, has high practical value and economic value.
- FIG. 5 is a schematic diagram of functional modules of a video fingerprint extraction device disclosed in an embodiment of the present invention.
- the video fingerprint extraction device 50 runs in a terminal.
- the video fingerprint extraction device 50 may include multiple functional modules composed of program code segments.
- the program code of each program segment in the video fingerprint extraction device 50 can be stored in the memory of the terminal and executed by the at least one processor to execute (see FIG. 1 for details) for the black border and watermark Video fingerprint extraction.
- the video fingerprint extraction device 50 can be divided into multiple functional modules according to the functions it performs.
- the functional modules may include: a first extraction module 501, a detection module 502, a determination module 503, a second extraction module 504, a first calculation module 505, and a second calculation module 506.
- the module referred to in the present invention refers to a series of computer program segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the function of each module will be described in detail in subsequent embodiments.
- the first extraction module 501 is configured to extract a first image with a preset number of frames from a video file.
- the video file may include, but is not limited to: music video, short video, TV series, movie, variety show video, animation video, and so on.
- the terminal can randomly extract the first image with a preset number of frames from the video file.
- the first extraction module 501 extracting the first image with a preset number of frames from the video file includes: the duration of acquiring the video file; Randomly extract the first image of the preset number of frames within the preset range of the duration.
- the duration of the video file is 1 minute
- the preset range is 30% to 80% of the duration
- the detection module 502 is configured to detect the non-black border area in the first image.
- the non-black border area of the first image is first determined, and then the non-black border area of the first image is determined in the video file The non-black bordered area.
- the detection module 502 detecting the non-black border area in the first image includes:
- the position corresponding to the pixel point when the detection is stopped is determined as the non-black border position in the first image, and the area formed by the non-black border position is determined as the non-black border area.
- the black area in the video can only appear in the areas of the top, bottom, left, and right parts of the video. Therefore, the areas of the top, bottom, left, and right parts can be pre-designated as the target area.
- the pre-designated areas of the upper, lower, left, and right parts have the same width, which are all r pixels, and r is a preset value. Afterwards, only the non-black border area in the target area needs to be detected to determine the non-black border area in the first grayscale image.
- the oblique shadow area is preset as the target area in the first grayscale image
- the black area is the central area in the target area.
- 10 frames of first images are randomly extracted from a video file, and these 10 frames of first images are converted into 10 frames of first grayscale images.
- the variances are sorted from largest to smallest, and then the first C (for example, the first 4) with larger ones are selected
- the first grayscale image corresponding to the first C variances is determined as the target grayscale image.
- the coordinate system shown in Figure 2 Assuming that the upper left corner of the target grayscale image is the origin, the horizontal to right direction is the positive y axis, and the vertical The downward direction is the positive x axis.
- the position (0, 0) in the coordinate system traverse the first pixel in each of the C target grayscale images (for example, the first pixel of the first target grayscale image) Pixel 1, the first pixel 0 of the second target grayscale image, the first pixel 1 of the third target grayscale image, the first pixel of the fourth target grayscale image 2), Calculate the relative mean (1) and relative variance (0.5) of the pixel corresponding to the position (0, 0).
- the preset stop detection condition may include: the ratio of the relative variance of the pixel points in the path direction to the total variance is greater than a preset threshold ⁇ (0-100%); or the relative value of the pixel points in the path direction The average value is greater than the preset first value ⁇ ; or the relative variance of the pixels in the path direction is greater than the preset second value ⁇ .
- the position corresponding to the pixel when the detection is stopped is determined as the non-black border position in the first image, and the area formed by the non-black border position is the non-black border area, such as the gray dot area shown in FIG. 2 .
- the video file will have a night scene picture, which will cause the contrast between the black border area and the night scene picture in the non-black border area to be insignificant, and the variance can reflect the size of the high frequency part of the image. If the image contrast is small , The variance is small, and if the image contrast is large, the variance is large.
- the variance By calculating the variance of the pixels in the target area in the first gray-scale image, it can be determined whether the target area contains a black border area. If the calculated variance is large, the target area in the first gray-scale image must contain a black area; if the calculated variance is small, the target area in the first gray-scale image may be Does not include areas with black borders.
- the target grayscale image with the largest variance is selected from the first grayscale image of the preset number of frames.
- the black border area and the non-black border area in the target grayscale image will have a very obvious contrast, then the detection The black border area is more accurate.
- since the number of pixels in the target area is much smaller than the number of pixels in the first grayscale image, compared to calculating the variance of the first grayscale image, only calculating the variance in the target area saves time. Helps improve the extraction efficiency of video fingerprints.
- calculating the relative mean value and relative variance of each pixel in the preset target area with respect to the C target gray levels reflects the brightness changes of the pixels at different times.
- the calculating the variance of the pixels in the preset target area in the first grayscale image includes: obtaining The pixels in the central area in the preset target area; calculate the variance of the pixels in the central area; determine the variance of the pixels in the central area to be within the preset target area in the first grayscale image The variance of the pixels.
- the calculating the average value of the pixels in the preset target area in the target grayscale image includes: obtaining the pixels of the central area in the preset target area; calculating the average value of the pixels in the central area ; Determine the average value of the pixels in the central area as the average value of the pixels in the preset target area in the target grayscale image.
- the central area refers to the exact central area of the preset target area, and the area of the central area is one half of the area of the preset target area. It can be seen that calculating the variance and average of the pixels in the target area becomes calculating the variance and average of the pixels in the central area. As the number of pixels in the central area is further reduced, the calculation efficiency can be further improved.
- the oblique shadow area is preset as the target area in the first grayscale image
- the black area is the central area in the target area.
- the target gray-scale image with the largest variance in the center area (for example, the first gray-scale image in the fifth frame) is selected from the 10 first gray-scale images.
- the two opposite vertices of the target grayscale image are taken, and the pixel points in the path direction are detected one by one in the path direction where the positions of the two vertices are toward the center position of the target grayscale image.
- the detection is stopped.
- the horizontal right direction is the positive y axis
- the vertical downward direction is the positive x axis
- the first grayscale image is long Is W and width is H.
- the detection starts from the path direction from the points (H, 0), (0, W) to the center position (H/2, W/2) respectively, assuming that the values of pixels A and B in the path direction are detected When it is greater than the mean value of the central area, stop detection.
- the area formed by the intersection of the horizontal and vertical lines corresponding to the positions of the pixels A and B (for example, the gray area including the center position in FIG. 2) is used as the non-black area in the target grayscale image.
- Edge area Taking the non-black border area in the target grayscale image as the non-black border area of the first image.
- the determining module 503 is configured to determine the non-black border area as the non-black border area of the video file.
- the position and size of the black border area in each frame of the video file are basically fixed.
- the position of the non-black border area and the size of the non-black border area in each frame of a video file are basically fixed, and there will be no larger non-black border area in one frame of image, and the other frame
- the non-black border area of the image is small. Therefore, the non-black border area in the video file can be determined according to the non-black border area appearing in the first image of the preset number of frames. That is, the position of the non-black border area and the size of the non-black border area in the first image of the preset number of frames may be determined as the position of the non-black border area and the size of the non-black border area of the video file.
- the second extraction module 504 is configured to extract a preset number of video clips from the video file.
- a preset number of video clips are extracted from the video file.
- a preset number of video clips can be randomly extracted from the video file. It is also possible to preset time nodes, for example, preset 4 time nodes, namely: the time node at 20%, the time node at 60%, the time node at 60%, and the time node at 80% of the video playback duration. , Extract the video clip of the preset duration in the attachment of the preset time node.
- the duration of the video clip is preset, for example, 10 seconds.
- the first calculation module 505 is configured to calculate the hash fingerprint in the non-black border area in the video clip.
- the calculation of the hash fingerprint in the non-black border area in the video clip by the first calculation module 505 includes:
- the value of the pixel in the non-black border area is determined to be 1;
- the value of the pixel in the non-black border area is determined to be 0;
- the hash fingerprint of the video segment is determined according to the hash fingerprints of the plurality of second grayscale images.
- the video clip is resampled at a preset fixed frame rate (that is, the number of frames per second (Frames Per Second, FPS)), which can cope with the change of frame rate, so that the subsequent extracted video
- FPS Frames Per Second
- a 10-second video clip can be resampled to obtain 260 frames of images.
- the grayscale image is 6*4
- the calculated hash fingerprint of the grayscale image is 24 bytes (bit)
- the finally obtained hash fingerprint of the video segment is 260*24bit.
- the combination of the values of the pixels in the non-black border area to obtain the hash fingerprint of the second grayscale image includes:
- pixels at positions where subtitles or watermarks may appear may be removed in advance.
- the diagonally shaded area represents the area where subtitles or watermarks appear. Since the value of the pixel at the position where the subtitle or the watermark may exist is removed, the interference of the subtitle or the watermark on the video fingerprint can be effectively avoided, thereby enhancing the characterization ability of the extracted video fingerprint.
- the determining the hash fingerprint of the video segment according to the hash fingerprints of the plurality of second gray-scale images includes:
- each set of grayscale image sequences includes a preset number of second grayscale images with a time sequence
- the similarity of the second gray-scale images of two adjacent frames can be compared by calculating the Hamming distance of the second gray-scale images of two adjacent frames.
- the larger the Hamming distance the second gray-scale images of adjacent two frames. Degree images are more dissimilar; conversely, the smaller the Hamming distance, the more similar the second gray-scale images of two adjacent frames.
- the Hamming distance is 0, it means that the second grayscale images of two adjacent frames are completely the same. It is generally believed that when the Hamming distance is greater than 10, the two grayscale images are completely different images.
- 4.
- the Hamming distances of the second grayscale images of every two adjacent frames in a certain group of grayscale image sequences are summed to obtain the sum of the Hamming distances of the group of grayscale image sequences.
- the hash fingerprint of the grayscale image in the grayscale image sequence with the more drastic content change or the sharper the contrast change is selected as the hash fingerprint of the video segment, which can most effectively represent the content of the video segment and has a stronger characterization ability.
- a sliding window of a preset length may be selected and sliding on the video clips, so as to obtain multiple sets of video clip sequences. Then, each group of video clip sequence is resampled according to the preset frame rate to obtain multiple groups of gray image sequences.
- the present invention does not make any specific restrictions on this, any calculation of the hash fingerprint based on the pixels in the non-black area of the gray image in the video clip and the calculation of the Hamming distance based on the hash fingerprints of two adjacent frames of gray image, The idea of determining the hash fingerprint of the video segment according to the sum of the Hamming distance should be included in the present invention.
- the second calculation module 506 is configured to calculate the video fingerprint of the video file according to the hash fingerprint of the preset number of video clips.
- the hash fingerprints of a preset number of video segments can be combined to obtain a hash fingerprint matrix or a hash fingerprint vector, and the hash fingerprint matrix Or hash the fingerprint vector as the video fingerprint of the final video file.
- the video fingerprint extraction device first extracts a first image with a preset number of frames from a video file, and determines the detected non-black area in the first image as the video file Then extract a preset number of video clips from the video file in the non-black area in the video file, then calculate the hash fingerprints in the non-black area in the video clip, and finally according to the preset number of The hash fingerprint of the video segment can be calculated to obtain the video fingerprint of the video file.
- the non-black border area of the video file is first determined, the influence of the black border on the extraction of the video fingerprint can be eliminated; and the video fingerprint is calculated in the non-black border area, and the extracted video fingerprint is robust to the black border; Secondly, a preset number of video clips are selected from the video files. Compared with the video files, the video clips greatly reduce the amount of calculation, save the calculation time of the video fingerprint, and improve the calculation efficiency of the video fingerprint. When applied to video retrieval, it effectively shortens the time of video retrieval and can meet the real-time requirements of video retrieval.
- the influence of the subtitle or watermark on the video fingerprint is effectively reduced, and the robustness of the extracted video fingerprint to the subtitle or watermark is further improved.
- FIG. 6 is a schematic diagram of functional modules of a video retrieval device disclosed in an embodiment of the present invention.
- the video retrieval device 60 runs in a terminal.
- the video retrieval device 60 may include multiple functional modules composed of program code segments.
- the program code of each program segment in the video retrieval device 60 can be stored in the memory of the terminal and executed by the at least one processor to execute (see FIG. 4 for details) for the black border and watermark video Fast search.
- the video retrieval device 60 can be divided into multiple functional modules according to the functions it performs.
- the functional modules may include: a first fingerprint extraction module 601, a second fingerprint extraction module 602, a retrieval module 603, and an output module 604.
- the module referred to in the present invention refers to a series of computer program segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the function of each module will be described in detail in subsequent embodiments.
- the first fingerprint extraction module 601 is configured to extract the first video fingerprint of the specified video file by using the video fingerprint extraction method.
- the specified video file may be an uploaded video file or a video file to be queried.
- the extracted video fingerprint of the specified video file is called the first video fingerprint.
- the second fingerprint extraction module 602 is configured to use the video fingerprint extraction method to extract the second video fingerprint of the video file in the database to be detected.
- the database to be detected may be a video copyright database, or a video warehouse on the Internet.
- the extracted video fingerprint of the video file in the database to be detected is called the second video fingerprint.
- the retrieval module 603 is configured to retrieve whether there is a target video fingerprint that is the same as the first video fingerprint in the second video fingerprint.
- each of the second video fingerprints is compared with the first video fingerprint. If it is determined that a certain second video fingerprint is the same as the first video fingerprint, it means that there is a target video fingerprint that is the same as the first video fingerprint in the second video fingerprint. If it is determined that any one of the second video fingerprints is different from the first video fingerprint, it means that there is no target video fingerprint that is the same as the first video fingerprint in the second video fingerprint.
- the output module 604 is configured to output a target video file corresponding to the target video fingerprint in the database to be detected when the retrieval module 603 determines that the target video fingerprint exists.
- the target video file corresponding to the target video fingerprint can be obtained, and the target video file can be output.
- the video fingerprint extraction method may be used to extract the first video fingerprint of each video in the video copyright database in advance.
- the second video fingerprint of the uploaded video is extracted using the video fingerprint extraction method.
- the first video fingerprint in the video copyright database contains the second video fingerprint, that is, the target video corresponding to the uploaded video is retrieved from the video copyright database, it is determined that the uploaded video has a copyright conflict.
- the file supervision department when it needs to monitor illegal videos on the Internet, it can extract the first video fingerprint of each video in the video warehouse by using the video fingerprint extraction method in advance. Then use the video fingerprint extraction method to extract the second video fingerprint of the designated illegal video.
- the first video fingerprint in the video warehouse contains the second video fingerprint, that is, the target video corresponding to the designated illegal video is retrieved from the video warehouse, it is determined that there is an illegal video on the Internet.
- the video retrieval device uses the video fingerprint extraction method to extract the first video fingerprint of the specified video file and the second video fingerprint of the video file in the database to be detected.
- the second video fingerprint is compared with the first video fingerprint to retrieve whether there is a target video fingerprint that is the same as the first video fingerprint in the second video fingerprint, and when it is determined that the target video fingerprint exists When, output the target video file corresponding to the target video fingerprint. Due to the video fingerprint extraction method described, the extracted video fingerprints have strong robustness to black borders and watermarks, and the extracted video fingerprints have strong characterization capabilities, so they can be found quickly and effectively during video file retrieval.
- the target video file is extracted; secondly, the video fingerprint extraction method is adopted, the extraction time of the video fingerprint is short, and the extraction efficiency is high. Therefore, when the video file is retrieved, the retrieval time of the video file can be effectively shortened and the retrieval of the video file can be improved Efficiency, meets the real-time requirements of video file retrieval, has high practical value and economic value.
- FIG. 7 is a schematic diagram of the internal structure of a terminal disclosed in an embodiment of the present invention.
- the terminal 7 may be a fixed terminal or a mobile terminal.
- the terminal 7 may include a memory 71, a processor 72, and a bus 73.
- the memory 71 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc.
- the memory 71 may be an internal storage unit of the terminal 7 in some embodiments, for example, a hard disk of the terminal 7. In other embodiments, the memory 71 may also be an external storage device of the terminal 7, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD card, Flash Card, etc. Further, the memory 71 may also include both an internal storage unit of the terminal 7 and an external storage device.
- the memory 71 can be used not only to store application software and various data installed in the terminal 7, such as the code of the video fingerprint extraction device 50 and various modules, or the code of the video retrieval device 60 and various modules, but also Temporarily store data that has been output or will be output.
- the processor 72 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments, and is used to run the program code or processing stored in the memory 71 data.
- CPU central processing unit
- controller microcontroller
- microprocessor or other data processing chip in some embodiments, and is used to run the program code or processing stored in the memory 71 data.
- the bus 73 may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
- PCI peripheral component interconnect standard
- EISA extended industry standard architecture
- the bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in FIG. 7, but it does not mean that there is only one bus or one type of bus.
- the terminal 7 may also include a network interface, and the network interface may optionally include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used between the terminal 7 and other terminals. Establish a communication connection between.
- a network interface may optionally include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used between the terminal 7 and other terminals. Establish a communication connection between.
- the terminal 7 may further include a user interface.
- the user interface may include a display (Display) and an input unit such as a keyboard (Keyboard).
- the optional user interface may also include a standard wired interface and a wireless interface.
- the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light emitting diode) touch device, etc.
- the display can also be called a display screen or a display unit as appropriate, and is used to display the messages processed in the terminal 7 and to display a visualized user interface.
- FIG. 7 only shows the terminal 7 with components 71-73.
- the structure shown in FIG. 7 does not constitute a limitation on the terminal 7, and may be a bus-type structure. It may also be a star-shaped structure, and the terminal 7 may also include fewer or more components than shown, or a combination of certain components, or a different component arrangement. If other existing or future electronic products can be adapted to the present invention, they should also be included in the protection scope of the present invention and included here by reference.
- the computer program product includes one or more computer instructions.
- the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
- the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
- the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website site, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
- wired such as coaxial cable, optical fiber, digital subscriber line (DSL)
- wireless such as infrared, wireless, microwave, etc.
- the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server or data center integrated with one or more available media.
- the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).
- the disclosed system, device, and method may be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of this application.
- the foregoing storage media include: U disk, hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Computational Linguistics (AREA)
- Image Analysis (AREA)
- Collating Specific Patterns (AREA)
Abstract
Description
Claims (11)
- 一种视频指纹提取方法,应用于终端中,其特征在于,所述方法包括:从视频文件中提取预设帧数的第一图像;检测所述第一图像中的非黑边区域;将所述非黑边区域确定为所述视频文件的非黑边区域;从所述视频文件中提取预设数量的视频片段;计算所述视频片段中的所述非黑边区域内的哈希指纹;根据所述预设数量的视频片段的哈希指纹计算所述视频文件的视频指纹。
- 如权利要求1所述的方法,其特征在于,所述检测所述第一图像中的非黑边区域包括:将所述第一图像转换为第一灰度图像;计算所述第一灰度图像中的预设目标区域内的像素的方差;将所述方差按照从大到小进行排序后取前C个方差对应的目标灰度图像;根据C个所述目标灰度图像中的所述预设目标区域内的相同位置处的像素,计算所述预设目标区域内的每一个像素的相对均值和相对方差;遍历所述预设目标区域,在所述预设目标区域内最外层朝向最内层的路径方向上,逐一检测所述路径方向上的像素点;当所述路径方向上的像素点的相对均值和相对方差满足了预设停止检测条件时,停止检测;将停止检测时的像素点对应的位置确定为所述第一图像中的非黑边位置,将所述非黑边位置形成的区域确定为所述非黑边区域。
- 如权利要求2所述的方法,其特征在于,所述计算所述第一灰度图像中的预设目标区域内的像素的方差包括:获取所述预设目标区域内的中心区域的像素,所述中心区域是指所述预设目标区域的正中心区域,且所述中心区域的面积为所述预设目标区域的面积的二分之一;计算所述中心区域的像素的方差;将所述中心区域的像素的方差确定为所述第一灰度图像中的所述预设目标区域内的像素的方差。
- 如权利要求1所述的方法,其特征在于,所述计算所述视频片段中的所述非黑边区域内的哈希指纹包括:根据预设的帧速率对所述视频片段进行重采样得到多帧第二图像;将所述第二图像转化为第二灰度图像;计算所述第二灰度图像中的所述非黑边区域内的像素的平均值;当所述非黑边区域内的像素的值大于或者等于所述平均值时,将所述像素的值确定为1;当所述非黑边区域内的像素的值小于所述平均值时,将所述像素的值确定为0;将所述非黑边区域内的像素的值进行组合后得到所述第二灰度图像的哈希指纹;根据所述多个第二灰度图像的哈希指纹确定所述视频片段的哈希指纹。
- 如权利要求4所述的方法,其特征在于,所述将所述非黑边区域内的像素的值进行组合后得到所述第二灰度图像的哈希指纹包括:去除所述非黑边区域内的预设目标位置处的像素的值;对去除所述预设目标位置处的像素的值的所述非黑边区域内的像素的值进 行组合,得到所述第二灰度图像的哈希指纹。
- 如权利要求4或5所述的方法,其特征在于,所述根据所述多个第二灰度图像的哈希指纹确定所述视频片段的哈希指纹包括:对所述多个第二灰度图像进行分组,得到多组灰度图像序列,其中,每组灰度图像序列包括预设数量的具有时间序列的第二灰度图像;计算每组所述灰度图像序列中相邻两帧第二灰度图像的哈希指纹的汉明距离;计算每组所述灰度图像序列中汉明距离的总和;将对应汉明距离的总和最大的灰度图像序列确定为目标灰度图像序列;将所述目标灰度图像序列中的灰度图像的哈希指纹确定为所述视频片段的哈希指纹。
- 一种视频检索方法,应用于终端中,其特征在于,所述方法包括:采用如权利要求1至6中任意一项所述的视频指纹提取方法提取指定的视频文件的第一视频指纹;采用如权利要求1至6中任意一项所述的视频指纹提取方法提取待检测的数据库中的视频文件的第二视频指纹;检索所述第二视频指纹中是否存在与所述第一视频指纹相同的目标视频指纹;当确定存在所述目标视频指纹时,输出所述待检测的数据库中对应所述目标视频指纹的目标视频文件。
- 一种视频指纹提取装置,运行于终端中,其特征在于,所述装置包括:第一提取模块,用于从视频文件中提取预设帧数的第一图像;检测模块,用于检测所述第一图像中的非黑边区域;确定模块,用于将所述非黑边区域确定为所述视频文件的非黑边区域;第二提取模块,用于从所述视频文件中提取预设数量的视频片段;第一计算模块,用于计算所述视频片段中的所述非黑边区域内的哈希指纹;第二计算模块,用于根据所述预设数量的视频片段的哈希指纹计算所述视频文件的视频指纹。
- 一种视频检索装置,运行于终端中,其特征在于,所述装置包括:第一指纹提取模块,用于采用如权利要求1至6中任意一项所述的视频指纹提取方法提取指定的视频文件的第一视频指纹;第二指纹提取模块,用于采用如权利要求1至6中任意一项所述的视频指纹提取方法提取待检测的数据库中的视频文件的第二视频指纹;检索模块,用于检索所述第二视频指纹中是否存在与所述第一视频指纹相同的目标视频指纹;输出模块,用于当所述检索模块确定存在所述目标视频指纹时,输出所述待检测的数据库中对应所述目标视频指纹的目标视频文件。
- 一种终端,其特征在于,所述终端包括存储器和处理器,所述存储器上存储有可在所述处理器上运行的视频指纹提取的下载程序或者视频检索的下载程序,所述视频指纹提取的下载程序被所述处理器执行时实现如权利要求1至6中任一项所述的视频指纹提取方法,所述视频检索的下载程序被所述处理器执行时实现如权利要求7所述的视频检索方法。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有视频指纹提取的下载程序或者视频检索的下载程序,所述视频指纹提取的下载程序可被一个或者多个处理器执行,以实现如权利要求1至6中任一项所 述的视频指纹提取方法,所述视频检索的下载程序可被一个或者多个处理器执行,以实现如权利要求7所述的视频检索方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910377071.6A CN110083740B (zh) | 2019-05-07 | 2019-05-07 | 视频指纹提取及视频检索方法、装置、终端及存储介质 |
CN201910377071.6 | 2019-05-07 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020224325A1 true WO2020224325A1 (zh) | 2020-11-12 |
Family
ID=67419038
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/079014 WO2020224325A1 (zh) | 2019-05-07 | 2020-03-12 | 视频指纹提取及视频检索方法、装置、终端及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110083740B (zh) |
WO (1) | WO2020224325A1 (zh) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110083740B (zh) * | 2019-05-07 | 2021-04-06 | 深圳市网心科技有限公司 | 视频指纹提取及视频检索方法、装置、终端及存储介质 |
CN110889011B (zh) * | 2019-11-29 | 2022-07-26 | 杭州当虹科技股份有限公司 | 一种视频指纹方法 |
CN111126620B (zh) * | 2019-12-10 | 2020-11-03 | 河海大学 | 一种用于时间序列的特征指纹生成方法及应用 |
CN111091118A (zh) * | 2019-12-31 | 2020-05-01 | 北京奇艺世纪科技有限公司 | 图像的识别方法、装置及电子设备和存储介质 |
CN111507260B (zh) * | 2020-04-17 | 2022-08-05 | 重庆邮电大学 | 一种视频相似度快速检测方法及检测装置 |
CN112203141A (zh) * | 2020-10-12 | 2021-01-08 | 广州欢网科技有限责任公司 | 视频点播内容识别方法、装置、设备、系统和智能电视 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100306193A1 (en) * | 2009-05-28 | 2010-12-02 | Zeitera, Llc | Multi-media content identification using multi-level content signature correlation and fast similarity search |
US20110085734A1 (en) * | 2009-08-10 | 2011-04-14 | Pixel Forensics, Inc. | Robust video retrieval utilizing video data |
CN105975939A (zh) * | 2016-05-06 | 2016-09-28 | 百度在线网络技术(北京)有限公司 | 视频检测方法和装置 |
CN106484837A (zh) * | 2016-09-30 | 2017-03-08 | 腾讯科技(北京)有限公司 | 相似视频文件的检测方法和装置 |
CN110083740A (zh) * | 2019-05-07 | 2019-08-02 | 深圳市网心科技有限公司 | 视频指纹提取及视频检索方法、装置、终端及存储介质 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915944B (zh) * | 2014-03-14 | 2018-08-07 | 北京风行在线技术有限公司 | 一种用于确定视频的黑边位置信息的方法与设备 |
CN104077590A (zh) * | 2014-06-30 | 2014-10-01 | 安科智慧城市技术(中国)有限公司 | 一种视频指纹提取方法及系统 |
CN104504162B (zh) * | 2015-01-21 | 2018-12-04 | 北京智富者机器人科技有限公司 | 一种基于机器人视觉平台的视频检索方法 |
CN105430382A (zh) * | 2015-12-02 | 2016-03-23 | 厦门雅迅网络股份有限公司 | 一种视频图像检测黑边的方法和装置 |
CN106683108A (zh) * | 2016-12-07 | 2017-05-17 | 乐视控股(北京)有限公司 | 确定视频帧中平坦区域的方法、装置及电子设备 |
CN109409208A (zh) * | 2018-09-10 | 2019-03-01 | 东南大学 | 一种基于视频的车辆特征提取与匹配方法 |
-
2019
- 2019-05-07 CN CN201910377071.6A patent/CN110083740B/zh active Active
-
2020
- 2020-03-12 WO PCT/CN2020/079014 patent/WO2020224325A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100306193A1 (en) * | 2009-05-28 | 2010-12-02 | Zeitera, Llc | Multi-media content identification using multi-level content signature correlation and fast similarity search |
US20110085734A1 (en) * | 2009-08-10 | 2011-04-14 | Pixel Forensics, Inc. | Robust video retrieval utilizing video data |
CN105975939A (zh) * | 2016-05-06 | 2016-09-28 | 百度在线网络技术(北京)有限公司 | 视频检测方法和装置 |
CN106484837A (zh) * | 2016-09-30 | 2017-03-08 | 腾讯科技(北京)有限公司 | 相似视频文件的检测方法和装置 |
CN110083740A (zh) * | 2019-05-07 | 2019-08-02 | 深圳市网心科技有限公司 | 视频指纹提取及视频检索方法、装置、终端及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN110083740B (zh) | 2021-04-06 |
CN110083740A (zh) | 2019-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020224325A1 (zh) | 视频指纹提取及视频检索方法、装置、终端及存储介质 | |
US11132555B2 (en) | Video detection method, server and storage medium | |
US10324977B2 (en) | Searching method and apparatus | |
US11275950B2 (en) | Method and apparatus for segmenting video | |
WO2018028583A1 (zh) | 字幕提取方法及装置、存储介质 | |
US9646358B2 (en) | Methods for scene based video watermarking and devices thereof | |
CN111651636B (zh) | 视频相似片段搜索方法及装置 | |
WO2020211624A1 (zh) | 对象追踪方法、追踪处理方法、相应的装置、电子设备 | |
WO2019061661A1 (zh) | 图像篡改检测方法、电子装置及可读存储介质 | |
US20210406549A1 (en) | Method and apparatus for detecting information insertion region, electronic device, and storage medium | |
WO2022105125A1 (zh) | 图像分割方法、装置、计算机设备及存储介质 | |
JP2015536094A (ja) | ビデオシーン検出 | |
US20160048849A1 (en) | Method and system for clustering and classifying online visual information | |
US20090290752A1 (en) | Method for producing video signatures and identifying video clips | |
WO2021237967A1 (zh) | 一种目标检索方法及装置 | |
CN104618803A (zh) | 信息推送方法、装置、终端及服务器 | |
CN111666907B (zh) | 一种视频中对象信息的识别方法、装置及服务器 | |
WO2014044158A1 (zh) | 一种图像中的目标物体识别方法及装置 | |
Pal et al. | Video segmentation using minimum ratio similarity measurement | |
CN106503112B (zh) | 视频检索方法和装置 | |
CN112291634A (zh) | 视频处理方法及装置 | |
CN110704104A (zh) | 一种应用仿冒检测方法、智能终端及存储介质 | |
CN114758268A (zh) | 手势识别方法、装置及智能设备 | |
CN111177450B (zh) | 一种图像检索云识别方法、系统及计算机可读存储介质 | |
WO2018120575A1 (zh) | 网页主图识别方法和装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20802440 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20802440 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 18/03/2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20802440 Country of ref document: EP Kind code of ref document: A1 |