CN111008301B - Method for searching video by using graph - Google Patents

Method for searching video by using graph Download PDF

Info

Publication number
CN111008301B
CN111008301B CN201911316843.1A CN201911316843A CN111008301B CN 111008301 B CN111008301 B CN 111008301B CN 201911316843 A CN201911316843 A CN 201911316843A CN 111008301 B CN111008301 B CN 111008301B
Authority
CN
China
Prior art keywords
video
fingerprint
searching
data
fingerprints
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911316843.1A
Other languages
Chinese (zh)
Other versions
CN111008301A (en
Inventor
柴中进
吴伟平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xinhua Zhiyun Technology Co ltd
Original Assignee
Xinhua Zhiyun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xinhua Zhiyun Technology Co ltd filed Critical Xinhua Zhiyun Technology Co ltd
Priority to CN201911316843.1A priority Critical patent/CN111008301B/en
Publication of CN111008301A publication Critical patent/CN111008301A/en
Application granted granted Critical
Publication of CN111008301B publication Critical patent/CN111008301B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the technical field of video searching, in particular to a method for searching videos by using pictures. The method comprises a data index creating stage and a video searching stage, wherein the data index creating stage comprises the following steps of: reading a video frame picture; calculating the digital fingerprint of the picture frame; dividing the fingerprint into a plurality of sections according to 16 bits as one section; traversing all the segments circularly, and putting the fingerprints into index directories corresponding to the segments; the fingerprint data is added into the index file; the search video phase includes the steps of: reading video screenshot data to be searched; calculating screenshot fingerprints; circularly obtaining data indexes under different fingerprint segments; searching a fingerprint through a data index; and obtaining video information and corresponding frames through the searched fingerprints. In the method for searching the video by the graph, the searching range is effectively reduced by segmenting the frame image fingerprint, the searching speed is improved, and meanwhile, the final result is rapidly positioned by a distributed processing mode of multiple nodes.

Description

Method for searching video by using graph
Technical Field
The invention relates to the technical field of video searching, in particular to a method for searching videos by using pictures.
Background
The technology for searching videos by pictures comprises the related technologies in the classical pattern recognition and deep learning field, and the principle is that the optimal combination of massive video searching in precision and speed is achieved through the fusion of the classical pattern recognition technology and the deep learning technology. However, the video is searched by a graph at present, the calculation speed is low, each calculation requires a calculation time of minutes or even hours, and during the period, a user cannot operate software and only waits for the completion of calculation; meanwhile, the multi-core characteristic of the modern CPU cannot be fully utilized, no matter how many processing cores the user has, only one of the processing cores can be utilized, the resource utilization rate is low, and the computing resources are consumed, and in particular, the deep learning technology needs special GPU resources to accelerate the learning process; the method has the advantages that the expandability is insufficient, massive training data are required to be prepared in advance by the technologies such as pattern recognition and deep learning, the adaptability of training results is poor, and the method is strongly associated with the selection of samples and can only be suitable for limited types of scenes.
Disclosure of Invention
The invention aims to provide a method for searching videos by using graphs, which aims to solve the problems in the background technology.
In order to achieve the above object, the present invention provides a method for searching video in a graph, including a data index creating stage and a video searching stage, wherein the data index creating stage includes the following steps:
s1.1, reading video frame pictures;
s1.2, calculating digital fingerprints of the picture frames;
s1.3, dividing the fingerprint into a plurality of sections according to 16 bits as one section;
s1.4, circularly traversing all the segments, and putting the fingerprints into index directories corresponding to the segments;
s1.5, adding fingerprint data into an index file;
the video searching stage comprises the following steps:
s2.1, reading video screenshot data to be searched;
s2.2, calculating screenshot fingerprints;
s2.3, circularly obtaining data indexes under different fingerprint segments;
s2.4, searching fingerprints through a data index;
and S2.5, obtaining video information and corresponding frames through the searched fingerprints.
Preferably, in S1.1, the method for reading the video frame picture includes: and restoring the video and audio compression coding data into uncompressed video, and decoding to obtain uncompressed video color data.
Preferably, in S1.2, the method for calculating the digital fingerprint of the picture frame is as follows: through a perceptual Hash algorithm, gray calculation is firstly carried out on an original image, after the image is reduced to 8x8 pixels, 64-bit binary data is stored in an array to be used as a 64-bit image fingerprint.
Preferably, in S1.4, the method for placing the fingerprint into the index directory corresponding to the segment includes: four catalogues are established in a file system, corresponding fingerprints are divided into serial numbers corresponding to 4 sections, serial number values are expressed as 1, 2, 3 and 4, 2-16=65536 Hash catalogues are established under each serial number and are respectively expressed by 1-65535, 10 files are established under each catalogue, and the complete 64-bit Hash values are stored in the files.
Preferably, in S1.4, the method of cycling through all segments is as follows: the complete 64-bit Hash value is stored in the file through a uniquely determined writing path of "/segmentation number/current segment Hash directory/Hash file".
Preferably, in S2.2, the method for calculating the screenshot fingerprint is as follows: a digital fingerprint is obtained using a perceptual Hash algorithm.
Preferably, in S2.3, the method for circularly obtaining the data indexes under different fingerprint segments includes: the fingerprint is segmented into 4 segments, and then segment indexing is performed from the first segment.
Preferably, in S2.4, the method for searching the fingerprint through the data index is as follows: and reading all files under the section number\current section Hash value\of the catalog, traversing the whole file content, calculating the Hamming distance, and returning the fingerprint with the minimum Hamming distance.
Preferably, in S2.5, the method for obtaining the video information and the corresponding frame through the searched fingerprint includes: after the fingerprint is identified, the database is queried to obtain the video and the frame of the video.
Preferably, the segment indexing method comprises the following steps: and storing the fingerprints with the same current segment in the 4 segments under the catalogues corresponding to the segment numbers, and storing the fingerprints with the same current segment in the files corresponding to the segments.
Compared with the prior art, the invention has the beneficial effects that: according to the method for searching the video by the graph, the searching range is effectively reduced through segmentation of the frame image fingerprint, the searching speed is improved, and meanwhile, a final result is rapidly positioned in a distributed processing mode of multiple sections, so that a target frame, a specific video and the number of frames in which the target frame is positioned can be rapidly and accurately searched.
Drawings
FIG. 1 is a schematic diagram of a frame image fingerprint segmentation of the present invention;
FIG. 2 is a flow chart of the multi-node real-time search of the present invention;
FIG. 3 is a schematic diagram of video frame fingerprint computation and segmentation of the present invention;
FIG. 4 is a diagram of a video frame fingerprint segment storage format of the present invention;
fig. 5 is a diagram of a search process for graphically searching for video in accordance with the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1-5, the present invention provides a technical solution:
the invention provides a method for searching video by using a graph, which comprises a data index creating stage and a video searching stage, wherein the data index creating stage comprises the following steps of:
s1.1, reading video frame pictures;
s1.2, calculating digital fingerprints of the picture frames;
s1.3, dividing the fingerprint into a plurality of sections according to 16 bits as one section;
s1.4, circularly traversing all the segments, and putting the fingerprints into index directories corresponding to the segments;
s1.5, adding fingerprint data into an index file;
the search video phase includes the steps of:
s2.1, reading video screenshot data to be searched;
s2.2, calculating screenshot fingerprints;
s2.3, circularly obtaining data indexes under different fingerprint segments;
s2.4, searching fingerprints through a data index;
and S2.5, obtaining video information and corresponding frames through the searched fingerprints.
In this embodiment, in S1.1, the method for reading the video frame picture is as follows: video and audio compression coding data are restored into uncompressed video and audio original data, compression coding standards of audio comprise AAC, MP3, AC-3 and the like, video compression coding standards comprise H.264, MPEG2, VC-1 and the like, and uncompressed video color data such as YUV420P, RGB and uncompressed audio data such as PCM and the like are obtained through decoding.
Further, in S1.2, the method for calculating the digital fingerprint of the picture frame includes: through a perceptual Hash algorithm, gray calculation is firstly carried out on an original image, after the image is reduced to 8x8 pixels, 64-bit binary data is stored in an array to be used as a 64-bit image fingerprint.
Specifically, in S1.3, the method for dividing the fingerprint into multiple sections according to 16 bits as one section is as follows: since the video frame to be searched and the picture used for searching have the same or mostly the same content (neglecting watermarks, station marks and the like), and assuming that the maximum fault tolerance 3 bits in the picture fingerprints are different, the fingerprints are divided into 4 segments by taking 16 bits as one segment.
In S1.4, the method for placing the fingerprint into the index directory corresponding to the segment includes: four catalogues are established in a file system, corresponding fingerprints are divided into serial numbers corresponding to 4 sections, serial number values are expressed as 1, 2, 3 and 4, 2-16=65536 Hash catalogues are established under each serial number and are respectively expressed by 1-65535, 10 files are established under each catalogue, complete 64-bit Hash values are stored in the files, and estimation is carried out: assuming that 80 ten thousand 1 hour videos are available, a total of about 720 hundred million frames are generated, each hash directory needs to store 720 hundred million/65535≡120 ten thousand fingerprints, and each file stores 120 ten thousand/10=12 ten thousand hash values.
In addition, in S1.4, the method of cycling through all segments is: the complete 64-bit Hash value is stored into the file through a uniquely determined writing path of "/segmentation number/current segment Hash directory/Hash file", specifically: first, the scheme assumes that the video frame to be searched is identical or mostly identical to the picture content used for searching (neglecting watermarks, station marks, etc.), and that the maximum fault tolerance 3 bits in the picture fingerprint are different, that is, if the fingerprint is segmented into 4 segments, one segment is identical. Therefore, the scheme can search all Hash values under the 'segmentation number/current segment Hash directory/Hash file', and obtain corresponding nearest similar Hash values by comparing one Hash with the nearest Hamming distance in the Hash values as output.
It should be noted that, the specific method for searching the video phase is as follows: the method comprises the steps of obtaining a digital fingerprint (see how to calculate the digital fingerprint of a picture frame in detail) by using a perceptual Hash algorithm, segmenting the fingerprint (see how to segment the fingerprint in detail), starting from a first segment after the fingerprint is segmented into 4 segments, carrying out segment indexing, and reading all files under the catalog segment number/current segment Hash value. Since all fingerprints with the same Hash value are stored in the file, we only need to traverse the whole file content, calculate the hamming distance, and return the fingerprint with the smallest hamming distance, and record the fingerprint as "A1". Similarly, we walk through the remaining three hashes and find the fingerprints "A2", "A3", "A4" with the smallest hamming distance. Next we compare these four fingerprints to find the fingerprint closest to hamming, such as: "A2". Finally, we determine that the video frame corresponding to "A2" is the one that we want to find. Because of the large number of hashes in traversed files, we store the hashed files deliberately as multiple files when they are stored. Thus, a plurality of files can be traversed by a plurality of processing nodes at the same time, and searching time is reduced.
Still further, the method for obtaining video information and corresponding frames through the searched fingerprints comprises the following steps: when fingerprint data is input, the association relation between the fingerprint Hash value and the video and the frame number of the corresponding video is stored into the data, so that after the fingerprint is identified, the database is only queried to obtain the video and the frame number of the video.
Specifically, the segment indexing method comprises the following steps: and storing the fingerprints with the same current segment in the 4 segments under the catalogues corresponding to the segment numbers, and storing the fingerprints with the same current segment in the files corresponding to the segments.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the above-described embodiments, and that the above-described embodiments and descriptions are only preferred embodiments of the present invention, and are not intended to limit the invention, and that various changes and modifications may be made therein without departing from the spirit and scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (7)

1. A method for searching video in a graph, comprising a data index stage and a video searching stage, characterized in that: the step of creating the data index phase comprises the following steps:
s1.1, reading video frame pictures;
s1.2, calculating digital fingerprints of the picture frames;
s1.3, dividing the fingerprint into a plurality of sections according to 16 bits as one section;
s1.4, circularly traversing all the segments, and putting the fingerprints into index directories corresponding to the segments;
s1.5, adding fingerprint data into an index file;
the video searching stage comprises the following steps:
s2.1, reading video screenshot data to be searched;
s2.2, calculating screenshot fingerprints;
s2.3, circularly obtaining data indexes under different fingerprint segments;
s2.4, searching fingerprints through a data index;
s2.5, obtaining video information and corresponding frames through the searched fingerprints;
in the step S1.1, the method for reading the video frame picture includes: restoring video and audio compression coding data into non-compressed video, and decoding to obtain non-compressed video color data;
in the step S1.2, the method for calculating the digital fingerprint of the picture frame includes: the method comprises the steps of firstly carrying out gray calculation on an original image through a perceptual Hash algorithm, and storing 64-bit binary data into an array after the image is reduced to 8x8 pixels to serve as a 64-bit image fingerprint;
in the step S1.4, the method for placing the fingerprint into the index catalog corresponding to the segment comprises the following steps: four catalogues are established in a file system, corresponding fingerprints are divided into serial numbers corresponding to 4 sections, serial number values are expressed as 1, 2, 3 and 4, 2-16=65536 Hash catalogues are established under each serial number and are respectively expressed by 1-65535, 10 files are established under each catalogue, and the complete 64-bit Hash values are stored in the files.
2. The method for graphically searching for video in accordance with claim 1, wherein: in the step S1.4, the method for circularly traversing all the segments comprises the following steps: the complete 64-bit Hash value is stored in the file through a uniquely determined writing path of "/segmentation number/current segment Hash directory/Hash file".
3. The method for graphically searching for video in accordance with claim 1, wherein: in the step S2.2, the method for calculating the screenshot fingerprint is as follows: a digital fingerprint is obtained using a perceptual Hash algorithm.
4. A method of graphically searching for video in accordance with claim 3, wherein: in the step S2.3, the method for circularly obtaining the data indexes under different fingerprint segments comprises the following steps: the fingerprint is segmented into 4 segments, and then segment indexing is performed from the first segment.
5. The method for graphically searching for video in accordance with claim 4, wherein: in S2.4, the method for searching the fingerprint through the data index is as follows: and reading all files under the section number\current section Hash value\of the catalog, traversing the whole file content, calculating the Hamming distance, and returning the fingerprint with the minimum Hamming distance.
6. The method for graphically searching for video in accordance with claim 5, wherein: in S2.5, the method for obtaining the video information and the corresponding frame through the searched fingerprint comprises the following steps: after the fingerprint is identified, the database is queried to obtain the video and the frame of the video.
7. The method for graphically searching for video in accordance with claim 4, wherein: the method for segment indexing comprises the following steps: and storing the fingerprints with the same current segment in the 4 segments under the catalogues corresponding to the segment numbers, and storing the fingerprints with the same current segment in the files corresponding to the segments.
CN201911316843.1A 2019-12-19 2019-12-19 Method for searching video by using graph Active CN111008301B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911316843.1A CN111008301B (en) 2019-12-19 2019-12-19 Method for searching video by using graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911316843.1A CN111008301B (en) 2019-12-19 2019-12-19 Method for searching video by using graph

Publications (2)

Publication Number Publication Date
CN111008301A CN111008301A (en) 2020-04-14
CN111008301B true CN111008301B (en) 2023-08-15

Family

ID=70117166

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911316843.1A Active CN111008301B (en) 2019-12-19 2019-12-19 Method for searching video by using graph

Country Status (1)

Country Link
CN (1) CN111008301B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1592906A (en) * 2000-07-31 2005-03-09 沙扎姆娱乐有限公司 System and methods for recognizing sound and music signals in high noise and distortion
WO2009140822A1 (en) * 2008-05-22 2009-11-26 Yuvad Technologies Co., Ltd. A method for extracting a fingerprint data from video/audio signals
CN101673266A (en) * 2008-09-12 2010-03-17 未序网络科技(上海)有限公司 Method for searching audio and video contents
CN101673263A (en) * 2008-09-12 2010-03-17 未序网络科技(上海)有限公司 Method for searching video content
CN101681381A (en) * 2007-06-06 2010-03-24 杜比实验室特许公司 Improving audio/video fingerprint search accuracy using multiple search combining
CN102419816A (en) * 2011-11-18 2012-04-18 山东大学 Video fingerprint method for same content video retrieval
CN103514250A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for deleting global repeating data and storage device
CN105095435A (en) * 2015-07-23 2015-11-25 北京京东尚科信息技术有限公司 Similarity comparison method and device for high-dimensional image features
CN106055632A (en) * 2016-05-27 2016-10-26 浙江工业大学 Video authentication method based on scene frame fingerprints
CN106802960A (en) * 2017-01-19 2017-06-06 湖南大学 A kind of burst audio search method based on audio-frequency fingerprint
CN109388729A (en) * 2017-08-14 2019-02-26 阿里巴巴集团控股有限公司 Search method, device and the audio query system of audio sub fingerprint

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1592906A (en) * 2000-07-31 2005-03-09 沙扎姆娱乐有限公司 System and methods for recognizing sound and music signals in high noise and distortion
CN101681381A (en) * 2007-06-06 2010-03-24 杜比实验室特许公司 Improving audio/video fingerprint search accuracy using multiple search combining
WO2009140822A1 (en) * 2008-05-22 2009-11-26 Yuvad Technologies Co., Ltd. A method for extracting a fingerprint data from video/audio signals
CN101673266A (en) * 2008-09-12 2010-03-17 未序网络科技(上海)有限公司 Method for searching audio and video contents
CN101673263A (en) * 2008-09-12 2010-03-17 未序网络科技(上海)有限公司 Method for searching video content
CN102419816A (en) * 2011-11-18 2012-04-18 山东大学 Video fingerprint method for same content video retrieval
CN103514250A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for deleting global repeating data and storage device
CN105095435A (en) * 2015-07-23 2015-11-25 北京京东尚科信息技术有限公司 Similarity comparison method and device for high-dimensional image features
CN106055632A (en) * 2016-05-27 2016-10-26 浙江工业大学 Video authentication method based on scene frame fingerprints
CN106802960A (en) * 2017-01-19 2017-06-06 湖南大学 A kind of burst audio search method based on audio-frequency fingerprint
CN109388729A (en) * 2017-08-14 2019-02-26 阿里巴巴集团控股有限公司 Search method, device and the audio query system of audio sub fingerprint

Also Published As

Publication number Publication date
CN111008301A (en) 2020-04-14

Similar Documents

Publication Publication Date Title
CN106407311B (en) Method and device for obtaining search result
JP5926291B2 (en) Method and apparatus for identifying similar images
WO2017012491A1 (en) Similarity comparison method and apparatus for high-dimensional image features
CN106557545B (en) Video retrieval method and device
US10997459B2 (en) Video content indexing and searching
CN109783691B (en) Video retrieval method for deep learning and Hash coding
CN110149529B (en) Media information processing method, server and storage medium
US10733454B2 (en) Transformation of video streams
CN115630236A (en) Global fast retrieval positioning method of passive remote sensing image, storage medium and equipment
CN111008301B (en) Method for searching video by using graph
CN110826365B (en) Video fingerprint generation method and device
CN115455083A (en) Duplicate checking method and device, electronic equipment and computer storage medium
WO2016110125A1 (en) Hash method for high dimension vector, and vector quantization method and device
CN117501632A (en) Decoder, encoder, controller, method and computer program for updating neural network parameters using node information
CN114372169A (en) Method, device and storage medium for searching homologous videos
WO2021109850A1 (en) Method and system for deduplicating and storing pdf files
CN115129915A (en) Repeated image retrieval method, device, equipment and storage medium
US10037148B2 (en) Facilitating reverse reading of sequentially stored, variable-length data
WO2019140548A1 (en) Similarity retrieval method and device for massive feature vector data, and storage medium
CN117251598A (en) Video retrieval method
Liu et al. A novel inverted index file based searching strategy for video copy detection
Assent et al. Speeding up complex video copy detection queries
CN115858860A (en) Video detection method and device, electronic equipment and storage medium
WO2019191904A1 (en) Data processing method and device
CN115270753A (en) Similar text determination method, information recommendation method, information duplicate checking method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant