CN104050247B - The method for realizing massive video quick-searching - Google Patents

The method for realizing massive video quick-searching Download PDF

Info

Publication number
CN104050247B
CN104050247B CN201410245315.2A CN201410245315A CN104050247B CN 104050247 B CN104050247 B CN 104050247B CN 201410245315 A CN201410245315 A CN 201410245315A CN 104050247 B CN104050247 B CN 104050247B
Authority
CN
China
Prior art keywords
video
key feature
feature vector
image
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410245315.2A
Other languages
Chinese (zh)
Other versions
CN104050247A (en
Inventor
逯利军
钱培专
董建磊
张树民
曹晶
李克民
高瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Certusnet Information Technology Co Ltd
Original Assignee
Shanghai Certusnet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Certusnet Information Technology Co Ltd filed Critical Shanghai Certusnet Information Technology Co Ltd
Priority to CN201410245315.2A priority Critical patent/CN104050247B/en
Publication of CN104050247A publication Critical patent/CN104050247A/en
Application granted granted Critical
Publication of CN104050247B publication Critical patent/CN104050247B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7834Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of method for realizing massive video quick-searching, spatial signature vectors are extracted respectively including each frame video image in the video flowing to video library and obtain video features sequence;Key feature vector is extracted in spatial signature vectors;Distributed storage index database is set up according to the key feature vector of all video files in video library;Extract the key feature vector set of video to be retrieved and extract the video index file of the video to be retrieved;Video similarity retrieval is carried out in distributed storage index database according to the video index file of video to be retrieved and the video frequency searching result that similarity is more than systemic presupposition value is exported.Using the method for realizing massive video quick-searching of this kind of structure, key frame is replaced using representational vision word, video information is intactly represented, both without bulk redundancy, but it is extremely compact, accelerate retrieval rate, with mass data oncurrent processing ability, with wider application.

Description

The method for realizing massive video quick-searching
Technical field
At multimedia information technique field, more particularly to multimedia information retrieval, data mining and video Reason field, specifically refers to a kind of method for realizing massive video quick-searching.
Background technology
As multimedia information technology is developed rapidly, the appearance of video sharing website, internet video quantity is increased rapidly, And rise into geometric progression.By Web Publishing, share and retrieve video and become a kind of life style of people.In face of magnanimity Multi-medium data, how quickly to retrieve same or analogous video become current industry and academia research heat Point.
Traditional video retrieval method based on key frame is primarily intended to the accuracy of video frequency searching, but computation complexity It is high, to spend could complete primary retrieval task within some minutes.In face of the networking video of magnanimity, traditional video comparison technology It is not competent.The video retrieval technology of current Internet, has used for reference the core concept of traditional text search engine, regarding Frequency feature regards video word (visual word) as, builds the inverted index of video file, realizes to the fast of massive video file Speed index.
Successfully matching depend on retrieval video and reference video self information abundant degree and self information be expressed, The degree of description.Internet video retrieval method is frequently not according to the conventional method, first when key frame is extracted Shot segmentation is carried out, then extracts camera lens key frame, because the position for extracting key frame can be by factors such as video frame rate, resolution ratio Influence, key frame can not stablize, reliable extract.More simple possible method is that video was done every 1 second once to sample, and is made For key frame.In fact equivalent to the frequency for adding sampling, sample frequency is higher, and it is more abundant that original information is expressed, but calculates Amount will be bigger.Increase the degree of information representation by increasing sample frequency, can so cause, existing information is overexpressed Redundancy is produced, has information not given full expression to again, causes information to be lost.And line sampling can be such that the information of loss has at random Property, because video information is not what linear list reached.The information of random loss can reduce the Stability and veracity of retrieval.In addition On the one hand, less key frame is extracted in traditional extraction method of key frame, the less place of general information change, is become in frame of video Cross larger place and extract more key frame, can produce to compare and compact and more complete expressing information, its degree depends on poly- Class or the threshold value of segmentation.Video and reference video are retrieved often by various noise jammings, such as video resolution is deteriorated, net Network packet loss, video frame losing, low frame per second, video insertion, video editing etc., original video information can be made to be mixed with noise, or cause Partial information lose and it is no longer complete.Traditional video key frame extracting method is excessively idealized, and does not consider external interference a) Complexity, appropriate redundancy is necessary, and b) its feature for being used to extract key frame is not built for magnanimity retrieval tasks, Related method simultaneously improper is used directly to extract key frame.How appropriate retrieval character is selected so that the key frame of structure The number of frames of sequence is minimum, video lens information representation relatively complete and has appropriate redundancy, becomes towards mass data Retrieval technique assistant officer key issue to be solved.
The content of the invention
The purpose of the present invention is that the shortcoming for overcoming above-mentioned prior art can be realized using representational there is provided one kind Vision word replace key frame, not only without bulk redundancy but also it is extremely compact, accelerate retrieval rate, with mass data concurrent processing Ability, the method for realizing massive video quick-searching with broader applications scope.
To achieve these goals, the method for realizing massive video quick-searching of the invention has following composition:
This realizes the method for massive video quick-searching, and it is mainly characterized by, and described method comprises the following steps:
(1) spatial signature vectors are extracted respectively to each frame video image in the video flowing of video library and obtains video features sequence Row;
(2) key feature vector is extracted in the spatial signature vectors of described video features sequence;
(3) distributed storage of all video files is set up according to the key feature vector of all video files in video library Index database;
(4) extract the key feature vector set of video to be retrieved and extract the video index file of the video to be retrieved;
(5) regarded according to the video index file of described video to be retrieved in described distributed storage index database Frequency similarity retrieval simultaneously exports the video frequency searching result that similarity is more than systemic presupposition value.
It is preferred that described spatial signature vectors include the gray space distribution characteristics and texture space of corresponding two field picture Each frame video image extracts spatial signature vectors, including following step respectively in distribution characteristics, the video flowing to video library Suddenly:
(11) gray level image and Edge texture image of each frame video image in the video flowing for obtaining video library are calculated;
(12) the central space feature and boundary space feature of the gray level image of each frame video image are calculated and is obtained by institute The gray space distribution characteristics for the frame video image that the central space feature and boundary space feature stated are constituted;
(13) the texture space distribution characteristics of the Edge texture image of each frame video image is calculated.
More preferably, described calculating obtains the gray level image and Edge texture of each frame video image in the video flowing of video library Image, comprises the following steps:
(111) each frame video image in the video flowing of video library is divided into several an equal amount of subgraphs and calculated The gray value and texture number of edge points of each subgraph;
(112) gray scale for calculating each subgraph of each frame video image is worth to the gray level image of the frame video image;
(113) the texture number of edge points for calculating each subgraph of each frame video image obtains the side of the frame video image Edge texture image.
More preferably, the central space feature and boundary space feature of the gray level image of described each frame video image of calculating, Specially:
Calculate the central space feature and boundary space feature of the local binary patterns of the gray level image of each frame video image;
The texture space distribution characteristics of the Edge texture image of described each frame video image of calculating, be specially:
Calculate the texture space distribution characteristics of the local binary patterns of the Edge texture image of each frame video image.
More preferably, described spatial signature vectors also include color histogram feature, the video flowing to video library In each frame video image extract spatial signature vectors respectively, it is further comprising the steps of:
(14) the color histogram feature of each frame video image is calculated.
It is preferred that described extract key feature vector, bag in the spatial signature vectors of described video features sequence Include following steps:
(21) first spatial signature vectors of described video features sequence are defaulted as key feature vector;
(22) each spatial signature vectors and the mahalanobis distance of previous key feature vector are calculated;
(23) will be greater than the spatial signature vectors corresponding to the mahalanobis distance of systemic presupposition threshold value be extracted as key feature to Amount.
It is preferred that the key feature vector according to all video files in video library sets up all video files Distributed storage index database, comprises the following steps:
(31) set up in described video features sequence the subspace projection histogram of key feature vector and record each The frequency that key feature vector occurs in corresponding video;
(32) the inverted index file of all video files of video library is set up;
(33) the distributed index database of all video files of video library is set up.
More preferably, the described subspace projection histogram for setting up key feature vector in video features sequence, be specially:
By in key feature vector projection in video features sequence to gray scale subspace, texture subspace and color sub-spaces And obtain the subspace projection histogram of each key feature vector.
Further, it is described to record the frequency that each key feature vector occurs in corresponding video, be specially:
Record each key feature vector corresponding to subspace projection histogram in represent the key feature vector regarding The characteristic value of frequency of occurrence in frequency.
Further, the inverted index file of the described all video files for setting up video library, comprises the following steps:
(321) the key feature vector set in statistics video library corresponding to each video file, which merges, constitutes the video library Count the vectorial storehouse of key feature;
(322) each key feature for setting up in the vectorial storehouse of described statistics key feature is vectorial corresponding to possess the key The collection of document of characteristic vector;
(323) by the document of key feature vector set according to the quantity of contained key feature vector from more to being arranged less Sequence;
(324) the inverted index file of all video files of video library is set up according to each sub-spaces.
Yet further, the distributed index database of the described all video files for setting up video library, including following Step:
(331) the local sensitivity hash algorithm based on p-stable is utilized by the key feature DUAL PROBLEMS OF VECTOR MAPPING of each sub-spaces To the one-dimensional space;
(332) Hash table is safeguarded using name_node and data_ is used based on Hadoop distributed file systems framework Node preserves the distributed index database that index data is all video files.
More preferably, the video index file of the video to be retrieved described in described basis is indexed in described distributed storage Video similarity retrieval is carried out in storehouse, is specially:
(51) each video subspace projection histogram in video subspace projection histogram to be retrieved and video library is calculated Friendship as each video in video to be retrieved and video library similarity;
(52) picked according to the space-time structure uniformity of the key feature vector of each video in video to be retrieved and video library Except the video file for not meeting space-time structure coherence request.
Further, described output similarity is more than the video frequency searching result of systemic presupposition value, comprises the following steps:
(52) each subspace projection histogram of the key feature vector of video to be retrieved is extracted and by each key feature Vector is mapped as cryptographic Hash in each sub-spaces;
(53) meeting systemic presupposition by similarity in described inverted index file selection distributed index database will The video file asked is as output;
(54) the space-time structure uniformity of key feature vector of each video in video to be retrieved and video library is calculated simultaneously The similarity of output and described video to be retrieved is more than the video file of systemic presupposition value.
The method for realizing massive video quick-searching in the invention is employed, is had the advantages that:
Present invention is generally directed to build the integrality of video index information and the select permeability of index feature, it is proposed that a kind of Subspace method based on video finger print, solves current quick, robust the search problem towards mass data.First, this is special Profit replaces key-frame extraction, directly with representational using novel extraction method of key frame with the extraction of key feature vector Visual signature replaces key frame, and original video is encoded equivalent in feature space, and complete expresses video information, Both without bulk redundancy, and closely, and current key frame extracting parameter select permeability is overcome.Secondly, each vision Feature Mapping is into one-dimensional cryptographic Hash, according to the cryptographic Hash location of visual signature, selects suitable HDFS (Hadoop Distributed File System, Hadoop distributed file system) name_node (name node) and data_node (back end), that is, accelerate retrieval rate, and with the ability of mass data concurrent processing, with widely applying model Enclose.
Brief description of the drawings
Fig. 1 is the flow chart of the method for realizing massive video quick-searching of the present invention.
Fig. 2 is applied to the flow chart of specific embodiment for the method for realizing massive video quick-searching of the present invention.
Fig. 3 is the flow chart that sequence of frames of video is mapped to video features sequence of the invention.
Fig. 4 is the flow chart for calculating gray space distribution characteristics of the invention.
Fig. 5 extracts the vectorial flow chart of key feature for the present invention's.
Embodiment
In order to more clearly describe the technology contents of the present invention, carried out with reference to specific embodiment further Description.
The invention discloses a kind of massive video method for quickly retrieving and system, wherein this method includes:Frame of video sequence Row are mapped to the video features sequence of spatial signature vectors composition, extract wherein representative feature and are used as video features sequence The key feature vector of row;Where the key feature vector, the cryptographic Hash obtained according to mapping being mapped by hash function Hash bucket, builds distributed index;According to the key feature vector set of video to be retrieved, Hash where each correspondence cryptographic Hash is calculated Bucket numbering, extracts the video index file of character pair, and candidate video file is obtained by the mode of voting, and calculates video to be retrieved With the similarity of candidate video file, output similarity is used as retrieval result more than certain threshold value.
As shown in figure 1, the present invention's realizes that the method for massive video quick-searching comprises the following steps:
(1) spatial signature vectors are extracted respectively to each frame video image in the video flowing of video library and obtains video features sequence Row;
In a preferred embodiment, described spatial signature vectors include the gray space point of corresponding two field picture Boot is sought peace texture space distribution characteristics, and therefore, each frame video image extracts sky respectively in the video flowing to video library Between characteristic vector obtain video features sequence, comprise the following steps:
(11) gray level image and Edge texture image of each frame video image in the video flowing for obtaining video library are calculated;
In a preferred embodiment, following this side can be used by calculating gray level image and Edge texture image Formula, i.e.,
Described calculating obtains the gray level image and Edge texture image of each frame video image in the video flowing of video library, bag Include following steps:
(111) each frame video image in the video flowing of video library is divided into several an equal amount of subgraphs and calculated The gray value and texture number of edge points of each subgraph;
(112) gray scale for calculating each subgraph of each frame video image is worth to the gray level image of the frame video image;
(113) the texture number of edge points for calculating each subgraph of each frame video image obtains the side of the frame video image Edge texture image.
(12) the central space feature and boundary space feature of the gray level image of each frame video image are calculated and is obtained by institute The gray space distribution characteristics for the frame video image that the central space feature and boundary space feature stated are constituted;Wherein center is empty Between feature and boundary space feature can be central space feature and boundary space feature based on local binary patterns.
(13) the texture space distribution characteristics of the Edge texture image of each frame video image is calculated.
Wherein, texture space distribution characteristics can be the texture space distribution characteristics based on local binary patterns.
In a kind of preferred embodiment, described spatial signature vectors can further include color histogram Feature so that spatial signature vectors can more represent each frame video image in video features, the i.e. described video flowing to video library Spatial signature vectors are extracted respectively, it is further comprising the steps of:
(14) the color histogram feature of each frame video image is calculated.
(2) key feature vector is extracted in the spatial signature vectors of described video features sequence;
In a preferred embodiment, key feature vector is extracted to comprise the following steps:
(21) first spatial signature vectors of described video features sequence are defaulted as key feature vector;
(22) each spatial signature vectors and the mahalanobis distance of previous key feature vector are calculated;
(23) will be greater than the spatial signature vectors corresponding to the mahalanobis distance of systemic presupposition threshold value be extracted as key feature to Amount.
(3) distributed storage of all video files is set up according to the key feature vector of all video files in video library Index database;
In a preferred embodiment, distributed storage index database is set up to comprise the following steps:
(31) set up in described video features sequence the subspace projection histogram of key feature vector and record each The frequency that key feature vector occurs in corresponding video;
Further, subspace can be gray scale subspace and texture subspace, can also include color sub-spaces, because The subspace projection histogram for setting up key feature vector in video features sequence described in this, be specially:
By in key feature vector projection in video features sequence to gray scale subspace, texture subspace and color sub-spaces
And obtain the subspace projection histogram of each key feature vector.
Further, it is described to record the frequency that each key feature vector occurs in corresponding video, be specially:
Record each key feature vector corresponding to subspace projection histogram in represent the key feature vector regarding The characteristic value of frequency of occurrence in frequency.
(32) the inverted index file of all video files of video library is set up;
Further, the inverted index file of the described all video files for setting up video library, comprises the following steps:
(321) the key feature vector set in statistics video library corresponding to each video file, which merges, constitutes the video library Count the vectorial storehouse of key feature;
(322) each key feature for setting up in the vectorial storehouse of described statistics key feature is vectorial corresponding to possess the key The collection of document of characteristic vector;
(323) by the document of key feature vector set according to the quantity of contained key feature vector from more to being arranged less Sequence;
(324) the inverted index file of all video files of video library is set up according to each sub-spaces.
(33) the distributed index database of all video files of video library is set up.
Further, the distributed index database of the described all video files for setting up video library, including following Step:
(331) the local sensitivity hash algorithm based on p-stable is utilized by the key feature DUAL PROBLEMS OF VECTOR MAPPING of each sub-spaces To the one-dimensional space;
(332) Hash table is safeguarded using name_node and data_ is used based on Hadoop distributed file systems framework Node preserves the distributed index database that index data is all video files.
(4) extract the key feature vector set of video to be retrieved and extract the video index file of the video to be retrieved; In specific implementation application, the key feature vector of video to be retrieved is extracted herein can use such as the key in step (1) and (2) Characteristic vector pickup method.
(5) regarded according to the video index file of described video to be retrieved in described distributed storage index database Frequency similarity retrieval simultaneously exports the video frequency searching result that similarity is more than systemic presupposition value.
In a preferred embodiment, the video index file of the video to be retrieved described in described basis is described Distributed storage index database in carry out video similarity retrieval, be specially:
(51) each video subspace projection histogram in video subspace projection histogram to be retrieved and video library is calculated Friendship as each video in video to be retrieved and video library similarity;
(52) picked according to the space-time structure uniformity of the key feature vector of each video in video to be retrieved and video library Except the video file for not meeting space-time structure coherence request.
In a preferred embodiment, described output similarity is more than the video frequency searching result of systemic presupposition value, Comprise the following steps:
(52) each subspace projection histogram of the key feature vector of video to be retrieved is extracted and by each key feature Vector is mapped as cryptographic Hash in each sub-spaces;
(53) meeting systemic presupposition by similarity in described inverted index file selection distributed index database will The video file asked is as output;
(54) the space-time structure uniformity of key feature vector of each video in video to be retrieved and video library is calculated simultaneously The similarity of output and described video to be retrieved is more than the video file of systemic presupposition value.
The method for realizing massive video quick-searching of the present invention is expanded on further with a specific embodiment below, such as Shown in Fig. 2, in a particular application, this method comprises the following steps:
(1) sdi video feature coding, i.e., be mapped to video features sequence by sequence of frames of video;As shown in figure 3, specific bag Include following sub-step:
(11) frame video image is read from video flowing, divides the image into MxN an equal amount of subgraphs, calculate each Subgraph gray value and texture number of edge points;
(12) gray level image two kinds of LBP (Local binary pattern, local binary patterns) space is calculated Feature v_gray, central feature (f) as shown in Figure 4 and boundary characteristic (g), 8 are collectively formed by central feature and boundary characteristic Frame of video spatial distribution characteristic, (h) seen in Fig. 4;
(13) ibid, the LBP spatial distribution characteristic v_texture of Edge texture image are calculated, for the sake of simplicity, can statistical chart The number of picture block internal edge is as the metric of Texture complication, and its result of calculation is ibid one 8 bit space grain distributions spy Levy;
(14) v_gray and v_texture features are combined, polynary frame feature v=(v_gray, v_texture) is constructed, I A frame feature v is called a frame vision word (visual word);
(15) in addition to the gray scale and texture LBP space characteristics that calculate figure, other frame features can also be added, such as 8 or 16bins color histogram v_color_his_16, now v=(v_gray, v_texture, v_color_his_16);Should Frame feature constituted mode can overcome single feature subspace to express the defect of frame of video very well.
This patent does not consider temporal characteristics because retrieval video temporal characteristics by low frame per second or scarce frame etc. other interference because The influence of element is with very big uncertainty, and the space-time characteristic of the frame constructed according to time series is likely to be mistake.And It is the uniformity of the proving time order in similar retrieval process.
(2) Video Key feature extraction, that is, extract wherein representative feature as the key of video features sequence Characteristic vector;As shown in figure 5, specifically including following sub-step:
(21) by first spatial signature vectors of video features sequence, key feature is vectorial by default;
(22) extract the spatial signature vectors v (n) of current n-th frame, if current signature v (n) and previous key feature to (v (m), mahalanobis distance m) is more than threshold value thrsh to amount, it is contemplated that noise factor, this paper 1<=thrsh<=2, then present frame be Key feature vector, be designated as (v (n), n).
Two different characteristic vector v1 and v2 express different video contents.The representational key feature of apparatus to Measure key vecotor and replace traditional key frame vector, the step for not only eliminating key-frame extraction, and given birth to source feature come Express video content more directly, accurately.Solve the integrality of video index information and the select permeability of index feature.
We are crucial characteristic vector (key vector), referred to as vision word (visual word), visual word Collection be collectively referred to as visual vocabulary table (visual vocabulary).The histogram of the set of eigenvectors of single video file is referred to as Feature histogram (vector histogram or visual word histogram).In order that key vector have it is rich Rich ability to express and abstract summarization, key vector are by different but independent sub vector Special composition intensity profile Feature Gray-LBP vector, spatial texture distribution characteristics Texture-LBP vector and color vector are constituted, can letter Singly it is expressed as key vector={ Gray-LBP, Texture-LBP, Color }.It is common by different abstract characteristics concept spaces Multiplying property of composition, which describes space, realizes the abundant abilities to express of key vector and abstract summarization.
This patent is that the present invention is directly to extract crucial special in video streaming with other key-frame extraction differences Levy, rather than traditional key-frame extraction.
Traditional key-frame extraction is to extract key frame using Key-frame Extraction Algorithm, is then carried again using the key frame of extraction Retrieval character is taken, the retrieval character calculated after the method used in key frame and extraction key frame is extracted and is not fully equal, sometimes Wait widely different, can so cause description inaccurate;This is also one of the reason for conventional retrieval feature accuracy is not high enough.
(3) sequence of frames of video specifically includes following sub-step to the histogrammic mapping of video visual word:
(31) because vision word may have very high dimension, such as (f_gray, f_texture, f_color_his_ 16) dimension (8,8,16) 32 is tieed up totally, the nearly 1GB of its memory requirements, and 32 dimension spaces are thrown into f_gray8 seats by we again again respectively In space, f_texture8 seats space, f_color_his_16 seats space, their Nogatas in subspace are counted respectively Figure, its memory requirements is significantly reduced, and less than 70MB, the histogram size of single video file is hardly more than 10MB;
(32) the histogrammic bin of subspace projection (some sub-space feature, such as 8 LBP features) numerical value, which is represented, is somebody's turn to do The frequency that feature occurs in video, in order to keep this feature inside same bin in the distribution of time, to remember in the following way Record bin contents:
bin:(this feature occur frequency be n1+n2+ ...+nk sum, frame number T1, continuous occurrence number n1, T2, n2,…,Tk,nk)。
(4) video file inverted index file is set up, following sub-step is specifically included:
(41) the corresponding vision word set of each video in statistics video library, constitutes the statistics vision dictionary of video library VwSet.According to
Vw_i (i-th of vision word in vision dictionary) set up possess the vision word collection of document vf1, vf2, Vf3 ..., vfni }, ni is collection of document size;
(42) document of vision word collection of document sorts from big to small by the number of contained vision word;
(43) because higher-dimension vision word projects to low-dimensional proper subspace, inverted index text is set up according to each subspace Part.
(5) distributed storage index database is set up, following steps are specifically included:
(51) using the local sensitivity hash algorithm (LSH) based on p-stable sub-space feature f_v, (such as f_ Colo_his_16) be mapped to the one-dimensional space [0-Range);
(52) using hadoop HDFS file system architectures, LSH tables is safeguarded with name_node, are preserved with data_node Index data.
(6) video Similarity Measure, specifically includes following steps:
It is { Bin_q_1, Bin_q_2 ..., Bin_q_M } to retrieve video Vq subspaces histogram, and it is big that M is characterized subspace Small, video library video Vi histograms are { Bin_i_1, Bin_i_2 ..., Bin_i_M }, in Bin_id_n, and id is that video is uniquely compiled Number, n is histogram bin sequence number,
Bin_id_n is the number of times that this feature occurs;
(61) video similarity is handed over to be histogrammic,
(62) if similarity is more than threshold value thrsh_sim, the time series relation of vision word is compared.The histogram time Sequence information has kept a record in step (32), and its algorithm is as follows:
Represent video in the order of the appearance of time according to retrieval video visual word, such as (Vq_vw1, Bin_k1), (Vq_vw2,Bin_k2),…,
(Vq_vwl, Bin_kl)) }, wherein vw1 is the vision word of first in time appearance, and vw2 is subsequent appearance , histogram Bin serial number k1, kl histogram bin total number where Bin_k1 represents the vision word;
(63) if the time of vision word Vq_vw (x) appearance in retrieval video is earlier than Vq_vw (y), x<Y, then match The corresponding serial number Bin_kx of video histogram all sequences number of identical vision word that are included of Bin in, at least One is less than the corresponding one of sequence numbers of Bin_ky;It is considered that retrieval of visual word occur sequencing should and phase The sequencing occurred like vision word same in video is consistent, i.e., corresponding space-time structure has uniformity, passage time Order is desirable to remove a large amount of doubtful similar videos.
(7) video is retrieved, specifically in the following ways:
Extract retrieval video visual word histogram, vision word feature each subspace mapping be cryptographic Hash, really Surely the name_node and data_nodes where Hash bucket are accessed, by inverted index video file, choose it is the most similar before 20%, as output, then calculates the uniformity of space-time structure, and it is all tested more than 0.7 to export similarity by similarity size The video file that rope is arrived.
The method for realizing massive video quick-searching in the invention is employed, is had the advantages that:
Present invention is generally directed to build the integrality of video index information and the select permeability of index feature, it is proposed that a kind of Subspace method based on video finger print, solves current quick, robust the search problem towards mass data.First, this is special Profit replaces key-frame extraction, directly with representational using novel extraction method of key frame with the extraction of key feature vector Visual signature replaces key frame, and original video is encoded equivalent in feature space, and complete expresses video information, Both without bulk redundancy, and closely, and current key frame extracting parameter select permeability is overcome.Secondly, each vision Feature Mapping is into one-dimensional cryptographic Hash, according to the cryptographic Hash location of visual signature, selects suitable HDFS (Hadoop Distributed File System, Hadoop distributed file system) name_node (name node) and data_node (back end), that is, accelerate retrieval rate, and with the ability of mass data concurrent processing, with widely applying model Enclose.
In this description, the present invention is described with reference to its specific embodiment.But it is clear that can still make Various modifications and alterations are without departing from the spirit and scope of the present invention.Therefore, specification and drawings are considered as illustrative And it is nonrestrictive.

Claims (12)

1. a kind of method for realizing massive video quick-searching, it is characterised in that described method comprises the following steps:
(1) spatial signature vectors are extracted respectively to each frame video image in the video flowing of video library and obtains video features sequence;
(2) key feature vector is extracted in the spatial signature vectors of described video features sequence;
(3) distributed storage for setting up all video files according to the key feature vector of all video files in video library is indexed Storehouse;
(4) extract the key feature vector set of video to be retrieved and extract the video index file of the video to be retrieved;
(5) video phase is carried out in described distributed storage index database according to the video index file of described video to be retrieved The video frequency searching result that similarity is more than systemic presupposition value is retrieved and exported like degree;
Described spatial signature vectors include the gray space distribution characteristics and texture space distribution characteristics of corresponding two field picture, institute Each frame video image extracts spatial signature vectors respectively in the video flowing to video library stated, and comprises the following steps:
(11) gray level image and Edge texture image of each frame video image in the video flowing for obtaining video library are calculated;
(12) the central space feature and boundary space feature of the gray level image of each frame video image are calculated and is obtained by described The gray space distribution characteristics for the frame video image that central space feature and boundary space feature are constituted;
(13) the texture space distribution characteristics of the Edge texture image of each frame video image is calculated.
2. the method according to claim 1 for realizing massive video quick-searching, it is characterised in that described calculating is obtained The gray level image of each frame video image and Edge texture image, comprise the following steps in the video flowing of video library:
(111) each frame video image in the video flowing of video library is divided into several an equal amount of subgraphs and calculates each The gray value and texture number of edge points of subgraph;
(112) gray scale for calculating each subgraph of each frame video image is worth to the gray level image of the frame video image;
(113) the texture number of edge points for calculating each subgraph of each frame video image obtains the edge line of the frame video image Manage image.
3. the method according to claim 1 for realizing massive video quick-searching, it is characterised in that each frame of described calculating The central space feature and boundary space feature of the gray level image of video image, be specially:
Calculate the central space feature and boundary space feature of the local binary patterns of the gray level image of each frame video image;
The texture space distribution characteristics of the Edge texture image of described each frame video image of calculating, be specially:
Calculate the texture space distribution characteristics of the local binary patterns of the Edge texture image of each frame video image.
4. the method according to claim 1 for realizing massive video quick-searching, it is characterised in that described space characteristics Vector also includes each frame video image in color histogram feature, the video flowing to video library and extracts space characteristics respectively Vector, it is further comprising the steps of:
(14) the color histogram feature of each frame video image is calculated.
5. the method according to claim 1 for realizing massive video quick-searching, it is characterised in that described described Key feature vector is extracted in the spatial signature vectors of video features sequence, is comprised the following steps:
(21) first spatial signature vectors of described video features sequence are defaulted as key feature vector;
(22) each spatial signature vectors and the mahalanobis distance of previous key feature vector are calculated;
(23) spatial signature vectors corresponding to the mahalanobis distance of systemic presupposition threshold value be will be greater than and be extracted as key feature vector.
6. the method according to claim 1 for realizing massive video quick-searching, it is characterised in that described according to video The key feature vector of all video files sets up the distributed storage index database of all video files, including following step in storehouse Suddenly:
(31) set up in described video features sequence the subspace projection histogram of key feature vector and to record each crucial The frequency that characteristic vector occurs in corresponding video;
(32) the inverted index file of all video files of video library is set up;
(33) the distributed index database of all video files of video library is set up.
7. the method according to claim 6 for realizing massive video quick-searching, it is characterised in that described sets up video The subspace projection histogram of key feature vector in characteristic sequence, be specially:
In key feature vector projection in video features sequence to gray scale subspace, texture subspace and color sub-spaces and it will obtain Obtain the subspace projection histogram of each key feature vector.
8. the method according to claim 7 for realizing massive video quick-searching, it is characterised in that described records each The frequency that key feature vector occurs in corresponding video, be specially:
Record and key feature vector is represented in the subspace projection histogram corresponding to each key feature vector in video The characteristic value of frequency of occurrence.
9. the method according to claim 7 for realizing massive video quick-searching, it is characterised in that described sets up video The inverted index file of all video files in storehouse, comprises the following steps:
(321) the key feature vector set in statistics video library corresponding to each video file merges the statistics for constituting the video library Key feature vector storehouse;
(322) each key feature for setting up in the vectorial storehouse of described statistics key feature is vectorial corresponding to possess the key feature The collection of document of vector;
(323) by the document of key feature vector set according to the quantity of contained key feature vector from being more to ranked up less;
(324) the inverted index file of all video files of video library is set up according to each sub-spaces.
10. the method according to claim 9 for realizing massive video quick-searching, it is characterised in that described foundation is regarded The distributed index database of all video files in frequency storehouse, comprises the following steps:
(331) the local sensitivity hash algorithm based on p-stable is utilized by the key feature DUAL PROBLEMS OF VECTOR MAPPING of each sub-spaces to one Dimension space;
(332) Hash table is safeguarded using name_node based on Hadoop distributed file systems framework and protected using data_node Deposit the distributed index database that index data is all video files.
11. the method according to claim 6 for realizing massive video quick-searching, it is characterised in that described according to institute The video index file for the video to be retrieved stated carries out video similarity retrieval in described distributed storage index database, specifically For:
(51) the histogrammic friendship of each video subspace projection in video subspace projection histogram to be retrieved and video library is calculated It is used as the similarity of each video in video to be retrieved and video library;
(52) rejected not according to the space-time structure uniformity of the key feature vector of each video in video to be retrieved and video library Meet the video file of space-time structure coherence request.
12. the method according to claim 11 for realizing massive video quick-searching, it is characterised in that described output phase Video frequency searching result like degree more than systemic presupposition value, comprises the following steps:
(52) each subspace projection histogram of the key feature vector of video to be retrieved is extracted and by each key feature vector Cryptographic Hash is mapped as in each sub-spaces;
(53) similarity in distributed index database is chosen by described inverted index file and meets systemic presupposition requirement Video file is used as output;
(54) calculate the space-time structure uniformity of the key feature vector of each video in video to be retrieved and video library and export It is more than the video file of systemic presupposition value with the similarity of described video to be retrieved.
CN201410245315.2A 2014-06-04 2014-06-04 The method for realizing massive video quick-searching Active CN104050247B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410245315.2A CN104050247B (en) 2014-06-04 2014-06-04 The method for realizing massive video quick-searching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410245315.2A CN104050247B (en) 2014-06-04 2014-06-04 The method for realizing massive video quick-searching

Publications (2)

Publication Number Publication Date
CN104050247A CN104050247A (en) 2014-09-17
CN104050247B true CN104050247B (en) 2017-08-08

Family

ID=51503079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410245315.2A Active CN104050247B (en) 2014-06-04 2014-06-04 The method for realizing massive video quick-searching

Country Status (1)

Country Link
CN (1) CN104050247B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107943903A (en) * 2017-11-17 2018-04-20 广州酷狗计算机科技有限公司 Video retrieval method and device, computer equipment, storage medium

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104504121A (en) * 2014-12-29 2015-04-08 北京奇艺世纪科技有限公司 Video retrieval method and device
CN104504162B (en) * 2015-01-21 2018-12-04 北京智富者机器人科技有限公司 A kind of video retrieval method based on robot vision platform
US9740775B2 (en) * 2015-03-13 2017-08-22 TCL Research America Inc. Video retrieval based on optimized selected fingerprints
CN105095435A (en) * 2015-07-23 2015-11-25 北京京东尚科信息技术有限公司 Similarity comparison method and device for high-dimensional image features
CN108780457A (en) * 2016-02-09 2018-11-09 开利公司 Multiple queries are executed in steady video search and search mechanism
CN106156284B (en) * 2016-06-24 2019-03-08 合肥工业大学 Extensive nearly repetition video retrieval method based on random multi-angle of view Hash
CN107748750A (en) * 2017-08-30 2018-03-02 百度在线网络技术(北京)有限公司 Similar video lookup method, device, equipment and storage medium
CN109857908B (en) * 2019-03-04 2021-04-09 北京字节跳动网络技术有限公司 Method and apparatus for matching videos
CN110032652B (en) * 2019-03-07 2022-03-25 腾讯科技(深圳)有限公司 Media file searching method and device, storage medium and electronic device
CN110188098B (en) * 2019-04-26 2021-02-19 浙江大学 High-dimensional vector data visualization method and system based on double-layer anchor point map projection optimization
CN110275983B (en) * 2019-06-05 2022-11-22 青岛海信网络科技股份有限公司 Retrieval method and device of traffic monitoring data
CN111294613A (en) * 2020-02-20 2020-06-16 北京奇艺世纪科技有限公司 Video processing method, client and server
CN111507260B (en) * 2020-04-17 2022-08-05 重庆邮电大学 Video similarity rapid detection method and detection device
CN113821704B (en) * 2020-06-18 2024-01-16 华为云计算技术有限公司 Method, device, electronic equipment and storage medium for constructing index
CN112699348A (en) * 2020-12-25 2021-04-23 中国平安人寿保险股份有限公司 Method and device for verifying nuclear body information, computer equipment and storage medium
CN112861609B (en) * 2020-12-30 2024-04-09 中国电子科技集团公司信息科学研究院 Multithreading content key frame identification efficiency improvement method
CN113779303B (en) * 2021-11-12 2022-02-25 腾讯科技(深圳)有限公司 Video set indexing method and device, storage medium and electronic equipment
CN115630191B (en) * 2022-12-22 2023-03-28 成都纵横自动化技术股份有限公司 Time-space data set retrieval method and device based on full-dynamic video and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101311947A (en) * 2008-06-12 2008-11-26 浙江大学 Real time intelligent control method based on natural video frequency
CN102436487A (en) * 2011-11-03 2012-05-02 北京电子科技学院 Optical flow method based on video retrieval system
CN102999640A (en) * 2013-01-09 2013-03-27 公安部第三研究所 Video and image retrieval system and method based on semantic reasoning and structural description

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996017313A1 (en) * 1994-11-18 1996-06-06 Oracle Corporation Method and apparatus for indexing multimedia information streams
JP2002007479A (en) * 2000-06-22 2002-01-11 Ntt Communications Kk Retrieving information displaying method, information retrieving system, retrieving server and recording medium of program for the server

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101311947A (en) * 2008-06-12 2008-11-26 浙江大学 Real time intelligent control method based on natural video frequency
CN102436487A (en) * 2011-11-03 2012-05-02 北京电子科技学院 Optical flow method based on video retrieval system
CN102999640A (en) * 2013-01-09 2013-03-27 公安部第三研究所 Video and image retrieval system and method based on semantic reasoning and structural description

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107943903A (en) * 2017-11-17 2018-04-20 广州酷狗计算机科技有限公司 Video retrieval method and device, computer equipment, storage medium

Also Published As

Publication number Publication date
CN104050247A (en) 2014-09-17

Similar Documents

Publication Publication Date Title
CN104050247B (en) The method for realizing massive video quick-searching
CN108920720B (en) Large-scale image retrieval method based on depth hash and GPU acceleration
Jégou et al. On the burstiness of visual elements
Murray et al. A deep architecture for unified aesthetic prediction
CN104376003B (en) A kind of video retrieval method and device
CN105095435A (en) Similarity comparison method and device for high-dimensional image features
CN103336957B (en) A kind of network homology video detecting method based on space-time characteristic
Wang et al. Compact CNN based video representation for efficient video copy detection
CN103186538A (en) Image classification method, image classification device, image retrieval method and image retrieval device
Liu et al. Deepindex for accurate and efficient image retrieval
CN111460961B (en) Static video abstraction method for CDVS-based similarity graph clustering
CN110059206A (en) A kind of extensive hashing image search method based on depth representative learning
CN110222218A (en) Image search method based on multiple dimensioned NetVLAD and depth Hash
CN104036012A (en) Dictionary learning method, visual word bag characteristic extracting method and retrieval system
CN102890700A (en) Method for retrieving similar video clips based on sports competition videos
CN107291825A (en) With the search method and system of money commodity in a kind of video
CN110046251A (en) Community content methods of risk assessment and device
CN102385592A (en) Image concept detection method and device
CN110502664A (en) Video tab indexes base establishing method, video tab generation method and device
CN112329460A (en) Text topic clustering method, device, equipment and storage medium
CN112434553A (en) Video identification method and system based on deep dictionary learning
CN109086830B (en) Typical correlation analysis near-duplicate video detection method based on sample punishment
Huang et al. Cross-modal deep metric learning with multi-task regularization
CN105183845A (en) ERVQ image indexing and retrieval method in combination with semantic features
CN108389113A (en) A kind of collaborative filtering recommending method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 200433, 101-10 floor, floor 1, building 127, Cathay Road, Cathay Road, Shanghai, Yangpu District

Applicant after: SHANGHAI CERTUSNET INFORMATION TECHNOLOGY CO., LTD.

Address before: 200433, room 1301, Fudan Science and technology building, 11 Guotai Road, Shanghai, Yangpu District

Applicant before: Shanghai Meiqi Puyue Communication Technology Co., Ltd.

COR Change of bibliographic data
GR01 Patent grant
GR01 Patent grant