CN105844214A - Multi-path depth encoded information fingerprint extraction method based on bit space - Google Patents

Multi-path depth encoded information fingerprint extraction method based on bit space Download PDF

Info

Publication number
CN105844214A
CN105844214A CN201610119377.8A CN201610119377A CN105844214A CN 105844214 A CN105844214 A CN 105844214A CN 201610119377 A CN201610119377 A CN 201610119377A CN 105844214 A CN105844214 A CN 105844214A
Authority
CN
China
Prior art keywords
bit
row
chain
information
sortord
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610119377.8A
Other languages
Chinese (zh)
Other versions
CN105844214B (en
Inventor
杨灿
任思璇
韩国强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201610119377.8A priority Critical patent/CN105844214B/en
Publication of CN105844214A publication Critical patent/CN105844214A/en
Application granted granted Critical
Publication of CN105844214B publication Critical patent/CN105844214B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/12Fingerprints or palmprints
    • G06V40/1347Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/12Fingerprints or palmprints
    • G06V40/1365Matching; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a multi-path depth encoded information fingerprint extraction method based on the bit space. The method comprises a first step of constructing a bit window; a second step of constructing a bit plane; a third step of reducing dimensions and constructing a bit chain; a fourth step of encoding the bit chain; and a fifth step of comparing fingerprints. The fourth step includes conducting transversal and statistics for the constructed bit chain with reduced dimensions, acquiring the number of continuous 0 and the number of continuous 1 in sequence, and constructing a new decimal sequence of number; and binarizing the obtained decimal sequence of number to obtain a new binary bit chain with the first bit not being 0, repeating the transversal and statistics operation for the obtained bit chain, carrying out loop iteration in sequence until the newly obtained decimal sequence of number has only one element, and recording the value of the element obtained at last and the number of times of loop iteration as a feature value of an information fingerprint feature space The invention has the advantages of improving the efficiency in the same information detection and the like.

Description

A kind of information fingerprint extracting method of multipath depth coding based on bit space
Technical field
The present invention relates to a kind of computer and communication technology, compile particularly to a kind of multipath degree of depth based on bit space The information fingerprint extracting method of code.
Background technology
Preferably information fingerprint follows two basic principles: the finger print information amount the first, extracted from initial data is the fewest The best, to save being taken up space of finger print information itself;The second, fingerprint comparison is passed through, it is possible to the accurately concordance of discriminative information. Information fingerprint extracts and mainly solves two class problems with comparison technology: 1) if fingerprint is inconsistent, conclude content to be sentenced with original in The discordance held;2) if fingerprint is consistent, the concordance of content to be sentenced and original contents is concluded.Current existing information fingerprint Extractive technique, mainly has the MD5 fingerprint for data, for the fingerprint such as rectangular histogram, eigenvalue and feature sampled point of image Type, has the grand wave filter of cloth (Bloom Filter) technology, these methods existing in terms of information comparison, is solving above institute The 1st class problem aspect stated, respond well;But in face of the 2nd class problem, the performance all not shown.Current information fingerprint But technical finesse the 2nd i.e. fingerprint of class problem is consistent cannot determine that the vulnerability of the problem that content is the most consistent causes necessary Carry out a large amount of computing costs of raw information comparison, and fingerprint comparison itself constitutes a redundancy, hence it is imperative that new finger Stricture of vagina extractive technique promotes its ability solving Equations of The Second Kind problem;The especially MD5 of international popular is successfully cracked by China scientist Afterwards, this problem is the most aobvious urgently.
Based on this, the present invention provides the information fingerprint extracting method of a kind of multipath depth coding based on bit space (it is called for short: MDB), its coherence request to information judgement can be met, be exactly i.e. that not only fingerprint is inconsistent can be concluded that content Certainly the most inconsistent, as long as and fingerprint is consistent, then content possesses the concordance of high probability.In order to improve the efficiency of fingerprint comparison, The present invention also provides for quick fingerprint comparison technology based on described method.There is provided new for quick information comparison and information retrieval Method (MDB-M1 and MDB-M2).The transmission of the application of present invention data the biggest for data and storage, information centre's net The redundant content detection in the fields such as Cache mechanism, CCN content center net and grain communication is the most significant.
Summary of the invention
It is an object of the invention to the shortcoming overcoming prior art with not enough, it is provided that a kind of multichannel based on bit space The information fingerprint extracting method of footpath depth coding, the method is that a kind of multichannel based on bit space refers to through the information of depth coding Stricture of vagina extracts and comparison method (MDB).
The purpose of the present invention can be achieved through the following technical solutions: a kind of multipath depth coding based on bit space Information fingerprint extracting method, comprise the following steps:
Step one, structure bit window;
Step 2, structure bit-planes: by bit window according to the extra heavy new segmentation of a certain width versus, be arranged in a ratio Special plane;
Step 3, dimensionality reduction structure bit chain: carry out dimensionality reduction arrangement by multiple travel paths mode, build different bit chains BC, and record the coding of corresponding arrangement mode;
Step 4, bit chain is encoded: the dimensionality reduction bit chain for above-mentioned structure carries out traversal statistics, obtains successively Continuous print 0 and the number of continuous print 1, constitute 1 new decimal scale ordered series of numbers: carry out binarization again for gained decimal scale ordered series of numbers Obtain the binary bits chain that a new first place is non-zero, gained bit chain is repeated the operation of above-mentioned traversal statistical counting, successively Till loop iteration element in the new decimal scale ordered series of numbers obtained is 1, the value of the element that record finally gives and circulation Iterations is an eigenvalue of information fingerprint feature space;
Step 5, fingerprint comparison.
In step one, the effective bit according to can uniquely characterize original information bytes stream constructs bit window, than The size of special window is positive integer m being arbitrarily not more than the total bit number of raw information (Tb), for different systems or byte stream, The value of m can change, especially m=8, and 16,32,64 etc. is the situation of the integral number power of 2;In step 2, by raw information Byte stream window is split by the width of m bit and is arranged side by side one by one, and lowest order is 0, and highest order is m-1, it is thus achieved that a m row * n The bit-planes of row;It is n that described m divides exactly Tb, if aliquant, omit remainder or mends 0 process.
In step 3, carry out dimensionality reduction arrangement by following travel paths mode, build different bit chains, and record corresponding Arrangement mode coding:
A) by above-mentioned bit-planes with m as fixed step size, from the beginning of a high position for the first row, from the high-order column of this row to low level Row, the tail position bit of low level row is connected with the first bit of high-order row, is an a length of m*n's by two-dimensional bits plane conversion One-dimensional bit chain, and record the encoded radio of this sortord;
B) by above-mentioned bit-planes with m as fixed step size, from the beginning of the tail position of the first row, low from this row ranks paramount position Row, the high order bit of low level row is connected with the tail position bit of high-order row, is an a length of m*n's by two-dimensional bits plane conversion One-dimensional bit chain, and record the encoded radio of this sortord;
C) by above-mentioned bit-planes with n as fixed step size, from the beginning of the high-order column of the first row, height is walked to from the low level of these row Position row, the low low level row bit ranked is connected with the high-order row bit of high-order column, is a length by two-dimensional bits plane conversion For the one-dimensional bit chain of n*m, and record the encoded radio of this sortord;
D) by above-mentioned bit-planes with n as fixed step size, from the beginning of the tail of the first row ranks, height is walked to from the low level of these row Position row, the low high-order row bit ranked is connected with the low level row bit of high-order column, is a length by two-dimensional bits plane conversion For the one-dimensional bit chain of n*m, and record the encoded radio of this sortord;
E) by above-mentioned bit-planes with n as fixed step size, from the beginning of the tail of the first row ranks, height is walked to from the low level of these row Position row, the low high-order row bit ranked is connected with the high-order row bit of high-order column, until the low lowest order row bit ranked is with high The lowest order row bit ranked is connected, the like, by the one-dimensional bit that two-dimensional bits plane conversion is an a length of n*m Chain, and record the encoded radio of this sortord;
F) by above-mentioned bit-planes with n as fixed step size, from the beginning of the first row of last column, walk to from a high position for these row Low level row, the low level row bit of high-order column is connected with the low low level row bit ranked, until the lowest order row bit of high-order column with The low lowest order row bit ranked is connected, the like, by the one-dimensional bit that two-dimensional bits plane conversion is an a length of n*m Chain, and record the encoded radio of this sortord.
Described step 4 comprises the following steps:
Step 41, bit chain is tentatively encoded;Described dimensionality reduction bit chain is carried out the first coding, i.e. records above-mentioned ratio 1st bit of special chain, if 0, then it is encoded to 0;If 1, then it is encoded to 1;The first coding stands corresponding bits chain Start bit is 0, or 1;
Step 42, to formed described dimensionality reduction bit chain simplify run-length encoding, order statistics is continuous the most from the beginning to the end The number of times of 0/1 appearance, it is thus achieved that the nonzero integer sequence of a corresponding bit chain, and initial code depth value is set, then Carry out depth coding;
Step 43, bit chain is carried out depth coding;
The described method that bit chain carries out depth coding comprises the following steps:
Step A, the nonzero integer sequence of gained is carried out binarization again, omit the binary digit that each integer is corresponding First 1 before 0, build and form new 0/1 bit chain;
Step B also adds up the length of this bit chain, and add 1 by loop iteration depth value;
Step C, the bit chain newly the obtained replacement original bit chain that generated of claim 4 is carried out simplifying stroke again compile Code, the number of times of the 0/1 of statistics appearance continuously constitutes new nonzero integer sequence.
Circulation performs step A to step C, until a length of the 1 of the new nonzero integer sequence obtained, the most only 1 Count value, and record this value for terminate encoded radio, record final coding depth value, the most above-mentioned loop iteration number of times simultaneously.
The multiple bit chain constructed by the different sortords step 41 that reruns is to 43, until obtaining six sequence sides Till the tlv triple<the first value, terminate encoded radio, coding depth value>of formula, the tlv triple of described six sortords includes first The tlv triple of kind of sortord structure, the tlv triple of the second sortord structure, the tlv triple of the third sortord structure, The tlv triple of the 4th kind of sortord structure, the tlv triple of the 5th kind of sortord structure and the three of the 6th kind of sortord structure Tuple;The feature space being constituted this raw information by described six tlv triple is the fingerprint of this information.
Generate the comparison method by turn of the fingerprint of described information, comprise the following steps:
(1) the first comparison: for same bit chain structural model, preferential comparison the first bit value accordingly, appoint if had What corresponding the first bit value is different, then both fingerprints of explanation do not mate, and terminate comparison;If identical, then turn next step;
(2) encoded radio comparison: if the first bit value is identical, then continuing the final encoded radio that comparison is corresponding, appointing if had What corresponding final encoded radio is different, then both fingerprints of explanation do not mate, and terminate comparison;If identical, then turn next step;
(3) coding depth value comparison: if the first and final encoded radio is the most identical, then continue the coding depth that comparison is corresponding Value, i.e. iterations value, if there being the coding depth value difference of any one correspondence, then both fingerprints of explanation do not mate, and terminate Comparison;If it is identical, then it is assumed that both fingerprint matchings.
The unification of combining of the fingerprint of generated described information is contrasted;The described method unifying to contrast of combining is: by right Require that in 6, the tlv triple of the first sortord structure is compared according to the method in claim 7, if identical, then will In claim 6, the tlv triple of the second sortord structure is compared by the method in claim 7, and the rest may be inferred, directly Comparing to the 6th kind of sortord;If six sequence sides that the path constructed according to identical sequencing model is formed The tlv triple of formula is all consistent, then fingerprint is the most consistent, then it is assumed that information is consistent, otherwise it is assumed that information is inconsistent.
The purpose of the present invention can also be achieved through the following technical solutions: a kind of multipath degree of depth based on bit space is compiled The information fingerprint extracting method of code, comprises the following steps:
S1, structure bit window: the effective bit according to can uniquely characterize original information bytes stream constructs bit window Mouthful, the width of bit window wb of byte stream is m bit, and m is not more than total bit number Tb, the most a total of n=of raw information Ceil (Tb/m) individual wb;If aliquant, the width of last window is remainder, i.e. Tb-ceil (Tb/m) * m, here Ceil (Tb/m) represents that Tb/m rounds up.In order to easy to operate, the value of m is generally the integral multiple of 4 or 8.
S2, structure bit-planes: being arranged side by side one by one by the width of m bit by byte stream window wb, lowest order is 0, A high position is m-1, obtains the bit-planes of m row * n row.
S3, dimensionality reduction structure bit chain: carry out dimensionality reduction arrangement by multiple travel paths mode, build different bit chain BC, And record the coding of corresponding arrangement mode:
A) by above-mentioned bit-planes with m as fixed step size, from the beginning of a high position for the first row, from the high-order column of this row to low level Row, the tail position bit of low level row is connected with the first bit of high-order row, is an a length of m*n's by two-dimensional bits plane conversion One-dimensional bit chain BC0.The encoded radio recording this sortord RM is 0;
B) by above-mentioned bit-planes with m as fixed step size, from the beginning of the tail position of the first row, low from this row ranks paramount position Row, the high order bit of low level row is connected with the tail position bit of high-order row, is an a length of m*n's by two-dimensional bits plane conversion One-dimensional bit chain BC1.The encoded radio recording this sortord RM is 1;
C) by above-mentioned bit-planes with n as fixed step size, from the beginning of the high-order column of the first row, height is walked to from the low level of these row Position row, the low low level row bit ranked is connected with the high-order row bit of high-order column, is a length by two-dimensional bits plane conversion One-dimensional bit chain BC2 for n*m.The encoded radio recording this sortord RM is 2;
D) by above-mentioned bit-planes with n as fixed step size, from the beginning of the tail of the first row ranks, height is walked to from the low level of these row Position row, the low high-order row bit ranked is connected with the low level row bit of high-order column, is a length by two-dimensional bits plane conversion One-dimensional bit chain BC3 for n*m.The encoded radio recording this sortord RM is 3;
E) by above-mentioned bit-planes with n as fixed step size, from the beginning of the tail of the first row ranks, height is walked to from the low level of these row Position row, the low high-order row bit ranked is connected with the high-order row bit of high-order column, until the low lowest order row bit ranked is with high The lowest order row bit ranked is connected, the like, by the one-dimensional bit chain that two-dimensional bits plane conversion is an a length of n*m BC4.The encoded radio recording this sortord RM is 4;
F) by above-mentioned bit-planes with n as fixed step size, from the beginning of the first row of last column, walk to from a high position for these row Low level row, the low level row bit of high-order column is connected with the low low level row bit ranked, until the lowest order row bit of high-order column with The low lowest order row bit ranked is connected, the like, by the one-dimensional bit that two-dimensional bits plane conversion is an a length of n*m Chain BC5.The encoded radio recording this sortord RM is 5.
S4, the first coding: above-mentioned bit chain is carried out the first coding Fb, i.e. records the 1st bit of above-mentioned bit chain, as Fruit is 0, then be encoded to 0;If 1, then it is encoded to 1;The start bit of the first coding stands corresponding bits chain is 0, or 1.
S5, simplifying run-length encoding: carry out simplifying run-length encoding by the bit chain of formation, order statistics is even the most from the beginning to the end The number of times of 0/1 continuous appearance, it is thus achieved that the nonzero integer sequence of a corresponding bit chain.And BRLC depth value depth_ is set Of_BRLC=1;
S6, depth coding: sequence B RLC of gained is carried out binarization, omit the binary digit that each integer is corresponding First 1 before 0, build and form new 0/1 bit chain NBC, and add up length Length_of_BRLC of this bit chain, And by loop iteration depth value depth_of_BRLC+=1, the bit chain NBC newly obtained is carried out simplifying run-length encoding, statistics The number of times of 0/1 occurred continuously constitutes the BRLC of letter.
S7, repetition step 6, until the new Length_of_BRLC obtained is 1.
S8, depth_of_BRLC when recording above-mentioned loop termination and the final BRLC value obtained.
S9, fingerprint extraction: the six kinds of uniform above-mentioned steps of bit chain 3 constructed by raw information that step 3 is carried to Step 8 repeats to obtain 6 tlv triple<Fb, BRLC, depth_of_BRLC>, thus 6 tlv triple (6*3 characteristic element) structures Become the fingerprint of this information.
In order to carry out fingerprint comparison, MDB of the present invention provides two kinds of comparison methods, method one, (abbreviation: be MDB-M1) by turn Comparison, method two (is called for short: MDB-M2) for combining unified comparison.Following steps are that method one MDB-M1 uses said process to step The fingerprint that rapid S9 extracts carries out the method for comparison by turn and comprises the following steps:
S10, the first comparison: for same bit chain structural model RM, preferential comparison corresponding Fb value, if having any one The Fb value of individual correspondence is different, then both fingerprints of explanation do not mate, and terminate comparison;If identical, then turn next step;
S11, encoded radio comparison: if Fb is identical, then continue the BRLC value that comparison is corresponding, if there being any one correspondence BRLC value is different, then both fingerprints of explanation do not mate, and terminate comparison;If identical, then turn next step;
S12, coding depth value comparison: if BRLC is identical, then continue the depth_of_BRLC value that comparison is corresponding, if The depth_of_BRLC value having any one correspondence is different, then both fingerprints of explanation do not mate, and terminate comparison;If it is identical, then Think both fingerprint matchings.
If S13 is homogeneous to step 14 through above-mentioned comparison step 11 by the tlv triple of the first path configuration of step 3 kind With, then start the tlv triple comparison of the second path configuration, the like compare to the 6th kind of path.The most consistent, then Thinking that information is consistent, if any any difference, then information is inconsistent.
The technical characteristic that method two, (MDB-M2) carry out unifying comparison is as follows:
S14, the bit chain CIFB of the 6*3 information fingerprint through overcompression generated according to above fingerprint extraction step 10 Value, directly carries out bit comparison, different if any any bit, then it is assumed that information is inconsistent, terminates comparison;If all than Special position is consistent, then it is assumed that information is consistent.
Above step is information fingerprint based on bit space multipath depth coding proposed by the invention from S1 to S5 The basic step of extracting method, in order to improve reliability and/or the suitability of system further, S6-S10 provides high for the present invention Level spread step, the step for can with in basic step S3 above one or more combination constitute different accuracies bit Fingerprint extraction scheme.6 kinds of patterns described in S3 can also extend further to the Arbitrary Deterministic bit space bit chain all over row Forming types, does not affect the essence of the present invention, surrounds the most from outside to inside, or dissipate to surrounding from centre, or the word of ZigZag The modes such as shape.Only comprise the feature of 0 and 1 two element for 0/1 bit chain, the present invention is carried to be simplified run-length encoding and only needs 1 Unit instead of three elementary cells (i.e. symbol, number of repetition, position) of classical run-length encoding because the present invention by symbol with 0/1 odd even fallback relationship and order arrangement mode, imply symbol and two, position feature, thus when bit chain constructs Only need record number of repetition the most successively, which save the expense of RLC.About the end circulating BRCL coding described in step 7 Only condition can also be some given threshold value of depth_of_BRLC or BRCL value, and this setting nor affects on the reality of the present invention Matter.S10 to S12 is fingerprint comparison step, and the first comparison is for quick fingerprint detection, the size and number warp of other alignment parameters Crossing above-mentioned fingerprint extraction process, obtained and simplified, therefore the quantity of information of comparison is the least, it is possible to be greatly improved comparison efficiency.With Time, because this fingerprint extraction method is directed to each bit, and implicit bit position information and mechanical periodicity, through S3 The building mode of described 6 dimensions, therefore comparison can meet the solution to the 2nd class problem by high probability very much.
In sum, the information fingerprint extracting method of the multipath depth coding based on bit space that the present invention provides, Its key step includes: bit window, bit-planes and the structure of bit chain, and the first coding simplifies run-length encoding, depth coding (circulation depth uses binary system to simplify run-length encoding), fingerprint extraction (six tlv triple), fingerprint compression, fingerprint comparison (the first, Encoded radio, coding depth) etc. step.The Main Function of this invention is to reduce comparison information amount, strengthens comparison accuracy, the present invention It is mainly used in network and the communications field extraction and the comparison of involved information fingerprint.
The original position of bit chain of the present invention structure and the adjustment of the mode of traversal and the selection mode of eigenvalue Change, do not change the essence of the present invention, i.e. its relevant mutation and constitute with the present invention substantive consistent.
The present invention, by original information bits stream is carried out bit segmentation reconstruct bit-planes according to certain window, carries out 0/ 1 traversal statistics, obtains continuous 0 and the number of continuous 1 successively, constitutes new decimal scale ordered series of numbers;For gained decimal scale ordered series of numbers again Carry out binarization and obtain the binary bits chain that new first place is non-zero, gained bit chain is repeated above-mentioned traversal statistical counting Operation;Till rule loop iteration element in the new decimal scale ordered series of numbers obtained is 1 according to this, the unit that record finally gives Element value is encoded radio, and loop iteration number of times is coding depth value;Use different bit-planes building methods and traversal mode, weight Use a series of different encoded radios and coding depth value, the complete characterization of configuration information fingerprint of said method gained again.Right The comparison of described information fingerprint uses comparison by turn and overall comparison method.
The present invention has such advantages as relative to prior art and effect:
1, the focus point of the present invention is that information fingerprint is extracted by the depth coding by multipath, including bit chain Structure, the first of bit chain encodes and simplifies run-length encoding, and fingerprint contrasts.
2, the thought source of multipath depth coding based on bit space proposed by the invention is, in a computer, Any information can store in binary mode, and in storing process, 0 or 1 occur the most continuously or with The form in cycle occurs, therefore in transmitting procedure, information self exists for a certain amount of redundancy.The thought of the present invention is then Information will have the bit of bigger redundancy through recompiling, to reduce the figure place of bit, when the information of carrying out comparison, it is only necessary to The information fingerprint extracted is compared, it is not necessary to original information is detected in a large number, greatly improve when information contrasts Work efficiency.Although there being a lot of method information fingerprint extracted and detects at present, but so far, there is not base Information fingerprint extracted and the correlational study of comparison and patent in bit space.The method that the present invention proposes can be in a large number Redundancy reduce, can apply the transmission in information can the context of detection of identical information.Meanwhile, in order to reduce information Interior degree of redundancy, the invention allows for depth coding mode, and information is constantly iterated coding, improves identical information inspection Efficiency during survey.
3, the information fingerprint extracting method of multipath depth coding based on bit space proposed by the invention, its core It is that prime information fingerprint is reconfigured to bit-planes with the form of bit window, by choosing different path dimensionality reduction structure ratios Special chain, then encode and simplify, by first place, the bit chain that path dimensionality reduction constructs by run-length encoding and recompile.At fingerprint pair Ratio aspect, the bit chain after compression is compared by the two kinds of quick fingerprint detection methods proposed by the present invention.Compared to biography The information fingerprint control methods of system, the quantity of information of the method contrast is less, and the accuracy rate differentiating prime information is higher.Side of the present invention The enforcement of method need not excessively complicated coded system, and can recompile on original information fingerprint, and energy Enough and existing fingerprint extraction method realizes seamless combination, and application prospect is extensive, can be widely used in network and the communications field.Right In the transmission of big data with migrate significant, particularly with the data that amount of redundancy is big, the fingerprint extraction method that the present invention proposes There is prior meaning.
Accompanying drawing explanation
Fig. 1 is the basic functional principle schematic diagram of MDB of the present invention.
Fig. 2 is the detailed operation principle process schematic diagram described in MDB of the present invention.
Fig. 3 a is that the embodiment of the present invention one Magic (3) builds the figure of bit chain by pattern a.
Fig. 3 b is that the embodiment of the present invention one Magic (3) builds the figure of bit chain by pattern b.
Fig. 3 c is that the embodiment of the present invention one Magic (3) builds the figure of bit chain by pattern c.
Fig. 3 d is that the embodiment of the present invention one Magic (3) builds the figure of bit chain by pattern d.
Fig. 3 e is that the embodiment of the present invention one Magic (3) builds the figure of bit chain by pattern e.
Fig. 3 f is that the embodiment of the present invention one Magic (3) builds the figure of bit chain by pattern f.
Fig. 4 a is that the present invention is according to Comparison Method comparison fingerprint workflow diagram by turn.
Fig. 4 b is that the present invention is according to overall Comparison Method comparison fingerprint workflow diagram.
Detailed description of the invention
Below in conjunction with embodiment and accompanying drawing, the present invention is described in further detail, but embodiments of the present invention do not limit In this.
Embodiment
As it is shown in figure 1, represent that the information fingerprint of multipath depth coding based on bit space proposed by the invention carries Whole flow processs of access method.First structure bit space and bit-planes;Choose mode according still further to different paths and construct bit Chain;Bit chain is carried out initial code and Depth Expansion coding, wherein, initial code include bit chain is carried out the first coding and Simplifying run-length encoding, Depth Expansion coding is to carry out successive ignition coding on the basis of initial code, until the length of bit chain Degree is 1 stopping;Again fingerprint is compared after fingerprint extraction.
As in figure 2 it is shown, represent the fingerprint extraction concrete grammar that the present invention proposes.To bit-planes side in a different ordering Formula structure bit chain, is combined the tlv triple that every kind of sortord eventually forms, forms the finger print information of a 6*3, will This finger print information is compressed the finger print information CIFB finally extracted.
As shown in fig. 4 a, represent that the present invention proposes to carry out fingerprint the specific works flow process of comparison by turn.First head is carried out Position comparison, i.e. for same bit chain structural model RM, preferential comparison corresponding Fb value, if there being the Fb value of any one correspondence Difference, then both fingerprints of explanation do not mate, and terminate comparison;If identical, then continue the BRLC value that comparison is corresponding, if having any One corresponding BRLC value is different, then both fingerprints of explanation do not mate, and terminate comparison;If identical, then continue comparison corresponding Depth_of_BRLC value, if there being the depth_of_BRLC value difference of any one correspondence, then both fingerprints of explanation do not mate, Terminate comparison;If it is identical, then it is assumed that both fingerprint matchings.
As shown in Figure 4 b, represent that the present invention proposes to carry out fingerprint the specific works flow process of entirety comparison.After compression The bit chain CIFB value of 6*3 information fingerprint directly carries out bit comparison, different if any any bit, then it is assumed that information differs Cause, terminate comparison;If all bit is consistent, then it is assumed that information is consistent.
In this example, to be processed to as if Magic (3) matrix of a standard, herein for describing convenient and base The size being limited this matrix in the page only takes 3*3, and matrix is:By dimension-reduction treatment, this matrix turns to one 1 dimension Ordered series of numbers, this 1 dimension is classified as: (8 3415967 2);Decimal scale element in arranging this 1 dimension carries out binarization, To a Binary Zero/1 bit chain, this Binary Zero/1 bit chain is: (1,000 0,011 0,100 0,001 0,101 1,001 0110 0111 0010);For the bit chain of Fig. 3 c, carry out segmentation with the bit window size of m=4 and rearrange, it is thus achieved that bit put down Face is:For gained bit-planes, respectively with different original positions and traverse path, reconfigure out 6 Bar difference bit chain, wherein, as shown in Figure 3 a, the bit chain concrete structure constructed by sortord RM=0 is:
(100000110100000101011001011001110010) the bit chain, constructed by sortord RM=1 Concrete structure is as shown in Figure 3 b: (000111000010100010101001011011100100), constructs by sortord RM=2 The bit chain concrete structure gone out is as shown in Figure 3 c:
(100001000001010110010000111010111010) the bit chain, constructed by sortord RM=3 Concrete structure is as shown in Figure 3 d: (010111010010000111001010110100001000), constructs by sortord RM=4 The bit chain concrete structure gone out is as shown in Figure 3 e: (100001000011010100010000111010111010), by sequence side The bit chain concrete structure that formula RM=5 constructs is as illustrated in figure 3f: (000100001001010110111000010010111010). Depth information fingerprint extraction idiographic flow for Fig. 3 a, method mode a proposed by the invention is:
The fingerprint character code that above flow process is exported is<1,6,8>;
As shown in Figure 3 b, the depth information fingerprint extraction idiographic flow of method mode b proposed by the invention is:
The fingerprint character code that above flow process is exported is<0,4,11>;
As shown in Figure 3 c, the depth information fingerprint extraction idiographic flow of method mode c proposed by the invention is:
The fingerprint character code that above flow process is exported is<1,6,6>;
As shown in Figure 3 d, the depth information fingerprint extraction idiographic flow of side's pattern d proposed by the invention is:
The fingerprint character code that above flow process is exported is<1,3,11>;
As shown in Figure 3 e, the depth information fingerprint extraction idiographic flow of method mode e proposed by the invention is:
The fingerprint character code that above flow process is exported is<1,5,7>;
As illustrated in figure 3f, the depth information fingerprint extraction idiographic flow of method mode f proposed by the invention is:
The fingerprint character code that above flow process is exported is<Isosorbide-5-Nitrae, 8>;
The condition code that comprehensive above six patterns are generated, the information fingerprint of the most former data is as follows:
〈1,6,8;0,4,11;1,6,6;1,3,11;1,5,7;Isosorbide-5-Nitrae, 8 >,
In this example, comparison method by turn: first by the first place comparison one by one of the characteristic fingerprint of information to be compared, the most suitable Sequence comparison < 1;0;1;1;1;1 > the most consistent?If consistent, order comparison < 6;4;6;3;5;4 > the most consistent?If it is consistent Then continuation order comparison<8,11,6,11,7,8>is the most consistent?If consistent, think that fingerprint is consistent;Any one numerical value is the most right Then think that fingerprint is inconsistent.
In this example, overall comparison method: first calculate the first bit-planes and constitute and traversal mode (a), generate The individual features value of the eigenvalue of one information fingerprint and band comparison is compared, if by the result of calculation of this example, and the 1st Secondary comparison information is<1,6,8>, if comparison result is consistent, then takes the 2nd value<0,4,11>, if unanimously, the like, directly To having compared all of eigenvalue, the most unanimously, then it is assumed that both information fingerprints are consistent, and any one element comparison differs Cause, then it is assumed that both fingerprints are inconsistent.
Above-described embodiment is the present invention preferably embodiment, but embodiments of the present invention are not by above-described embodiment Limit, the change made under other any spirit without departing from the present invention and principle, modify, substitute, combine, simplify, All should be the substitute mode of equivalence, within being included in protection scope of the present invention.

Claims (9)

1. the information fingerprint extracting method of a multipath depth coding based on bit space, it is characterised in that include following Step:
Step one, structure bit window;
Step 2, structure bit-planes: by bit window according to the extra heavy new segmentation of a certain width versus, be arranged in a bit and put down Face;
Step 3, dimensionality reduction structure bit chain: carry out dimensionality reduction arrangement by multiple travel paths mode, build different bit chain BC, And record the coding of corresponding arrangement mode;
Step 4, bit chain is encoded: the dimensionality reduction bit chain for above-mentioned structure carries out traversal statistics, obtains successively continuously The number of 0 and continuous print 1, constitute 1 new decimal scale ordered series of numbers: carry out binarization acquisition again for gained decimal scale ordered series of numbers The binary bits chain that one new first place is non-zero, repeats the operation of above-mentioned traversal statistical counting, circulates successively gained bit chain Till iteration element in the new decimal scale ordered series of numbers obtained is 1, the value of the element that record finally gives and loop iteration Number of times is an eigenvalue of information fingerprint feature space;
Step 5, fingerprint comparison.
The information fingerprint extracting method of multipath depth coding based on bit space the most according to claim 1, it is special Levying and be: in step one, the effective bit according to can uniquely characterize original information bytes stream constructs bit window, bit The size of window is positive integer m being arbitrarily not more than the total bit number of raw information (Tb), for different systems or byte stream, m Value can change;In step 2, the window that original information word throttled is split by the width of m bit and arranges the most side by side Row, lowest order is 0, and highest order is m-1, it is thus achieved that the bit-planes of m row * n row;It is n that described m divides exactly Tb, if can not be whole Except then remainder being omitted or mending 0 process.
The information fingerprint extracting method of multipath depth coding based on bit space the most according to claim 1, it is special Levy and be, in step 3, carry out dimensionality reduction arrangement by following travel paths mode, build different bit chains, and record corresponding Arrangement mode coding:
A) by above-mentioned bit-planes with m as fixed step size, from the beginning of a high position for the first row, rank to low from the high-order column of this row, The tail position bit of low level row is connected with the first bit of high-order row, by two-dimensional bits plane conversion is an a length of m*n one Dimension bit chain, and record the encoded radio of this sortord;
B) by above-mentioned bit-planes with m as fixed step size, from the beginning of the tail position of the first row, low from this row ranks paramount ranking, The high order bit of low level row is connected with the tail position bit of high-order row, by two-dimensional bits plane conversion is an a length of m*n one Dimension bit chain, and record the encoded radio of this sortord;
C) by above-mentioned bit-planes with n as fixed step size, from the beginning of the high-order column of the first row, a high position is walked to from the low level of these row OK, the low low level row bit ranked is connected with the high-order row bit of high-order column, by two-dimensional bits plane conversion be one a length of The one-dimensional bit chain of n*m, and record the encoded radio of this sortord;
D) by above-mentioned bit-planes with n as fixed step size, from the beginning of the tail of the first row ranks, a high position is walked to from the low level of these row OK, the low high-order row bit ranked is connected with the low level row bit of high-order column, by two-dimensional bits plane conversion be one a length of The one-dimensional bit chain of n*m, and record the encoded radio of this sortord;
E) by above-mentioned bit-planes with n as fixed step size, from the beginning of the tail of the first row ranks, a high position is walked to from the low level of these row OK, the low high-order row bit ranked is connected with the high-order row bit of high-order column, until the low lowest order row bit ranked is with high-order The lowest order row bit of row is connected, the like, by the one-dimensional bit chain that two-dimensional bits plane conversion is an a length of n*m, And record the encoded radio of this sortord;
F) by above-mentioned bit-planes with n as fixed step size, from the beginning of the first row of last column, low level is walked to from a high position for these row OK, the low level row bit of high-order column is connected with the low low level row bit ranked, until the lowest order row bit of high-order column and low level The lowest order row bit of row is connected, the like, by the one-dimensional bit chain that two-dimensional bits plane conversion is an a length of n*m, And record the encoded radio of this sortord.
The information fingerprint extracting method of multipath depth coding based on bit space the most according to claim 1, it is special Levying and be, described step 4 comprises the following steps:
Step 41, bit chain is tentatively encoded;Described dimensionality reduction bit chain is carried out the first coding, i.e. records above-mentioned bit chain The 1st bit, if 0, be then encoded to 0;If 1, then it is encoded to 1;Initiateing of the first coding stands corresponding bits chain Position is 0, or 1;
Step 42, to formed described dimensionality reduction bit chain simplify run-length encoding, order statistics continuous print 0/1 the most from the beginning to the end The number of times occurred, it is thus achieved that the nonzero integer sequence of a corresponding bit chain, and initial code depth value is set, then carry out deep Degree coding;
Step 43, bit chain is carried out depth coding;
The described method that bit chain carries out depth coding comprises the following steps:
Step A, the nonzero integer sequence of gained is carried out binarization again, omit that each integer is corresponding binary digital Before one 10, builds and forms 0/1 new bit chain;
Step B also adds up the length of this bit chain, and add 1 by loop iteration depth value;
Step C, carry out again simplifying run-length encoding by the bit chain newly the obtained replacement original bit chain that generated of claim 4, The number of times of the 0/1 of statistics appearance continuously constitutes new nonzero integer sequence.
The information fingerprint extracting method of multipath depth coding based on bit space the most according to claim 4, it is special Levying and be, circulation performs step A to step C, until a length of the 1 of the new nonzero integer sequence obtained, the most only 1 meter Numerical value, and record this value for terminate encoded radio, record final coding depth value, the most above-mentioned loop iteration number of times simultaneously.
The information fingerprint extracting method of multipath depth coding based on bit space the most according to claim 3, it is special Levying and be, the multiple bit chain constructed by the different sortords step 41 that reruns is to 43, until obtaining six sequence sides Till the tlv triple<the first value, terminate encoded radio, coding depth value>of formula, the tlv triple of described six sortords includes first The tlv triple of kind of sortord structure, the tlv triple of the second sortord structure, the tlv triple of the third sortord structure, The tlv triple of the 4th kind of sortord structure, the tlv triple of the 5th kind of sortord structure and the three of the 6th kind of sortord structure Tuple;The feature space being constituted this raw information by described six tlv triple is the fingerprint of this information.
The information fingerprint extracting method of multipath depth coding based on bit space the most according to claim 6, it is special Levy and be, generate the comparison method by turn of the fingerprint of described information, comprise the following steps:
(1) the first comparison: for same bit chain structural model, preferential comparison the first bit value accordingly, if having any one The first bit value of individual correspondence is different, then both fingerprints of explanation do not mate, and terminate comparison;If identical, then turn next step;
(2) encoded radio comparison: if the first bit value is identical, then continue the final encoded radio that comparison is corresponding, if having any The final encoded radio of individual correspondence is different, then both fingerprints of explanation do not mate, and terminate comparison;If identical, then turn next step;
(3) coding depth value comparison: if the first and final encoded radio is the most identical, then continue the coding depth value that comparison is corresponding, I.e. iterations value, if there being the coding depth value difference of any one correspondence, then both fingerprints of explanation do not mate, and terminate ratio Right;If it is identical, then it is assumed that both fingerprint matchings.
The information fingerprint extracting method of multipath depth coding based on bit space the most according to claim 6, it is special Levy and be, the unification of combining of the fingerprint of generated described information is contrasted;The described method unifying to contrast of combining is: right wanted The tlv triple of the first sortord structure in 6 is asked to compare according to the method in claim 7, if identical, by right Require that in 6, the tlv triple of the second sortord structure is compared by the method in claim 7, if identical, by right Require that in 6, the tlv triple of the third sortord structure is compared by the method in claim 7, if identical, by right Require that in 6, the tlv triple of the 4th kind of sortord structure is compared by the method in claim 7, if identical, by right Require that in 6, the tlv triple of the 5th kind of sortord structure is compared by the method in claim 7, if identical, by right Require that in 6, the tlv triple of the 6th kind of sortord structure is compared by the method in claim 7;If according to identical sequence mould The tlv triple of six sortords that the path that formula constructs is formed is all consistent, then fingerprint is the most consistent, then it is assumed that information one Cause, otherwise it is assumed that information is inconsistent.
The information fingerprint extracting method of multipath depth coding based on bit space the most according to claim 2, it is special Levying and be, the value of described m is the integral number power of 2.
CN201610119377.8A 2016-03-02 2016-03-02 A kind of information fingerprint extracting method of the multipath depth coding based on bit space Active CN105844214B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610119377.8A CN105844214B (en) 2016-03-02 2016-03-02 A kind of information fingerprint extracting method of the multipath depth coding based on bit space

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610119377.8A CN105844214B (en) 2016-03-02 2016-03-02 A kind of information fingerprint extracting method of the multipath depth coding based on bit space

Publications (2)

Publication Number Publication Date
CN105844214A true CN105844214A (en) 2016-08-10
CN105844214B CN105844214B (en) 2019-06-21

Family

ID=56586862

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610119377.8A Active CN105844214B (en) 2016-03-02 2016-03-02 A kind of information fingerprint extracting method of the multipath depth coding based on bit space

Country Status (1)

Country Link
CN (1) CN105844214B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111464187A (en) * 2020-04-17 2020-07-28 北京百瑞互联技术有限公司 Host control interface command event coding method
CN115470508A (en) * 2022-11-02 2022-12-13 北京点聚信息技术有限公司 Format file vectorization encryption method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7098815B1 (en) * 2005-03-25 2006-08-29 Orbital Data Corporation Method and apparatus for efficient compression
CN102323934A (en) * 2011-08-31 2012-01-18 深圳市彩讯科技有限公司 Mail fingerprint extraction method based on sliding window and mail similarity judging method
CN102354354A (en) * 2011-09-28 2012-02-15 辽宁国兴科技有限公司 Information fingerprint technique based picture password generation and authentication method
CN103258156A (en) * 2013-04-11 2013-08-21 杭州电子科技大学 Method for generating secret key on basis of fingerprint characteristics
CN103425639A (en) * 2013-09-06 2013-12-04 广州一呼百应网络技术有限公司 Similar information identifying method based on information fingerprints

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7098815B1 (en) * 2005-03-25 2006-08-29 Orbital Data Corporation Method and apparatus for efficient compression
CN102323934A (en) * 2011-08-31 2012-01-18 深圳市彩讯科技有限公司 Mail fingerprint extraction method based on sliding window and mail similarity judging method
CN102354354A (en) * 2011-09-28 2012-02-15 辽宁国兴科技有限公司 Information fingerprint technique based picture password generation and authentication method
CN103258156A (en) * 2013-04-11 2013-08-21 杭州电子科技大学 Method for generating secret key on basis of fingerprint characteristics
CN103425639A (en) * 2013-09-06 2013-12-04 广州一呼百应网络技术有限公司 Similar information identifying method based on information fingerprints

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ALPER KANAK 等: "Biometric key generation with a parametric linear classifier", 《2009 IEEE 17TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE》 *
周星: "基于数字水印的可追踪电子文档保护系统研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111464187A (en) * 2020-04-17 2020-07-28 北京百瑞互联技术有限公司 Host control interface command event coding method
CN111464187B (en) * 2020-04-17 2023-04-28 北京百瑞互联技术有限公司 Host control interface command event coding method, storage medium and computer equipment
CN115470508A (en) * 2022-11-02 2022-12-13 北京点聚信息技术有限公司 Format file vectorization encryption method

Also Published As

Publication number Publication date
CN105844214B (en) 2019-06-21

Similar Documents

Publication Publication Date Title
Bille et al. Random access to grammar-compressed strings
Sidirourgos et al. Column imprints: a secondary index structure
CN104952039B (en) Distributed image compressed sensing method for reconstructing
CN104124980B (en) It is adapted to the high speed secret negotiation method of continuous variable quantum key distribution
CN106326641A (en) Data processing method for block chain system based on compressed sensing and sparse reconstruction algorithm
CN105144157B (en) System and method for the data in compressed data library
Ferres et al. Fast and compact planar embeddings
CN102176750B (en) High-performance adaptive binary arithmetic encoder
CN104937593A (en) System and method for database searching
CN115204754B (en) Heating power supply and demand information management platform based on big data
Shun Parallel wavelet tree construction
CN104881449A (en) Image retrieval method based on manifold learning data compression hash
CN102905137B (en) The quick difference vector of ultraphotic spectrum signal quantizes compaction coding method
WO2023202149A1 (en) State selection method and system for finite state entropy encoding, and storage medium and device
CN105844214A (en) Multi-path depth encoded information fingerprint extraction method based on bit space
CN109598334A (en) A kind of sample generating method and device
CN105302915B (en) The high-performance data processing system calculated based on memory
CN109767282A (en) Intelligent commodity screening technique and device, electronic equipment
CN110489606B (en) Packet Hilbert coding and decoding method
CN117097906B (en) Method and system for efficiently utilizing regional medical resources
CN108923889B (en) Coding method and device
CN102595496A (en) Context-adaptive quotient and remainder encoding method used for sensing data of wireless sensing nodes
CN112115307A (en) Vertex data rule storage structure of facing graph and connection topology compression method
CN106790550A (en) A kind of system suitable for the compression of power distribution network Monitoring Data
Yuan et al. Optimizing sparse matrix vector multiplication using diagonal storage matrix format

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant