CN108304409A - A kind of data Frequency estimation method of the Sketch data structures based on carry - Google Patents

A kind of data Frequency estimation method of the Sketch data structures based on carry Download PDF

Info

Publication number
CN108304409A
CN108304409A CN201710024141.0A CN201710024141A CN108304409A CN 108304409 A CN108304409 A CN 108304409A CN 201710024141 A CN201710024141 A CN 201710024141A CN 108304409 A CN108304409 A CN 108304409A
Authority
CN
China
Prior art keywords
value
meter digital
query
marker bit
counter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710024141.0A
Other languages
Chinese (zh)
Other versions
CN108304409B (en
Inventor
杨仝
姜雨萌
李晓明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201710024141.0A priority Critical patent/CN108304409B/en
Publication of CN108304409A publication Critical patent/CN108304409A/en
Application granted granted Critical
Publication of CN108304409B publication Critical patent/CN108304409B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The data Frequency estimation method for the Sketch data structures based on carry that the present invention relates to a kind of.This method includes:1) Sketch data structures are established, for the two-dimensional array being made of counter, wherein each position is one n counters, and marker bit and meter digital are set up in the n bit spaces of counter;2) it when being updated operation, by hash function by maps data items to the two-dimensional array, is counted by meter digital in mapping process, and reaches in meter digital and carry out carry using marker bit in limited time thereon;3) when carrying out inquiry operation, the minimum value in Query Value often capable in two-dimensional array is returned to, as query result.The mode of fixation mark position or the mode of multistage dynamically labeled position may be used in this method.The present invention can be such that count upper-limit is obviously improved in the case where counter size is constant, can promote the order of accuarcy of counting.

Description

A kind of data Frequency estimation method of the Sketch data structures based on carry
Technical field
The present invention relates to multiple key areas such as network security, financial analysis, machine learning, natural language processing, specifically For a kind of data Frequency estimation method of the Sketch data structures based on carry.
Background technology
Currently, Count-Min Sketch (Graham Cormode, S.Muthukrishnan.An Improved Data Stream Summary:The Count-Min Sketch and Its Applications [M]), i.e. counting-minimum sketch map, It is using at most, performance is best, a kind of most pervasive Sketch in various data.It is relatively light and handy, and real-time counting is simple and quick, Scalability is stronger, and storage and computation complexity are all very low.
However, as a lightweight even (Y.Wang, Y.Zu, the and et of the data structure used in GPU al.Wire speed name lookup:A gpu-based approach.In Proc.USENIX NSDI,pages 199– 212,2013.), Count-Min Sketch still have larger limitation in performance, such as it inquires accuracy rate to use space Size is more sensitive, and the limitation of space size can largely restrict its accuracy rate.Its Data Structure Design is more simple simultaneously It is single, cause data volume storage cap extremely limited.
Invention content
In order to overcome the shortcomings of that existing Count-Min Sketch counting modes are original, present invention offer is a kind of to promote one Determine the method for counting for the codomain upper limit that bit can express.
The technical solution adopted by the present invention is as follows:
A kind of data Frequency estimation method of the Sketch data structures based on carry, includes the following steps:
1) Sketch data structures are established, for the two-dimensional array being made of counter, wherein each position is one A n of counter sets up marker bit and meter digital in the n bit spaces of counter;
2) it when being updated operation, by hash function by maps data items to the two-dimensional array, was mapping It is counted by meter digital in journey, and reaches in meter digital and carry out carry using marker bit in limited time thereon;
3) when carrying out inquiry operation, the minimum value in Query Value often capable in two-dimensional array is returned to, as query result.
Further, step 1) is by the way of fixation mark position, i.e., by high x in the n bit spaces of counter as mark Remember position, remaining n-x is used as meter digital.
Alternatively, step 1) is by the way of multistage dynamically labeled position, the number of marker bit and the number of meter digital are according to depositing The numerical value of storage is adjusted into Mobile state.
A kind of query string frequency statistics method, includes the following steps:
1) that retrieves the retrieval string used every time using Sketch data structure records user described in claim 1 goes out occurrence Number;
2) for each query string, the Query Value of its occurrence number is obtained according to the Sketch data structures, and then obtain To the maximum k query string of occurrence number.
Further, if the Query Value that step 2) obtains some query string is not enough to discharge into the maximum k of occurrence number Among a query string, then it is not necessarily to go in the Hash table outside piece to obtain its actual value.
The beneficial effects of the invention are as follows:
In the case where counter size is constant, count upper-limit is obviously improved the present invention.Therefore, if keeping count upper-limit not Become, then smaller, more counters can be used, so as to promote the order of accuarcy of counting.Because the present invention is pair The improvement of Count-Min Sketch, so the usage scenario suitable for all Count-Min Sketch, including natural language Processing, data stream statistics calculate point mutual information, the sparse bayesian learning of compression sensing, the detection of Network Abnormal stream, processing distributed data Collection, etc..
Description of the drawings
Schematic diagram when Fig. 1 is the execution update operation of the fixation mark position version of the present invention.
Fig. 2 is the structural schematic diagram of the counter of multistage dynamically labeled position version, it is shown that the area of marker bit and meter digital Point.
Specific implementation mode
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, below by specific embodiment and Attached drawing, the present invention will be further described.
The technical solution adopted in the present invention is divided into 2 versions:
1.Carry-In Sketch:Fixation mark position version
1) data structure
Carry-In Sketch are as CM Sketch data structures, if it is a wide w high being made of Counter The two-dimensional array of d:C[1,1]…C[d,w].Each position is the one n counters for being initialized as 0.At this n In space, high x (x<N) position is used as marker bit, and the remaining position (n-x) is used as meter digital.In addition, it would be desirable to d two Two independent hash functions:h1...hd:(1...∞)→{1...w}.Wherein h1…hdIndicate d hash function, (1... ∞) → { 1...w } indicates that any positive integer is mapped to 1 to w by each hash functions.Meanwhile we define coefficient of expansion m.
2) it operates
Update (update):When a Update request (k, c) reaches, it would be desirable to be inserted into the element that key values are k C times in Sketch.Every time, we carry out primary insertion operation to each row of Carry-In Sketch.For r rows, we First according to hr(k) position of element, h are foundr(k) mapping value of r-th of hash function pairs k is indicated.Later, we are to the position Set C [r, hr(k)] insertion operation is carried out:We use marker bit as carry.When the marker bit of a counter is 0 When, insertion operation is simply to carry out+1 to counter.Until meter digital reaches its upper limit 2n-x, we are to marker bit It carries out+1 operation and is 0 by counting position.Since then, for being inserted into each time, we withProbability to meter digital carry out+ 1.If meter digital again achieves its upper limit, we be repeated once before operation, to marker bit+1 and by meter digital It is set to 0.
Schematic diagram when Fig. 1 is the execution update operation of the fixation mark position version of the present invention.4 rows are shared herein, often Row is all according to hr(k) it finds the position of the corresponding counter of the element and operates on it.Every time to counting when marker bit is 0 Position+1, otherwise every time withProbability is to meter digital+1.
Query (inquiry):When the inquiry request that a key value is k arrives, we calculate:
C[1,h1(k)],C[2,h2(k)]…C[d,hd(k)]:
If value (x sign bits)=0
Value (C [i, hi (k)])=value ((n-x) count bits);
if value(x sign bits)>0
Value (C [i, hi (k)])=2^ (n-x)+m*2^ (n-x) * (value (x sign bits) -1)+m* (value ((n-x)count bits)).
Content is with natural language description above:
For each C [i, hi(k)], inquiry value calculating method is as follows:
If 1) value of marker bit is 0, the value of meter digital is exactly Query Value.
2) if marker bit is not 0, Query Value is divided into two parts:
A part is the Query Value of meter digital, and from mathematic expectaion, being inserted into per m times can enable meter digital increase by 1, so It is equal to the value that coefficient of expansion m is multiplied by meter digital.
Another part is the value that marker bit represents.Known meter digital shares the position (n-x), according to above-mentioned update step it is found that Meter digital, which often increases 2^ (n-x), will make marker bit increase by 1, while meter digital returns 0.Therefore, enabling marker bit rise to 1 from 0 needs Want 2^ (n-x) secondary insertion.Once marker bit is not 0, from mathematic expectaion, being inserted into per m times can enable meter digital increase by 1, again Because wanting that enabling marker bit increase by 1 needs meter digital to increase 2^ (n-x), the insertion number needed is that m*2^ (n-x) is secondary.So For the value of marker bit if x, Query Value is 2^ (n-x)+m*2^ (n-x) * (x-1).
It is exactly C [i, h that above-mentioned two part, which is added up,i(k)] Query Value.
The value of value (element) representative element in above-mentioned code.Later, we return to all C [i, hi(k)] it inquires That minimum is as final query result in value.
2.Carry-In Sketch:Multistage dynamically labeled position version
1) data structure
The data structure of this version is still a two-dimensional array C [d, w].But current, marker bit is multistage.Marker bit Number and the number of meter digital be all to be adjusted according to the numerical value of storage dynamic.Toward low level since the highest order of counter It looks for, the value until finding first bit is 0, and the positions higher than 0 all at this time are all marker bits, and all positions lower than 0 are all to count Position.It is m that equally we, which define the coefficient of expansion,.As shown in Figure 2.
2) it operates
Update:Multi-stage signature position Carry-In Sketch and fixation mark position Carry-In Sketch updates operation is only One the difference is that for each counter carry out+1.We assume that high x is marker bit, then the low position (n-x-1) is to count Position.Every time, we withProbability to meter digital carry out+1 operation.When meter digital, which has been expired, needs carry, we will be rigid First 0 just found is set to 1, is then 0 by all counting positions.In this way, marker bit just extends one, meter digital subtracts One is lacked, that is, first found from a high position to low level 0 has moved to right one.
Query:When a key value be k inquire come when, we calculate C [1, h1(k)],C[2,h2(k)]…C [d,hd(k)] value, calculation are as follows:
Value=m0*2n-1+m1*2n-2+…+mx-1*2n+1-x+mx-1*(value of counter bits)
Wherein " value of counter bits " indicates the value of meter digital.After calculating these values, selection is wherein most Small, as final Query Value.
The invention has the advantages that in the case where counter size is constant, count upper-limit is obviously improved.Take each meter The digit n=8 of number device, it is m=16 to take the coefficient of expansion, then specific effect is as shown in Table 1 and Table 2:
1. fixation mark position version of table:
2. dynamically labeled versions of table:
Application scenarios:
Search engine can every time retrieve user all retrieval strings used by journal file and all record, it is assumed that mesh Before have a several records, every record corresponds to the one query to some query string.It is required that the k inquiry that statistics is most popular String.
Traditional scheme is to record the number that each query string occurs using a Hash table.Safeguard that a size is later The rootlet heap of k traverses whole query strings, finally can be obtained the maximum k query string of occurrence number.We can breathe out now On the basis of uncommon table, increase Count-Min Sketch (abbreviation CM Sketch, a similarly hereinafter) structure in the prior art, it is excellent Change processing speed.It is specific as follows:
First, according to the scene of practical application, it is believed that Hash table is very big, it is necessary to it is placed on outside piece, and being accessed outside piece is Very slow (relative to being accessed in piece).Now, we can increase a CM Sketch structure in piece, record each query string Occurrence number.The characteristics of CM Sketch is exactly small, it is sufficient to it is placed in piece, thus access speed (accesses a sketch quickly The time consumed is far smaller than the time accessed needed for a Hash table).At the same time, according to CM Sketch the characteristics of, look into It is inaccurate to ask the value that it is obtained, but the Query Value is always not less than actual value.Therefore, if to some query string, from CM The Query Value obtained in Sketch is all not enough to discharge among maximum k, then is not necessarily to go in the Hash table outside piece to obtain it very Real value accesses so as to avoid outside primary piece, increases treatment effeciency.
But still remain such a case, that is, the Query Value of the CM Sketch of some query string is enough to discharge into Among maximum k, but actual value is not enough to discharge among maximum k.We it is of course desirable that such case can be reduced to the greatest extent, This requires sketch to promote order of accuarcy as much as possible under the premise of the consumed memory space of holding is constant.And it is of the invention Carry-In Sketch compared to CM Sketch, be exactly made that such improvement!
When it is implemented, the Count-Min Sketch used originally in above-mentioned scene are replaced with Carry-In Sketch.The data structure and mode of operation of Carry-In Sketch is described above in detail.
Specific example:
5 different query strings are suppose there is, are a, b, c, d, e respectively, the frequency is followed successively by 1000,300,200,1200, 400.In the CM Sketch of script, a and c are mapped to the same position, this position is counted as 1000+200=1200.B and D is mapped to the same position, this position is counted as 300+1200=1500.
Currently assume us with the order traversal of edcba to these strings, it is intended to find top-3, and find 3 before A maximum value is 350,340,330.E is found, Query Value 400 is sufficiently large, then goes in Hash table to inquire its actual value 400, then existing Top-3 be 400,350,340 respectively.It finds d, Query Value 1500, then goes for actual value 1200, then present top-3 points It is not 1200,400,350.C, Query Value 1200 are found, then goes for actual value 200, is ignored.Similarly, b also ignores.It is eventually found A inquires actual value 1000, and it is 1200,1000,400 to finally obtain top-3.In order to which this 5 elements inquire 5 in total during this It is accessed outside secondary Hash table, that is, 5 pieces.
If using Carry-In Sketch, under the premise of consumed space invariance, since counter numbers increase, very It is possible that the position that these elements are mapped to is different.Query Value 200 is found when then inquiring c, is not enough to discharge into top-3, just Its actual value need not be inquired.Similarly for b.Which reduces the number of 2 access Hash tables, or perhaps 2 pieces Outer access, to improve efficiency.
In above-mentioned update step, insertion operation is carried out to d rows.Another mode of texturing is to search this d row knot Minimum value (may have multiple minimum values) in fruit then only executes insertion to this or these minimum value;To remaining it is capable not into Any operation of row.Remaining operation is all constant.This mode of texturing is suitable for 2 versions of above-mentioned whole.
The above embodiments are merely illustrative of the technical solutions of the present invention rather than is limited, the ordinary skill of this field Personnel can be modified or replaced equivalently technical scheme of the present invention, without departing from the spirit and scope of the present invention, this The protection domain of invention should be subject to described in claims.

Claims (10)

1. a kind of data Frequency estimation method of the Sketch data structures based on carry, which is characterized in that include the following steps:
1) Sketch data structures are established, for the two-dimensional array being made of counter, wherein each position is one n Counter, set up marker bit and meter digital in the n bit spaces of counter;
2) when being updated operation, by hash function by maps data items to the two-dimensional array, in mapping process It is counted by meter digital, and reaches in meter digital and carry out carry using marker bit in limited time thereon;
3) when carrying out inquiry operation, the minimum value in Query Value often capable in two-dimensional array is returned to, as query result.
2. the method as described in claim 1, which is characterized in that step 1) is by the way of fixation mark position, i.e., by counter N bit spaces in high x be used as marker bit, remaining n-x as meter digital.
3. method as claimed in claim 2, which is characterized in that the specific method that step 2) is updated operation is:When one When update request (k, c) reaches, the element that key values are k is inserted into Sketch c times, each row is once inserted into every time Operation;For r rows, first according to hr(k) position of element, h are foundr(k) mapping value of r-th of hash function pairs k is indicated, Later to the position C [r, hr(k)] insertion operation is carried out, and uses marker bit as carry;When the mark of a counter When remembering that position is 0, insertion operation is simply to carry out+1 to counter, until meter digital reaches its upper limit 2n-x, to marker bit into Row+1 operates and is 0 by counting position;Then for being inserted into each time, withProbability to meter digital carry out+1, wherein m be it is swollen Swollen coefficient;If meter digital again achieves its upper limit, the operation before being repeated once, to marker bit+1 and by counting position It is 0.
4. method as claimed in claim 3, it is characterised in that:In the step 3) inquiry operation, for each C [i, hi (k)], inquiry value calculating method is as follows:
If a) value of marker bit is 0, the value of meter digital is exactly Query Value;
If b) marker bit is not 0, Query Value is divided into two parts:A part is the Query Value of meter digital, is equal to coefficient of expansion m It is multiplied by the value of meter digital;Another part be marker bit represent value, if the value of marker bit be x, then its Query Value be 2^ (n-x)+ m*2^(n-x)*(x-1);Two parts are added as C [i, hi(k)] Query Value is then back to all C [i, hi(k)] The final query result of minimum conduct in Query Value.
5. the method as described in claim 1, which is characterized in that step 1) is by the way of multistage dynamically labeled position, marker bit Number and the number of meter digital adjusted into Mobile state according to the numerical value of storage.
6. method as claimed in claim 5, which is characterized in that the mode of the dynamically labeled position of multistage, most from counter A high position starts to look for toward low level, and the value until finding first bit is 0, and the positions higher than 0 all at this time are all marker bits, all to compare 0 Low position is all meter digital.
7. method as claimed in claim 6, which is characterized in that the specific method that step 2) is updated operation is:If x high Marker bit, then low n-x-1 is meter digital, every time withProbability to meter digital carry out+1 operation, wherein m be expansion be Number;When meter digital completely needs carry, it is set to 1 by first 0 found when determining marker bit, then by all meter digitals It is set to 0, to make label Bits Expanding one, meter digital reduce one.
8. the method for claim 7, which is characterized in that the specific method that step 3) carries out inquiry operation is, when one When the inquiry request that key values are k arrives, calculate:
Value=m0*2n-1+m1*2n-2+…+mx-1*2n+1-x+mx-1* (value of counter bits),
Wherein value of counter bits indicate the value of meter digital;After calculating these values, selection wherein minimum is For final Query Value.
9. a kind of query string frequency statistics method, which is characterized in that include the following steps:
1) Sketch data structure records user described in claim 1 is used to retrieve the occurrence number of the retrieval string used every time;
2) for each query string, the Query Value of its occurrence number is obtained according to the Sketch data structures, and then gone out The maximum k query string of occurrence number.
10. method as claimed in claim 9, which is characterized in that it is characterized in that, step 2) looks into some query string acquisition If inquiry value is not enough to discharge among the maximum k query string of occurrence number, without going in the Hash table outside piece to obtain it very Real value.
CN201710024141.0A 2017-01-13 2017-01-13 Carry-based data frequency estimation method of Sketch data structure Active CN108304409B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710024141.0A CN108304409B (en) 2017-01-13 2017-01-13 Carry-based data frequency estimation method of Sketch data structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710024141.0A CN108304409B (en) 2017-01-13 2017-01-13 Carry-based data frequency estimation method of Sketch data structure

Publications (2)

Publication Number Publication Date
CN108304409A true CN108304409A (en) 2018-07-20
CN108304409B CN108304409B (en) 2021-11-16

Family

ID=62872335

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710024141.0A Active CN108304409B (en) 2017-01-13 2017-01-13 Carry-based data frequency estimation method of Sketch data structure

Country Status (1)

Country Link
CN (1) CN108304409B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532307A (en) * 2019-07-11 2019-12-03 北京大学 A kind of date storage method and querying method flowing sliding window
CN110535825A (en) * 2019-07-16 2019-12-03 北京大学 A kind of data identification method of character network stream
CN110830322A (en) * 2019-09-16 2020-02-21 北京大学 Network flow measuring method and system based on probability measurement data structure Sketch with approximate zero error
CN111782700A (en) * 2020-08-05 2020-10-16 中国人民解放军国防科技大学 Data stream frequency estimation method, system and medium based on double-layer structure
CN112422579A (en) * 2020-11-30 2021-02-26 福州大学 Execution body set construction method based on mimicry defense Sketch
US11934401B2 (en) 2022-08-04 2024-03-19 International Business Machines Corporation Scalable count based interpretability for database artificial intelligence (AI)
CN117811951A (en) * 2024-02-29 2024-04-02 苏州大学 Network flow size measuring method based on Sketch

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102882798A (en) * 2012-09-04 2013-01-16 中国人民解放军理工大学 Statistical counting method facing to backbone network flow analysis
CN103647670A (en) * 2013-12-20 2014-03-19 北京理工大学 Sketch based data center network flow analysis method
CN103763154A (en) * 2014-01-11 2014-04-30 浪潮电子信息产业股份有限公司 Network flow detection method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102882798A (en) * 2012-09-04 2013-01-16 中国人民解放军理工大学 Statistical counting method facing to backbone network flow analysis
CN103647670A (en) * 2013-12-20 2014-03-19 北京理工大学 Sketch based data center network flow analysis method
CN103763154A (en) * 2014-01-11 2014-04-30 浪潮电子信息产业股份有限公司 Network flow detection method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GRAHAM CORMODE 等: "An improved data stream summary: the count-min sketch and its applications", 《JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC》 *
GRAHAM CORMODE: "Count-Min Sketch", 《ENCYCLOPEDIA OF DATABASE SYSTEMS》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532307A (en) * 2019-07-11 2019-12-03 北京大学 A kind of date storage method and querying method flowing sliding window
CN110532307B (en) * 2019-07-11 2022-05-03 北京大学 Data storage method and query method of stream sliding window
CN110535825A (en) * 2019-07-16 2019-12-03 北京大学 A kind of data identification method of character network stream
CN110830322A (en) * 2019-09-16 2020-02-21 北京大学 Network flow measuring method and system based on probability measurement data structure Sketch with approximate zero error
CN111782700A (en) * 2020-08-05 2020-10-16 中国人民解放军国防科技大学 Data stream frequency estimation method, system and medium based on double-layer structure
CN111782700B (en) * 2020-08-05 2023-08-18 中国人民解放军国防科技大学 Data stream frequency estimation method, system and medium based on double-layer structure
CN112422579A (en) * 2020-11-30 2021-02-26 福州大学 Execution body set construction method based on mimicry defense Sketch
CN112422579B (en) * 2020-11-30 2021-11-30 福州大学 Execution body set construction method based on mimicry defense Sketch
US11934401B2 (en) 2022-08-04 2024-03-19 International Business Machines Corporation Scalable count based interpretability for database artificial intelligence (AI)
CN117811951A (en) * 2024-02-29 2024-04-02 苏州大学 Network flow size measuring method based on Sketch
CN117811951B (en) * 2024-02-29 2024-05-31 苏州大学 Network flow size measuring method based on Sketch

Also Published As

Publication number Publication date
CN108304409B (en) 2021-11-16

Similar Documents

Publication Publication Date Title
CN108304409A (en) A kind of data Frequency estimation method of the Sketch data structures based on carry
CN106326361B (en) Data query method and device based on HBase database
CN106407201B (en) Data processing method and device and computer readable storage medium
CN101404032B (en) Video retrieval method and system based on contents
CN106452868A (en) Network traffic statistics implement method supporting multi-dimensional aggregation classification
CN103914463B (en) A kind of similarity retrieval method and apparatus of pictorial information
CN107046586B (en) A kind of algorithm generation domain name detection method based on natural language feature
CN106850187A (en) A kind of privacy character information encrypted query method and system
WO2021072874A1 (en) Dual array-based location query method and apparatus, computer device, and storage medium
CN106326475A (en) High-efficiency static hash table implement method and system
CN104008134B (en) Efficient storage method and system based on Hbase
WO2013143278A1 (en) Method, device and system for querying data index
CN111447292B (en) IPv6 geographical position positioning method, device, equipment and storage medium
CN103002061A (en) Method and device for mutual conversion of long domain names and short domain names
CN108763536A (en) Data bank access method and device
CN107656989B (en) Nearest Neighbor based on data distribution perception in cloud storage system
CN107784073B (en) Data query method for local cache, storage medium and server
CN108460030A (en) A kind of set element judgment method based on improved Bloom filter
CN108304404B (en) Data frequency estimation method based on improved Sketch structure
Wang et al. Rencoder: A space-time efficient range filter with local encoder
CN113297266B (en) Data processing method, device, equipment and computer storage medium
CN105302833A (en) Content based video retrieval mathematic model establishment method
CN107609089B (en) A kind of data processing method, apparatus and system
CN107294855B (en) A kind of TCP under high-performance calculation network searches optimization method
CN115391568A (en) Entity classification method, system, terminal and storage medium based on knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant