CN117112549B - Big data merging method based on bloom filter - Google Patents

Big data merging method based on bloom filter Download PDF

Info

Publication number
CN117112549B
CN117112549B CN202311365012.XA CN202311365012A CN117112549B CN 117112549 B CN117112549 B CN 117112549B CN 202311365012 A CN202311365012 A CN 202311365012A CN 117112549 B CN117112549 B CN 117112549B
Authority
CN
China
Prior art keywords
data
bloom filter
merging
field
log data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311365012.XA
Other languages
Chinese (zh)
Other versions
CN117112549A (en
Inventor
代颖超
张仑
梁思杰
牛威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Xingtu Measurement And Control Technology Co ltd
Original Assignee
Zhongke Xingtu Measurement And Control Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Xingtu Measurement And Control Technology Co ltd filed Critical Zhongke Xingtu Measurement And Control Technology Co ltd
Priority to CN202311365012.XA priority Critical patent/CN117112549B/en
Publication of CN117112549A publication Critical patent/CN117112549A/en
Application granted granted Critical
Publication of CN117112549B publication Critical patent/CN117112549B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a big data merging method based on a bloom filter, which comprises the following steps: s1, adopting Redis to cache syslog log data sent by different devices/hosts in batches; s2, consuming the syslog log data cached in the Redis batch, and obtaining the merged field encryption value of the syslog log data after analyzing and processing the consumed syslog log data; s3, screening the encryption value of the integrated field of the syslog log data by using a bloom filter, and transferring the screened syslog log data to a database; according to the invention, the Redis batch cache syslog log data is used for carrying out the analysis processing of the encrypted value of the merging field, and the merging field is used for reducing a large amount of redundant data in the syslog log data, saving the storage space, reducing the use cost of a database and improving the use efficiency of the database.

Description

Big data merging method based on bloom filter
Technical Field
The invention relates to the technical field of big data merging and storing, in particular to a big data merging method based on a bloom filter.
Background
The recent rapid development of the internet has led humans to enter an era of explosive growth in information content. Everyone's life is filled with structured and unstructured data. With the overall shift of human life to the internet, the big data age will inevitably come, and as the leading edge concept of the global internet, the big data mainly comprises two aspects of characteristics: on the one hand the amount of information in the whole society has grown drastically and on the other hand the information available to individuals has grown exponentially. From the technological development perspective, "big data" is an inevitable product of the trend of "data" -! And as this trend continues to go deep, we will be in the near future in an era of "everything is recorded and everything is digitized".
In the big data age, the amount of data generated in various fields has been increasing explosively. Data accumulates at a staggering rate from social media, sensor data, to online transactions and cloud storage. These data contain valuable information and insight, and in this context efficient storage of large data and good analytical exploitation is becoming increasingly urgent. The data analysis capability determines the quality and success/failure of the value discovery process in the big data. The most important difference between data collection, analysis, storage and past data analysis in the big data age is the dramatic increase in data volume. The demands for storage, querying and analysis of data are rapidly increasing due to the increasing amount of data. The big data age requires efficient data processing and analysis methods, and the traditional mode is from data receiving, preprocessing to data merging and storing, and risks of redundant data, data loss, cache penetration and service downtime exist.
Patent document CN103116599a discloses a method for removing fast redundancy of urban mass data stream based on improved Bloom Filter structure, and the method is related to a method for removing redundancy data based on Bloom Filter structure; but it focuses on redundancy removal after data set storage by the Bloom Filter structure, and does not take advantage of the Bloom Filter screening approach.
Disclosure of Invention
The invention aims to provide a bloom filter-based big data merging method, which solves the problems of reduced data processing efficiency, data loss, cache penetration and service downtime caused by a large amount of redundant data when big data is processed.
The aim of the invention can be achieved by the following technical scheme: a big data merging method based on a bloom filter comprises the following steps:
s1, adopting Redis to cache syslog log data sent by different devices/hosts in batches;
s2, consuming the syslog log data cached in the Redis batch, and obtaining the merged field encryption value of the syslog log data after analyzing and processing the consumed syslog log data;
s3, screening the encryption value of the integrated field of the syslog log data by using a bloom filter, and transferring the screened syslog log data to a database.
Further: the step of screening the encrypted value of the syslog log data merging field by using a bloom filter in the S3 is as follows:
s31, searching whether a corresponding merging field encryption value exists in the bloom filter for the passed syslog log data by the bloom filter;
s32, when the merging field encryption value does not exist in the bloom filter, searching whether the data which is the same as the merging field exists in the database, updating the data if the data exists, and updating the data to the database if the data does not exist; and storing the merge field encryption value into a bloom filter and a Redis;
s33, when the merging field encryption value exists in the bloom filter, updating the data which are the same as the merging field in the database;
s34, repeating S31-S33 to finish the consumption syslog log data screening process.
Further: the step of searching whether the bloom filter has the corresponding merging field encryption value for the passed syslog log data in the S31 is as follows:
s311, the bloom filter converts the encryption value into a hash value;
s312, comparing byte array positions corresponding to the hash values by using a bloom filter;
s313, if the compared hash value does not exist in the byte array position, returning a null value, and judging that the corresponding merging field encryption value does not exist in the bloom filter.
Further: in S33, when the merging field encryption value exists in the bloom filter, the step of updating the data identical to the merging field in the database is as follows:
s331, when a merging field encryption value exists in the bloom filter, inquiring merging field encryption value data stored by Redis to confirm whether the merging field encryption value data exists truly;
s332, if the Redis has the same data as the merging field, updating the same data and updating the same data as the merging field in the database;
s333, if the Redis does not have the same data as the merging field, the data is inserted into the database.
The invention has the beneficial effects that:
1. according to the invention, the syslog log data is cached in batches by the Redis and the analysis processing of the encrypted value of the merging field is carried out, and the processing of the merging field reduces a large amount of redundant data in the syslog log data, so that the storage space is saved, the use cost of the database is reduced, and the use efficiency of the database is improved.
2. According to the invention, the bloom filter is carried out on the data of the merging field encryption value, the bloom filter has higher screening speed, the screening speed is obviously faster than that of searching the same merging field data in the Redis, the Redis searching and using of the request data can be prevented continuously through the quick screening of the bloom filter, so that the Redis operation speed is reduced, the Redis cache penetration is caused, the problem of Redis cache penetration can be solved through the arrangement of the bloom filter, and the removing speed of redundant data is improved.
3. The invention adopts the bloom filter to screen and filter, and simultaneously increases the utilization of Redis screening, and can lead the removal of redundant data to be more accurate and lead the removal rate of the redundant data to be higher through re-inquiring the encrypted value of the merging field in the Redis.
4. The invention uniformly encrypts the data merging fields, thereby effectively preventing malicious attack and preventing service downtime.
5. The invention adopts the bloom filter to control the Redis cache penetration in a tolerant range, and the bloom filter can be utilized to pre-cache the main key of the data query, the encryption value of the merging field is cached in the bloom filter, when the data query is carried out according to the encryption value of the merging field, the bloom filter firstly judges whether the value exists, if the value exists, the next processing is carried out, if the value does not exist, the processing returns directly, and the cache penetration is effectively controlled in a tolerant range.
Drawings
FIG. 1 is a flow chart of a bloom filter-based big data merging method of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar symbols indicate like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are exemplary only for explaining the present invention and are not to be construed as limiting the present invention.
As shown in fig. 1, the invention discloses a bloom filter-based big data merging method, which comprises the following steps:
s1, adopting Redis to cache syslog log data sent by different devices/hosts in batches;
s2, consuming the syslog log data cached in the Redis batch, and obtaining the merged field encryption value of the syslog log data after analyzing and processing the consumed syslog log data;
s3, screening the encryption value of the integrated field of the syslog log data by using a bloom filter, and transferring the screened syslog log data to a database.
Redis adopts a message queue Redis Stream to buffer data, the event Stream data is stored through an orderly and continuously growing log sequence, each event is a message containing a plurality of fields, the messages are added to the tail of the Redis Stream, the Redis Stream receives syslog log data sent by different devices/hosts in a UDP mode, and the message queue Redis Stream buffers the syslog log data sent by different devices/hosts in batches; the message queue Redis Stream provides the functions of persistence and master-slave replication, so that any client can access syslog log data at any moment for consumption, the position of each client accessing the syslog log data can be stored, the client can dynamically adjust the consumption speed according to the processing capacity of the client, the reliable processing of the data is ensured, and the data loss is effectively prevented.
And using the client to consume the syslog log data cached in batches in the message queue Redis Stream, analyzing the syslog log data, and acquiring the encrypted value of the merge field of the syslog log data.
The merging field is used for extracting and merging the characteristic values in the syslog log data according to a unified format, and is used as a merging field, for example, equipment information, time information, content information and the like in the syslog log data, a large amount of redundant data can be generated in the syslog log data after the processing of the merging field, the same redundant data has no use value, and occupies a large amount of storage space, so that the use cost of a database is increased, and the use efficiency of the database is reduced.
The parsed syslog log data is encrypted according to the merging field to obtain an encrypted value, so that malicious attacks can be prevented.
In order to screen redundant data in the syslog log data, a bloom filter can be utilized to screen the encrypted value of the merging field of the syslog log data, and then the screened syslog log data is transferred to a database.
As shown in fig. 1, in particular:
s31, searching whether the bloom filter has a corresponding merging field encryption value or not according to the passed syslog log data by the bloom filter, wherein the specific steps of searching whether the bloom filter has the corresponding merging field encryption value are as follows:
s311, the bloom filter converts the encryption value into a hash value;
s312, comparing byte array positions corresponding to the hash values by using a bloom filter;
s313, if the compared hash value does not exist in the byte array position, returning a null value, and judging that the corresponding merging field encryption value does not exist in the bloom filter.
When data of a non-existing merging field encryption value is requested, the bloom filter converts the comparison encryption value into a hash value when the data passes through the bloom filter, the byte array position corresponding to the hash value is compared, if the compared hash value does not exist in the byte array position, the value can be found to be non-existing immediately, a null value is directly returned, the speed is almost as fast as ignoring, and the speed is obviously faster than that of searching the same merging field data in Redis.
When the redundant data with the same merging fields are subjected to duplicate checking, the request data can be prevented from being continuously searched and used for the Redis through the quick screening of the bloom filter, so that the Redis operation speed is reduced, the penetration of the Redis cache is caused, and the problem of the penetration of the Redis cache can be solved through the setting of the bloom filter.
S32, when the merging field encryption value does not exist in the bloom filter, searching whether the data which is the same as the merging field exists in the database, updating the data if the data exists, and updating the data to the database if the data does not exist; and storing the merge field encryption value into a bloom filter and a Redis;
the encryption value of the merging field in the syslog log data consumed in the bloom filter and the Redis is synchronously kept updated; if the bloom filter does not have a merge field encryption value, it may be determined that there is also no merge field encryption value in Redis, then the piece of syslog log data may be considered new data, and then the data is updated to the database.
S33, when the merging field encryption value exists in the bloom filter, updating the data which are the same as the merging field in the database; the step of updating the same data in the database as the merge field may be:
s331, when a merging field encryption value exists in the bloom filter, inquiring merging field encryption value data stored by Redis to confirm whether the merging field encryption value data exists truly;
s332, if the Redis has the same data as the merging field, updating the same data and updating the same data as the merging field in the database;
s333, if the Redis does not have the same data as the merging field, the data is inserted into the database.
The bloom filter is used for screening the encryption value of the merging field, the hash value converted from the comparison encryption value is adopted, and the byte array position corresponding to the hash value is compared, so that the bloom filter is different from the Redis in the screening process of the encryption value of the merging field, and the result is that when the same data of the encryption value of the merging field exists in the bloom filter, the same merging field data may or may not exist in the Redis, if the same data of the merging field exists in the Redis, the same data is updated, and the same data of the merging field in the database is updated; if the Redis does not have the same data as the merge field, the data is updated to the database.
By re-inquiring the log data consumed by syslog in Redis, the redundant data can be removed more accurately, and the removal rate of the redundant data is higher.
S34, repeating S31-S33 to finish the consumption syslog log data screening process.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.
It is to be understood that the terms "center," "longitudinal," "transverse," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counter-clockwise," "axial," "radial," "circumferential," and the like are directional or positional relationships as indicated based on the drawings, merely to facilitate describing the invention and to simplify the description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be configured and operated in a particular orientation, and therefore should not be construed as limiting the invention.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.
In the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," "secured," and the like are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally formed; may be mechanically connected, may be electrically connected or may be in communication with each other; either directly or indirectly, through intermediaries, or both, may be in communication with each other or in interaction with each other, unless expressly defined otherwise. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art according to the specific circumstances.
In the present invention, unless expressly stated or limited otherwise, a first feature "up" or "down" a second feature may be the first and second features in direct contact, or the first and second features in indirect contact via an intervening medium. Moreover, a first feature being "above," "over" and "on" a second feature may be a first feature being directly above or obliquely above the second feature, or simply indicating that the first feature is level higher than the second feature. The first feature being "under", "below" and "beneath" the second feature may be the first feature being directly under or obliquely below the second feature, or simply indicating that the first feature is less level than the second feature.

Claims (3)

1. The big data merging method based on the bloom filter is characterized by comprising the following steps of:
s1, adopting Redis to cache syslog log data sent by different devices/hosts in batches;
s2, consuming the syslog log data cached in the Redis batch, and obtaining the merged field encryption value of the syslog log data after analyzing and processing the consumed syslog log data;
s3, screening the encryption value of the integrated field of the syslog log data by using a bloom filter, and transferring the screened syslog log data to a database;
the step of screening the encrypted value of the syslog log data merging field by using a bloom filter in the S3 is as follows:
s31, searching whether a corresponding merging field encryption value exists in the bloom filter for the passed syslog log data by the bloom filter;
s32, when the merging field encryption value does not exist in the bloom filter, searching whether the data which is the same as the merging field exists in the database, updating the data if the data exists, and updating the data to the database if the data does not exist; and storing the merge field encryption value into a bloom filter and a Redis;
s33, when the merging field encryption value exists in the bloom filter, updating the data which are the same as the merging field in the database;
s34, repeating S31-S33 to finish the consumption syslog log data screening process.
2. The bloom filter-based big data merging method of claim 1, wherein: the step of searching whether the bloom filter has the corresponding merging field encryption value for the passed syslog log data in the S31 is as follows:
s311, the bloom filter converts the encryption value into a hash value;
s312, comparing byte array positions corresponding to the hash values by using a bloom filter;
s313, if the compared hash value does not exist in the byte array position, returning a null value, and judging that the corresponding merging field encryption value does not exist in the bloom filter.
3. The bloom filter-based big data merging method of claim 1, wherein: in S33, when the merging field encryption value exists in the bloom filter, the step of updating the data identical to the merging field in the database is as follows:
s331, when a merging field encryption value exists in the bloom filter, inquiring merging field encryption value data stored by Redis to confirm whether the merging field encryption value data exists truly;
s332, if the Redis has the same data as the merging field, updating the same data and updating the same data as the merging field in the database;
s333, if the Redis does not have the same data as the merging field, the data is inserted into the database.
CN202311365012.XA 2023-10-20 2023-10-20 Big data merging method based on bloom filter Active CN117112549B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311365012.XA CN117112549B (en) 2023-10-20 2023-10-20 Big data merging method based on bloom filter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311365012.XA CN117112549B (en) 2023-10-20 2023-10-20 Big data merging method based on bloom filter

Publications (2)

Publication Number Publication Date
CN117112549A CN117112549A (en) 2023-11-24
CN117112549B true CN117112549B (en) 2024-03-26

Family

ID=88795040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311365012.XA Active CN117112549B (en) 2023-10-20 2023-10-20 Big data merging method based on bloom filter

Country Status (1)

Country Link
CN (1) CN117112549B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156380A (en) * 2014-03-04 2014-11-19 深圳信息职业技术学院 Distributed memory Hash indexing method and system
KR101648317B1 (en) * 2015-12-09 2016-08-16 성균관대학교산학협력단 Method for searching data using partitioned bloom filter for supporting item elimination, cache memory apparatus and storage apparatus using the same
US10776355B1 (en) * 2016-09-26 2020-09-15 Splunk Inc. Managing, storing, and caching query results and partial query results for combination with additional query results
CN111723063A (en) * 2019-03-18 2020-09-29 北京沃东天骏信息技术有限公司 Method and device for processing offline log data
CN113342748A (en) * 2021-07-05 2021-09-03 北京腾云天下科技有限公司 Log data processing method and device, distributed computing system and storage medium
CN113392082A (en) * 2021-04-06 2021-09-14 北京沃东天骏信息技术有限公司 Log duplicate removal method and device, electronic equipment and storage medium
CN113420032A (en) * 2021-07-20 2021-09-21 奇安信科技集团股份有限公司 Classification storage method and device for logs
CN113535777A (en) * 2021-06-24 2021-10-22 上海浦东发展银行股份有限公司 Database query method, device and system
CN114003559A (en) * 2020-07-28 2022-02-01 中移(苏州)软件技术有限公司 Log access method, device and equipment and computer readable storage medium
CN116431598A (en) * 2022-04-18 2023-07-14 四川师范大学 Redis-based relational database full memory method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021045727A1 (en) * 2019-09-03 2021-03-11 Google Llc Systems and methods for secure identification retrieval
US11374776B2 (en) * 2019-09-28 2022-06-28 Intel Corporation Adaptive dataflow transformation in edge computing environments

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156380A (en) * 2014-03-04 2014-11-19 深圳信息职业技术学院 Distributed memory Hash indexing method and system
KR101648317B1 (en) * 2015-12-09 2016-08-16 성균관대학교산학협력단 Method for searching data using partitioned bloom filter for supporting item elimination, cache memory apparatus and storage apparatus using the same
US10776355B1 (en) * 2016-09-26 2020-09-15 Splunk Inc. Managing, storing, and caching query results and partial query results for combination with additional query results
CN111723063A (en) * 2019-03-18 2020-09-29 北京沃东天骏信息技术有限公司 Method and device for processing offline log data
CN114003559A (en) * 2020-07-28 2022-02-01 中移(苏州)软件技术有限公司 Log access method, device and equipment and computer readable storage medium
CN113392082A (en) * 2021-04-06 2021-09-14 北京沃东天骏信息技术有限公司 Log duplicate removal method and device, electronic equipment and storage medium
CN113535777A (en) * 2021-06-24 2021-10-22 上海浦东发展银行股份有限公司 Database query method, device and system
CN113342748A (en) * 2021-07-05 2021-09-03 北京腾云天下科技有限公司 Log data processing method and device, distributed computing system and storage medium
CN113420032A (en) * 2021-07-20 2021-09-21 奇安信科技集团股份有限公司 Classification storage method and device for logs
CN116431598A (en) * 2022-04-18 2023-07-14 四川师范大学 Redis-based relational database full memory method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
用联盟链的布隆过滤器优化;吴亦涵等;《应用科学学报》;全文 *

Also Published As

Publication number Publication date
CN117112549A (en) 2023-11-24

Similar Documents

Publication Publication Date Title
US11757740B2 (en) Aggregation of select network traffic statistics
US9852144B2 (en) System and method for investigating large amounts of data
EP1911189B1 (en) Efficient processing of time-bounded messages
US11494438B2 (en) Population of file-catalog table for file stage
JP2008529105A (en) Method, apparatus and system for clustering and classification
CN108134775B (en) Data processing method and equipment
US9197613B2 (en) Document processing method and system
US8065729B2 (en) Method and apparatus for generating network attack signature
CN108804661B (en) Fuzzy clustering-based repeated data deleting method in cloud storage system
CN117112549B (en) Big data merging method based on bloom filter
CN112800287A (en) Full-text indexing method and system based on graph database
CN111177360A (en) Self-adaptive filtering method and device based on user logs on cloud
CN106326317A (en) Data processing method and device
US7139801B2 (en) Systems and methods for monitoring events associated with transmitted electronic mail messages
Feng et al. An efficient caching mechanism for network-based url filtering by multi-level counting bloom filters
US10614102B2 (en) Method and system for creating entity records using existing data sources
CN113672583B (en) Big data multi-data source analysis method and system based on storage and calculation separation
Gupta et al. A short survey on bloom filter and its variants
CN111045987B (en) Ceph-based distributed file system metadata access acceleration method and system
CN110825940A (en) Network data packet storage and query method
CN117240615B (en) Migration learning network traffic correlation method based on time interval diagram watermark
CN115622818B (en) Network attack data processing method and device
CN117056246A (en) Data caching method and system
US11936545B1 (en) Systems and methods for detecting beaconing communications in aggregated traffic data
CN108509648A (en) A kind of log searching system based on recorder platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant