CN116431694A - Method, device, equipment and storage medium for monitoring mass data full link - Google Patents

Method, device, equipment and storage medium for monitoring mass data full link Download PDF

Info

Publication number
CN116431694A
CN116431694A CN202310425072.XA CN202310425072A CN116431694A CN 116431694 A CN116431694 A CN 116431694A CN 202310425072 A CN202310425072 A CN 202310425072A CN 116431694 A CN116431694 A CN 116431694A
Authority
CN
China
Prior art keywords
data
link
index
regular
log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310425072.XA
Other languages
Chinese (zh)
Inventor
张硕
蒋尧鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kangjian Information Technology Shenzhen Co Ltd
Original Assignee
Kangjian Information Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kangjian Information Technology Shenzhen Co Ltd filed Critical Kangjian Information Technology Shenzhen Co Ltd
Priority to CN202310425072.XA priority Critical patent/CN116431694A/en
Publication of CN116431694A publication Critical patent/CN116431694A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2468Fuzzy queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to artificial intelligence technology, and discloses a full-link monitoring method for financial mass data, which comprises the following steps: collecting link data, log data and index data; selecting a message queue slicing strategy according to data magnitude balance, and dividing link data, log data and index data into sliced link data, sliced log data and sliced index data; storing the regular link data and the regular log data into a first database of a hot-warm architecture; extracting associated index data associated with the segment index data from the regular link data and the regular log data; and carrying out secondary aggregation on the associated index data and the fragment index data to obtain aggregated index data, and storing the regular link data, the regular log data and the aggregated index data into a database of a hot-warm architecture. The invention also provides a device for monitoring the mass data full link, electronic equipment and a storage medium. The invention can improve the monitoring accuracy and maintainability of mass data.

Description

Method, device, equipment and storage medium for monitoring mass data full link
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a method and apparatus for monitoring a full link of mass data, an electronic device, and a computer readable storage medium.
Background
With the development of enterprise data volume blowout and the transformation of micro-servitization technical architecture of enterprises, services are split according to different dimensions, a plurality of services are often involved in one request, particularly financial and medical services are usually developed by different teams and deployed on different servers, even across different data centers, and the states of data (indexes, log records and tracking data) in the whole link need to be monitored to ensure the normal operation of the system.
The current mass data all-link monitoring generally adopts two modes of all-sampling and non-all-sampling, and the all-sampling mode adopts a single sampling mode because the data is too large, so that the processing capacity is insufficient, the system is crashed, the sampling data is lost, only the sampling can be carried out at a low sampling rate, and the sampling data is too little, so that the abnormality cannot be accurately monitored; index data constructed in a non-full sampling mode is inaccurate, correct data indexes cannot be provided for decisions (such as alarms), and monitoring problems cannot be solved according to monitoring results due to missing tracking data when the problems are detected.
In summary, the current mass data all-link monitoring method has low monitoring accuracy and low maintainability.
Disclosure of Invention
The invention provides a method and a device for monitoring a full link of mass data and a computer readable storage medium, and mainly aims to solve the problems of low monitoring accuracy and maintainability of the mass data.
In order to achieve the above object, the present invention provides a method for monitoring a full link of mass data, comprising:
collecting link data, log data and index data;
counting the data magnitude of the link data, the log data and the index data, selecting a message queue slicing strategy according to the data magnitude balance, and dividing the link data, the log data and the index data into sliced link data, sliced log data and sliced index data according to the message queue slicing strategy;
analyzing the sliced link data and the sliced log data into regular link data and regular log data in a preset format, storing the regular link data and the regular log data into a first database of a hot-warm architecture, and displaying the regular link data and the regular log data according to a first preset dimension;
Extracting associated index data associated with the segment index data from the regular link data and the regular log data;
performing secondary aggregation on the associated index data and the fragment index data to obtain aggregated index data, and storing the aggregated index data into a second database of the hot-warm architecture;
and pushing the aggregation index data meeting the preset index threshold to a user as alarm information, and displaying the aggregation index data according to a second preset dimension.
Optionally, the selecting a message queue slicing strategy according to the data magnitude balance includes:
when the data magnitude is smaller than a first preset magnitude threshold, randomly selecting message queue fragments according to the IDs of the link data, the log data and the index data;
when the data magnitude is greater than or equal to the first preset magnitude threshold and is smaller than a second preset magnitude threshold, selecting a single message queue cluster, carrying out hash operation on the link data, the log data and the IP address of the index data to obtain an IP address hash value, and distributing message queue fragments to the single message queue cluster according to the IP address hash value;
And when the data magnitude is greater than or equal to the second preset magnitude threshold, selecting a plurality of message queue clusters, and distributing message queue fragments to the plurality of message queue clusters according to the IP address hash value.
Optionally, the parsing the sliced link data and the sliced log data into regular link data and regular log data in a preset format includes:
performing data cleaning on the sliced link data and the sliced log data to obtain target sliced link data and target sliced log data;
and analyzing the target slicing link data and the target slicing log data into regular link data and regular log data with the same format.
Optionally, the performing secondary aggregation on the associated index data and the fragment index data to obtain aggregated index data includes:
analyzing the associated index data and the source application of the fragmentation index data;
according to the source application and the IP address, performing first aggregation on the associated index data and the fragmented index data from four dimensions of the source application, the IP address, the index name and the time to obtain first aggregation index data;
And according to the source application, performing second aggregation on the first aggregation index data from the source application, the index name and the time dimension to obtain aggregation index data.
Optionally, the extracting the associated index data associated with the segment index data from the regular link data and the regular log data includes:
word segmentation and quantization processing are carried out on the regular link data and the regular log data to obtain a link data vector sequence and a log data vector sequence;
extracting keywords of the fragmentation index data;
respectively calculating the similarity of the keyword and the word vector in the link data vector sequence and the log data vector sequence;
and taking the word vector with the similarity meeting a preset similarity threshold value as associated index data.
Optionally, the storing the aggregate index data in a second database of the hot-warm architecture includes:
storing the aggregation indicator data as hot data in a hot data node in a second database;
detecting reserved duration of the aggregation indicator data in the second database;
and automatically migrating the aggregation index data with the reserved time length longer than the preset time length to a cold data node in the second database as cold data.
Optionally, the displaying the regular link data and the regular log data according to a first preset dimension includes:
performing dimension classification on the regular link data and the regular log data according to the ID, the service class and the time range respectively to obtain an ID data set, a service class data set and a time range data set;
when query data is received, analyzing the query dimension of the query data, and selecting one of the ID data set, the business class data set and the time range data set according to the query dimension;
and inquiring and displaying the inquiry result from the selected data set according to the inquiry data.
In order to solve the above problems, the present invention further provides a device for monitoring a full link of mass data, the device comprising:
the data acquisition module is used for acquiring link data, log data and index data;
the equalization segmentation module is used for counting the data magnitude of the link data, the log data and the index data, selecting a message queue segmentation strategy according to the data magnitude equalization, and dividing the link data, the log data and the index data into segmented link data, segmented log data and segmented index data according to the message queue segmentation strategy;
The analysis module is used for analyzing the slicing link data and the slicing log data into regular link data and regular log data in a preset format, storing the regular link data and the regular log data into a first database of a hot-warm architecture, and displaying the regular link data and the regular log data according to a first preset dimension;
the data aggregation module is used for extracting associated index data associated with the fragmentation index data from the regular link data and the regular log data; performing secondary aggregation on the associated index data and the fragment index data to obtain aggregated index data, and storing the aggregated index data into a second database of the hot-warm architecture;
and the information alarm module is used for pushing the aggregation index data meeting the preset index threshold value to a user as alarm information and displaying the aggregation index data according to a second preset dimension.
In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the mass data full link monitoring method described above.
In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having stored therein at least one computer program that is executed by a processor in an electronic device to implement the above-mentioned mass data full-link monitoring method.
According to the embodiment of the invention, the message queue slicing strategy is selected according to the data magnitude equalization, and the link data, the log data and the index data are divided into the slicing link data, the slicing log data and the slicing index data according to the message queue slicing strategy, so that the data are more balanced, the processing capacity is stronger, the data loss caused by data extrusion is avoided, and the monitoring accuracy of mass data is improved; further extracting associated index data associated with the fragmentation index data from the regular link data and the regular log data, and performing secondary aggregation on the associated index data and the fragmentation index data to obtain aggregated index data, wherein a pipe-line processing mechanism of a plurality of streaming architectures can be adapted, and the secondary aggregation mode can obtain instance-level and application-level data which are mutually independent, so that when a problem occurs later, the problem sources can be quickly checked, and the maintainability of mass data is improved; and finally, pushing the aggregation index data meeting the preset index threshold value to a user as alarm information, and storing the regular link data, the regular log data and the aggregation index data into a database of a hot-warm architecture, so that the write-in data can be effectively processed and can be used for quick query, meanwhile, the data can be reserved for a long time on the premise of saving cost, and the maintainability of mass data is improved. Therefore, the method, the device, the electronic equipment and the computer readable storage medium for monitoring the full link of the mass data can solve the problems of low accuracy and low maintainability of monitoring the full link of the mass data.
Drawings
Fig. 1 is a flow chart of a method for monitoring a full link of mass data according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a detailed implementation flow of one of the steps in the method for monitoring a full link of mass data shown in FIG. 1;
FIG. 3 is a detailed flow chart of another step in the method for monitoring a full link of mass data shown in FIG. 1;
FIG. 4 is a functional block diagram of a full link monitoring device for mass data according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device for implementing the method for monitoring a full link of mass data according to an embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the application provides a method for monitoring a full link of mass data. The execution body of the mass data full-link monitoring method includes, but is not limited to, at least one of a server, a terminal and the like, which can be configured to execute the method provided by the embodiment of the application. In other words, the mass data full-link monitoring method may be performed by software or hardware installed in a terminal device or a server device, where the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (ContentDelivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
Referring to fig. 1, a flow chart of a method for monitoring a full link of mass data according to an embodiment of the present invention is shown. In this embodiment, the method for monitoring a full link of mass data includes:
s1, collecting link data, log data and index data.
In the embodiment of the invention, the Agent can be used for collecting the link data (trace), the log data (log) and the index data (metrics).
In the embodiment of the present invention, the link data, the log data, and the index data are link data, log data, and index data generated by different financial or medical services, for example, link data, log data, and index data generated by a service corresponding to a transaction system.
In the embodiment of the invention, each piece of data in the link data, the log data and the index data has a corresponding ID and IP address.
S2, counting the data magnitude of the link data, the log data and the index data, selecting a message queue slicing strategy according to the data magnitude balance, and dividing the link data, the log data and the index data into sliced link data, sliced log data and sliced index data according to the message queue slicing strategy.
In detail, the selecting the message queue slicing strategy according to the data magnitude equalization in S2 includes:
when the data magnitude is smaller than a first preset magnitude threshold, randomly selecting message queue fragments according to the IDs of the link data, the log data and the index data;
when the data magnitude is greater than or equal to the first preset magnitude threshold and is smaller than a second preset magnitude threshold, selecting a single message queue cluster, carrying out hash operation on the link data, the log data and the IP address of the index data to obtain an IP address hash value, and distributing message queue fragments to the single message queue cluster according to the IP address hash value;
and when the data magnitude is greater than or equal to the second preset magnitude threshold, selecting a plurality of message queue clusters, and distributing message queue fragments to the plurality of message queue clusters according to the IP address hash value.
In the embodiment of the invention, the link data, the log data and the index data are cached by adopting the non-locking ring-shaped queue, and then the link data, the log data and the index data are reported to a message queue after the independent thread consumes the non-locking ring-shaped queue.
In the embodiment of the invention, a message queue slicing strategy is selected according to the data magnitude balance, for example, when the magnitude of the data magnitude in skywalking is smaller than 5 hundred million segments/day, message queue slicing is selected randomly according to the IDs of the link data, the log data and the index data; when the size of the data magnitude in skywalking is more than or equal to 5 hundred million segments/day and less than 100 hundred million segments/day, selecting a single message queue cluster, carrying out hash operation on the link data, the log data and the IP address of the index data to obtain an IP address hash value, and distributing message queue fragments to the single message queue cluster according to the IP address hash value; when the size of the data magnitude in skywalking is more than or equal to 100 hundred million segments/day, selecting a plurality of message queue clusters, and distributing message queue fragments to the plurality of message queue clusters according to the IP address hash value.
In the embodiment of the invention, when the data magnitude is smaller than a first preset magnitude threshold, the link data, the log data and the index data are put into the message queue fragments corresponding to the ID to obtain fragmented link data, fragmented log data and fragmented index data; and when the data magnitude is greater than or equal to the first preset magnitude threshold, the link data, the log data and the index data are put into the message queue fragments corresponding to the IP address hash value, so as to obtain fragmented link data, fragmented log data and fragmented index data.
In the embodiment of the invention, the message queue slicing strategy is selected according to the data magnitude equalization, and the link data, the log data and the index data are divided into the slicing link data, the slicing log data and the slicing index data according to the message queue slicing strategy, so that the data are more balanced, the processing capacity is stronger, the data loss caused by data extrusion is avoided, and the monitoring accuracy of mass data is improved.
S3, analyzing the sliced link data and the sliced log data into regular link data and regular log data in a preset format, storing the regular link data and the regular log data into a first database of a hot-warm architecture, and displaying the regular link data and the regular log data according to a first preset dimension.
In the embodiment of the invention, the first database of the hot-warm architecture is a database containing hot data nodes and warm data nodes, the hot data nodes usually store hot data which is most concerned by users, and the warm data nodes store cold data or warm data with low user priority, and can be used for storing a log-type database.
In detail, in S3, the parsing the sliced link data and the sliced log data into regular link data and regular log data in a preset format includes:
Performing data cleaning on the sliced link data and the sliced log data to obtain target sliced link data and target sliced log data;
and analyzing the target slicing link data and the target slicing log data into regular link data and regular log data with the same format.
In the embodiment of the invention, the data of the slicing link data and the slicing log data can be cleaned by a poll mechanism, and the data which is not in the expected time window can be cleaned.
In the embodiment of the invention, the regular link data and the regular log data which are approximately 2 days can be stored into a hot data node in a first database as hot data, and the regular link data and the regular log data which are more than 2 days can be automatically migrated into a cold data node in the first database as cold data.
According to the embodiment of the invention, according to different read-write requirements of the regular link data and the regular log data, the thermal heating architecture is adopted for storage, so that the written data can be effectively processed, the method and the device can be used for quick inquiry, and meanwhile, the data can be reserved for a long time on the premise of saving cost.
Further, the displaying the regular link data and the regular log data according to the first preset dimension in S3 includes:
performing dimension classification on the regular link data and the regular log data according to the ID, the service class and the time range respectively to obtain an ID data set, a service class data set and a time range data set;
when query data is received, analyzing the query dimension of the query data, and selecting one of the ID data set, the business class data set and the time range data set according to the query dimension;
and inquiring and displaying the inquiry result from the selected data set according to the inquiry data.
In the embodiment of the invention, the method and the device can accurately inquire according to the ID of the log data and the index data, accurately inquire according to the service category and vague inquire according to the time range and the key words.
S4, extracting the associated index data associated with the slicing index data from the regular link data and the regular log data.
In detail, the S4 includes:
word segmentation and quantization processing are carried out on the regular link data and the regular log data to obtain a link data vector sequence and a log data vector sequence;
Extracting keywords of the fragmentation index data;
respectively calculating the similarity of the keyword and the word vector in the link data vector sequence and the log data vector sequence;
and taking the word vector with the similarity meeting a preset similarity threshold value as associated index data.
In the embodiment of the invention, the key words can be extracted by using a TF-IDF (term frequency-reverse document frequency) algorithm, a TextRank key word extraction algorithm, a K-means clustering algorithm and other algorithms.
In the embodiment of the invention, the similarity of the keyword and the word vector in the link data vector sequence and the log data vector sequence can be calculated by using algorithms such as cosine similarity, jacaded similarity coefficient, pearson correlation coefficient and the like.
In the embodiment of the invention, the related index data associated with the fragmentation index data is extracted from the regular link data and the regular log data, and the link data, the log data and the index data form linkage, so that the problem can be easily checked, and the maintainability of the full-link monitoring of mass data is improved.
S5, carrying out secondary aggregation on the associated index data and the fragment index data to obtain aggregate index data, and storing the aggregate index data into a second database of the hot-warm architecture.
In detail, referring to fig. 2, in S5, performing the second-level aggregation on the associated index data and the slice index data to obtain aggregated index data includes:
s51, analyzing the associated index data and source application of the fragmentation index data;
s52, according to the source application and the IP address, performing first aggregation on the associated index data and the fragmented index data from four dimensions of the source application, the IP address, the index name and the time to obtain first aggregation index data;
and S53, according to the source application, performing second aggregation on the first aggregation index data from the source application, the index name and the time dimension to obtain aggregation index data.
In the embodiment of the present invention, the associated index data and the fragmented index data include an application from which data is derived, an IP address of the data, and the index data.
According to the source application and the IP address, the associated index data and the fragmented index data of the same application and the same IP address are subjected to first aggregation from the source application, the IP address, the index name and the time dimension to obtain first aggregation index data of an instance level, wherein the time dimension comprises time magnitudes of minutes, hours and days; further, the first aggregation index data of the same source application is subjected to second aggregation from the source application, the index name and the time dimension to obtain aggregation index data of an application level.
In the embodiment of the invention, the associated index data and the sliced index data are subjected to secondary aggregation to obtain the aggregated index data, a pipe-line processing mechanism of various streaming architectures can be adapted, the secondary aggregation mode obtains data of an instance level and application level, and the data of the instance level and the application level are mutually independent, so that the problem sources can be quickly checked when the problems occur later.
Further, referring to fig. 3, storing the aggregate index data in the second database of the hot-warm architecture in S5 includes:
s54, storing the aggregation indicator data as hot data into hot data nodes in a second database;
s55, detecting reserved duration of the aggregation indicator data in the second database;
and S56, taking the aggregation index data with reserved time length longer than the preset time length as cold data, and automatically migrating the aggregation index data into a cold data node in the second database.
In the embodiment of the invention, the aggregation index data of approximately 2 days can be stored as hot data in the hot data node in the second database, and because the hot data is frequently read and written, the performance requirement is high, and SSD can be adopted to build the hot data node; and taking the aggregation index data exceeding 2 days as cold data, automatically migrating the cold data into cold data nodes in the second database, and constructing the cold data nodes by adopting a mechanical hard disk because the cold data has larger storage density and needs to be reserved for a longer reservation period.
According to the embodiment of the invention, according to different read-write requirements of the aggregation index data, the thermal heating architecture is adopted for storage, so that the written data can be effectively processed, the method and the device can be used for quick inquiry, and meanwhile, the data can be reserved for a long time on the premise of saving cost.
And S6, pushing the aggregation index data meeting the preset index threshold to a user as alarm information, and displaying the aggregation index data according to a second preset dimension.
In the embodiment of the invention, when the aggregation index data reaches the preset index threshold, the aggregation index data is abnormal, and the aggregation index data with the abnormality is pushed to a user as alarm information.
In the embodiment of the invention, the second preset dimension can be displayed according to the query requirement or the definition dimension of the user requirement.
According to the embodiment of the invention, the message queue slicing strategy is selected according to the data magnitude equalization, and the link data, the log data and the index data are divided into the slicing link data, the slicing log data and the slicing index data according to the message queue slicing strategy, so that the data are more balanced, the processing capacity is stronger, the data loss caused by data extrusion is avoided, and the monitoring accuracy of mass data is improved; further extracting associated index data associated with the fragmentation index data from the regular link data and the regular log data, and performing secondary aggregation on the associated index data and the fragmentation index data to obtain aggregated index data, wherein a pipe-line processing mechanism of a plurality of streaming architectures can be adapted, and the secondary aggregation mode can obtain instance-level and application-level data which are mutually independent, so that when a problem occurs later, the problem sources can be quickly checked, and the maintainability of mass data is improved; and finally, pushing the aggregation index data meeting the preset index threshold value to a user as alarm information, and storing the regular link data, the regular log data and the aggregation index data into a database of a hot-warm architecture, so that the write-in data can be effectively processed and can be used for quick query, meanwhile, the data can be reserved for a long time on the premise of saving cost, and the maintainability of mass data is improved. Therefore, the mass data all-link monitoring method provided by the invention can solve the problems of low accuracy and low maintainability of mass data all-link monitoring.
Fig. 4 is a functional block diagram of a full link monitoring device for mass data according to an embodiment of the present invention.
The mass data full-link monitoring device 100 of the present invention may be installed in an electronic apparatus. According to the implemented functions, the mass data full-link monitoring device 100 may include a data acquisition module 101, an equalization segmentation module 102, an analysis module 103, a data aggregation module 104, and an information alarm module 105. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
In the present embodiment, the functions concerning the respective modules/units are as follows:
the data acquisition module 101 is configured to acquire link data, log data, and index data;
the balancing and slicing module 102 is configured to count data magnitudes of the link data, the log data, and the index data, select a message queue slicing policy according to the data magnitudes balancing, and divide the link data, the log data, and the index data into sliced link data, sliced log data, and sliced index data according to the message queue slicing policy;
The parsing module 103 is configured to parse the sliced link data and the sliced log data into regular link data and regular log data in a preset format, store the regular link data and the regular log data into a first database of a hot-warm architecture, and display the regular link data and the regular log data according to a first preset dimension;
the data aggregation module 104 is configured to extract, from the regular link data and the regular log data, associated index data associated with the segment index data; performing secondary aggregation on the associated index data and the fragment index data to obtain aggregated index data, and storing the aggregated index data into a second database of the hot-warm architecture;
the information alert module 105 is configured to push the aggregate index data that meets a preset index threshold to a user as alert information, and display the aggregate index data according to a second preset dimension.
In detail, each module in the mass data full-link monitoring device 100 in the embodiment of the present invention adopts the same technical means as the mass data full-link monitoring method described in fig. 1 to 3, and can produce the same technical effects, which are not described herein.
Fig. 5 is a schematic structural diagram of an electronic device for implementing a method for monitoring a full link of mass data according to an embodiment of the present invention.
The electronic device 1 may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program, such as a mass data full link monitor, stored in the memory 11 and executable on the processor 10.
The processor 10 may be formed by an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be formed by a plurality of integrated circuits packaged with the same function or different functions, including one or more central processing units (Central Processing Unit, CPU), a microprocessor, a digital processing chip, a graphics processor, a combination of various control chips, and so on. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the entire electronic device using various interfaces and lines, and executes various functions of the electronic device and processes data by running or executing programs or modules stored in the memory 11 (for example, executing a mass data full link monitoring program, etc.), and calling data stored in the memory 11.
The memory 11 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory 11 may in other embodiments also be an external storage device of the electronic device, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only for storing application software installed in an electronic device and various types of data, such as codes of a mass data full-link monitoring program, but also for temporarily storing data that has been output or is to be output.
The communication bus 12 may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.
The communication interface 13 is used for communication between the electronic device and other devices, including a network interface and a user interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices. The user interface may be a Display (Display), an input unit such as a Keyboard (Keyboard), or alternatively a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.
Fig. 5 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 5 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.
For example, although not shown, the electronic device may further include a power source (such as a battery) for supplying power to the respective components, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device may further include various sensors, bluetooth modules, wi-Fi modules, etc., which are not described herein.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The mass data full link monitoring program stored in the memory 11 of the electronic device 1 is a combination of instructions which, when executed in the processor 10, can implement:
collecting link data, log data and index data;
counting the data magnitude of the link data, the log data and the index data, selecting a message queue slicing strategy according to the data magnitude balance, and dividing the link data, the log data and the index data into sliced link data, sliced log data and sliced index data according to the message queue slicing strategy;
Analyzing the sliced link data and the sliced log data into regular link data and regular log data in a preset format, storing the regular link data and the regular log data into a first database of a hot-warm architecture, and displaying the regular link data and the regular log data according to a first preset dimension;
extracting associated index data associated with the segment index data from the regular link data and the regular log data;
performing secondary aggregation on the associated index data and the fragment index data to obtain aggregated index data, and storing the aggregated index data into a second database of the hot-warm architecture;
and pushing the aggregation index data meeting the preset index threshold to a user as alarm information, and displaying the aggregation index data according to a second preset dimension.
In particular, the specific implementation method of the above instructions by the processor 10 may refer to the description of the relevant steps in the corresponding embodiment of the drawings, which is not repeated herein.
Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable storage medium may be volatile or nonvolatile. For example, the computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, can implement:
collecting link data, log data and index data;
counting the data magnitude of the link data, the log data and the index data, selecting a message queue slicing strategy according to the data magnitude balance, and dividing the link data, the log data and the index data into sliced link data, sliced log data and sliced index data according to the message queue slicing strategy;
analyzing the sliced link data and the sliced log data into regular link data and regular log data in a preset format, storing the regular link data and the regular log data into a first database of a hot-warm architecture, and displaying the regular link data and the regular log data according to a first preset dimension;
extracting associated index data associated with the segment index data from the regular link data and the regular log data;
performing secondary aggregation on the associated index data and the fragment index data to obtain aggregated index data, and storing the aggregated index data into a second database of the hot-warm architecture;
And pushing the aggregation index data meeting the preset index threshold to a user as alarm information, and displaying the aggregation index data according to a second preset dimension.
In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms first, second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (10)

1. The method for monitoring the mass data full link is characterized by comprising the following steps:
Collecting link data, log data and index data;
counting the data magnitude of the link data, the log data and the index data, selecting a message queue slicing strategy according to the data magnitude balance, and dividing the link data, the log data and the index data into sliced link data, sliced log data and sliced index data according to the message queue slicing strategy;
analyzing the sliced link data and the sliced log data into regular link data and regular log data in a preset format, storing the regular link data and the regular log data into a first database of a hot-warm architecture, and displaying the regular link data and the regular log data according to a first preset dimension;
extracting associated index data associated with the segment index data from the regular link data and the regular log data;
performing secondary aggregation on the associated index data and the fragment index data to obtain aggregated index data, and storing the aggregated index data into a second database of the hot-warm architecture;
and pushing the aggregation index data meeting the preset index threshold to a user as alarm information, and displaying the aggregation index data according to a second preset dimension.
2. The method for monitoring a full link of mass data according to claim 1, wherein selecting a message queue slicing strategy according to the data magnitude equalization comprises:
when the data magnitude is smaller than a first preset magnitude threshold, randomly selecting message queue fragments according to the IDs of the link data, the log data and the index data;
when the data magnitude is greater than or equal to the first preset magnitude threshold and is smaller than a second preset magnitude threshold, selecting a single message queue cluster, carrying out hash operation on the link data, the log data and the IP address of the index data to obtain an IP address hash value, and distributing message queue fragments to the single message queue cluster according to the IP address hash value;
and when the data magnitude is greater than or equal to the second preset magnitude threshold, selecting a plurality of message queue clusters, and distributing message queue fragments to the plurality of message queue clusters according to the IP address hash value.
3. The method for monitoring the full link of the mass data according to claim 1, wherein the parsing the fragmented link data and the fragmented log data into regular link data and regular log data in a preset format comprises:
Performing data cleaning on the sliced link data and the sliced log data to obtain target sliced link data and target sliced log data;
and analyzing the target slicing link data and the target slicing log data into regular link data and regular log data with the same format.
4. The method for monitoring the full link of the mass data according to claim 1, wherein the performing the secondary aggregation on the associated index data and the fragmented index data to obtain the aggregated index data comprises:
analyzing the associated index data and the source application of the fragmentation index data;
according to the source application and the IP address, performing first aggregation on the associated index data and the fragmented index data from four dimensions of the source application, the IP address, the index name and the time to obtain first aggregation index data;
and according to the source application, performing second aggregation on the first aggregation index data from the source application, the index name and the time dimension to obtain aggregation index data.
5. The method for full link monitoring of mass data according to claim 1, wherein the extracting associated index data associated with the segment index data from the regular link data and the regular log data comprises:
Word segmentation and quantization processing are carried out on the regular link data and the regular log data to obtain a link data vector sequence and a log data vector sequence;
extracting keywords of the fragmentation index data;
respectively calculating the similarity of the keyword and the word vector in the link data vector sequence and the log data vector sequence;
and taking the word vector with the similarity meeting a preset similarity threshold value as associated index data.
6. The method for full link monitoring of mass data as claimed in claim 1, wherein said storing said aggregate index data into a second database of said hot-warm architecture comprises:
storing the aggregation indicator data as hot data in a hot data node in a second database;
detecting reserved duration of the aggregation indicator data in the second database;
and automatically migrating the aggregation index data with the reserved time length longer than the preset time length to a cold data node in the second database as cold data.
7. The method for full link monitoring of mass data according to claim 1, wherein said displaying said regular link data and said regular log data according to a first predetermined dimension comprises:
Performing dimension classification on the regular link data and the regular log data according to the ID, the service class and the time range respectively to obtain an ID data set, a service class data set and a time range data set;
when query data is received, analyzing the query dimension of the query data, and selecting one of the ID data set, the business class data set and the time range data set according to the query dimension;
and inquiring and displaying the inquiry result from the selected data set according to the inquiry data.
8. A mass data full link monitoring device, the device comprising:
the data acquisition module is used for acquiring link data, log data and index data;
the equalization segmentation module is used for counting the data magnitude of the link data, the log data and the index data, selecting a message queue segmentation strategy according to the data magnitude equalization, and dividing the link data, the log data and the index data into segmented link data, segmented log data and segmented index data according to the message queue segmentation strategy;
the analysis module is used for analyzing the slicing link data and the slicing log data into regular link data and regular log data in a preset format, storing the regular link data and the regular log data into a first database of a hot-warm architecture, and displaying the regular link data and the regular log data according to a first preset dimension;
The data aggregation module is used for extracting associated index data associated with the fragmentation index data from the regular link data and the regular log data; performing secondary aggregation on the associated index data and the fragment index data to obtain aggregated index data, and storing the aggregated index data into a second database of the hot-warm architecture;
and the information alarm module is used for pushing the aggregation index data meeting the preset index threshold value to a user as alarm information and displaying the aggregation index data according to a second preset dimension.
9. An electronic device, the electronic device comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the mass data full link monitoring method as claimed in any one of claims 1 to 7.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the mass data full link monitoring method according to any one of claims 1 to 7.
CN202310425072.XA 2023-04-14 2023-04-14 Method, device, equipment and storage medium for monitoring mass data full link Pending CN116431694A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310425072.XA CN116431694A (en) 2023-04-14 2023-04-14 Method, device, equipment and storage medium for monitoring mass data full link

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310425072.XA CN116431694A (en) 2023-04-14 2023-04-14 Method, device, equipment and storage medium for monitoring mass data full link

Publications (1)

Publication Number Publication Date
CN116431694A true CN116431694A (en) 2023-07-14

Family

ID=87083041

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310425072.XA Pending CN116431694A (en) 2023-04-14 2023-04-14 Method, device, equipment and storage medium for monitoring mass data full link

Country Status (1)

Country Link
CN (1) CN116431694A (en)

Similar Documents

Publication Publication Date Title
US8850263B1 (en) Streaming and sampling in real-time log analysis
US7502971B2 (en) Determining a recurrent problem of a computer resource using signatures
WO2019218475A1 (en) Method and device for identifying abnormally-behaving subject, terminal device, and medium
CN111612041B (en) Abnormal user identification method and device, storage medium and electronic equipment
US9460308B2 (en) Multi-level privacy evaluation
WO2020177384A1 (en) Method and apparatus for reporting and processing user message status of message pushing, and storage medium
US11593029B1 (en) Identifying a parent event associated with child error states
US11301425B2 (en) Systems and computer implemented methods for semantic data compression
WO2021068547A1 (en) Log schema extraction method and apparatus
CN111984499A (en) Fault detection method and device for big data cluster
CN112445854B (en) Multi-source service data real-time processing method, device, terminal and storage medium
CN111881011A (en) Log management method, platform, server and storage medium
US9633088B1 (en) Event log versioning, synchronization, and consolidation
CN113836131B (en) Big data cleaning method and device, computer equipment and storage medium
US9922116B2 (en) Managing big data for services
CN112306700A (en) Abnormal RPC request diagnosis method and device
CN115698977A (en) Context-driven data profiling
Botezatu et al. Multi-view incident ticket clustering for optimal ticket dispatching
CN110807050B (en) Performance analysis method, device, computer equipment and storage medium
CN114461792A (en) Alarm event correlation method, device, electronic equipment, medium and program product
CN113760666A (en) System exception processing method, device and storage medium
CN112699142A (en) Cold and hot data processing method and device, electronic equipment and storage medium
CN111625656A (en) Information processing method, device, equipment and storage medium
CN116431694A (en) Method, device, equipment and storage medium for monitoring mass data full link
CN114490667A (en) Multidimensional data analysis method and device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination