CN108399199A - A kind of collection of the application software running log based on Spark and service processing system and method - Google Patents
A kind of collection of the application software running log based on Spark and service processing system and method Download PDFInfo
- Publication number
- CN108399199A CN108399199A CN201810091898.6A CN201810091898A CN108399199A CN 108399199 A CN108399199 A CN 108399199A CN 201810091898 A CN201810091898 A CN 201810091898A CN 108399199 A CN108399199 A CN 108399199A
- Authority
- CN
- China
- Prior art keywords
- data
- daily record
- log
- service
- record data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1805—Append-only file systems, e.g. using logs or journals to store data
- G06F16/1815—Journaling file systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/069—Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/50—Network service management, e.g. ensuring proper service fulfilment according to agreements
- H04L41/5041—Network service management, e.g. ensuring proper service fulfilment according to agreements characterised by the time relationship between creation and deployment of a service
Abstract
The collection of the present invention relates to a kind of application software running log based on Spark and service processing system and method, log services are provided for the different resource at all levels, log collection service including daily record data resource layer, the user of the daily record data storage service of daily record data service layer, daily record data application layer obtains daily record data service.After it is collected into initial data by the log collection service of daily record data resource layer, it is pushed to daily record data service layer and carries out daily record data pretreatment, data are stored by daily record data storage service again, finally provide a user daily record data service in daily record data application layer.The present invention may be implemented using distributed collection strategy to collection of log data, a kind of multi-levels data storage organization is defined to store daily record data, and provide a user daily record data inquiry service, so that user is obtained useful application software running log data, the efficiency of fault diagnosis is improved by the daily record of acquisition.
Description
Technical field
The invention belongs to daily record big data field, the collection more particularly to a kind of application software running log and service center
Reason.
Background technology
In information system application, operation every time can all leave a trace, and here it is daily record, each journal file is remembered by daily record
Record composition, every log recording describe the primary event individually occurred.Inside a complete information system, log system
It is a very important functional component.It can record all behaviors caused by system, and according to certain specification
It expresses.Daily record records necessary, valuable for IT resource correlated activations such as server, work station, fire wall and application software
The record of the information of value, these information is all highly important to system monitoring, inquiry, security audit and troubleshooting.
Existing large scope software is mostly more people's exploitations, or using the software in a variety of sources, is integrated with and largely increases income
The code of community, the code spice of software inhouse are difficult point after this daily record inherent logic complexity error there are many inconsistent
Analysis, user is difficult that effective information is quickly obtained from daily record, it is difficult to achieve the purpose that improve fault diagnosis efficiency.More at present
Development community adds some log processing methods in their Log Administration System, but all there are many problems, such as:Daily record
The storage of data is chaotic;User oriented service is related to seldom, and user is difficult to customize daily record data according to demand.Therefore, such as
What quickly and effectively handles daily record, and useful log information return to be current log services research must examine
The problem considered.
It is on the increase with the running log of application software, more and more researchers begin to focus on log collection kimonos
The research work of business processing this respect, while the software and system, Scribe that also occur largely for this respect research are
The result collection system that Facebook increases income, has got a lot of applications inside Facebook.It can be from various daily records
Collector journal on source.Logstash be an application log, the transmission of event, processing, management and search platform, can be with
Management is collected to application log come unified with it, provides web interface for inquiring and counting.Flume is
The High Availabitity that Cloudera is provided, highly reliable, distributed massive logs acquisition, polymerization and the system transmitted,
Flume supports to customize Various types of data sender in log system, for collecting data.But most of these softwares are only inclined to
There is no subsequent processing for daily record after collector journal, collection, it is difficult to reach the demand that user refines daily record data.
Invention content
For the studies above background and problem, the present invention provides a kind of application software running log based on Spark
It collects and service processing frame:A daily record data service level frame is proposed, is provided for the different resource at all levels
Log services include the log collection service of daily record data resource layer, the daily record data storage service of daily record data service layer, day
The user of will data application layer obtains daily record data service.It is collected into original by the log collection service of daily record data resource layer
After beginning data, it is pushed to daily record data service layer and carries out daily record data pretreatment, then number is stored by daily record data storage service
According to finally providing a user daily record data service in daily record data application layer.
Technical scheme is as follows:
A kind of collection of the application software running log based on Spark and service processing system, which is characterized in that including day
The log collection service unit of will data resource layer, the daily record data storage service unit and daily record data of daily record data service layer
The user of pre-processing service unit, daily record data application layer obtains daily record data service unit, wherein:
Log collection service unit:For acquiring the original day generated in application software operation on daily record data resource layer
Will data;
Daily record data pre-processing service unit:For log data according to demand to be rejected unnecessary information,
The message of user's needs is left, including following three aspects pretreatment work is made to initial data:Data filtering, data deduplication with
Log recording is segmented;
Daily record data storage service unit:For being responsible for storing initial data and pretreated data;
User obtains daily record data service unit:For providing multi-condition inquiry service interface, inquiry clothes are provided a user
Business;
A kind of collection of the application software running log based on Spark and service processing method, which is characterized in that including such as
Lower step:
Step 1:Log collection service collects log data using distributed collection strategy;
Step 2:Daily record data pre-processing service pre-processes the log data being collected into.
Step 3:Daily record data storage service receives log data and pretreated daily record data is stored in respectively
In different databases;
Step 4:User obtains daily record data service, is used to provide multi-condition inquiry service interface, provides a user and look into
Ask service;
Collection in a kind of above-mentioned application software running log based on Spark and service processing method, the step
Rapid 1 comprises the steps of:
Step 1.1:Log collection service connects log services node using failover modes, can automatically select available
Node connects.When some node goes wrong in daily record service node cluster, then the daily record data being collected into can be passed to it
His service node.When cluster message service node is unavailable, other available message clothes will be selected automatically by failover
Business node processing.
Step 1.2:When carrying out collection of log data, log collection module sets journal file path, daily record number first
Can be completed in each child node according to collection work, collect after the completion of convergence synthesis one big log data set, here in order to
Meet the needs of user is for daily record data, a filter is set in each child node.
Step 1.3:Before starting collection, it is first determined the data source of collector journal judges after determining data source
Whether master nodes start, if do not started, start master nodes by changing configuration file.If started
Master nodes are then selected, and determine that, using one or multiple master nodes, master nodes start according to system requirements
Afterwards, agent nodes are set, and user customizes agent nodes according to self-demand, and customization includes three aspects, customizes source, real
When data source, the channel of the caching of channel, that is, real-time logs data, the output of sink, that is, real time data.Setting includes
The name of each source, channel and sink, type and its attribute.The connection of agent is carried out after being provided with,
Start the collection work that all agent nodes start daily record data after the completion.
Collection in a kind of above-mentioned application software running log based on Spark and service processing method, the step
Rapid 2 comprise the steps of:
Step 2.1:The pretreatment that daily record data pre-processing service carries out daily record data, including to initial data make with
Lower three aspect pretreatment works:Data filtering, data deduplication and log recording are segmented.
Step 2.2:After simply pre-processing, selection carries out data point using Algorithm of documents categorization to daily record data
Class processing, TFIDF algorithms are responsible for building VSM models to complete text vector, then realize data point by KNN algorithms
Class.
Collection in a kind of above-mentioned application software running log based on Spark and service processing frame, step 2.1
In, daily record data pretreatment is divided into three subdivisions to complete, specifically:
Step A, first it is data filtering part, it can include a large amount of unnecessary records that a log data, which is concentrated,
If system control position records, it is very common resource in daily record that user, which asks the URL etc. of static resource, these resources, still
These are for ordinary user, it is not essential however to, it is therefore desirable to processing is filtered to log data, is effectively subtracted
Light subsequent log processing pressure;
Step B, followed by daily record duplicate removal part, a log data concentration can include the record largely repeated, example
Such as when encountering remote service request interruption, identical log recording can be repeatedly returned to, these records are only repeated first
The log recording of return carries out subsequent daily record data analysis for user and does not help, it is therefore desirable to remember the daily record repeated
Record removal;
Step C, it is finally log record sorter processing, general Log data format:When m- logging level-service name-
The event of generation, such log data user can not read, it is therefore desirable to classify to log recording, no
Same classification represents different meanings.
Collection in a kind of above-mentioned application software running log based on Spark and service processing method, the step
Rapid 3 comprise the steps of:
Step 3.1:Daily record data is divided into two kinds:One is log datas, although this data general value density
It is low but very valuable if further excavate.This kind of log data is filed herein, not due to such data
It can be accessed by high frequency, it is also not high to requirement of real-time, therefore store into MySQL database;Another kind is cleaned by extracting
Filtering screening simultaneously carries out pretreated daily record data, and such data are to analyze directly related data with subsequent user, therefore
These data will be by frequent visit, therefore stores to the high distributed NoSQL databases HBase of performance;
Step 3.2:Log data is a kind of irregular semi-structured data form, different types of data format
It is inconsistent, in order to cooperate with the work of two distinct types of database, it is necessary to carry out standardization processing, standardization to daily record data
The purpose of format analysis processing is in order to reach following two purposes:Scalability is simplified.The purpose of scalability is to accommodate difference
The application log of type makes to aim at day not constraining in type, simplification be in order to the daily record data after standardizing by with
Family, which is brought, can improve efficiency when doing log analysis.
Step 3.3:Log data format specificationization can will be divided into three parts:Daily record indexes, log recording body
And logging level.Daily record index is the core of entire daily record data storage format, we define one for each log services
Service ID, as major key index convenient for quick-searching position log recording, improve pretreatment and later inquiry service
Efficiency;Log recording body stores the information of daily record data itself, and log recording body is transparent, daily record for log system
System carries out fault diagnosis by the analysis to these information;Logging level is divided into logging level and is divided into three grades INFO,
ERROR, DEBUG.
Collection in a kind of above-mentioned application software running log based on Spark and service processing method, the step
Rapid 4 comprise the steps of:
Step 4.1:By providing multi-condition inquiry service, inquiry service is provided a user, user can be according to different
Demand carrys out inquiry log data using multi-condition inquiry.
Step 4.2:Daily record data enquiry module creates index to the daily record data that log collection module collection comes first, so
The interface that user is provided by daily record data enquiry module afterwards can be retrieved by simple multi-condition inquiry and find out some server
On all daily record datas, in real time from daily record storage data set in retrieve oneself required daily record data, every daily record
Data record can generally check application program either with or without going wrong by log recording message, equally also can get the daily record
Data come from the information such as which platform server.
The present invention may be implemented to define collection of log data using distributed collection strategy a kind of multi-levels data and deposit
Storage structure stores daily record data, and provides a user daily record data inquiry service, so that user is obtained useful application soft
Part running log data improve the efficiency of fault diagnosis by the daily record of acquisition.
Description of the drawings
Fig. 1 is daily record data service framework figure of the present invention.
Fig. 2 is daily record data service framework technology realization figure of the present invention.
Fig. 3 is data collection flow figure of the present invention.
Fig. 4 is daily record data querying flow figure of the present invention.
Specific implementation mode
Understand for the ease of those of ordinary skill in the art and implement the present invention, with reference to the accompanying drawings and embodiments to this hair
It is bright to be described in further detail, it should be understood that implementation example described herein is merely to illustrate and explain the present invention, not
For limiting the present invention.
As shown in Figure 1, it can be seen from the figure that when in each child node log data generate after, can be by working as
Agent in preceding child node passes to collector, and each Agent is by source, channel, sink three parts group here
At, source is data source, and channel is the channel that data are transmitted, and sink is used to transfer data to appointed place,
It is triggered and is coordinated by event between this three, data can polymerize after Collector receives data, generate one
The data flow of a bigger, is finally delivered in database.
Daily record data is generally divided into two classes, and the first kind is exactly the log data acquired from each application software.This
Class daily record data information content is very big but all in tumble.This method files this kind of log data, due to such data
It will not be accessed by high frequency, it is also not high to requirement of real-time, therefore can be stored in MySQL clusters after standardization.Second class
It is exactly by pretreated daily record data, such data have very important significance for the work of the log analysis such as fault diagnosis, because
This these data will be by frequent visit.Therefore this kind of data consider that the higher distributed data base of performance is arrived in storage herein
In Hbase.How relevant database and non-relational database to be combined, and designs the specification of suitable daily record data storage
Change the core that data structure is daily record data storage service.
First initial data is stored in relevant database cluster herein in service layer as shown in Figure 2, the day on resource layer
Log data is imported service layer by will data aggregation service, calls daily record data storage service by data herein in service layer
It stores in Mysql clusters, uses MySQL Cluster in this method here to build data-base cluster.
It is serviced by collection of log data after collecting log data, log data is inquired for convenience
Modification, log data is sent in MySQL database cluster by this method, the format of log data generally comprise as
When m- logging level-generation event, such daily record data is very lack of standardization, if directly to such log data
It is pre-processed, it will waste a large amount of time, cause the inefficiency of entire daily record data processing service, it is therefore necessary to day
Log data format is divided into three parts by will data format specifications, this method:Daily record indexes, log recording body and day
Will grade.
In Fig. 2 it can be seen that the daily record data after standardization can be distributed to different ndbd sections after entering in MySQL clusters
On point, storage operation is then carried out by NDB engines.Management node (can also claim management server) is mainly responsible for management number
According to node and SQL nodes, also cluster profile and cluster log file.It monitors the working condition of other nodes, can
Start, close or restart some node.Back end is for storing data.SQL nodes are with general MySQL server
, we can carry out SQL operations by it.
After initial data is stored to MySQL clusters, Spark is imported immediately, by the RDD modules in Spark to original
Pretreated result is then imported Hbase and carried out by the pretreatment work that data are filtered, duplicate removal and log recording are segmented
Storage.
In HBase, by way of dividing region, to manage large-scale distributed database.It is pretreated
Data can be that unit is stored according to record, herein be led pretreated data using the MapReduce Job modes of customization
Enter Hbase.Since the document storage system of HBase bottoms is HDFS.Therefore still have HDFS high it is fault-tolerant the advantages of, simultaneously
HBase provides Indexing Mechanism, the other application access log data being also convenient for.
After in the storage to Hbase databases of pretreated daily record data, user needs to look into log database
It askes to obtain required daily record data.Hbase itself only supports the inquiry mode based on rowkey, and user, which is not aware that, to be needed
The rowkey of data is inquired, daily record data can only be inquired by keyword, in order to meet this many condition of user
Query demand.This method provides service to the user using the HBase multi-condition inquiries based on Solr, by Solr by HBase numbers
Be encapsulated according to different condition according to data in library, user can according to different demands to by pretreated daily record data into
Row condition query obtains oneself required daily record data.
The field of condition filter involved in Hbase tables and rowkey are established index by this method as shown in Figure 4 in Solr,
It is quickly obtained the rowkey values for meeting filter condition by the multi-condition inquiry of Solr, takes these rowkey later in Hbase
In inquired by specified rowkey, finally by result returned data collection to user.Since daily record data is a kind of big data
The data of amount, therefore traditional singly-bound index has not been suitable for the inquiry of daily record data, therefore this method will be on Solr
Key combination index is established, the characteristics of for daily record data after standardization, herein in these three keys of Time, Skype and ID
Index is established in field, and carries out descending sort according to Time.
In order to better illustrate the present invention, case deployment is carried out to certain integrated disaster reduction spatial Information Service application system and ground
Study carefully.In order to carry out verification feasibility to collection of log data proposed in this paper and processing method, test environment is by three physics
Mechanism is at this three physical machines constitute a Spark cluster, wherein two conducts, from node, one is used as host node host
Operating system be all Ubuntu, host node is only the part really calculated from node as existing for dispatching distribution task.It is main
Node is DELL PowerEdge M630, it has 26 core E5-2609 v3 processors, 1.9GHz, 15M caching, 64G DDR4
Memory, 2 pieces of 300G 10K 2.5``SAS hard disks.It is DELL PowerEdge M630,28 core Xeon E5-2640 from node
V3 processors, 2.6GHz, 20M caching, 128G DDR4 memories, 2 pieces of 300G 10K 2.5``SAS hard disks.These hardware devices it
Between by ten thousand Broadcoms be connected;This three physical machines have also built MySQL clusters simultaneously, are responsible for providing relevant database clothes
Business.
Data used herein use the daily record data of certain integrated disaster reduction spatial Information Service application system, the application system
The target of system is that risk and the loss of natural calamity are visualized from the dimension of room and time, for each rank of every disaster management work
Section provides intuitive information, and provides the services such as product, technology, decision, has ensured the effective progress for work of preventing and reducing natural disasters.It provides
Data be divided into three kinds, respectively sample data (100MB), one month data (700MB), 1 year partial data
(4.96GB)
It should be understood that the part that this specification does not elaborate belongs to the prior art.
It should be understood that the above-mentioned description for preferred embodiment is more detailed, can not therefore be considered to this
The limitation of invention patent protection range, those skilled in the art under the inspiration of the present invention, are not departing from power of the present invention
Profit requires under protected ambit, can also make replacement or deformation, each fall within protection scope of the present invention, this hair
It is bright range is claimed to be determined by the appended claims.
Claims (7)
1. collection and the service processing system of a kind of application software running log based on Spark, which is characterized in that including daily record
The log collection service unit of data resource layer, the daily record data storage service unit and daily record data of daily record data service layer are pre-
Service unit is handled, the user of daily record data application layer obtains daily record data service unit, wherein:
Log collection service unit:For acquiring the original log number generated in application software operation on daily record data resource layer
According to;
Daily record data pre-processing service unit:For unnecessary information to be rejected to log data according to demand, leave
The message that user needs, including following three aspects pretreatment work is made to initial data:Data filtering, data deduplication and daily record
Record segmentation;
Daily record data storage service unit:For being responsible for storing initial data and pretreated data;
User obtains daily record data service unit:For providing multi-condition inquiry service interface, inquiry service is provided a user.
2. collection and the service processing method of a kind of application software running log based on Spark, which is characterized in that including as follows
Step:
Step 1:Log collection service collects log data using distributed collection strategy;
Step 2:Daily record data pre-processing service pre-processes the log data being collected into;
Step 3:Daily record data storage service receives log data and pretreated daily record data is stored in difference respectively
Database in;
Step 4:User obtains daily record data service, is used to provide multi-condition inquiry service interface, provides a user inquiry clothes
Business.
3. collection and the service processing method of a kind of application software running log based on Spark according to claim 2,
It is characterized in that, the step 1 comprises the steps of:
Step 1.1:Log collection service connects log services node using failover modes, can automatically select available node
Connection;When some node goes wrong in daily record service node cluster, then the daily record data being collected into can be passed to other clothes
Business node;When cluster message service node is unavailable, other available messenger service sections will be selected automatically by failover
Point processing;
Step 1.2:When carrying out collection of log data, log collection module sets journal file path first, and daily record data is received
Collection work can be completed in each child node, one big log data set of convergence synthesis after the completion of collecting, here in order to meet
User sets a filter in each child node for the demand of daily record data;
Step 1.3:Before starting collection, it is first determined the data source of collector journal judges that master is saved after determining data source
Whether point starts, if do not started, starts master nodes by changing configuration file;It is selected if starting
Master nodes, and determined using one or multiple master nodes, after master nodes start, setting according to system requirements
Agent nodes, user customize agent nodes according to self-demand, and customization includes three aspects, customizes source, real time data
Source, the channel of the caching of channel, that is, real-time logs data, the output of sink, that is, real time data;Setting includes each
The name of source, channel and sink, type and its attribute;The connection that agent is carried out after being provided with, is completed
Start the collection work that all agent nodes start daily record data afterwards.
4. collection and the service processing method of a kind of application software running log based on Spark according to claim 2,
It is characterized in that, the step 2 comprises the steps of:
Step 2.1:The pretreatment that daily record data pre-processing service carries out daily record data, including following three are made to initial data
Aspect pretreatment work:Data filtering, data deduplication and log recording are segmented;
Step 2.2:After simply pre-processing, selection carries out at data classification daily record data using Algorithm of documents categorization
Reason, TFIDF algorithms are responsible for building VSM models to complete text vector, then realize that data are classified by KNN algorithms.
5. a kind of collection of application software running log based on Spark according to claim 2 and service processing frame,
It is characterized in that, in step 2.1, daily record data pretreatment is divided into three subdivisions to complete, specifically:
Step A, first it is data filtering part, it can include a large amount of unnecessary records that a log data, which is concentrated, need
Processing is filtered to log data;
Step B, followed by daily record duplicate removal part, a log data concentration can include the record largely repeated, such as
When encountering remote service request interruption, identical log recording can be repeatedly returned to, only first return is repeated in these records
Log recording, subsequent daily record data analysis is carried out for user and is not helped, it is therefore desirable to remove the log recording repeated
It removes;
Step C, it is finally log record sorter processing, general Log data format:When m- logging level-service name-generation
Event, such log data user can not read, it is therefore desirable to classify to log recording, it is different
Classification represents different meanings.
6. collection and the service processing method of a kind of application software running log based on Spark according to claim 2,
It is characterized in that, the step 3 comprises the steps of:
Step 3.1:Daily record data is divided into two kinds:One is log data, this data general value density is although low, but
It is very valuable if further excavate;This kind of log data is filed herein, since such data will not be by
High frequency accesses, also not high to requirement of real-time, therefore stores into MySQL database;Another kind is by extracting cleaning filtering
Pretreated daily record data is screened and carries out, such data are to analyze directly related data, therefore these with subsequent user
Data will be by frequent visit, therefore stores to the high distributed NoSQL databases HBase of performance;
Step 3.2:Log data is a kind of irregular semi-structured data form, and different types of data format differs
It causes, in order to cooperate with the work of two distinct types of database, it is necessary to standardization processing be carried out to daily record data, standardize format
The purpose of processing is in order to reach following two purposes:Scalability is simplified;The purpose of scalability is to accommodate different type
Application log make to aim at day not constraining in type, simplification is in order to which the daily record data after standardizing is taken by user
Efficiency can be improved when doing log analysis;
Step 3.3:Log data format specificationization can will be divided into three parts:Daily record index, log recording body and
Logging level;Daily record index is the core of entire daily record data storage format, we define a service for each log services
ID, as major key index convenient for quick-searching position log recording, improve pretreatment and later inquiry service efficiency;
Log recording body stores the information of daily record data itself, and log recording body is transparent, log system for log system
Fault diagnosis is carried out by the analysis to these information;Logging level is divided into logging level and is divided into three grades INFO,
ERROR, DEBUG.
7. collection and the service processing method of a kind of application software running log based on Spark according to claim 2,
It is characterized in that, the step 4 comprises the steps of:
Step 4.1:By providing multi-condition inquiry service, inquiry service is provided a user, user can be according to different needs
Carry out inquiry log data using multi-condition inquiry;
Step 4.2:Daily record data enquiry module creates index to the daily record data that log collection module collection comes first, then uses
The interface that family is provided by daily record data enquiry module can be retrieved by simple multi-condition inquiry and be found out on some server
All daily record datas retrieve oneself required daily record data, every daily record data out of daily record storage data set in real time
Record can generally check application program either with or without going wrong by log recording message, equally also can get the daily record data
Come from the information such as any platform server.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810091898.6A CN108399199A (en) | 2018-01-30 | 2018-01-30 | A kind of collection of the application software running log based on Spark and service processing system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810091898.6A CN108399199A (en) | 2018-01-30 | 2018-01-30 | A kind of collection of the application software running log based on Spark and service processing system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108399199A true CN108399199A (en) | 2018-08-14 |
Family
ID=63095380
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810091898.6A Pending CN108399199A (en) | 2018-01-30 | 2018-01-30 | A kind of collection of the application software running log based on Spark and service processing system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108399199A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109120445A (en) * | 2018-08-22 | 2019-01-01 | 公安部第三研究所 | A kind of network log data synchronous system and method |
CN109542733A (en) * | 2018-12-05 | 2019-03-29 | 焦点科技股份有限公司 | A kind of highly reliable real-time logs collection and visual m odeling technique method |
CN109992569A (en) * | 2019-02-19 | 2019-07-09 | 平安科技(深圳)有限公司 | Cluster log feature extracting method, device and storage medium |
CN110209518A (en) * | 2019-04-26 | 2019-09-06 | 福州慧校通教育信息技术有限公司 | A kind of multi-data source daily record data, which is concentrated, collects storage method and device |
CN110503131A (en) * | 2019-07-22 | 2019-11-26 | 北京工业大学 | Wind-driven generator health monitoring systems based on big data analysis |
CN111177193A (en) * | 2019-12-13 | 2020-05-19 | 航天信息股份有限公司 | Flink-based log streaming processing method and system |
CN111917600A (en) * | 2020-06-12 | 2020-11-10 | 贵州大学 | Spark performance optimization-based network traffic classification device and classification method |
CN112035353A (en) * | 2020-08-28 | 2020-12-04 | 北京浪潮数据技术有限公司 | Log recording method, device, equipment and computer readable storage medium |
CN112632020A (en) * | 2020-12-25 | 2021-04-09 | 中国电子科技集团公司第三十研究所 | Log information type extraction method and mining method based on spark big data platform |
CN113010483A (en) * | 2020-11-20 | 2021-06-22 | 云智慧(北京)科技有限公司 | Mass log management method and system |
CN114301769A (en) * | 2021-12-29 | 2022-04-08 | 杭州迪普信息技术有限公司 | Method and system for processing original flow data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104636494A (en) * | 2015-03-04 | 2015-05-20 | 浪潮电子信息产业股份有限公司 | Spark-based log auditing and reversed checking system for big data platforms |
CN106227832A (en) * | 2016-07-26 | 2016-12-14 | 浪潮软件股份有限公司 | The Internet big data technique framework application process in operational analysis in enterprise |
EP3179387A1 (en) * | 2015-12-07 | 2017-06-14 | Ephesoft Inc. | Analytic systems, methods, and computer-readable media for structured, semi-structured, and unstructured documents |
-
2018
- 2018-01-30 CN CN201810091898.6A patent/CN108399199A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104636494A (en) * | 2015-03-04 | 2015-05-20 | 浪潮电子信息产业股份有限公司 | Spark-based log auditing and reversed checking system for big data platforms |
EP3179387A1 (en) * | 2015-12-07 | 2017-06-14 | Ephesoft Inc. | Analytic systems, methods, and computer-readable media for structured, semi-structured, and unstructured documents |
CN106227832A (en) * | 2016-07-26 | 2016-12-14 | 浪潮软件股份有限公司 | The Internet big data technique framework application process in operational analysis in enterprise |
Non-Patent Citations (1)
Title |
---|
张骁等: "应用软件运行日志的收集与服务处理框架", 《计算机工程与应用》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109120445B (en) * | 2018-08-22 | 2021-11-26 | 公安部第三研究所 | Network log data synchronization system and method |
CN109120445A (en) * | 2018-08-22 | 2019-01-01 | 公安部第三研究所 | A kind of network log data synchronous system and method |
CN109542733A (en) * | 2018-12-05 | 2019-03-29 | 焦点科技股份有限公司 | A kind of highly reliable real-time logs collection and visual m odeling technique method |
CN109542733B (en) * | 2018-12-05 | 2020-05-01 | 焦点科技股份有限公司 | High-reliability real-time log collection and visual retrieval method |
CN109992569A (en) * | 2019-02-19 | 2019-07-09 | 平安科技(深圳)有限公司 | Cluster log feature extracting method, device and storage medium |
CN110209518A (en) * | 2019-04-26 | 2019-09-06 | 福州慧校通教育信息技术有限公司 | A kind of multi-data source daily record data, which is concentrated, collects storage method and device |
CN110503131A (en) * | 2019-07-22 | 2019-11-26 | 北京工业大学 | Wind-driven generator health monitoring systems based on big data analysis |
CN110503131B (en) * | 2019-07-22 | 2023-10-10 | 北京工业大学 | Wind driven generator health monitoring system based on big data analysis |
CN111177193A (en) * | 2019-12-13 | 2020-05-19 | 航天信息股份有限公司 | Flink-based log streaming processing method and system |
CN111917600A (en) * | 2020-06-12 | 2020-11-10 | 贵州大学 | Spark performance optimization-based network traffic classification device and classification method |
CN112035353A (en) * | 2020-08-28 | 2020-12-04 | 北京浪潮数据技术有限公司 | Log recording method, device, equipment and computer readable storage medium |
CN112035353B (en) * | 2020-08-28 | 2022-06-17 | 北京浪潮数据技术有限公司 | Log recording method, device and equipment and computer readable storage medium |
CN113010483A (en) * | 2020-11-20 | 2021-06-22 | 云智慧(北京)科技有限公司 | Mass log management method and system |
CN112632020B (en) * | 2020-12-25 | 2022-03-18 | 中国电子科技集团公司第三十研究所 | Log information type extraction method and mining method based on spark big data platform |
CN112632020A (en) * | 2020-12-25 | 2021-04-09 | 中国电子科技集团公司第三十研究所 | Log information type extraction method and mining method based on spark big data platform |
CN114301769A (en) * | 2021-12-29 | 2022-04-08 | 杭州迪普信息技术有限公司 | Method and system for processing original flow data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108399199A (en) | A kind of collection of the application software running log based on Spark and service processing system and method | |
US11755628B2 (en) | Data relationships storage platform | |
CN108038222B (en) | System of entity-attribute framework for information system modeling and data access | |
Zhang et al. | Multi-database mining | |
CN106095862B (en) | Storage method of centralized extensible fusion type multi-dimensional complex structure relation data | |
CN110300963A (en) | Data management system in large-scale data repository | |
CN110168515A (en) | System for analyzing data relationship to support query execution | |
CN109997125A (en) | System for importing data to data storage bank | |
CN108255712A (en) | The test system and test method of data system | |
CN112199433A (en) | Data management system for city-level data middling station | |
US9123006B2 (en) | Techniques for parallel business intelligence evaluation and management | |
Bellini et al. | Data flow management and visual analytic for big data smart city/IOT | |
JP5535062B2 (en) | Data storage and query method for time series analysis of weblog and system for executing the method | |
US11615076B2 (en) | Monolith database to distributed database transformation | |
CN112148578A (en) | IT fault defect prediction method based on machine learning | |
CN114218218A (en) | Data processing method, device and equipment based on data warehouse and storage medium | |
CN111126852A (en) | BI application system based on big data modeling | |
Theeten et al. | Chive: Bandwidth optimized continuous querying in distributed clouds | |
Tsai et al. | Data Partitioning and Redundancy Management for Robust Multi-Tenancy SaaS. | |
CN111459900B (en) | Big data life cycle setting method, device, storage medium and server | |
Batini et al. | A survey of data quality issues in cooperative information systems | |
Hadzhiev et al. | A Hybrid Model for Structuring, Storing and Processing Distributed Data on the Internet | |
Jadhav et al. | A Practical approach for integrating Big data Analytics into E-governance using hadoop | |
Prashanthi et al. | Generating analytics from web log | |
Wei et al. | A method and application for constructing a authentic data space |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180814 |