CN105262812A - Log data processing method based on cloud computing platform, log data processing device and log data processing system - Google Patents

Log data processing method based on cloud computing platform, log data processing device and log data processing system Download PDF

Info

Publication number
CN105262812A
CN105262812A CN201510672429.XA CN201510672429A CN105262812A CN 105262812 A CN105262812 A CN 105262812A CN 201510672429 A CN201510672429 A CN 201510672429A CN 105262812 A CN105262812 A CN 105262812A
Authority
CN
China
Prior art keywords
daily record
record data
server
data
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510672429.XA
Other languages
Chinese (zh)
Inventor
杨吉东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN201510672429.XA priority Critical patent/CN105262812A/en
Publication of CN105262812A publication Critical patent/CN105262812A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a log data processing method based on a cloud computing platform, a log data processing device and a log data processing system. The log data processing method comprises the following steps of respectively acquiring corresponding log data from each webpage server by a log preprocessing server, performing preprocessing, acquiring intermediate log data and transmitting the intermediate log data to a log extraction server; and obtaining target log data from the intermediate log data by the log extraction server according to a preset information extraction strategy. The log data processing method based on the cloud computing platform, the log data processing device and the log data processing system realize automatic generation of target log data which satisfy a user requirement under a precondition of performing preprocessing on the log data, thereby greatly improving user experience.

Description

A kind of daily record data processing method, Apparatus and system based on cloud computing platform
Technical field
The present invention relates to log information process field, particularly relate to a kind of daily record data processing method, Apparatus and system based on cloud computing platform.
Background technology
Along with computer technology and Internet industry develop rapidly, the status of Web in people's routine work and life is day by day remarkable, and therefore, it is huge for producing web log data amount, and in order to better services user, the work of Web Web log mining just seems particularly important.Web Web log mining is exactly the excavation to Web log recording, by Web Web log mining, find the browse mode of the Web page of user's access, thus analyze further and study the rule in Web log recording, improve performance and the institutional framework of Web site, and provide personalized service accordingly.
In the face of massive logs data, traditional database, processing mode etc. can not process correlation log data timely and effectively, obtain the rule that these improve Web site performance, distributed system is exactly arise at the historic moment under this background, adopt distributed file system both can to store the daily record on each server, distributed arithmetic also can be adopted to process these daily record datas.
But before daily record data is analyzed, preliminary treatment is not carried out to daily record data, cause follow-up when extracting key message from log information, have impact on the efficiency obtaining key message; In addition, adopting keyword to carry out extracting in the process of key message to daily record data, the degree of association between keyword is not considered, therefore the key message extracted is caused to be fragmentation, need manually to combine, considerably increase workload, have impact on Consumer's Experience.
Summary of the invention
The invention provides a kind of daily record data processing method, Apparatus and system based on cloud computing platform, to solve the problem.
The invention provides a kind of daily record data processing method based on cloud computing platform.Said method comprises the following steps:
Log integrity server obtains corresponding daily record data respectively from each web page server, and after carrying out preliminary treatment, described middle daily record data is also sent to daily record extraction server by daily record data in the middle of obtaining;
Described daily record extracts server according to presupposed information fetch strategy, from described middle daily record data, obtain target journaling data.
The present invention also provides a kind of daily record data processing unit based on cloud computing platform, comprises middle daily record data acquisition module, daily record data processing module; Wherein, described middle daily record data acquisition module is connected with described daily record data processing module;
Described middle daily record data acquisition module, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record data processing module by daily record data in the middle of obtaining;
Described daily record data processing module, for according to presupposed information fetch strategy, obtains target journaling information from described middle daily record data.
The present invention also provides a kind of daily record data treatment system based on cloud computing platform, comprises one or more web page server, log integrity server, daily record extraction server; Wherein, described one or more web page server extracts server by described log integrity server and described daily record and is connected;
Described log integrity server, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record extraction server by daily record data in the middle of obtaining;
Server is extracted in described daily record, for according to presupposed information fetch strategy, from described middle daily record data, obtains target journaling data.
By following scheme: log integrity server obtains corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to daily record extraction server by daily record data in the middle of obtaining; Described daily record extracts server according to presupposed information fetch strategy, target journaling data are obtained from described middle daily record data, achieve under pretreated prerequisite is carried out to daily record data, automatically generate the target journaling data meeting user's request, greatly improve Consumer's Experience.
By following scheme: described daily record extracts server according to each keyword, from the daily record data that described middle daily record extracting data is corresponding with each keyword; Described daily record extracts server according to the degree of association between keyword, daily record data corresponding for the keyword with the degree of association is carried out combining as target journaling data, exports, automatically can generate the target journaling data meeting user's request, decrease manual operation, improve Consumer's Experience.
By following scheme: carry out preliminary treatment to daily record data and comprise: daily record data extracts, daily record data cleans, daily record data converts, daily record data is integrated; Achieve before daily record data is analyzed, preliminary treatment is carried out to daily record data, make follow-up when extracting target journaling data from log information, improve acquisition efficiency.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide a further understanding of the present invention, and form a application's part, schematic description and description of the present invention, for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Figure 1 shows that the daily record data process flow figure based on cloud computing platform of the embodiment of the present invention 1;
Figure 2 shows that the daily record data processing unit structure chart based on cloud computing platform of the embodiment of the present invention 2;
Figure 3 shows that the daily record data treatment system structure chart based on cloud computing platform of the embodiment of the present invention 3;
Figure 4 shows that the daily record data treatment system structure chart based on cloud computing platform of the embodiment of the present invention 4.
Embodiment
Hereinafter also describe the present invention in detail with reference to accompanying drawing in conjunction with the embodiments.It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.
Figure 1 shows that the daily record data process flow figure based on cloud computing platform of the embodiment of the present invention 1, comprise the following steps:
Step 101: log integrity server obtains corresponding daily record data respectively from each web page server;
Further, log integrity server also comprised obtain corresponding daily record data respectively from each web page server before:
Operation system in each web page server produces daily record data and is stored in journal file by described daily record data.
Step 102: after log integrity server carries out preliminary treatment to the daily record data obtained, described middle daily record data is also sent to daily record extraction server by daily record data in the middle of obtaining;
Further, carry out preliminary treatment to daily record data to comprise: daily record data extracts, daily record data cleans, daily record data converts, daily record data is integrated.
Such as: in specific implementation process, it can be the daily record data extracting special time or special time period from daily record data that daily record data extracts, and can also be the daily record data extracted from particular ip address or particular ip address section; Daily record data cleaning can be clean noise data from daily record data, can also be clean extraneous data from daily record data; Daily record data conversion refers to form daily record data being converted to applicable data mining; Daily record data is integrated to be referred to and leaves in during consistent data store by combining from the daily record data in multiple data source.
Step 103: described daily record extracts server according to presupposed information fetch strategy, obtains target journaling data from described middle daily record data.
Further, described daily record extracts server according to presupposed information fetch strategy, and the process obtaining target journaling data from described middle daily record data is:
Described daily record extracts server according to each keyword, from the daily record data that described middle daily record extracting data is corresponding with each keyword;
Described daily record extracts server according to the degree of association between keyword, is carried out combining as target journaling data by daily record data corresponding for the keyword with the degree of association, exports.
Further, the degree of association between keyword comprises: onrelevant degree between relevant degree, keyword between keyword.
Further, between keyword, relevant degree refers to the daily record produced the business with relevance, when adopting keyword to carry out daily record data extraction, has the degree of association between the keyword of employing; Between keyword, onrelevant degree refers to the daily record data to not having the business of relevance to produce, when adopting keyword to carry out daily record data extraction, and onrelevant degree between the keyword of employing.
Wherein, in specific implementation process, such as: IMAQ business, image mosaic business are the business with relevance; Database inquiry services, defragmentation business are the business not having relevance.
For the daily record data that IMAQ business, image mosaic business produce, adopt the keyword with the degree of association: (" image acquisition time ", " splicing picture number ") extracts the daily record data that IMAQ business, image mosaic business produce, obtain the daily record data of daily record data corresponding to " image acquisition time ", " splicing picture number " correspondence respectively;
The daily record data of daily record data corresponding to " image acquisition time " that obtain, " splicing picture number " correspondence is carried out combining as target journaling data, exports.
Further, described daily record extracts server according to presupposed information fetch strategy, after obtaining target journaling data, also comprises from described middle daily record data:
Described daily record is extracted server and described target journaling data is sent to client;
Described client processes after receiving described target journaling data.
Figure 2 shows that the daily record data processing unit structure chart based on cloud computing platform of the embodiment of the present invention 2, comprise middle daily record data acquisition module 201, daily record data processing module 202; Wherein, described middle daily record data acquisition module 201 is connected with described daily record data processing module 202;
Described middle daily record data acquisition module 201, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record data processing module 202 by daily record data in the middle of obtaining;
Described daily record data processing module 202, for according to presupposed information fetch strategy, obtains target journaling information from described middle daily record data.
Figure 3 shows that the daily record data treatment system structure chart based on cloud computing platform of the embodiment of the present invention 3, comprise web page server 1, web page server 2 ... web page server n, log integrity server, daily record extract server; Wherein, described web page server 1, web page server 2 ... web page server n extracts server respectively by described log integrity server and described daily record and is connected;
Described log integrity server, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record extraction server by daily record data in the middle of obtaining;
Server is extracted in described daily record, for according to presupposed information fetch strategy, from described middle daily record data, obtains target journaling data.
Figure 4 shows that the daily record data treatment system structure chart based on cloud computing platform of the embodiment of the present invention 4, on the basis of Fig. 3, add client 1, client 2 ... client n; Wherein, described client 1, client 2 ... client n extracts server with described daily record respectively and is connected;
Described client, for obtain target journaling data from described daily record extraction server after, processes.
By following scheme: log integrity server obtains corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to daily record extraction server by daily record data in the middle of obtaining; Described daily record extracts server according to presupposed information fetch strategy, target journaling data are obtained from described middle daily record data, achieve under pretreated prerequisite is carried out to daily record data, automatically generate the target journaling data meeting user's request, greatly improve Consumer's Experience.
By following scheme: described daily record extracts server according to each keyword, from the daily record data that described middle daily record extracting data is corresponding with each keyword; Described daily record extracts server according to the degree of association between keyword, daily record data corresponding for the keyword with the degree of association is carried out combining as target journaling data, exports, automatically can generate the target journaling data meeting user's request, decrease manual operation, improve Consumer's Experience.
By following scheme: carry out preliminary treatment to daily record data and comprise: daily record data extracts, daily record data cleans, daily record data converts, daily record data is integrated; Achieve before daily record data is analyzed, preliminary treatment is carried out to daily record data, make follow-up when extracting target journaling data from log information, improve acquisition efficiency.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1., based on a daily record data processing method for cloud computing platform, it is characterized in that, comprise the following steps:
Log integrity server obtains corresponding daily record data respectively from each web page server, and after carrying out preliminary treatment, described middle daily record data is also sent to daily record extraction server by daily record data in the middle of obtaining;
Described daily record extracts server according to presupposed information fetch strategy, from described middle daily record data, obtain target journaling data.
2. method according to claim 1, is characterized in that, carries out preliminary treatment comprise daily record data: daily record data extracts, daily record data cleans, daily record data converts, daily record data is integrated.
3. method according to claim 1, is characterized in that, described daily record extracts server according to presupposed information fetch strategy, and the process obtaining target journaling data from described middle daily record data is:
Described daily record extracts server according to each keyword, from the daily record data that described middle daily record extracting data is corresponding with each keyword;
Described daily record extracts server according to the degree of association between keyword, is carried out combining as target journaling data by daily record data corresponding for the keyword with the degree of association, exports.
4. method according to claim 3, is characterized in that, the degree of association between keyword comprises: onrelevant degree between relevant degree, keyword between keyword.
5. method according to claim 4, is characterized in that, between keyword, relevant degree refers to the daily record produced the business with relevance, when adopting keyword to carry out daily record data extraction, has the degree of association between the keyword of employing; Between keyword, onrelevant degree refers to the daily record data to not having the business of relevance to produce, when adopting keyword to carry out daily record data extraction, and onrelevant degree between the keyword of employing.
6. method according to claim 1, is characterized in that, described daily record extracts server according to presupposed information fetch strategy, after obtaining target journaling data, also comprises from described middle daily record data:
Described daily record is extracted server and described target journaling data is sent to client;
Described client processes after receiving described target journaling data.
7. method according to claim 1, is characterized in that, log integrity server also comprised obtain corresponding daily record data respectively from each web page server before:
Operation system in each web page server produces daily record data and is stored in journal file by described daily record data.
8. based on a daily record data processing unit for cloud computing platform, it is characterized in that, comprise middle daily record data acquisition module, daily record data processing module; Wherein, described middle daily record data acquisition module is connected with described daily record data processing module;
Described middle daily record data acquisition module, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record data processing module by daily record data in the middle of obtaining;
Described daily record data processing module, for according to presupposed information fetch strategy, obtains target journaling information from described middle daily record data.
9. based on a daily record data treatment system for cloud computing platform, it is characterized in that, comprise one or more web page server, log integrity server, daily record extraction server; Wherein, described one or more web page server extracts server by described log integrity server and described daily record and is connected;
Described log integrity server, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record extraction server by daily record data in the middle of obtaining;
Server is extracted in described daily record, for according to presupposed information fetch strategy, from described middle daily record data, obtains target journaling data.
10. system according to claim 9, is characterized in that, also comprises one or more client; Wherein, described one or more client and described daily record are extracted server and are connected;
Described client, for obtain target journaling data from described daily record extraction server after, processes.
CN201510672429.XA 2015-10-16 2015-10-16 Log data processing method based on cloud computing platform, log data processing device and log data processing system Pending CN105262812A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510672429.XA CN105262812A (en) 2015-10-16 2015-10-16 Log data processing method based on cloud computing platform, log data processing device and log data processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510672429.XA CN105262812A (en) 2015-10-16 2015-10-16 Log data processing method based on cloud computing platform, log data processing device and log data processing system

Publications (1)

Publication Number Publication Date
CN105262812A true CN105262812A (en) 2016-01-20

Family

ID=55102316

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510672429.XA Pending CN105262812A (en) 2015-10-16 2015-10-16 Log data processing method based on cloud computing platform, log data processing device and log data processing system

Country Status (1)

Country Link
CN (1) CN105262812A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105843941A (en) * 2016-04-01 2016-08-10 北京小米移动软件有限公司 Log checking method and device
WO2017143936A1 (en) * 2016-02-24 2017-08-31 华为技术有限公司 Web log time alignment method and apparatus, and host
CN107480277A (en) * 2017-08-22 2017-12-15 北京京东尚科信息技术有限公司 Method and device for web log file collection
CN107729206A (en) * 2017-09-04 2018-02-23 上海斐讯数据通信技术有限公司 Real-time analysis method, system and the computer-processing equipment of alarm log
CN107870921A (en) * 2016-09-26 2018-04-03 杭州华为数字技术有限公司 A kind of daily record data processing method and processing device
CN111198859A (en) * 2018-11-16 2020-05-26 北京微播视界科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN111782473A (en) * 2020-06-30 2020-10-16 中国工商银行股份有限公司 Distributed log data processing method, device and system
CN112256549A (en) * 2020-11-13 2021-01-22 珠海大横琴科技发展有限公司 Log processing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629257A (en) * 2012-02-29 2012-08-08 南京大学 Commodity recommending method of e-commerce website based on keywords
US8583685B2 (en) * 2010-11-02 2013-11-12 Alibaba Group Holding Limited Determination of category information using multiple stages
CN103914478A (en) * 2013-01-06 2014-07-09 阿里巴巴集团控股有限公司 Webpage training method and system and webpage prediction method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8583685B2 (en) * 2010-11-02 2013-11-12 Alibaba Group Holding Limited Determination of category information using multiple stages
CN102629257A (en) * 2012-02-29 2012-08-08 南京大学 Commodity recommending method of e-commerce website based on keywords
CN103914478A (en) * 2013-01-06 2014-07-09 阿里巴巴集团控股有限公司 Webpage training method and system and webpage prediction method and system

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017143936A1 (en) * 2016-02-24 2017-08-31 华为技术有限公司 Web log time alignment method and apparatus, and host
US11750438B2 (en) 2016-02-24 2023-09-05 Huawei Technologies Co., Ltd. Network log time alignment method, apparatus, and host
US11140022B2 (en) 2016-02-24 2021-10-05 Huawei Technologies Co., Ltd. Network log time alignment method, apparatus, and host
CN105843941A (en) * 2016-04-01 2016-08-10 北京小米移动软件有限公司 Log checking method and device
CN105843941B (en) * 2016-04-01 2019-07-09 北京小米移动软件有限公司 Log method of calibration and device
CN107870921A (en) * 2016-09-26 2018-04-03 杭州华为数字技术有限公司 A kind of daily record data processing method and processing device
CN107870921B (en) * 2016-09-26 2021-10-15 华为技术有限公司 Log data processing method and device
CN107480277B (en) * 2017-08-22 2021-01-26 北京京东尚科信息技术有限公司 Method and device for collecting website logs
CN107480277A (en) * 2017-08-22 2017-12-15 北京京东尚科信息技术有限公司 Method and device for web log file collection
CN107729206A (en) * 2017-09-04 2018-02-23 上海斐讯数据通信技术有限公司 Real-time analysis method, system and the computer-processing equipment of alarm log
CN111198859A (en) * 2018-11-16 2020-05-26 北京微播视界科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN111198859B (en) * 2018-11-16 2023-11-03 北京微播视界科技有限公司 Data processing method, device, electronic equipment and computer readable storage medium
CN111782473A (en) * 2020-06-30 2020-10-16 中国工商银行股份有限公司 Distributed log data processing method, device and system
CN112256549A (en) * 2020-11-13 2021-01-22 珠海大横琴科技发展有限公司 Log processing method and device
CN112256549B (en) * 2020-11-13 2022-01-04 珠海大横琴科技发展有限公司 Log processing method and device

Similar Documents

Publication Publication Date Title
CN105262812A (en) Log data processing method based on cloud computing platform, log data processing device and log data processing system
US20180124193A1 (en) System and method for displaying contextual activity streams
CN104951544A (en) User data processing method and system and method and system for providing user data
CN103793285A (en) Method and platform server for processing online anomalies
JP6428795B2 (en) Model generation method, word weighting method, model generation device, word weighting device, device, computer program, and computer storage medium
TW200639727A (en) Online printing service system on the internet
NZ583751A (en) A system and method using graphical user interfaces for jury veridct information
WO2016112665A1 (en) Voice data processing method and device
CN103049474A (en) Search query and document-related data translation
CN104102692A (en) Electronic document tracking method based on logs
CN104391706A (en) Reverse engineering based model base structuring method
CN110968571A (en) Big data analysis and processing platform for financial information service
CN103473645A (en) Enterprise internal project evaluation and review system
GB0509904D0 (en) Method, apparatus and computer program for facilitating communication between a client application and a server application
CN104765823A (en) Method and device for collecting website data
CN101673263B (en) Method for searching video content
CN104680398A (en) Acquisition and storage method for mass behavior data of E-commerce users
AU2017409831A1 (en) Advertisement generation method, computer readable storage medium and system
CN105808605A (en) Search log combination method and system
WO2007145775A3 (en) Keyword extraction and contextual advertisement generation
CN105491090B (en) network data processing method and device
US11062239B2 (en) Structuring computer-mediated communication and determining relevant case type
CN106528796A (en) Method for quickly identifying proper nouns in industrial product e-commerce search engine
CN111158677A (en) Page layout data processing method and system for multiple platforms
CN101557310A (en) System for tracing user access information and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160120

RJ01 Rejection of invention patent application after publication