CN105262812A - Log data processing method based on cloud computing platform, log data processing device and log data processing system - Google Patents
Log data processing method based on cloud computing platform, log data processing device and log data processing system Download PDFInfo
- Publication number
- CN105262812A CN105262812A CN201510672429.XA CN201510672429A CN105262812A CN 105262812 A CN105262812 A CN 105262812A CN 201510672429 A CN201510672429 A CN 201510672429A CN 105262812 A CN105262812 A CN 105262812A
- Authority
- CN
- China
- Prior art keywords
- daily record
- record data
- server
- data
- keyword
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/069—Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a log data processing method based on a cloud computing platform, a log data processing device and a log data processing system. The log data processing method comprises the following steps of respectively acquiring corresponding log data from each webpage server by a log preprocessing server, performing preprocessing, acquiring intermediate log data and transmitting the intermediate log data to a log extraction server; and obtaining target log data from the intermediate log data by the log extraction server according to a preset information extraction strategy. The log data processing method based on the cloud computing platform, the log data processing device and the log data processing system realize automatic generation of target log data which satisfy a user requirement under a precondition of performing preprocessing on the log data, thereby greatly improving user experience.
Description
Technical field
The present invention relates to log information process field, particularly relate to a kind of daily record data processing method, Apparatus and system based on cloud computing platform.
Background technology
Along with computer technology and Internet industry develop rapidly, the status of Web in people's routine work and life is day by day remarkable, and therefore, it is huge for producing web log data amount, and in order to better services user, the work of Web Web log mining just seems particularly important.Web Web log mining is exactly the excavation to Web log recording, by Web Web log mining, find the browse mode of the Web page of user's access, thus analyze further and study the rule in Web log recording, improve performance and the institutional framework of Web site, and provide personalized service accordingly.
In the face of massive logs data, traditional database, processing mode etc. can not process correlation log data timely and effectively, obtain the rule that these improve Web site performance, distributed system is exactly arise at the historic moment under this background, adopt distributed file system both can to store the daily record on each server, distributed arithmetic also can be adopted to process these daily record datas.
But before daily record data is analyzed, preliminary treatment is not carried out to daily record data, cause follow-up when extracting key message from log information, have impact on the efficiency obtaining key message; In addition, adopting keyword to carry out extracting in the process of key message to daily record data, the degree of association between keyword is not considered, therefore the key message extracted is caused to be fragmentation, need manually to combine, considerably increase workload, have impact on Consumer's Experience.
Summary of the invention
The invention provides a kind of daily record data processing method, Apparatus and system based on cloud computing platform, to solve the problem.
The invention provides a kind of daily record data processing method based on cloud computing platform.Said method comprises the following steps:
Log integrity server obtains corresponding daily record data respectively from each web page server, and after carrying out preliminary treatment, described middle daily record data is also sent to daily record extraction server by daily record data in the middle of obtaining;
Described daily record extracts server according to presupposed information fetch strategy, from described middle daily record data, obtain target journaling data.
The present invention also provides a kind of daily record data processing unit based on cloud computing platform, comprises middle daily record data acquisition module, daily record data processing module; Wherein, described middle daily record data acquisition module is connected with described daily record data processing module;
Described middle daily record data acquisition module, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record data processing module by daily record data in the middle of obtaining;
Described daily record data processing module, for according to presupposed information fetch strategy, obtains target journaling information from described middle daily record data.
The present invention also provides a kind of daily record data treatment system based on cloud computing platform, comprises one or more web page server, log integrity server, daily record extraction server; Wherein, described one or more web page server extracts server by described log integrity server and described daily record and is connected;
Described log integrity server, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record extraction server by daily record data in the middle of obtaining;
Server is extracted in described daily record, for according to presupposed information fetch strategy, from described middle daily record data, obtains target journaling data.
By following scheme: log integrity server obtains corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to daily record extraction server by daily record data in the middle of obtaining; Described daily record extracts server according to presupposed information fetch strategy, target journaling data are obtained from described middle daily record data, achieve under pretreated prerequisite is carried out to daily record data, automatically generate the target journaling data meeting user's request, greatly improve Consumer's Experience.
By following scheme: described daily record extracts server according to each keyword, from the daily record data that described middle daily record extracting data is corresponding with each keyword; Described daily record extracts server according to the degree of association between keyword, daily record data corresponding for the keyword with the degree of association is carried out combining as target journaling data, exports, automatically can generate the target journaling data meeting user's request, decrease manual operation, improve Consumer's Experience.
By following scheme: carry out preliminary treatment to daily record data and comprise: daily record data extracts, daily record data cleans, daily record data converts, daily record data is integrated; Achieve before daily record data is analyzed, preliminary treatment is carried out to daily record data, make follow-up when extracting target journaling data from log information, improve acquisition efficiency.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide a further understanding of the present invention, and form a application's part, schematic description and description of the present invention, for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Figure 1 shows that the daily record data process flow figure based on cloud computing platform of the embodiment of the present invention 1;
Figure 2 shows that the daily record data processing unit structure chart based on cloud computing platform of the embodiment of the present invention 2;
Figure 3 shows that the daily record data treatment system structure chart based on cloud computing platform of the embodiment of the present invention 3;
Figure 4 shows that the daily record data treatment system structure chart based on cloud computing platform of the embodiment of the present invention 4.
Embodiment
Hereinafter also describe the present invention in detail with reference to accompanying drawing in conjunction with the embodiments.It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.
Figure 1 shows that the daily record data process flow figure based on cloud computing platform of the embodiment of the present invention 1, comprise the following steps:
Step 101: log integrity server obtains corresponding daily record data respectively from each web page server;
Further, log integrity server also comprised obtain corresponding daily record data respectively from each web page server before:
Operation system in each web page server produces daily record data and is stored in journal file by described daily record data.
Step 102: after log integrity server carries out preliminary treatment to the daily record data obtained, described middle daily record data is also sent to daily record extraction server by daily record data in the middle of obtaining;
Further, carry out preliminary treatment to daily record data to comprise: daily record data extracts, daily record data cleans, daily record data converts, daily record data is integrated.
Such as: in specific implementation process, it can be the daily record data extracting special time or special time period from daily record data that daily record data extracts, and can also be the daily record data extracted from particular ip address or particular ip address section; Daily record data cleaning can be clean noise data from daily record data, can also be clean extraneous data from daily record data; Daily record data conversion refers to form daily record data being converted to applicable data mining; Daily record data is integrated to be referred to and leaves in during consistent data store by combining from the daily record data in multiple data source.
Step 103: described daily record extracts server according to presupposed information fetch strategy, obtains target journaling data from described middle daily record data.
Further, described daily record extracts server according to presupposed information fetch strategy, and the process obtaining target journaling data from described middle daily record data is:
Described daily record extracts server according to each keyword, from the daily record data that described middle daily record extracting data is corresponding with each keyword;
Described daily record extracts server according to the degree of association between keyword, is carried out combining as target journaling data by daily record data corresponding for the keyword with the degree of association, exports.
Further, the degree of association between keyword comprises: onrelevant degree between relevant degree, keyword between keyword.
Further, between keyword, relevant degree refers to the daily record produced the business with relevance, when adopting keyword to carry out daily record data extraction, has the degree of association between the keyword of employing; Between keyword, onrelevant degree refers to the daily record data to not having the business of relevance to produce, when adopting keyword to carry out daily record data extraction, and onrelevant degree between the keyword of employing.
Wherein, in specific implementation process, such as: IMAQ business, image mosaic business are the business with relevance; Database inquiry services, defragmentation business are the business not having relevance.
For the daily record data that IMAQ business, image mosaic business produce, adopt the keyword with the degree of association: (" image acquisition time ", " splicing picture number ") extracts the daily record data that IMAQ business, image mosaic business produce, obtain the daily record data of daily record data corresponding to " image acquisition time ", " splicing picture number " correspondence respectively;
The daily record data of daily record data corresponding to " image acquisition time " that obtain, " splicing picture number " correspondence is carried out combining as target journaling data, exports.
Further, described daily record extracts server according to presupposed information fetch strategy, after obtaining target journaling data, also comprises from described middle daily record data:
Described daily record is extracted server and described target journaling data is sent to client;
Described client processes after receiving described target journaling data.
Figure 2 shows that the daily record data processing unit structure chart based on cloud computing platform of the embodiment of the present invention 2, comprise middle daily record data acquisition module 201, daily record data processing module 202; Wherein, described middle daily record data acquisition module 201 is connected with described daily record data processing module 202;
Described middle daily record data acquisition module 201, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record data processing module 202 by daily record data in the middle of obtaining;
Described daily record data processing module 202, for according to presupposed information fetch strategy, obtains target journaling information from described middle daily record data.
Figure 3 shows that the daily record data treatment system structure chart based on cloud computing platform of the embodiment of the present invention 3, comprise web page server 1, web page server 2 ... web page server n, log integrity server, daily record extract server; Wherein, described web page server 1, web page server 2 ... web page server n extracts server respectively by described log integrity server and described daily record and is connected;
Described log integrity server, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record extraction server by daily record data in the middle of obtaining;
Server is extracted in described daily record, for according to presupposed information fetch strategy, from described middle daily record data, obtains target journaling data.
Figure 4 shows that the daily record data treatment system structure chart based on cloud computing platform of the embodiment of the present invention 4, on the basis of Fig. 3, add client 1, client 2 ... client n; Wherein, described client 1, client 2 ... client n extracts server with described daily record respectively and is connected;
Described client, for obtain target journaling data from described daily record extraction server after, processes.
By following scheme: log integrity server obtains corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to daily record extraction server by daily record data in the middle of obtaining; Described daily record extracts server according to presupposed information fetch strategy, target journaling data are obtained from described middle daily record data, achieve under pretreated prerequisite is carried out to daily record data, automatically generate the target journaling data meeting user's request, greatly improve Consumer's Experience.
By following scheme: described daily record extracts server according to each keyword, from the daily record data that described middle daily record extracting data is corresponding with each keyword; Described daily record extracts server according to the degree of association between keyword, daily record data corresponding for the keyword with the degree of association is carried out combining as target journaling data, exports, automatically can generate the target journaling data meeting user's request, decrease manual operation, improve Consumer's Experience.
By following scheme: carry out preliminary treatment to daily record data and comprise: daily record data extracts, daily record data cleans, daily record data converts, daily record data is integrated; Achieve before daily record data is analyzed, preliminary treatment is carried out to daily record data, make follow-up when extracting target journaling data from log information, improve acquisition efficiency.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.
Claims (10)
1., based on a daily record data processing method for cloud computing platform, it is characterized in that, comprise the following steps:
Log integrity server obtains corresponding daily record data respectively from each web page server, and after carrying out preliminary treatment, described middle daily record data is also sent to daily record extraction server by daily record data in the middle of obtaining;
Described daily record extracts server according to presupposed information fetch strategy, from described middle daily record data, obtain target journaling data.
2. method according to claim 1, is characterized in that, carries out preliminary treatment comprise daily record data: daily record data extracts, daily record data cleans, daily record data converts, daily record data is integrated.
3. method according to claim 1, is characterized in that, described daily record extracts server according to presupposed information fetch strategy, and the process obtaining target journaling data from described middle daily record data is:
Described daily record extracts server according to each keyword, from the daily record data that described middle daily record extracting data is corresponding with each keyword;
Described daily record extracts server according to the degree of association between keyword, is carried out combining as target journaling data by daily record data corresponding for the keyword with the degree of association, exports.
4. method according to claim 3, is characterized in that, the degree of association between keyword comprises: onrelevant degree between relevant degree, keyword between keyword.
5. method according to claim 4, is characterized in that, between keyword, relevant degree refers to the daily record produced the business with relevance, when adopting keyword to carry out daily record data extraction, has the degree of association between the keyword of employing; Between keyword, onrelevant degree refers to the daily record data to not having the business of relevance to produce, when adopting keyword to carry out daily record data extraction, and onrelevant degree between the keyword of employing.
6. method according to claim 1, is characterized in that, described daily record extracts server according to presupposed information fetch strategy, after obtaining target journaling data, also comprises from described middle daily record data:
Described daily record is extracted server and described target journaling data is sent to client;
Described client processes after receiving described target journaling data.
7. method according to claim 1, is characterized in that, log integrity server also comprised obtain corresponding daily record data respectively from each web page server before:
Operation system in each web page server produces daily record data and is stored in journal file by described daily record data.
8. based on a daily record data processing unit for cloud computing platform, it is characterized in that, comprise middle daily record data acquisition module, daily record data processing module; Wherein, described middle daily record data acquisition module is connected with described daily record data processing module;
Described middle daily record data acquisition module, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record data processing module by daily record data in the middle of obtaining;
Described daily record data processing module, for according to presupposed information fetch strategy, obtains target journaling information from described middle daily record data.
9. based on a daily record data treatment system for cloud computing platform, it is characterized in that, comprise one or more web page server, log integrity server, daily record extraction server; Wherein, described one or more web page server extracts server by described log integrity server and described daily record and is connected;
Described log integrity server, for obtaining corresponding daily record data respectively from each web page server, after carrying out preliminary treatment, described middle daily record data is also sent to described daily record extraction server by daily record data in the middle of obtaining;
Server is extracted in described daily record, for according to presupposed information fetch strategy, from described middle daily record data, obtains target journaling data.
10. system according to claim 9, is characterized in that, also comprises one or more client; Wherein, described one or more client and described daily record are extracted server and are connected;
Described client, for obtain target journaling data from described daily record extraction server after, processes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510672429.XA CN105262812A (en) | 2015-10-16 | 2015-10-16 | Log data processing method based on cloud computing platform, log data processing device and log data processing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510672429.XA CN105262812A (en) | 2015-10-16 | 2015-10-16 | Log data processing method based on cloud computing platform, log data processing device and log data processing system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105262812A true CN105262812A (en) | 2016-01-20 |
Family
ID=55102316
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510672429.XA Pending CN105262812A (en) | 2015-10-16 | 2015-10-16 | Log data processing method based on cloud computing platform, log data processing device and log data processing system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105262812A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105843941A (en) * | 2016-04-01 | 2016-08-10 | 北京小米移动软件有限公司 | Log checking method and device |
WO2017143936A1 (en) * | 2016-02-24 | 2017-08-31 | 华为技术有限公司 | Web log time alignment method and apparatus, and host |
CN107480277A (en) * | 2017-08-22 | 2017-12-15 | 北京京东尚科信息技术有限公司 | Method and device for web log file collection |
CN107729206A (en) * | 2017-09-04 | 2018-02-23 | 上海斐讯数据通信技术有限公司 | Real-time analysis method, system and the computer-processing equipment of alarm log |
CN107870921A (en) * | 2016-09-26 | 2018-04-03 | 杭州华为数字技术有限公司 | A kind of daily record data processing method and processing device |
CN111198859A (en) * | 2018-11-16 | 2020-05-26 | 北京微播视界科技有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
CN111782473A (en) * | 2020-06-30 | 2020-10-16 | 中国工商银行股份有限公司 | Distributed log data processing method, device and system |
CN112256549A (en) * | 2020-11-13 | 2021-01-22 | 珠海大横琴科技发展有限公司 | Log processing method and device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102629257A (en) * | 2012-02-29 | 2012-08-08 | 南京大学 | Commodity recommending method of e-commerce website based on keywords |
US8583685B2 (en) * | 2010-11-02 | 2013-11-12 | Alibaba Group Holding Limited | Determination of category information using multiple stages |
CN103914478A (en) * | 2013-01-06 | 2014-07-09 | 阿里巴巴集团控股有限公司 | Webpage training method and system and webpage prediction method and system |
-
2015
- 2015-10-16 CN CN201510672429.XA patent/CN105262812A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8583685B2 (en) * | 2010-11-02 | 2013-11-12 | Alibaba Group Holding Limited | Determination of category information using multiple stages |
CN102629257A (en) * | 2012-02-29 | 2012-08-08 | 南京大学 | Commodity recommending method of e-commerce website based on keywords |
CN103914478A (en) * | 2013-01-06 | 2014-07-09 | 阿里巴巴集团控股有限公司 | Webpage training method and system and webpage prediction method and system |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017143936A1 (en) * | 2016-02-24 | 2017-08-31 | 华为技术有限公司 | Web log time alignment method and apparatus, and host |
US11750438B2 (en) | 2016-02-24 | 2023-09-05 | Huawei Technologies Co., Ltd. | Network log time alignment method, apparatus, and host |
US11140022B2 (en) | 2016-02-24 | 2021-10-05 | Huawei Technologies Co., Ltd. | Network log time alignment method, apparatus, and host |
CN105843941A (en) * | 2016-04-01 | 2016-08-10 | 北京小米移动软件有限公司 | Log checking method and device |
CN105843941B (en) * | 2016-04-01 | 2019-07-09 | 北京小米移动软件有限公司 | Log method of calibration and device |
CN107870921A (en) * | 2016-09-26 | 2018-04-03 | 杭州华为数字技术有限公司 | A kind of daily record data processing method and processing device |
CN107870921B (en) * | 2016-09-26 | 2021-10-15 | 华为技术有限公司 | Log data processing method and device |
CN107480277B (en) * | 2017-08-22 | 2021-01-26 | 北京京东尚科信息技术有限公司 | Method and device for collecting website logs |
CN107480277A (en) * | 2017-08-22 | 2017-12-15 | 北京京东尚科信息技术有限公司 | Method and device for web log file collection |
CN107729206A (en) * | 2017-09-04 | 2018-02-23 | 上海斐讯数据通信技术有限公司 | Real-time analysis method, system and the computer-processing equipment of alarm log |
CN111198859A (en) * | 2018-11-16 | 2020-05-26 | 北京微播视界科技有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
CN111198859B (en) * | 2018-11-16 | 2023-11-03 | 北京微播视界科技有限公司 | Data processing method, device, electronic equipment and computer readable storage medium |
CN111782473A (en) * | 2020-06-30 | 2020-10-16 | 中国工商银行股份有限公司 | Distributed log data processing method, device and system |
CN112256549A (en) * | 2020-11-13 | 2021-01-22 | 珠海大横琴科技发展有限公司 | Log processing method and device |
CN112256549B (en) * | 2020-11-13 | 2022-01-04 | 珠海大横琴科技发展有限公司 | Log processing method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105262812A (en) | Log data processing method based on cloud computing platform, log data processing device and log data processing system | |
US20180124193A1 (en) | System and method for displaying contextual activity streams | |
CN104951544A (en) | User data processing method and system and method and system for providing user data | |
CN103793285A (en) | Method and platform server for processing online anomalies | |
JP6428795B2 (en) | Model generation method, word weighting method, model generation device, word weighting device, device, computer program, and computer storage medium | |
TW200639727A (en) | Online printing service system on the internet | |
NZ583751A (en) | A system and method using graphical user interfaces for jury veridct information | |
WO2016112665A1 (en) | Voice data processing method and device | |
CN103049474A (en) | Search query and document-related data translation | |
CN104102692A (en) | Electronic document tracking method based on logs | |
CN104391706A (en) | Reverse engineering based model base structuring method | |
CN110968571A (en) | Big data analysis and processing platform for financial information service | |
CN103473645A (en) | Enterprise internal project evaluation and review system | |
GB0509904D0 (en) | Method, apparatus and computer program for facilitating communication between a client application and a server application | |
CN104765823A (en) | Method and device for collecting website data | |
CN101673263B (en) | Method for searching video content | |
CN104680398A (en) | Acquisition and storage method for mass behavior data of E-commerce users | |
AU2017409831A1 (en) | Advertisement generation method, computer readable storage medium and system | |
CN105808605A (en) | Search log combination method and system | |
WO2007145775A3 (en) | Keyword extraction and contextual advertisement generation | |
CN105491090B (en) | network data processing method and device | |
US11062239B2 (en) | Structuring computer-mediated communication and determining relevant case type | |
CN106528796A (en) | Method for quickly identifying proper nouns in industrial product e-commerce search engine | |
CN111158677A (en) | Page layout data processing method and system for multiple platforms | |
CN101557310A (en) | System for tracing user access information and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160120 |
|
RJ01 | Rejection of invention patent application after publication |