CN108234245A - The screening technique of log content and daily record data, device, system, readable medium - Google Patents

The screening technique of log content and daily record data, device, system, readable medium Download PDF

Info

Publication number
CN108234245A
CN108234245A CN201810020494.8A CN201810020494A CN108234245A CN 108234245 A CN108234245 A CN 108234245A CN 201810020494 A CN201810020494 A CN 201810020494A CN 108234245 A CN108234245 A CN 108234245A
Authority
CN
China
Prior art keywords
daily record
record data
filter condition
client
filtered
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810020494.8A
Other languages
Chinese (zh)
Inventor
黄凯旋
康凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Di Lian Network Technology Co Ltd
Original Assignee
Shanghai Di Lian Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Di Lian Network Technology Co Ltd filed Critical Shanghai Di Lian Network Technology Co Ltd
Priority to CN201810020494.8A priority Critical patent/CN108234245A/en
Publication of CN108234245A publication Critical patent/CN108234245A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/026Capturing of monitoring data using flow identification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/028Capturing of monitoring data by filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)

Abstract

The screening technique of a kind of log content and daily record data, device, system, readable medium, the screening technique of the log content include:Receive the daily record filter condition that client is sent;Based on the filter condition received, the daily record data in distributed file system is filtered using cloud computing resource to obtain the daily record data for meeting the filter condition, and generate execution result information;The execution result information or the daily record data for meeting the filter condition are fed back into the client.Using said program, the computing capability of cloud computing resource can be made full use of, daily record data is screened beyond the clouds, avoid screening after client downloads the cumbersome processing of daily record data again, so as to save network bandwidth and local computing resource.

Description

The screening technique of log content and daily record data, device, system, readable medium
Technical field
The present embodiments relate to big data process field more particularly to the screening sides of a kind of log content and daily record data Method, device, system, readable medium.
Background technology
With the application popularization of internet, produce the daily record data of magnanimity, the interconnection of logdata record each time Network data interacts.By analyzing daily record data, the scene of each data access can be reappeared;And to massive logs data Analysis, user behavior can be fully understood, for Enterprises Strategic Decision provide data support.
In existing implementation, when client needs to obtain daily record data, pass through File Transfer Protocol (File Transfer Protocol, FTP) or hypertext transfer protocol (Hyper Text Transfer Protocol, HTTP) general Daily record data is locally downloading, is then screened again, obtains desired daily record data content.
It is existing to be screened again by FTP or HTTP by daily record data is locally downloading, when daily record data is very big, Network bandwidth and limited local computing resource can be wasted.
Invention content
The technical issues of embodiment of the present invention solves is how to reduce Log Filter processing time, saves network bandwidth and sheet Ground computing resource.
In order to solve the above technical problems, the embodiment of the present invention provides a kind of screening technique of log content, the method packet It includes:Receive the daily record filter condition that client is sent;Based on the filter condition received, using cloud computing resource to distribution Daily record data in file system is filtered to obtain the daily record data for meeting the filter condition, and generates implementing result letter Breath;The execution result information or the daily record data for meeting the filter condition are fed back into the client.
Optionally, the daily record filter condition is:Log Types, the daily record time started, the daily record end time, domain name and with It is at least one of lower:Uniform resource locator, HTTP conditional codes, server-side IP address.
Optionally, the daily record filter condition for receiving client transmission includes:The client is received by Web ends to send out The daily record filter condition sent, and the daily record filter condition received is converted to the conditional parameter of preset format.
Optionally, the daily record data of the filter condition by the execution result information or is met by the Web ends Feed back to the client.
Optionally, it is described based on the filter condition received, using cloud computing resource in distributed file system Daily record data is filtered to obtain the daily record data for meeting the filter condition, and is generated execution result information and included:Distribution Spark Distributed Calculation units;Conditional parameter based on preset format, using the Spark Distributed Calculations unit of distribution to dividing Daily record data in cloth file system is filtered to obtain the daily record data for meeting the filter condition, and is generated and performed knot Fruit information;The execution result information is fed back into the Web ends.
Optionally, the Spark Distributed Calculations unit using distribution is to the daily record data in distributed file system It is filtered to obtain the daily record data for meeting the filter condition, and generates execution result information and include:To distributed document The corresponding file name of daily record data in system is filtered;The corresponding daily record of file name to meeting the filter condition Record is filtered line by line, to obtain the daily record data for meeting the filter condition;Meet the mistake based on whether successfully obtaining The daily record data of filter condition generates execution result information.
Optionally, it is described to meet the daily record data of the filter condition based on whether successfully obtaining, generation implementing result letter Breath includes:When successfully acquisition meets the daily record data of the filter condition, it is as follows to generate the execution result information:Meet institute State the size information of the daily record data of filter condition and the deposit position information in distributed file system;It is obtained when failed When meeting the daily record data of the filter condition, it is as follows to generate the execution result information:The reason of performing failure information.
Optionally, it is described that the daily record data for meeting the filter condition is fed back to by the client by the Web ends Including:When successfully acquisition meets the daily record data of the filter condition, shown by the Web ends and successfully indicate information;It is logical It crosses the Web ends and receives the download instruction that the client is sent;The mistake will be met in http protocol by the Web end groups The daily record data of filter condition feeds back to the client.
The embodiment of the present invention provides a kind of screening technique of daily record data, including:Receive what client was sent by Web ends Daily record filter condition;Based on the filter condition received, judge whether to need to the log content in distributed file system into Row filtering;When not needing to be filtered the log content in distributed file system, the corresponding filename to daily record data Title is filtered, and is met the daily record data of the filter condition with acquisition and is fed back to the client;When needs are to distribution When log content in file system is filtered, using the screening technique of the log content described in any of the above-described kind to daily record number According to being screened.
Optionally, the daily record filter condition includes following at least one:Log Types, daily record time started, daily record knot Beam time, domain name, uniform resource locator, HTTP conditional codes, server-side IP address.
Optionally, it is described based on the filter condition received, judge whether to need the daily record in distributed file system Record be filtered including:When the filter condition only includes Log Types, daily record time started, daily record end time, domain name In one or several kinds when, determine not needing to be filtered the log recording in distributed file system, otherwise determine need Log recording in distributed file system is filtered.
Optionally, the file name corresponding to daily record data be filtered including:It establishes and connects with the client; The download target instruction target word that the client is sent is received, the download target instruction target word includes:It is expected the file destination title downloaded Corresponding information;Based on the corresponding information of file destination title for it is expected to download, to daily record data, corresponding file name carries out It filters and obtains the file destination title for meeting the filter condition;The corresponding daily record data of the file destination title is sent To the client.
Optionally, the file name corresponding to daily record data be filtered including:The root stored according to daily record data To daily record data, corresponding file name is filtered the sequence of catalogue, second-level directory and three-level catalogue successively.
Optionally, the root corresponds to Log Types, and the second-level directory corresponds to logging time, the three-level catalogue pair Answer domain name.
Optionally, described based on the download target instruction target word received, to daily record data, corresponding file name is filtered And it obtains and meets the file destination title of the filter condition and include:To it is expected download the corresponding information of file destination title into Row is split, and obtains the condition entry after splitting;Based on the condition entry after fractionation, to daily record data, corresponding file name carried out Filter obtains the file destination title for meeting all conditions item.
Optionally, it is described the corresponding daily record data of the file destination title is sent to the client to include:It is based on The connection with the client established carries out information exchange with the client, carries out information exchange with the client, obtains Take the length of daily record data that the client has obtained;Length based on the corresponding daily record data of the file destination title and The length difference for the daily record data that the client has obtained sends remaining daily record data to the client.
The embodiment of the present invention provides a kind of screening plant of log content, including:First receiving unit, suitable for receiving client Hold the daily record filter condition sent;Execution unit, suitable for based on the filter condition received, using cloud computing resource to distribution Daily record data in formula file system is filtered to obtain the daily record data for meeting the filter condition, and generates implementing result Information;Feedback unit, suitable for the execution result information is fed back to the client.
Optionally, the daily record filter condition is:Log Types, the daily record time started, the daily record end time, domain name and with It is at least one of lower:Uniform resource locator, HTTP conditional codes, server-side IP address.
Optionally, first receiving unit, suitable for passing through the daily record filtering rod that Web ends receive the client and send Part, and the daily record filter condition received is converted to the conditional parameter of preset format.
Optionally, the feedback unit, suitable for by the execution result information or meeting the mistake by the Web ends The daily record data of filter condition feeds back to the client.
Optionally, the execution unit includes:Subelement is distributed, suitable for distributing Spark Distributed Calculation units;Perform son Unit, suitable for the conditional parameter based on preset format, using the Spark Distributed Calculation units of distribution to distributed file system In daily record data be filtered and meet the daily record data of the filter condition to obtain, and generate execution result information;First Subelement is fed back, suitable for the execution result information is fed back to the Web ends.
Optionally, the execution subelement includes:First filtering module, suitable for all days in distributed file system The corresponding file name of will data is filtered;Second filtering module, suitable for the file name pair to meeting the filter condition The log recording answered is filtered line by line, to obtain the daily record data for meeting the filter condition;Generation module, suitable for being based on being It is no successfully to obtain the daily record data for meeting the filter condition, generate execution result information.
Optionally, the generation module, suitable for when successfully acquisition meets the daily record data of the filter condition, generation is held Row result information is as follows:Meet the size information of the daily record data of the filter condition and the storage in distributed file system Location information;When failed acquisition meets the daily record data of the filter condition, generation execution result information is as follows:Perform mistake The reason of losing information.
Optionally, the feedback unit includes:Subelement is shown, suitable for ought successfully obtain the day for meeting the filter condition During will data, shown by the Web ends and successfully indicate information;First receiving subelement receives institute suitable for passing through the Web ends State the download instruction of client transmission;Second feedback subelement, will be described in satisfaction in http protocol suitable for passing through the Web end groups The daily record data of filter condition feeds back to the client.
The embodiment of the present invention provides a kind of screening plant of daily record data, including:Second receiving unit, suitable for receiving client Hold the daily record filter condition sent;Judging unit, suitable for based on the filter condition received, judging whether to need to distributed text Log recording in part system is filtered;First screening unit is not needed to suitable for working as to the daily record in distributed file system When record is filtered, to daily record data, corresponding file name is filtered, to obtain the daily record for meeting the filter condition Data simultaneously feed back to the client;Second screening unit, suitable for when need to the log recording in distributed file system into During row filtering, daily record data is screened using the screening technique of the log content described in any of the above-described kind.
Optionally, the daily record filter condition includes at least one of following:Log Types, daily record time started, daily record knot Beam time, domain name, uniform resource locator, HTTP conditional codes, server-side IP address.
Optionally, the judging unit only includes Log Types, daily record time started, day suitable for working as the filter condition During one or several kinds in will end time, domain name, determine not needing to carry out the log recording in distributed file system Filtering, otherwise determines to need to be filtered the log recording in distributed file system.
Optionally, first screening unit includes:Subelement is connected, is connected suitable for being established with the client;Second Receiving subelement, suitable for receiving the download target instruction target word that the client is sent, the download target instruction target word includes:It is expected to download The corresponding information of file destination title;Filter subelement, suitable for based on it is expected download the corresponding information of file destination title, To daily record data, corresponding file name is filtered and obtains the file destination title for meeting the filter condition;Transmission is single Member, suitable for the corresponding daily record data of the file destination title is sent to the client.
Optionally, the filtering subelement, suitable for the corresponding information of file destination title downloaded based on expectation, according to day To daily record data, corresponding file name carried out the sequence of root, second-level directory and three-level catalogue that will data are stored successively Filter.
Optionally, the root corresponds to Log Types, and the second-level directory corresponds to logging time, the three-level catalogue pair Answer domain name.
Optionally, the filtering subelement includes:Module is split, suitable for corresponding to the file destination title for it is expected to download Information is split, and obtains the condition entry after splitting;Filtering module, suitable for based on the condition entry after fractionation, to daily record data pair The file name answered is filtered, and obtains the file destination title for meeting all conditions item.
Optionally, the transmission subelement includes:Acquisition module, suitable for the connection based on foundation with the client with The client carries out information exchange, obtains the length for the daily record data that the client has obtained;Transmission module, suitable for being based on The length difference of daily record data that the length and the client of the corresponding daily record data of the file destination title have obtained, sends Remaining daily record data is to the client.
The embodiment of the present invention provides a kind of computer readable storage medium, is stored thereon with computer instruction, the calculating The step of screening technique of the log content described in any of the above-described kind is performed during machine instruction operation.
The embodiment of the present invention provides a kind of computer readable storage medium, is stored thereon with computer instruction, the calculating The step of screening technique of the daily record data described in any of the above-described kind is performed during machine instruction operation.
The embodiment of the present invention provides a kind of system, and including memory and processor, being stored on the memory can be in institute The computer instruction run on processor is stated, the processor is performed when running the computer instruction described in any of the above-described kind The step of screening technique of log content.
The embodiment of the present invention provides a kind of system, and including memory and processor, being stored on the memory can be in institute The computer instruction run on processor is stated, the processor is performed when running the computer instruction described in any of the above-described kind The step of screening technique of daily record data.
Compared with prior art, the technical solution of the embodiment of the present invention has the advantages that:
The daily record filter condition that the embodiment of the present invention is sent by receiving client, is then based on received filtering rod Part is filtered the daily record data in distributed file system using cloud computing resource to obtain the day for meeting filter condition Will data, generate execution result information, and by execution result information or meet the daily record data of filter condition and feed back to client End, can make full use of the computing capability of cloud computing resource, daily record data is screened beyond the clouds, so as to avoid client The cumbersome processing of daily record data is screened after download again, so as to save network bandwidth and local computing resource.
Further, it by splitting the corresponding information of file destination title for it is expected to download, is then based on splitting Condition entry afterwards, to daily record data, corresponding file name is filtered, and obtains the file destination title for meeting all conditions item, Daily record data can be obtained by conditional information retrieval in file name Incomplete matching, so as to improve daily record data screening Success rate.
Further, the length of length and the daily record data obtained based on the corresponding daily record data of file destination title Difference sends remaining daily record data to client, necessary daily record data can be only transmitted to client, so as to further Save network bandwidth.
Description of the drawings
Fig. 1 is a kind of detail flowchart of the screening technique of log content provided in an embodiment of the present invention;
Fig. 2 is a kind of detail flowchart of the screening technique of daily record data provided in an embodiment of the present invention;
Fig. 3 is the detail flowchart of the screening technique of another daily record data provided in an embodiment of the present invention;
Fig. 4 is the detail flowchart of the screening technique of another daily record data provided in an embodiment of the present invention;
Fig. 5 is the interaction diagrams of a kind of client and server-side provided in an embodiment of the present invention;
Fig. 6 is the detail flowchart of the screening technique of another log content provided in an embodiment of the present invention;
Fig. 7 is the detail flowchart of the screening technique of another log content provided in an embodiment of the present invention;
Fig. 8 is the structure diagram of the screening system of another log content provided in an embodiment of the present invention;
Fig. 9 is a kind of structure diagram of the screening plant of log content provided in an embodiment of the present invention;
Figure 10 is a kind of structure diagram of the screening plant of daily record data provided in an embodiment of the present invention.
Specific embodiment
In existing implementation, when client needs to obtain daily record data, pass through File Transfer Protocol or http protocol Daily record data is locally downloading, it is then screened again, obtains desired daily record data content.It is existing to pass through File Transfer Protocol Or http protocol is screened again by daily record data is locally downloading, when daily record data is very big, can waste network bandwidth and Limited local computing resource.
The daily record filter condition that the embodiment of the present invention is sent by receiving client, is then based on received filtering rod Part is filtered the daily record data in distributed file system using cloud computing resource to obtain the day for meeting filter condition Will data, generate execution result information, and by execution result information or meet the daily record data of daily record filter condition and feed back to Client can make full use of the computing capability of cloud computing resource, daily record data is screened beyond the clouds, so as to avoid visitor The cumbersome processing of daily record data is screened at family end again after downloading, so as to save network bandwidth and local computing resource.
It is understandable for above-mentioned purpose, feature and advantageous effect of the invention is enable to become apparent, below in conjunction with the accompanying drawings to this The specific embodiment of invention is described in detail.
Referring to Fig. 1, an embodiment of the present invention provides a kind of screening techniques of log content, include the following steps:
Step S101 receives the daily record filter condition that client is sent.
In specific implementation, the daily record data can deposit in distributed file system (Hadoop Distributed File System, HDFS) in, HDFS is a kind of distributed file system, can merge the physical host of more and carry For externally servicing, using same catalogue NameSpace, for client, it is equivalent to " physical host " and clothes is externally provided Business.
In specific implementation, it when client needs download log data, needs to send daily record filter condition, the daily record Filter condition can be used for characterizing the feature that client it is expected the daily record data received, therefore sent firstly the need of reception client Daily record filter condition, is then based on the daily record filter condition, and screening obtains the daily record data for meeting the filter condition.
In specific implementation, it in order to preferably receive the daily record filter condition of the client, can be received by Web ends The daily record filter condition that client is sent, i.e. terminal client can input desired daily record filter condition in Web page.
To more fully understand those skilled in the art and implementing the present invention, one embodiment of the invention gives daily record data Storing directory in distributed file system is as follows:
/logdownload/
|---------20170101_00
|---------www.a.com
|---------192.168.16.1-20170101_00-www.a.com.gz
|---------192.168.16.2-20170101_00-www.a.com.gz
|---------192.168.16.3-20170101_00-www.a.com.gz
|---------www.b.com
|---------192.168.16.1-20170101_00-www.b.com.gz
|---------192.168.16.2-20170101_00-www.b.com.gz
|---------192.168.16.3-20170101_00-www.b.com.gz
Wherein logdownload is root, for identifying the type of journal file, i.e. Log Types;20170101_00 It is the small period information of integral point for identifying the temporal information of daily record data for second-level directory;Www.a.com, www.b.com are Three-level catalogue, for identifying domain name;192.168.16.1-20170101_00-www.a.com.gz it waits as gzip daily record datas pair The journal file title answered, wherein store log content, i.e. log recording includes uniform resource locator in log recording (Uniform Resource Locator, URL), HTTP conditional codes, server-side network address (Internet Protocol, IP) the information such as address.Journal file and it includes log recording be daily record data.
In specific implementation, the Log Types can also be other types, such as plogdownload.
In specific implementation, since daily record filter condition needs the feature of the daily record data of characterization client expectation reception, Therefore can by URL, HTTP conditional code in Log Types, the temporal information of daily record data, domain-name information and log recording, The information such as server-side IP address are as the filter condition.
In an embodiment of the present invention, Log Types, daily record time started, daily record end time and domain name mistake can be passed through The corresponding file name of daily record data is filtered to obtain the corresponding file name of daily record data;Pass through uniform resource locator, HTTP In conditional code, server-side IP address any one or it is several, in journal file log recording carry out information filtering, with Obtain the log recording for the condition that meets.Therefore the daily record filter condition can terminate for Log Types, daily record time started, daily record Time, domain name and at least one of following:URL, HTTP conditional code, server-side IP address.
In specific implementation, due to the daily record filtering rod of client input described in the possible None- identified of cloud computing resource Part, therefore the Web ends need to be converted to the daily record filter condition received into the conditional parameter of preset format, for example, the Web The daily record filter condition received can be converted to the conditional parameter of self-defined Jar packets by end, input the cloud computing resource In self-defined Jar packets program to carry out the screening of daily record data.
Step S102, based on the filter condition received, using cloud computing resource to the day in distributed file system Will data are filtered to obtain the daily record data for meeting the filter condition, and generate execution result information.
In specific implementation, the cloud computing resource can be Spark cluster resources, and Spark is a kind of distributed Computational frame, by this frame, the program that script can be performed on a machine, in parallel distributed to more machines It handles simultaneously.It is understood that the cloud computing resource may be other system resources, as long as using cloud frame Structure can realize the system resource of parallel processing capability, belong to the protection domain of the embodiment of the present invention.
In specific implementation, self-defined Jar packets can be run in the cloud computing resource, to distributed file system In daily record data be filtered, generate execution result information.
In an embodiment of the present invention, Spark Distributed Calculation units can be distributed first, be then based on preset format Conditional parameter, using the Spark Distributed Calculations unit of distribution the daily record data in distributed file system is filtered with The daily record data for meeting the filter condition is obtained, and generates execution result information, it is finally again that the execution result information is anti- It is fed to the Web ends.
In specific implementation, since daily record data is stored in journal file, therefore can be first to distributed file system In the corresponding file name of all daily record datas be filtered, then again to meet the filter condition file name correspond to Content, i.e. log recording filtered line by line, with obtain meet the filter condition daily record data (i.e. log recording and its Corresponding file name), and meet the daily record data of the filter condition based on whether successfully obtaining, generation implementing result letter Breath.For example, can self-defined Jar packets be run in the cloud computing resource based on the conditional parameter of self-defined Jar packets, Search meets the journal file of condition in HDFS clusters, then filters the note for the condition that meets in the journal file for meeting condition again Record, and the log recording for the condition that meets and its corresponding file name are output on HDFS clusters, so that the Web ends are read It takes.
In specific implementation, filtering line by line being carried out to log recording can be:The record in journal file is read, passes through day Will filter condition finds out the log recording row of the condition of satisfaction line by line.
In specific implementation, it is multiple due to having when being filtered by Spark Distributed Calculation clusters to log recording Working node is simultaneously filtered multiple files, therefore the journal file of each gz forms can export a destination file, with Realize file write-in separation, the synchronous purpose performed.
In an embodiment of the present invention, the destination file is with GZIP stored in file format.
In specific implementation, since the Web ends are successful to Spark clusters submission task (i.e. daily record data screening task) Later, just and task onrelevant, in order to by the selection result information asynchronous notifications of Spark clusters to the Web ends, can With after execution task, regardless of whether daily record data is successfully obtained, by described in internal interface to Web ends feedback Execution result information.The execution result information can be used for feeding back whether tasks carrying succeeds:If it fails, feeding back unsuccessful Cause description, if it is successful, the position of the size for the daily record data that feedback obtains and storage.
In an embodiment of the present invention, it may be based on whether successfully to obtain the daily record data for meeting the filter condition, it is raw Into execution result information, such as:When successfully acquisition meets the daily record data of the filter condition, the implementing result letter is generated Breath is as follows:Meet the size information of the daily record data of the filter condition and the deposit position letter in distributed file system Breath.When failed acquisition meets the daily record data of the filter condition, it is as follows to generate the execution result information:Perform failure The reason of information.
The execution result information or the daily record data for meeting the filter condition are fed back to the visitor by step S103 Family end.
In specific implementation, can the filter condition by the execution result information or be met by the Web ends Daily record data feed back to the client.For example, the execution result information can be shown to terminal visitor by Web page Family.
In an embodiment of the present invention, when successfully acquisition meets the daily record data of the filter condition, institute can be passed through It states Web ends and shows and successfully indicate information, the download instruction that the client sends then is received by the Web ends, then pass through The daily record data for meeting the filter condition is fed back to the client by the Web end groups in http protocol.
For example, when obtaining daily record data not successfully, the Web ends displaying failure Label and failure cause Label;Such as When fruit successfully obtains daily record data, the Web ends show successfully label (Label), daily record data size tab (Label) and under Carry button (Button).When terminal client, which is clicked, downloads Button, the implementing result will be obtained by triggering the Web ends Then the deposit position information of information, i.e. daily record data on HDFS reads application programming interface by HDFS files (Application Programming Interface, API) obtains the corresponding GZIP files of daily record data, filters out ineffective law, rule, etc. All non-nulls are transferred to the client by part in a manner that Java merges stream.
In an alternative embodiment of the invention, when successfully acquisition meets the daily record data of the filter condition, can pass through The Web ends show the execution result information, that is, meet the size information of the daily record data of the filter condition and be distributed Deposit position information in formula file system so that the client is loaded under being based on File Transfer Protocol from distributed file system The daily record data of the foot filter condition.
Using the above method, the daily record filter condition sent by receiving client is then based on received filtering rod Part is filtered the daily record data in distributed file system using cloud computing resource and meets the filter condition to obtain Daily record data, generate execution result information, and by execution result information or meet the daily record data of filter condition and feed back to Client can make full use of the computing capability of cloud computing resource, daily record data is screened beyond the clouds, so as to avoid visitor The cumbersome processing of daily record data is screened at family end again after downloading, so as to save network bandwidth and local computing resource.
To more fully understand those skilled in the art and implementing the present invention, an embodiment of the present invention provides a kind of daily record numbers According to screening technique, as shown in Figure 2.
Referring to Fig. 2, the screening technique of the daily record data may include steps of:
Step S201 receives the daily record filter condition of client transmission by Web ends.
In specific implementation, at the end of the daily record filter condition can be Log Types, daily record time started, daily record Between, the one or several kinds in domain name, URL, HTTP conditional code, server-side IP address.
Step S202 based on the filter condition received, judges whether to need in the daily record in distributed file system Appearance is filtered.
In specific implementation, due to including time of log recording, domain-name information in the corresponding file name of daily record data, Therefore when the filter condition is only comprising the one or several kinds in Log Types, daily record time started, daily record end time, domain name When, it does not need to the log content in distributed file system, i.e. log recording is filtered, only by being stored to daily record data File directory, i.e. daily record file name is filtered the daily record data that can be obtained and meet the filter condition.Otherwise, work as institute It states filter condition and includes other filter conditions, such as during URL, it is necessary to which the log recording in distributed file system is carried out Filtering, to obtain the daily record data for meeting the filter condition.
Step S203, when not needing to be filtered the log content in distributed file system, to daily record data pair The file name answered is filtered, and is met the daily record data of the filter condition with acquisition and is fed back to the client.
In specific implementation, the download mesh for connecting and receiving the client and send can be established with the client first Mark instruction, the download target instruction target word include:It is expected the corresponding information of file destination title downloaded, such as Log Types, day Will temporal information, domain-name information;The corresponding information of file destination title for it is expected to download is then based on, it is corresponding to daily record data File name is filtered and obtains the file destination title for meeting the filter condition;Finally by the file destination title pair The daily record data answered is sent to the client.
In specific implementation, the corresponding information of file destination title for it is expected to download can both be deposited including journal file The directory information put, and the relevant information of file destination title can be included.
In specific implementation, can according to root that daily record data is stored, the sequence of second-level directory and three-level catalogue according to Secondary file name corresponding to daily record data, the i.e. storing directory of journal file are filtered.
In an embodiment of the present invention, the root corresponds to Log Types, and the second-level directory corresponds to logging time, institute It states three-level catalogue and corresponds to domain name.
In specific implementation, the connection request that the client is sent can be received with receiving the client and establishing to connect It connects, for concrete implementation form, the embodiment of the present invention is not limited.For the safe handling of daily record data, may also require that The client sends verification information, only when verification information by after, establish and connect with the client.
In specific implementation, since the corresponding information of file destination title for it is expected to download can include Log Types, day The time of will record and corresponding domain-name information, and in order to promote acquisition efficiency, downloading for one can include in target instruction target word Multiple domain-name informations it is expected once to obtain the corresponding daily record data of multiple domain-name informations.Target instruction target word is downloaded when one to include When multiple domain-name informations are as filter condition, the corresponding information of file destination title for it is expected to download is corresponding with daily record data File name can not exactly match.In this case, can to it is expected download the corresponding information of file destination title into Row is split, and obtains the condition entry after splitting, such as can be split as the corresponding information of file destination title for it is expected to download:Day Will time started condition entry, daily record end time condition entry and domain name condition entry, wherein domain name condition entry include:A, B or C; The condition entry after splitting is then based on, corresponding file name is filtered to daily record data, and acquisition meets all conditions item File destination title.
In an embodiment of the present invention, the download target instruction target word that the client is sent is:RERT/logdownload/ 20170101_01/www.a.com/192.168.16.1-20170101_01-www.a.com .gz is then sent 192.168.16.1-20170101_01-www.a.com.gz file content to the client, the client is received and is protected It is stored in local file, filename can be named as:192.168.16.1-20170101_01-www.a.com.gz.
In an alternative embodiment of the invention, the download target instruction target word that the client is sent is:RERT/logdownload/ 1483200000_1483207200_www.a.com-wwww.b.com-wwww.c.com.do wnload.gz, due to the visitor The file name suffix of family end request is .download.gz, and journal file title Incomplete matching, therefore is downloaded for condition.To under Target instruction target word is carried to be split, due to the filename it is expected to download is made of three parts-time started stamp, ending time stamp, domain List of file names, therefore divide filename, obtain the time started (include 1483200000), the end time (not including 1483207200), Domain name list (www.a.com-wwww.b.com-wwww.c.com), then will be converted into date literal the time started is 20170101_00, end time are converted into date literal as 20170101_03,3 domain names are included in domain name list (www.a.com, www.b.com, www.c.com) meets any one of list domain name i.e. it is believed that meeting domain name condition, Three time started, receiving time, domain name condition entries are generated, further according to three condition entries of generation, traverse service end/ Logownload directory matches meet the journal file of three condition entries simultaneously.
In specific implementation, in order to save bandwidth, reduce the transmission of unnecessary daily record data, can based on foundation with The connection of the client carries out information exchange with the client, carries out information exchange with the client, obtains the visitor The length for the daily record data that family end has obtained;It is then based on the length of the corresponding daily record data of the file destination title and described The length difference for the daily record data that client has obtained sends remaining daily record data to the client.
In specific implementation, the Web ends can download the daily record for meeting the filter condition based on MINA FTP components Data are simultaneously transmitted to the client, and wherein MINA FTP components are a kind of standard implementation component based on File Transfer Protocol.It can manage Solution, the distributed file system can also the form of component be mounted to the client, with the side of local file system Formula is downloaded.
Step S204, when needing to be filtered the log content in distributed file system, using as described above The screening technique of any log content screens daily record data.
Using said program, by being split to the corresponding information of file destination title for it is expected to download, it is then based on Condition entry after fractionation, to daily record data, corresponding file name is filtered, and obtains the file destination for meeting all conditions item Title can obtain daily record data, so as to improve daily record data sieve in file name Incomplete matching by conditional information retrieval The success rate of choosing.
To more fully understand those skilled in the art and implementing the present invention, an embodiment of the present invention provides another daily records The screening technique of data, as shown in Figure 3.
Referring to Fig. 3, the screening technique of the daily record data may include steps of:
Step S301 receives daily record filter condition by Web client.
Step S302 judges whether to need to be filtered log content, when needing to be filtered log content, hold Otherwise row step S303 performs step S304.
In specific implementation, can judge whether to need to be filtered log content based on the filter condition received.
Step S303, the filtering of execution journal content.
Step S304, the filtering of execution journal file name.
To more fully understand those skilled in the art and implementing the present invention, an embodiment of the present invention provides another daily records The screening technique of data, as shown in Figure 4.
Referring to Fig. 4, the screening technique of the daily record data may include steps of:
Step S401 is established with client and is connected.
Step S402 receives the download target signaling that client is sent.
In specific implementation, it is described to download in target signaling comprising the corresponding information of file destination title for it is expected to download.
Step S403 determines whether that condition is downloaded, and when being downloaded for condition, performs step S405;Otherwise, step is performed S404。
In specific implementation, when the corresponding information of the file destination title filename corresponding with daily record data for it is expected to download When title exactly matches, directly according to the corresponding acquisition of information file destination of file destination title for it is expected to download.
In specific implementation, when the corresponding information of the file destination title filename corresponding with daily record data for it is expected to download It when title can not match, is downloaded for condition, that is, needs to split the corresponding information of file destination title for it is expected to download, obtain Then condition entry carries out the file destination that condition filter acquisition meets all conditions item.Due to for condition filter, therefore meet all The file of condition entry can be there are two more than.
In an embodiment of the present invention, suffix information can be pre-set, for example.download.gz, for judging to be It is no to be downloaded for condition.For example, when the suffix information included in the corresponding information of file destination title for it is expected to download and in advance When the suffix information first set is consistent, it may be determined that downloaded for unconditional, directly according to the file destination title pair for it is expected to download The acquisition of information file destination answered;When in the corresponding information of file destination title for it is expected to download the suffix information that includes with When pre-set suffix information is inconsistent, it may be determined that downloaded for condition.
File destination path is added in LIST by step S404.
The All Files path for meeting condition entry is added in LIST by step S405.
Step S406 reads the file path in LIST.
Step S407, interacts with client, obtains the data length S1 (SkipLength) that client has obtained.
Step S408 obtains the size S2 (FileSize) of file destination.
Step S409, judges whether S2 is more than S1, when S2 is more than S1, performs step S410, otherwise performs step S412.
Step S410 sends daily record data to client, and S1 is reset at S1.
In specific implementation, after file destination is sent, this document is emptied in LSIT.
Step S411, S1=S1-S2.
Step S412 judges whether LIST is empty, if LIST performs step S413 to be empty, otherwise performs step S406.
Step S413 sends message informing client data and is sent.
In specific implementation, length of the step S407- steps S412 based on the corresponding daily record data of file destination title and The length difference of the daily record data obtained sends remaining daily record data to client, can only transmit necessary daily record data To client, the size of data obtained is skipped, breakpoint transmission is supported, so as to save bandwidth.
To more fully understand those skilled in the art and implementing the present invention, an embodiment of the present invention provides a kind of clients With the interaction flow of server-side, as shown in Figure 5.
Referring to Fig. 5, using the FTP service end of screening technique of daily record data shown in Fig. 4 and the interactive stream of ftp client Journey may include steps of:
Step S51, ftp client send connection request to FTP service end.
Step S52, FTP service end receives request, and sends verification information to the ftp client.
In specific implementation, it is verified, ftp client sends and downloads target signaling, the mistake of FTP service end execution journal Screen choosing and etc., the description in embodiment shown in Fig. 4 is specifically may refer to, details are not described herein again.
Daily record data is sent to ftp client by step S5i, FTP service end.
In specific implementation, it after ftp client receives data from FTP service end, can preserve into local file.
To more fully understand those skilled in the art and implementing the present invention, an embodiment of the present invention provides another daily records The screening technique of content, as shown in Figure 6.
Referring to Fig. 6, the screening technique of the daily record data may include steps of:
Step S601 receives the daily record filtration duty that Web ends are submitted.
Step S602 distributes the required computing resource of filtration duty.
Step S603, judges whether computing resource is allocated successfully, no if computational resource allocation successful execution step S604 It then ends task, returns to failure cause.
Step S604, execution journal filtration duty.
Step S605 is disposably returned to the log record file for the condition that meets in a manner of stream to compress by the Web ends To client.
In specific implementation, the Web ends can obtain the log recording text for the condition that meets by HDFS file read-writes API Part.
To more fully understand those skilled in the art and implementing the present invention, an embodiment of the present invention provides another daily records The screening technique of content, as shown in Figure 7.
Referring to Fig. 7, the screening technique of the daily record data may include steps of:
Step S701 analyzes filter condition, finds out the journal file for meeting the filter condition.
Step S702 reads the content in journal file, by content filter conditions, finds out meet the filtering rod line by line The record row of part.
Step S703 preserves the log lines for meeting the filter condition into HDFS systems.
To more fully understand those skilled in the art and implementing the present invention, an embodiment of the present invention provides a kind of daily record numbers According to screening system, as shown in Figure 8.
Referring to Fig. 8, the screening system of the daily record data includes:Client 81 and server-side 80, wherein the server-side 80 include:Web ends 82, Spark clusters 83 and Hadoop clusters 84.
Using the screening technique of above-mentioned log content, the interaction flow of the client 81 and the server-side 80 can wrap Include following steps:
Step S801, the client 81 is in the 82 input journal filter condition of Web ends and submits.
In specific implementation, the daily record filter condition can be:The daily record time started, the daily record end time, domain name/ URL, HTTP conditional code, server-side IP address, Log Types.
Log Filter task is submitted to Spark clusters by step S802, the Web82
In specific implementation, the daily record filter condition can be converted to the ginseng of self-defined Jar packets by the Web ends 82 Number is committed to the Spark clusters 83, and the wherein self-defined Jar packets of Log Filter are the Spark Log Filter journeys that user writes Sequence.
Step S803, the Spark clusters 83 obtain the parameter of Jar packets, and run self-defined Jar packets, described Search meets the journal file of the filter condition on Hadoop clusters 84, then again in the daily record text for meeting the filter condition Filtering meets the record of condition in part, finally the record for meeting the filter condition is output on HDFS clusters, for the Web It reads at end 82.
In specific implementation, it can be filtered by time started, end time and domain name and obtain the corresponding text of daily record data Part title is then based on URL, HTTP conditional code, server-side IP address or Log Types and log recording is filtered, will be full In the daily record data storage to HDFS file system of the foot filter condition.Due to being carried out by Spark Distributed Calculations cluster File content filters, and multiple files are filtered simultaneously by multiple working nodes, so each gz files can export One destination file reaches file write-in separation, the synchronous purpose performed.
Step S804 when self-defined Spark tasks are after file is output to HDFS, calls the Web ends 82 and institute State the interface feedback task action result between Spark clusters.
In specific implementation, due to the Web ends 82 to the Spark clusters 83 submit task (i.e. daily record data screening Task) success after, just and task onrelevant, in order to by the selection result information of the Spark clusters 83 it is asynchronous lead to Know the Web ends 82, regardless of whether successfully obtaining daily record data, can pass through internal interface after execution task The execution result information is fed back to the Web ends 82.
Step S805, the Web ends 82 obtain the task action result of Spark clusters feedback and present to Web page.
In specific implementation, when obtaining daily record data not successfully, the Web ends displaying failure Label and failure cause Label;If successfully obtain daily record data, the Web ends show successfully Label, daily record data size Label and download Button。
Step S806, the Web ends 82 obtain Spark task action results, and feed back to the client 81.
In specific implementation, it when tasks carrying success, i.e., successfully obtains after meeting the daily record data of the filter condition, when When terminal client clicks Button, deposit position information of the daily record data on HDFS will be obtained by triggering the Web ends 82, Then application programming interface API is read by HDFS files, obtains the corresponding GZIP files of daily record data, filter out ineffective law, rule, etc. All non-nulls are transferred to the client 81 by part in a manner that Java merges stream.
To be better understood from those skilled in the art and implementing the present invention, the embodiment of the present invention, which additionally provides, to be realized The screening plant of the screening technique of above-mentioned log content, as shown in Figure 9.
Referring to Fig. 9, an embodiment of the present invention provides a kind of screening plants 90 of log content, can include:First receives Unit 91, execution unit 92 and feedback unit 93, wherein:
First receiving unit 91, suitable for receiving the daily record filter condition that client is sent.
The execution unit 92, suitable for based on the filter condition received, using cloud computing resource to distributed document Daily record data in system is filtered to obtain the daily record data for meeting the filter condition, and generates execution result information.
The feedback unit 93, suitable for by the execution result information or meet the filter condition daily record data it is anti- It is fed to the client.
In specific implementation, the daily record filter condition is:Log Types, daily record time started, daily record end time, domain Name and at least one of following:Uniform resource locator, HTTP conditional codes, server-side IP address.
In an embodiment of the present invention, first receiving unit 91 receives the client transmission suitable for passing through Web ends Daily record filter condition, and the daily record filter condition received is converted to the conditional parameter of preset format.
In an embodiment of the present invention, the feedback unit 93, suitable for by the Web ends by the execution result information Or meet the daily record data of the filter condition and feed back to the client.
In specific implementation, the execution unit 92 includes:It distributes subelement 921, perform the feedback of subelement 922 and first Subelement 923, wherein:
The distribution subelement 921, suitable for distributing Spark Distributed Calculation units.
The execution subelement 922, suitable for the conditional parameter based on preset format, utilizes the Spark distribution meters of distribution Unit is calculated to be filtered the daily record data in distributed file system to obtain the daily record data for meeting the filter condition, and Generate execution result information.
The first feedback subelement 923, suitable for the execution result information is fed back to the Web ends.
In specific implementation, the execution subelement 922 can include:First filtering module (not shown), the second filtering Module (not shown) and generation module (not shown), wherein:
First filtering module, suitable for the corresponding file name of all daily record datas in distributed file system into Row filtering.
Second filtering module, suitable for the corresponding log recording of the file name for meeting the filter condition carry out by Row filtering, to obtain the daily record data for meeting the filter condition.
The generation module, suitable for based on whether successfully obtaining the daily record data for meeting the filter condition, generation performs Result information.
In an embodiment of the present invention, the generation module, suitable for ought successfully obtain the daily record for meeting the filter condition During data, generation execution result information is as follows:Meet the size information of the daily record data of the filter condition and in distributed text Deposit position information in part system;When failed acquisition meets the daily record data of the filter condition, implementing result is generated Information is as follows:The reason of performing failure information.
In an embodiment of the present invention, the feedback unit 93 includes:Show subelement 931, the first receiving subelement 932 Subelement 933 is fed back with second, wherein:
It is described display subelement 931, suitable for when successfully obtain meet the daily record data of the filter condition when, by described Web ends, which are shown, successfully indicates information.
First receiving subelement 932, suitable for receiving the download instruction of the client transmission by the Web ends.
The second feedback subelement 933, will meet the filter condition suitable for passing through the Web end groups in http protocol Daily record data feed back to the client.
In specific implementation, the workflow of the screening plant 90 of the log content and principle can refer to above-mentioned implementation Description in the method provided in example, details are not described herein again.
To be better understood from those skilled in the art and implementing the present invention, the embodiment of the present invention, which additionally provides, to be realized The screening plant of the screening technique of above-mentioned daily record data, as shown in Figure 10.
Referring to Figure 10, an embodiment of the present invention provides a kind of screening plants 100 of daily record data, can include:Second connects Receipts unit 11, judging unit 12, the first screening unit 13 and the second screening unit 14, wherein:
Second receiving unit 11, suitable for receiving the daily record filter condition that client is sent.
The judging unit 12, suitable for based on the filter condition received, judging whether to need to distributed file system In log recording be filtered.
First screening unit 13 does not need to be filtered the log recording in distributed file system suitable for working as When, to daily record data, corresponding file name is filtered, and is met the daily record data of the filter condition with acquisition and is fed back to The client.
Second screening unit 14, suitable for when needing to be filtered the log recording in distributed file system, Daily record data is screened using the screening technique of the daily record data described in any of the above-described kind.
In specific implementation, the daily record filter condition includes at least one of following:Log Types, the daily record time started, Daily record end time, domain name, uniform resource locator, HTTP conditional codes, server-side IP address.
In an embodiment of the present invention, the judging unit 12 only includes Log Types, day suitable for working as the filter condition During one or several kinds in will time started, daily record end time, domain name, determine not needing to in distributed file system Log recording is filtered, and otherwise determines to need to be filtered the log recording in distributed file system.
In specific implementation, first screening unit 13 includes:Connect subelement 131, the second receiving subelement 132, Subelement 133 and transmission subelement 134 are filtered, wherein:
The connection subelement 131, connects suitable for being established with the client.
Second receiving subelement 132, suitable for receiving the download target instruction target word that the client is sent, the download mesh Mark instruction includes:It is expected the corresponding information of file destination title downloaded.
The filtering subelement 133, suitable for the corresponding information of file destination title downloaded based on expectation, to daily record data Corresponding file name is filtered and obtains the file destination title for meeting the filter condition.
The transmission subelement 134, suitable for the corresponding daily record data of the file destination title is sent to the client End.
In specific implementation, the filtering subelement 133, suitable for based on the corresponding letter of file destination title for it is expected to download Breath, according to the sequence of the root of daily record data storage, the second-level directory and three-level catalogue successively corresponding file to daily record data Title is filtered.
In an embodiment of the present invention, the root corresponds to Log Types, and the second-level directory corresponds to logging time, institute It states three-level catalogue and corresponds to domain name.
In an embodiment of the present invention, the filtering subelement 133 includes:Split module (not shown) and filtering module (not shown), wherein:
The fractionation module suitable for being split to the corresponding information of file destination title for it is expected to download, is obtained and is split Condition entry afterwards.
The filtering module, suitable for based on the condition entry after fractionation, to daily record data, corresponding file name is filtered, Obtain the file destination title for meeting all conditions item.
In specific implementation, the transmission subelement 134 includes:Acquisition module (not shown) and transmission module (are not shown Go out), wherein:
The acquisition module is handed over suitable for the connection with the client based on foundation with the client into row information Mutually, the length for the daily record data that the client has obtained is obtained.
The transmission module, suitable for the length based on the corresponding daily record data of the file destination title and the client The length difference of the daily record data obtained sends remaining daily record data to the client.
In specific implementation, the workflow of the screening plant 100 of the daily record data and principle can refer to above-mentioned reality The description in the method provided in example is applied, details are not described herein again.
The embodiment of the present invention provides a kind of computer readable storage medium, and computer readable storage medium is deposited to be non-volatile Storage media or non-transitory storage media, are stored thereon with computer instruction, and the computer instruction performs any of the above-described when running Step corresponding to the screening technique of the log content is planted, details are not described herein again.
The embodiment of the present invention provides a kind of computer readable storage medium, and computer readable storage medium is deposited to be non-volatile Storage media or non-transitory storage media, are stored thereon with computer instruction, and the computer instruction performs any of the above-described when running Step corresponding to the screening technique of the daily record data is planted, details are not described herein again.
The embodiment of the present invention provides a kind of system, and including memory and processor, being stored on the memory can be The computer instruction run on the processor, the processor are performed when running the computer instruction described in any of the above-described kind Step corresponding to the screening technique of log content, details are not described herein again.
The embodiment of the present invention provides a kind of system, and including memory and processor, being stored on the memory can be The computer instruction run on the processor, the processor are performed when running the computer instruction described in any of the above-described kind Step corresponding to the screening technique of daily record data, details are not described herein again.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage Medium can include:ROM, RAM, disk or CD etc..
Although present disclosure is as above, present invention is not limited to this.Any those skilled in the art are not departing from this It in the spirit and scope of invention, can make various changes or modifications, therefore protection scope of the present invention should be with claim institute Subject to the range of restriction.

Claims (36)

1. a kind of screening technique of log content, which is characterized in that including:
Receive the daily record filter condition that client is sent;
Based on the filter condition received, the daily record data in distributed file system is filtered using cloud computing resource To obtain the daily record data for meeting the filter condition, and generate execution result information;
The execution result information or the daily record data for meeting the filter condition are fed back into the client.
2. the screening technique of log content according to claim 1, which is characterized in that the daily record filter condition is:Day Will type, daily record time started, daily record end time, domain name and at least one of following:
Uniform resource locator, HTTP conditional codes, server-side IP address.
3. the screening technique of log content according to claim 1, which is characterized in that the day for receiving client and sending Will filter condition includes:
The daily record filter condition of the client transmission is received by Web ends, and the daily record filter condition received is converted to The conditional parameter of preset format.
4. the screening technique of log content according to claim 3, which is characterized in that held by the Web ends by described Row result information meets the daily record data of the filter condition and feeds back to the client.
5. the screening technique of log content according to claim 4, which is characterized in that described based on the filtering rod received Part is filtered the daily record data in distributed file system using cloud computing resource and meets the filter condition to obtain Daily record data, and generate execution result information and include:
Distribute Spark Distributed Calculation units;
Conditional parameter based on preset format, using the Spark Distributed Calculation units of distribution in distributed file system Daily record data is filtered to obtain the daily record data for meeting the filter condition, and generates execution result information;
The execution result information is fed back into the Web ends.
6. the screening technique of log content according to claim 5, which is characterized in that the Spark using distribution points Cloth computing unit is filtered the daily record data in distributed file system to obtain the daily record for meeting the filter condition Data, and generate execution result information and include:
The corresponding file name of daily record data in distributed file system is filtered;
The corresponding log recording of file name to meeting the filter condition is filtered line by line, meets the filtering to obtain The daily record data of condition;
Based on whether successfully obtaining the daily record data for meeting the filter condition, execution result information is generated.
7. the screening technique of log content according to claim 6, which is characterized in that described based on whether successfully obtaining full The daily record data of the foot filter condition, generation execution result information include:
When successfully acquisition meets the daily record data of the filter condition, it is as follows to generate the execution result information:Described in satisfaction The size information of the daily record data of filter condition and the deposit position information in distributed file system;
When failed acquisition meets the daily record data of the filter condition, it is as follows to generate the execution result information:Perform mistake The reason of losing information.
8. the screening technique of log content according to claim 5, which is characterized in that described to be expired by the Web ends The daily record data of the foot filter condition feeds back to the client and includes:
When successfully acquisition meets the daily record data of the filter condition, shown by the Web ends and successfully indicate information;
The download instruction of the client transmission is received by the Web ends;
The daily record data for meeting the filter condition is fed back to by the client in http protocol by the Web end groups.
9. a kind of screening technique of daily record data, which is characterized in that including:
The daily record filter condition of client transmission is received by Web ends;
Based on the filter condition received, judge whether to need to be filtered the log content in distributed file system;
When not needing to be filtered the log content in distributed file system, to daily record data corresponding file name into Row filtering meets the daily record data of the filter condition with acquisition and feeds back to the client;
When needing to be filtered the log content in distributed file system, using as described in any one of claim 1 to 8 The screening technique of log content daily record data is screened.
10. the screening technique of daily record data according to claim 9, which is characterized in that the daily record filter condition includes Following at least one:Log Types, daily record time started, daily record end time, domain name, uniform resource locator, HTTP states Code, server-side IP address.
11. the screening technique of daily record data according to claim 10, which is characterized in that described based on the filtering received Condition, judge whether to need to be filtered the log recording in distributed file system including:
When the filter condition is only including a kind of or several in Log Types, daily record time started, daily record end time, domain name During kind, determine not needing to be filtered the log recording in distributed file system, otherwise determine to need to distributed document Log recording in system is filtered.
12. the screening technique of daily record data according to claim 9, which is characterized in that described corresponding to daily record data File name be filtered including:
It establishes and connects with the client;
The download target instruction target word that the client is sent is received, the download target instruction target word includes:It is expected the file destination downloaded The corresponding information of title;
Based on the corresponding information of file destination title for it is expected to download, to daily record data, corresponding file name is filtered and obtains Take the file destination title for meeting the filter condition;
The corresponding daily record data of the file destination title is sent to the client.
13. the screening technique of daily record data according to claim 12, which is characterized in that described corresponding to daily record data File name be filtered including:The sequence of root, second-level directory and three-level catalogue stored according to daily record data is right successively The corresponding file name of daily record data is filtered.
14. the screening technique of daily record data according to claim 13, which is characterized in that the root corresponds to daily record class Type, the second-level directory correspond to logging time, and the three-level catalogue corresponds to domain name.
15. the screening technique of daily record data according to claim 12, which is characterized in that described based on the download received Target instruction target word, to daily record data, corresponding file name is filtered and obtains the file destination title for meeting the filter condition Including:
The corresponding information of file destination title for it is expected to download is split, obtains the condition entry after splitting;
Based on the condition entry after fractionation, to daily record data, corresponding file name is filtered, and acquisition meets all conditions item File destination title.
16. the screening technique of daily record data according to claim 12, which is characterized in that described by the file destination name Corresponding daily record data is claimed to be sent to the client to include:
Connection with the client based on foundation carries out information exchange with the client, with the client into row information Interaction obtains the length for the daily record data that the client has obtained;
The length of daily record data that length and the client based on the corresponding daily record data of the file destination title have obtained Degree is poor, sends remaining daily record data to the client.
17. a kind of screening plant of log content, which is characterized in that including:
First receiving unit, suitable for receiving the daily record filter condition that client is sent;
Execution unit, suitable for based on the filter condition received, using cloud computing resource to the day in distributed file system Will data are filtered to obtain the daily record data for meeting the filter condition, and generate execution result information;
Feedback unit, suitable for the execution result information is fed back to the client.
18. the screening plant of log content according to claim 17, which is characterized in that the daily record filter condition is: Log Types, daily record time started, daily record end time, domain name and at least one of following:
Uniform resource locator, HTTP conditional codes, server-side IP address.
19. the screening plant of log content according to claim 17, which is characterized in that first receiving unit is fitted In receiving the daily record filter condition that the client sends by Web ends, and the daily record filter condition received is converted to pre- If the conditional parameter of form.
20. the screening plant of log content according to claim 19, which is characterized in that the feedback unit, suitable for logical The Web ends are crossed by the execution result information or meets the daily record data of the filter condition and feeds back to the client.
21. the screening plant of log content according to claim 20, which is characterized in that the execution unit includes:
Subelement is distributed, suitable for distributing Spark Distributed Calculation units;
Subelement is performed, suitable for the conditional parameter based on preset format, using the Spark Distributed Calculations unit of distribution to distribution Daily record data in formula file system is filtered to obtain the daily record data for meeting the filter condition, and generates implementing result Information;
First feedback subelement, suitable for the execution result information is fed back to the Web ends.
22. the screening plant of log content according to claim 21, which is characterized in that the execution subelement includes:
First filtering module, suitable for being filtered to the corresponding file name of all daily record datas in distributed file system;
Second filtering module, suitable for being filtered line by line to the corresponding log recording of the file name for meeting the filter condition, To obtain the daily record data for meeting the filter condition;
Generation module suitable for meeting the daily record data of the filter condition based on whether successfully obtaining, generates execution result information.
23. the screening plant of log content according to claim 22, which is characterized in that the generation module, suitable for working as When success acquisition meets the daily record data of the filter condition, generation execution result information is as follows:Meet the filter condition The size information of daily record data and the deposit position information in distributed file system;When failed acquisition meets the filtering During the daily record data of condition, generation execution result information is as follows:The reason of performing failure information.
24. the screening plant of log content according to claim 21, which is characterized in that the feedback unit includes:
Subelement is shown, suitable for when successfully acquisition meets the daily record data of the filter condition, being shown as by the Web ends Work(indicates information;
First receiving subelement, suitable for receiving the download instruction of the client transmission by the Web ends;
Second feedback subelement, suitable for passing through the daily record data that the Web end groups will meet the filter condition in http protocol Feed back to the client.
25. a kind of screening plant of daily record data, which is characterized in that including:
Second receiving unit, suitable for receiving the daily record filter condition that client is sent;
Judging unit, suitable for based on the filter condition received, judging whether to need to remember the daily record in distributed file system Record is filtered;
First screening unit, suitable for when not needing to be filtered the log recording in distributed file system, to daily record number It is filtered according to corresponding file name, the daily record data of the filter condition is met with acquisition and feeds back to the client;
Second screening unit, suitable for when needing to be filtered the log recording in distributed file system, using such as right It is required that the screening technique of 1 to 8 any one of them log content screens daily record data.
26. the screening plant of daily record data according to claim 25, which is characterized in that the daily record filter condition includes At least one of below:Log Types, daily record time started, daily record end time, domain name, uniform resource locator, HTTP states Code, server-side IP address.
27. the screening plant of daily record data according to claim 26, which is characterized in that the judging unit, suitable for working as When the filter condition only includes the one or several kinds in Log Types, daily record time started, daily record end time, domain name, It determines not needing to be filtered the log recording in distributed file system, otherwise determines to need in distributed file system Log recording be filtered.
28. the screening plant of daily record data according to claim 25, which is characterized in that the first screening unit packet It includes:
Subelement is connected, is connected suitable for being established with the client;
Second receiving subelement, suitable for receiving the download target instruction target word that the client is sent, the download target instruction target word includes: It is expected the corresponding information of file destination title downloaded;
Subelement is filtered, suitable for the corresponding information of file destination title downloaded based on expectation, the corresponding file to daily record data Title is filtered and obtains the file destination title for meeting the filter condition;
Transmission subelement, suitable for the corresponding daily record data of the file destination title is sent to the client.
29. the screening plant of daily record data according to claim 28, which is characterized in that the filtering subelement is suitable for Based on the corresponding information of file destination title for it is expected to download, root, second-level directory and the three-level stored according to daily record data To daily record data, corresponding file name is filtered the sequence of catalogue successively.
30. the screening plant of daily record data according to claim 29, which is characterized in that the root corresponds to daily record class Type, the second-level directory correspond to logging time, and the three-level catalogue corresponds to domain name.
31. the screening plant of daily record data according to claim 28, which is characterized in that the filtering subelement includes:
Module is split, suitable for being split to the corresponding information of file destination title for it is expected to download, obtains the condition after splitting ;
Filtering module, suitable for based on the condition entry after fractionation, corresponding file name is filtered to daily record data, is obtained and is met The file destination title of all conditions item.
32. the screening plant of daily record data according to claim 28, which is characterized in that the transmission subelement includes:
Acquisition module carries out information exchange with the client suitable for the connection with the client based on foundation, obtains institute State the length for the daily record data that client has obtained;
Transmission module has obtained suitable for the length based on the corresponding daily record data of the file destination title and the client The length difference of daily record data sends remaining daily record data to the client.
33. a kind of computer readable storage medium, is stored thereon with computer instruction, which is characterized in that the computer instruction Perform claim requires the step of any one of 1 to 8 the method during operation.
34. a kind of computer readable storage medium, is stored thereon with computer instruction, which is characterized in that the computer instruction Perform claim requires the step of any one of 9 to 16 the method during operation.
35. a kind of system including memory and processor, is stored with the meter that can be run on the processor on the memory Calculation machine instructs, which is characterized in that perform claim requires any one of 1 to 8 institute when the processor runs the computer instruction The step of stating method.
36. a kind of system including memory and processor, is stored with the meter that can be run on the processor on the memory Calculation machine instructs, which is characterized in that perform claim requires any one of 9 to 16 institutes when the processor runs the computer instruction The step of stating method.
CN201810020494.8A 2018-01-09 2018-01-09 The screening technique of log content and daily record data, device, system, readable medium Pending CN108234245A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810020494.8A CN108234245A (en) 2018-01-09 2018-01-09 The screening technique of log content and daily record data, device, system, readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810020494.8A CN108234245A (en) 2018-01-09 2018-01-09 The screening technique of log content and daily record data, device, system, readable medium

Publications (1)

Publication Number Publication Date
CN108234245A true CN108234245A (en) 2018-06-29

Family

ID=62641562

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810020494.8A Pending CN108234245A (en) 2018-01-09 2018-01-09 The screening technique of log content and daily record data, device, system, readable medium

Country Status (1)

Country Link
CN (1) CN108234245A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109213660A (en) * 2018-08-28 2019-01-15 郑州云海信息技术有限公司 A kind of log-output method and device based on filter condition
CN109902073A (en) * 2019-04-03 2019-06-18 北京奇安信科技有限公司 Log processing method, device, computer equipment and computer readable storage medium
CN110232048A (en) * 2019-06-12 2019-09-13 腾讯科技(成都)有限公司 Acquisition methods, device and the storage medium of journal file
CN111045848A (en) * 2019-12-19 2020-04-21 广州唯品会信息科技有限公司 Log analysis method, terminal device and computer-readable storage medium
CN111061697A (en) * 2019-12-25 2020-04-24 中国联合网络通信集团有限公司 Log data processing method and device, electronic equipment and storage medium
CN111078657A (en) * 2019-12-26 2020-04-28 北京思特奇信息技术股份有限公司 Service log query method, system, medium and equipment of distributed system
CN111831542A (en) * 2019-04-23 2020-10-27 华为技术有限公司 API application debugging method and device and storage medium
CN112351090A (en) * 2020-10-29 2021-02-09 深圳Tcl新技术有限公司 Log information transmission method and device based on intelligent large screen and storage medium
CN113342748A (en) * 2021-07-05 2021-09-03 北京腾云天下科技有限公司 Log data processing method and device, distributed computing system and storage medium
CN113396395A (en) * 2018-12-20 2021-09-14 皇家飞利浦有限公司 Method for effectively evaluating log mode
CN113687974A (en) * 2021-10-22 2021-11-23 飞狐信息技术(天津)有限公司 Client log processing method and device and computer equipment
CN116149958A (en) * 2023-04-20 2023-05-23 华谱科仪(北京)科技有限公司 Chromatograph operation and maintenance method and device based on log management function
CN116719874A (en) * 2023-08-08 2023-09-08 深圳复临科技有限公司 MVC architecture-based data unification system, method, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102238023A (en) * 2010-04-23 2011-11-09 中兴通讯股份有限公司 Method and device for generating warning data of network management system
CN103401937A (en) * 2013-08-07 2013-11-20 中国科学院信息工程研究所 Log data processing method and system
CN103580899A (en) * 2012-08-01 2014-02-12 中兴通讯股份有限公司 Method and system for managing event logs, cloud service client side and virtualization platform
US20150163199A1 (en) * 2012-04-30 2015-06-11 Zscaler, Inc. Systems and methods for integrating cloud services with information management systems
CN105468737A (en) * 2015-11-24 2016-04-06 湖北大学 Web service big data analysis method, cloud computing platform and mining system
CN106254096A (en) * 2016-07-21 2016-12-21 柳州龙辉科技有限公司 A kind of processing means of Linux daily record
CN106649044A (en) * 2016-12-28 2017-05-10 深圳市深信服电子科技有限公司 Log processing method, device and system based on container cloud system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102238023A (en) * 2010-04-23 2011-11-09 中兴通讯股份有限公司 Method and device for generating warning data of network management system
US20150163199A1 (en) * 2012-04-30 2015-06-11 Zscaler, Inc. Systems and methods for integrating cloud services with information management systems
CN103580899A (en) * 2012-08-01 2014-02-12 中兴通讯股份有限公司 Method and system for managing event logs, cloud service client side and virtualization platform
CN103401937A (en) * 2013-08-07 2013-11-20 中国科学院信息工程研究所 Log data processing method and system
CN105468737A (en) * 2015-11-24 2016-04-06 湖北大学 Web service big data analysis method, cloud computing platform and mining system
CN106254096A (en) * 2016-07-21 2016-12-21 柳州龙辉科技有限公司 A kind of processing means of Linux daily record
CN106649044A (en) * 2016-12-28 2017-05-10 深圳市深信服电子科技有限公司 Log processing method, device and system based on container cloud system

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109213660A (en) * 2018-08-28 2019-01-15 郑州云海信息技术有限公司 A kind of log-output method and device based on filter condition
CN113396395A (en) * 2018-12-20 2021-09-14 皇家飞利浦有限公司 Method for effectively evaluating log mode
CN109902073A (en) * 2019-04-03 2019-06-18 北京奇安信科技有限公司 Log processing method, device, computer equipment and computer readable storage medium
CN109902073B (en) * 2019-04-03 2020-12-29 奇安信科技集团股份有限公司 Log processing method and device, computer equipment and computer readable storage medium
CN111831542A (en) * 2019-04-23 2020-10-27 华为技术有限公司 API application debugging method and device and storage medium
CN111831542B (en) * 2019-04-23 2022-04-05 华为技术有限公司 API application debugging method and device and storage medium
CN110232048A (en) * 2019-06-12 2019-09-13 腾讯科技(成都)有限公司 Acquisition methods, device and the storage medium of journal file
CN111045848A (en) * 2019-12-19 2020-04-21 广州唯品会信息科技有限公司 Log analysis method, terminal device and computer-readable storage medium
CN111045848B (en) * 2019-12-19 2024-04-19 广州唯品会信息科技有限公司 Log analysis method, terminal device and computer readable storage medium
CN111061697A (en) * 2019-12-25 2020-04-24 中国联合网络通信集团有限公司 Log data processing method and device, electronic equipment and storage medium
CN111061697B (en) * 2019-12-25 2023-06-13 中国联合网络通信集团有限公司 Log data processing method and device, electronic equipment and storage medium
CN111078657A (en) * 2019-12-26 2020-04-28 北京思特奇信息技术股份有限公司 Service log query method, system, medium and equipment of distributed system
CN112351090A (en) * 2020-10-29 2021-02-09 深圳Tcl新技术有限公司 Log information transmission method and device based on intelligent large screen and storage medium
CN112351090B (en) * 2020-10-29 2024-04-05 深圳Tcl新技术有限公司 Log information transmission method and device based on intelligent large screen and storage medium
CN113342748A (en) * 2021-07-05 2021-09-03 北京腾云天下科技有限公司 Log data processing method and device, distributed computing system and storage medium
CN113687974A (en) * 2021-10-22 2021-11-23 飞狐信息技术(天津)有限公司 Client log processing method and device and computer equipment
CN116149958A (en) * 2023-04-20 2023-05-23 华谱科仪(北京)科技有限公司 Chromatograph operation and maintenance method and device based on log management function
CN116149958B (en) * 2023-04-20 2023-06-27 华谱科仪(北京)科技有限公司 Chromatograph operation and maintenance method and device based on log management function
CN116719874A (en) * 2023-08-08 2023-09-08 深圳复临科技有限公司 MVC architecture-based data unification system, method, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN108234245A (en) The screening technique of log content and daily record data, device, system, readable medium
US20190146836A1 (en) Data forwarder for distributed data acquisition, indexing and search system
US8156216B1 (en) Distributed data collection and aggregation
CN111736775B (en) Multi-source storage method, device, computer system and storage medium
CN102035696B (en) Website access performance monitoring method, device and system
US11176594B2 (en) Transformation and aggregation engine
CN107967143A (en) Obtain the methods, devices and systems of the update instruction information of client application source code
CA2657487A1 (en) Distributed capture and aggregation of dynamic application usage information
JP6287122B2 (en) Information processing system and information processing method
US11323412B2 (en) DNS rendezvous localization
CN104378234A (en) Cross-data-center data transmission processing method and system
CN108701130A (en) Hints model is updated using auto-browsing cluster
CN109254854A (en) Asynchronous invoking method, computer installation and storage medium
CN109981745A (en) A kind of journal file processing method and server
US20220078101A1 (en) Protocol and state analysis in a dynamic routing network
CN105245588B (en) A kind of method of web service port separating treatment
CN107562426A (en) Without the method and system for burying point type high in the clouds collection and analysis browser Trace
KR102423039B1 (en) Real-time packet data storing method and apparatus for mass network monitoring
JP4689867B2 (en) Server system, client system, differential update system, and differential update program
CN105335470B (en) User login information shows method and apparatus
CN105100151A (en) Content distribution method, apparatus and system
EP3729761B1 (en) Smart delivery node
CN109669846B (en) Management information query method, device and computer readable medium
JP5734416B2 (en) Network traffic analysis method, network traffic analysis device, computer program
US9483561B2 (en) Server inventory trends

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180629