CN108881352B - Method, device and system for processing click log - Google Patents
Method, device and system for processing click log Download PDFInfo
- Publication number
- CN108881352B CN108881352B CN201710342330.2A CN201710342330A CN108881352B CN 108881352 B CN108881352 B CN 108881352B CN 201710342330 A CN201710342330 A CN 201710342330A CN 108881352 B CN108881352 B CN 108881352B
- Authority
- CN
- China
- Prior art keywords
- click log
- real
- data
- click
- temporary storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Environmental & Geological Engineering (AREA)
- Debugging And Monitoring (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The disclosure provides a method, a device and a system for processing a click log. The method comprises the following steps: according to a preset condition met by a click log, sending the click log to a data temporary storage station corresponding to the preset condition; acquiring a first click log from a first data temporary storage station matched with the data requirement for calculating the first real-time index; calculating the first real-time indicator using the first click log.
Description
Technical Field
The present disclosure relates to the field of internet technologies, and in particular, to a method, an apparatus, and a system for processing a click log.
Background
When a user accesses a website, a server sends data to a user client after receiving a request, and then clicking of any position of any page on the website by the user generates a clicking log.
With the widespread use of internet applications, the number of click logs generated per unit time is very large and has great randomness, especially, for example, during large promotion periods. Therefore, the processing of the click log requires high timeliness and stability, for example, data accumulation is avoided to cause system shutdown.
When the click logs are processed in real time, the generated click logs are collected together to a data temporary storage central station, such as a kafka data system. And when various real-time indexes related to the click logs are calculated in a subsequent process, the required click logs are obtained from the data temporary storage total station for calculation. However, this approach allows data to be obtained from the total data staging station while calculating all of the real-time indicators, which may place significant strain on the total data staging station. Moreover, when each real-time index is calculated, a large amount of data irrelevant to the real-time index exists in the data temporary storage master station, so that extra computer resources are required to be occupied for calculation, and the timeliness of calculation of the real-time index is affected.
Disclosure of Invention
In view of this, the present disclosure provides a method, an apparatus, and a system for processing a click log, which can relieve the data processing pressure of a data temporary storage central station and improve the real-time processing efficiency of the click log.
One aspect of the present disclosure provides a method of processing a click log. And sending the click log to a data temporary storage station corresponding to the preset condition according to the preset condition met by the click log. And acquiring a first click log from a first data temporary storage station matched with the data requirement for calculating the first real-time index. Calculating the first real-time indicator using the first click log.
According to the embodiment of the disclosure, before sending the click log to the data temporary storage station corresponding to the preset condition according to the preset condition met by the click log, the method further comprises sending the click log to a data temporary storage master station when the click log is generated.
According to the embodiment of the disclosure, the method further comprises the step of obtaining a second click log from the data temporary storage main station when the data requirement for calculating the second real-time index cannot be matched with any one data temporary storage station, and then calculating the second real-time index by using the second click log.
According to the embodiment of the disclosure, the method for sending the click log to the data temporary storage station corresponding to the preset condition according to the preset condition met by the click log comprises the steps of judging a first parameter of the click log and sending the click log to the data temporary storage station corresponding to the preset condition according to the preset condition met by the first parameter.
According to the embodiment of the disclosure, the preset condition includes a subject type of the click log and/or a generation source of the click log.
According to an embodiment of the present disclosure, the data staging station includes a distributed publish-subscribe message system kafka.
Another aspect of the present disclosure provides an apparatus for processing a click log, including: the system comprises a click log distribution module, a data temporary storage station and a data temporary storage module, wherein the click log distribution module is used for sending a click log to the data temporary storage station corresponding to a preset condition according to the preset condition met by the click log; the first acquisition module is used for acquiring a first click log from a first data temporary storage station matched with the data requirement for calculating the first real-time index; and the first calculating module is used for calculating the first real-time index by utilizing the first click log.
According to the embodiment of the disclosure, the device further comprises a click log collection module, wherein the click log collection module is used for sending the click log to a data temporary storage central station when the click log is generated.
According to an embodiment of the present disclosure, the apparatus further comprises: the second acquisition module is used for acquiring a second click log from the data temporary storage main station when the data requirement for calculating a second real-time index cannot be matched with any data temporary storage station; and the second calculating module is used for calculating the second real-time index by utilizing the second click log.
According to an embodiment of the present disclosure, the click log distribution module includes: the judging submodule is used for judging a first parameter of the click log; and the distribution submodule is used for sending the click log to a data temporary storage station corresponding to the preset condition according to the preset condition met by the first parameter.
According to the embodiment of the disclosure, the preset condition includes a subject type of the click log and/or a generation source of the click log.
According to an embodiment of the present disclosure, the data staging station includes a distributed publish-subscribe message system kafka.
Another aspect of the present disclosure provides a system for processing a click log, including one or more memories storing executable instructions; and one or more processors executing the executable instructions to implement the method as described above.
Another aspect of the disclosure provides a computer-readable storage medium having executable instructions stored thereon. Which instructions are executed by a processor to implement the method as described above.
According to the embodiment of the disclosure, the real-time processing pressure of the click logs of the data temporary storage master station can be at least partially relieved, the backlog of a large number of click logs in the data temporary storage master station and the data redundancy during the real-time index calculation are avoided to a greater extent, and therefore the technical effects of timeliness and stability of the click logs can be improved.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an exemplary system architecture to which the method and apparatus for processing a click log of the present disclosure may be applied;
FIG. 2 schematically illustrates a flow chart of a method of processing a click log according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow diagram of a method of processing a click log according to another embodiment of the present disclosure;
FIG. 4 schematically illustrates a flow diagram of a method of processing a click log according to yet another embodiment of the present disclosure;
FIG. 5 is a flow chart that schematically illustrates a method for sending a click log to a data staging station according to a preset condition that the click log satisfies, in accordance with an embodiment of the present disclosure;
FIG. 6 schematically illustrates a block diagram of an apparatus for processing a click log according to an embodiment of the present disclosure;
FIG. 7 schematically illustrates a block diagram of an apparatus for processing a click log according to another embodiment of the present disclosure;
FIG. 8 schematically illustrates a block diagram of an apparatus for processing a click log according to still another embodiment of the present disclosure;
FIG. 9 schematically illustrates a block diagram of a click log distribution module according to an embodiment of the present disclosure; and
FIG. 10 schematically illustrates a block diagram of a computer system that processes a click log according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The words "a", "an" and "the" and the like as used herein are also intended to include the meanings of "a plurality" and "the" unless the context clearly dictates otherwise. Furthermore, the terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Some block diagrams and/or flow diagrams are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations thereof, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks.
Accordingly, the techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). In addition, the techniques of this disclosure may take the form of a computer program product on a computer-readable medium having instructions stored thereon for use by or in connection with an instruction execution system. In the context of this disclosure, a computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the instructions. For example, the computer readable medium can include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. Specific examples of the computer readable medium include: magnetic storage devices, such as magnetic tape or Hard Disk Drives (HDDs); optical storage devices, such as compact disks (CD-ROMs); a memory, such as a Random Access Memory (RAM) or a flash memory; and/or wired/wireless communication links.
The embodiment of the disclosure provides a method, a device and a system for processing a click log. The method for processing the click log sends the click log to a data temporary storage station corresponding to the preset condition according to the preset condition met by the click log. And then acquiring a first click log required by calculation from a first data temporary storage station matched with the data requirement for calculating the first real-time index during calculation of the first real-time index, and calculating the first real-time index by using the first click log. In this way, when the first real-time index is calculated, the corresponding first click log can be obtained from the data temporary storage station matched with the first real-time index in a targeted manner, the data redundancy in the calculation process is reduced, and the timeliness of processing the click log is improved.
FIG. 1 schematically illustrates an exemplary system architecture to which the method and apparatus for processing a click log of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105 (this architecture is merely an example, and the components included in a particular architecture may be adjusted according to application specific circumstances). The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use terminal devices 101, 102, 103 to interact with a server 105 over a network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (for example only) providing support for shopping-like websites browsed by users using the terminal devices 101, 102, 103. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.
A user click anywhere on any page of the web site via terminal devices 101, and/or 102, and/or 103 may be sent to server 105 via network 104. Then, a corresponding click log is generated in the server 105. Then, the click logs can be processed according to the method for processing click logs provided by the embodiment of the disclosure.
It should be noted that the method for processing the click log provided by the embodiment of the present disclosure may be executed by the server 105, or may be executed by another server or a server cluster different from the server 105. Accordingly, the device for processing the click log may be disposed in the server 105, or may be disposed in another server or a server cluster other than the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
FIG. 2 schematically shows a flow chart of a method of processing a click log according to an embodiment of the present disclosure.
As shown in fig. 2, the method of processing a click log according to an embodiment of the present disclosure includes operations S210 to S230.
In operation S210, according to a preset condition that the click log satisfies, the click log is sent to a data staging station corresponding to the preset condition.
Specifically, when a user clicks on a website, a click log is generated that reflects the click behavior. And sending the click log to a data temporary storage station corresponding to the preset condition in a targeted manner according to the preset condition met by the click log.
The preset condition that the click log meets may be a condition set by selecting according to data information that the click log may have.
According to an embodiment of the present disclosure, the preset condition includes a subject type of the click log, and/or a generation source of the click log.
Specifically, the preset condition may only include the topic type of the click log, may only include the generation source of the click log, and may also include both the topic type of the click log and the generation source of the click log.
According to the embodiment of the disclosure, the topic type of the click log can be determined according to the information of the effect or the achieved purpose generated by reflecting the user click behavior in the click log.
Specifically, when the user clicks on different webpages, since the functions and/or the displayed contents of the different webpages are different, the theme type of the click log generated by the click behavior can be determined according to the webpage where the click behavior of the user is located.
For example, the page clicked by the user is a commodity display page, and the subject type of the click log generated by clicking on the page may be considered commodity browsing.
As another example, if the page clicked by the user is a search page, the topic type of the click log generated by the click on the page may be considered as a search.
In addition, the topic type of the click log can be determined according to the clicking behavior of the user at a specific position or a specific button on the webpage and the action of the specific position or the specific button.
For example, when a user clicks in a search box of a web page, the topic type of a click log generated by a click at that location on the web page may be considered a search.
As another example, when a user clicks on a join shopping cart button in a web page, the subject type of the resulting click log may be considered to be a join shopping cart.
For another example, when the user clicks the page layout editing button on the web page first, and then the user performs a series of clicks on the page to modify the layout of the page, etc., the subject of the series of click logs generated thereby may be considered as page decoration. The situation is often that a user makes layout changes on his/her own page on a social network site, or a seller designs his/her own page layout on an e-commerce site, and the like.
According to the embodiment of the present disclosure, a generation source of the click log may be determined according to the type of the terminal device operated by the click behavior of the click log.
For example, the user's click behavior is generated by the PC side, and the generation source of the click log may be considered as the PC side.
For another example, if the click behavior of the user is generated by the mobile phone APP client, the generation source of the click log may be considered to be the mobile phone APP.
According to the embodiment of the disclosure, the click log is sent to the data temporary storage station corresponding to the preset condition met by the click log, so that the click log is screened according to the preset condition met while the click log is sent, and the subsequent processing of the click log is more targeted.
The saving time of the click log in the data temporary storage station can be determined according to the capacity of the data temporary storage station, the generation amount and the processing amount of the click log.
According to an embodiment of the present disclosure, the data staging station includes a distributed subscription message system kafka.
In operation S220, a first click log is obtained from a first data staging station that matches a data requirement for calculating a first real-time indicator.
Specifically, the click log is allocated to the data staging station corresponding to the preset condition satisfied by the click log in operation S210. Next, in operation S220, when a first real-time index calculation is to be performed, according to a data requirement for calculating the first real-time index, a first click log required for calculation is obtained from a first data staging station matching the data requirement.
The first real-time index can be an index which can reflect the click behavior of the user and has statistical significance, and can be used for monitoring, displaying and/or providing guidance for system optimization and the like.
According to an embodiment of the present disclosure, the data requirement matching for calculating the first real-time index may be, for example, that click log data required for calculating the first real-time index can be found from just one of the first data staging stations.
For example, when the first real-time indicator is order data statistics, the first click log required for performing order data statistics is an order click log. At this time, the order click log is obtained from the first data temporary storage station corresponding to the order subject.
For another example, when the first real-time index is a search log statistic index, the first click log required for calculation of the search log statistic index is a search click log. At this time, the search click log is acquired from the first data staging station corresponding to the search topic.
It is to be understood that matching the data requirement for calculating the first real-time indicator is not a one-to-one relationship between the data requirement for the first real-time indicator and the first data staging, and there may be situations where multiple first real-time indicators obtain data from the same first data staging station.
For example, when the first real-time indicator is a keyword search indicator and the first real-time indicator is a search statistic, whether the keyword search indicator or the search statistic is calculated, the click logs required by the respective calculations may be obtained from the first data staging station of the search topic. In contrast, when the keyword search index is calculated, the click logs related to the keyword search are obtained from the first data staging station of the search topic, and when the search statistics is calculated, all the click logs are obtained from the first data staging station of the search topic.
In operation S230, a first real-time metric is calculated using the first click log.
After the first real-time index is obtained through calculation, the first real-time index can be stored, so that real-time processing of the first click log is completed.
According to the embodiment of the disclosure, the click log is sent to the data temporary storage station corresponding to the preset condition in a targeted manner according to the preset condition met by the click log, then the required first click log is obtained from the first data temporary storage station matched with the data required for calculating the first real-time index when the first real-time index is calculated, and finally the first real-time index is calculated, so that the processing of the click log is completed.
FIG. 3 schematically shows a flow chart of a method of processing a click log according to another embodiment of the present disclosure.
As shown in fig. 3, a method of processing a click log according to another embodiment of the present disclosure includes operation S310 and operations S210 to S230, wherein operation S310 is performed before operation S210.
In operation S310, when a click log is generated, the click log is transmitted to a data staging central station.
According to the embodiment of the disclosure, the click log is generated and then sent to the data temporary storage central station for collection.
Next, in operation S210, according to a preset condition that the click log satisfies, the click log is sent to a data staging station corresponding to the preset condition.
In this case, in operation S210, the click log collected by the data staging master station is specifically sent to the data staging station corresponding to the preset condition according to the preset condition that the click log meets.
According to embodiments of the present disclosure, the data staging station and the data staging headstation may be separate devices or systems for temporarily storing the click log. Alternatively, the data staging station may be a data cluster located in the data staging head station and meeting a predetermined condition.
Then, in operation S220, a first click log is obtained from a first data staging station that matches the data requirements for calculating the first real-time index.
In operation S230, the first real-time metric is calculated using the first click log.
Here, operations S220 and S230 are the same as those in fig. 2, and are not described again.
According to the embodiment of the disclosure, the generated click logs are collected and summarized in time through the data temporary storage master station, the occurrence of the conditions of omission of the click logs and the like is effectively reduced, and the overall stability of the click log processing system is improved.
In addition, before the click logs are sent to the data temporary storage station corresponding to the preset condition in a targeted manner according to the preset condition, the generated click logs are collected and summarized by the data temporary storage master station in a unified manner, so that the influence on the efficiency of processing the upstream service of the click logs can be reduced, and the phenomenon of webpage reflection speed delay caused by processing the click logs can be avoided.
FIG. 4 schematically illustrates a flow chart of a method of processing a click log according to yet another embodiment of the present disclosure.
As shown in fig. 4, a method of processing a click log according to still another embodiment of the present disclosure includes operation S310, operation S420, and operation S430.
Operation S310 is as described with reference to fig. 3.
In operation S420, when the data requirement for calculating the second real-time indicator cannot be matched with any one of the data staging stations, a second click log is obtained from the data staging master station in operation S310.
In operation S430, the second real-time indicator is calculated using the second click log.
According to the embodiment of the disclosure, the situation that the data requirement for calculating the second real-time index cannot be matched with any one data temporary storage station may be that the click log required for calculating the second real-time index needs to be associated with click logs with multiple functions, and at this time, the first click log obtained from any one first data temporary storage station cannot complete the calculation of the second real-time index.
For example, when the second real-time indicator is an order data indicator resulting from a search click. The second real-time index is used for analyzing the click log of the search and correlating the final order click log brought by the search click behavior.
According to the embodiment of the disclosure, the situation that the data requirement for calculating the second real-time index cannot be matched with any one data temporary storage station may also be that when the click logs meeting the predetermined condition are received by the data temporary storage station, some click logs may not be sent by the corresponding data temporary storage stations due to the limitation set by the predetermined condition, so that the click logs are only temporarily stored in the data temporary storage master station.
According to the embodiment of the disclosure, when the data requirement for calculating the second real-time index cannot be matched with any one data temporary storage station, the second click log is acquired from the data temporary storage master station to calculate the second real-time index, so that the stability of data processing can be improved, and the problem that the second click log cannot be calculated because the second real-time index cannot be acquired from the data temporary storage station is solved.
According to the embodiment of the disclosure, when the first real-time index is calculated, a first click log is obtained from a first data temporary storage station matched with the data requirement for calculating the first real-time index for calculation; when the second real-time index is calculated, the second click log required by calculation cannot be obtained from any data temporary storage station, so that the second click log is obtained from the data temporary storage main station for calculation. By the method, the acquisition way of the click log is flexibly selected according to the real-time index to be calculated when the click log is processed, and the timeliness and the stability of real-time processing of the click log are improved on the whole.
FIG. 5 is a flow chart that schematically illustrates a method for sending a click log to a data staging station according to a preset condition that the click log satisfies, in accordance with an embodiment of the present disclosure.
As shown in fig. 5, according to the embodiment of the present disclosure, the sending of the click log to the data staging station corresponding to the preset condition according to the preset condition met by the click log in operation S210 specifically includes operation S511 and operation S512.
In operation S511, a first parameter of the click log is determined.
In operation S512, the click log is sent to a data staging station corresponding to a preset condition according to the preset condition satisfied by the first parameter.
Specifically, the first parameter of the click log may be a part or all of the data information of the click log, for example, the first parameter may be a generation source of the click log, a topic type and/or association information between the click logs, and the like.
According to the embodiment of the disclosure, a first parameter of the click log is judged, and then the click log is sent to a data temporary storage station corresponding to a preset condition according to the preset condition met by the first parameter. In the process, the first parameters of the click log are compared and screened only through preset conditions, the original format of the log is not damaged, and the data stability in the processing process of the click log is ensured.
FIG. 6 schematically shows a block diagram of an apparatus for processing a click log according to an embodiment of the present disclosure.
As shown in FIG. 6, an apparatus 600 for processing a click log according to an embodiment of the present disclosure includes a click log distribution module 610, a first obtaining module 620, and a first calculating module 630.
The click log distribution module 610 is configured to send the click log to a data temporary storage station corresponding to a preset condition according to the preset condition that the click log meets.
Specifically, when a user clicks on a website, a click log is generated that reflects the click behavior. The click log distribution module 610 sends the click log to a data temporary storage station corresponding to a preset condition in a targeted manner according to the preset condition satisfied by the click log.
The preset condition that the click log meets may be a condition set by selecting according to data information that the click log may have.
According to an embodiment of the present disclosure, the preset condition includes a subject type of the click log, and/or a generation source of the click log.
Specifically, the preset condition may only include the topic type of the click log, may only include the generation source of the click log, and may also include both the topic type of the click log and the generation source of the click log.
According to the embodiment of the disclosure, the topic type of the click log can be determined according to the effect generated by reflecting the user click behavior or the achieved purpose information in the click log.
Specifically, when the user clicks on different webpages, since the functions and/or the displayed contents of the different webpages are different, the theme type of the click log generated by the click behavior can be determined according to the webpage where the click behavior of the user is located.
For example, the page clicked by the user is a commodity display page, and the subject type of the click log generated by clicking on the page may be considered commodity browsing.
As another example, if the page clicked by the user is a search page, the topic type of the click log generated by clicking on the page may be considered as a search.
In addition, the topic type of the click log can be determined according to the clicking behavior of the user at a specific position or a specific button on the webpage and the action of the specific position or the specific button.
For example, when a user clicks in a search box of a web page, the topic type of a click log generated by a click at that location on the web page may be considered a search.
As another example, when a user clicks on a join shopping cart button in a web page, the subject type of the resulting click log may be considered to be a join shopping cart.
For another example, when the user clicks the page layout editing button on the web page first, and then the user performs a series of clicks on the page to modify the layout of the page, etc., the subject of the series of click logs generated thereby may be considered as page decoration. The situation is often that a user makes layout changes on his/her own page on a social network site, or a seller designs his/her own page layout on an e-commerce site, and the like.
According to the embodiment of the present disclosure, a generation source of the click log may be determined according to the type of the terminal device operated by the click behavior of the click log.
For example, the user's click behavior is generated by the PC side, and the generation source of the click log may be considered as the PC side.
For another example, if the click behavior of the user is generated by the mobile phone APP client, the generation source of the click log may be considered to be the mobile phone APP.
According to the embodiment of the disclosure, the click log distribution module 610 sends the click log to the data temporary storage station corresponding to the preset condition that the click log meets, so that the click log is screened according to the preset condition that the click log meets while the click log is sent, and the subsequent processing of the click log is more targeted.
The saving time of the click log in the data temporary storage station can be determined according to the capacity of the data temporary storage station, the generation amount and the processing amount of the click log.
According to an embodiment of the present disclosure, the data staging station includes a distributed subscription message system kafka.
The first obtaining module 620 is configured to obtain a first click log from a first data staging station matching a data requirement for calculating a first real-time indicator.
According to the embodiment of the disclosure, the first calculating module 620 obtains a first click log required for calculation from a first data staging station matched with a data demand for calculating a first real-time index according to the data demand.
The first real-time index can be an index which can reflect the click behavior of the user and has statistical significance, and can be used for monitoring, displaying and/or providing guidance and other data indexes for system optimization and the like.
According to an embodiment of the present disclosure, the data requirement matching for calculating the first real-time index may be, for example, that click log data required for calculating the first real-time index can be found from just one of the first data staging stations.
For example, when the first real-time indicator is order data statistics, the first click log required for performing order data statistics is an order click log. At this time, the order click log is obtained from the first data temporary storage station corresponding to the order subject.
For another example, when the first real-time index is a search log statistic index, the first click log required for calculation of the search log statistic index is a search click log. At this time, the search click log is acquired from the first data staging station corresponding to the search topic.
It is to be understood that matching the data requirements for calculating the first real-time indicator is not to say that the data requirements for the first real-time indicator are in a one-to-one relationship with the first data staging station, and there may be situations where multiple first real-time indicators obtain data from the same first data staging station.
For example, when the first real-time indicator is a keyword search indicator and the first real-time indicator is a search statistic, whether the keyword search indicator or the search statistic is calculated, the click logs required by the respective calculations may be obtained from the first data staging station of the search topic. In contrast, when the keyword search index is calculated, the click logs related to the keyword search are obtained from the first data staging station of the search topic, and when the search statistics is calculated, all the click logs are obtained from the first data staging station of the search topic.
The first calculating module 630 is configured to calculate the first real-time indicator by using the first click log. After the first real-time index is obtained through calculation, the first real-time index can be stored, so that real-time processing of the first click log is completed.
According to the embodiment of the disclosure, the device 600 for processing the click log sends the click log to the data temporary storage station corresponding to the preset condition in a targeted manner according to the preset condition met by the click log, then obtains the required first click log from the first data temporary storage station matched with the data requirement for calculating the first real-time index when the first real-time index is calculated, and finally obtains the first real-time index through calculation, thereby completing the processing of the click log.
FIG. 7 schematically illustrates a block diagram of an apparatus for processing a click log according to another embodiment of the present disclosure.
The apparatus 700 for processing a click log according to another embodiment of the present disclosure includes a click log collecting module 740 in addition to the click log distributing module 610, the first obtaining module 620, and the first calculating module 630.
The click log collection module 740 is configured to send the click log to a data staging central station when the click log is generated.
According to the embodiment of the disclosure, the click log collection module 740 sends the generated click log to the data temporary storage central station for collection.
According to the embodiment of the present disclosure, the click log distribution module 610 in the device 700 for processing click logs sends the click logs collected by the data staging master station to the data staging station corresponding to the preset condition in a targeted manner according to the preset condition that the click logs meet.
According to embodiments of the present disclosure, the data staging station and the data staging headstation may be separate devices or systems for temporarily storing the click logs. Alternatively, the data staging station may be a data cluster located in the data staging head station and meeting a predetermined condition.
According to the embodiment of the present disclosure, the device 700 for processing the click log collects and summarizes the generated click log in time through the data temporary storage central station, thereby effectively reducing the occurrence of the situations such as omission of the click log and improving the overall stability of the click log processing system.
In addition, before the device 700 for processing the click log sends the click log to the data temporary storage station corresponding to the preset condition in a targeted manner according to the preset condition, the data temporary storage master station uniformly collects and summarizes the generated click log, so that the influence on the efficiency of processing the upstream service of the click log can be reduced, for example, the phenomenon that the webpage reflection speed is delayed due to the processing of the click log is avoided.
FIG. 8 schematically illustrates a block diagram of an apparatus for processing a click log according to still another embodiment of the present disclosure.
As shown in fig. 8, an apparatus 800 for processing a click log according to still another embodiment of the present disclosure includes a second obtaining module 850 and a second calculating module 860 in addition to the click log distributing module 610, the first obtaining module 620 and the first calculating module 630, and the click log collecting module 740.
Specifically, the second obtaining module 850 is configured to obtain a second click log from the data staging master station when the data requirement for calculating the second real-time index cannot be matched with any one data staging station;
the second calculating module 860 is configured to calculate the second real-time indicator using the second click log.
According to the embodiment of the disclosure, the situation that the data requirement for calculating the second real-time index cannot be matched with any one data temporary storage station may be that the click log required for calculating the second real-time index needs to be associated with click logs with multiple functions, and at this time, the first click log obtained from any one first data temporary storage station cannot complete the calculation of the second real-time index.
For example, when the second real-time index is an order data index resulting from a search click. The second real-time index is used for analyzing the click log of the search and correlating the final order click log brought by the search click behavior.
According to the embodiment of the disclosure, the situation that the data requirement for calculating the second real-time index cannot be matched with any one data temporary storage station may also be that when the click logs meeting the predetermined condition are received by the data temporary storage station, some click logs may not be sent by the corresponding data temporary storage stations due to the limitation set by the predetermined condition, so that the click logs are only temporarily stored in the data temporary storage master station.
According to the embodiment of the disclosure, when the data requirement for calculating the second real-time index cannot be matched with any one data temporary storage station, the device 800 for processing the click log performs calculation of the second real-time index by acquiring the second click log from the data temporary storage master station, so that the stability of data processing is improved, and the problem that calculation cannot be performed because the second real-time index cannot acquire the second click log from the data temporary storage station is solved.
According to the embodiment of the disclosure, when calculating the first real-time index, the device 800 for processing a click log obtains the first click log from the first data temporary storage station matched with the data requirement for calculating the first real-time index for calculation; and when the second real-time index is calculated, the second click log required by calculation cannot be obtained from any data temporary storage station, so that the second click log is obtained from the data temporary storage main station for calculation. Therefore, when the device 800 for processing the click log processes the click log, the acquisition path of the click log can be flexibly selected according to the real-time index to be calculated, and the timeliness and the stability of real-time processing of the click log are integrally improved.
FIG. 9 schematically shows a block diagram of a click log distribution module according to an embodiment of the disclosure.
As shown in fig. 9, the click log distribution module 610 according to an embodiment of the present disclosure includes a determination sub-module 911 and a distribution sub-module 912.
The judgment sub-module 911 is configured to judge a first parameter of the click log;
the distributing sub-module 912 is configured to send the click log to a data staging station corresponding to a preset condition according to the preset condition met by the first parameter.
Specifically, the first parameter of the click log may be a part or all of the data information of the click log, for example, the first parameter may be a generation source of the click log, a topic type and/or association information between the click logs, and the like.
According to the embodiment of the disclosure, the click log distribution module 610 first determines a first parameter of the click log, and then sends the click log to a data temporary storage station corresponding to a preset condition according to the preset condition met by the first parameter. The click log distribution module 610 compares and screens the first parameters of the click log only through preset conditions when sending the click log to the corresponding data temporary storage station, and does not destroy the original format of the log, thereby ensuring the data stability in the processing process of the click log.
It is to be appreciated that the click log distribution module 610, the first obtaining module 620, the first calculating module 630, the click log collection module 740, the second obtaining module 850, and the second calculating module 860 may be combined in one module to be implemented, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present invention, at least one of the click log distribution module 610, the first obtaining module 620, the first calculating module 630, the click log collection module 740, the second obtaining module 850, and the second calculating module 860 may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging circuits, or in a suitable combination of three implementations of software, hardware, and firmware. Alternatively, at least one of the click log distribution module 610, the first obtaining module 620, the first calculating module 630, the click log collection module 740, the second obtaining module 850, and the second calculating module 860 may be at least partially implemented as a computer program module that, when executed by a computer, may perform the functions of the respective modules.
FIG. 10 schematically illustrates a block diagram of a computer system that processes a click log according to an embodiment of the disclosure.
As shown in fig. 10, the computer system 1000 includes a Central Processing Unit (CPU)1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data necessary for the operation of the system 1000 are also stored. The CPU 1001, ROM 1002, and RAM 1003 are connected to each other via a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.
According to an embodiment of the present disclosure, the Central Processing Unit (CPU)1001 may execute the above-described method by executing a program stored in the Read Only Memory (ROM)1002 or a program loaded from the storage section 1008 into the Random Access Memory (RAM) 1003. It is noted that although fig. 10 shows only one Central Processing Unit (CPU)1001, one Read Only Memory (ROM)1002, and one Random Access Memory (RAM)1003, embodiments of the present disclosure may include one or more of the above components.
The following components are connected to the I/O interface 1005: an input section 1006 including a keyboard, a mouse, and the like; an output section 1007 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 1008 including a hard disk and the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. The driver 1010 is also connected to the I/O interface 1005 as necessary. A removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1010 as necessary, so that a computer program read out therefrom is mounted into the storage section 1008 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication part 1009 and/or installed from the removable medium 1011. The above-described functions defined in the system of the present disclosure are executed when the computer program is executed by a Central Processing Unit (CPU) 1001.
It should be noted that the computer readable media shown in the present disclosure may be computer readable signal media or computer readable storage media or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units and/or modules described in the embodiments of the present disclosure may be implemented by software or hardware. The described units and/or modules may also be provided in a processor, and may be described as: a processor includes a sending module, an obtaining module, a determining module, and a first processing module. The names of these modules do not form a limitation on the modules themselves in some cases, and for example, the sending module may also be described as a "module sending a picture acquisition request to a connected server".
As another aspect, a computer-readable medium is also provided according to an embodiment of the present disclosure. The computer readable medium carries one or more programs which, when executed, implement a method of processing a click log according to an embodiment of the present disclosure, including: sending the click log to a data temporary storage station corresponding to a preset condition according to the preset condition met by the click log; acquiring a first click log from a first data temporary storage station matched with the data requirement for calculating the first real-time index; and calculating the first real-time index by using the first click log.
Before sending the click log to a data temporary storage station corresponding to a preset condition according to the preset condition met by the click log, the method for processing the click log further comprises sending the click log to a data temporary storage master station when the click log is generated. In addition, the method for processing the click log according to the embodiment of the present disclosure may further include obtaining a second click log from the data staging master station when a data requirement for calculating a second real-time indicator cannot be matched with any one of the data staging stations, and then calculating the second real-time indicator by using the second click log.
And sending the click log to a data temporary storage station corresponding to the preset condition according to the preset condition met by the click log, wherein the step of judging the first parameter of the click log comprises the step of sending the click log to the data temporary storage station corresponding to the preset condition according to the preset condition met by the first parameter.
The preset condition comprises the subject type of the click log and/or the generation source of the click log. The data staging station includes a distributed publish-subscribe message system kafka.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.
Claims (10)
1. A method of processing a click log, comprising:
when a click log is generated, sending the click log to a data temporary storage master station;
sending the click log to a data temporary storage station corresponding to a preset condition according to the preset condition met by the click log, wherein the click log is a log which reflects a click behavior and is generated when a user clicks any position of any page on a website;
acquiring a first click log from a first data temporary storage station matched with the data requirement for calculating the first real-time index; the click log data required by calculating the first real-time index can be found from a first data temporary storage station matched with the data requirement for calculating the first real-time index; the first real-time index and a first data temporary storage station matched with the data requirement for calculating the first real-time index are in a one-to-one relationship, or the first real-time index and other real-time indexes have a many-to-one relationship with the first data temporary storage station matched with the data requirement for calculating the first real-time index; calculating the first real-time index by using the first click log; and
when the data requirement for calculating a second real-time index cannot be matched with any data temporary storage station, acquiring a second click log from the data temporary storage master station; and calculating the second real-time index by using the second click log.
2. The method of claim 1, wherein sending the click log to a data staging station corresponding to a preset condition according to the preset condition satisfied by the click log comprises:
judging a first parameter of the click log;
and sending the click log to a data temporary storage station corresponding to the preset condition according to the preset condition met by the first parameter.
3. The method of claim 1, wherein the preset condition includes a subject type of the click log, and/or a generation source of the click log.
4. The method of claim 1, the data staging station comprising a distributed publish-subscribe message system kafka.
5. An apparatus to process a click log, comprising:
the system comprises a click log collection module, a data temporary storage master station and a data temporary storage master station, wherein the click log collection module is used for sending a click log to the data temporary storage master station when the click log is generated;
the click log distribution module is used for sending the click log to a data temporary storage station corresponding to a preset condition according to the preset condition met by the click log; the click log is a log which is generated when a user clicks any position of any page on a website and reflects the click behavior;
the first acquisition module is used for acquiring a first click log from a first data temporary storage station matched with the data requirement for calculating the first real-time index; the click log data required by calculating the first real-time index can be found from a first data temporary storage station matched with the data requirement for calculating the first real-time index; the first real-time index and a first data temporary storage station matched with the data requirement for calculating the first real-time index are in a one-to-one relationship, or the first real-time index and other real-time indexes have a many-to-one relationship with the first data temporary storage station matched with the data requirement for calculating the first real-time index;
the first calculation module is used for calculating the first real-time index by utilizing the first click log;
the second acquisition module is used for acquiring a second click log from the data temporary storage main station when the data requirement for calculating a second real-time index cannot be matched with any data temporary storage station; and
and the second calculating module is used for calculating the second real-time index by utilizing the second click log.
6. The apparatus of claim 5, wherein the click log distribution module comprises:
the judging submodule is used for judging a first parameter of the click log;
and the distribution submodule is used for sending the click log to a data temporary storage station corresponding to the preset condition according to the preset condition met by the first parameter.
7. The apparatus of claim 5, wherein the preset condition includes a subject type of the click log, and/or a generation source of the click log.
8. The apparatus of claim 5, the data staging station comprising a distributed publish-subscribe message system kafka.
9. A system for processing a click log includes one or more memories storing executable instructions; and
one or more processors executing the executable instructions to implement the method of any one of claims 1-4.
10. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710342330.2A CN108881352B (en) | 2017-05-15 | 2017-05-15 | Method, device and system for processing click log |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710342330.2A CN108881352B (en) | 2017-05-15 | 2017-05-15 | Method, device and system for processing click log |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108881352A CN108881352A (en) | 2018-11-23 |
CN108881352B true CN108881352B (en) | 2022-06-07 |
Family
ID=64320398
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710342330.2A Active CN108881352B (en) | 2017-05-15 | 2017-05-15 | Method, device and system for processing click log |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108881352B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103916293A (en) * | 2014-04-15 | 2014-07-09 | 浪潮软件股份有限公司 | Method for monitoring and analyzing website user behaviors |
CN105528454A (en) * | 2015-12-25 | 2016-04-27 | 北京奇虎科技有限公司 | Log treatment method and distributed cluster computing device |
CN106055630A (en) * | 2016-05-27 | 2016-10-26 | 北京小米移动软件有限公司 | Log storage method and device |
CN106202305A (en) * | 2016-06-30 | 2016-12-07 | 北京北信源软件股份有限公司 | A kind of log processing method, device and Database Systems |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103793479A (en) * | 2014-01-14 | 2014-05-14 | 上海上讯信息技术股份有限公司 | Log management method and log management system |
US10339165B2 (en) * | 2015-02-27 | 2019-07-02 | Walmart Apollo, Llc | System, method, and non-transitory computer-readable storage media for generating synonyms of a search query |
CN106649312B (en) * | 2015-10-29 | 2019-10-29 | 北京北方华创微电子装备有限公司 | The analysis method and system of journal file |
CN106227790A (en) * | 2016-07-19 | 2016-12-14 | 北京北信源软件股份有限公司 | A kind of method using Apache Spark classification and parsing massive logs |
-
2017
- 2017-05-15 CN CN201710342330.2A patent/CN108881352B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103916293A (en) * | 2014-04-15 | 2014-07-09 | 浪潮软件股份有限公司 | Method for monitoring and analyzing website user behaviors |
CN105528454A (en) * | 2015-12-25 | 2016-04-27 | 北京奇虎科技有限公司 | Log treatment method and distributed cluster computing device |
CN106055630A (en) * | 2016-05-27 | 2016-10-26 | 北京小米移动软件有限公司 | Log storage method and device |
CN106202305A (en) * | 2016-06-30 | 2016-12-07 | 北京北信源软件股份有限公司 | A kind of log processing method, device and Database Systems |
Also Published As
Publication number | Publication date |
---|---|
CN108881352A (en) | 2018-11-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111125107A (en) | Data processing method, device, electronic equipment and medium | |
CN109901987B (en) | Method and device for generating test data | |
CN109218041B (en) | Request processing method and device for server system | |
CN110321252B (en) | Skill service resource scheduling method and device | |
CN109992719B (en) | Method and apparatus for determining push priority information | |
WO2022257604A1 (en) | Method and apparatus for determining user tag | |
CN111010453B (en) | Service request processing method, system, electronic device and computer readable medium | |
CN110866031B (en) | Database access path optimization method and device, computing equipment and medium | |
CN112947919A (en) | Method and device for constructing service model and processing service request | |
CN112015790A (en) | Data processing method and device | |
CN110928594A (en) | Service development method and platform | |
CN113190558A (en) | Data processing method and system | |
CN113378346A (en) | Method and device for model simulation | |
CN108881352B (en) | Method, device and system for processing click log | |
CN113032702A (en) | Page loading method and device | |
CN112799863B (en) | Method and device for outputting information | |
CN112688982B (en) | User request processing method and device | |
CN113282455A (en) | Monitoring processing method and device | |
CN107885774B (en) | Data processing method and system | |
CN111127077A (en) | Recommendation method and device based on stream computing | |
CN112581154A (en) | Settlement processing method and device | |
CN113449230A (en) | Method and system for determining exposure element, client and server | |
CN112822225A (en) | Method and device for tracking content delivery effect | |
CN109992428B (en) | Data processing method and system | |
CN112667627B (en) | Data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |