CN116701337A - Log data processing method and device, electronic equipment and storage medium - Google Patents

Log data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116701337A
CN116701337A CN202310976938.6A CN202310976938A CN116701337A CN 116701337 A CN116701337 A CN 116701337A CN 202310976938 A CN202310976938 A CN 202310976938A CN 116701337 A CN116701337 A CN 116701337A
Authority
CN
China
Prior art keywords
log data
log
search
search request
identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310976938.6A
Other languages
Chinese (zh)
Other versions
CN116701337B (en
Inventor
王威
饶春平
陈楷彬
卢雯雯
冷晶晶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202310976938.6A priority Critical patent/CN116701337B/en
Publication of CN116701337A publication Critical patent/CN116701337A/en
Application granted granted Critical
Publication of CN116701337B publication Critical patent/CN116701337B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Human Computer Interaction (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application relates to a log data processing method, a log data processing device, electronic equipment and a storage medium. The method comprises the following steps: acquiring a plurality of pieces of log data generated by a plurality of search service modules on search requests processed in a history period; the log data comprises search request identification information, wherein the search request identification information comprises a search request identification for uniquely identifying a search request; the search service module comprises a search service module serving as a main call in the search request and a search service module serving as a called call; based on the search request identifiers, aggregating the plurality of pieces of log data to obtain aggregated log data corresponding to each search request identifier; responding to the log query request, extracting a query identifier from the log query request, wherein the query identifier corresponds to the search request identifier; and querying the matched target aggregate log data from the aggregate log data according to the query identification. The log query efficiency and the integrity of the call chain can be improved.

Description

Log data processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a log data processing method, device, electronic device, and storage medium.
Background
The log checking and the call chain checking are common tools for service development and problem positioning, and the log checking is generally printed through a local log, but the local log is frequently exploded and cannot be reserved for too long time; the log checking and the call chain checking need to log in a plurality of modules at the same time, so that the efficiency of log checking and call chain checking is low, and in the prior art, when one search service module calls another search service module, the call relationship is only selected to be reported by the other search service module, so that the information of the subsequent call chain is not complete and accurate enough.
Disclosure of Invention
In view of the above-mentioned technical problems, the present application provides a log data processing method, a log data processing device, an electronic device and a storage medium.
According to an aspect of the present application, there is provided a log data processing method, the method including:
acquiring a plurality of pieces of log data generated by a plurality of search service modules on search requests processed in a history period; the log data comprises search request identification information, wherein the search request identification information comprises a search request identification for uniquely identifying a search request; the search service module comprises a search service module serving as a main call and a search service module serving as a called call in the search request;
Based on the search request identifiers, carrying out aggregation processing on the plurality of pieces of log data to obtain aggregated log data corresponding to each search request identifier;
responding to a log query request, and extracting a query identifier from the log query request, wherein the query identifier corresponds to the search request identifier;
inquiring matched target aggregate log data from the aggregate log data according to the inquiry identification; and the search request identifier corresponding to the target aggregate log data is matched with the query identifier.
According to another aspect of the present application, there is provided a log data processing apparatus, the apparatus comprising:
the acquisition module is used for acquiring a plurality of pieces of log data generated by the search service modules on the search requests processed in the history period; the log data comprises search request identification information, wherein the search request identification information comprises a search request identification for uniquely identifying a search request; the search service module comprises a search service module serving as a main call and a search service module serving as a called call in the search request;
the first aggregation module is used for carrying out aggregation processing on the plurality of pieces of log data based on the search request identifiers to obtain aggregated log data corresponding to each search request identifier;
The query identifier extraction module is used for responding to a log query request and extracting a query identifier from the log query request, wherein the query identifier corresponds to the search request identifier;
the query module is used for querying the matched target aggregate log data from the aggregate log data according to the query identifier; and the search request identifier corresponding to the target aggregate log data is matched with the query identifier.
According to another aspect of the present application, there is provided an electronic apparatus including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the above method.
According to another aspect of the present application there is provided a non-transitory computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions when executed by a processor implement the above-described method.
According to another aspect of the application, there is provided a computer program product comprising computer instructions which, when executed by a processor, cause the computer to perform the above method.
The aggregation processing is carried out on the plurality of pieces of log data based on the search request identifiers to obtain aggregated log data corresponding to each search request identifier, so that the storage space of the log data can be greatly saved, and the log can be kept for a longer time; when responding to a log query request, a query identifier can be extracted from the log query request, and the query identifier corresponds to the search request identifier; and inquiring the matched target aggregate log data from the aggregate log data according to the inquiry identification. Therefore, multiple index queries and multiple log data extraction are not needed, the aggregated log data of the same search request can be obtained at one time, and the query efficiency of the log and the call chain is improved. In addition, the search service module comprises a search service module serving as a main call and a search service module serving as a called call in the search request, so that the information of a call chain can be more complete and accurate.
Other features and aspects of the present application will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features and aspects of the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a schematic diagram of a log data processing system according to an embodiment of the present application.
Fig. 2 shows a flowchart of a log data processing method according to an embodiment of the present application.
Fig. 3 is a schematic illustration showing identifying a query log based on a search request according to an embodiment of the present application.
Fig. 4 is a schematic illustration showing identification of a query log based on a search account according to an embodiment of the present application.
FIG. 5 illustrates a schematic diagram of call chain information provided in accordance with an embodiment of the present application.
Fig. 6 shows a block diagram of a log data processing apparatus according to an embodiment of the present application.
Fig. 7 shows a block diagram of an electronic device for log data processing according to an embodiment of the application.
Detailed Description
Various exemplary embodiments, features and aspects of the application will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
In addition, numerous specific details are set forth in the following description in order to provide a better illustration of the application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, well known methods, procedures, components, and circuits have not been described in detail so as not to obscure the present application.
Referring to FIG. 1, FIG. 1 is a schematic diagram of a log data processing system according to an embodiment of the present application. The log data processing system can be used in the log data processing method of the present application. As one example, as shown in FIG. 1, the log data processing system may include at least a log collection module and a query module. Optionally, the log data processing system may further include a local indexing module and a data statistics module; or may also include an object storage module, a distributed module (e.g., WFS or Kafka in fig. 1), and a business database coupled to the data statistics module. Alternatively, the SDK (Software Development Kit ) coupled to the log collection module may be considered part of a log data processing system, as the application is not limited in this regard.
The SDK may be responsible for collecting and reporting log data of each search service module, for example, may report periodically, which is not limited in the present application. Correspondingly, the log collection module can obtain log data in each search service module through the SDK and can aggregate the log data of the same search request; and the aggregated log data and the corresponding index may be stored to the object storage module and the distributed module, respectively. The log collection module may construct an LRU cache (Least Recently Used, i.e., least recently used, a caching mechanism for storing least recently used data to increase access speed) in the memory for improving the aggregation efficiency of the log data.
The query module can be used for responding to the queries of the log and the call chain and the queries of the statistical data, namely responding to the log statistical request; the local index module can be used for reading the index from the distributed module for storage so as to respond to the index inquiry of the inquiry module efficiently. The object storage module can be used for log storage, and the distributed module can be used for index storage; in one example, the storage cost of the object storage module is lower than the storage cost of the distributed module, so that resources can be saved compared with the existing storage mode using a plurality of distributed modules. The data statistics module may be used for statistics of business data, such as statistics of RPC (Remote Procedure Call ) call time consumption, experiment time consumption, exception information, and the like. The service database may be used to store logs for multiple service types separately. The manner in which the service database is stored may vary. For specific functions of the modules in the log data processing system, reference is made to the following description.
It should be noted that, in the specific embodiment of the present application, related data of a user is referred to, and when the following embodiments of the present application are applied to specific products or technologies, user permission or consent is required to be obtained, and the collection, use and processing of related data are required to comply with related laws and regulations and standards of related countries and regions.
Fig. 2 shows a flowchart of a log data processing method according to an embodiment of the present application. As shown in fig. 2, the method may include:
s201, acquiring a plurality of pieces of log data generated by a plurality of search service modules on search requests processed in a history period.
In the embodiment of the present specification, the search service module may refer to a device for responding to a search service, such as a server. For example, a device that performs a search service in response to a search request, such as a search request triggered by a search term, may be considered a search service module. The history period may refer to a period of time before a time corresponding to the acquisition operation of the plurality of pieces of log data, which is not limited by the present application. The search request may refer to a request triggered based on a search service. The log data may be data recording relevant processing information of the search request, such as a search word used, a request time of the search request, etc., which is not limited in the present application.
Illustratively, the log data may include search request identification information (represented by logid) that may include a search request identification (represented by searchid) for uniquely identifying a search request; the search service modules may include a search service module that is a master in the search request and a search service module that is a tuned. The tuned search service module can be regarded as a client, and the tuned search service module can be regarded as a server. The clients and servers can be aimed at a call chain, for example, when the search service module A calls the search service module B in the process of searching requests, the call relationship can be represented by the mark A-B, and for the call relationship, the call relationship can be reported in A or B. The data reported in A can be regarded as client end data, and the data reported in B can be regarded as server end data. Only B can report in the prior art, but the report of A is increased, so that the call relation of A-B has more complete log data.
In the embodiment of the present disclosure, a plurality of pieces of log data generated by a plurality of search service modules for search requests processed in a history period may be acquired. For example, as shown in fig. 1, the log data processing system may obtain the log data generated by the plurality of search service modules for the search request processed in the history period by reporting the plurality of log data generated by the plurality of search service modules for the search request processed in the history period to the log collection module through the SDK.
In one example, the search service module may initialize the SDK such that the interface provided by the SDK actively reports log data. The log data of the same search request may be identified and associated by a log, based on which, for the same search request, the SDK is initialized at the entry module (i.e., the first search service module to respond to the search request), and the log of the search request is set. Further, the set logid may be transmitted to a downstream search service module as a parameter of the search request, so that the same search request may be strung based on the logid.
S203, based on the search request identifiers, aggregating the plurality of pieces of log data to obtain aggregated log data corresponding to each search request identifier.
In the embodiment of the present disclosure, in order to reduce the log storage space to increase the log storage duration and increase the log query efficiency, the aggregation processing is selected to be performed on a plurality of pieces of log data based on the search request identifier, that is, the log data under the same search request identifier may be aggregated, so as to obtain aggregated log data corresponding to each search request identifier. In this way, log data of the same search request is aggregated together, facilitating efficient subsequent queries for log data based on search request identification.
In an alternative embodiment, the search request identification information log of the log data may include a search request identification field search, a search account identification field (which may be expressed by a uin), a search word field (which may be expressed by a query) used in the search request, a request time field (which may be expressed by a timestamp) of the search request, a storage time field (or referred to as a log storage time field) of the log, and a detail field of the log, which are not limited in the present application. Accordingly, the aggregated log data obtained by aggregating the log data of the same search request may also include the fields, where the search request identification information of the aggregated log data includes the same search request identification under the search request identification field search, which are all search request identifications of the same search request. The search request identification information of the aggregated log data may be used to identify a piece of aggregated log data.
Optionally, in a case where the search request identification information includes a log storage duration field, the method may further include: the log storage duration in the log storage duration field may be updated in response to a duration update request for the log storage duration in the log storage duration field. In this way, the field comprising the log storage duration is set in the search request identifier, so that the log storage duration can be flexibly indicated.
S205, in response to the log query request, extracting a query identifier from the log query request, wherein the query identifier corresponds to the search request identifier.
In the embodiment of the present disclosure, the query identifier corresponds to the search request identifier may refer to that the query identifier is the search request identifier, so that a log aggregation manner based on the search request identifier may be adapted, and aggregated log data matched with the search request identifier may be conveniently extracted. Alternatively, the query identifier corresponds to the search request identifier may refer to that the query identifier corresponds to other fields in the search request identifier information to which the search request identifier belongs, for example, the other fields may be search account identifier fields (may be represented by uis), so that log data of the same search account may be obtained through a log query request. The search account identification field may be used to store a search account identification, and in particular, the search account identification field may be used to record an identification of a search account that triggers a search request.
In practical applications, when a log is generally required to be checked or a call chain is generally required to be checked to locate a problem, a log query request can be triggered. For example, as shown in FIG. 1, a log query request may be triggered by a front-end page, which may be communicated to a query module in a log data processing system. Accordingly, the query module may extract the query identification from the log query request in response to the log query request.
S207, inquiring the matched target aggregate log data from the aggregate log data according to the inquiry identification.
In one example, the search request identifier corresponding to the target aggregate log data is matched with the query identifier, which may mean that the search request identifier corresponding to the target aggregate log data is the same as the query identifier; or the search account identification in the search request identification information to which the search request identification corresponding to the target aggregate log data belongs is the same as the query identification. The target aggregate log data refers to aggregate log data for which the search request identification matches the query identification.
In the embodiment of the present disclosure, the query identifier may be matched with a search request identifier under a search field or a search account identifier under a uin field included in each aggregate log data, so as to obtain matched target aggregate log data.
By carrying out aggregation processing on a plurality of pieces of log data based on the search request identifications, aggregated log data corresponding to each search request identification is obtained, the storage space of the log data can be greatly saved, and the log can be kept for a longer time; when responding to the log query request, a query identifier can be extracted from the log query request, and the query identifier corresponds to the search request identifier; and querying the matched target aggregate log data from the aggregate log data according to the query identification. Therefore, multiple index queries and multiple log data extraction are not needed, the aggregated log data of the same search request can be obtained at one time, and the query efficiency of the log and the call chain is improved. In addition, the search service module comprises a search service module serving as a main call and a search service module serving as a called call in the search request, so that the information of the call chain can be more complete and accurate.
The application has more complete reported data, so that the data statistics module included in the log data processing system can be combined with the service databases of various service types, the coupling degree of data statistics is reduced, and the expansibility is better. Based on this, the data statistics module stores aggregate log data, which may be communicated by the log collection module; furthermore, the aggregate log data can be classified and stored into the corresponding service databases through the data statistics module, namely, the aggregate log data is stored based on the service types, so that the aggregate log data with different service types can be stored in different service databases, and the subsequent service statistics is facilitated. Accordingly, the method may further comprise: responding to a log statistics request, and acquiring a target service type to be counted; the log statistics request may be triggered based on the front page in fig. 1, for example. Further, service log data, that is, aggregate log data in the service database corresponding to the target service type, may be obtained from the service database corresponding to the target service type; therefore, service statistics processing can be performed based on the service log data, and service statistics results are obtained. Such as an exception location, a retry location, etc., of the search request, as the application is not limited in this regard.
By way of example, the service type may refer to a data form type of the search request, such as an image, text, etc., or the service type may refer to a data content type of the search request, such as a sports, cosmetic, etc. The application is not limited in this regard.
As an alternative to real-time, further aggregation processing may be performed on the aggregated log data to further reduce storage space so that the log data may be retained for longer periods of time, e.g., 15, 30, 60, 90 days, etc., as the application is not limited in this regard. Based on this, the method may further comprise: and performing further aggregation processing on the aggregated log data to obtain aggregated log blocks meeting a preset aggregation threshold, and configuring an aggregation index for the aggregated log blocks.
As an example, the preset aggregation threshold may refer to an end condition for indicating one aggregation process, for example, the preset aggregation threshold may be an aggregation duration threshold or a preset data amount, which is not limited by the present application. This can result in a funnel-like operation, reducing the frequency of storage.
For example, if the preset aggregation threshold is a preset data size, the aggregate log data may be further aggregated, and each log block with a size of 5M obtained by aggregation may be used as an aggregate log block, that is, the data size of the aggregate log block is 5M.
Further, an aggregate index may be configured for the aggregate log blocks, which may be used to uniquely identify the aggregate log blocks. For example, agg1- -block1, agg2- -block2, etc. Where agg1 and agg2 may refer to an aggregate index and block1 and block2 may refer to an aggregate log block.
In order to construct log index information between the aggregate index of the aggregate log block and the search request identification and the search account identification. Based on the method, the search account identification, namely the search account identification under the uin field, can be extracted from the aggregate log data; therefore, the corresponding relation between the search request identification, the search account identification and the aggregation index can be constructed, and log index information is obtained; further, the aggregate index may be stored in association with the aggregate log block; and may store log index information. Therefore, the log index information and the aggregate log block are stored separately, and the index retrieval efficiency can be improved.
It should be noted that, one log data in the plurality of log data may be regarded as one line of data in the repository table; in the aggregated log data, one piece of aggregated log data may be regarded as one line of data in the repository table; in the case of aggregated log blocks that are aggregated again, one aggregated log block may be considered a row of data in the repository table. In the aggregation process, the number of indexes can be continuously reduced, and the effect of optimizing the log indexes is achieved, so that the query efficiency of the log indexes in the query request can be improved.
In an alternative embodiment, a log data processing system may include an object storage module and a distributed module; based on this, the storing the aggregate index in association with the aggregate log block and storing the log index information may include: storing the aggregate index in association with the aggregate log block into an object storage module, such as the object storage module of fig. 1 for log storage; log index information is stored into distributed modules, such as WFS in fig. 1 for index storage. Because the cost of the object storage module is lower than that of the distributed module, the storage cost can be saved. For example, each row of data in the object storage module may be an aggregate index and an aggregate log block corresponding to the aggregate index. Each row of data in the distributed module may be an aggregation index, a search request identifier, and a search account identifier, that is, fields in the distributed module are an aggregation index field (which may be denoted by agg), a search id, and a uin.
On the basis of the storage mode, the matched target aggregation index can be firstly inquired from the distributed modules, and then the aggregation log blocks corresponding to the target aggregation index are inquired from the distributed modules, so that target aggregation log data to be inquired can be extracted from the aggregation log blocks. For example, the query identifier is a search request identifier to be queried or a search account identifier to be queried; accordingly, the target aggregate log data matched with the aggregate log data is queried according to the query identifier, that is, the step S207 may include:
Inquiring a matched target aggregation index from the log index information based on the inquiry identification;
querying a matched target aggregation log block by using a target aggregation index;
and extracting target aggregate log data matched with the query identifier from the target aggregate log block.
For example, in the case where the query is identified as the search request to be queried, the queried target aggregate log data may be as shown in fig. 3. For example, the front page that may provide the query may select the query mode, i.e., select to query based on the search request identification or the search account identification, or may define the date, as shown in fig. 3. For example, selecting the query mode to be search, then inputting search, specifically aaa1, and triggering the "query" button, the one-entry aggregate log data shown in fig. 3 may be displayed, where fields such as searchid, uin, query, request time, storage duration, and details may be included.
And hh in the uin field indicates that the search request corresponding to the target aggregate log data is triggered by the search account corresponding to the search account identifier hh.
Xxx under query may refer to the search term corresponding to the aaa1 search request.
The request time may be a time corresponding to a request trigger of the search request of aaa1, which may be between the log date and the current time.
The 7 days under the storage period may mean that the target aggregate log data will remain for 7 days.
Details may be checked against the details of the target aggregate log data by triggering "view".
Alternatively, in the case of the identification of the search account to be queried, as shown in fig. 4, the queried target aggregate log data may query all aggregate log data that hh is the search account satisfying the log date, where the roles of the fields are consistent with the fields in fig. 3 above. By providing the search account identification as the query identification, log data of the same search account identification can be efficiently obtained.
It should be noted that, the specific query identifier input manner may be displayed in a linkage manner based on the selection of the query manner, for example, in fig. 3, if the query manner is selected as search, the search input manner is displayed later; in fig. 4, if the query mode is selected as the uin, the input mode of the uin will be shown later.
Optionally, as shown in fig. 1, the log data processing system may further include a local index module, where the local index module is configured to store the full-scale index information, and the read speed of the local index module may be higher than the read speed of the distributed module, so that the response speed of the query may be improved, and the distributed module may support the index query without retaining the full-scale index information, so that the storage space requirement of the distributed module may be reduced, thereby reducing the storage cost. The method may further comprise: based on the local index module, the log index information of the distributed module is read, and incremental update of the full index information is performed. Illustratively, the fields of the repository table in the local index module may be an aggregate index field (which may be denoted by agg), a search, a uin.
Accordingly, the querying the matching target aggregate index from the log index information based on the query identifier includes: in the local index module, a matching target aggregate index is queried from the full-scale index information based on the query identification.
In an alternative embodiment, in the process of aggregating multiple pieces of log data, besides the log data compression processing, call chain information can be constructed based on the call relationship of the search service module under the same search request, so as to facilitate the query of a subsequent call chain.
Optionally, the log data may carry call relationships between search service modules in the search request, and for each call relationship, the log data may also carry a remote procedure call identifier (which may be represented by a span), a call type identifier (which may be represented by a task) for indicating a serial call or a parallel call, and a call retry identifier (which may be represented by a retry id) for each call relationship. I.e., the log data may also include a remote procedure call identifier span between the plurality of search service modules in each search request, a call type identifier task for indicating a serial call or a parallel call, and a call retry identifier retry id. Illustratively, the remote procedure call identifier, call type identifier, and call retry identifier may be carried by setting corresponding fields in the log data, as the application is not limited in this regard. Wherein the remote procedure call identifier span may be used to distinguish between different RPC calls; the call retry identification retryid may be used to characterize the number of retries.
Accordingly, the aggregating processing is performed on the plurality of pieces of log data based on the search request identifier to obtain aggregated log data corresponding to each search request identifier, which may include the following steps:
screening first log data under the same search request identifier from a plurality of pieces of log data; that is, the plurality of pieces of log data are clustered according to different search request identifiers to obtain log data under each search request identifier, and for convenience of description, the clustering may be classified into log data under each search request identifier, which is referred to as first log data.
Further, second log data with consistent calling retry identifiers corresponding to the respective primary and secondary reconciliation can be screened out based on the calling retry identifiers corresponding to the respective primary and secondary reconciliation of the first log data. Here, in order to screen out the log data actually useful under each search request identifier, for example, for the first log data under each search request identifier, the first log data with the same call retry identifier corresponding to the main and the called call retry identifiers may be screened out, so as to obtain the second log data under each search request identifier.
Therefore, call chain information corresponding to the second log data can be constructed according to the remote procedure call identifier and the call type identifier carried by the second log data. And the second log data under the same search request identifier and the call chain information corresponding to the second log data can be subjected to aggregation processing to obtain the aggregated log data corresponding to each search request identifier, namely the aggregated log data can comprise the constructed call chain information, and the call chain information is complete and accurate.
In this case, as an example, the log data may carry a calling relationship between the search service modules in the search request, so that the second log data that is screened out may also include the calling relationship between the search service modules. However, for multiple calls between two search service modules, it is not clear which time, resulting in incomplete and inaccurate call chains. For example, for this call relationship of search service a→b, multiple different RPC calls cannot be distinguished, and in the case that the remote procedure call identifier span is not set in the prior art, the call relationship of a to B in the different RPC calls may be constructed. If there are two times a→b, that is, a1→b1 and a2→b2, the existing case of a1→b2 and a2→b1 may occur without setting the remote procedure call identifier span. Wherein a and B may refer to two search service modules. A1→B1 may refer to one of the RPC calls; A2→B2 may refer to another RPC call.
Correspondingly, the constructing the call chain information corresponding to the second log data according to the remote procedure call identifier and the call type identifier carried by the second log data may include: any calling relation can be determined to be serial calling or parallel calling according to the calling type identifier, and different RPC calls can be accurately resolved according to the remote procedure calling identifier span, so that a complete calling relation under a search request can be accurately constructed.
Optionally, the method may further include: and extracting target call chain information from the target aggregate log data in response to a call chain query request for the target aggregate log data. Such as shown in fig. 5. For example, the call chain information may be, as shown in the lower area of fig. 5, where the call chain information may be identification information of search service modules arranged according to the call sequence, and "details" after clicking on the identification information of any search service module may view details of the search request on the any search service module, such as call time, call duration, and the like.
In an alternative embodiment, as shown in fig. 5, the call chain information may also include content in the upper left corner of fig. 5, such as search request identification, search account identification, request time, etc. Or, the content in the upper right corner in fig. 5 may be included, for example, a call position in the call chain where an abnormality occurs, that is, an abnormal position, for example, identification information of a search service module where the call fails, and the like; and a retry location, such as identification information of the search service module where the invoking retry occurred. The present application is not limited to these.
In one example application, referring to FIG. 1, queries of the log and call chain and log statistics requests may be triggered at the front end page. For example, fig. 3 and fig. 4 may be used to trigger a log query, and a search or a uin may be input to query the aggregate log data of a search request, or query the aggregate log data under the same search account. Such a log data processing system using the simple architecture of fig. 1 can shorten the query path. And because the log data of the same search request is aggregated and stored, the inquiry is simpler and more efficient. Compared with the existing method without log data aggregation (the log data of one search request is stored and scattered without aggregation, and multiple rows of indexes are occupied), multiple times of index query are not needed during query, and the time consumption of query is greatly reduced. In addition, a search account identifier is set for query, so that the purpose of retrospecting the query operation of the search account is achieved.
Alternatively, the call chain query may be triggered, for example, the call chain query may be triggered directly, or the call chain query may be further performed according to the queried aggregate log data after the log query. The presentation of the call chain information of the query may be as shown in fig. 5, which is not limited by the present application.
For log statistics requests, each service database of the log data processing system is of an abstract service type, the service types are not affected, and the coupling degree is reduced, so that when the statistics needs are newly increased, the newly increased statistics needs can be responded in time only by the newly increased service database.
Fig. 6 shows a block diagram of a log data processing apparatus according to an embodiment of the present application. The log data processing apparatus may be applied to a log data processing system, for example, the log data processing apparatus may be provided in the log data processing system; or the various modules included in the log data processing apparatus may be adaptively disposed in corresponding modules in the log data processing system, which is not limited in the present application. As shown in fig. 6, the apparatus may include:
an obtaining module 601, configured to obtain a plurality of pieces of log data generated by a plurality of search service modules for search requests processed in a history period; the log data comprises search request identification information, wherein the search request identification information comprises a search request identification for uniquely identifying a search request; the search service module comprises a search service module serving as a main call in the search request and a search service module serving as a called call;
A first aggregation module 603, configured to aggregate a plurality of pieces of log data based on the search request identifiers, to obtain aggregated log data corresponding to each search request identifier;
the query identifier extraction module 605 is configured to extract a query identifier from the log query request in response to the log query request, where the query identifier corresponds to the search request identifier;
a query module 607, configured to query, according to the query identifier, the aggregate log data for the matched target aggregate log data; the search request identification corresponding to the target aggregate log data is matched with the query identification.
According to the log data processing device provided by the embodiment of the application, a plurality of pieces of log data can be aggregated based on the search request identifiers to obtain the aggregated log data corresponding to each search request identifier, so that the storage space of the log data can be greatly saved, and the log can be kept for a longer time; when responding to the log query request, a query identifier can be extracted from the log query request, and the query identifier corresponds to the search request identifier; and querying the matched target aggregate log data from the aggregate log data according to the query identification. Therefore, multiple index queries and multiple log data extraction are not needed, the aggregated log data of the same search request can be obtained at one time, and the query efficiency of the log and the call chain is improved. In addition, the search service module comprises a search service module serving as a main call and a search service module serving as a called call in the search request, so that the information of the call chain can be more complete and accurate.
In one possible implementation manner, the apparatus may further include:
the second aggregation module is used for carrying out further aggregation processing on the aggregation log data to obtain aggregation log blocks meeting a preset aggregation threshold value, and configuring an aggregation index for the aggregation log blocks;
the search account identification extraction module is used for extracting a search account identification from the aggregate log data;
the log index information acquisition module is used for constructing the corresponding relation between the search request identification, the search account identification and the aggregation index to obtain log index information;
and the storage module is used for storing the aggregation index and the aggregation log block in an associated mode and storing log index information.
In one possible implementation, a log data processing system includes an object storage module and a distributed module; the memory module may include:
the first storage unit is used for storing the aggregation index and the aggregation log block in an object storage module in an associated mode;
and the second storage unit is used for storing the log index information into the distributed module.
In one possible implementation, the query identifier is a search request identifier to be queried or a search account identifier to be queried; the query module 607 may include:
The target aggregation index query unit is used for querying the matched target aggregation index from the log index information based on the query identification;
the target aggregate log block query unit is used for querying the matched target aggregate log blocks by using the target aggregate index;
and the target aggregate log data acquisition unit is used for extracting target aggregate log data matched with the query identifier from the target aggregate log block.
In one possible implementation, the log data processing system further includes a local index module for storing full index information; the apparatus may further include:
the local index updating module is used for reading the log index information of the distributed module based on the local index module and performing incremental updating of the full index information;
the target aggregate index query unit is further configured to query, in the local index module, a matching target aggregate index from the full-scale index information based on the query identifier.
In one possible implementation, the log data further includes a remote procedure call identifier between the plurality of search service modules in each search request, a call type identifier for indicating a serial call or a parallel call, and a call retry identifier; the first aggregation module 603 may include:
A first log data acquisition unit, configured to screen out first log data under the same search request identifier from the plurality of pieces of log data;
the second log data acquisition unit is used for screening out second log data with consistent calling retry identifiers corresponding to the corresponding main and tuned respectively based on the calling retry identifiers corresponding to the main and tuned respectively corresponding to the first log data;
the call chain construction unit is used for constructing call chain information corresponding to the second log data according to the remote procedure call identifier and the call type identifier carried by the second log data;
the log data aggregation unit is used for carrying out aggregation processing on the second log data under the same search request identifier and the call chain information corresponding to the second log data to obtain aggregated log data corresponding to each search request identifier.
In one possible implementation manner, the apparatus may further include:
and the call chain query module is used for responding to a call chain query request of the target aggregate log data and extracting target call chain information from the target aggregate log data.
In one possible implementation, the search request identification information further includes a log storage duration field; the apparatus may further include:
The log storage duration updating module is used for responding to a duration updating request of the log storage duration in the log storage duration field and updating the log storage duration in the log storage duration field.
In one possible implementation, the log data processing system includes a data statistics module and a plurality of service databases of service types, wherein the data statistics module stores aggregate log data; the aggregate log data is classified and stored into a corresponding service database through a data statistics module, and the device can further comprise:
the statistics processing module is used for responding to the log statistics request and acquiring a target service type to be counted; acquiring service log data from a service database corresponding to the target service type; and carrying out service statistics processing based on the service log data to obtain service statistics results.
The specific manner in which the individual modules and units perform the operations in relation to the apparatus of the above embodiments has been described in detail in relation to the embodiments of the method and will not be described in detail here.
Fig. 7 shows a block diagram of an electronic device for log data processing according to an embodiment of the application. The electronic device may be a server, and its internal structure may be as shown in fig. 7. The electronic device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the electronic device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a method of log data processing.
It will be appreciated by those skilled in the art that the structure shown in fig. 7 is merely a block diagram of a portion of the structure associated with the present inventive arrangements and is not limiting of the electronic device to which the present inventive arrangements are applied, and that a particular electronic device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In an exemplary embodiment, there is also provided an electronic device including: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement a log data processing method as in the embodiments of the present application.
In an exemplary embodiment, a storage medium is also provided, which when executed by a processor of an electronic device, enables the electronic device to perform the log data processing method in the embodiment of the application.
In an exemplary embodiment, a computer program product containing instructions that, when run on a computer, cause the computer to perform the log data processing method in an embodiment of the application is also provided.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (12)

1. A method of log data processing, the method comprising:
acquiring a plurality of pieces of log data generated by a plurality of search service modules on search requests processed in a history period; the log data comprises search request identification information, wherein the search request identification information comprises a search request identification for uniquely identifying a search request; the search service module comprises a search service module serving as a main call and a search service module serving as a called call in the search request;
Based on the search request identifiers, carrying out aggregation processing on the plurality of pieces of log data to obtain aggregated log data corresponding to each search request identifier;
responding to a log query request, and extracting a query identifier from the log query request, wherein the query identifier corresponds to the search request identifier;
inquiring matched target aggregate log data from the aggregate log data according to the inquiry identification; and the search request identifier corresponding to the target aggregate log data is matched with the query identifier.
2. The log data processing method as defined in claim 1, wherein the method further comprises:
further aggregating the aggregate log data to obtain aggregate log blocks meeting a preset aggregate threshold value, and configuring an aggregate index for the aggregate log blocks;
extracting a search account identifier from the aggregate log data;
constructing the corresponding relation between the search request identification, the search account identification and the aggregation index to obtain log index information;
and storing the aggregation index and the aggregation log block in an associated mode, and storing the log index information.
3. The log data processing method as claimed in claim 2, wherein the log data processing method is applied to a log data processing system, the log data processing system comprising an object storage module and a distributed module; the storing the aggregate index in association with the aggregate log block and storing the log index information includes:
Storing the aggregate index and the aggregate log block in association with the object storage module;
and storing the log index information into the distributed module.
4. The log data processing method according to claim 3, wherein the query identifier is a search request identifier to be queried or a search account identifier to be queried; the querying, according to the query identifier, the matched target aggregate log data from the aggregate log data includes:
querying a matched target aggregation index from the log index information based on the query identification;
querying a matched target aggregation log block by using the target aggregation index;
and extracting the target aggregate log data matched with the query identifier from the target aggregate log block.
5. The log data processing method as defined in claim 4, wherein the log data processing system further comprises a local index module for storing full-scale index information; the method further comprises the steps of:
based on the local index module, reading the log index information of the distributed module, and performing incremental update of the full index information;
The querying the matching target aggregation index from the log index information based on the query identification includes:
in the local index module, the matching target aggregate index is queried from the full-scale index information based on the query identification.
6. The method according to any one of claims 1 to 5, wherein the log data further includes a remote procedure call identifier between the plurality of search service modules in each search request, a call type identifier for indicating a serial call or a parallel call, and a call retry identifier; the aggregating processing is performed on the plurality of pieces of log data based on the search request identifiers to obtain aggregate log data corresponding to each search request identifier, including:
screening first log data under the same search request identifier from the plurality of pieces of log data;
screening out second log data with consistent calling retry identifiers corresponding to the corresponding main call and the called call respectively based on the calling retry identifiers corresponding to the main call and the called call respectively corresponding to the first log data;
constructing call chain information corresponding to the second log data according to the remote procedure call identifier and the call type identifier carried by the second log data;
And carrying out aggregation processing on the second log data under the same search request identifier and the call chain information corresponding to the second log data to obtain aggregated log data corresponding to each search request identifier.
7. The log data processing method as defined in claim 6, wherein the method further comprises:
and responding to a call chain query request of the target aggregate log data, and extracting target call chain information from the target aggregate log data.
8. The log data processing method as set forth in claim 1, wherein the search request identification information further includes a log storage duration field; the method further comprises the steps of:
and in response to a time length update request for the log storage time length in the log storage time length field, updating the log storage time length in the log storage time length field.
9. The log data processing method as claimed in claim 1, wherein the log data processing method is applied to a log data processing system, the log data processing system comprising a data statistics module and a plurality of service types of service databases, the data statistics module storing the aggregate log data therein; the aggregate log data is classified and stored into a corresponding service database through the data statistics module, and the method further comprises the steps of:
Responding to a log statistics request, and acquiring a target service type to be counted;
acquiring service log data from a service database corresponding to the target service type;
and carrying out service statistics processing based on the service log data to obtain service statistics results.
10. A log data processing apparatus, comprising:
the acquisition module is used for acquiring a plurality of pieces of log data generated by the search service modules on the search requests processed in the history period; the log data comprises search request identification information, wherein the search request identification information comprises a search request identification for uniquely identifying a search request; the search service module comprises a search service module serving as a main call and a search service module serving as a called call in the search request;
the first aggregation module is used for carrying out aggregation processing on the plurality of pieces of log data based on the search request identifiers to obtain aggregated log data corresponding to each search request identifier;
the query identifier extraction module is used for responding to a log query request and extracting a query identifier from the log query request, wherein the query identifier corresponds to the search request identifier;
The query module is used for querying the matched target aggregate log data from the aggregate log data according to the query identifier; and the search request identifier corresponding to the target aggregate log data is matched with the query identifier.
11. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to execute the executable instructions to implement the log data processing method of any one of claims 1 to 9.
12. A non-transitory computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the log data processing method of any of claims 1 to 9.
CN202310976938.6A 2023-08-04 2023-08-04 Log data processing method and device, electronic equipment and storage medium Active CN116701337B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310976938.6A CN116701337B (en) 2023-08-04 2023-08-04 Log data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310976938.6A CN116701337B (en) 2023-08-04 2023-08-04 Log data processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116701337A true CN116701337A (en) 2023-09-05
CN116701337B CN116701337B (en) 2024-01-16

Family

ID=87843661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310976938.6A Active CN116701337B (en) 2023-08-04 2023-08-04 Log data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116701337B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095052A (en) * 2014-05-22 2015-11-25 阿里巴巴集团控股有限公司 Fault detection method and fault detection device in SOA (Service-Oriented Architecture) environment
CN107229619A (en) * 2016-03-23 2017-10-03 阿里巴巴集团控股有限公司 Internet service link calls statistics, methods of exhibiting and the device of situation
US20180278499A1 (en) * 2017-03-27 2018-09-27 Ca, Inc. Rendering application log data in conjunction with system monitoring
CN110175161A (en) * 2019-04-25 2019-08-27 平安科技(深圳)有限公司 Method, apparatus, computer equipment and the storage medium of record log
CN110855477A (en) * 2019-10-29 2020-02-28 浙江大搜车软件技术有限公司 Link log monitoring method and device, computer equipment and storage medium
CN111352760A (en) * 2020-02-27 2020-06-30 深圳市腾讯网域计算机网络有限公司 Data processing method and related device
CN111522922A (en) * 2020-03-26 2020-08-11 浙江口碑网络技术有限公司 Log information query method and device, storage medium and computer equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095052A (en) * 2014-05-22 2015-11-25 阿里巴巴集团控股有限公司 Fault detection method and fault detection device in SOA (Service-Oriented Architecture) environment
CN107229619A (en) * 2016-03-23 2017-10-03 阿里巴巴集团控股有限公司 Internet service link calls statistics, methods of exhibiting and the device of situation
US20180278499A1 (en) * 2017-03-27 2018-09-27 Ca, Inc. Rendering application log data in conjunction with system monitoring
CN110175161A (en) * 2019-04-25 2019-08-27 平安科技(深圳)有限公司 Method, apparatus, computer equipment and the storage medium of record log
CN110855477A (en) * 2019-10-29 2020-02-28 浙江大搜车软件技术有限公司 Link log monitoring method and device, computer equipment and storage medium
CN111352760A (en) * 2020-02-27 2020-06-30 深圳市腾讯网域计算机网络有限公司 Data processing method and related device
CN111522922A (en) * 2020-03-26 2020-08-11 浙江口碑网络技术有限公司 Log information query method and device, storage medium and computer equipment

Also Published As

Publication number Publication date
CN116701337B (en) 2024-01-16

Similar Documents

Publication Publication Date Title
US20200372007A1 (en) Trace and span sampling and analysis for instrumented software
CN111339171B (en) Data query method, device and equipment
CN108647357B (en) Data query method and device
CN111385365B (en) Processing method and device for reported data, computer equipment and storage medium
CN110659282B (en) Data route construction method, device, computer equipment and storage medium
CN111740868B (en) Alarm data processing method and device and storage medium
CN111488377A (en) Data query method and device, electronic equipment and storage medium
CN110362607B (en) Abnormal number identification method, device, computer equipment and storage medium
CN114116762A (en) Offline data fuzzy search method, device, equipment and medium
CN116055551A (en) Information pushing method, device and system, electronic equipment and storage medium
CN106815277B (en) Evaluation method and device for search engine optimization
CN107330031B (en) Data storage method and device and electronic equipment
CN116701337B (en) Log data processing method and device, electronic equipment and storage medium
CN110008243B (en) Data table processing method and device
CN110727895B (en) Sensitive word sending method and device, electronic equipment and storage medium
CN109408479B (en) Log data adding method, system, computer device and storage medium
CN111124891A (en) Access state detection method and device, storage medium and electronic device
CN112651840B (en) Business data log processing method and system based on blockchain and digital finance
KR102107919B1 (en) Matching method of rating information, device, storage medium and server
CN115617794A (en) Data analysis method, data analysis apparatus, and computer-readable storage medium
CN112765118B (en) Log query method, device, equipment and storage medium
CN111475505B (en) Data acquisition method and device
CN112131215B (en) Bottom-up database information acquisition method and device
CN114637780A (en) Data query processing method and device, computer equipment and medium
CN113868283A (en) Data testing method, device, equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant