CN117194159A - Log quality evaluation method and device, electronic equipment and storage medium - Google Patents

Log quality evaluation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117194159A
CN117194159A CN202311167477.4A CN202311167477A CN117194159A CN 117194159 A CN117194159 A CN 117194159A CN 202311167477 A CN202311167477 A CN 202311167477A CN 117194159 A CN117194159 A CN 117194159A
Authority
CN
China
Prior art keywords
log file
field
special
data line
missing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311167477.4A
Other languages
Chinese (zh)
Inventor
邢盛骞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Youtejie Information Technology Co ltd
Original Assignee
Beijing Youtejie Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Youtejie Information Technology Co ltd filed Critical Beijing Youtejie Information Technology Co ltd
Priority to CN202311167477.4A priority Critical patent/CN117194159A/en
Publication of CN117194159A publication Critical patent/CN117194159A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a log quality evaluation method, a device, electronic equipment and a storage medium, and relates to the field of log detection, wherein the method comprises the following steps: detecting the universal field of each data line in the log file through a universal detection template; performing special field detection on at least one category of special data row in the log file through at least one special detection template; and obtaining a quality evaluation result of the log file according to the general field detection result and the special field detection result of the log file. The technical scheme of the embodiment of the invention not only improves the detection efficiency of the missing field in the log file, but also completely and accurately reflects the overall quality of the log file, and is applicable to the detection of variable information under each field in the log file based on the detection mode of the general detection template and the special detection template, thereby expanding the applicable range of log data detection.

Description

Log quality evaluation method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of log detection, and in particular, to a method and apparatus for evaluating log quality, an electronic device, and a storage medium.
Background
With the continuous development of computer technology, more and more business systems are in daily offices of various industries, and the log data generated by the business systems is taken as an important component part thereof, so how to evaluate the log quality of the log data becomes an important link of enterprise operation.
In the prior art, for log data generated by a service system, a small number of data samples are usually screened out by spot check, then whether the data samples meet the normalization requirement is confirmed by manual detection, and the whole quality of the log file is predicted by the detection result of the data samples, or whether the log file has the phenomena of data missing or not is judged by inquiring specified keywords in the log data through preset keywords.
However, the efficiency of the manual detection mode is low, meanwhile, the whole quality of the log file cannot be reflected by the data sample, and the detection can only be carried out on a limited number of fixed words by inquiring the keywords, and the detection effect is poor because the fixed words cannot be adapted to variable information under each field in the log data.
Disclosure of Invention
The invention provides a log quality evaluation method, a device, electronic equipment and a storage medium, which are used for solving the problem of larger error of a log quality evaluation result.
According to an aspect of the present invention, there is provided a log quality evaluation method including:
detecting all data lines in the log file by using a universal detection template to determine whether missing universal fields exist in the log file;
performing special field detection on at least one class of special data line in the log file through at least one special detection template to determine whether a missing special field exists in the log file;
and acquiring a quality evaluation result of the log file according to the general field detection result and the special field detection result of the log file.
According to another aspect of the present invention, there is provided a log quality evaluation apparatus including:
the universal detection execution module is used for detecting the universal fields of each data line in the log file through the universal detection template so as to determine whether the missing universal fields exist in the log file;
the special detection execution module is used for carrying out special field detection on at least one class of special data row in the log file through at least one special detection template so as to determine whether the special field is missing in the log file;
And the quality evaluation result acquisition module is used for acquiring the quality evaluation result of the log file according to the general field detection result and the special field detection result of the log file.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the log quality assessment method according to any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to implement the log quality evaluation method according to any one of the embodiments of the present invention when executed.
According to the technical scheme, the universal detection template is used for carrying out universal field detection on each data line in the log file, and at least one special detection template is used for carrying out special field detection on at least one type of special data line in the log file, so that the quality evaluation result of the log file is obtained according to the universal field detection result and the special field detection result of the log file, the detection efficiency of missing fields in the log file is improved, the detection result completely and accurately reflects the integral quality of the log file, and the method is suitable for detecting variable information under each field in the log file based on the detection mode of the universal detection template and the special detection template, so that the applicable range of log data detection is expanded.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a log quality evaluation method according to a first embodiment of the present invention;
fig. 2 is a flowchart of a log quality evaluation method according to a second embodiment of the present invention;
FIG. 3 is a flowchart of a log quality evaluation method according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a log quality evaluation device according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device implementing the log quality evaluation method according to the embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a log quality evaluation method according to a first embodiment of the present invention, where the present embodiment is applicable to field missing detection for different types of data lines according to a general detection template and a specific detection template, and the log quality evaluation device may be implemented in hardware and/or software, and the log quality evaluation device is configured in an electronic device such as a server. As shown in fig. 1, the method includes:
s101, detecting the universal field of each data line in the log file through a universal detection template to determine whether the missing universal field exists in the log file.
The regular expression is a text pattern matching tool and can be used for searching a character sequence conforming to a specific pattern in a text; a plurality of regular expressions (namely, universal regular expressions) are recorded in the universal detection template, and each regular expression is used for acquiring a universal field in a data row; the universal field is a variable field which appears in each data line in the log file and is used for representing the basic information recorded in the data line; for example, for log information in the financial business field, the universal detection template may be used for detecting and acquiring universal fields such as a timestamp, a log rank, a thread number, a transaction unique ID (Identity Document, identity), a transaction type, an interface name, and the like.
Log files generated by different business systems can correspond to different general detection templates, and the different general detection templates comprise regular expressions for detecting different general fields; in order to distinguish the log files of different service systems, the name of the service system can be used as prefix information in the file name of the log file generated by the system; in addition, a data line is a generic data line if it consists of only generic fields and other fixed characters, i.e., it represents only generic fields of variables.
When each data line of a log file is detected through a universal detection template, whether a missing universal field exists in each data line is recorded; if the missing general field does not exist, setting the current data line as a complete data line (namely, a complete general data line); if the missing general fields exist, setting the current data line as the missing data line (namely, the missing general data line), and recording which general fields are specifically missing in the data line; therefore, after the data line in the log file is traversed, the obtained general field detection results record the respective duty ratios of the missing general data line and the complete general data line in all detection data lines and the duty ratio of the data line missing the field in all general data lines and all detection data lines under each general field.
Particularly, for a log file, a part of data lines in the log file can be selected as detection samples of the log file through a specified screening rule or a random screening mode, so that only the data lines in the detection samples are detected, and the detection result of the log file is predicted through the detection result of the detection samples, thereby improving the detection efficiency of the log file; optionally, in the embodiment of the present invention, the number of general fields and the type of the general fields of the general detection template are not specifically limited.
S102, performing special field detection on at least one class of special data line in the log file through at least one special detection template to determine whether a special field is missing in the log file.
The special detection template also records a plurality of regular expressions (namely special regular expressions), and each regular expression is used for acquiring a special field in the data row; the special field is a field which can only appear in a data line of a specific category in the log file and is used for representing the characteristic information recorded in the data line; different classes of private data lines record different data features, i.e. have different private fields, and therefore different classes of private data lines need to be detected by different private detection templates, i.e. one private detection template matches a private data line under one class.
A data line is a private data line if it consists of a general field, a private field, and other fixed characters, i.e., it represents a variable that includes not only a general field but also a private field. Different dedicated data lines are represented by different dedicated identifications, so that it is possible to determine whether or not the data line is a dedicated data line, and in particular which dedicated data line, by identifying the dedicated identification in the data line; if no special identification exists in the data row, the data row is determined to be a general data row.
For example, the private data lines may include request message data lines and response piece message data lines; the request message data line records special fields such as a transactor name, a transaction account name, an interaction system name, a transaction serial number, a transaction amount and the like, and the receipt message data line records special fields such as the transactor name, the transaction account name, the interaction system name, the transaction serial number, the transaction amount, a return error code, return error information and the like, wherein the two special data lines are required to be detected by different special detection templates respectively.
When the special data row is detected through the special detection template, recording whether each special data row in the category has a missing special field or not; if the missing special field does not exist, setting the current data line as a complete data line (namely, a complete special data line); if the missing special fields exist, setting the current data line as the missing data line (namely, missing the special data line), and recording which special fields are specifically missing; after traversing the special data line of the current category in the log file, the specific field detection result under the current category is obtained, and the respective duty ratio of the special missing data line and the special complete data line in the special detection data line of the current category, the duty ratio of the special data line of the missing field in the special data line under the current category and the duty ratio of the special data line of the category in all detection data lines are recorded.
S103, obtaining a quality evaluation result of the log file according to the general field detection result and the special field detection result of the log file.
After the general field detection result is obtained, the product result between weights (namely general weights) corresponding to the general field detection result can be used as a general field detection score according to the quantity proportion of the general complete data lines in all detection data lines; meanwhile, for the special field detection result of each category, the product result between the weights (namely the special weights) corresponding to the special field detection result of the category can be used as the special field detection score of the category according to the quantity proportion of the special complete data row of the category in all the special data rows of the category; wherein the sum of the general weight and each special weight is 100; and further taking the sum result of the general field detection scores and the special field detection scores under each category as the quality score of the log file.
Taking the above technical scheme as an example, in the current log file, the number of the general complete data lines in all the detected data lines accounts for 86%, and the general weight is 50; in the data lines of the request message, the quantity of the complete special data lines in all special data lines under the category accounts for 98 percent, and the special weight of the request message is 25; in the receipt message data line, the number of the complete special data line in all special data lines under the category is 93 percent, and the special weight of the category is 25. The quality score for the log file was obtained as 90.74 according to the equation 86% ×50+98% ×25+93% ×25=90.74 as calculated below.
In addition, the quality evaluation result of the log file can be continuously determined according to the numerical interval where the quality score is located; for example, 90 or more is classified as excellent; 80 parts or more and less than 90 parts are good; the number of the components is more than or equal to 60 and less than 80; less than 60 minutes as difference; meanwhile, in order to intuitively show each data line in the log file to a user, missing data lines with missing general fields or missing special fields can be marked with different colors so as to distinguish complete data lines and missing data lines.
Optionally, in an embodiment of the present invention, the obtaining a quality evaluation result of the log file according to a general field detection result and a special field detection result of the log file includes: replacing the general field and the special field in each data line with matched first preset characters respectively; respectively configuring different colors for the first type data row, the second type data row and the third type data row, and displaying the log file with the configured colors; wherein the first type data line has a missing general field and no missing special field; the second type data line has no missing general field and a missing special field; the third type data line has no missing generic field and no missing dedicated field.
Specifically, the character lengths of the universal field and the special field are often different due to different variable contents represented in the data lines, so that the universal field and the special field can be respectively replaced by a first preset character with a smaller character number, the character lengths of the universal field and the special field can be reduced, the display space occupied by the data lines can be reduced, the character lengths of the data lines of the same category can be ensured to be equal, and the character alignment among the data lines of the same category can be realized; in addition, for whether the general field is deleted or not and whether the special field is deleted or not, different colors can be respectively configured for the three data lines with the deleted fields through specific deletion conditions, and then the complete data lines without the general field and the special field are kept in the original colors, so that the four data lines with different colors clearly show the field deletion conditions corresponding to the data lines in the current log file to a user, and the display effect of the field detection result is improved.
Optionally, in an embodiment of the present invention, after replacing the general field and the special field in each data line with the matched first preset characters, the method further includes: respectively adding second preset characters at the matched predicted occurrence positions according to the missing general field and the missing special field in each data line; and configuring different colors for the first preset character and the second preset character respectively, and displaying the log file with the color configured.
Specifically, according to different categories of the general data line and the special data line, the expected appearance position of each field is preset for each category, and the expected appearance position is used as the predicted appearance position of the field in the missing data line; the appearance position of each field in the complete data line of the same class can be used as the predicted appearance position of the field in the missing data line; if the current data line is determined to have a missing field, adding a second preset character according to the predicted occurrence position of the missing field; the second preset character can be equal to the first preset character in length; and different colors are respectively configured for the first preset character and the second preset character, so that the log file not only clearly shows whether each data line has a missing field, but also can intuitively show which fields are specifically missing in each missing data line and the positions of the missing fields in the data line, and the display effect of the log file is further improved.
Optionally, in an embodiment of the present invention, the obtaining a quality evaluation result of the log file according to a general field detection result and a special field detection result of the log file includes: acquiring the quality evaluation score of the log file according to the number proportion of the complete data lines under each category and the category weight of each category; wherein the category weight is related to the number ratio of the data rows of the current category in all the data rows and the importance degree of the current category.
Specifically, as described in the above technical solution, the product of the number ratio of the complete data lines under each category and the category weight of each category may be used as a field detection score of the category, and then the sum result of the field detection scores of each category may be used as a quality score of the log file; wherein, the number of data lines in the log file obtained each time is often different, and the types and the number of the special data lines are also different, so that for the general data lines and the special data lines of each type, different weights can be allocated to each type of data line according to the occurrence times of the type of data lines, namely, the occurrence times and the weights of the type of data lines are in positive correlation,
meanwhile, the importance degree of the data line bearing information of different categories is also different; for example, the request message data line and the receipt message data line bear more important transaction information than other data lines (such as service system maintenance messages), so different weights are allocated to different types of person data lines according to different importance degrees of the different types of data lines; that is, the weight of a category is determined by the number ratio of the category data row in all the data rows and the preset importance degree of the category; therefore, field detection results corresponding to data lines with fewer occurrence times or lower importance degree are avoided, the actual detection results of log quality evaluation are greatly influenced, detection errors are reduced, and the acquisition accuracy of the log quality evaluation results is greatly improved.
According to the technical scheme, the universal detection template is used for carrying out universal field detection on each data line in the log file, and at least one special detection template is used for carrying out special field detection on at least one type of special data line in the log file, so that the quality evaluation result of the log file is obtained according to the universal field detection result and the special field detection result of the log file, the detection efficiency of missing fields in the log file is improved, the detection result completely and accurately reflects the integral quality of the log file, and the method is suitable for detecting variable information under each field in the log file based on the detection mode of the universal detection template and the special detection template, so that the applicable range of log data detection is expanded.
Example two
Fig. 2 is a flowchart of a log quality evaluation method according to a second embodiment of the present invention, where normalization detection of log files is performed based on the above embodiments. As shown in fig. 2, the method specifically includes:
s201, acquiring a number sequence corresponding to each data line; wherein the numbering sequence comprises the arrangement order of the general fields and/or the special fields.
For a service system, the data lines of the same category in the generated service log should all show normative data in the same field arrangement mode; for the general data row example, each general data row should have a "timestamp" field, a "log level" field and a "thread number" field in turn, if the current general data row also includes the three fields, but the three fields are not sequentially arranged in this order, which indicates that the data row has no general field missing phenomenon, but has a phenomenon that the field arrangement is not standard.
Specifically, with the above general data behavior example, different general fields are respectively numbered, for example, a "timestamp" is numbered 1; the log grade is numbered 2; the thread number is number 3; if the current general data line has the field of log grade, the field of time stamp and the field of thread number, the serial number of the general data line is 2-1-3. In particular, for the dedicated data line, only the arrangement order corresponding to the general field may be acquired, only the arrangement order corresponding to the dedicated field may be acquired, and also the arrangement order corresponding to the sequence formed by the general field and the dedicated field together may be acquired.
S202, judging whether the log file accords with a numbering specification rule according to the numbering sequence corresponding to each data line.
And S203, if the serial numbers of each data line are the same as those of other data lines in the same category, determining that the log file accords with the serial number specification rule.
If the number sequences of the data lines in a category are the same, it is obvious that the data lines in the category show the fields in the same arrangement mode, that is, the log file accords with the number specification rule.
S204, if it is determined that the number sequence of at least one first data line is different from that of other data lines in the same category, determining that the log file does not accord with the number specification rule, and marking the at least one first data line as a number non-specification data line.
If the number sequences of the data lines in one category are not identical, namely the number sequence with the largest occurrence number under the category is used as a standard number sequence, and the data line (namely the first data line) corresponding to the abnormal number sequence different from the standard number sequence is marked as a data line with different numbers, thereby realizing the normative detection of the field sequence in the log file and expanding the detection range of the log data.
Optionally, in an embodiment of the present invention, the obtaining a quality evaluation result of the log file according to a general field detection result and a special field detection result of the log file includes: acquiring character sequences corresponding to the data lines respectively; wherein the character sequence comprises character positions of general fields and/or special fields; judging whether the log file accords with a character specification rule according to character sequences corresponding to each data line respectively; if the character sequences of the data lines are the same as those of the other data lines in the same category, determining that the log file accords with a character specification rule; if the character sequence of the at least one second data line is different from that of the other data lines in the same category, determining that the log file does not accord with the character specification rule, and marking the at least one second data line as a character non-specification data line.
Specifically, one data line in the log file, except that the general field and the special field are variable data, the other fields are composed of fixed characters, and in the generated log file, the two adjacent variable fields (i.e. two adjacent general fields, or two adjacent special fields, or two adjacent general fields and special fields) should be separated by the same character, that is, if the fixed characters in the data line are taken as a character sequence, the general fields or the special fields in each data line in the same category should be located at the same character position.
In the general data behavior example, the front and rear fixed character numbers of the general field are used as the character positions of the general field; for example, the "time stamp" is located between the fixed characters 10 and 11, i.e., the character positions thereof are "10-11"; the "log rank" is located between the anchor characters 25 and 26, i.e., the character positions thereof are "25-26"; the "thread number" is located between the anchor characters 36 and 37, i.e., its character position is "36-37". For the dedicated data line, only the character position corresponding to the general field may be acquired, only the character position corresponding to the dedicated field may be acquired, and also the character position corresponding to the sequence formed by the general field and the dedicated field may be acquired.
If the character positions of all the data lines in one category are the same, obviously, all the data lines in the category show all the fields in the same field positions, and the data lines do not have the phenomena of redundant characters, messy code characters and the like, namely the log file accords with the character specification rule; if the character positions of the data lines in one category are not identical, namely, the character position with the largest occurrence number in the category is used as a standard character position, and the data line (namely, the second data line) corresponding to the abnormal character position different from the standard character position is marked as a character nonstandard data line, thereby realizing the normative detection of the character position in the log file and further expanding the detection range of the log data.
According to the technical scheme, after the number sequences corresponding to the data lines are obtained, whether the log file accords with the number specification rules is judged according to the number sequences corresponding to the data lines, when at least one first data line is determined to be different from the number sequences of other data lines in the same category, the log file is determined to be not in accordance with the number specification rules, and the first data line is marked as the number non-specification data line, so that the detection of missing fields in the log data is realized, the standardability detection of the field sequences in the log file is realized, and the detection range of the log data is expanded.
Example III
Fig. 3 is a flowchart of a log quality evaluation method according to a third embodiment of the present invention, where on the basis of the foregoing embodiment, missing fields and misplacement fields may be intuitively displayed through a color vertical column chart, as shown in fig. 3, and the method specifically includes:
s301, detecting the universal field of each data line in the log file through a universal detection template to determine whether a missing universal field exists in the log file.
S302, performing special field detection on at least one type of special data line in the log file through at least one special detection template to determine whether a special field is missing in the log file.
S303, replacing the general field and the special field in each data line with the matched first preset characters respectively.
S304, according to different categories of the data lines, the data lines in different categories are distributed in different display areas.
Each data line in the log file is respectively configured in different display areas according to different categories, and the essence is that each data line is classified, summarized and displayed; taking the technical scheme as an example, the general data line, the request message data line and the receipt message data line are respectively summarized in different display areas.
S305, configuring a first preset color for the first preset character, and acquiring color vertical charts corresponding to the display areas respectively so as to display missing fields and dislocation fields through the color vertical charts; wherein the missing fields comprise a missing general field and a missing special field; the misalignment field includes a misalignment general field and a misalignment special field.
Taking the display area where the general data line is located as an example, since the variable field in the general data line is only a general field, and the general field has been replaced with the first preset character, that is, the variable positions of all general data lines are the same character, and the non-variable positions in all general data lines are the same fixed character, actually all data lines in the current display area should have the same character length, and the vertical column where the first preset color character is located should be a complete vertical column from top to bottom viewed from the vertical column direction, but if a breakpoint occurs at a part of the position of a certain vertical column, and no dislocation field exists in the data line where the power-off position is located, the breakpoint position indicates that the field is missing.
If, in addition to the columns, there are scattered first color characters, the scattered first color characters indicate that a dislocation phenomenon occurs in the position, that is, the universal field corresponding to the position is not aligned with the same field in other data lines, and the breakpoint of the color columns in the data lines is not caused by the missing universal field, but caused by the dislocation of the universal field, and thus it is determined that the dislocation phenomenon exists in the breakpoint position.
According to the technical scheme, after the universal field and the special field in each data line are replaced by the matched first preset characters respectively, the data lines of different types are distributed to different display areas according to different types of the data lines, the first preset color is further configured for the first preset characters, and color vertical charts corresponding to the display areas are obtained respectively, so that missing fields and misplacement fields are displayed through the color vertical charts, the missing fields and misplacement fields in the log file are intuitively displayed through the configured color vertical charts, and the display effect of the log data detection result is improved.
Example IV
Fig. 4 is a block diagram of a log quality evaluation device according to a fourth embodiment of the present invention, where the device specifically includes:
The universal detection execution module 401 is configured to perform universal field detection on each data line in the log file through a universal detection template, so as to determine whether a missing universal field exists in the log file;
a dedicated detection execution module 402, configured to perform dedicated field detection on at least one class of dedicated data line in the log file through at least one dedicated detection template, so as to determine whether a missing dedicated field exists in the log file;
and the quality evaluation result obtaining module 403 is configured to obtain a quality evaluation result of the log file according to the general field detection result and the special field detection result of the log file.
According to the technical scheme, the universal detection template is used for carrying out universal field detection on each data line in the log file, and at least one special detection template is used for carrying out special field detection on at least one type of special data line in the log file, so that the quality evaluation result of the log file is obtained according to the universal field detection result and the special field detection result of the log file, the detection efficiency of missing fields in the log file is improved, the detection result completely and accurately reflects the integral quality of the log file, and the method is suitable for detecting variable information under each field in the log file based on the detection mode of the universal detection template and the special detection template, so that the applicable range of log data detection is expanded.
Optionally, the quality evaluation result obtaining module 403 is specifically configured to obtain a number sequence corresponding to each data line respectively; wherein the numbering sequence comprises the arrangement order of general fields and/or special fields; judging whether the log file accords with a numbering rule or not according to the numbering sequence corresponding to each data line; if the serial numbers of the data lines are the same as those of other data lines in the same category, determining that the log file accords with the serial number specification rule; if the serial numbers of the at least one first data line and other data lines in the same class are different, determining that the log file does not accord with the serial number specification rule, and marking the at least one first data line as the serial number non-specification data line.
Optionally, the quality evaluation result obtaining module 403 is specifically configured to obtain a character sequence corresponding to each data line respectively; wherein the character sequence comprises character positions of general fields and/or special fields; judging whether the log file accords with a character specification rule according to character sequences corresponding to each data line respectively; if the character sequences of the data lines are the same as those of the other data lines in the same category, determining that the log file accords with a character specification rule; if the character sequence of the at least one second data line is different from that of the other data lines in the same category, determining that the log file does not accord with the character specification rule, and marking the at least one second data line as a character non-specification data line.
Optionally, the quality evaluation result obtaining module 403 is specifically configured to replace a general field and a special field in each data line with a matched first preset character respectively; respectively configuring different colors for the first type data row, the second type data row and the third type data row, and displaying the log file with the configured colors; wherein the first type data line has a missing general field and no missing special field; the second type data line has no missing general field and a missing special field; the third type data line has no missing generic field and no missing dedicated field.
Optionally, the quality evaluation result obtaining module 403 is specifically further configured to add second preset characters to the matched predicted occurrence positions according to the missing general field and the missing special field in each data line; and configuring different colors for the first preset character and the second preset character respectively, and displaying the log file with the color configured.
Optionally, the quality evaluation result obtaining module 403 is specifically further configured to allocate different types of data rows to different display areas according to different types of data rows; a first preset color is configured for the first preset character, and color vertical charts corresponding to the display areas are obtained so as to display missing fields and dislocation fields through the color vertical charts; wherein the missing fields comprise a missing general field and a missing special field; the misalignment field includes a misalignment general field and a misalignment special field.
Optionally, the quality evaluation result obtaining module 403 is specifically further configured to obtain a quality evaluation score of the log file according to a number ratio of the complete data lines under each category and a category weight of each category; wherein the category weight is related to the number ratio of the data rows of the current category in all the data rows and the importance degree of the current category.
The device can execute the log quality evaluation method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. Technical details which are not described in detail in this embodiment can be referred to the log quality evaluation method provided in any embodiment of the present invention.
Example five
Fig. 5 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the respective methods and processes described above, such as the log quality evaluation method.
In some embodiments, the log quality assessment method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as a storage unit. In some embodiments, part or all of the computer program may be loaded and/or installed onto the heterogeneous hardware accelerator via the ROM and/or the communication unit. When the computer program is loaded into RAM and executed by a processor, one or more steps of the log quality assessment method described above may be performed. Alternatively, in other embodiments, the processor may be configured to perform the log quality assessment method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a heterogeneous hardware accelerator having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or a trackball) through which a user can provide input to the heterogeneous hardware accelerator. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A log quality evaluation method, comprising:
detecting all data lines in the log file by using a universal detection template to determine whether missing universal fields exist in the log file;
performing special field detection on at least one class of special data line in the log file through at least one special detection template to determine whether a missing special field exists in the log file;
And acquiring a quality evaluation result of the log file according to the general field detection result and the special field detection result of the log file.
2. The method according to claim 1, wherein the obtaining the quality evaluation result of the log file according to the general field detection result and the special field detection result of the log file includes:
acquiring a number sequence corresponding to each data line respectively; wherein the numbering sequence comprises the arrangement order of general fields and/or special fields;
judging whether the log file accords with a numbering rule or not according to the numbering sequence corresponding to each data line;
if the serial numbers of the data lines are the same as those of other data lines in the same category, determining that the log file accords with the serial number specification rule;
if the serial numbers of the at least one first data line and other data lines in the same class are different, determining that the log file does not accord with the serial number specification rule, and marking the at least one first data line as the serial number non-specification data line.
3. The method according to claim 1, wherein the obtaining the quality evaluation result of the log file according to the general field detection result and the special field detection result of the log file includes:
Acquiring character sequences corresponding to the data lines respectively; wherein the character sequence comprises character positions of general fields and/or special fields;
judging whether the log file accords with a character specification rule according to character sequences corresponding to each data line respectively;
if the character sequences of the data lines are the same as those of the other data lines in the same category, determining that the log file accords with a character specification rule;
if the character sequence of the at least one second data line is different from that of the other data lines in the same category, determining that the log file does not accord with the character specification rule, and marking the at least one second data line as a character non-specification data line.
4. The method according to claim 1, wherein the obtaining the quality evaluation result of the log file according to the general field detection result and the special field detection result of the log file includes:
replacing the general field and the special field in each data line with matched first preset characters respectively;
respectively configuring different colors for the first type data row, the second type data row and the third type data row, and displaying the log file with the configured colors; wherein the first type data line has a missing general field and no missing special field; the second type data line has no missing general field and a missing special field; the third type data line has no missing generic field and no missing dedicated field.
5. The method of claim 4, further comprising, after replacing the common field and the dedicated field in each of the data lines with the matched first preset character, respectively:
respectively adding second preset characters at the matched predicted occurrence positions according to the missing general field and the missing special field in each data line;
and configuring different colors for the first preset character and the second preset character respectively, and displaying the log file with the color configured.
6. The method of claim 4, further comprising, after replacing the common field and the dedicated field in each of the data lines with the matched first preset character, respectively:
according to different categories of the data lines, distributing the data lines of different categories to different display areas;
a first preset color is configured for the first preset character, and color vertical charts corresponding to the display areas are obtained so as to display missing fields and dislocation fields through the color vertical charts; wherein the missing fields comprise a missing general field and a missing special field; the misalignment field includes a misalignment general field and a misalignment special field.
7. The method according to claim 1, wherein the obtaining the quality evaluation result of the log file according to the general field detection result and the special field detection result of the log file includes:
acquiring the quality evaluation score of the log file according to the number proportion of the complete data lines under each category and the category weight of each category; wherein the category weight is related to the number ratio of the data rows of the current category in all the data rows and the importance degree of the current category.
8. A log quality evaluation device, comprising:
the universal detection execution module is used for detecting the universal fields of each data line in the log file through the universal detection template so as to determine whether the missing universal fields exist in the log file;
the special detection execution module is used for carrying out special field detection on at least one class of special data row in the log file through at least one special detection template so as to determine whether the special field is missing in the log file;
and the quality evaluation result acquisition module is used for acquiring the quality evaluation result of the log file according to the general field detection result and the special field detection result of the log file.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the log quality assessment method of any one of claims 1-7.
10. A computer readable storage medium storing computer instructions for causing a processor to implement the log quality assessment method of any one of claims 1-7 when executed.
CN202311167477.4A 2023-09-11 2023-09-11 Log quality evaluation method and device, electronic equipment and storage medium Pending CN117194159A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311167477.4A CN117194159A (en) 2023-09-11 2023-09-11 Log quality evaluation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311167477.4A CN117194159A (en) 2023-09-11 2023-09-11 Log quality evaluation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117194159A true CN117194159A (en) 2023-12-08

Family

ID=88988321

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311167477.4A Pending CN117194159A (en) 2023-09-11 2023-09-11 Log quality evaluation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117194159A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106547658A (en) * 2016-10-28 2017-03-29 合网络技术(北京)有限公司 A kind of automated testing method and device
CN106649333A (en) * 2015-10-29 2017-05-10 阿里巴巴集团控股有限公司 Method and device for consistency testing of field sequence
CN109324996A (en) * 2018-10-12 2019-02-12 平安科技(深圳)有限公司 Journal file processing method, device, computer equipment and storage medium
CN109828920A (en) * 2019-01-18 2019-05-31 深圳市买买提信息科技有限公司 A kind of log analysis method, device and computer readable storage medium
CN113742192A (en) * 2021-09-13 2021-12-03 杭州安恒信息技术股份有限公司 Log rule quality analysis method, system, electronic device and storage medium
CN113919309A (en) * 2021-10-22 2022-01-11 平安科技(深圳)有限公司 Excel macro function-based field comparison method, device, equipment and storage medium
CN115599778A (en) * 2022-10-28 2023-01-13 中国农业银行股份有限公司(Cn) Data file processing method and device, electronic equipment and storage medium
CN115858884A (en) * 2023-02-28 2023-03-28 天翼云科技有限公司 Log verification method, device and product

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649333A (en) * 2015-10-29 2017-05-10 阿里巴巴集团控股有限公司 Method and device for consistency testing of field sequence
CN106547658A (en) * 2016-10-28 2017-03-29 合网络技术(北京)有限公司 A kind of automated testing method and device
CN109324996A (en) * 2018-10-12 2019-02-12 平安科技(深圳)有限公司 Journal file processing method, device, computer equipment and storage medium
CN109828920A (en) * 2019-01-18 2019-05-31 深圳市买买提信息科技有限公司 A kind of log analysis method, device and computer readable storage medium
CN113742192A (en) * 2021-09-13 2021-12-03 杭州安恒信息技术股份有限公司 Log rule quality analysis method, system, electronic device and storage medium
CN113919309A (en) * 2021-10-22 2022-01-11 平安科技(深圳)有限公司 Excel macro function-based field comparison method, device, equipment and storage medium
CN115599778A (en) * 2022-10-28 2023-01-13 中国农业银行股份有限公司(Cn) Data file processing method and device, electronic equipment and storage medium
CN115858884A (en) * 2023-02-28 2023-03-28 天翼云科技有限公司 Log verification method, device and product

Similar Documents

Publication Publication Date Title
US11113317B2 (en) Generating parsing rules for log messages
CN111460312A (en) Method and device for identifying empty-shell enterprise and computer equipment
CN114580916A (en) Enterprise risk assessment method and device, electronic equipment and storage medium
US9760607B1 (en) Calculating document quality
CN113761334A (en) Visual recommendation method, device, equipment and storage medium
CN116074183B (en) C3 timeout analysis method, device and equipment based on rule engine
CN112613762A (en) Knowledge graph-based group rating method and device and electronic equipment
CN117194159A (en) Log quality evaluation method and device, electronic equipment and storage medium
CN115314424B (en) Method and device for rapidly detecting network signals
CN116414814A (en) Data checking method, device, equipment, storage medium and program product
CN116431505A (en) Regression testing method and device, electronic equipment, storage medium and product
CN115660451A (en) Supplier risk early warning method, device, equipment and medium based on RPA
CN115116070A (en) Method, device and equipment for accurately cutting PDF and storage medium
CN116644102A (en) Intelligent investment object selection method, system terminal and computer readable storage medium
CN115168509A (en) Processing method and device of wind control data, storage medium and computer equipment
CN111639195B (en) Display method and display device of knowledge graph and readable storage medium
CN113592305A (en) Test method, test device, electronic device, and storage medium
CN112836472A (en) Address annotation method, device, equipment and storage medium
CN111722977A (en) System inspection method and device and electronic equipment
CN117539840B (en) Log acquisition method, device, equipment and medium
CN112819511B (en) Relationship display method and device of object execution strategy and electronic equipment
CN116149933B (en) Abnormal log data determining method, device, equipment and storage medium
CN117573491A (en) Positioning method, device, equipment and storage medium for performance bottleneck
CN114819738A (en) Resource allocation method, device, equipment and storage medium
CN116244403A (en) Abnormal questionnaire screening method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination