CN112306787B

CN112306787B - Error log processing method and device, electronic equipment and intelligent sound box

Info

Publication number: CN112306787B
Application number: CN201910672491.7A
Authority: CN
Inventors: 张颖莹; 孙永华
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2019-07-24
Filing date: 2019-07-24
Publication date: 2022-08-09
Anticipated expiration: 2039-07-24
Also published as: CN112306787A

Abstract

The embodiment of the invention provides an error log processing method, an error log processing device, electronic equipment and an intelligent sound box, wherein the method comprises the following steps: acquiring a plurality of first error logs; clustering the first error logs to obtain a plurality of clusters; error correction information corresponding to each of the plurality of clusters is determined, and error correction processing is performed based on the error correction information corresponding to each of the plurality of clusters. Through clustering, the error logs are equivalently grouped and divided, so that the abnormal problems corresponding to the error logs can be conveniently positioned. Further, error correction information such as the type of error, the cause of error, and the solution corresponding to each cluster can be determined by means of an expert or the like, so that error correction processing can be performed based on the error correction information corresponding to each cluster.

Description

Error log processing method and device, electronic equipment and intelligent sound box

Technical Field

The invention relates to the technical field of internet, in particular to an error log processing method and device, electronic equipment and an intelligent sound box.

Background

With the development and popularization of internet technology, various service providers can provide various network services for vast users, and in the process of using the services provided by the service providers, a service platform at the service provider side needs to perform millions of computing operations every day, so that massive log records can be generated, and a large number of error reporting logs can exist.

Some error reporting reasons may be problems originated from the user, for example, an error exists in the SQL statement written by the user; while some may be a problem on the service platform side. Operation and maintenance personnel are often submerged in a large number of error reporting logs, and it is difficult to accurately position the error reporting and how to solve the abnormal problem in time.

Disclosure of Invention

The embodiment of the invention provides an error log processing method and device, electronic equipment and an intelligent sound box, which are used for accurately positioning and solving an abnormal problem.

In a first aspect, an embodiment of the present invention provides an error log processing method, where the method includes:

acquiring a plurality of first error logs;

clustering the first error logs to obtain a plurality of clusters;

determining error correction information corresponding to the plurality of clusters, respectively;

and performing error correction processing according to the error correction information respectively corresponding to the plurality of clusters.

In a second aspect, an embodiment of the present invention provides an error log processing apparatus, where the apparatus includes:

the acquisition module is used for acquiring a plurality of first error logs;

the clustering module is used for clustering the first error logs to obtain a plurality of clusters;

a determining module for determining error correction information corresponding to the plurality of clusters, respectively;

and the error correction module is used for carrying out error correction processing according to the error correction information respectively corresponding to the plurality of clusters.

In a third aspect, an embodiment of the present invention provides an electronic device, including a processor and a memory, where the memory stores executable codes, and when the executable codes are executed by the processor, the processor may implement at least the error log processing method in the first aspect.

In a fourth aspect, an embodiment of the present invention provides a non-transitory machine-readable storage medium, on which executable code is stored, and when the executable code is executed by a processor of an electronic device, the processor is enabled to implement at least the error log processing method in the first aspect.

In a fifth aspect, an embodiment of the present invention provides a smart sound box, including a processor, a memory, and a communication interface, where the memory stores executable codes thereon, and when the executable codes are executed by the processor, the processor is caused to perform the following steps:

acquiring a plurality of error logs generated by at least one controlled device controlled by the intelligent sound box through the communication interface;

clustering the error logs to obtain a plurality of clusters;

In the embodiment of the invention, a large amount of error logs which are generated can be collected firstly, the error logs are clustered, a plurality of clusters can be obtained, the error logs contained in each cluster have certain similarity, and the error logs contained in different clusters have obvious difference. Through clustering, the error logs are equivalently grouped and divided, so that the abnormal problems corresponding to the error logs can be conveniently positioned. Further, error correction information such as the type of error, the cause of error, and the solution corresponding to each cluster can be determined by means of an expert or the like, so that error correction processing can be performed based on the error correction information corresponding to each cluster.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a schematic diagram illustrating an application scenario of an error log processing method according to an embodiment of the present invention;

fig. 2 is a flowchart of an error log processing method according to an embodiment of the present invention;

fig. 3 is a flowchart of another error log processing method according to an embodiment of the present invention;

fig. 4 is a flowchart of another error log processing method according to an embodiment of the present invention;

fig. 5 is a schematic diagram of anomaly detection based on clustering results according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an error log processing apparatus according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of an electronic device corresponding to the error log processing apparatus provided in the embodiment shown in FIG. 6;

fig. 8 is a schematic diagram of a processing procedure of error log reporting in a home scenario according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and "a plurality" typically includes at least two.

The words "if", as used herein, may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.

It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element.

In addition, the sequence of steps in the embodiments of the methods described below is merely an example, and is not strictly limited.

The main idea of the error log processing method provided by the embodiment of the invention is explained first.

Assuming that a service provider provides a service to a large number of users, the large number of users may use the service by means of web access or by means of downloading and using a client (APP). Then a large log of the user's usage is generated during the user's usage of the service. When some abnormal problem occurs during the use process, the correspondingly generated log can be called an error log.

For example, many users (user 1, user 2, user 3, and user 4 in fig. 1 are taken as examples) illustrated in fig. 1 may generate various error logs in the process of using the APP, and these error logs may be uploaded to the server by the APP. In addition, a large amount of logs are also generated in the process of running the service by the service end, wherein when some abnormal phenomena occur, the correspondingly generated logs can also be called error logs. Thus, as shown in fig. 1, the server may have a large number of error logs stored therein from the user side as well as several error logs stored therein from the server itself.

In order to analyze the reasons for generating the error logs and solve the generated abnormal problems, a large number of error logs which are generated can be collected firstly, the error logs are subjected to cluster analysis so as to classify the error logs with the same error reasons into one class, and then the error reasons and the solutions of each class of error logs can be labeled by experts, so that the error logs can be subjected to error correction response according to the labeling results of the error reasons and the solutions.

In fig. 1, it is assumed that the error logs that have been generated include error log 1 to error log 1000, and that the cluster analysis processing on these error logs results in three clusters: c1 (including error log 1 to error log 200), C2 (including error log 201 to error log 700), and C3 (including error log 701 to error log 1000), the results of error cause and solution annotation for these three clusters are assumed as:

the error cause corresponding to the C1 is a user side cause, specifically a cause R1, and the solution is S1;

the error reason corresponding to the C2 is a platform side reason, specifically a reason R2, and the solution is S2;

the error reason corresponding to the C3 is a platform side reason, specifically a reason R3, and the solution is S3;

the reason of the platform side is the reason of the server side.

After the labeling results of the three clusters, the error reasons and the solutions are obtained, the server side continuously receives the access of the user, so that a large number of logs can be generated in a streaming manner, and therefore, when a newly generated error log is subsequently acquired in real time, clustering processing of the newly generated error log to each cluster which is already generated can be performed, and of course, a part of the newly generated error log may cause generation of a new cluster or may form an outlier (i.e., an independent point which does not belong to any cluster). For example, as illustrated in fig. 1, assume that the error logs 1001 to 1100 are generated after the error log 1000, wherein the error logs 1001 to 1040 are from the user 1, and the error logs 1041 to 1100 are from the user 3. After the clustering process, some error logs such as the error logs 1001 to 1070 are clustered into the cluster C1, the error logs 1071 to 1099 are clustered into the cluster C3, and the error log 1100 is an isolated outlier, so that the three clusters are already labeled, and at this time, the isolated outlier can be labeled by an expert, for example, the error cause is the cause R4, and the solution is S4.

Based on the noted result, the error correction response processing performed on the error log may be, for example, what the reason why the user corresponding to the error log generates the error log is and how to solve the error log, or may be to discover an exception existing on the platform side in time and execute a corresponding solution to make the platform self-heal.

The core idea of the error log processing method provided by the embodiment of the present invention is summarized above, and the following embodiment is combined to describe in detail the execution process of the error log processing method.

Fig. 2 is a flowchart of an error log processing method according to an embodiment of the present invention, where the error log processing method according to the embodiment of the present invention may be executed by an electronic device on a service provider side, such as a server. As shown in fig. 2, the method comprises the steps of:

201. a plurality of first error logs are obtained.

202. And clustering the first error logs to obtain a plurality of clusters.

203. Error correction information corresponding to each of the plurality of clusters is determined.

204. Error correction processing is performed based on error correction information corresponding to each of the plurality of clusters.

As described above, the first error logs may be error logs generated by a large number of users during the service provided by the service provider.

In practical applications, assuming that the time when the error log processing method provided by the embodiment of the present invention is initially executed is denoted as T1, a large number of error logs generated a period of time before the time T1 can be collected as the first error logs. Wherein the period of time is set as 3 days, one week, etc.

Of course, as the error logs are continuously generated, a certain time interval may also be set, for example, a day, so that the error logs generated every day are sequentially regarded as the first error logs. However, when clustering a plurality of error logs generated in the following day, it is necessary to use a clustering result that has been generated previously.

Since a user generates a large amount of logs during the process of using the service provided by the service provider, including error logs and non-error logs, usually, the error logs include a field indicating that the log is an error log, such as an "error" word, so that the error logs can be extracted according to the special field.

In an optional embodiment, before performing clustering processing on the first error logs, some preprocessing steps may be further included, and these preprocessing steps may help to ensure a clustering result, improve clustering efficiency, and the like.

Optionally, the preprocessing steps may include, for example: and carrying out stem extraction on the first error logs. The stem extraction is a process of removing affixes to obtain roots, and is simply the most common writing method for obtaining words.

Optionally, the preprocessing steps may further include, for example: and eliminating redundant entities contained in the first error logs, wherein the entities are application names, table names, machine names, cluster names and the like. Entity culling rules can be made in advance according to the characteristics of the differences of the entities, so that the culling of redundant entities is realized based on the entity culling rules. For example, assuming that the table name is located within a double quotation mark and the application name is located within a book name number, the corresponding entities may be determined based on these specific symbols, thereby eliminating these entities.

The reason why these entities can be eliminated is that in some practical applications, the positioning granularity of the reason for the error reporting can not be too fine, and only the rough root cause or the type of the reason needs to be positioned, for example, the table does not exist.

It can be understood that, when two error logs in the first error logs are different only in the redundant entity, the two error logs will be the same after the redundant entity removing process, that is, the two error logs can be merged into one log, so that the total amount of the error logs can be reduced based on the preprocessing of the stem extraction and the redundant entity removing process performed on the first error logs, and the occupation of the storage space can also be reduced.

In an optional embodiment, clustering the first error logs to obtain a plurality of clusters may be implemented as:

respectively carrying out word vector coding on the first error logs;

determining semantic feature matrixes respectively corresponding to the first error logs according to the word vectors respectively corresponding to the first error logs;

and clustering the plurality of error logs by taking the semantic feature matrixes respectively corresponding to the plurality of first error logs as samples.

Taking any error log i of the first error logs as an example, performing word vector encoding on the error log i may be, for example: performing word segmentation processing on the error log i, performing word vector coding on each obtained word, sequentially inputting the obtained word vectors into the recurrent neural network model, and outputting a semantic feature matrix corresponding to the error log i through the recurrent neural network model. For the error log i, a semantic feature matrix corresponding to the error log i can be determined according to the final hidden layer state of the recurrent neural network model. Of course, a long-short term memory network model or a bidirectional long-short term memory network model may be used instead of the recurrent neural network model. At this time, the semantic feature matrix corresponding to the error log i contains semantic information of the error log i.

In another optional embodiment, clustering the first error logs to obtain a plurality of clusters may be implemented as:

obtaining keywords contained in each of the first error logs;

carrying out word vector coding on keywords contained in each of the plurality of first error logs;

Still taking any error log i of the first error log as an example, in this alternative implementation, not all words included in the error log i are word vector encoded, but only the keywords included therein are combined to perform word vector encoding.

The following describes the extraction process of the keyword and the encoding process of the word vector by taking any one of the error logs i as an example.

Wherein, the extraction process of the keywords can be realized as follows: and for the error log i, performing word segmentation processing on the error log i, and further determining words contained in the error log i, wherein the word frequency of the words meets the set conditions, as keywords contained in the error log i. The word frequency satisfies a predetermined condition, for example, the word frequency is arranged in the first N bits of the vocabulary table from large to small, that is, the word frequency belongs to topN.

Wherein, assuming that the first error logs are error logs generated in a period of time before that have been collected when the error log processing method provided by the embodiment of the present invention is initially executed, the vocabulary table may be obtained as follows: and acquiring a plurality of words contained in the first error report logs, and generating a vocabulary list according to the respective corresponding word frequencies of the words, wherein the vocabulary list is composed of words with the highest word frequency and with a preset number, and the preset number is N. That is, assuming that 3000 words are included in the first error logs in total, the word frequency of each word may be expressed by the total number of times that each word appears in the first error logs, and assuming that N is 1000, the top 1000 words with the highest word frequency are selected from the 3000 words to constitute the vocabulary. Based on this, for the error log i, the keywords included in the error log i are words included in the error log i and located in the vocabulary, for example, the error log i includes 25 words, and if 15 words are located in the vocabulary, the 15 words are the keywords included in the error log i.

The encoding process of the word vector of the error log i can be implemented as follows:

first, co-occurrence times of any two words in the vocabulary in the first error logs are obtained, where the co-occurrence times are co-occurrence times within a preset window length (e.g., 3, 5). And generating a co-occurrence matrix according to the co-occurrence times, and training a word vector according to the co-occurrence matrix and the GloVe model to obtain the GloVe model for carrying out word vector coding on the plurality of first error report logs.

That is to say, in this embodiment, a GloVe model may be used to implement an encoding process for the word vector of the error log i. The training process and the using process of the GloVe model can be realized by referring to the prior related art, and are not described herein in detail.

Through the word vector encoding processing, assuming that the error report log i has 15 keywords, 15 word vectors are obtained, and optionally, the 15 word vectors may be subjected to mean value calculation, so that the finally obtained mean value is used as a semantic feature matrix corresponding to the error report log i.

After the processing, the plurality of first error-reporting logs are represented as a plurality of semantic feature matrices, and then the plurality of first error-reporting logs are clustered by taking the semantic feature matrices corresponding to the plurality of first error-reporting logs as samples.

Optionally, the Clustering process may be performed by using a Density Clustering algorithm, for example, using a Noise-Based Density Clustering method (DBSCAN for short). Of course, other clustering methods such as k-means may be employed.

In many practical application scenarios, it is not possible to accurately know in advance what causes the abnormal problem, that is, how many clusters can be clustered, so that the DBSCAN method can be used to perform clustering.

DBSCAN describes how close a sample set is based on a set of neighborhoods, where the parameters (epsilon, MinPts) are used to describe how close the samples in the neighborhood are distributed. Where ε describes the neighborhood distance threshold for a sample, and MinPts describes the threshold for the number of samples in the neighborhood where the distance of a sample is ε. In this embodiment, since the semantic feature matrix corresponding to the error log is used as the sample, the difference between the two semantic feature matrices can be calculated as the distance between the two corresponding samples. The detailed implementation process of the DBSCAN can be referred to the description in the related art, and is not described herein.

By clustering the first error logs based on semantic features, some error logs with the same error reporting reason and different expression modes can be merged into one type, namely clustered into one cluster.

In addition, it should be noted that the parameters epsilon and MinPts may be adjusted step by step to obtain a satisfactory clustering result, and the adjusted parameters may be applied to the clustering process of the error logs generated subsequently. Moreover, when the word vector encoding method is adopted for the keywords included in the error log, the value of N in topN may also be adjusted step by step.

Specifically, assuming that epsilon, MinPts are set to be small at the initial time and N is also set to be small, the final clustering result is likely to be: the number of clusters obtained by clustering is large, samples contained in the clusters have large difference, and a good clustering effect cannot be realized.

It should be noted that after the first error logs are clustered based on the DBSCAN, a plurality of clusters may be obtained, and one or more outliers may also be obtained. The outliers are simply isolated points that do not belong to any cluster, and correspond to some unusual error reporting problem that may occur sporadically in practical applications.

After the clustering result is obtained, error correction information can be labeled on each generated clustering result by means of experts, and if the clustering result is a plurality of clusters, the error correction information is labeled on the plurality of clusters respectively.

Specifically, the clusters can be sent to an expert system, and on the expert system side, corresponding experts can label error correction information for each cluster, and then the expert system feeds back a labeling result to the server. As shown in fig. 1, the error correction information may include error reporting reasons and solutions, among others.

The error reporting reasons can be divided into two categories: user-side reasons and platform-side reasons. Alternatively, it is possible to only note which of these two categories the cause of the error is. Of course, optionally, it is also possible to further note what the specific reason is. For example, for the user side reason, specific reasons such as SQL statement error, table absence and the like can be specifically marked; for the platform-side reason, specific reasons such as excessive stored expired files can be specifically marked.

The solution may be some prompt information or a calling interface of an executable script file. For example, when a certain error report problem can be automatically solved, the operation and maintenance personnel can write a corresponding script file, so that the corresponding problem can be automatically solved by running the script file. When the server cannot provide a scheme for automatically solving the error reporting problem, certain prompt information can be provided, for example, the user is informed that the currently generated error reporting belongs to the platform side problem, the error reporting problem is submitted to operation and maintenance personnel for processing, and the information of the operation and maintenance personnel can be provided for the user; for another example, the user is informed that the error report problem generated currently is caused by misoperation at the user side, and some solution suggestions are given.

After the error correction information corresponding to each of the plurality of clusters is obtained, error correction processing may be performed based on the error correction information corresponding to each of the plurality of clusters. In practical applications, if the plurality of first error logs are a large number of error logs generated within a long period of time (for example, a past week) collected when the error log processing method provided by the embodiment of the present invention is initially executed, in order to take account of user experience (that is, after the error log is generated at the user side, a response should be timely performed on the user), a conventional error log processing manner may be used for responding, so that an initial clustering result and error correction information are obtained at this time, and the clustering result and the error correction information may be used for error correction processing on the error logs generated subsequently in real time. Of course, if the first error logs are error logs generated in a short time, such as generated in the current day, the error correction processing may be directly performed on the first error logs based on the error correction information corresponding to each of the clusters. For example, a notification message is sent to the corresponding user to notify the user of the error reason of the error log and the solution, for example, the solution corresponding to each cluster is directly executed.

Fig. 3 is a flowchart of another error log processing method according to an embodiment of the present invention, as shown in fig. 3, the method includes the following steps:

301. a plurality of first error logs are obtained.

302. And clustering the first error logs to obtain a plurality of clusters.

In this embodiment, the process of clustering the plurality of first error logs may refer to the description in the foregoing embodiment, which is not described herein again.

303. For any one of the plurality of clusters, a similarity distance between different error logs contained in the any one cluster is determined.

304. And determining multiple groups of error-reporting logs with similarity distances meeting set conditions, and determining entity elimination rules according to the multiple groups of error-reporting logs.

305. And removing redundant entities contained in the first error logs according to the entity removing rule.

In the foregoing embodiment, it is mentioned that before the clustering process is performed on the first error logs, the first error logs may be subjected to a preprocessing such as stem extraction, redundant entity elimination, and the like. The elimination rules of the redundant entities in the foregoing are usually based on special identifiers corresponding to different entities, such as double quotation marks, book name numbers, and the like. However, in some practical application scenarios, the redundant entities that can be removed based on these special identifiers are relatively limited, and this embodiment provides another supplementary means for removing the redundant entities.

Specifically, after a plurality of first error logs are clustered to obtain a plurality of clusters, taking any one of the clusters Ci as an example, the similarity distance between different error logs included in the cluster Ci can be determined, if the similarity distance between two or more error logs is smaller than a set threshold, the two or more error logs are taken as a group, and a corresponding entity elimination rule is determined according to the group of error logs.

The similarity distance may be calculated by using a Levenshtein algorithm, for example, and therefore may also be referred to as a Levenshtein distance or an edit distance.

The Levenshtein distance describes the minimum number of operations that can be performed to translate from one string to another, including insertions, deletions, substitutions, and the like. For example, for converting eeba to abac, the first e may be deleted first to become eba, the remaining e may be replaced with a to become aba, and c may be inserted at the end to become abac, so the Levenshtein distance between eeba and abac is 3.

Therefore, for any two error logs, the similarity distance between the two error logs describes the distance for converting one error log into the other error log, and if the similarity distance between the two error logs is smaller, the main difference between the two error logs is mainly reflected as the difference on the redundant entity, so that the elimination rule of the redundant entity can be mined according to the error logs with the smaller similarity distance. In fact, the server may output multiple sets of error logs with similarity distances smaller than a set threshold, so that the operation and maintenance staff may determine the entity elimination rule based on differences embodied by the multiple sets of error logs. Furthermore, on the one hand, the server may perform entity elimination on the first error logs based on the supplemented entity elimination rule obtained at this time, and on the other hand, the entity elimination rule may also be used to perform entity elimination on subsequently generated error logs.

It can be understood that, at this time, performing entity elimination processing on the plurality of first error-reporting logs does not affect the clustering result that has been obtained so far, but can further reduce the occupation of the storage space by the plurality of first error-reporting logs.

In practical application, as the service provided by the service end continuously runs, error logs are continuously generated. Then, when based on the assumption in the foregoing embodiment that the plurality of first error logs are a large number of error logs generated before that have been collected when the error log processing method provided by the embodiment of the present invention is initially executed, after the initial clustering result, that is, the plurality of clusters and the error correction information corresponding to each cluster, is obtained based on the plurality of first error logs, the processing procedure of the error logs generated subsequently in real time can be executed with reference to the embodiment shown in fig. 4.

Fig. 4 is a flowchart of another error log processing method according to an embodiment of the present invention, as shown in fig. 4, the method includes the following steps:

401. a plurality of first error logs are obtained.

402. And clustering the first error logs to obtain a plurality of clusters.

403. Error correction information corresponding to each of the plurality of clusters is determined.

The execution of the foregoing steps can refer to the description in the foregoing embodiments, and is not described herein.

404. And acquiring at least one group of second error logs, wherein the at least one group of second error logs sequentially correspond to the error logs generated in different time periods after the plurality of first error logs.

405. And sequentially clustering the at least one group of second error logs according to the obtained plurality of clusters to update the plurality of clusters, and performing error correction processing on the at least one group of second error logs according to the error correction information respectively corresponding to each updated cluster.

Assuming that the first error logs are error logs collected within a period of time, such as 10 days, before time T1, then assuming that a large number of error logs are generated within a period of time, such as T2-T3, after time T1, these error logs are treated as a set of second error logs, and so on, a large number of error logs generated within T3-T4 can be treated as a next set of second error logs, and so on. The length of the time period T2-T3 may be preset, and the time period may be set to a shorter time, such as 1 day or less, than one day, compared to the collection time period of the first error logs.

Taking a group of second error logs generated in T2-T3 as an example, before clustering the group of second error logs, the method may still include: and performing preprocessing steps such as stem extraction and redundant entity elimination on the group of second error-reporting logs, wherein the entity elimination can be performed by combining the entity elimination rule obtained in the embodiment shown in fig. 3. The DBSCAN clustering method may still be used to perform clustering processing on the group of second error logs, and it is only assumed that a plurality of clusters obtained at this time are respectively C1, C2, and C3, and when performing clustering processing on the group of second error logs, a semantic feature matrix corresponding to each of the group of second error logs is used as a newly added sample to perform clustering processing, so that a part of the group of second error logs may be clustered into one or more of C1, C2, and C3, and of course, a new outlier and a new cluster, such as C4, may also be generated. After the second error-reporting log is clustered, when new clusters and outliers are generated, the new clusters and outliers can be labeled with the help of experts. Further, the group of second error logs may be subjected to error correction processing based on the clustering result and the labeled error correction information at this time.

And then, carrying out the same clustering and error correction processing on the next group of second error logs generated in T3-T4 by adopting the same processing process, and so on.

In terms of a time axis, clustering of the first error logs is called offline clustering, clustering of each group of second error logs generated in real time at intervals of a short period is called online clustering, the offline clustering is combined with the online clustering, and the online clustering is based on results of the offline clustering, so that the stability and the real-time performance of clustering results can be improved.

The following still takes the second error logs generated in T2-T3 as an example to describe the error correction process of the second error logs.

In an optional embodiment, for the group of second error logs, if any one of the currently obtained clusters includes a part of the second error logs, the error correction information corresponding to any one of the clusters may be sent to the corresponding user according to the user identifier corresponding to each of the part of the second error logs. For example, assuming that the currently obtained clusters are C1, C2, and C3, and assuming that the logs a to f in the group of second error-reporting logs are clustered in the cluster C1, the error correction information corresponding to the cluster C1 may be sent to the corresponding user according to the user identifier corresponding to each of the logs a to f. Thus, the user can know the reason and the solution of the error report so as to perform corresponding processing.

In another optional embodiment, for the set of second error logs, if any cluster includes a part of the second error logs therein, the error cause corresponding to the any cluster belongs to the platform-side cause, and the total number of error logs included in the any cluster is greater than or equal to the set threshold, then the solution corresponding to the any cluster is executed. For example, assume that the currently obtained clusters are C1, C2, and C3, and assume that the logs a to f in the second error log set are clustered into the cluster C1, and assume that the error reason marked in the cluster C1 belongs to a platform-side reason and a corresponding solution (such as a call interface of a script file) is marked, and assume that the number of all error logs currently contained in the cluster C1 is greater than a set threshold, the solution may be directly executed, thereby implementing self-healing processing of the error problem. In this case, it is needless to say that the user may also know what the error cause is and know that the server has handled the error problem by sending the notification information such as the error cause corresponding to the cluster C1 to the corresponding user according to the user identifier corresponding to each of the logs a to f.

In another optional embodiment, for the entirety of the at least one group of second error logs respectively obtained in at least one continuous time period, for any one cluster, if the error cause corresponding to the any one cluster belongs to the platform-side cause, the number of the at least one group of second error logs respectively corresponding to the any one cluster is determined, and thus, if the change of the number shows the sudden increase characteristic, the solution corresponding to the any one cluster is executed.

The detection of the platform-side abnormal situation can be realized based on the present embodiment, and for the convenience of understanding the abnormal detection scheme, the description is provided with reference to fig. 5.

Assume that three sets of second error logs have been obtained, corresponding to the set of second error logs generated in T2-T3, the set of second error logs generated in T3-T4, and the set of second error logs generated in T4-T5, respectively. Assume that three clusters of C1, C2, and C3 have been currently generated. Taking cluster C1 as an example, in fig. 5, assume that n error logs in a group of second error logs generated in T2-T3 are clustered in cluster C1, assume that m error logs in a group of second error logs generated in T3-T4 are clustered in cluster C1, and assume that k error logs in a group of second error logs generated in T4-T5 are clustered in cluster C1. Based on this, a curve reflecting the change of the error logs in the cluster C1 can be drawn, as shown in fig. 5, the abscissa of the curve represents time, and the ordinate represents the number of error logs contained in the cluster C1, wherein it is assumed that the number of error logs contained in the cluster C1 before the time T2 is a. Thus, at time T3, the corresponding total number of error logs is a + n; at time T4, the total number of corresponding error logs is a + n + m; at time T5, the corresponding total number of error logs is a + n + m + k.

That is to say, the number of error logs included in each cluster may be counted every preset time segment, so that for each cluster, a time series data may be obtained, where the time series data is the number of error logs included in a certain cluster at different time.

For any one cluster, such as the cluster C1, the time series data corresponding to C1 may be analyzed to alert when a change in the number of error logs in the cluster C1 is found to present a surge feature, so that the solution corresponding to the cluster C1 is automatically executed based on the triggering of the alert.

In fact, the surging feature can be divided into two cases: one is sudden increase from nothing to nothing, for example, a cluster is not originally existed, a cluster is newly added in the clustering process of a group of second error-reporting logs, and for example, an outlier is newly added; another situation is a sudden increase from few to many.

Wherein, the sudden increase from nothing to nothing is easy to be found, and the extra explanation is not needed.

For sudden increase from few to many, since the number of error logs included in the cluster is supposed to show a Trend of changing from few to many as time goes on, in order to find the sudden increase feature, a time series decomposition algorithm such as STL (secure-objective decomposition procedure on Loess) may be used to decompose the time series data corresponding to the cluster C1 into a Trend term, a season term (also referred to as a period term) and a random term (also referred to as a remainder), remove the season term therefrom, and analyze the remaining Trend term and random term to determine whether the cluster C1 shows the sudden increase feature. Briefly, the analysis of the remaining trend terms and random terms is mainly: an upper limit value is set, and if the cumulative sum of the trend term and the random term corresponding to the cluster C1 exceeds the upper limit value within a certain period of time, the cluster C1 is considered to exhibit a surging characteristic, as shown in fig. 5.

Through the error correction processing method for the plurality of second error logs provided by this embodiment, the abnormality existing on the platform side can be detected in time, so as to perform self-healing processing in time, for example, when it is found that the amount of data stored on the platform side has exceeded the upper threshold based on the STL method, a corresponding solution can be executed: according to the sequence of the generation time, deleting a part of data with the earliest generation time.

In summary, the error log processing method provided by the embodiment of the invention can perform clustering processing on the error logs based on the semantic features of the error logs, so that the error logs with the same error cause are clustered into one class, and the clustering effect is better by combining the semantic information of the error logs to perform clustering. In addition, by labeling the error correction information of each cluster corresponding to the clustering result, the reason of the error reporting problem can be positioned and the error reporting problem can be solved based on the labeling result.

The error log processing apparatus according to one or more embodiments of the present invention will be described in detail below. Those skilled in the art will appreciate that these error log processing means can be constructed by configuring the steps taught in the present embodiment using commercially available hardware components.

Fig. 6 is a schematic structural diagram of an error log processing apparatus according to an embodiment of the present invention, and as shown in fig. 6, the error log processing apparatus includes: the device comprises an acquisition module 11, a clustering module 12, a determination module 13 and an error correction module 14.

The obtaining module 11 is configured to obtain a plurality of first error logs.

And the clustering module 12 is configured to perform clustering processing on the first error logs to obtain a plurality of clusters.

A determining module 13, configured to determine error correction information corresponding to each of the plurality of clusters.

And an error correction module 14, configured to perform error correction processing according to the error correction information corresponding to each of the plurality of clusters.

Optionally, the error correction information is expert annotated.

Optionally, the clustering module 12 may be specifically configured to: obtaining keywords contained in the first error logs respectively; performing word vector coding on keywords contained in the first error logs respectively; determining semantic feature matrixes respectively corresponding to the first error logs according to the word vectors respectively corresponding to the first error logs; and clustering the first error logs by taking semantic feature matrixes respectively corresponding to the first error logs as samples.

Optionally, a density clustering algorithm is adopted for the clustering process.

Optionally, in the process of obtaining the keywords included in each of the plurality of first error logs, the clustering module 12 may be specifically configured to:

for any error log in the first error logs, performing word segmentation on the error log; determining the words with the word frequency meeting the set conditions in any error log as the keywords contained in any error log.

Optionally, the apparatus further comprises: the vocabulary generating module is used for acquiring a plurality of words contained in the first error log; generating a vocabulary list according to the word frequencies corresponding to the plurality of words, wherein the vocabulary list is composed of words with the highest word frequency and the preset number of words; and the keywords contained in any error log are words contained in the error log and positioned in the vocabulary.

Optionally, the apparatus further comprises: the model generation module is used for acquiring the co-occurrence times of any two words in the vocabulary in the first error-reporting logs, wherein the co-occurrence times are the co-occurrence times within the length of a preset window; generating a co-occurrence matrix according to the co-occurrence times; training word vectors according to the co-occurrence matrix and the GloVe model to obtain the GloVe model for carrying out word vector coding on the first error-reporting logs.

Optionally, the apparatus further comprises: and the preprocessing module is used for extracting the stems of the first error-reporting logs and eliminating redundant entities contained in the first error-reporting logs.

Optionally, the apparatus further comprises: a rule generating module, configured to determine, for any one of the multiple clusters, a similarity distance between different error logs included in the any one cluster; determining a plurality of groups of error reporting logs with the similarity distance meeting set conditions; and determining an entity rejecting rule according to the plurality of groups of error reporting logs.

Based on this, the preprocessing module may be further configured to: and rejecting redundant entities contained in the first error logs according to the entity rejection rules.

Optionally, the error correction module 14 may specifically be configured to: acquiring at least one group of second error logs, wherein the at least one group of second error logs sequentially correspond to the error logs generated in different time periods after the plurality of first error logs; according to the obtained clusters, sequentially clustering the at least one group of second error logs to update the clusters; and carrying out error correction processing on the at least one group of second error logs according to the updated error correction information respectively corresponding to each cluster.

Optionally, in the process of performing error correction processing on the at least one group of second error logs according to the updated error correction information respectively corresponding to each cluster, the error correction module 14 may be specifically configured to: and for any group of second error logs, if any cluster contains part of second error logs, sending the error correction information corresponding to any cluster to the corresponding user according to the user identification corresponding to each part of second error logs.

Optionally, in the process of performing error correction processing on the at least one group of second error logs according to the updated error correction information respectively corresponding to each cluster, the error correction module 14 may be specifically configured to: and for any group of second error logs, if any cluster contains part of the second error logs, the error reason corresponding to any cluster belongs to the platform-side reason, and the total number of the error logs contained in any cluster is greater than or equal to a set threshold, executing a solution corresponding to any cluster.

Optionally, in the process of performing error correction processing on the at least one group of second error logs according to the updated error correction information respectively corresponding to each cluster, the error correction module 14 may be specifically configured to: for any cluster, if the error reason corresponding to any cluster belongs to the platform-side reason, determining the number of the at least one group of second error logs respectively corresponding to any cluster; and if the change of the number presents a sudden increase characteristic, executing a solution corresponding to any one cluster.

The error log processing apparatus shown in fig. 6 can execute the methods provided in the foregoing embodiments, and portions not described in detail in this embodiment may refer to the related descriptions of the foregoing embodiments, which are not described herein again.

In one possible design, the structure of the error log processing apparatus shown in fig. 6 may be implemented as an electronic device. As shown in fig. 7, the electronic device may include: a processor 21 and a memory 22. Wherein, the memory 22 has stored thereon executable codes, when the executable codes are executed by the processor 21, at least the processor 21 is enabled to implement the error log processing method as provided in the foregoing embodiments.

The electronic device may further include a communication interface 23 for communicating with other devices or a communication network.

In addition, an embodiment of the present invention provides a non-transitory machine-readable storage medium, on which executable code is stored, and when the executable code is executed by a processor of a wireless router, the processor is caused to execute the error log processing method provided in the foregoing embodiments.

The above-described apparatus embodiments are merely illustrative, wherein the various modules illustrated as separate components may or may not be physically separate. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

In the following, with reference to fig. 8, an intelligent home scenario is taken as an example to describe implementation of the error log processing method provided by the embodiment of the present invention in the intelligent home scenario.

In a smart home scenario, there is a certain smart device and at least one controlled device controlled by the smart device. For example, as shown in fig. 8, the smart device may be a smart speaker, and the at least one controlled device is at least one household device, such as a refrigerator, an air conditioner, and the like.

The intelligent sound box comprises: memory, processor, communication interface. Wherein, communication interface makes this intelligent audio amplifier have the network communication ability.

Wherein the memory has stored thereon executable code that, when executed by the processor, causes the processor to perform the steps of:

acquiring a plurality of error logs generated by at least one controlled device controlled by the intelligent sound box through a communication interface; clustering the error logs to obtain a plurality of clusters; determining error correction information corresponding to the plurality of clusters, respectively; error correction processing is performed based on error correction information corresponding to each of the plurality of clusters.

Assuming that the at least one controlled device is the refrigerator or the air conditioner shown in fig. 8, after determining the clusters into which the error log generated by the refrigerator and the error log generated by the air conditioner are respectively clustered based on the above steps and obtaining the error correction information corresponding to the clusters, the corresponding error correction processing can be performed on the refrigerator and the air conditioner according to the error correction information corresponding to the clusters. The error correction processing is, for example, what tells the refrigerator, what the reason the air conditioner makes an error, and what the solution is as mentioned in the foregoing embodiments.

For the clustering process and the error correction process of the error logs in this embodiment, reference may be made to the descriptions in the foregoing other embodiments, which are not described herein again.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by adding a necessary general hardware platform, and of course, can also be implemented by a combination of hardware and software. With this understanding in mind, the above-described aspects and portions of the present technology which contribute substantially or in part to the prior art may be embodied in the form of a computer program product, which may be embodied on one or more computer-usable storage media having computer-usable program code embodied therein, including without limitation disk storage, CD-ROM, optical storage, and the like.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An error log processing method is characterized by comprising the following steps:

acquiring a plurality of first error logs;

based on identifiers corresponding to different entities, removing redundant first entities contained in the first error logs;

obtaining keywords contained in each of the first error report logs, performing word vector coding on the keywords contained in each of the first error report logs, determining semantic feature matrices respectively corresponding to the first error report logs according to the word vectors respectively corresponding to the first error report logs, and clustering the first error report logs by using the semantic feature matrices respectively corresponding to the first error report logs as samples to obtain a plurality of clusters;

removing redundant second entities contained in the first error logs; wherein an entity culling rule for implementing culling of the second entity is generated as follows: for any one of the clusters, determining similarity distances among different error-reporting logs contained in the any one cluster, determining multiple groups of error-reporting logs with the similarity distances meeting set conditions, and determining the entity elimination rule according to differences embodied by the multiple groups of error-reporting logs;

2. The method of claim 1, wherein the clustering process is performed using a density clustering algorithm.

3. The method according to claim 2, wherein the obtaining keywords included in each of the first error logs comprises:

for any error log in the first error logs, performing word segmentation processing on the error log;

determining the words with the word frequency meeting the set conditions in any error log as the keywords contained in any error log.

4. The method of claim 3, further comprising:

acquiring a plurality of words contained in the first error logs;

generating a vocabulary list according to the word frequencies corresponding to the plurality of words, wherein the vocabulary list is composed of words with the highest word frequency and the preset number of words;

and the keywords contained in any error log are words contained in the error log and positioned in the vocabulary.

5. The method of claim 4, further comprising:

acquiring the co-occurrence times of any two words in the vocabulary in the first error report logs, wherein the co-occurrence times are the co-occurrence times within the length of a preset window;

generating a co-occurrence matrix according to the co-occurrence times;

training word vectors according to the co-occurrence matrix and the GloVe model to obtain the GloVe model for carrying out word vector coding on the first error-reporting logs.

6. The method of claim 1, wherein prior to clustering the first plurality of error logs, the method further comprises:

and performing stem extraction on the first error logs.

7. The method of claim 1, wherein said culling redundant entities contained in the first plurality of error logs comprises:

and rejecting redundant entities contained in the first error logs according to the entity rejection rules.

8. The method according to any one of claims 1 to 7, wherein the performing error correction processing according to the error correction information corresponding to each of the plurality of clusters includes:

acquiring at least one group of second error logs, wherein the at least one group of second error logs sequentially correspond to the error logs generated in different time periods after the plurality of first error logs;

according to the obtained clusters, sequentially clustering the at least one group of second error logs to update the clusters;

and carrying out error correction processing on the at least one group of second error logs according to the updated error correction information respectively corresponding to each cluster.

9. The method according to claim 8, wherein performing error correction processing on the at least one second error log according to the updated error correction information respectively corresponding to each cluster includes:

and for any group of second error logs, if any cluster contains part of second error logs, sending the error correction information corresponding to any cluster to the corresponding user according to the user identification corresponding to each part of second error logs.

10. The method according to claim 8, wherein the error correction information includes a solution and an error reason, and performing error correction processing on the at least one second error log according to the updated error correction information corresponding to each cluster respectively includes:

and for any group of second error logs, if any cluster contains part of the second error logs, the error reason corresponding to any cluster belongs to the platform-side reason, and the total number of the error logs contained in any cluster is greater than or equal to a set threshold, executing a solution corresponding to any cluster.

11. The method according to claim 8, wherein the error correction information includes a solution and an error reason, and performing error correction processing on the at least one second error log according to the error reason corresponding to each updated cluster respectively includes:

for any cluster, if the error reason corresponding to any cluster belongs to the platform-side reason, determining the number of the at least one group of second error logs respectively corresponding to any cluster;

and if the change of the number presents a sudden increase characteristic, executing a solution corresponding to any one cluster.

12. The method of claim 1, wherein the error correction information is expert labeled.

13. An error log processing apparatus, comprising:

the acquisition module is used for acquiring a plurality of first error logs;

the preprocessing module is used for eliminating redundant first entities contained in the first error logs based on identifiers corresponding to different entities;

the clustering module is used for acquiring keywords contained in the first error report logs, performing word vector coding on the keywords contained in the first error report logs, determining semantic feature matrixes corresponding to the first error report logs according to the word vectors corresponding to the first error report logs, and clustering the first error report logs by taking the semantic feature matrixes corresponding to the first error report logs as samples to obtain a plurality of clusters;

the preprocessing module is further configured to remove redundant second entities included in the first error logs; wherein an entity culling rule for implementing culling of the second entity is generated as follows: for any one of the clusters, determining similarity distances among different error-reporting logs contained in the any one cluster, determining multiple groups of error-reporting logs with the similarity distances meeting set conditions, and determining the entity elimination rule according to differences embodied by the multiple groups of error-reporting logs;

14. An electronic device, comprising: a memory, a processor; wherein the memory has stored thereon executable code which, when executed by the processor, causes the processor to perform the error log processing method of any of claims 1 to 12.

15. The electronic device of claim 14, wherein the electronic device is a terminal device or a server.

16. An intelligent sound box, comprising: a memory, a processor, a communication interface; wherein the memory has stored thereon executable code that, when executed by the processor, causes the processor to perform the steps of:

acquiring a plurality of first error logs generated by at least one controlled device controlled by the intelligent sound box through the communication interface;

17. The smart sound box of claim 16, wherein the at least one controlled device is at least one household device.