CN113849474A - Data processing method and device, electronic equipment and readable storage medium - Google Patents

Data processing method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN113849474A
CN113849474A CN202111101705.9A CN202111101705A CN113849474A CN 113849474 A CN113849474 A CN 113849474A CN 202111101705 A CN202111101705 A CN 202111101705A CN 113849474 A CN113849474 A CN 113849474A
Authority
CN
China
Prior art keywords
classification
log
result
target
classification result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111101705.9A
Other languages
Chinese (zh)
Inventor
李士新
帅朝春
陆天洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Douku Software Technology Co Ltd
Original Assignee
Hangzhou Douku Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Douku Software Technology Co Ltd filed Critical Hangzhou Douku Software Technology Co Ltd
Priority to CN202111101705.9A priority Critical patent/CN113849474A/en
Publication of CN113849474A publication Critical patent/CN113849474A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri

Abstract

The application discloses a data processing method, a data processing device, electronic equipment and a readable storage medium, and belongs to the technical field of data processing. The method comprises the following steps: acquiring a problem description text input by a user; determining a target log based on the problem description text, wherein the target log is an abnormal log file related to the problem description text; and performing problem analysis on the target log to obtain a log analysis result corresponding to the target log. The method and the device for determining the target log through the problem description text can improve the log analysis rate.

Description

Data processing method and device, electronic equipment and readable storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, an electronic device, and a readable storage medium.
Background
With the explosion of internet technology, computers have become an indispensable tool for people to work or communicate in daily life. When a user uses a computing device, the network device generates various text data such as log, url, and trace, which describe operations related to date, time, user, and action. By analyzing the text data, the operation and maintenance personnel can monitor the health conditions of the system and the network, the use condition of the user and the like in real time. However, the formats and data types of log files are various, and for different types of log files, the processing is complicated due to non-uniform formats.
Disclosure of Invention
The application provides a data processing method, a data processing device, an electronic device and a readable storage medium, so as to overcome the defects.
In a first aspect, an embodiment of the present application provides a data processing method, where the method includes: acquiring a problem description text input by a user; determining a target log based on the problem description text, wherein the target log is an abnormal log file related to the problem description text; and performing problem analysis on the target log to obtain a log analysis result corresponding to the target log.
In a second aspect, an embodiment of the present application further provides a data processing apparatus, where the apparatus includes: the device comprises an acquisition module, a determination module and an analysis module. The acquisition module is used for acquiring the question description text input by the user. A determining module, configured to determine a target log based on the problem description text, where the target log is an abnormal log file related to the problem description text. And the analysis module is used for carrying out problem analysis on the target log to obtain a log analysis result corresponding to the target log.
In a third aspect, an embodiment of the present application further provides an electronic device, including one or more processors; a memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the above-described methods.
In a fourth aspect, an embodiment of the present application further provides a computer-readable storage medium, where a program code is stored in the computer-readable storage medium, and the program code can be called by a processor to execute the above method.
According to the data processing method and device, the electronic device and the readable storage medium, when the problem description text is obtained, the problem analysis rate can be increased by determining the target log by using the problem description text. Specifically, the problem description text input by the user can be obtained first, on the basis, the target log is determined based on the problem description text, wherein the target log is an abnormal log file related to the problem description text, and finally, the target log can be subjected to problem analysis to obtain a log analysis result corresponding to the target log. According to the method and the device, the target log is determined by utilizing the problem description text, and the target log is analyzed, so that the acquisition rate of the target log can be increased to a certain extent, and the efficiency of log analysis can be improved.
Additional features and advantages of embodiments of the present application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of embodiments of the present application. The objectives and other advantages of the embodiments of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 illustrates a method flow diagram of a data processing method provided by an embodiment of the present application;
FIG. 2 is a flow chart of a method of data processing according to another embodiment of the present application;
FIG. 3 is a diagram illustrating an exemplary first classification result in a data processing method according to another embodiment of the present application
Fig. 4 is a flowchart illustrating a step of step S230 in a data processing method according to another embodiment of the present application;
FIG. 5 is a diagram illustrating an example of a second classification result in a data processing method according to another embodiment of the present application;
FIG. 6 is a diagram illustrating an exemplary comparison algorithm for obtaining a classification network in a data processing method according to another embodiment of the present application;
FIG. 7 illustrates a method flow diagram of a data processing method provided by yet another embodiment of the present application;
fig. 8 is a flowchart illustrating a step of step S330 in a data processing method according to another embodiment of the present application;
fig. 9 is a block diagram illustrating a structure of a data processing apparatus according to an embodiment of the present application;
fig. 10 shows a block diagram of an electronic device provided in an embodiment of the present application;
fig. 11 illustrates a storage unit for storing or carrying program codes for implementing the data processing method according to the embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Natural Language ProceSSing (NLP) is a technique of performing various types of ProceSSing, and the like on Natural Language information in written form or spoken form by using a computer. The natural language processing is mainly applied to search engines, intelligent customer service, business intelligence, law, medical treatment, education and the like, and the core application of the natural language processing mainly comprises machine translation, information retrieval, information extraction, text classification, emotion classification, question-answering system, recommendation system, reading understanding and the like. In addition, the basic technology of natural language processing mainly includes lexical analysis, syntactic analysis, semantic analysis, pragmatic analysis, discourse analysis and the like, and the basic theory for realizing natural speech processing includes formal speech and automaton, probability theory and information theory and machine learning, wherein the machine learning may include deep learning, and the data resource of natural language processing is a corpus.
A corpus is a set of written or spoken natural language material stored on a computer for studying how a language is used. In other words, the corpus is a systematic and computerized set of real languages for linguistic and corpus analysis, and it can be seen that the quality of the type of corpus can largely determine the effect of text recognition. In addition, DevOps is a collective term for a set of processes, methods and systems, which emphasizes the communication and cooperation between software developers and operation and maintenance personnel.
At present, in the field of DevOps (processes, methods and systems), universal corpora often cannot meet requirements, but high-quality corpora are not many, and the main reasons are as follows: the log types are various, for example, in the Android field, the log may include a linux log, a frame log, a crash log, a modem log, and the like, and each log has a different format and a different processing method. The demands of different services for the material processing are different, some services need specific parts such as stacking, and some services need to analyze problems by looking at the whole picture, so the shortage of high-quality linguistic data causes the use of natural language processing technology in the field of DevOps to be greatly influenced. In addition, the prior art cannot support rapid analysis of large-scale professional text data, mainly because the existing log files are large in size and different in format and content. For example, the error log of linux and the error log and analysis method of Java layer are completely different.
In view of the above problems, the inventor proposes a data processing method, an electronic device, and a storage medium provided in the embodiments of the present application, and when a problem description text is obtained, a rate of problem analysis can be increased by determining a target log using the problem description text. Specifically, the problem description text input by the user can be obtained first, on the basis, the target log is determined based on the problem description text, wherein the target log is an abnormal log file related to the problem description text, and finally, the target log can be subjected to problem analysis to obtain a log analysis result corresponding to the target log. According to the method and the device, the target log is determined by utilizing the problem description text, and the target log is analyzed, so that the acquisition rate of the target log can be increased to a certain extent, and the efficiency of log analysis can be improved. Specific data processing methods are described in detail in the following embodiments.
Referring to fig. 1, fig. 1 is a schematic flow chart illustrating a data processing method according to an embodiment of the present application. In a specific embodiment, the data processing method is applied to the data processing apparatus 400 shown in fig. 9 and applied to the electronic device 500 shown in fig. 10. As will be described in detail with respect to the flow shown in fig. 1, the data processing method may specifically include step 110 to step S130.
Step S110: and acquiring the problem description text input by the user.
The data processing method in the embodiment of the application can be applied to electronic equipment, and the electronic equipment can be a smart phone, a smart sound box, a smart watch, a portable computer and the like. As one way, the electronic device may detect whether a question description text input by the user is received, and if the question description text input by the user is received, the electronic device may determine a target log based on the question description text, i.e., proceed to step S120.
In some embodiments, the problem description text may be data actively reported by the user, that is, when the user uses the electronic device, if the electronic device is found to be abnormal, the user may input the problem description text based on the abnormality. For example, when the user uses the electronic device, the screen may be flashed back, and at this time, the user may input a question description text "the screen is flashed back" according to the flashed back condition. In addition, the question description text may include words, punctuation marks, emoticons, numbers, and other symbols, wherein the words are the main contents constituting the question description text.
In some embodiments, the problem description text may be a text input by a user according to an actual situation of using the electronic device, and may also be a description text uploaded by a back-end operation and maintenance person according to a test situation. In addition, when the problem description text input by the user is obtained, the problem description text may also be stored, and meanwhile, an abnormal event that triggers the problem description text may be obtained, and the subsequent electronic device may detect the abnormal event, and when the abnormal event is detected again, the electronic device may directly use the problem description text corresponding to the abnormal event as the problem description text input by the user.
In other words, the problem description text input by the user can be recorded and associated with the abnormal event, when the abnormal event occurs again, the user does not need to repeatedly input the problem description text, that is, the electronic device can automatically acquire the problem description text only according to the abnormal event, and therefore the use experience of the user can be improved to a great extent.
In some embodiments, the electronic device may also detect an abnormal event, and when the abnormal event is detected, the electronic device may output a question report box, and the user may input a question description text based on the question report box at this time. In addition, when inputting the question description text, the user can directly input the question description text by using the text input module configured by the electronic equipment. Optionally, when the user inputs the question description text, the user may also input the question description text in a voice manner, and in the process, the electronic device may receive the question description voice input by the user and then convert the question description voice into the question description text.
In other embodiments, when the electronic device detects that an abnormal event occurs, it may also detect a situation that the user currently uses the electronic device, that is, determine a level of a foreground application in the electronic device, and when the level of the current foreground application of the electronic device is the first level, the electronic device may not pop up a problem report box, and may simultaneously notify the user that a problem description text may be reported in a vibration or flashing manner.
In other embodiments, when the electronic device detects that an abnormal event occurs, it may also output a prompt message, by which the user may be prompted that the electronic device is abnormal. In addition, the embodiment of the application can also indicate how the user inputs the problem description text through the prompt message, and the operation of triggering the problem description text input function is possibly complicated. Optionally, the electronic device may also instruct the user to enter standard question description text via the prompt message. For example, the user is prompted to "please enter standard question description text, avoid unwanted symbols such as emoticons".
Step S120: determining a target log based on the problem description text, the target log being an abnormal log file associated with the problem description text.
In some embodiments, when the problem description text is obtained, the electronic device may determine a target log based on the problem description text, and the target logs corresponding to different problem description texts may also be different, where different logs according to types may include a linux error log, a frame error log, an error log of a Java layer, and the like, and different logs according to problems may include a crash log, a modem log, and the like, and the specific log includes which logs are not described in detail herein.
As can be known from the above description, a large number of log files are usually generated when an electronic device is operated, and if different anomalies occur in the electronic device, the log files correspondingly generated are different. Therefore, in order to speed up the problem analysis rate, the embodiment of the present application may use the problem description text to locate a target log from a plurality of log files, where the target log may be an abnormal log file related to the problem description text.
In the embodiment of the application, the problem description text and the target log can be stored in the log obtaining list through a preset corresponding relation, and when the problem description text is obtained, the electronic device can search the target log corresponding to the problem description text in the log obtaining list. In addition, after the problem description text is acquired, the problem description text may also be preprocessed to remove redundant information in the problem description text, and then the remaining problem description text is subjected to semantic analysis to obtain a semantic analysis result, and a log corresponding to the semantic analysis result is used as a target log.
In other embodiments, when the problem description text input by the user is acquired, the electronic device may perform text recognition on the problem description text to acquire keyword information included in the problem description text, then acquire a log corresponding to the keyword information, and use the log as a target log. In addition, when there are a plurality of logs corresponding to the problem description text, the embodiment of the present application may use all of the plurality of logs as target logs, that is, the target logs may include a plurality of target sub-logs.
In other embodiments, if there are multiple logs corresponding to the problem description text, in the embodiment of the present application, priority ordering may also be performed on multiple target sub-logs, and a log with a higher priority is used as a target log, that is, the electronic device may select a preset number of logs from the target sub-logs by using the priority size as the target log. The priority level can be determined by analyzing the probability of the target sub-log storing the abnormal file, wherein the higher the probability of storing the abnormal file is, the higher the priority level of the target sub-log is.
As one way, after obtaining the target log, the electronic device may perform problem analysis on the target log to obtain a log analysis result corresponding to the target log, that is, enter step S130.
Step S130: and performing problem analysis on the target log to obtain a log analysis result corresponding to the target log.
In some embodiments, the target log may include a time when the abnormality occurs, a location of the abnormality, a category of the abnormality, and the like, and when the target log is obtained, the target log may be analyzed by an automatic analyzer to obtain a log analysis result corresponding to the target log. The log analysis result may include a cause of the exception, a specific location of the exception, and a policy that may be taken to resolve the exception, among others.
In addition, when the target log is acquired, different automatic analyzers are loaded according to the type of the target log to analyze the target log so as to obtain a log analysis result. Generally speaking, the solutions of the problems in the specific field are similar, for example, anr-class (application non-response) problems in Android are often located at a specific position of a stack, and the cause of the abnormality can be accurately and quickly analyzed by analyzing a target log by using a specific analyzer.
As a way, when the target log is obtained, the log analysis result may also be obtained by combining the target log and the problem description text, that is, both the target log and the problem analysis text are input to the automatic analyzer, so as to instruct the automatic analyzer to obtain the log analysis result by combining the target log and the problem analysis text, thereby improving the accuracy of log analysis.
According to the data processing method provided by the embodiment of the application, when the problem description text is obtained, the problem analysis rate can be increased by determining the target log by using the problem description text. Specifically, the problem description text input by the user can be obtained first, on the basis, the target log is determined based on the problem description text, wherein the target log is an abnormal log file related to the problem description text, and finally, the target log can be subjected to problem analysis to obtain a log analysis result corresponding to the target log. According to the method and the device, the target log is determined by utilizing the problem description text, and the target log is analyzed, so that the acquisition rate of the target log can be increased to a certain extent, and the efficiency of log analysis can be improved.
Referring to fig. 2, the data processing method according to another embodiment of the present application may include steps S210 to S240.
Step S210: and acquiring the problem description text input by the user.
Step S220: and performing text recognition on the problem description text to obtain a text recognition result.
As one mode, when the electronic device obtains the problem description text input by the user, the electronic device may perform text recognition on the problem description text to obtain a text recognition result. The text recognition result may include a plurality of keywords, or may include different symbols such as expressions and commas. After obtaining the text recognition result, the electronic device may perform a first classification operation based on the text recognition result, that is, proceed to step S230.
Step S230: and executing a first classification operation to obtain a first classification result, and taking a log corresponding to the first classification result as the target log.
In this embodiment of the application, the first classification operation may be used to perform a classification operation on the text recognition result by using a first classification network. The first classification operation may also be referred to as module classification, the first classification operation is used to classify the text recognition result, and the network for performing the first classification operation may be referred to as a first classifier. The classification result obtained by the first classification operation may include two classes, the first class being a class in which a phenomenon can be classified, and the second class being a class in which a phenomenon cannot be classified. Here, the phenomenon classification refers to a case where the cause of the occurrence of the abnormality needs to be further analyzed.
As an example, the cause of the abnormality is an application a, and it is necessary to further understand whether the application a is a flash back abnormality, a heat generation abnormality, or a stuck abnormality. For another example, for the audio problem, since it is not possible to determine what problem is specific, and after the audio abnormality is located, the electronic device realizes automatic abnormality analysis, and the subsequent operations can be analyzed in a unified manner by a professional user, the application a belongs to a case where phenomenon classification can be performed, and the audio problem is a case where phenomenon classification cannot be performed.
In this embodiment, the first classification network may also be referred to as a module classifier or a first classifier, and when the text recognition result is obtained, the electronic device may input the text recognition result to the first classification network to obtain the first classification result. The result output by using the first classification network may include categories as shown in fig. 3, and it can be seen from fig. 3 that the output by using the first classifier includes a third party, a camera, a network, a system application, an internet application, audio, wifi and wireless connection, security and compliance, a screen, standby, dcs, notification and status bars, a face fingerprint, aging, bluetooth, charging, screen locking, photo album and video, split screen, CTS, logout maintenance, feedback, dump, sensor, nfc, multiple users, and screen recording.
Wherein, the third party marked with number 1 in fig. 3, the system application marked with number 2, the internet application marked with number 3, the notification and status bar marked with number 4 and the album and video marked with number 5 can execute the second classification operation, and the other classifications do not need to execute the second classification operation, i.e. the camera, network, audio, wifi and wireless connection, security and compliance, screen, standby, dcs (data return tool), face fingerprint, aging, bluetooth, charging, screen locking, screen splitting, CTS (android compatibility test), logout maintenance, feedback, dump, sensor, nfc (near field communication), multiple users and screen recording can be used as the target classification result.
Referring to fig. 4, step S230 may include steps S231 to S232.
Step S231: and executing a first classification operation to obtain a first classification result.
By one approach, when the electronic device performs the first classification operation and obtains the first classification result, it may determine whether the first classification result is the target classification result, i.e., proceed to step S232. In addition, the first classification result may include a plurality of first classification sub-results, each of which is different. When a plurality of first classification sub-results are obtained, the embodiment of the present application may also obtain the probability of each first classification sub-result, and then rank the plurality of first classification sub-results based on the probability.
Step S232: determining whether the first classification result is a target classification result.
In this embodiment of the application, the class corresponding to the target classification result is a class which cannot be subjected to the second classification operation, where the second classification operation may be used to perform a classification operation on the text recognition result by using a second classification network. As an example, a first classification operation is performed, the obtained first classification result is a third party, and since the third party does not belong to the target classification result, the classification operation may be continued by using the text recognition result of the second classification network to obtain a second classification result.
As one way, when it is determined that the first classification result is the target classification, the electronic device may take a log corresponding to the first classification result as the target log, that is, proceed to step S233. In addition, if it is determined that the first classification result is not the target classification result, a second classification operation may be performed to obtain a second classification result, and a log corresponding to the second classification result is taken as a target log, i.e., the process proceeds to step S234.
Step S233: and taking the log corresponding to the first classification result as a target log.
In the embodiment of the application, each first classification result may correspond to one target log, and when the first classification result is obtained, the electronic device may obtain the target log corresponding to the first classification result in the log query list, so that the rate of obtaining the target log may be increased.
Step S234: and executing the second classification operation to obtain a second classification result, and taking the log corresponding to the second classification result as the target log.
In this embodiment, the second classification network may also be referred to as a phenomenon classifier or a second classifier, and when it is determined that the first classification result is not the target classification result, the electronic device may input the text recognition result to the second classification network to obtain a second classification result, where the second classification operation may be referred to as a phenomenon classifier. The output result by the second classification network may include the categories as shown in fig. 5, and it can be seen from fig. 5 that the output by the second classifier includes a functional exception, a UI exception, a screen exception, a stuck, no response, a flash back crash, heat generation, a red screen, power consumption, and a crash restart.
In addition, the UI exception, screen exception, stuck and no response may include a plurality of sub-classification results, specifically, as shown in fig. 5, the UI exception may include 5 sub-classification results, where the 5 sub-classification results are respectively inconsistent, font color background, display, edge shadow and overlap; the screen abnormality may include 6 sub-classification results, the 6 sub-classification results being a splash screen, a black screen, a blur, a white screen, a splash screen, and a blue screen, respectively; the card pause can comprise 6 sub-classification results, wherein the 6 sub-classification results are interface card pause, function card pause, sliding card pause, video playing card pause, loading card pause and touch-off card pause respectively; the no-response may include 5 sub-classification results, which are click no-response, function no-response, popup no-response, swipe no-response, flash back crash, warm up, red screen, power consumption, and crash restart, respectively.
By comparing the categories specifically included in the first classification result and the second classification result, it can be known that the second classification result and the first classification result belong to different categories. Moreover, the first classification network and the second classification network for obtaining the first classification result and the second classification result are different, and the training data for training the first classification network and the second classification network may also be different.
In some embodiments, performing the second classification operation to obtain the second classification result, and taking the log corresponding to the second classification result as the target log may include: and executing a second classification operation to obtain a second classification result, and determining whether the second classification result comprises a plurality of sub-classification results on the basis. And if the second classification result does not comprise a plurality of sub-classification results, taking the log corresponding to the second classification result as a target log.
In addition, if the second classification result includes a plurality of sub-classification results, a target sub-classification result corresponding to the text recognition result may be obtained from the plurality of sub-classification results based on the third classification operation, and then a log corresponding to the target sub-classification result may be used as the target log. Wherein the network performing the third classification operation may be a third classification network, which may be referred to as a third classifier, and the electronic device may include a plurality of the third classification networks or the third classifiers. When the second classification result is obtained and it is determined that the second classification result includes a plurality of sub-classification results, the electronic device may obtain a third classifier corresponding to the second classification result, and then input the text recognition result to the third classifier, so as to obtain the target sub-classification result through the third classifier.
In order to more clearly understand the embodiment of the present application, an example is given, where when the electronic device obtains the question description text input by the user, the electronic device may perform text recognition on the question description text to obtain a text recognition result. Then, the text recognition result is input to the first classifier, the obtained first classification result is "third party" and "internet application", and it can be known from fig. 3 that the "third party" and the "internet application" do not belong to the target classification result, so the text recognition result can be input to the second classifier, the obtained second classification result is "heat generation" and "power consumption", and it can be known from fig. 4 that the "heat generation" and the "power consumption" do not include a plurality of sub-classification results, so the finally obtained target sub-classification result is "heat generation" and "power consumption".
In some embodiments, when it is determined that the second classification result is not the target classification result and the second classification result includes a plurality of sub-classification results, the embodiment of the present application may comprehensively determine the target log by combining the first classification result, the second classification result, and the target sub-classification result. Similarly, when different classification results are obtained, the electronic device may obtain the target log according to the different classification results.
In the embodiment of the application, the first classification network, the second classification network and the third classification network may be semi-supervised learning networks, and when the classification operation is performed, part of training data may be labeled first, and then each classification network is trained continuously, so that the classification effect of each classification network is optimized continuously. On the basis, the real label of the new data is verified according to the prediction result, and finally, the marking and training of the training data set are achieved.
As a mode, algorithms adopted by the electronic device to acquire the first classification network, the second classification network and the third classification network mainly include linear svc, LR, naive bayes, RF, lightgbm, LSTM, randomfortestclassfing, multinomial nb, logistic regression and the like. Because the integration model and the depth model are long in time consumption and large in model, and the final deployment is not facilitated, the method and the device for acquiring the classification networks can acquire the classification networks by adopting LinearSVC.
For more clear understanding, the embodiment of the present application selects the advantages of linearfvc, and the embodiment of the present application provides a schematic diagram for comparing the accuracy of a linearfvc algorithm, a randomfortestclassfing algorithm, a MultinomialNB algorithm, and a logistic regression algorithm, as shown in detail in fig. 6. It can be seen from fig. 6 that the accuracy of randomfortestclassinfig and MultinomialNB is lower than that of LinearSVC algorithm and logistic regression algorithm. Moreover, the Logistic regression algorithm has a large model and does not need deployment, so that LinearSVC is finally selected to obtain each classification network in the embodiment of the application.
Step S240: performing problem analysis on the target log to obtain a log analysis result corresponding to the target log
As a manner, when a target log is obtained, the embodiment of the present application may determine a problem analyzer corresponding to the target log, and then analyze the target log by using the problem analyzer to obtain a log analysis result corresponding to the target log. Therefore, when the target logs are determined to be different, the corresponding analyzers may be different, so that the flexibility and the effectiveness of log analysis can be improved.
According to the data processing method provided by the embodiment of the application, when the problem description text is obtained, the problem analysis rate can be increased by determining the target log by using the problem description text. Specifically, the problem description text input by the user can be obtained first, on the basis, the target log is determined based on the problem description text, wherein the target log is an abnormal log file related to the problem description text, and finally, the target log can be subjected to problem analysis to obtain a log analysis result corresponding to the target log. According to the method and the device, the target log is determined by utilizing the problem description text, and the target log is analyzed, so that the acquisition rate of the target log can be increased to a certain extent, and the efficiency of log analysis can be improved. In addition, the target log is comprehensively acquired by combining the first classification operation and the second classification operation, so that the acquisition rate of the target log can be improved, and the accuracy of data analysis can be improved.
Referring to fig. 7, the data processing method may include steps S310 to 340.
Step S310: and acquiring the problem description text input by the user.
Step S320: and performing text recognition on the problem description text to obtain a text recognition result.
The above embodiments of steps S310 to S320 have been described in detail, and are not described in detail here.
Step S330: and executing a first classification operation to obtain a first classification result, and taking a log corresponding to the first classification result as the target log.
Referring to fig. 7, step S330 may include steps S331 to S333.
Step S331: and performing word segmentation on the text recognition result to obtain a plurality of word segmentation results.
As one way, after performing text recognition on the problem description text and obtaining a text recognition result, the electronic device may pre-process the text recognition result, mainly because the problem description text input by the user is a description of various problems, which usually contains more useless contents and has a problem of a disordered format. Therefore, when the text recognition is obtained, the electronic equipment can preprocess the text recognition result to remove useless keywords and adjust the format of the text recognition result, so that the text recognition result is more standardized. In addition, after preprocessing the text recognition result, the electronic device can perform word segmentation on the text recognition result to obtain a plurality of word segmentation results. The embodiment of the application can adopt a Chinese word segmentation method mainly based on pkuseg,
step S332: and selecting a target word segmentation result from the plurality of word segmentation results.
In other embodiments, after obtaining the multiple word segmentation results, the electronic device may select a target word segmentation result from the multiple word segmentation results. In this process, the electronic device may remove the redundant word by using the stop word, where the stop word may be a special word and has no influence on the problem analysis, and therefore, in order to improve the efficiency of data processing, in the embodiment of the present application, after performing word segmentation on the text recognition result, it may also be possible to search for whether there is a stop word matching with multiple word segmentation results in the stop data table. If the word segmentation result exists, the word segmentation result is indicated to be a redundant word, at this time, the word segmentation result can be directly removed from a plurality of word segmentation results, and the rest word segmentation results can be used as target word segmentation results.
As an example, the word segmentation results respectively include "mobile phone", "yes", "screen", "abnormal", "black screen", and "233", and the word segmentation result "and" 233 "are determined to be stop words by comparison, so that the two word segmentation results can be removed, and the remaining" mobile phone "," abnormal ", and" black screen "can be used as the target word segmentation result.
Step S333: and classifying the target word segmentation result by utilizing a first classification network to obtain a first classification result, and taking a log corresponding to the first classification result as the target log.
In this embodiment of the application, classifying the target word segmentation result by using the first classification network, and obtaining the first classification result may include: vectorizing the target word segmentation result to obtain a vector result, classifying the vector result by using a first classification network to obtain a first classification result, and taking a log corresponding to the first classification result as a target log. In other words, for better obtaining the target log, when the target word segmentation result is obtained, the electronic device may vectorize the target word segmentation result, so that the target word segmentation result may be converted into a form that can be recognized by the electronic device.
Optionally, when vectorizing the target word segmentation result, the embodiment of the present application may adopt a word bag model, a tf-idf (term frequency-inverse file frequency), a word2vec with tf-idf, and other methods. In addition, because the problem description text mainly takes the short texts as the main part and lacks the complex logic relationship among the words which is common in the long text, the word2vec which pays attention to the context relation of the words has a common effect, and the tf-idf method which pays attention to the meaning of the words has a good effect. Therefore, the tf-idf method is preferred when vectorizing the target word segmentation result.
Step S340: and performing problem analysis on the target log to obtain a log analysis result corresponding to the target log.
According to the data processing method provided by the embodiment of the application, when the problem description text is obtained, the problem analysis rate can be increased by determining the target log by using the problem description text. Specifically, the problem description text input by the user can be obtained first, on the basis, the target log is determined based on the problem description text, wherein the target log is an abnormal log file related to the problem description text, and finally, the target log can be subjected to problem analysis to obtain a log analysis result corresponding to the target log. According to the method and the device, the target log is determined by utilizing the problem description text, and the target log is analyzed, so that the acquisition rate of the target log can be increased to a certain extent, and the efficiency of log analysis can be improved. In addition, the embodiment of the application can improve the quality of data processing by carrying out operations such as preprocessing on the problem description text, and meanwhile, the universality of log analysis can be ensured.
Referring to fig. 9, an embodiment of the present application provides a data processing apparatus 400. In a specific embodiment, the data processing apparatus 400 includes: an acquisition module 410, a determination module 420, and an analysis module 430.
The obtaining module 410 is configured to obtain a question description text input by a user.
A determining module 420, configured to determine a target log based on the question description text, where the target log is an abnormal log file related to the question description text.
Further, the determining module 420 is further configured to perform text recognition on the problem description text to obtain a text recognition result; and executing a first classification operation to obtain a first classification result, and taking a log corresponding to the first classification result as the target log, wherein the first classification operation is used for performing classification operation on the text recognition result by using a first classification network.
Further, the determining module 420 is further configured to perform a first classification operation to obtain a first classification result; determining whether the first classification result is a target classification result, wherein the class corresponding to the target classification result is a class which cannot be subjected to a second classification operation, and the second classification operation is used for performing classification operation on the text recognition result by utilizing a second classification network; and if the first classification result is a target classification result, taking the log corresponding to the first classification result as a target log.
Further, the determining module 420 is further configured to, if the first classification result is not the target classification result, execute the second classification operation to obtain a second classification result, and use a log corresponding to the second classification result as the target log, where the second classification result and the first classification result belong to different categories.
Further, the determining module 420 is further configured to perform a second classification operation to obtain a second classification result; determining whether the second classification result includes a plurality of sub-classification results; and if the second classification result does not comprise a plurality of sub-classification results, taking the log corresponding to the second classification result as the target log.
Further, the determining module 420 is further configured to, if the second classification result includes multiple sub-classification results, obtain a target sub-classification result corresponding to the text recognition result from the multiple sub-classification results based on a third classification operation, and use a log corresponding to the target sub-classification result as a target log.
Further, the determining module 420 is further configured to perform word segmentation on the text recognition result to obtain a plurality of word segmentation results; selecting a target word segmentation result from the plurality of word segmentation results; and classifying the target word segmentation result by utilizing a first classification network to obtain a first classification result, and taking a log corresponding to the first classification result as the target log.
Further, the determining module 420 is further configured to perform vectorization on the target word segmentation result to obtain a vector result; and classifying the vector result by using the first classification network to obtain a first classification result, and taking a log corresponding to the first classification result as a target log.
And the analysis module 430 is configured to perform problem analysis on the target log to obtain a log analysis result corresponding to the target log.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
According to the data processing device provided by the embodiment of the application, when the problem description text is obtained, the problem analysis rate can be increased by determining the target log by using the problem description text. Specifically, the problem description text input by the user can be obtained first, on the basis, the target log is determined based on the problem description text, wherein the target log is an abnormal log file related to the problem description text, and finally, the target log can be subjected to problem analysis to obtain a log analysis result corresponding to the target log. According to the method and the device, the target log is determined by utilizing the problem description text, and the target log is analyzed, so that the acquisition rate of the target log can be increased to a certain extent, and the efficiency of log analysis can be improved.
Referring to fig. 10, a block diagram of an electronic device 500 according to an embodiment of the present disclosure is shown. The electronic device 500 may be a smart phone, a tablet computer, an electronic book, or other electronic devices capable of running an application. The electronic device 500 in the present application may include one or more of the following components: a processor 510, a memory 520, and one or more applications, wherein the one or more applications may be stored in the memory 520 and configured to be executed by the one or more processors 510, the one or more programs configured to perform a method as described in the aforementioned method embodiments.
Processor 510 may include one or more processing cores. The processor 510 interfaces with various components throughout the electronic device 500 using various interfaces and circuitry to perform various functions of the electronic device 500 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 520 and invoking data stored in the memory 520. Alternatively, the processor 510 may be implemented in hardware using at least one of Digital Signal ProceSSing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 510 may integrate one or a combination of a Central ProceSSing Unit (CPU), a voice print recognizer (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 510, but may be implemented by a communication chip.
The Memory 520 may include a Random AcceSS Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 520 may be used to store instructions, programs, code sets, or instruction sets. The memory 520 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The data storage area may also store data created during use by the electronic device 500 (e.g., phone books, audio-visual data, chat log data), and so forth.
Referring to fig. 11, a block diagram of a computer-readable storage medium 600 according to an embodiment of the present application is shown. The computer-readable storage medium 600 has stored therein program code that can be called by a processor to execute the method described in the above-described method embodiments.
The computer-readable storage medium 600 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 600 includes a non-volatile computer-readable storage medium. The computer readable storage medium 600 has storage space for program code 610 for performing any of the method steps in the above-described method embodiments. The program code can be read from or written to one or more computer program products. The program code 610 may be compressed, for example, in a suitable form. Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (11)

1. A method of data processing, the method comprising:
acquiring a problem description text input by a user;
determining a target log based on the problem description text, wherein the target log is an abnormal log file related to the problem description text;
and performing problem analysis on the target log to obtain a log analysis result corresponding to the target log.
2. The method of claim 1, wherein determining a target log based on the issue description text comprises:
performing text recognition on the problem description text to obtain a text recognition result;
and executing a first classification operation to obtain a first classification result, and taking a log corresponding to the first classification result as the target log, wherein the first classification operation is used for performing classification operation on the text recognition result by using a first classification network.
3. The method according to claim 2, wherein the performing the first classification operation to obtain a first classification result and using a log corresponding to the first classification result as the target log comprises:
executing a first classification operation to obtain a first classification result;
determining whether the first classification result is a target classification result, wherein the class corresponding to the target classification result is a class which cannot be subjected to a second classification operation, and the second classification operation is used for performing classification operation on the text recognition result by utilizing a second classification network;
and if the first classification result is a target classification result, taking the log corresponding to the first classification result as a target log.
4. The method of claim 3, further comprising:
and if the first classification result is not the target classification result, executing the second classification operation to obtain a second classification result, and taking a log corresponding to the second classification result as the target log, wherein the second classification result and the first classification result belong to different categories.
5. The method according to claim 4, wherein the performing the second classification operation to obtain a second classification result, and taking a log corresponding to the second classification result as the target log comprises:
executing a second classification operation to obtain a second classification result;
determining whether the second classification result includes a plurality of sub-classification results;
and if the second classification result does not comprise a plurality of sub-classification results, taking the log corresponding to the second classification result as the target log.
6. The method of claim 5, further comprising:
and if the second classification result comprises a plurality of sub-classification results, acquiring a target sub-classification result corresponding to the text recognition result from the plurality of sub-classification results based on a third classification operation, and taking a log corresponding to the target sub-classification result as a target log.
7. The method according to any one of claims 2 to 6, wherein the performing the first classification operation to obtain a first classification result and using a log corresponding to the first classification result as the target log comprises:
performing word segmentation on the text recognition result to obtain a plurality of word segmentation results;
selecting a target word segmentation result from the plurality of word segmentation results;
and classifying the target word segmentation result by utilizing a first classification network to obtain a first classification result, and taking a log corresponding to the first classification result as the target log.
8. The method of claim 7, wherein the classifying the target word segmentation result using a first classification network to obtain the first classification result comprises:
vectorizing the target word segmentation result to obtain a vector result;
and classifying the vector result by using the first classification network to obtain a first classification result, and taking a log corresponding to the first classification result as a target log.
9. A data processing apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring a problem description text input by a user;
a determining module, configured to determine a target log based on the question description text, where the target log is an abnormal log file related to the question description text;
and the analysis module is used for carrying out problem analysis on the target log to obtain a log analysis result corresponding to the target log.
10. An electronic device, comprising:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the method of any of claims 1-8.
11. A computer-readable storage medium, having stored thereon program code that can be invoked by a processor to perform the method according to any one of claims 1 to 8.
CN202111101705.9A 2021-09-18 2021-09-18 Data processing method and device, electronic equipment and readable storage medium Pending CN113849474A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111101705.9A CN113849474A (en) 2021-09-18 2021-09-18 Data processing method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111101705.9A CN113849474A (en) 2021-09-18 2021-09-18 Data processing method and device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN113849474A true CN113849474A (en) 2021-12-28

Family

ID=78974692

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111101705.9A Pending CN113849474A (en) 2021-09-18 2021-09-18 Data processing method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN113849474A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114531340A (en) * 2022-02-17 2022-05-24 Oppo广东移动通信有限公司 Log acquisition method and device, electronic equipment, chip and storage medium
CN115118582A (en) * 2022-06-15 2022-09-27 合肥移瑞通信技术有限公司 Log analysis method and device
CN116204266A (en) * 2023-05-04 2023-06-02 深圳市联合信息技术有限公司 Remote assisted information creation operation and maintenance system and method thereof

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114531340A (en) * 2022-02-17 2022-05-24 Oppo广东移动通信有限公司 Log acquisition method and device, electronic equipment, chip and storage medium
CN114531340B (en) * 2022-02-17 2023-10-13 Oppo广东移动通信有限公司 Log acquisition method and device, electronic equipment, chip and storage medium
CN115118582A (en) * 2022-06-15 2022-09-27 合肥移瑞通信技术有限公司 Log analysis method and device
CN115118582B (en) * 2022-06-15 2024-04-16 合肥移瑞通信技术有限公司 Log analysis method and device
CN116204266A (en) * 2023-05-04 2023-06-02 深圳市联合信息技术有限公司 Remote assisted information creation operation and maintenance system and method thereof

Similar Documents

Publication Publication Date Title
US11934789B2 (en) Artificial intelligence augmented document capture and processing systems and methods
US20190287142A1 (en) Method, apparatus for evaluating review, device and storage medium
US9792534B2 (en) Semantic natural language vector space
US11514235B2 (en) Information extraction from open-ended schema-less tables
CN113849474A (en) Data processing method and device, electronic equipment and readable storage medium
CN111680159B (en) Data processing method and device and electronic equipment
CN110444198B (en) Retrieval method, retrieval device, computer equipment and storage medium
JP2017534941A (en) Orphan utterance detection system and method
CN110597952A (en) Information processing method, server, and computer storage medium
US10108698B2 (en) Common data repository for improving transactional efficiencies of user interactions with a computing device
CN112084334B (en) Label classification method and device for corpus, computer equipment and storage medium
CN109933782B (en) User emotion prediction method and device
US11416539B2 (en) Media selection based on content topic and sentiment
CN112347760A (en) Method and device for training intention recognition model and method and device for recognizing intention
CN109634436B (en) Method, device, equipment and readable storage medium for associating input method
CN114218958A (en) Work order processing method, device, equipment and storage medium
CN110909768B (en) Method and device for acquiring marked data
US11561964B2 (en) Intelligent reading support
US11243916B2 (en) Autonomous redundancy mitigation in knowledge-sharing features of a collaborative work tool
US11373041B2 (en) Text classification using models with complementary granularity and accuracy
US20230351121A1 (en) Method and system for generating conversation flows
US20220207066A1 (en) System and method for self-generated entity-specific bot
CN114780757A (en) Short media label extraction method and device, computer equipment and storage medium
CN114492306A (en) Corpus labeling method and device, electronic equipment and storage medium
CN113255368A (en) Method and device for emotion analysis of text data and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination