CN111367971A - Financial system abnormity auxiliary analysis method and device based on data mining - Google Patents

Financial system abnormity auxiliary analysis method and device based on data mining Download PDF

Info

Publication number
CN111367971A
CN111367971A CN202010234684.7A CN202010234684A CN111367971A CN 111367971 A CN111367971 A CN 111367971A CN 202010234684 A CN202010234684 A CN 202010234684A CN 111367971 A CN111367971 A CN 111367971A
Authority
CN
China
Prior art keywords
keyword
abnormal
vector
frequency
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010234684.7A
Other languages
Chinese (zh)
Inventor
郭敏
阳骁尧
张爱华
朱宇峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202010234684.7A priority Critical patent/CN111367971A/en
Publication of CN111367971A publication Critical patent/CN111367971A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating

Abstract

The invention provides a financial system abnormity auxiliary analysis method and device based on data mining, comprising the following steps: obtaining abnormal keywords from the obtained system abnormal records, and generating abnormal keyword frequency vectors according to the abnormal keywords and a pre-established keyword vector model; calculating according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in a pre-established keyword frequency model to obtain a cosine similarity vector; and screening historical abnormal information similar to the system abnormal record according to the cosine similarity vector, and outputting classification, reason and positioning information of the historical abnormal information as an auxiliary analysis result of the abnormality. According to the method and the system, the abnormity of the system is analyzed in an auxiliary mode by establishing the abnormal keyword frequency model, the labor cost can be effectively saved, historical abnormal information can be fully utilized, and the probability of analysis errors can be reduced.

Description

Financial system abnormity auxiliary analysis method and device based on data mining
Technical Field
The application belongs to the technical field of financial system operation and maintenance, and particularly relates to a financial system abnormity auxiliary analysis method and device based on a data mining area.
Background
In computer network service in the financial field, business faults caused by abnormal operation of a financial system can cause economic losses of different degrees to customers, and the use experience of the users is reduced to the greatest extent, so that the method is particularly important for quick emergency treatment of the abnormal conditions in the financial system. At the present stage, the exception handling of the system mainly depends on the operation and maintenance personnel to acquire a series of exception information, then carry out manual analysis and carry out rapid processing on the exception information. Therefore, the experience of the operation and maintenance personnel and the analysis processing speed are related to the range and the degree of financial business affected by the financial system abnormity. The existing financial system exception handling mainly depends on manpower, and has the following defects:
1. the financial system abnormity analysis requires operation and maintenance personnel to have rich experience, the economic cost and the time cost for cultivating all-purpose operation and maintenance personnel are high, meanwhile, a large amount of historical system abnormity analysis information is not fully and effectively utilized, and huge waste of information intelligence resources exists.
2. The whole process of the financial system abnormity analysis depends on manual work, and the whole process is long in time consumption and slow in response speed.
Disclosure of Invention
The application provides a financial system abnormity auxiliary analysis method and device based on data mining, and aims to at least solve the problem that existing financial system abnormity analysis is highly dependent on manual work.
According to one aspect of the application, a financial system abnormity auxiliary analysis method based on data mining is provided, and comprises the following steps:
obtaining abnormal keywords from the obtained system abnormal records, and generating abnormal keyword frequency vectors according to the abnormal keywords and a pre-established keyword vector model;
obtaining a cosine similarity vector according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in a pre-established keyword frequency model;
and screening historical abnormal information similar to the system abnormal record according to the cosine similarity vector, and outputting classification, reason and positioning information of the historical abnormal information as an auxiliary analysis result of the abnormality.
In one embodiment, obtaining an abnormal keyword from the obtained system abnormal record, and generating an abnormal keyword frequency vector according to the abnormal keyword and a pre-established keyword vector model, includes:
cutting off abnormal keywords from a text sequence recorded by the system abnormity;
and counting the occurrence frequency of each keyword in the dimension of the keyword vector model according to the abnormal keywords and the keyword vector model to obtain a frequency vector of the abnormal keywords.
In an embodiment, obtaining a cosine similarity vector according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in the pre-established keyword frequency model includes:
generating cosine similarity according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in a pre-established keyword frequency model;
and constructing a cosine similarity vector according to the cosine similarity.
In one embodiment, the screening of historical anomaly information similar to the system anomaly record according to the cosine similarity vector includes:
and according to a preset similarity threshold, screening historical abnormal information corresponding to the cosine similarity vector higher than the similarity threshold from the cosine similarity vectors.
In one embodiment, a method for constructing a keyword vector model and a keyword frequency model includes the following steps:
obtaining keywords from the obtained historical abnormal information, and obtaining the corresponding frequency of each keyword according to the obtained frequency;
screening the keywords according to a preset frequency threshold value to generate a keyword vector model;
and generating a keyword frequency vector according to the keyword vector model and the frequency corresponding to the keyword, wherein the keyword frequency model consists of the keyword frequency vector.
In one embodiment, the method further comprises:
generating an abnormal record after the system exception processing is finished, and acquiring an abnormal record keyword from the abnormal record;
and generating a new keyword vector model by using the abnormal record keywords, and updating the keyword frequency model by using the new keyword vector model.
According to another aspect of the present application, there is also provided a financial system abnormality auxiliary analysis apparatus based on data mining, including:
an abnormal keyword frequency vector generating unit, configured to obtain an abnormal keyword from the obtained system abnormal record, and generate an abnormal keyword frequency vector according to the abnormal keyword and a pre-established keyword vector model;
a cosine similarity vector obtaining unit, configured to obtain a cosine similarity vector according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in the pre-established keyword frequency model;
and the auxiliary analysis result output unit is used for screening historical abnormal information similar to the system abnormal record according to the cosine similarity vector and outputting the classification, reason and positioning information of the historical abnormal information as the result of the auxiliary analysis of the abnormality.
In one embodiment, the abnormal keyword frequency vector generating unit includes:
the abnormal keyword segmentation module is used for segmenting the text sequence recorded by the system abnormity into abnormal keywords;
and the frequency vector generation module is used for counting the occurrence frequency of each keyword in the dimension of the model in the abnormal keywords and obtaining the frequency vector of the abnormal keywords.
In one embodiment, the cosine similarity vector obtaining unit includes:
the cosine similarity generating module is used for generating cosine similarity according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in the pre-established keyword frequency model;
and the cosine similarity vector constructing module is used for constructing a cosine similarity vector according to the cosine similarity.
In one embodiment, the auxiliary analysis result output unit includes:
and the historical anomaly screening module is used for screening historical anomaly information corresponding to the cosine similarity vectors higher than the similarity threshold from the cosine similarity vectors according to the preset similarity threshold.
In one embodiment, a method for constructing a keyword vector model and a keyword frequency model includes the following steps:
obtaining keywords from the obtained historical abnormal information, and obtaining the corresponding frequency of each keyword according to the obtained frequency;
screening the keywords according to a preset frequency threshold value to generate a keyword vector model;
and generating a keyword frequency vector according to the keyword vector model and the frequency corresponding to the keyword, wherein the keyword frequency model consists of the keyword frequency vector.
In one embodiment, the apparatus further comprises:
the abnormal keyword acquisition module is used for generating an abnormal record after the system exception processing is finished and acquiring an abnormal record keyword from the abnormal record;
and the updating module is used for generating a new keyword vector model by using the abnormal record keywords and updating the keyword frequency model by using the new keyword vector model.
According to the method, the abnormal keyword frequency model is constructed by exploring historical abnormal analysis data of the financial system, and after the system is abnormal, the model is used for automatically matching historical similar abnormal information and a solution, so that operation and maintenance personnel are assisted to quickly locate an abnormal solution, and the abnormal business influence range of the system is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of an auxiliary financial system anomaly analysis method based on data mining according to the present application.
Fig. 2 is a flowchart of generating an abnormal keyword frequency vector according to an abnormal keyword in the embodiment of the present application.
Fig. 3 is a diagram illustrating an abnormal keyword frequency vector in the embodiment of the present application.
Fig. 4 is a flowchart of a method for obtaining a cosine similarity vector according to an embodiment of the present application.
Fig. 5 is a flowchart of a method for constructing a keyword frequency model in an embodiment of the present application.
Fig. 6 is a schematic diagram of a keyword frequency model in an embodiment of the present application.
Fig. 7 is a flowchart of an update procedure in an embodiment of the present application.
Fig. 8 is a block diagram of a financial system abnormality auxiliary analysis device based on data mining according to the present application.
Fig. 9 is a block diagram showing a structure of an abnormal keyword frequency vector generating unit in the embodiment of the present application.
Fig. 10 is a block diagram of a cosine similarity vector obtaining unit in the embodiment of the present application.
Fig. 11 is a specific implementation of an electronic device in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
During online operation of the financial system, system operation and maintenance personnel mainly find the financial system abnormity through three methods of monitoring and warning, telephone fault reporting and active inspection. At present, financial system abnormity mainly depends on manual processing, and the specific flow is as follows:
1. and acquiring information texts attached to system abnormalities, such as CPU (Central processing Unit) alarm, memory alarm, transaction error and the like, from various ways.
2. And analyzing the abnormal information and positioning the abnormal reason according to the self experience of the operation and maintenance personnel.
3. And (4) the operation and maintenance personnel make an exception handling scheme and implement the exception handling scheme, such as parameter adjustment, application restart and the like.
4. And (4) establishing an abnormal record list by operation and maintenance personnel, and registering abnormal records, reason analysis, problem positioning and processing information.
Due to the wide variety of financial system anomalies, structured anomaly information data has great reference value in the aspects of anomaly pre-event prevention, anomaly post-event processing and the like. Through analyzing the historical abnormal information of the financial system, most of the conventional abnormal conditions have experience and can be found, such as SQL full-table scanning, unreasonable parameter value setting and the like. However, in the current financial system anomaly analysis processing scene, mining of structured historical system anomaly data is lacked, so that a large amount of similar historical system anomaly analysis information cannot be utilized.
Based on the above problem, the present application provides a financial system anomaly auxiliary analysis method based on data mining, as shown in fig. 1, including:
s101: and obtaining abnormal keywords from the obtained system abnormal records, and generating an abnormal keyword frequency vector according to the abnormal keywords and a pre-established keyword vector model.
In one embodiment, when the system is in an abnormal condition, a system abnormal record (text format) is generated, and the system abnormal record is segmented into abnormal keywords [ K ] by using a python own nubby library (jieba) accurate word segmentation technology11,…,K1a,…,Kn1,…,Knb]. And counting the frequency of the abnormal keywords and generating an abnormal keyword frequency vector F according to the vector dimension in a pre-established keyword vector modelTo. The method for establishing the keyword vector model comprises the following steps of: obtaining multiple historical system exception records T1,…,Ti]Extracting the keyword T of each abnormal record of the historical system by utilizing a jieba library word segmentation technology1->[K11,…,K1a],…,Tn->[Kn1,…,Knb]Counting all the keywords extracted from the abnormal records of the historical system and removing the duplication to form a total keyword set K11,…,Knb]And counting the frequency [ F ] of each keyword in the total keyword setK1,…,FKnb]And setting a frequency threshold value to screen high-frequency keywords to form a final keyword vector [ K1,K2,…,Km]The final keyword vector is a keyword vector model, the keyword vector model has m vector dimensions, and the vector dimension of the generated abnormal keyword frequency vector is m and is consistent with the keyword vector model by taking the m vector dimensions as a standard.
S102: and obtaining a cosine similarity vector according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in the pre-established keyword frequency model.
In one embodiment, an abnormal keyword frequency vector F is calculatedToKeyword frequency vector F with each historical abnormal information in keyword frequency model FTiCosine similarity S betweenTiTo=cos(FTi,FTo)。
S103: and screening historical abnormal information similar to the system abnormal record according to the cosine similarity vector, and outputting classification, reason and positioning information of the historical abnormal information as an auxiliary analysis result of the abnormality.
In a specific embodiment, a reasonable similarity threshold is set according to the cosine similarity vector calculated in S102 according to the abnormal keyword frequency vector and the keyword frequency vector of the historical abnormal information, N pieces of historical abnormal information with the highest similarity to the abnormal record of the system are screened according to the similarity threshold, and classification, reason and positioning information of the historical abnormal information are output.
The execution main body of the method shown in fig. 1 can be a PC, a terminal, and the like, the auxiliary system exception analysis is realized by establishing an exception keyword frequency model, the intelligence resources accumulated by the historical exception handling information are effectively transmitted to operation and maintenance personnel, the labor cost is saved, and the effect of the historical exception handling information is effectively utilized.
In an embodiment, obtaining an abnormal keyword from the obtained system abnormal record, and generating an abnormal keyword frequency vector according to the abnormal keyword and a pre-established keyword vector model, as shown in fig. 2, includes:
s201: and cutting the text sequence recorded by the system exception into exception keywords.
In one embodiment, a system exception record T is formed when a system exception occursoExtracting system abnormal record T by utilizing python self-contained accurate word segmentation technology of a crust blockoAbnormal keyword K in (1)1,K2,…,K1,Kn
S202: and counting the occurrence frequency of each keyword in the dimension of the keyword vector model in the abnormal keywords according to the abnormal keywords and the keyword vector model to generate an abnormal keyword frequency vector.
In one embodiment, based on the abnormal keyword K1,K2,…,K1,KnAnd the established keyword vector model [ K ]1,K2,…,Km]Statistical keyword vector model [ K1,K2,…,Km]Each abnormal keyword K in a dimension1,K2,…,KmThe frequency of occurrence, as shown in FIG. 3, e.g. K1The number of the occurrences is 2 times,K 21 occurrence of, K m0 occurrences, then the formed keyword frequency vector FTo=[2,1,...,0]。
In an embodiment, obtaining a cosine similarity vector according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in the pre-established keyword frequency model, as shown in fig. 4, includes:
s501: and generating cosine similarity according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in a pre-established keyword frequency model.
In one embodiment, an abnormal keyword frequency vector F is calculatedToKeyword frequency vector F with each historical abnormal information recorded in the keyword frequency modelTiCosine similarity S betweenTiTo=cos(FTi,FTo) The cosine similarity represents the frequency vector F of the abnormal keywordToKeyword frequency vector F associated with certain historical abnormal informationTiThe greater the similarity, the more similar (closer) the two are.
S502: and constructing a cosine similarity vector according to the cosine similarity.
In one embodiment, the cosine similarity vector [ S ] is formed using the cosine similarities obtained in S501T1To,…,STnTo]。
In one embodiment, the screening of historical anomaly information similar to the system anomaly record according to the cosine similarity vector includes:
and according to a preset similarity threshold, screening historical abnormal information corresponding to the cosine similarity vector higher than the similarity threshold from the cosine similarity vectors.
In one embodiment, a similarity threshold is set, and then the set similarity threshold and the formula [ S ] are usedTjTo,…,STkTo]=TopN[ST1To,…,STnTo]Screening out N pieces of historical abnormal information [ T ] with highest similarityj,…,Tk]And outputting the related classification, reason, positioning information, etc. of the historical abnormal information as target abnormal informationAnd (6) prejudging a result.
In an embodiment, a method for constructing a keyword vector model and a keyword frequency model, as shown in fig. 5, includes the following steps:
s601: and obtaining keywords from the obtained historical abnormal information, and obtaining the corresponding frequency of each keyword according to the obtained frequency.
In one embodiment, establishing the keyword frequency model first requires inputting a plurality of historical system exceptions (historical exception information) [ T ]1,…,Ti]Then, extracting the keyword T of each abnormal record by utilizing the python self-contained Chinese character of 'Jiba' exact word segmentation technology1->[K11,…,K1a],…,Tn->[Kn1,…,Knb]Counting all the keywords extracted from the abnormal records of the historical system and removing the duplication to form a total keyword set K11,…,Knb]And counting the frequency [ F ] corresponding to each keywordK11,…,FKnb]。
S602: and screening the keywords according to a preset frequency threshold value to generate a keyword vector model.
In one embodiment, a frequency threshold is set to screen high frequency keywords to form a final historical abnormal record keyword vector [ K1,K2,…,Km]。
S603: and generating a keyword frequency vector according to the keyword vector model and the frequency corresponding to the keyword, wherein the keyword frequency model consists of the keyword frequency vector.
In a particular embodiment, for each historical exception record T1,…,Ti]The word segmentation result of (2) and the statistical keyword vector [ K1,K2,…,Km]And forming a keyword frequency model F according to the occurrence frequency of each keyword in the dimension. (as shown in fig. 6).
In one embodiment, the method further includes an update step, as shown in fig. 7:
s801: and generating an abnormal record after the system exception processing is finished, and acquiring an abnormal record keyword from the abnormal record.
In one embodimentIn the method, after the system exception processing is completed, the exception record T is perfectedo' the same word segmentation extraction technique is still used to obtain the keyword T recorded by the system exceptiono'->[Ko1',…,Kod']。
S802: and generating a new keyword vector model by using the abnormal record keywords, and updating the keyword frequency model by using the new keyword vector model.
In one embodiment, if the keyword Ko1' Presence of a full set of keywords [ K ]11,…,Knb]In (3), the frequency of the corresponding keyword is increased by 1, and if K is equal to the frequency of the corresponding keywordo1' As a new keyword, it is included in the full keyword set11,…,Knb,Ko1']Sequentially processing the keywords with the frequency of 1 to finally form a new full keyword set, setting a frequency threshold value to screen high-frequency keywords and form a new keyword vector model [ K ]1,K2,…,Kp]And aiming at each system abnormal record (including newly-added system abnormal record T)o) And updating the keyword frequency model.
According to the method and the system, historical abnormal analysis data of the system are fully mined, the abnormal keyword frequency model is established to perform auxiliary analysis on the system, the labor cost can be effectively saved, historical abnormal information is fully utilized, and the probability of analysis errors is reduced. Moreover, the keyword frequency model in the application supports self-learning to continuously update the keyword frequency model, and the intelligent degree is high.
Based on the same inventive concept, the embodiment of the present application further provides a financial system abnormality auxiliary analysis apparatus based on data mining, which can be used to implement the method described in the above embodiments, as described in the following embodiments. Because the principle of solving the problems of the financial system abnormity auxiliary analysis device based on data mining is similar to that of the financial system abnormity auxiliary analysis method based on data mining, the implementation of the financial system abnormity auxiliary analysis device based on data mining can refer to the implementation of the financial system abnormity auxiliary analysis method based on data mining, and repeated parts are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
As shown in fig. 8, the data mining-based financial system abnormality auxiliary analysis apparatus provided by the present application includes:
an abnormal keyword frequency vector generating unit 901, configured to obtain an abnormal keyword from the obtained system abnormal record, and generate an abnormal keyword frequency vector according to the abnormal keyword and a pre-established keyword vector model;
a cosine similarity vector obtaining unit 902, configured to obtain a cosine similarity vector according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in the pre-established keyword frequency model;
and the auxiliary analysis result output unit 903 is used for screening historical abnormal information similar to the system abnormal record according to the cosine similarity vector, and outputting classification, reason and positioning information of the historical abnormal information as an abnormal auxiliary analysis result.
In one embodiment, as shown in fig. 9, the abnormal keyword frequency vector generating unit 901 includes:
an abnormal keyword segmentation module 1001, configured to segment a text sequence recorded by the system anomaly into abnormal keywords;
a frequency vector generating module 1002, configured to count occurrence frequencies of each keyword in a dimension of the keyword vector model according to the abnormal keyword and the keyword vector model, and obtain a frequency vector of the abnormal keyword.
In one embodiment, as shown in fig. 10, the cosine similarity vector obtaining unit 902 includes:
a cosine similarity generating module 1201, configured to generate a cosine similarity according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in the pre-established keyword frequency model;
a cosine similarity vector constructing module 1202, configured to construct a cosine similarity vector according to the cosine similarity.
In one embodiment, the auxiliary analysis result output unit 903 includes:
and the historical anomaly screening module is used for screening historical anomaly information corresponding to the cosine similarity vectors higher than the similarity threshold from the cosine similarity vectors according to the preset similarity threshold.
In one embodiment, a method for constructing a keyword frequency model includes the following steps:
obtaining keywords from the obtained historical abnormal information, and obtaining the corresponding frequency of each keyword according to the obtained frequency;
screening the keywords according to a preset frequency threshold value to generate a keyword vector model;
and generating a keyword frequency vector according to the keyword vector model and the frequency corresponding to the keyword, wherein the keyword frequency model consists of the keyword frequency vector.
In one embodiment, the apparatus further comprises:
the abnormal keyword acquisition module is used for generating an abnormal record after the system exception processing is finished and acquiring an abnormal record keyword from the abnormal record;
and the updating module is used for generating a new keyword vector model by using the abnormal record keywords and updating the keyword frequency model by using the new keyword vector model.
According to the method and the system, the abnormity of the system is analyzed in an auxiliary mode by establishing the abnormal keyword frequency model, the labor cost can be effectively saved, historical abnormal information can be fully utilized, and the probability of analysis errors can be reduced.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
An embodiment of the present application further provides a specific implementation manner of an electronic device, which is capable of implementing all steps in the method in the foregoing embodiment, and referring to fig. 11, the electronic device specifically includes the following contents:
a processor (processor)1301, a memory 1302, a Communications Interface (Communications Interface)1303, a bus 1304, and a non-volatile memory 1305;
the processor 1301, the memory 1302 and the communication interface 1303 complete communication with each other through the bus 1304;
the processor 1301 is configured to call the computer programs in the memory 1302 and the non-volatile storage 1305, and the processor implements all the steps of the method in the above embodiments when executing the computer programs, for example, the processor implements the following steps when executing the computer programs:
s101: and obtaining abnormal keywords from the obtained system abnormal records, and generating an abnormal keyword frequency vector according to the abnormal keywords and a pre-established keyword vector model.
S102: and obtaining a cosine similarity vector according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in the pre-established keyword frequency model.
S103: and screening historical abnormal information similar to the system abnormal record according to the cosine similarity vector, and outputting classification, reason and positioning information of the historical abnormal information as an auxiliary analysis result of the abnormality.
Embodiments of the present application also provide a computer-readable storage medium capable of implementing all the steps of the method in the above embodiments, where the computer-readable storage medium stores thereon a computer program, and the computer program when executed by a processor implements all the steps of the method in the above embodiments, for example, the processor implements the following steps when executing the computer program:
s101: and obtaining abnormal keywords from the obtained system abnormal records, and generating an abnormal keyword frequency vector according to the abnormal keywords and a pre-established keyword vector model.
S102: and obtaining a cosine similarity vector according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in the pre-established keyword frequency model.
S103: and screening historical abnormal information similar to the system abnormal record according to the cosine similarity vector, and outputting classification, reason and positioning information of the historical abnormal information as an auxiliary analysis result of the abnormality.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the hardware + program class embodiment, since it is substantially similar to the method embodiment, the description is simple, and the relevant points can be referred to the partial description of the method embodiment. Although embodiments of the present description provide method steps as described in embodiments or flowcharts, more or fewer steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an actual apparatus or end product executes, it may execute sequentially or in parallel (e.g., parallel processors or multi-threaded environments, or even distributed data processing environments) according to the method shown in the embodiment or the figures. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the presence of additional identical or equivalent elements in a process, method, article, or apparatus that comprises the recited elements is not excluded. For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, in implementing the embodiments of the present description, the functions of each module may be implemented in one or more software and/or hardware, or a module implementing the same function may be implemented by a combination of multiple sub-modules or sub-units, and the like. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form. The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein. The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction. The above description is only an example of the embodiments of the present disclosure, and is not intended to limit the embodiments of the present disclosure. Various modifications and variations to the embodiments described herein will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the embodiments of the present specification should be included in the scope of the claims of the embodiments of the present specification.

Claims (14)

1. A financial system abnormity auxiliary analysis method based on data mining is characterized by comprising the following steps:
obtaining abnormal keywords from the obtained system abnormal records, and generating abnormal keyword frequency vectors according to the abnormal keywords and a pre-established keyword vector model;
obtaining a cosine similarity vector according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in a pre-established keyword frequency model;
and screening historical abnormal information similar to the system abnormal record according to the cosine similarity vector, and outputting classification, reason and positioning information of the historical abnormal information as an auxiliary analysis result of the abnormality.
2. The method of claim 1, wherein the obtaining abnormal keywords from the obtained system abnormal records and generating abnormal keyword frequency vectors according to the abnormal keywords and a pre-established keyword vector model comprises:
segmenting abnormal keywords from the text sequence recorded by the system abnormity;
and counting the occurrence frequency of each keyword in the dimension of the keyword vector model according to the abnormal keywords and the keyword vector model to obtain a frequency vector of the abnormal keywords.
3. The method of claim 1, wherein obtaining a cosine similarity vector according to the abnormal keyword frequency vector and a keyword frequency vector of each historical abnormal information in a pre-established keyword frequency model comprises:
generating cosine similarity according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in a pre-established keyword frequency model;
and constructing the cosine similarity vector according to the cosine similarity.
4. The method for auxiliary analysis of anomalies according to claim 1, wherein said screening historical anomaly information similar to the system anomaly record according to the cosine similarity vector comprises:
and according to a preset similarity threshold, screening out historical abnormal information corresponding to the cosine similarity vector higher than the similarity threshold from the cosine similarity vectors.
5. The abnormality assistant analysis method according to claim 1, wherein the method of constructing the keyword vector model and the keyword frequency model includes the steps of:
obtaining keywords from the obtained historical abnormal information, and obtaining the corresponding frequency of each keyword according to the obtained frequency;
screening the keywords according to a preset frequency threshold value to generate a keyword vector model;
and generating the keyword frequency vector according to the keyword vector model and the frequency corresponding to the keyword, wherein the keyword frequency model consists of the keyword frequency vector.
6. The abnormality aided analysis method according to claim 1, further comprising:
generating an abnormal record after the system exception processing is finished, and acquiring an abnormal record keyword from the abnormal record;
and generating a new keyword vector model by using the abnormal record keywords, and updating the keyword frequency model by using the new keyword vector model.
7. A financial system abnormity auxiliary analysis device based on data mining is characterized by comprising:
an abnormal keyword frequency vector generating unit, configured to obtain an abnormal keyword from the obtained system abnormal record, and generate an abnormal keyword frequency vector according to the abnormal keyword and a pre-established keyword vector model;
a cosine similarity vector obtaining unit, configured to obtain a cosine similarity vector according to the abnormal keyword frequency vector and a keyword frequency vector of each historical abnormal information in a pre-established keyword frequency model;
and the auxiliary analysis result output unit is used for screening historical abnormal information similar to the system abnormal record according to the cosine similarity vector and outputting the classification, reason and positioning information of the historical abnormal information as an abnormal auxiliary analysis result.
8. The abnormality auxiliary analysis device according to claim 7, wherein said abnormal keyword frequency vector generation means includes:
the abnormal keyword segmentation module is used for segmenting the text sequence recorded by the system abnormity into abnormal keywords;
and the frequency vector generation module is used for counting the occurrence frequency of each keyword in the dimensionality of the keyword vector model according to the abnormal keywords and the keyword vector model to obtain the frequency vector of the abnormal keywords.
9. The abnormality auxiliary analysis device according to claim 7, wherein the cosine similarity vector obtaining unit includes:
the cosine similarity generating module is used for generating cosine similarity according to the abnormal keyword frequency vector and the keyword frequency vector of each historical abnormal information in a pre-established keyword frequency model;
and the cosine similarity vector constructing module is used for constructing the cosine similarity vector according to the cosine similarity.
10. The abnormality auxiliary analysis device according to claim 7, wherein the auxiliary analysis result output unit includes:
and the historical anomaly screening module is used for screening historical anomaly information corresponding to the cosine similarity vectors higher than the similarity threshold from the cosine similarity vectors according to a preset similarity threshold.
11. The abnormality auxiliary analysis device according to claim 7, wherein the method of constructing the keyword vector model and the keyword frequency model includes the steps of:
obtaining keywords from the obtained historical abnormal information, and obtaining the corresponding frequency of each keyword according to the obtained frequency;
screening the keywords according to a preset frequency threshold value to generate a keyword vector model;
and generating the keyword frequency vector according to the keyword vector model and the frequency corresponding to the keyword, wherein the keyword frequency model consists of the keyword frequency vector.
12. The abnormality auxiliary analyzing apparatus according to claim 7, further comprising:
the abnormal keyword acquisition module is used for generating an abnormal record after the system exception processing is finished and acquiring an abnormal record keyword from the abnormal record;
and the updating module is used for generating a new keyword vector model by using the abnormal record keywords and updating the keyword frequency model by using the new keyword vector model.
13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the data mining-based financial system abnormality assisted analysis method of any one of claims 1 to 6 when executing the program.
14. A computer-readable storage medium, on which a computer program is stored, the computer program, when being executed by a processor, implementing the data mining-based financial system abnormality assisted analysis method according to any one of claims 1 to 6.
CN202010234684.7A 2020-03-30 2020-03-30 Financial system abnormity auxiliary analysis method and device based on data mining Pending CN111367971A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010234684.7A CN111367971A (en) 2020-03-30 2020-03-30 Financial system abnormity auxiliary analysis method and device based on data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010234684.7A CN111367971A (en) 2020-03-30 2020-03-30 Financial system abnormity auxiliary analysis method and device based on data mining

Publications (1)

Publication Number Publication Date
CN111367971A true CN111367971A (en) 2020-07-03

Family

ID=71206957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010234684.7A Pending CN111367971A (en) 2020-03-30 2020-03-30 Financial system abnormity auxiliary analysis method and device based on data mining

Country Status (1)

Country Link
CN (1) CN111367971A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112037026A (en) * 2020-09-01 2020-12-04 中国银行股份有限公司 Automatic abnormal transaction work order processing method, device and system
CN115564450A (en) * 2022-12-06 2023-01-03 支付宝(杭州)信息技术有限公司 Wind control method, device, storage medium and equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010055494A (en) * 2008-08-29 2010-03-11 Oki Electric Ind Co Ltd Search and analysis server device and search and analysis method
CN104951553A (en) * 2015-06-30 2015-09-30 成都蓝码科技发展有限公司 Content collecting and data mining platform accurate in data processing and implementation method thereof
CN106570196A (en) * 2016-11-18 2017-04-19 广州视源电子科技股份有限公司 Video program searching method and apparatus
CN107426199A (en) * 2017-07-05 2017-12-01 浙江鹏信信息科技股份有限公司 A kind of method and system of Network anomalous behaviors detection and analysis
CN107506424A (en) * 2017-08-17 2017-12-22 国网北京市电力公司 Data analysing method and device
CN109147934A (en) * 2018-07-04 2019-01-04 平安科技(深圳)有限公司 Interrogation data recommendation method, device, computer equipment and storage medium
CN109241144A (en) * 2018-04-24 2019-01-18 中国银行股份有限公司 Rule inspection method and system are excavated and closed to a kind of operation/maintenance data
CN110399385A (en) * 2019-06-24 2019-11-01 厦门市美亚柏科信息股份有限公司 A kind of semantic analysis and system for small data set

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010055494A (en) * 2008-08-29 2010-03-11 Oki Electric Ind Co Ltd Search and analysis server device and search and analysis method
CN104951553A (en) * 2015-06-30 2015-09-30 成都蓝码科技发展有限公司 Content collecting and data mining platform accurate in data processing and implementation method thereof
CN106570196A (en) * 2016-11-18 2017-04-19 广州视源电子科技股份有限公司 Video program searching method and apparatus
CN107426199A (en) * 2017-07-05 2017-12-01 浙江鹏信信息科技股份有限公司 A kind of method and system of Network anomalous behaviors detection and analysis
CN107506424A (en) * 2017-08-17 2017-12-22 国网北京市电力公司 Data analysing method and device
CN109241144A (en) * 2018-04-24 2019-01-18 中国银行股份有限公司 Rule inspection method and system are excavated and closed to a kind of operation/maintenance data
CN109147934A (en) * 2018-07-04 2019-01-04 平安科技(深圳)有限公司 Interrogation data recommendation method, device, computer equipment and storage medium
CN110399385A (en) * 2019-06-24 2019-11-01 厦门市美亚柏科信息股份有限公司 A kind of semantic analysis and system for small data set

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112037026A (en) * 2020-09-01 2020-12-04 中国银行股份有限公司 Automatic abnormal transaction work order processing method, device and system
CN115564450A (en) * 2022-12-06 2023-01-03 支付宝(杭州)信息技术有限公司 Wind control method, device, storage medium and equipment
CN115564450B (en) * 2022-12-06 2023-03-10 支付宝(杭州)信息技术有限公司 Wind control method, device, storage medium and equipment

Similar Documents

Publication Publication Date Title
US9542255B2 (en) Troubleshooting based on log similarity
US11010223B2 (en) Method and system of automatic event and error correlation from log data
US9298538B2 (en) Methods and systems for abnormality analysis of streamed log data
CN112181758B (en) Fault root cause positioning method based on network topology and real-time alarm
CN112148772A (en) Alarm root cause identification method, device, equipment and storage medium
CN111726248A (en) Alarm root cause positioning method and device
US20210097431A1 (en) Debugging and profiling of machine learning model training
CN113282461A (en) Alarm identification method and device for transmission network
CN111813960B (en) Knowledge graph-based data security audit model device, method and terminal equipment
CN110730100B (en) Alarm information processing method and device and server
CN115033463B (en) System exception type determining method, device, equipment and storage medium
CN111367971A (en) Financial system abnormity auxiliary analysis method and device based on data mining
CN107579944B (en) Artificial intelligence and MapReduce-based security attack prediction method
WO2016093839A1 (en) Structuring of semi-structured log messages
Omori et al. Comparing concept drift detection with process mining tools
US20190340540A1 (en) Adaptive continuous log model learning
CN115169426B (en) Anomaly detection method and system based on similarity learning fusion model
CN114629776B (en) Fault analysis method and device based on graph model
CN113157741B (en) Service state visualization method and device based on dimension conversion and electronic equipment
CN115203277A (en) Data decision method and device
Abe et al. Analyzing business processes by automatically detecting kpi thresholds
CN113342861B (en) Data management method and device in service scene
Sik et al. Detecting outliers and anomalies to prevent failures and accidents in Industry 4.0
CN110727532A (en) Data restoration method, electronic device and storage medium
Reiter et al. AIOps–A Systematic Literature Review

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220927

Address after: 12 / F, 15 / F, 99 Yincheng Road, Pudong New Area pilot Free Trade Zone, Shanghai, 200120

Applicant after: Jianxin Financial Science and Technology Co.,Ltd.

Address before: 25 Financial Street, Xicheng District, Beijing 100033

Applicant before: CHINA CONSTRUCTION BANK Corp.

Applicant before: Jianxin Financial Science and Technology Co.,Ltd.