Disclosure of Invention
The invention mainly aims to provide a multi-system association early warning method, a device, equipment and a computer readable storage medium, which aim to solve the problem that early warning is inaccurate due to fewer data samples by monitoring operation of a user on each system in the prior art.
In order to achieve the above object, the present invention provides a multi-system association early warning method, which includes the following steps:
receiving mapping relations between a plurality of preset parameter sets and risk weights, and transmitting the mapping relations as training samples to a preset learning model to form an event model, wherein the preset parameter sets comprise a plurality of data with operation risks;
receiving log files uploaded by each system, and extracting target data in the log files;
and transmitting each piece of target data to the event model, judging whether each piece of target data has risk target data with risk, and if so, carrying out early warning on each piece of risk target data.
Preferably, the step of transmitting each target data to the event model, and determining whether each target data has risk target data with risk includes:
reading a plurality of keywords in each target data, forming keyword groups corresponding to each target data, and transmitting each keyword group to the event model;
determining target risk weights corresponding to the keyword groups according to the mapping relation between the preset parameter groups and the risk weights in the event model;
And comparing the target risk weights with preset risk early warning values, and judging whether risk target data with risks exist in the target data corresponding to the keyword groups.
Preferably, the step of comparing each target risk weight with a preset risk early-warning value, and determining whether there is risk target data with risk in the target data corresponding to each keyword group includes:
reading account identifiers in the target data, and classifying the target data with the same account identifier into a target data group of the same account;
determining target keyword groups corresponding to target data in the target data groups of the same account, and weighting target risk weights of the target keyword groups to generate weighted average values corresponding to the target data groups of the same account;
comparing each weighted average value with a preset risk early warning value, and determining early warning risk weights which are larger than the preset risk early warning value in each weighted average value;
and determining a same-account target data group corresponding to each early warning risk weight according to the corresponding relation between each same-account target data group and the weighted average value, and judging target data in each corresponding same-account target data group as risk target data with risk.
Preferably, the step of pre-warning each risk target data includes:
performing difference operation on each early warning risk weight and the preset risk early warning value to generate an operation result;
and determining the early warning level of risk target data corresponding to each operation result according to the corresponding relation between the preset result range and the early warning level, and carrying out early warning corresponding to the early warning level on each risk target data.
Preferably, the step of determining the target risk weight corresponding to each target data according to the mapping relationship between the preset parameter set and the risk weight in the event model includes:
judging whether preset parameter groups corresponding to the keyword groups exist in the preset event model or not;
if the preset parameter groups corresponding to the keyword groups exist, determining target risk weights corresponding to the keyword groups according to the mapping relation between the preset parameter groups and the risk weights in the event model;
if the preset parameter groups corresponding to the keyword groups do not exist, determining difference data between the keyword groups and the preset parameter groups;
and determining the target risk weight of each keyword group according to each difference data.
Preferably, the step of determining the target risk weight of each keyword group according to each difference data includes:
comparing the parameter number in each piece of difference data corresponding to the keyword group, and determining the least difference data with the least parameter number in each piece of difference data;
determining a preset parameter set corresponding to the least difference data according to the corresponding relation between each difference data and the preset parameter set;
and determining the target risk weight of the keyword group according to the risk weight corresponding to the corresponding preset parameter group.
Preferably, the step of receiving the log file uploaded by each system and extracting the target data in the log file includes:
receiving log files uploaded by each system, and reading log data in the log files;
and classifying and screening the log data based on a preset regular expression to extract target data.
In addition, in order to achieve the above object, the present invention further provides a multi-system association early warning device, which includes:
the system comprises a receiving module, a training module and a processing module, wherein the receiving module is used for receiving mapping relations between a plurality of preset parameter sets and risk weights, and transmitting the mapping relations as training samples to a preset learning model to form an event model, wherein the preset parameter sets comprise a plurality of data with operation risks;
The extraction module is used for receiving the log files uploaded by each system and extracting target data in the log files;
and the early warning module is used for transmitting each piece of target data to the event model, judging whether each piece of target data has risk target data with risk, and if so, carrying out early warning on each piece of risk target data.
In addition, in order to achieve the above object, the present invention further provides a multi-system association early warning device, which includes: the system comprises a memory, a processor, a communication bus and a multi-system association early warning program stored on the memory;
the communication bus is used for realizing connection communication between the processor and the memory;
the processor is used for executing the multi-system association early warning program to realize the following steps:
receiving mapping relations between a plurality of preset parameter sets and risk weights, and transmitting the mapping relations as training samples to a preset learning model to form an event model, wherein the preset parameter sets comprise a plurality of data with operation risks;
receiving log files uploaded by each system, and extracting target data in the log files;
And transmitting each piece of target data to the event model, judging whether each piece of target data has risk target data with risk, and if so, carrying out early warning on each piece of risk target data.
In addition, to achieve the above object, the present invention also provides a computer-readable storage medium storing one or more programs executable by one or more processors for:
receiving mapping relations between a plurality of preset parameter sets and risk weights, and transmitting the mapping relations as training samples to a preset learning model to form an event model, wherein the preset parameter sets comprise a plurality of data with operation risks;
receiving log files uploaded by each system, and extracting target data in the log files;
and transmitting each piece of target data to the event model, judging whether each piece of target data has risk target data with risk, and if so, carrying out early warning on each piece of risk target data.
According to the multisystem association early warning method, the received mapping relation between the plurality of preset parameter sets and the risk weights is used as a training sample to be transmitted to a preset learning model for training, and an event model is formed; when log files uploaded by each system are received, extracting target data in each log file; and transmitting each target data to the event model for risk judgment, and when judging that the risk target data with risk exists in the extracted target data, carrying out early warning on the risk target data. According to the scheme, the event model is provided with a plurality of training samples with the mapping relation between the preset parameter groups and the risk weights, so that the risk judgment of the risk target data can be more accurate; meanwhile, the basis of early warning is derived from a plurality of systems due to the source of target data and each system, and the early warning data are sufficient; through the association of a plurality of systems, the accuracy and the effectiveness of early warning are further improved.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The invention provides a multisystem association early warning method.
Referring to fig. 1, fig. 1 is a flowchart of a first embodiment of a multi-system association early warning method according to the present invention. In this embodiment, the multi-system association early warning method includes:
step S10, receiving mapping relations between a plurality of preset parameter sets and risk weights, and transmitting the mapping relations as training samples to a preset learning model to form an event model, wherein the preset parameter sets comprise a plurality of data with operation risks;
the multi-system association early warning method is applied to the early warning server, and is suitable for early warning the operation of the user through the early warning server according to log files generated by the operation of the user in each system which is butted by an enterprise or an organization. The mechanism or the enterprise needs to interface a plurality of internal and external systems for realizing the functions of the mechanism or the enterprise, and each system records the operation of a user in the system in the running process of the system and generates a log file; by monitoring the log file, the operation of the user in the system can be known, and the operation of the user in each system is pre-warned, so that risks caused by the operation of the user are prevented. In order to facilitate early warning, event model training is performed first to early warning through the trained event model. Specifically, a developer sets a mapping relationship between a plurality of preset parameter sets and risk weights according to the risk level possibly caused by operating on each data. If the client identification card number stored in the database is subjected to the outgoing operation, the identification card number is revealed, which can lead to higher risk; the outgoing operation of the client telephone numbers stored in the database causes telephone number leakage, so that the risk is relatively low; thus, a higher weight A1 is set for the ID card number outgoing operation, and a lower weight A2 is set for the phone number outgoing operation.
Furthermore, the user operation may involve a plurality of data such as e.g. a simultaneous outgoing customer name and telephone number, or a simultaneous outgoing customer name and home address, etc.; the plurality of data involves a higher risk than operating only one of the plurality of data, i.e., the number of outgoing customers, and the number of outgoing customers' names each causes a higher risk, and the plurality of data is correspondingly higher weighted for this operation. Meanwhile, the operation types of the same data are different, and the caused risks are different; such as viewing transaction data, has a lower risk relative to sending; so that the viewing operation is less weighted and the sending operation is less weighted. Combining single data or a plurality of data with operation types to set weight according to the risk caused by the single data or the plurality of data; taking the combination of the single data and the operation type or the combination of a plurality of data and the operation as a preset parameter set, so that the preset parameter set comprises a plurality of data with operation risks; and taking the set weight as a risk weight to form a mapping relation between the preset parameter set and the risk weight. Such as corresponding risk weight B1 of parameter group [ ID card number, send ], corresponding risk weight (B1-i) of parameter group [ ID card number, view ], corresponding risk weight B2 of parameter group [ name, phone number, send ], corresponding risk weight B2-i of parameter group [ name, age, home address, read ], etc.
After the mapping relation between the preset parameter set and the risk weight is set, the mapping relation is uploaded to an early warning server, and a preset learning model for training is preset in the early warning server. The preset learning model may be a supervised learning model, an unsupervised learning model, or a semi-supervised learning model; the method comprises the steps of firstly marking data to form a data set, then learning a model through the data set, and predicting new data by using the learned model; the data in the non-supervision learning model does not need to be marked, and some modes in the data, such as a clustering structure, a hierarchical structure, a sparse tree, a graph and the like, are automatically found by using an algorithm model, but the accuracy is lower than that of the supervised learning model; in the semi-supervised learning model, unlabeled data and part of the labeled data are used together to train the model so as to improve the accuracy of the model. After receiving the mapping relation between a plurality of preset parameter sets and the risk weights, transmitting the mapping relation to the preset learning model to form an event model for predicting the weights of the risks of all events; the event is an operation performed on the data by the user in each system, and the risk of the operation is represented by the weight value.
Step S20, receiving log files uploaded by each system, and extracting target data in each log file;
further, since each system records the user operation through the log file, in order to determine whether the user operation has a risk, the log file of each system needs to be acquired. Specifically, a timing uploading mechanism or a timing requesting mechanism can be set for acquisition; for a timing uploading mechanism, each system actively uploads the generated log file to an early warning server at regular time; for the timing request mechanism, the early warning server sends request information to each system at regular time, and each system uploads a log file generated by the system after receiving the request information; and receiving the log files uploaded by the systems, namely realizing the operation of acquiring the log files of the systems. Considering that various data can be carried in the generated log file by each system, certain data are not related to operation early warning, but can affect analysis of the early warning; therefore, effective data needs to be extracted from the log file, and the extracted effective data is target data which can be used for early warning in the log file. Specifically, the step of receiving the log file uploaded by each system and extracting the target data in the log file includes:
Step S21, receiving log files uploaded by each system, and reading log data in the log files;
it will be appreciated that, because each system may be provided by a plurality of different developers or developed by the developers, the log files generated by each system may include different information, and may also include information of the system itself or other information, such as a system version number, a system update state, a system running duration, and the like, while recording the user operation. Such information is log data in a log file, but is not related to operations performed by a user in the system, and is invalid information, and filtering operation is required for the invalid information. The filtering operation needs to acquire all log data included in the log file first, and then filter invalid information included in all log data. And after receiving the log files uploaded by each system, reading the log data included in each log file to acquire all the log data in each log file.
And S22, classifying and screening the log data based on a preset regular expression so as to extract the target data.
Further, after the log data in each log file are read, filtering operation is performed on invalid information in the log data, target data which can be used for early warning is filtered out of the read log data, and the invalid information in the log data is filtered out. The filtering operation may be performed by setting a regular expression, where the regular expression is composed of a predetermined number of specific characters and combinations of the specific characters, and the "regular character string" is used to express a filtering logic for the character string. For example, foo is used to match the word face value of a text string, while the regular expression matching chinese characters is: [_4e00\u9fa 5], matches the character string consisting of 26 English letters as follows: "++A-Za-z" + $ "etc. Defining the data to be screened into a regular expression to form a preset regular expression, and classifying and screening the log data through the preset regular expression; and extracting required target data, and filtering out unnecessary invalid information, wherein the extracted target data is operation data of a user on system operation.
And step S30, transmitting each piece of target data to the event model, judging whether each piece of target data has risk target data with risk, and if so, carrying out early warning on each piece of risk target data.
Further, after extracting the target data from each system, each target data is transmitted to the event model to predict the risk of the target number through the event model. The weight of risks of each operation is represented by the event model; the event model predicts the weight of each target data as the operation data, and judges whether each target data has risk according to the weight. When it is determined that some target data are at risk according to the weights, the target data with risk are taken as risk target data in all target data, and the risk target data need to be pre-warned when a user is at risk to the operation of the risk target data. Specifically, the step of transmitting each target data to the event model, and determining whether each target data has risk target data with risk includes:
Step S31, a plurality of keywords in each target data are read, keyword groups corresponding to each target data are formed, and each keyword group is transmitted to the event model;
the event model is trained by the mapping relation between the preset parameter set and the risk weight, and the preset parameter set consists of operation data of risks caused by the operation of each system; so that when predicting the weight of each target data, it is essentially predicting from the data characterizing the risk in each target data. Specifically, words representing risks, such as an identity card number, a telephone number and the like, are read from target data; taking into account the different risks caused by different operations, it is further necessary to read words representing the operation type in the target data, such as sending, reading, etc.; and taking the words representing the risks and the words of the operation type as a plurality of keywords of the target data to form a keyword group, wherein the keyword group is the data representing the risks in the target data. Because the user operates each system, a plurality of keywords are extracted for the target data of each system, and keyword groups corresponding to each system are formed. The respective keyword groups are transmitted to the event model to predict weights of target numbers corresponding to the respective keyword groups.
Step S32, determining target risk weights corresponding to the keyword groups according to the mapping relation between the preset parameter groups and the risk weights in the event model;
further, predicting the weight of each keyword group through a mapping relation between a preset parameter group and risk weights in an event model; judging whether preset parameter groups which are completely consistent with the keyword groups exist in the event model, and if so, determining risk weights corresponding to the preset parameter groups which are completely consistent with the preset parameter groups as the weights of the keyword groups; if the preset parameter groups are not present, predicting the preset parameter groups closest to the types of the key word groups according to a prediction algorithm of the event model; and determining the risk weight corresponding to the preset parameter group closest to the type as the weight of each keyword group, wherein the determined weight of each keyword group is the target risk weight corresponding to each keyword group.
And step S33, comparing each target risk weight with a preset risk early warning value, and judging whether risk target data with risk exists in the target data corresponding to each keyword group.
Further, in order to judge whether each target data has risk, preset risk early warning values for judging weight is preset; when the weight is greater than the preset risk early warning value, the risk is provided, and otherwise, the risk is not provided. Specifically, comparing the target risk weights of the keyword groups formed according to the target data with preset risk early warning values, and judging whether the target risk weights corresponding to the keyword groups are larger than the preset risk early warning values or not; if the risk weights of the targets are not greater than the preset risk early warning value, the target data forming the keyword groups are not at risk; and when the target risk weights are larger than the preset risk early warning value, indicating that target data with risk exists in the target data forming each keyword group, and taking the target data with risk as risk target data for early warning.
According to the multisystem association early warning method, the received mapping relation between the plurality of preset parameter sets and the risk weights is used as a training sample to be transmitted to a preset learning model for training, and an event model is formed; when log files uploaded by each system are received, extracting target data in each log file; and transmitting each target data to the event model for risk judgment, and when judging that the risk target data with risk exists in the extracted target data, carrying out early warning on the risk target data. According to the scheme, the event model is provided with a plurality of training samples with the mapping relation between the preset parameter groups and the risk weights, so that the risk judgment of the risk target data can be more accurate; meanwhile, the basis of early warning is derived from a plurality of systems due to the source of target data and each system, and the early warning data are sufficient; through the association of a plurality of systems, the accuracy and the effectiveness of early warning are further improved.
Further, in another embodiment of the multi-system association early warning method of the present invention, the step of comparing each target risk weight with a preset risk early warning value to determine whether there is risk target data with risk in the target data corresponding to each keyword group includes:
Step S331, reading account identifiers in the target data, and classifying the target data with the same account identifier into a target data group with the same account;
further, the target risk weight is determined according to a keyword group formed by each target data, the risk of the user operation is reflected, and each target data reflects the operation data of the user on the system operation; in order to make the target risk weight more accurately reflect the risk of operation, the operation data of the same user operated in different systems is taken as the whole data to combine the target data of multiple systems and reflect the risk of the user operated in each system. Specifically, the user operates the system through a user account, and an account identifier representing the user account is allocated to the operation data of the operation in the operation process, so that the target data carries the account identifier to represent the user account for operating the target data. The account identifiers added to the operation data of the same user in different systems are different due to the fact that the user accounts of the same user in different systems are different, so that the combined multi-system target data are inaccurate. To determine target data operated by the same user, reading user information, such as identity information, in different user accounts, wherein the user information characterizes the uniqueness of the user account; an account identifier is assigned to a user account having this user information to characterize a different user account having this account identifier, which is owned by essentially the same user. For example, for the system A, B, the accounts of the same user are a and b, respectively, and in the process of registering the accounts of the user, effective information of the user, such as a mobile phone number, a mailbox and the like, is needed. Although the user may have a plurality of mobile phone numbers, only one identity card number is used for registering the mobile phone numbers, so that whether the identity card numbers corresponding to the mobile phone numbers are consistent is determined through the mobile phone numbers registered by the user accounts a and b; if the user accounts a and b are consistent, judging that the user accounts a and b are all of the same user, and distributing the same account identifier F for the user accounts a and b; by operating the user accounts a, b on the system A, B, two formed target data are carried with the account identifier F, characterizing that the two target data are obtained by the same user operating on different accounts.
In order to combine target data of a plurality of systems to reflect the risk of the same user operating in each system, account identifiers carried in each target data are read, and each account identifier is compared to determine the same account identifier. Such target data having the same account identifier is generated by the same user account operating in different systems, and the target data having the same account identifier is assigned to the same account target data group, and the target data in the group are generated by the same user account operating. In addition, because the users of the system are numerous, each user has an account in the system, so that a plurality of target data sets with the same account can be formed correspondingly.
Step S332, determining a target keyword group corresponding to target data in each same-account target data group, and weighting a target risk weight of each target keyword group to generate a weighted average value corresponding to each same-account target data group;
because the target data group of the same account comprises a plurality of target data, and each target data is provided with a keyword group, each keyword group corresponding to each target data in the target data group of the same account can be determined according to the target data in the target data group of the same account, and the keyword group is used as the target keyword group. Each keyword corresponds to a target risk weight reflecting the target data from which the keyword is derived, and after each target keyword group is determined, the target risk weight of each target keyword group is subjected to weighting processing to generate a weighted average value. As the account target data set includes target data p1, p2 and p3, and the target keyword groups corresponding to p1, p2 and p3 are q1, q2 and q3, wherein the target risk weights of q1, q2 and q3 are s1, s2 and s3, the weighted average value obtained by weighting is (s1+s2+s3)/3. In addition, the risk caused by each target data in the target data group of the account is different, so that the target risk weights of each target data can be assigned with group weights according to the difference of the risks caused by each target data, and the weighted average value is characterized by the group weights. The larger the risk caused by the target data is, the larger the group weight of the corresponding target risk weight is; if the group weights for s1, s2, s3 are k1, k2, k3 (k1+k2+k3=1), the weighted average value obtained by weighting is (s1×k1+s2×k2+s3×k3)/(k1+k2+k3). And because the same account target data sets are numerous, the weighted average value of each same account target data set is calculated, and the risk of each same account target data set is represented.
Step S333, comparing each weighted average value with a preset risk early warning value, and determining an early warning risk weight greater than the preset risk early warning value in each weighted average value;
further, after generating a weighted average value representing the risk of the target data sets of the same account, comparing the weighted average value with a preset risk early warning value, and judging whether the weighted average value corresponding to each target data set of the same account is larger than the preset risk early warning value; if the weighted average values are not larger than the preset risk early warning value, the target data sets of the same account are not at risk; and when the weighted average value is larger than the weighted average value of the preset risk early warning value, indicating that the same-account target data sets with risks exist in the same-account target data sets. And determining the weighted average value larger than the preset risk early warning value as a preset risk weight, and carrying out early warning on the same account target data set corresponding to the preset risk weight.
Step S334, determining a same-account target data set corresponding to each early warning risk weight according to the corresponding relation between each same-account target data set and the weighted average, and determining target data in each corresponding same-account target data set as risk target data with risk.
Because the early warning risk weights come from weighted average values, the weighted average values are generated according to the target data sets of the same account, and a corresponding relation exists between each weighted average value and the target data of the same account; therefore, the same account target data set corresponding to the early warning risk weight can be determined according to the corresponding relation between the same account target data sets and the weighted average value. The risk of the target data in the determined target data set of the same account is high, and early warning is needed; and therefore, the target data in the target data are judged to be risk target data with risk, and early warning is carried out on the risk target data.
Further, in another embodiment of the multi-system association early warning method of the present invention, the step of early warning each risk target data includes:
step S34, performing difference operation on each early warning risk weight and the preset risk early warning value to generate an operation result;
understandably, the risk levels caused by different operations on the target data in the system are different, after each risk target data with risk is determined, the risk level of each risk target data can be determined by the corresponding early warning risk weight, and the greater the early warning risk weight is, the higher the risk of the risk target data is, and otherwise, the lower the risk of the risk target data is. For risk target data with higher risk, more strict early warning control measures are adopted, and for risk target data with lower risk, more loose early warning control measures are adopted so as to more accurately prevent the risk. Specifically, in order to represent the risk level of each risk target data, performing a difference operation on the pre-warning risk weight and the pre-set risk pre-warning value of each risk target data, generating an operation result of the difference operation, and representing the risk level of each risk target data through each operation result.
Step S35, determining the early warning level of risk target data corresponding to each operation result according to the corresponding relation between the preset result range and the early warning level, and carrying out early warning corresponding to the early warning level on each risk target data.
Further, in order to represent the risk height through the operation result of the difference operation, a corresponding relation between a result range and an early warning level is preset, for example, the early warning level corresponding to the result range m 1-m 2 is two-level, the early warning level corresponding to the result range m 2-m 3 is three-level, and the like; and different early warning levels correspond to different early warning measures, for example, the early warning measure corresponding to the second early warning is to prohibit checking any relevant information representing the risk words in the risk target data, and the early warning measure corresponding to the third early warning is to prohibit sending and the like, while the early warning measure corresponding to the third early warning is to prohibit checking the risk words in the target data. After each operation result is generated, comparing each operation result with the result range, and determining the result range of each operation result, wherein the early warning grade corresponding to the result range is the early warning grade corresponding to each risk target data. And then, according to the early warning measures corresponding to the early warning levels, early warning is carried out on the early warning levels corresponding to the risk target data.
Further, in another embodiment of the multi-system association early warning method of the present invention, the step of determining the target risk weight corresponding to each keyword group according to the mapping relationship between the preset parameter group and the risk weight in the event model includes:
step S321, judging whether preset parameter groups corresponding to the keyword groups exist in the preset event model or not;
understandably, when determining the target risk weights corresponding to the keyword groups through the preset parameter groups which are completely consistent or closest to the keyword groups in the event model, the preset parameter groups which are completely consistent or closest to the keyword groups may not exist in the event model, so that the determination of the target risk weights of the keyword groups is inaccurate. In the process of determining the target risk weight corresponding to each keyword group through the mapping relation between the preset parameter group and the risk weight in the event model, whether the preset parameter group corresponding to each keyword group exists in the preset event model or not is firstly judged, wherein the corresponding part comprises two parts which are completely consistent and the type is closest to the corresponding part. For the judgment of the completely consistent preset parameter sets, the keywords in the keyword sets and the parameters in the preset parameter sets can be compared one by one, and when the keywords and the parameters in the preset parameter sets are completely consistent, the existence of the completely consistent preset parameter sets is judged. Judging the preset parameter group with the closest type, judging whether the types represented by the keywords in the keyword group and the parameters in the preset parameter group are consistent or not after judging that the two types are not completely consistent, and only having difference in word description; such as an identification card number and identification card number, a mobile phone number and telephone number, etc. And when judging that the types represented by the keywords in the keyword group and the parameters in the preset parameter group are consistent, indicating that the preset parameter group with the closest type exists. Otherwise, when judging that the completely consistent preset parameter groups do not exist and the preset parameter groups with the closest types do not exist, judging that the preset parameter groups corresponding to the key word groups do not exist in the preset event model.
Step S322, if there is a preset parameter set corresponding to each keyword set, determining a target risk weight corresponding to each keyword set according to a mapping relationship between the preset parameter set and the risk weight in the event model;
when the preset parameter groups completely consistent with the keyword groups exist in the preset event model or the preset parameter groups closest to the keyword groups exist in the preset event model, the preset parameter groups corresponding to the keyword groups can be judged to exist in the preset event model; therefore, the target risk weight of each keyword group can be determined according to the mapping relation between the preset parameter group and the risk weight in the event model.
Step S323, if there is no preset parameter set corresponding to each keyword set, determining difference data between each keyword set and the preset parameter set;
and when the preset parameter groups completely consistent with the keyword groups do not exist in the preset event model and the preset parameter groups closest to the keyword groups do not exist in the preset event model, judging that the preset parameter groups corresponding to the keyword groups do not exist in the preset event model. At the moment, comparing each keyword group with each preset parameter group in the event model, and determining difference data between each keyword group and each preset parameter group; the difference data is the difference data between the keyword group and the preset parameter group, wherein the difference data exists in the keyword group but does not exist in the preset parameter group, or does not exist in the keyword group but exists in the preset parameter group; meanwhile, the types represented by the keywords in the keyword group and the parameters in the preset parameter group are consistent and are not used as difference data. If the key word group [ ID card number, telephone number, send ], but the preset parameter group [ name, home address, mobile phone number, send ], the difference data is ID card number and home address.
Step S324, determining a target risk weight of each keyword group according to each difference data.
Because the preset parameter sets in the event model are numerous, a plurality of groups of difference data exist between the keyword sets and each preset parameter set; the parameters of each preset parameter group are different, so that the number of parameters in each group of difference data is different; the parameter number can represent the correlation degree between the keyword group and each preset parameter group, and the smaller the parameter number is, the smaller the difference between the keyword group and the preset parameter group is, and the higher the correlation degree is. And then, according to the risk weight corresponding to the preset parameter set, the target risk weight of the keyword set is presumed. Specifically, the step of determining the target risk weight of each keyword group according to each difference data includes:
step q1, comparing the parameter number in each piece of difference data corresponding to the keyword group, and determining the least difference data with the least parameter number in each piece of difference data;
after each group of difference data between the keyword group and each preset parameter group is determined, the parameter number in each difference data is compared, and the difference data with the least parameter number in each difference data is determined, wherein the least difference data is the least difference data. If the keyword group [ N1, N2, M1], the preset parameter group 1[ N1, M2], the preset parameter group 2[ N1, N2, M1, M2], the preset parameter group 3[ N1, M2], the difference data between the keyword group and the preset parameter group 1, 2, 3 are [ N2, M2], [ N2, M1], respectively, and the difference data between the keyword group and the preset parameter group 2 is determined as the least difference data because the parameter number of the difference data between the two is the least.
Q2, determining a preset parameter set corresponding to the least difference data according to the corresponding relation between each difference data and the preset parameter set;
it is understood that, since each difference data is determined by the difference data between the keyword group and each preset parameter group, there is a correspondence between each difference data and the preset parameter group, so that the preset parameter group corresponding to the minimum difference data can be determined according to the correspondence between each difference data and the preset parameter group.
And q3, determining the target risk weight of the keyword group according to the risk weight corresponding to the corresponding preset parameter group.
Because the corresponding relation exists between the preset parameter set and the risk weight in the event model, the risk weight of the preset parameter set corresponding to the least difference data can be determined according to the corresponding relation, and then the target risk weight of the keyword corresponding to the generated difference data can be determined according to the risk weight. The key word group for generating the difference data and the preset parameter group have differences, so that in the process of determining the target risk weight according to the risk weight of the preset parameter group, the adjustment can be performed on the basis of the risk weight, and the adjustment can be performed according to the quantity and the type of the difference data. When the number of the difference data is small and the risk is not greatly influenced, the target risk weight is adjusted to be left and right floating or the target risk weight is directly represented by the risk weight; and when the number of the difference data is large and the influence on the risk is large, the target risk weight can be adjusted to be larger than the risk weight. Further, consider the case where there may be a plurality of the least difference data, that is, the keyword group and the plurality of preset parameter groups have the same number of the least difference data; at this time, determining a target risk weight according to a numerical value size relation between the risk weights of the preset parameter sets corresponding to the minimum difference data; specifically, a risk weight with a larger value is determined as a target risk weight so as to avoid risks.
It should be noted that, because of the plurality of keyword groups and the plurality of preset parameter groups, when determining the difference data between the keyword groups and the preset parameter groups, the difference data of each single keyword group is determined by comparing each single keyword group with each preset parameter group; after all the difference data of one keyword group are determined, a determination operation of another keyword group is performed.
In addition, referring to fig. 2, the present invention provides a multi-system association early-warning device, in a first embodiment of the multi-system association early-warning device of the present invention, the multi-system association early-warning device includes:
the receiving module 10 is configured to receive mapping relationships between a plurality of preset parameter sets and risk weights, and transmit the mapping relationships as training samples to a preset learning model to form an event model, where the preset parameter sets include a plurality of data with operational risks;
the extraction module 20 is configured to receive log files uploaded by each system, and extract target data in the log files;
and the early warning module 30 is configured to transmit each piece of target data to the event model, determine whether each piece of target data has risk target data with risk, and if so, early warn each piece of risk target data.
In the multi-system association early warning device of the embodiment, mapping relations between a plurality of preset parameter sets and risk weights received by the receiving module 10 are used as training samples to be transmitted to a preset learning model for training, so as to form an event model; when receiving the log files uploaded by each system, the extraction module 20 extracts target data in each log file; and each target data is transmitted to the event model through the early warning module 30 to carry out risk judgment, and when the risk target data with risk exists in the extracted target data, the risk target data is early warned. According to the scheme, the event model is provided with a plurality of training samples with the mapping relation between the preset parameter groups and the risk weights, so that the risk judgment of the risk target data can be more accurate; meanwhile, the basis of early warning is derived from a plurality of systems due to the source of target data and each system, and the early warning data are sufficient; through the association of a plurality of systems, the accuracy and the effectiveness of early warning are further improved.
Further, in another embodiment of the multi-system association early warning device of the present invention, the early warning module includes:
a reading unit, configured to read a plurality of keywords in each of the target data, form keyword groups corresponding to each of the target data, and transmit each of the keyword groups to the event model;
The determining unit is used for determining target risk weights corresponding to the keyword groups according to the mapping relation between the preset parameter groups and the risk weights in the event model;
and the comparison unit is used for comparing the target risk weights with preset risk early warning values and judging whether risk target data with risks exist in the target data corresponding to the keyword groups.
Further, in another embodiment of the multi-system association early warning device of the present invention, the comparing unit is further configured to:
reading account identifiers in the target data, and classifying the target data with the same account identifier into a target data group of the same account;
determining target keyword groups corresponding to target data in the target data groups of the same account, and weighting target risk weights of the target keyword groups to generate weighted average values corresponding to the target data groups of the same account;
comparing each weighted average value with a preset risk early warning value, and determining early warning risk weights which are larger than the preset risk early warning value in each weighted average value;
and determining a same-account target data group corresponding to each early warning risk weight according to the corresponding relation between each same-account target data group and the weighted average value, and judging target data in each corresponding same-account target data group as risk target data with risk.
Further, in another embodiment of the multi-system association early warning device of the present invention, the early warning module further includes:
the operation unit is used for performing difference operation on each early warning risk weight and the preset risk early warning value to generate an operation result;
and the early warning unit is used for determining the early warning grade of the risk target data corresponding to the generated operation result according to the corresponding relation between the preset result range and the early warning grade, and carrying out early warning corresponding to the early warning grade on the risk target data.
Further, in another embodiment of the multi-system association early warning device of the present invention, the determining unit is further configured to:
judging whether preset parameter groups corresponding to the keyword groups exist in the preset event model or not;
if the preset parameter groups corresponding to the keyword groups exist, executing the step of determining target risk weights corresponding to the keyword groups according to the mapping relation between the preset parameter groups and the risk weights in the event model;
if the preset parameter groups corresponding to the keyword groups do not exist, determining difference data between the keyword groups and the preset parameter groups;
And determining the target risk weight of each keyword group according to each difference data.
Further, in another embodiment of the multi-system association early warning device of the present invention, the determining unit is further configured to:
comparing the parameter number in each piece of difference data corresponding to the keyword group, and determining the least difference data with the least parameter number in each piece of difference data;
determining a preset parameter set corresponding to the least difference data according to the corresponding relation between each difference data and the preset parameter set;
and determining the target risk weight of the keyword group according to the risk weight corresponding to the corresponding preset parameter group.
Further, in another embodiment of the multi-system association early warning device of the present invention, the extraction module includes:
the receiving unit is used for receiving the log files uploaded by each system and reading log data in the log files;
and the screening unit is used for classifying and screening the log data based on a preset regular expression so as to extract target data.
The virtual function modules of the multi-system association early-warning device are stored in the memory 1005 of the multi-system association early-warning device shown in fig. 3, and when the processor 1001 executes the multi-system association early-warning program, the functions of the modules in the embodiment shown in fig. 2 are implemented.
Referring to fig. 3, fig. 3 is a schematic device structure of a hardware running environment related to a method according to an embodiment of the present invention.
The multi-system association early warning device in the embodiment of the invention can be a PC (personal computer ) or terminal devices such as a smart phone, a tablet personal computer, an electronic book reader, a portable computer and the like.
As shown in fig. 3, the multi-system association early warning device may include: a processor 1001, such as a CPU (Central Processing Unit ), a memory 1005, and a communication bus 1002. Wherein a communication bus 1002 is used to enable connected communication between the processor 1001 and a memory 1005. The memory 1005 may be a high-speed RAM (random access memory ) or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Optionally, the multi-system association early warning device may further include a user interface, a network interface, a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi (Wireless Fidelity, wireless broadband) module, and so on. The user interface may comprise a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface may further comprise a standard wired interface, a wireless interface. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface).
It will be appreciated by those skilled in the art that the multi-system associated pre-warning device structure shown in fig. 3 is not limiting of the multi-system associated pre-warning device and may include more or fewer components than shown, or may combine certain components, or may be a different arrangement of components.
As shown in fig. 3, an operating system, a network communication module, and a multi-system association early warning program may be included in a memory 1005, which is a computer-readable storage medium. The operating system is a program for managing and controlling hardware and software resources of the multi-system associated early warning device, and supports the operation of the multi-system associated early warning program and other software and/or programs. The network communication module is used for realizing communication among components in the memory 1005 and communication among other hardware and software in the multi-system associated early warning device.
In the multi-system association early-warning device shown in fig. 3, the processor 1001 is configured to execute a multi-system association early-warning program stored in the memory 1005, so as to implement the steps in the embodiments of the multi-system association early-warning method.
The present invention provides a computer readable storage medium storing one or more programs executable by one or more processors for implementing the steps in the embodiments of the multi-system association early warning method described above.
It should also be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a computer readable storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the invention, and all equivalent structural changes made by the specification and drawings of the present invention or direct/indirect application in other related technical fields are included in the scope of the present invention.