CN113076451B - Abnormal behavior identification and risk model library establishment method and device and electronic equipment - Google Patents

Abnormal behavior identification and risk model library establishment method and device and electronic equipment Download PDF

Info

Publication number
CN113076451B
CN113076451B CN202010004255.0A CN202010004255A CN113076451B CN 113076451 B CN113076451 B CN 113076451B CN 202010004255 A CN202010004255 A CN 202010004255A CN 113076451 B CN113076451 B CN 113076451B
Authority
CN
China
Prior art keywords
risk
data
feature
abnormal behavior
categories
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010004255.0A
Other languages
Chinese (zh)
Other versions
CN113076451A (en
Inventor
赵俊
王伟杰
许鑫伶
刘钢庭
鲁银冰
李启文
何洋
王丹弘
任姣姣
刘浩明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Guangdong Co Ltd
China Mobile Hangzhou Information Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Guangdong Co Ltd
China Mobile Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Guangdong Co Ltd, China Mobile Hangzhou Information Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202010004255.0A priority Critical patent/CN113076451B/en
Publication of CN113076451A publication Critical patent/CN113076451A/en
Application granted granted Critical
Publication of CN113076451B publication Critical patent/CN113076451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Data Mining & Analysis (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Operations Research (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses a method and a device for establishing an abnormal behavior recognition and risk model library and electronic equipment, which are used for solving the problem that the existing abnormal behavior recognition method is inaccurate. The method comprises the following steps: acquiring a plurality of groups of abnormal behavior characteristic data of a risk scene and corresponding abnormal classification labels; discretizing the multiple groups of abnormal behavior feature data to obtain risk feature values corresponding to the multiple abnormal behavior feature data in the multiple groups of abnormal behavior feature data; respectively carrying out attribute reduction on risk feature categories corresponding to a plurality of abnormal behavior feature data in the plurality of sets of abnormal behavior feature data; and establishing a risk model library based on the relationship among the risk characteristic class set, the risk characteristic value and the corresponding abnormal classification label after attribute reduction.

Description

Abnormal behavior identification and risk model library establishment method and device and electronic equipment
Technical Field
The embodiment of the invention relates to the technical field of wireless communication, in particular to a method and a device for identifying abnormal behaviors and establishing a risk model library and electronic equipment.
Background
With the development of the internet of things, some security problems are also implied in the behavior of the internet of things card, such as telephone fraud, traffic theft, wool pulling and the like. In view of these problems, the prior art proposes the following methods: the first method provides a method for attribute reduction based on a big data platform, and simultaneously builds a mobile user behavior pattern analysis platform based on a Hadoop open source platform; in the second method, rough set theory is introduced to preprocess customer base information, a core customer off-network early warning model is taken as an example, variable precision rough set attribute about Jian Suanfa is given, and model key variables are screened out.
However, when the user behavior characteristic data is reduced, some important data are lost in the reduction process, so that the accuracy of identifying abnormal behaviors in a risk scene is reduced; the second method ignores the mass data amount in the operation process of an operator when selecting test data, on the one hand, the recognition accuracy is reduced due to the fact that the sample amount of the test data is small, and on the other hand, the method is difficult to deal with and has low efficiency when facing the mass test data.
Therefore, how to accurately identify abnormal behavior in a risk scenario, still further solutions are needed.
Disclosure of Invention
The embodiment of the invention provides a method and a device for establishing an abnormal behavior recognition and risk model library and electronic equipment, which are used for solving the problem that the existing abnormal behavior recognition method is inaccurate.
The embodiment of the invention adopts the following technical scheme:
in a first aspect, a method for establishing a risk feature model library is provided, including:
acquiring a plurality of groups of abnormal behavior feature data of a risk scene and corresponding abnormal classification labels, wherein one group of abnormal behavior feature data comprises a plurality of abnormal behavior feature data;
discretizing the multiple groups of abnormal behavior feature data to obtain risk feature values corresponding to the multiple abnormal behavior feature data in the multiple groups of abnormal behavior feature data;
respectively carrying out attribute reduction on risk feature categories corresponding to a plurality of abnormal behavior feature data in the plurality of sets of abnormal behavior feature data, so that the accuracy of the determined abnormal classification labels meets preset conditions based on the risk feature category set and the corresponding risk feature values after attribute reduction;
and establishing a risk model library based on the relationship among the risk characteristic class set, the risk characteristic value and the corresponding abnormal classification label after attribute reduction.
In a second aspect, there is provided an abnormal behavior recognition method, including:
acquiring target behavior feature data and corresponding risk feature categories in a risk scene, wherein the target behavior feature data comprises a plurality of behavior feature data, and at least one behavior feature data corresponds to one risk feature category;
discretizing a plurality of behavior feature data in the target behavior feature data based on a corresponding relation between a preset numerical interval and risk feature values to obtain a plurality of risk feature values corresponding to the plurality of behavior feature data in the target behavior feature data;
acquiring abnormal behavior labels corresponding to the target behavior feature data from a risk model library based on risk feature categories corresponding to the plurality of risk feature values; the risk model library stores mapping relations among risk characteristic values, abnormal behavior labels and risk characteristic categories;
the number of risk feature categories in the risk model library is smaller than or equal to the number of risk feature categories contained in the target behavior feature data, and the risk feature categories in the risk model library are obtained by attribute reduction based on multiple groups of abnormal behavior feature data in a risk scene and corresponding abnormal behavior labels.
In a third aspect, there is provided an apparatus for establishing a risk feature model library, including:
the processor is used for acquiring a plurality of groups of abnormal behavior feature data of the risk scene and corresponding abnormal classification labels, wherein one group of abnormal behavior feature data comprises a plurality of abnormal behavior feature data; discretizing the multiple groups of abnormal behavior feature data to obtain risk feature values corresponding to the multiple abnormal behavior feature data in the multiple groups of abnormal behavior feature data; respectively carrying out attribute reduction on risk feature categories corresponding to a plurality of abnormal behavior feature data in the plurality of sets of abnormal behavior feature data, so that the accuracy of the determined abnormal classification labels meets preset conditions based on the risk feature category set and the corresponding risk feature values after attribute reduction; and establishing a risk model library based on the relationship among the risk characteristic class set, the risk characteristic value and the corresponding abnormal classification label after attribute reduction.
In a fourth aspect, there is provided an abnormal behavior recognition apparatus including:
the system comprises a processor, a risk scene and a risk scene, wherein the processor is used for acquiring target behavior feature data and corresponding risk feature categories in the risk scene, the target behavior feature data comprises a plurality of behavior feature data, and at least one behavior feature data corresponds to one risk feature category; discretizing a plurality of behavior feature data in the target behavior feature data based on a corresponding relation between a preset numerical interval and risk feature values to obtain a plurality of risk feature values corresponding to the plurality of behavior feature data in the target behavior feature data; acquiring abnormal behavior labels corresponding to the target behavior feature data from a risk model library based on risk feature categories corresponding to the plurality of risk feature values; the risk model library stores mapping relations among risk characteristic values, abnormal behavior labels and risk characteristic categories; the number of risk feature categories in the risk model library is smaller than or equal to the number of risk feature categories contained in the target behavior feature data, and the risk feature categories in the risk model library are obtained by attribute reduction based on multiple groups of abnormal behavior feature data in a risk scene and corresponding abnormal behavior labels.
In a fifth aspect, there is provided an electronic device comprising:
a memory storing computer program instructions;
a processor, which when executed by the processor, implements the method of building a risk feature model library as described in the first aspect.
In a sixth aspect, a computer readable storage medium is provided,
the computer readable storage medium comprises instructions which, when run on a computer, cause the computer to perform the method of building a risk feature model library as described in the first aspect.
In a seventh aspect, there is provided an electronic device comprising:
a memory storing computer program instructions;
a processor which when executed by the processor implements the abnormal behavior identification method as described in the second aspect.
In an eighth aspect, a computer-readable storage medium is provided,
the computer readable storage medium comprises instructions which, when run on a computer, cause the computer to perform the abnormal behavior identification method as described in the second aspect.
The above at least one technical scheme adopted by the embodiment of the invention can achieve the following beneficial effects:
when the risk feature mapping library is constructed, the embodiment of the invention can discretize a plurality of behavior feature data in the acquired plurality of types of abnormal behavior feature data to obtain risk feature values corresponding to the plurality of abnormal behavior feature data in the plurality of groups of abnormal behavior feature data; the attribute reduction can be carried out on risk feature categories corresponding to a plurality of abnormal behavior feature data in a plurality of groups of abnormal behavior feature data respectively, so that the accuracy of the determined abnormal behavior label meets the preset precision based on the plurality of risk feature categories and the corresponding risk feature values after the attribute reduction; and finally, constructing a risk feature mapping library based on the relationships among the attribute reduced feature categories, the risk feature values and the abnormal behavior labels.
Because the attribute reduction processing is carried out on the risk feature categories which do not influence the accuracy of the abnormal classification labels (namely the abnormal behavior labels) before the risk feature mapping library is constructed, on one hand, the influence of the risk feature categories after attribute reduction on the accuracy of determining the abnormal classification labels is avoided, and on the other hand, the calculation amount for determining the abnormal classification labels is reduced because part of the risk feature categories are reduced.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
fig. 1 is a schematic implementation flow chart of a method for establishing a risk module library according to an embodiment of the present disclosure;
FIG. 2 is a diagram of a user behavioral portrayal constructed from a set of abnormal behavioral characteristic data provided by one embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a framework to which the method for creating a risk module library according to one embodiment of the present disclosure is applied;
FIG. 4 is a schematic flowchart of an abnormal behavior recognition method according to an embodiment of the present disclosure;
Fig. 5 is a schematic structural diagram of a risk model library creating device according to an embodiment of the present disclosure;
FIG. 6 is a schematic structural diagram of an abnormal behavior recognition device according to an embodiment of the present disclosure;
fig. 7 is a schematic hardware structure of an electronic device according to an embodiment of the present disclosure;
fig. 8 is a schematic hardware structure of another electronic device according to an embodiment of the present disclosure.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present application based on the embodiments herein.
In order to solve the problem that the existing abnormal behavior identification method is not accurate enough, the embodiment of the specification provides a method for establishing a risk model library. The execution bodies of the methods provided by the embodiments of the present description may be, but are not limited to, distributed file systems.
Specifically, an implementation flow diagram of a method for establishing a risk model library provided in one or more embodiments of the present disclosure is shown in fig. 1, and includes:
step 110, acquiring a plurality of groups of abnormal behavior feature data of a risk scene and corresponding abnormal classification labels, wherein one group of abnormal behavior feature data comprises a plurality of abnormal behavior feature data;
it should be understood that, with the rapid development of the internet of things technology, the management and control of the internet of things card also exposes a series of security problems. In order to solve the problem, the embodiment of the present disclosure may obtain the abnormal behavior feature data of the multiple groups of internet of things cards and the corresponding abnormal classification labels in the risk scenario. Each group of abnormal behavior characteristic data comprises a plurality of abnormal behavior characteristic data, such as conversation behavior records of users, flow use records, information of equipment of the internet of things card and the like. The abnormal classification label can be specifically a specific reason that the internet of things card is deactivated after the abnormal behavior is determined, for example, the number of roaming places exceeds the set number, the roaming places with high risk exist, the number of contacts exceeds the set number, and the like.
Fig. 2 shows a behavioral representation of an internet of things network card constructed based on abnormal behavior feature data of the internet of things network card according to an embodiment of the present disclosure. The behavior portrait of the Internet of things card comprises a plurality of risk feature categories, including equipment use conditions, calling times of the day, roaming place number, short message receiving conditions, position information, flow distribution conditions, call duration, roaming conditions in high-risk areas, whether roaming exists, contact number, contact place number, night flow use conditions and the like.
Step 120, performing discretization processing on the multiple groups of abnormal behavior feature data respectively to obtain risk feature values corresponding to the multiple abnormal behavior feature data in the multiple groups of abnormal behavior feature data;
alternatively, since the value of the behavior feature data is always continuous, such as the maximum value of the traffic usage of the last month, the number of times of receiving the short message, etc., all are continuous variables, such continuous variables will cause huge calculation amount to be consumed in the subsequent data processing process. In order to simplify the calculation, the embodiment of the present disclosure may first perform discretization processing on multiple sets of abnormal behavior feature data. Specifically, discretizing the multiple sets of abnormal behavior feature data to obtain risk feature values corresponding to the multiple sets of abnormal behavior feature data, where the risk feature values include:
based on a plurality of numerical intervals, respectively acquiring target numerical intervals corresponding to a plurality of abnormal behavior characteristic data in a plurality of groups of abnormal behavior characteristic data;
and acquiring the risk characteristic value corresponding to the target numerical value interval based on the corresponding relation between the numerical value intervals and the risk characteristic value.
Table 1 table of discretized processing results for abnormal behavior feature data of card device of internet of things
As shown in table 1, an example table of results of discretizing abnormal behavior feature data of the card device of the internet of things provided in the embodiment of the present specification includes three risk feature categories, in table 1, including the number of IMEI (mobile phone serial number) occurring on the same day, whether there is a separation of a machine and a card in history, whether the IMEI occurs on the same day, and corresponding risk levels and discretization results. The numerical value in the discretization processing result is the risk characteristic value in the embodiment of the present specification.
TABLE 2 mapping table between risk profile categories and abnormal behavior classification labels
As shown in Table 2, a mapping table between risk feature classes and abnormal behavior classification labels is provided for the embodiment of the present specification, x is shown in Table 2 1 ~x n The n internet of things cards are internet of things cards which are shut down in a historical time period due to abnormal behaviors. c 1 ~c m For m risk feature categories of the n internet of things cards, the corresponding numerical value is a risk feature value, and the definition list represents an abnormal classification label corresponding to the internet of things card number.
In general, the number n of the internet of things cards is thousands, and the initial m risk feature categories extracted are not all related to the abnormal classification labels. Therefore, redundant risk feature class data needs to be deleted from the mapping table, and dimension reduction processing is performed on the behavior features. According to the embodiment of the specification, the idea of big data is combined, a variable precision attribute reduction algorithm for the card data of the Internet of things is provided, the recognition efficiency of abnormal behaviors is improved, the measurement precision is ensured, and redundant risk feature categories are deleted.
Step 130, respectively performing attribute reduction on risk feature categories corresponding to a plurality of abnormal behavior feature data in a plurality of sets of abnormal behavior feature data, so that the accuracy of the determined abnormal classification label meets a preset condition based on the risk feature category set and the corresponding risk feature value after attribute reduction;
it should be appreciated that the distributed file system described above may include a plurality of data nodes;
attribute reduction is performed on risk feature categories corresponding to a plurality of abnormal behavior feature data in a plurality of sets of abnormal behavior feature data respectively, so that the accuracy of the determined abnormal classification label meets preset conditions based on the risk feature category set and the corresponding risk feature value after attribute reduction, and the method comprises the following steps:
equally dividing characteristic values corresponding to a plurality of abnormal behavior characteristic data in a plurality of groups of abnormal behavior characteristic data into N data blocks, numbering the N data blocks, wherein N is a positive integer greater than or equal to 3, and the N data blocks comprise data block X 1 Data block X N
Halving the N data blocks according to the numbering sequence of the N data blocks to obtain 2*N data blocks, wherein the 2*N data blocks comprise data block X 1,1 Data block X 1,2 Data block X N,1 Data block X N,2
Data block X of 2*N data blocks 1,1 Data block X 2,1 And data block X i,1 Assigned to the first data node, and data block X of 2*N data blocks 1,2 Data block X 2,2 And data block X i,2 Assigned to a second data node, where i E [3, N];
Block X of data a,1 And data block X a,2 A data block X satisfying a specified condition k,1 And data block X k,2 Assigned to the kth data node, where a e [1, N]And a is not equal to k, k is a positive integer greater than or equal to 3; wherein the specified condition may be specifically that when a is oddWhen counting, the data block X a1 Assigned to the kth data node, when a is even, data block X a,2 Assigned to the kth data node.
The data distribution method not only ensures the knowledge integrity of the characteristics of the Internet of things card in the reduction process, but also reduces the data volume on each data node to be
And respectively carrying out attribute reduction on the data distributed in the plurality of data nodes so that the accuracy of the determined abnormal classification labels meets the preset condition based on the risk characteristic class set and the corresponding risk characteristic value after the attribute reduction.
Optionally, in order to avoid that the risk feature class set after attribute reduction reduces some necessary risk feature classes and further affects the accuracy of abnormal behavior recognition, in this embodiment of the present disclosure, a preset condition may be preset, specifically, may be an expected range of measurement accuracy, and on this premise, attribute reduction is performed on the risk feature classes. Specifically, attribute reduction is performed on data allocated in a plurality of data nodes, so that the accuracy of the determined abnormal classification label meets a preset condition based on the risk feature class set and the corresponding risk feature value after attribute reduction, and the method comprises the following steps:
Initializing a specified number of measurement accuracies, wherein the measurement accuracies are the accuracy of the abnormal classification labels determined based on the reduced risk characteristic categories and the corresponding risk characteristic values;
sequentially and circularly executing a first operation, a second operation and a third operation aiming at a plurality of risk feature categories;
determining a risk feature class set with the measurement precision meeting a preset condition from the risk feature class set with the reduced attributes and the corresponding risk feature values;
wherein the first operation comprises:
assigning k1 to 1, and selecting a plurality of groups of feature categories from a plurality of risk feature categories, wherein one group of risk feature categories comprises k1 risk feature categories;
determining a plurality of first measurement accuracies corresponding to the plurality of sets of risk feature categories based on k1 risk feature categories included in each of the plurality of sets of risk feature categories;
arranging the first measurement precision from large to small according to the value, and selecting a specified number of first measurement precision which are ranked at the front from a plurality of first measurement precision;
when k1 is a positive integer less than or equal to the number of the plurality of risk feature categories, adding one to k 1;
the second operation includes:
Assigning k2 to be k1+1, and selecting a plurality of groups of risk feature categories from a plurality of risk feature categories, wherein one group of risk feature categories comprises k2 feature categories;
determining a plurality of second measurement accuracies corresponding to the plurality of sets of risk feature categories based on k2 risk feature categories included in each of the plurality of sets of risk feature categories;
arranging the second measurement precision from large to small according to the value, and selecting a specified number of second measurement precision which are ranked at the front from a plurality of second measurement precision;
when k2 is a positive integer less than or equal to the number of the plurality of risk feature categories, adding one to k 2;
the third operation includes:
and selecting the first measurement precision of the specified number and the second measurement precision of the specified number, and assigning the first measurement precision to the specified number of measurement precision.
Assuming that the specified number is 6 as an example, it should be understood that the specified number may be determined according to the number of actual measurement accuracy after attribute reduction in a specified range, or may be determined manually. The specific steps of the attribute reduction algorithm are as follows:
(1) Initializing a specified number of measurement accuracies, i.e. letting P max1 =0,……,P max6 =0;
(2) Let k1=1;
(3) Order theNamely, sequentially extracting one risk feature class from m risk feature classes shown in table 2, namely, m groups of risk feature classes, wherein each group of risk feature classes comprises one risk feature class, and then passing through a formulaTo respectively determine the measurement precision corresponding to the m groups of risk feature categories, wherein I U I is the sum of the number of the risk feature categories and the number of the abnormal behavior labels, +.>Is d i Is a non-empty finite set of abnormal behavior tags;
(4) Arranging the measurement precision pval corresponding to the m groups of risk feature categories from large to small, and respectively giving the highest 6 measurement precision values to the measurement precision P max1 ,……,P max6
(5) Let k 2 =k 1 +1;
(6) Order theNamely, the m sets of risk feature classes are respectively compared with (k) extracted from the m risk feature classes 1 Combining +1) risk profile classes to obtain +.>Group risk feature class, then through formulaRespectively determining the measurement precision corresponding to the multiple groups of risk feature categories;
(7) To be calculated to obtainMeasurement accuracy pval corresponding to group risk feature class, from largeTo a small arrangement and to assign the highest 6 measurement accuracy values to the measurement accuracy P, respectively max1 ,……,P max6
(8) Comparing the measurement precision obtained by calculation in the steps (4) and (7), and selecting a group of risk feature categories with larger measurement precision;
(9)k 2 ++;
(10) Ending the cycle until the value of k2 is greater than or equal to m+1
(11) Outputting the highest 6 precision values { P } of the measurement precision max1 ,…,P max6 -and a corresponding set of risk feature classes.
Table 3 attribute reduction combined risk feature class table for Internet of things card
As shown in table 3, the risk feature class table is obtained by summarizing and combining the risk feature class sets corresponding to the 6 measurement accuracies. Each risk feature category may correspond to a different degree or a different category of behavior. For example, the current day voice usage may include 6 types of actions, namely calling frequency, call duration, contacts, etc., each of which may be represented by a letter.
And 140, establishing a risk model library based on the relationship among the risk characteristic category set, the risk characteristic value and the corresponding abnormal classification label after attribute reduction.
Alternatively, to facilitate monitoring and identifying abnormal behavior, embodiments of the present description may pre-build a risk model library. Specifically, based on the relationship among the risk feature class set, the risk feature value and the corresponding abnormal classification label after attribute reduction, a risk model library is established, which comprises the following steps:
acquiring risk feature categories, corresponding risk feature values and corresponding abnormal behavior labels after attribute reduction from a plurality of groups of abnormal behavior feature data;
And correlating the multiple risk feature categories with the corresponding abnormal behavior labels and risk feature values after attribute reduction, and establishing a risk model library.
Table 4 risk model library for internet of things card
As shown in Table 4, a risk model library of the Internet of things card is established for the embodiment of the present specification, wherein τ 1 ~τ 10 For the 10 risk feature categories with the attributes reduced, the Decision is a corresponding abnormality classification label, and the Level is a corresponding risk Level, and the risk Level can be set through artificial experience evaluation or artificial intelligence model evaluation.
As shown in fig. 3, a framework schematic diagram applied to a method for establishing a risk module library according to an embodiment of the present disclosure includes:
s310, acquiring a massive historical Internet of things card data set;
s320, performing feature extraction and discretization on the acquired internet of things card data set to obtain a plurality of corresponding risk feature categories and risk feature values;
s330, an information system is established, namely a corresponding relation table among the risk characteristic categories, the risk characteristic values and the abnormal classification labels shown in the table 2 is established;
s340, distributing data to data nodes in a file system, wherein the data nodes comprise blocks 1 to blockN, and correspond to mappers 1 to mappers N;
S350, performing attribute reduction on the extracted multiple risk feature categories;
s360, determining the first 6 risk feature class sets with highest measurement accuracy;
s370, establishing a risk model library, grading risks and determining risk grades;
s380, monitoring and identifying the abnormal behavior of the online Internet of things card.
When the risk feature mapping library is constructed, the embodiment of the invention can discretize a plurality of behavior feature data in the acquired plurality of types of abnormal behavior feature data to obtain risk feature values corresponding to the plurality of abnormal behavior feature data in the plurality of groups of abnormal behavior feature data; the attribute reduction can be carried out on risk feature categories corresponding to a plurality of abnormal behavior feature data in a plurality of groups of abnormal behavior feature data respectively, so that the accuracy of the determined abnormal behavior label meets the preset precision based on the plurality of risk feature categories and the corresponding risk feature values after the attribute reduction; and finally, constructing a risk feature mapping library based on the relationships among the attribute reduced feature categories, the risk feature values and the abnormal behavior labels.
Because the attribute reduction processing is carried out on the risk feature categories which do not influence the accuracy of the abnormal classification labels (namely the abnormal behavior labels) before the risk feature mapping library is constructed, on one hand, the influence of the risk feature categories after attribute reduction on the accuracy of determining the abnormal classification labels is avoided, and on the other hand, the calculation amount for determining the abnormal classification labels is reduced because part of the risk feature categories are reduced.
Fig. 4 is a flowchart of an abnormal behavior recognition method according to an embodiment of the present disclosure, including:
step 410, obtaining target behavior feature data and a corresponding risk feature class in a risk scene, wherein the target behavior feature data comprises a plurality of behavior feature data, and at least one behavior feature data corresponds to one risk feature class;
step 420, performing discretization processing on a plurality of behavior feature data in the target behavior feature data based on a corresponding relation between a preset numerical interval and risk feature values, so as to obtain a plurality of risk feature values corresponding to the plurality of behavior feature data in the target behavior feature data;
step 430, acquiring abnormal behavior labels corresponding to the target behavior feature data from a risk model library based on risk feature categories corresponding to the plurality of risk feature values; the risk model library stores mapping relations among risk characteristic values, abnormal behavior labels and risk characteristic categories;
the number of risk feature categories in the risk model library is smaller than or equal to the number of risk feature categories contained in the target behavior feature data, and the risk feature categories in the risk model library are obtained by attribute reduction based on multiple groups of abnormal behavior feature data in a risk scene and corresponding abnormal behavior labels.
Optionally, based on risk feature categories corresponding to the multiple risk feature values, acquiring an abnormal behavior tag corresponding to the target behavior feature data from the risk model library includes:
determining a target risk characteristic value matched with the target behavior characteristic data from the corresponding relation between the risk characteristic class and the risk characteristic value in the risk model library;
and determining the abnormal classification label corresponding to the target risk characteristic value based on the corresponding relation among the risk characteristic category, the risk characteristic value and the abnormal classification label in the risk model library.
Because the attribute reduction processing is carried out on the risk feature categories which do not influence the accuracy of the abnormal classification labels (namely the abnormal behavior labels) before the risk feature mapping library is constructed, on one hand, the influence of the risk feature categories after attribute reduction on the accuracy of determining the abnormal classification labels is avoided, and on the other hand, the calculation amount for determining the abnormal classification labels is reduced because part of the risk feature categories are reduced.
Fig. 5 is a schematic structural diagram of an apparatus 500 for creating a risk model library according to an embodiment of the present disclosure. In a software implementation, the risk model library creating device 500 may include a processor 501, where:
The processor 501 is configured to obtain multiple sets of abnormal behavior feature data of a risk scene and corresponding abnormal classification tags, where a set of abnormal behavior feature data includes multiple abnormal behavior feature data; discretizing the multiple groups of abnormal behavior feature data to obtain risk feature values corresponding to the multiple abnormal behavior feature data in the multiple groups of abnormal behavior feature data; respectively carrying out attribute reduction on risk feature categories corresponding to a plurality of abnormal behavior feature data in the plurality of sets of abnormal behavior feature data, so that the accuracy of the determined abnormal classification labels meets preset conditions based on the risk feature category set and the corresponding risk feature values after attribute reduction; and establishing a risk model library based on the relationship among the risk characteristic class set, the risk characteristic value and the corresponding abnormal classification label after attribute reduction.
Optionally, in one embodiment, the processor 501 is configured to:
respectively performing attribute reduction on risk feature categories corresponding to a plurality of abnormal behavior feature data in the plurality of sets of abnormal behavior feature data, so that the determined accuracy of the abnormal classification label meets preset conditions based on the risk feature categories and the corresponding risk feature values after attribute reduction, and the method comprises the following steps:
Equally dividing feature values corresponding to a plurality of abnormal behavior feature data in the plurality of groups of abnormal behavior feature data into N data blocks, numbering the N data blocks, wherein N is a positive integer greater than or equal to 3, and the N data blocks comprise data block X 1 Data block X N
Respectively halving the N data blocks according to the numbering sequence of the N data blocks to obtain 2*N data blocks, wherein the 2*N data blocks comprise data block X 1,1 Data block X 1,2 Data block X N,1 Data block X N,2
Data block X of the 2*N data blocks 1,1 Data block X 2,1 And data block X i,1 Assigned to the first data node, and block X of said 2*N data blocks 1,2 Data block X 2,2 And data block X i,2 Assigned to a second data node, where i E [3, N];
Block X of data a,1 And data block X a,2 Medium fullnessOne block of data, block X, sufficient to specify conditions k,1 And data block X k,2 Assigned to the kth data node, where a e [1, N]And a is not equal to k, k is a positive integer greater than or equal to 3;
and respectively carrying out attribute reduction on the data distributed in the plurality of data nodes so that the accuracy of the determined abnormal classification labels meets the preset condition based on the risk characteristic class set and the corresponding risk characteristic values after attribute reduction.
Optionally, in one embodiment, the processor 501 is configured to:
initializing a specified number of measurement accuracies, wherein the measurement accuracies are the accuracy of the abnormal classification labels determined based on the reduced risk characteristic categories and the corresponding risk characteristic values;
sequentially and circularly executing a first operation, a second operation and a third operation aiming at the multiple risk feature categories;
determining a risk feature class set with the measurement precision meeting a preset condition from the risk feature class set with the reduced attribute and the corresponding risk feature value;
wherein the first operation comprises:
assigning k1 to 1, and selecting a plurality of groups of feature categories from the plurality of risk feature categories, wherein one group of risk feature categories comprises k1 risk feature categories;
determining a plurality of first measurement accuracies corresponding to the plurality of sets of risk feature categories based on k1 risk feature categories included in each of the plurality of sets of risk feature categories;
arranging the first measurement precision according to the magnitude of the numerical value from large to small, and selecting the first measurement precision of the designated number which is ranked at the front from the plurality of first measurement precision;
When k1 is a positive integer less than or equal to the number of the plurality of risk feature categories, adding one to k 1;
the second operation includes:
assigning k2 to be k1+1, and selecting a plurality of groups of risk feature categories from the plurality of risk feature categories, wherein one group of risk feature categories comprises k2 feature categories;
determining a plurality of second measurement accuracies corresponding to the plurality of sets of risk feature categories based on k2 risk feature categories included in each of the plurality of sets of risk feature categories;
arranging the second measurement precision according to the value from big to small, and selecting the second measurement precision of the designated number which is ranked at the front from the plurality of second measurement precision;
when the k2 is a positive integer less than or equal to the number of the plurality of risk feature categories, adding one to the k 2;
the third operation includes:
and selecting the first measurement precision of the specified number and the second measurement precision of the specified number, selecting the measurement precision of the specified number which is ranked at the front, and assigning the measurement precision of the specified number.
Optionally, in one embodiment, the processor 501 is configured to:
Acquiring the risk characteristic category, the corresponding risk characteristic value and the corresponding abnormal behavior label after attribute reduction from the plurality of groups of abnormal behavior characteristic data;
and associating the attribute-reduced multiple risk feature categories with the corresponding abnormal behavior labels and risk feature values, and establishing the risk model library.
Optionally, in one embodiment, the processor 501 is configured to:
based on a plurality of numerical intervals, respectively acquiring target numerical intervals corresponding to a plurality of abnormal behavior characteristic data in the plurality of sets of abnormal behavior characteristic data;
and acquiring the risk characteristic value corresponding to the target numerical value interval based on the corresponding relation between the numerical value intervals and the risk characteristic value.
The risk model library establishing device 500 can implement the method of the method embodiments of fig. 1 to 3, and specifically, the risk model library establishing method of the embodiments shown in fig. 1 to 3 may be referred to, which is not described herein.
Fig. 6 is a schematic structural diagram of an abnormal behavior recognition apparatus 600 according to an embodiment of the present disclosure. In a software implementation, the abnormal behavior recognition apparatus 600 may include a processor 601, wherein:
a processor 601, configured to obtain target behavior feature data and a corresponding risk feature class in a risk scenario, where the target behavior feature data includes a plurality of behavior feature data, and at least one behavior feature data corresponds to one risk feature class; discretizing a plurality of behavior feature data in the target behavior feature data based on a corresponding relation between a preset numerical interval and risk feature values to obtain a plurality of risk feature values corresponding to the plurality of behavior feature data in the target behavior feature data; acquiring abnormal behavior labels corresponding to the target behavior feature data from a risk model library based on risk feature categories corresponding to the plurality of risk feature values; the risk model library stores mapping relations among risk characteristic values, abnormal behavior labels and risk characteristic categories; the number of risk feature categories in the risk model library is smaller than or equal to the number of risk feature categories contained in the target behavior feature data, and the risk feature categories in the risk model library are obtained by attribute reduction based on multiple groups of abnormal behavior feature data in a risk scene and corresponding abnormal behavior labels.
Optionally, in an embodiment, the processor 601 is configured to:
determining a target risk characteristic value matched with the target behavior characteristic data from the corresponding relation between the risk characteristic category and the risk characteristic value in the risk model library;
and determining an abnormal classification label corresponding to the target risk characteristic value based on the corresponding relation among the risk characteristic category, the risk characteristic value and the abnormal classification label in the risk model library.
The abnormal behavior recognition apparatus 600 can implement the method of the method embodiment of fig. 4, and specifically, reference may be made to the abnormal behavior recognition method of the embodiment shown in fig. 4, which is not described herein.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. Referring to fig. 7, at the hardware level, the electronic device includes a processor, and optionally an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory (non-volatile Memory), such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, network interface, and memory may be interconnected by an internal bus, which may be an ISA (Industry Standard Architecture ) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus, or EISA (Extended Industry Standard Architecture ) bus, among others. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one bi-directional arrow is shown in FIG. 7, but not only one bus or type of bus.
And the memory is used for storing programs. In particular, the program may include program code including computer-operating instructions. The memory may include memory and non-volatile storage and provide instructions and data to the processor.
The processor reads the corresponding computer program from the nonvolatile memory to the memory and then runs the computer program to form a building device of the risk model library on a logic level. The processor is used for executing the programs stored in the memory and is specifically used for executing the following operations:
acquiring a plurality of groups of abnormal behavior feature data of a risk scene and corresponding abnormal classification labels, wherein one group of abnormal behavior feature data comprises a plurality of abnormal behavior feature data;
Discretizing the multiple groups of abnormal behavior feature data to obtain risk feature values corresponding to the multiple abnormal behavior feature data in the multiple groups of abnormal behavior feature data;
respectively carrying out attribute reduction on risk feature categories corresponding to a plurality of abnormal behavior feature data in the plurality of sets of abnormal behavior feature data, so that the accuracy of the determined abnormal classification labels meets preset conditions based on the risk feature category set and the corresponding risk feature values after attribute reduction;
and establishing a risk model library based on the relationship among the risk characteristic class set, the risk characteristic value and the corresponding abnormal classification label after attribute reduction.
The method for establishing the risk model library disclosed in the embodiments shown in fig. 1 to 3 of the present specification may be applied to a processor or implemented by the processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in one or more embodiments of the present description may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with one or more embodiments of the present disclosure may be embodied directly in a hardware decoding processor or in a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
The electronic device may further execute the method for establishing the risk model library of fig. 1 to 3, which is not described herein.
The present embodiments also provide a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a portable electronic device comprising a plurality of application programs, enable the portable electronic device to perform the method of the embodiment of fig. 1, and in particular to:
acquiring a plurality of groups of abnormal behavior feature data of a risk scene and corresponding abnormal classification labels, wherein one group of abnormal behavior feature data comprises a plurality of abnormal behavior feature data;
discretizing the multiple groups of abnormal behavior feature data to obtain risk feature values corresponding to the multiple abnormal behavior feature data in the multiple groups of abnormal behavior feature data;
respectively carrying out attribute reduction on risk feature categories corresponding to a plurality of abnormal behavior feature data in the plurality of sets of abnormal behavior feature data, so that the accuracy of the determined abnormal classification labels meets preset conditions based on the risk feature category set and the corresponding risk feature values after attribute reduction;
And establishing a risk model library based on the relationship among the risk characteristic class set, the risk characteristic value and the corresponding abnormal classification label after attribute reduction.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. Referring to fig. 8, at the hardware level, the electronic device includes a processor, and optionally an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory (non-volatile Memory), such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, network interface, and memory may be interconnected by an internal bus, which may be an ISA (Industry Standard Architecture ) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus, or EISA (Extended Industry Standard Architecture ) bus, among others. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one bi-directional arrow is shown in FIG. 8, but not only one bus or type of bus.
And the memory is used for storing programs. In particular, the program may include program code including computer-operating instructions. The memory may include memory and non-volatile storage and provide instructions and data to the processor.
The processor reads the corresponding computer program from the nonvolatile memory to the memory and then runs the computer program to form the abnormal behavior recognition device on the logic level. The processor is used for executing the programs stored in the memory and is specifically used for executing the following operations:
acquiring target behavior feature data and corresponding risk feature categories in a risk scene, wherein the target behavior feature data comprises a plurality of behavior feature data, and at least one behavior feature data corresponds to one risk feature category;
discretizing a plurality of behavior feature data in the target behavior feature data based on a corresponding relation between a preset numerical interval and risk feature values to obtain a plurality of risk feature values corresponding to the plurality of behavior feature data in the target behavior feature data;
acquiring abnormal behavior labels corresponding to the target behavior feature data from a risk model library based on risk feature categories corresponding to the plurality of risk feature values; the risk model library stores mapping relations among risk characteristic values, abnormal behavior labels and risk characteristic categories;
The number of risk feature categories in the risk model library is smaller than or equal to the number of risk feature categories contained in the target behavior feature data, and the risk feature categories in the risk model library are obtained by attribute reduction based on multiple groups of abnormal behavior feature data in a risk scene and corresponding abnormal behavior labels.
The abnormal behavior recognition method disclosed in the embodiment shown in fig. 4 of the present specification may be applied to a processor or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in one or more embodiments of the present description may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with one or more embodiments of the present disclosure may be embodied directly in a hardware decoding processor or in a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
The electronic device may further execute the abnormal behavior recognition method of fig. 4, which is not described herein.
The present embodiments also provide a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a portable electronic device comprising a plurality of application programs, enable the portable electronic device to perform the method of the embodiment of fig. 4, and in particular to:
acquiring target behavior feature data and corresponding risk feature categories in a risk scene, wherein the target behavior feature data comprises a plurality of behavior feature data, and at least one behavior feature data corresponds to one risk feature category;
discretizing a plurality of behavior feature data in the target behavior feature data based on a corresponding relation between a preset numerical interval and risk feature values to obtain a plurality of risk feature values corresponding to the plurality of behavior feature data in the target behavior feature data;
acquiring abnormal behavior labels corresponding to the target behavior feature data from a risk model library based on risk feature categories corresponding to the plurality of risk feature values; the risk model library stores mapping relations among risk characteristic values, abnormal behavior labels and risk characteristic categories;
The number of risk feature categories in the risk model library is smaller than or equal to the number of risk feature categories contained in the target behavior feature data, and the risk feature categories in the risk model library are obtained by attribute reduction based on multiple groups of abnormal behavior feature data in a risk scene and corresponding abnormal behavior labels.
Of course, in addition to the software implementation, the electronic device in this specification does not exclude other implementations, such as a logic device or a combination of software and hardware, that is, the execution subject of the following process is not limited to each logic unit, but may also be hardware or a logic device.
In summary, the foregoing description is only a preferred embodiment of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of one or more embodiments of the present disclosure, is intended to be included within the scope of one or more embodiments of the present disclosure.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

Claims (9)

1. A method for establishing a risk model library, wherein the method is applied to a distributed file system and comprises the following steps:
acquiring a plurality of groups of abnormal behavior feature data of a risk scene and corresponding abnormal classification labels, wherein one group of abnormal behavior feature data comprises a plurality of abnormal behavior feature data; the abnormal behavior characteristic data are abnormal behavior characteristic data of the Internet of things card;
discretizing the multiple groups of abnormal behavior feature data to obtain risk feature values corresponding to the multiple abnormal behavior feature data in the multiple groups of abnormal behavior feature data;
respectively carrying out attribute reduction on risk feature categories corresponding to a plurality of abnormal behavior feature data in the plurality of sets of abnormal behavior feature data, so that the accuracy of the determined abnormal classification labels meets preset conditions based on the risk feature category set and the corresponding risk feature values after attribute reduction;
Based on the relationship among the risk feature class set, the risk feature value and the corresponding abnormal classification label after attribute reduction, establishing a risk model library, wherein the risk model library is a risk model library of an Internet of things card;
the distributed file system includes a plurality of data nodes;
respectively performing attribute reduction on risk feature categories corresponding to a plurality of abnormal behavior feature data in the plurality of sets of abnormal behavior feature data, so that the determined accuracy of the abnormal classification label meets preset conditions based on the risk feature categories and the corresponding risk feature values after attribute reduction, and the method comprises the following steps:
the plurality of abnormal behavior feature data in the plurality of sets of abnormal behavior feature data are processedThe corresponding characteristic value is equally divided into N data blocks, the N data blocks are numbered, N is a positive integer greater than or equal to 3, and the N data blocks comprise data block X 1 Data block X N
Respectively halving the N data blocks according to the numbering sequence of the N data blocks to obtain 2*N data blocks, wherein the 2*N data blocks comprise data block X 1,1 Data block X 1,2 Data block X N,1 Data block X N,2
Data block X of the 2*N data blocks 1,1 Data block X 2,1 And data block X i,1 Assigned to the first data node, and block X of said 2*N data blocks 12 Data block X 22 And data block X i,2 Assigned to a second data node, where i E [3, N];
Block X of data a,1 And data block X a,2 A data block X satisfying a specified condition k,1 And data block X k,2 Assigned to the kth data node, where a e [1, N]And a is not equal to k, k is a positive integer greater than or equal to 3;
and respectively carrying out attribute reduction on the data distributed in the plurality of data nodes so that the accuracy of the determined abnormal classification labels meets the preset condition based on the risk characteristic class set and the corresponding risk characteristic values after attribute reduction.
2. The method of claim 1, wherein attribute reduction is performed on the data allocated in the plurality of data nodes, respectively, so that the accuracy of the determined abnormal classification label satisfies a preset condition based on the risk feature class set and the corresponding risk feature value after attribute reduction, including:
initializing a specified number of measurement accuracies, wherein the measurement accuracies are the accuracy of the abnormal classification labels determined based on the reduced risk characteristic categories and the corresponding risk characteristic values;
Sequentially and circularly executing a first operation, a second operation and a third operation aiming at a plurality of risk feature categories;
determining a risk feature class set with the measurement precision meeting a preset condition from the risk feature class set with the reduced attribute and the corresponding risk feature value;
wherein the first operation comprises:
assigning k1 to 1, and selecting a plurality of groups of feature categories from a plurality of risk feature categories, wherein one group of risk feature categories comprises k1 risk feature categories;
determining a plurality of first measurement accuracies corresponding to a plurality of groups of risk feature categories based on k1 risk feature categories included in each of the plurality of groups of risk feature categories;
arranging the first measurement precision according to the magnitude of the numerical value from large to small, and selecting the first measurement precision of the designated number which is ranked at the front from the plurality of first measurement precision;
when k1 is a positive integer less than or equal to the number of the plurality of risk feature categories, adding one to k 1;
the second operation includes:
assigning k2 to be k1+1, and selecting a plurality of groups of risk feature categories from a plurality of risk feature categories, wherein one group of risk feature categories comprises k2 feature categories;
Determining a plurality of second measurement accuracies corresponding to a plurality of groups of risk feature categories based on k2 risk feature categories included in each of the plurality of groups of risk feature categories;
arranging the second measurement precision according to the value from big to small, and selecting the second measurement precision of the designated number which is ranked at the front from the plurality of second measurement precision;
when the k2 is a positive integer less than or equal to the number of the plurality of risk feature categories, adding one to the k 2;
the third operation includes:
and selecting the first measurement precision of the specified number and the second measurement precision of the specified number, selecting the measurement precision of the specified number which is ranked at the front, and assigning the measurement precision of the specified number.
3. The method of claim 1, wherein establishing a risk feature map library based on relationships between the plurality of sets of feature categories, risk feature values, and anomaly classification tags after the attribute reduction comprises:
acquiring the risk characteristic category, the corresponding risk characteristic value and the corresponding abnormal behavior label after attribute reduction from the plurality of groups of abnormal behavior characteristic data;
And associating the attribute-reduced multiple risk feature categories with the corresponding abnormal behavior labels and risk feature values, and establishing the risk model library.
4. The method of claim 1, wherein discretizing the plurality of sets of abnormal behavior feature data to obtain risk feature values corresponding to a plurality of sets of abnormal behavior feature data, respectively, comprises:
based on a plurality of numerical intervals, respectively acquiring target numerical intervals corresponding to a plurality of abnormal behavior characteristic data in the plurality of sets of abnormal behavior characteristic data;
and acquiring the risk characteristic value corresponding to the target numerical value interval based on the corresponding relation between the numerical value intervals and the risk characteristic value.
5. An abnormal behavior recognition method, comprising:
acquiring target behavior feature data and corresponding risk feature categories in a risk scene, wherein the target behavior feature data comprises a plurality of behavior feature data, and at least one behavior feature data corresponds to one risk feature category;
discretizing a plurality of behavior feature data in the target behavior feature data based on a corresponding relation between a preset numerical interval and risk feature values to obtain a plurality of risk feature values corresponding to the plurality of behavior feature data in the target behavior feature data;
Acquiring abnormal behavior labels corresponding to the target behavior feature data from a risk model library based on risk feature categories corresponding to the plurality of risk feature values; the risk model library stores mapping relations among risk characteristic values, abnormal behavior labels and risk characteristic categories; the method for establishing the risk model library is as claimed in any one of claims 1 to 4;
the number of risk feature categories in the risk model library is smaller than or equal to the number of risk feature categories contained in the target behavior feature data, and the risk feature categories in the risk model library are obtained by attribute reduction based on multiple groups of abnormal behavior feature data in a risk scene and corresponding abnormal behavior labels.
6. The method of claim 5, wherein obtaining, from a risk model library, an abnormal behavior tag corresponding to the target behavior feature data based on risk feature categories corresponding to the plurality of risk feature values, comprises:
determining a target risk characteristic value matched with the target behavior characteristic data from the corresponding relation between the risk characteristic category and the risk characteristic value in the risk model library;
And determining an abnormal classification label corresponding to the target risk characteristic value based on the corresponding relation among the risk characteristic category, the risk characteristic value and the abnormal classification label in the risk model library.
7. A risk model library building apparatus, comprising:
the processor is used for acquiring a plurality of groups of abnormal behavior feature data of the risk scene and corresponding abnormal classification labels, wherein one group of abnormal behavior feature data comprises a plurality of abnormal behavior feature data; the abnormal behavior characteristic data are abnormal behavior characteristic data of the Internet of things card; discretizing the multiple groups of abnormal behavior feature data to obtain risk feature values corresponding to the multiple abnormal behavior feature data in the multiple groups of abnormal behavior feature data; respectively carrying out attribute reduction on risk feature categories corresponding to a plurality of abnormal behavior feature data in the plurality of sets of abnormal behavior feature data, so that the accuracy of the determined abnormal classification labels meets preset conditions based on the risk feature category set and the corresponding risk feature values after attribute reduction; establishing a risk model library based on the relationship among the risk feature class set, the risk feature value and the corresponding abnormal classification label after attribute reduction; the risk model library is a risk model library of an Internet of things card; the apparatus being capable of performing the method of any one of claims 1-4.
8. An abnormal behavior recognition apparatus, comprising:
the system comprises a processor, a risk scene and a risk scene, wherein the processor is used for acquiring target behavior feature data and corresponding risk feature categories in the risk scene, the target behavior feature data comprises a plurality of behavior feature data, and at least one behavior feature data corresponds to one risk feature category; discretizing a plurality of behavior feature data in the target behavior feature data based on a corresponding relation between a preset numerical interval and risk feature values to obtain a plurality of risk feature values corresponding to the plurality of behavior feature data in the target behavior feature data; acquiring abnormal behavior labels corresponding to the target behavior feature data from a risk model library based on risk feature categories corresponding to the plurality of risk feature values; the risk model library stores mapping relations among risk characteristic values, abnormal behavior labels and risk characteristic categories; the number of risk feature categories in the risk model library is smaller than or equal to the number of risk feature categories contained in the target behavior feature data, and the risk feature categories in the risk model library are obtained by attribute reduction based on multiple groups of abnormal behavior feature data in a risk scene and corresponding abnormal behavior labels; the method for establishing the risk model library is as claimed in any one of claims 1 to 4.
9. An electronic device, comprising:
a memory storing computer program instructions;
a processor, which when executed by the processor, implements a method of building a risk model library according to any of claims 1-4.
CN202010004255.0A 2020-01-03 2020-01-03 Abnormal behavior identification and risk model library establishment method and device and electronic equipment Active CN113076451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010004255.0A CN113076451B (en) 2020-01-03 2020-01-03 Abnormal behavior identification and risk model library establishment method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010004255.0A CN113076451B (en) 2020-01-03 2020-01-03 Abnormal behavior identification and risk model library establishment method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113076451A CN113076451A (en) 2021-07-06
CN113076451B true CN113076451B (en) 2023-07-25

Family

ID=76608607

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010004255.0A Active CN113076451B (en) 2020-01-03 2020-01-03 Abnormal behavior identification and risk model library establishment method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113076451B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115730233B (en) * 2022-10-28 2023-07-11 支付宝(杭州)信息技术有限公司 Data processing method and device, readable storage medium and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139021A (en) * 2015-07-08 2015-12-09 Tcl集团股份有限公司 Method and system for realizing television user rapid classification based on rough set theory
CN105681339A (en) * 2016-03-07 2016-06-15 重庆邮电大学 Incremental intrusion detection method fusing rough set theory and DS evidence theory

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011003076A (en) * 2009-06-19 2011-01-06 Fuji Heavy Ind Ltd Risk recognition system
CN103426123A (en) * 2013-07-24 2013-12-04 国家电网公司 Power grid fault risk evaluation method based on rough set theory
US20170315855A1 (en) * 2016-05-02 2017-11-02 Agt International Gmbh Method of detecting anomalies on appliances and system thereof
CN107256444A (en) * 2017-04-24 2017-10-17 中国电力科学研究院 A kind of distribution network failure Fuzzy comprehensive evaluation for risk method and device
CN108985811A (en) * 2017-06-02 2018-12-11 北京京东尚科信息技术有限公司 Method, apparatus and electronic equipment for precision marketing
US11093535B2 (en) * 2017-11-27 2021-08-17 International Business Machines Corporation Data preprocessing using risk identifier tags
US11341513B2 (en) * 2018-02-20 2022-05-24 James R Jackson Systems and methods for generating a relationship among a plurality of datasets to generate a desired attribute value
CN108683663B (en) * 2018-05-14 2021-04-20 中国科学院信息工程研究所 Network security situation assessment method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139021A (en) * 2015-07-08 2015-12-09 Tcl集团股份有限公司 Method and system for realizing television user rapid classification based on rough set theory
CN105681339A (en) * 2016-03-07 2016-06-15 重庆邮电大学 Incremental intrusion detection method fusing rough set theory and DS evidence theory

Also Published As

Publication number Publication date
CN113076451A (en) 2021-07-06

Similar Documents

Publication Publication Date Title
CN107563757B (en) Data risk identification method and device
CN107679683B (en) Software development progress early warning method and device
CN110457175B (en) Service data processing method and device, electronic equipment and medium
CN104731816A (en) Method and device for processing abnormal business data
CN110209551B (en) Abnormal equipment identification method and device, electronic equipment and storage medium
CN111796957B (en) Transaction abnormal root cause analysis method and system based on application log
CN106919957A (en) Method and device for processing data
CN111090807A (en) Knowledge graph-based user identification method and device
CN112200644B (en) Method and device for identifying fraudulent user, computer equipment and storage medium
CN113837635A (en) Risk detection processing method, device and equipment
CN111611390B (en) Data processing method and device
CN115953172A (en) Fraud risk identification method and device based on graph neural network
CN115049493A (en) Block chain data tracking method and device and electronic equipment
CN113076451B (en) Abnormal behavior identification and risk model library establishment method and device and electronic equipment
CN110414591A (en) A kind of data processing method and equipment
CN111797181B (en) Positioning method, device, control equipment and storage medium for user location
CN110796178B (en) Decision model training method, sample feature selection method, device and electronic equipment
CN110278524B (en) User position determining method, graph model generating method, device and server
CN110177006B (en) Node testing method and device based on interface prediction model
CN111325255A (en) Specific crowd delineating method and device, electronic equipment and storage medium
CN108133234B (en) Sparse subset selection algorithm-based community detection method, device and equipment
CN110661913A (en) User sorting method and device and electronic equipment
CN108429632B (en) Service monitoring method and device
CN115203556A (en) Score prediction model training method and device, electronic equipment and storage medium
CN110708414B (en) Telephone number sorting method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant