CN116011810A - Regional risk identification method, device, equipment and storage medium - Google Patents

Regional risk identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN116011810A
CN116011810A CN202211658951.9A CN202211658951A CN116011810A CN 116011810 A CN116011810 A CN 116011810A CN 202211658951 A CN202211658951 A CN 202211658951A CN 116011810 A CN116011810 A CN 116011810A
Authority
CN
China
Prior art keywords
feature
population
risk
characteristic
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211658951.9A
Other languages
Chinese (zh)
Inventor
吴勇
柯晨怡
陈晞
陈亚君
李宁
卢世温
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCB Finetech Co Ltd
Original Assignee
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCB Finetech Co Ltd filed Critical CCB Finetech Co Ltd
Priority to CN202211658951.9A priority Critical patent/CN116011810A/en
Publication of CN116011810A publication Critical patent/CN116011810A/en
Pending legal-status Critical Current

Links

Images

Abstract

The application discloses a regional risk identification method, a regional risk identification device, regional risk identification equipment and a storage medium. The application relates to the technical field of artificial intelligence. The method comprises the following steps: acquiring risk index data of each resource association transaction in the region to be detected under different dimensions; extracting regional risk characteristics in each risk index data; and determining the regional risk category of the region to be detected according to the regional risk characteristics. According to the scheme, the richness and the expandability of the risk index data are improved by acquiring the risk index data under different dimensions; meanwhile, according to the regional risk characteristics extracted from the risk index data, the regional risk category of the region to be detected is determined, and the accuracy of the regional risk category determination result is improved.

Description

Regional risk identification method, device, equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of artificial intelligence, in particular to a regional risk identification method, a regional risk identification device, regional risk identification equipment and a storage medium.
Background
With the rapid development of social economy, each large resource storage organization can transact various resource leasing services for clients so as to meet the daily life demands of the clients. However, there may be an abnormal situation in some resource leasing services.
Because the risk identification attention of the resource association mechanism to the region is insufficient, the possibility of abnormality of the region resource is increased, the accuracy of the determination result of the region risk category is lower, and the risk early warning or effective intervention cannot be performed in time.
Disclosure of Invention
The embodiment of the application provides a regional risk identification method, a regional risk identification device, regional risk identification equipment and a storage medium, which are used for improving the accuracy of a regional risk category determination result.
In a first aspect, an embodiment of the present application provides a regional risk identification method, including:
acquiring risk index data of each resource association transaction in the region to be detected under different dimensions;
extracting regional risk characteristics in each risk index data;
and determining the regional risk category of the region to be detected according to the regional risk characteristics.
In a second aspect, an embodiment of the present application further provides an area risk identifying apparatus, where the apparatus includes:
the risk index data acquisition module is used for acquiring risk index data of each resource association transaction in the region to be detected under different dimensionalities;
the regional risk feature extraction module is used for extracting regional risk features in each risk index data;
the regional risk category determining module is used for determining the regional risk category of the region to be detected according to the regional risk characteristics.
In a third aspect, an embodiment of the present application further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements any one of the regional risk identification methods provided in the embodiments of the first aspect of the present application when the processor executes the program.
In a fourth aspect, embodiments of the present application further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the regional risk identification methods as provided by the embodiments of the first aspect of the present application.
In a fifth aspect, embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements any of the regional risk identification methods as provided by the embodiments of the first aspect of the present application.
According to the regional risk identification scheme provided by the embodiment of the application, the risk index data of each resource association transaction in the region to be detected under different dimensions is obtained; extracting regional risk characteristics in each risk index data; and determining the regional risk category of the region to be detected according to the regional risk characteristics. According to the scheme, the richness and the expandability of the risk index data are improved by acquiring the risk index data under different dimensions; meanwhile, according to the regional risk characteristics extracted from the risk index data, the regional risk category of the region to be detected is determined, and the accuracy of the regional risk category determination result is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered limiting the scope, and that other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a regional risk identification method provided in an embodiment of the present application;
FIG. 2 is a flowchart of another method for identifying regional risk according to an embodiment of the present application;
FIG. 3 is a flowchart of another method for identifying regional risk according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an area risk identification device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device for implementing a regional risk identification method according to an embodiment of the present application.
Detailed Description
The present application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present application are shown in the drawings.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance. In the technical scheme, the acquisition, storage, use, processing and the like of the risk index data, the regional risk characteristics, the initial characteristic data, the weight coefficient and other data all accord with the relevant regulations of national laws and regulations.
The method and the device for identifying the risk of each region are suitable for application scenes for identifying the risk of the region to be detected. The method for identifying the risk of each region provided by the embodiment of the application can be executed by a region risk identification device, and the device can be realized in a software and/or hardware mode and is specifically configured in electronic equipment with certain computing and storage capacity.
For ease of understanding, the regional risk identification method will be described in detail first.
Referring to the area risk identification method shown in fig. 1, the method includes:
S110, acquiring risk index data of each resource association transaction in the region to be detected under different dimensions.
The region to be detected refers to a region in need of abnormality detection. The region to be detected in the embodiment of the present application is not particularly limited, and may be set by a technician according to experience. The area to be detected may be at least one of an administrative area, a human-set area, a business district, and the like, for example.
The resource association transaction refers to a transaction related to resource interaction performed by the area to be detected.
The number and/or content of the different dimensions are not particularly limited in the embodiments of the present application, and may be set by a technician according to experience. By way of example, the different dimensions may include at least one of a resource renter dimension, a resource voucherer dimension, an exchange rate dimension, a resource renting institution dimension, and an interest rate dimension, among others.
The risk index data refers to risk assessment data generated when resource association transactions are performed. The content of the risk indicator data is not limited in this embodiment, and may be set by a technician according to experience.
For example, if the area to be detected is a set area or business district, the risk index data may include at least one of lease information of a resource leasing party, lease time of the resource leasing party, guarantee information of the resource guarantee party, exchange rate at the time of resource lease, information of a resource leasing mechanism, interest rate at the time of resource lease, and the like; if the area to be detected is an administrative area, the risk index data may include at least one of risk data of a resource leasing party, risk data of a resource guarantee party, exchange rate risk data during resource leasing, risk data of a resource leasing mechanism, interest rate risk data during resource leasing, and the like. The risk data of the resource renting party refers to a risk value that the resource renting party cannot repay the resource according to the specified return time and return resources. The risk data of the resource guarantor refers to a risk value of a principal providing a guaranty for a resource leasing party, which cannot fulfill a guaranty obligation when a problem occurs in resource leasing. The exchange rate risk data during resource lease refers to a risk value of loss caused by the influence of exchange rate fluctuation when a resource leasing party leases resources. The risk data of the resource renting mechanism refers to a risk value that a resource renting party needing to bear cannot repay the resource in time when the resource renting mechanism provides long renting or short renting of the resource. The interest rate risk data during resource lease refers to a risk value generated by the fact that the resource lease interest rate is not matched with the market benchmark interest rate.
Specifically, risk index data of each resource-associated transaction in the region to be detected under at least one dimension is obtained.
S120, extracting regional risk characteristics in each risk index data.
The regional risk features refer to data which can be used for risk assessment in the risk index data.
The method for extracting the regional risk features in the embodiment of the present application is not limited, and may be set by a technician according to experience.
Specifically, corresponding regional risk features are extracted from at least one risk index data.
It should be noted that, before the region risk feature extraction, each risk index data may be preprocessed to improve accuracy of the region risk feature extraction. The pretreatment method is not limited in any way, and the pretreatment method can be set by a technician according to experience.
For example, the risk index data may be normalized to unify the dimensions of the risk index data.
Optionally, for each type of risk indicator data, at least one statistic of the type of risk indicator data may be determined; and carrying out standardized processing on the risk index data according to the statistical data. Wherein the statistical data may include at least one of a maximum value, a minimum value, an average value, a standard deviation, and the like.
Specifically, the normalization processing can be performed on each risk indicator data by the following formula:
Figure BDA0004012902320000061
wherein x is t ' represents the risk indicator data after normalization; x is x t Representing risk indicator data; min (x) t ) Representing a minimum value of the risk indicator data; max (x) t ) Representing the maximum value of the risk indicator data.
S130, determining the regional risk category of the region to be detected according to the regional risk characteristics.
Wherein the regional risk category can be used to quantify the degree of abnormality of the region to be detected. Optionally, the area risk category may be directly determining whether the area to be detected is abnormal or not, and is used for qualitatively describing the degree of abnormality of the area to be detected. Specifically, the regional risk categories may include anomalies and no anomalies. Or alternatively, the region risk category may be determining an abnormality probability of the region to be detected, and used for quantitatively describing the abnormality degree of the region to be detected. Correspondingly, a preset probability interval to which the regional risk category belongs can be determined, and different probability intervals correspond to different abnormal grades; and determining whether the region to be detected is abnormal or not according to the abnormal grade corresponding to the preset probability interval. The embodiment of the application does not limit the division and/or the number of the preset probability intervals, and can be set by a technician according to experience. For example, if the preset probability interval includes a first-level probability interval and a second-level probability interval; wherein, the first-level probability interval corresponds to the first-level abnormality and the second-level probability interval corresponds to the second-level abnormality; if the regional risk category is in the first-level probability interval, the abnormality level of the region to be detected is first-level abnormality, namely the region to be detected is free of abnormality; if the region risk category is in the second-level probability interval, the abnormality level of the region to be detected is second-level abnormality, namely the region to be detected is abnormal.
Specifically, according to each regional risk characteristic of the region to be detected, determining the regional risk category of the region to be detected. The manner of determining the regional risk category in the embodiment of the present application is not limited in any way, and may be set by a technician according to experience.
According to the regional risk identification scheme provided by the embodiment of the application, the risk index data of each resource association transaction in the region to be detected under different dimensions is obtained; extracting regional risk characteristics in each risk index data; and determining the regional risk category of the region to be detected according to the regional risk characteristics. According to the scheme, the richness and the expandability of the risk index data are improved by acquiring the risk index data under different dimensions; meanwhile, according to the regional risk characteristics extracted from the risk index data, the regional risk category of the region to be detected is determined, and the accuracy of the regional risk category determination result is improved.
On the basis of the above embodiment, the application also provides an alternative embodiment. In this alternative embodiment, the extraction mechanism of the regional risk features is optimized and improved.
Further, the operation of extracting regional risk characteristics in each risk index data is thinned into the operation of extracting characteristics of each risk index data, so as to obtain initial characteristic data; determining a weight coefficient of the initial characteristic data; determining a noise reduction threshold according to the initial characteristic data and the weight coefficient; and determining regional risk characteristics' according to the noise reduction threshold value and the initial characteristic data, thereby providing a perfect regional risk characteristic extraction mechanism. In the embodiments of the present application, the details not described in detail may be referred to in the description of other embodiments.
Referring to the area risk identification method shown in fig. 2, the method includes:
s210, acquiring risk index data of each resource association transaction in the region to be detected under different dimensions.
S220, extracting features of the risk index data to obtain initial feature data.
The initial feature data refers to data obtained by performing rough extraction on risk index data. Specifically, the initial feature data includes redundant noise.
Specifically, corresponding initial characteristic data are extracted from each risk index data.
S230, determining a weight coefficient of the initial characteristic data.
The size of the weight coefficient is not limited in this embodiment, and may be set by a technician according to experience or needs. Illustratively, the weight coefficients may be generated by a deep learning attention mechanism.
It should be noted that the weight coefficients of different initial feature data may be the same or different, which is not limited in any way in the embodiment of the present application.
S240, determining a noise reduction threshold according to the initial characteristic data and the weight coefficient.
Wherein the noise reduction threshold may be used to remove redundant noise from the initial feature data. The method for determining the noise reduction threshold is not limited, and the noise reduction threshold can be set by a technician according to experience, or can be determined repeatedly through a large number of experiments.
In an alternative embodiment, the noise reduction threshold may be set empirically by humans. According to the scheme, the noise reduction threshold value is low in efficiency and poor in accuracy, and when initial characteristic data change, the noise reduction threshold value needs to be reset.
In order to improve the efficiency of determining the noise reduction threshold value and the accuracy of the determination result, the real-time dynamic updating of the noise reduction threshold value is realized. In another alternative embodiment, a deep learning attention mechanism may be added to the determination of the noise reduction threshold to enable dynamic updating of the noise reduction threshold.
S250, determining regional risk characteristics according to the noise reduction threshold and the initial characteristic data.
In the embodiment of the application, soft threshold processing is performed on the initial feature data according to the noise reduction threshold. Specifically, the signal domain in the initial feature data is converted into a numerical domain space close to zero. Wherein the signal features in the signal domain are converted into positive features or negative features; converting the noise characteristics to near zero characteristics; the feature values of the noise features approaching zero are then placed at zero by the noise reduction threshold.
Illustratively, the initial feature data may be processed to obtain the regional risk feature by the following formula:
Figure BDA0004012902320000081
wherein y represents the feature value in the processed initial feature data; x represents any one feature value in the initial feature data; τ represents a noise reduction threshold.
According to the above formula, when any characteristic value in the initial characteristic data is between [ -tau, tau ], the characteristic value is zeroed; if the characteristic value is greater than the noise reduction threshold τ or less than the negative of the noise reduction threshold τ, a linear change is exhibited. This has the advantage that the feature data in the initial feature data is largely preserved.
Specifically, in the embodiment of the present application, the feature value in the initial feature data may be processed through the above formula, so as to obtain a corresponding regional risk feature.
S260, determining the regional risk category of the region to be detected according to the regional risk characteristics.
According to the regional risk identification scheme provided by the embodiment of the application, regional risk feature operations in the extracted risk index data are refined into feature extraction of the risk index data, so that initial feature data are obtained; determining a weight coefficient of the initial characteristic data; determining a noise reduction threshold according to the initial characteristic data and the weight coefficient; determining regional risk characteristics according to the noise reduction threshold and the initial characteristic data, and processing the extracted initial characteristics by introducing the noise reduction threshold to obtain regional risk characteristics, so that the influence of redundant noise in the initial characteristic data on subsequent processing is avoided; meanwhile, the derivative of any feature value in the determined regional risk features is not 0, namely 1, so that the condition of gradient dispersion is avoided, and the accuracy of determining the regional risk category according to the regional risk features is improved.
On the basis of the above embodiments, the extraction of the regional risk features may be performed by using a depth residual systolic neural network (DRSN). Wherein, the depth residual error contracted neural network can comprise a global pooling layer and a convolution layer. The depth residual contracting neural network may combine the deep learning attention mechanism with the determination of the noise reduction threshold, resulting in a dynamic determination of the noise reduction threshold. Specifically, the depth residual error shrinkage neural network generates a weight coefficient through an attention mechanism, and multiplies the weight coefficient by initial characteristic data output by the global pooling layer to obtain a noise reduction threshold; and then, denoising the output result of the convolution layer through a denoising threshold value to obtain regional risk characteristics. It should be noted that the noise reduction thresholds corresponding to different feature channels in the depth residual error shrinkage neural network may be the same or different, which is not limited in the embodiment of the present application.
Based on the technical schemes, the application also provides an alternative embodiment. In this alternative embodiment, the determination mechanism of the regional risk category is optimized and improved.
Further, the regional risk category of the region to be detected is determined according to the regional risk characteristics, and is refined into the steps of inputting the regional risk characteristics into a trained depth forest, and sequentially carrying out classification prediction according to a random forest layer cascaded in the depth forest to obtain at least one reference risk category of the region to be detected; and determining the regional risk category' of the region to be detected according to each reference risk category, thereby providing a perfect regional risk category determination mechanism. In the embodiments of the present application, the details not described in detail may be referred to in the description of other embodiments.
Referring to the area risk identification method shown in fig. 3, the method includes:
s310, acquiring risk index data of each resource-associated transaction in the region to be detected under different dimensions.
S320, extracting regional risk characteristics in each risk index data.
S330, inputting the regional risk characteristics into the trained depth forest, and sequentially carrying out classification prediction according to the random forest layers cascaded in the depth forest to obtain at least one reference risk category of the region to be detected.
The depth forest can be used for determining the region risk category of the region to be detected. Specifically, the depth forest is a multi-layered cascade forest architecture based on random forests, and characterization learning is performed through an integrated and re-integrated mode. In depth forests, the constituent elements of each layer of forests are random forests and fully random forests, while the smallest constituent elements of random forests and fully random forests are decision trees. The number of layers of the depth forest is not particularly limited, and the depth forest can be set by a technician according to experience. By way of example, the number of layers of the depth forest can be determined in a self-adaptive manner, namely K-fold cross validation is performed, and when one layer of forest is completed in each training, if the validation accuracy is not obviously improved, the number of layers of the forest is not increased any more.
Wherein the depth forest comprises a random forest and a completely random forest. Wherein, the random forest is composed of a common random tree. A completely random forest is made up of completely random trees. Specifically, the common random tree is different from the full random tree in the node division manner. When nodes are divided, decision trees forming a random forest are firstly selected from the whole feature space
Figure BDA0004012902320000111
And taking the individual regional risk features as candidate features of node division, wherein d is the total number of the regional risk features. The regional risk feature with the best Gini value is then selected from the candidate features as the attribute feature for node partitioning. The decision tree forming the completely random tree forest has completely random property, namely, when the nodes are divided, the regional risk features in the feature space are randomly selected as the attribute features of node division.
The reference risk category refers to a risk category of the area to be detected, which needs to be determined. Optionally, the reference risk category may be a qualitative basis for providing whether an abnormality exists in the area to be detected; or alternatively, the reference risk category may also be a quantitative basis for providing an anomaly probability for the region to be detected.
In an alternative embodiment, the regional risk features may be input into a depth forest where a cascading random forest layer directly outputs the reference risk category.
In another alternative embodiment, the determination of the reference risk category may be made by introducing a category prediction probability. Specifically, aiming at any random forest layer in the depth forest, the category prediction probabilities of different decision trees in the random forest of the front cascade level can be subjected to feature fusion with regional risk features to obtain features to be processed; the feature to be processed of the head layer is a regional risk feature; inputting the features to be processed into a random forest of a post-cascade level to obtain category prediction probabilities of different decision trees in the random forest of the post-cascade level; and taking the category prediction probabilities of different decision trees in a random forest of the tail level in the depth forest as each reference risk category of the region to be detected.
The category prediction probability refers to risk category probability of a region to be detected generated by any decision tree in the depth forest. The feature to be processed refers to feature data required to be processed for each layer of forest in the depth forests.
It should be noted that, in each layer of forest, the category prediction probabilities of all decision trees can be generated by voting. And adding the category prediction probabilities generated by the decision trees of any forest in the layer of forests, and carrying out numerical value averaging to obtain the probability which is the reference risk category of the forest. The sum of the class prediction probabilities of the decision trees in each forest is 1, and the sum of the reference risk classes of the forests in the layer of forests is also 1. For example, the reference risk category for each forest may be the maximum of the category prediction probabilities for the decision trees in that forest.
It can be appreciated that by introducing the category prediction probability to determine the reference risk category, the accuracy of the reference risk category determination result can be further improved.
It should be noted that, when training each layer of forest, the characteristics of the sample to be processed of each layer of forest are input in a cascade manner. After each layer of forest training is completed, outputting each sample reference risk category in the layer of forest; connecting the sample reference risk category output by the forest layer, and then connecting with the input sample region risk feature; and inputting the connected data into a next forest. The sample region risk feature refers to sample risk assessment data of a sample region for which the presence or absence of an abnormality is known. The sample feature to be processed refers to sample feature data required to be processed for each layer of forest in the depth forests. The sample reference risk category refers to a risk category of a sample area generated by any layer of forest in the depth forest.
The determination mode of each super parameter in the depth forest is not limited, and can be set or adjusted by a technician according to experience. In an alternative embodiment, the hyper-parameters in the depth forest may be set manually.
In order to increase the accuracy of the super parameters in the depth forest, in an alternative embodiment, the super parameters may be set dynamically. Specifically, each hyper-parameter in the depth forest is determined in the following manner: constructing initial population characteristics comprising initial values of all super parameters; according to the initial population characteristics, performing characteristic search and/or solution space contraction iteration in a hyper-parameter solution space to obtain target population characteristics; and taking each characteristic value in the target population characteristic as a parameter value of a corresponding super parameter in the depth forest.
Wherein, the super parameter refers to a frame parameter in the depth forest. Specifically, each super parameter in the depth forest includes at least one of the following: the number of layers of random forests, the number of random forests of each layer, the number of decision trees in the random forests and the maximum bifurcation number of the decision trees.
It can be appreciated that the number of layers of random forests, the number of random forests of each layer, the number of decision trees in the random forests and the maximum bifurcation number of the decision trees are introduced, so that the richness of the hyper-parameters in the depth forests is improved, and the integrity of the hyper-parameters is improved.
Wherein, the initial value refers to the numerical value of the super parameter at the starting moment. The initial population feature refers to feature data including initial values of each super-parameter. The target population characteristics refer to characteristic data including parameter values of each processed super-parameter.
Wherein, the characteristic search refers to searching in a solution space to search for an optimal solution. The solution space contraction refers to gradually narrowing the search range of the feature search by narrowing the solution space range. The method of selecting feature search or solution space contraction is not limited in this embodiment, and may be set by a technician according to experience.
It will be appreciated that by introducing feature search and de-spatial contraction to determine the parameter values of the hyper-parameters, the accuracy of determining the parameter values of the hyper-parameters is improved.
In an alternative embodiment, the determination of the hyper-parameters may be selected from feature search and solution space contraction manually, empirically or as desired.
In another alternative embodiment, the determination of the hyper-parameters may be chosen from feature search and solution space contraction by introducing random probabilities. Specifically, for any feature iteration process, determining a target feature processing mode according to the random probability of the iteration; the target feature processing mode is feature searching or solution space contraction; based on the feature processing logic corresponding to the target feature processing mode, carrying out feature update on the current population feature obtained in the previous iteration to obtain the current population feature obtained in the current iteration; the current population characteristics obtained by the first iteration are initial population characteristics; and taking the current population characteristic obtained by the last iteration as a target population characteristic.
The random probability refers to the probability of selecting feature search or solution space contraction in any feature iteration process. The target feature processing mode refers to a method for determining parameter values of super parameters. Specifically, the target feature processing mode may be feature searching or solution space contraction. The feature search or the solution space contraction has corresponding feature processing logic. Wherein feature processing logic refers to the process of determining parameter values for super-parameters. The current population feature refers to feature data comprising parameter values of each super parameter after a feature iteration process is performed at the current moment.
The number of feature iterations in the embodiments of the present application is not limited, and may be set by a technician according to experience.
Illustratively, the target feature handling manner may be determined from the random probability by the following formula:
Figure BDA0004012902320000141
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004012902320000142
representing the current population characteristics obtained by the iteration; p represents the random probability.
It can be understood that by introducing random probability, the target feature processing mode is determined, so that the situation that the determination result is wrong when the target feature processing mode is manually determined is avoided, data support is provided for determining the target feature processing mode, and the accuracy of determining the target feature processing mode is improved.
In an alternative embodiment, if the target feature processing mode is feature search, based on feature processing logic corresponding to the target feature processing mode, feature updating is performed on the current population feature obtained in the previous iteration to obtain the current population feature obtained in the current iteration, including: determining a reference population characteristic according to the distance coefficient; determining a feature searching distance between the reference population feature and the current population feature obtained in the previous iteration; and according to the feature searching distance, performing feature searching towards the reference population feature in the hyper-parameter solution space to obtain the current population feature of the current iteration.
The size of the distance coefficient is not limited in this embodiment, and may be set by a technician according to experience. The reference population characteristics may be used to provide basis for determining the current population characteristics of the current iteration. Alternatively, the reference population feature may be any population feature determined by a random search, and the reference population feature may also be an optimal population feature determined by an optimal search. The feature searching distance refers to the distance between the current population feature and the reference population feature of the previous iteration in the feature searching mode.
The manner of determining the reference population feature according to the distance coefficient in the embodiment of the present application is not particularly limited. In an alternative embodiment, the reference population characteristics may be determined manually based on distance coefficients.
In order to improve the accuracy of determining the reference population characteristics, in another alternative embodiment, random search conditions and optimal search conditions may be introduced to determine the reference population characteristics. Specifically, if the distance coefficient meets the random search condition, selecting any candidate population feature in the super-parameter solution space as a reference population feature; if the distance coefficient meets the optimal searching condition, selecting candidate population characteristics with higher adaptation degree in the super-parameter solution space as reference population characteristics; wherein the random search condition is complementary to the optimal search condition.
Wherein the hyper-parametric solution space may be used to store characteristic data of parameter values of at least one set of hyper-parameters. Candidate population characteristics refer to characteristic data comprising parameter values for a set of super-parameters in a super-parametric solution space. The embodiment of the application does not limit the contents of the random search condition and/or the optimal search condition, and can be set by a technician according to experience, and only the random search condition and the optimal search condition are ensured to be complementary. Illustratively, the random search condition may refer to an absolute value of the distance coefficient being greater than or equal to 1; the optimal search condition may mean that the absolute value of the distance coefficient is less than 1. The adaptation degree refers to the probability that each candidate population feature in the hyper-parametric solution space can be used as the current population feature of the current iteration.
It should be noted that, when the distance coefficient satisfies the optimal search condition, the candidate population feature with the highest fitness in the hyper-parameter solution space may be selected as the reference population feature.
It can be understood that by introducing the random search condition and the optimal search condition, the reference population characteristic is determined, a judgment basis is provided for the selection of the reference population characteristic, and the accuracy of the determined reference population characteristic is improved.
Further, a feature search distance is determined based on the determined reference population feature. For example, if the distance coefficient satisfies the random search condition, the feature search distance may be determined by the following formula:
Figure BDA0004012902320000161
Figure BDA0004012902320000162
/>
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004012902320000163
representing a feature search distance under the random search; c represents a coefficient vector, namely a distance coefficient; />
Figure BDA0004012902320000164
Representing reference population characteristics, namely any candidate population characteristics in the hyper-parametric solution space; />
Figure BDA0004012902320000165
Representing the current population characteristics obtained in the previous iteration; />
Figure BDA0004012902320000166
Represents [0,1 ]]Random vector between them.
For example, if the distance coefficient satisfies the optimal search condition, the feature search distance may be determined by the following formula:
Figure BDA0004012902320000167
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004012902320000168
representing the feature search distance under the optimal search; />
Figure BDA0004012902320000169
Representing the characteristics of a reference population, which is a super-parametric solution Candidate population features with higher adaptation degree in space.
Further, according to the feature searching distance, feature searching is conducted towards the reference population feature in the super-parameter solution space, and the current population feature of the iteration is obtained. In an alternative embodiment, the current population feature of the current iteration may be directly determined according to the determined feature search distance.
In order to improve the accuracy of the determined current population characteristics of the current iteration, in another alternative embodiment, the characteristic search distance may be updated. Specifically, according to the distance coefficient, updating the characteristic searching distance; and carrying out feature searching towards the reference population feature in the super-parameter solution space according to the updated feature searching distance to obtain the current population feature of the current iteration.
For example, if the distance coefficient satisfies the random search condition, the distance coefficient may be determined by
Figure BDA0004012902320000171
Updating feature search distance->
Figure BDA0004012902320000172
Searching for distance according to the updated feature>
Figure BDA0004012902320000173
The current population characteristics of the current iteration are determined by the following formula:
Figure BDA0004012902320000174
Figure BDA0004012902320000175
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004012902320000176
representing the current population characteristics of the iteration under random search; />
Figure BDA0004012902320000177
Representing the coefficient vector, namely the distance coefficient; />
Figure BDA0004012902320000178
Representing the updated feature search distance under random search; / >
Figure BDA0004012902320000179
Representing the convergence factor linearly decreasing from 2 to 0 during the iteration.
For example, if the distance coefficient satisfies the optimal search condition, the distance coefficient may be used as
Figure BDA00040129023200001710
Updating feature search distance->
Figure BDA00040129023200001711
Searching for distance according to the updated feature>
Figure BDA00040129023200001712
The current population characteristics of the current iteration are determined by the following formula:
Figure BDA00040129023200001713
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA00040129023200001714
representing the current population characteristics of the iteration under the optimal search; />
Figure BDA00040129023200001715
And representing the updated feature search distance under the optimal search.
It should be noted that in the embodiment of the present application, the distance coefficient may be used
Figure BDA00040129023200001716
Is determined by the size of the productA machine search mode or an optimal search mode.
It can be understood that the current population characteristic of the current iteration is determined by updating the characteristic search distance, so that the problem that the determined current population characteristic of the current iteration is inaccurate due to the fact that the characteristic search distance is kept fixed in the iteration process is avoided, and the accuracy of determining the current population characteristic of the current iteration is improved.
In the embodiment of the application, the current population characteristic of the current iteration is determined by introducing the reference population characteristic and the characteristic search distance, so that the accuracy of determining the current population characteristic of the current iteration is improved.
In another alternative embodiment, if the target feature processing mode is solution space contraction, based on feature processing logic corresponding to the target feature processing mode, feature updating is performed on the current population feature obtained in the previous iteration to obtain the current population feature obtained in the current iteration, including: taking candidate population characteristics with higher adaptation degree in the hyper-parametric solution space as reference population characteristics; determining a space searching distance between the reference population characteristic and the current population characteristic obtained in the previous iteration; updating the space searching distance according to a preset spiral coefficient; and according to the updated space searching distance, performing feature searching towards the reference population feature in the hyper-parametric solution space to obtain the current population feature obtained by the current iteration.
The space searching distance refers to the distance between the reference population characteristic and the current population characteristic of the previous iteration in a solution space shrinkage mode. The preset spiral coefficient is not limited in size, and can be set by a technician according to experience, or can be repeatedly determined through a large number of experiments.
By way of example, the spatial search distance may be determined by the following formula:
Figure BDA0004012902320000181
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004012902320000182
representing the spatial search distance.
Further, searching the determined space for a distance according to a preset hover coefficient
Figure BDA0004012902320000183
Updating; according to the updated spatial search distance +.>
Figure BDA0004012902320000184
The current population characteristics of the current iteration are determined by the following formula:
Figure BDA0004012902320000185
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004012902320000186
representing the current population characteristics of the iteration under the shrinkage of the solution space; />
Figure BDA0004012902320000187
Representing the updated spatial search distance; b represents a preset hover coefficient, which is a constant; l represents [ -1,1]Random numbers in between.
It can be understood that by introducing the preset spiral coefficient to update the space searching distance, the accuracy of the space searching distance is improved, the accuracy of the current population characteristic obtained in the iteration is further improved, and the inaccuracy of the current population characteristic determined in the iteration according to the fixed space searching distance is avoided.
S340, determining the regional risk category of the region to be detected according to each reference risk category.
After the number of layers of the depth forest is adaptively determined, classifying the random forest and the complete random forest in the classified layer forest to classify each reference risk category, averaging each reference risk category by category, and finally taking the risk category corresponding to each reference risk category with the largest average value as the regional risk category of the region to be detected.
According to the regional risk identification scheme provided by the embodiment of the application, regional risk category operation of the region to be detected is determined according to regional risk characteristics, the regional risk characteristics are refined into the trained depth forest, and classification prediction is sequentially carried out according to the random forest layers cascaded in the depth forest to obtain at least one reference risk category of the region to be detected; according to each reference risk category, determining the regional risk category of the region to be detected, and determining the regional risk category by introducing a depth forest, so that the accuracy of the regional risk category determination result is improved; meanwhile, the depth forest can determine the regional risk category of different regions to be detected, so that the applicability of determining the regional risk category is improved; and moreover, the accuracy of the regional risk category is improved by determining the regional risk category by using the depth forest, so that the robustness is high and the generalization capability is strong.
As an implementation of the above-mentioned region risk identification method, the embodiment of the present application further provides an optional embodiment of an execution apparatus for implementing the region risk identification method.
Referring to fig. 4, an area risk recognition apparatus includes: a risk indicator data acquisition module 410, a regional risk feature extraction module 420, and a regional risk category determination module 430. Wherein, the liquid crystal display device comprises a liquid crystal display device,
the risk index data acquisition module 410 is configured to acquire risk index data of each resource-related transaction in the region to be detected under different dimensions;
the regional risk feature extraction module 420 is configured to extract regional risk features in each risk indicator data;
the region risk category determination module 430 is configured to determine a region risk category of the region to be detected according to the region risk feature.
According to the regional risk identification scheme provided by the embodiment of the application, the risk index data of each resource-associated transaction in the region to be detected under different dimensionalities is acquired through the risk index data acquisition module; extracting regional risk features in each risk index data through a regional risk feature extraction module; and determining the regional risk category of the region to be detected according to the regional risk characteristics by a regional risk category determination module. According to the scheme, the richness and the expandability of the risk index data are improved by acquiring the risk index data under different dimensions; meanwhile, according to the regional risk characteristics extracted from the risk index data, the regional risk category of the region to be detected is determined, and the accuracy of the regional risk category determination result is improved.
Optionally, the regional risk feature extraction module 420 includes:
the initial characteristic data acquisition unit is used for carrying out characteristic extraction on each risk index data to obtain initial characteristic data;
a weight coefficient determining unit for determining a weight coefficient of the initial feature data;
the noise reduction threshold determining unit is used for determining a noise reduction threshold according to the initial characteristic data and the weight coefficient;
and the regional risk feature determining unit is used for determining regional risk features according to the noise reduction threshold value and the initial feature data.
Optionally, the regional risk category determination module 430 includes:
the reference risk category obtaining unit is used for inputting regional risk characteristics into the trained depth forest, and sequentially carrying out classification prediction according to the random forest layers cascaded in the depth forest to obtain at least one reference risk category of the region to be detected;
the regional risk category determining unit is used for determining the regional risk category of the region to be detected according to each reference risk category.
Optionally, the reference risk category acquiring unit includes:
the to-be-processed feature acquisition subunit is used for carrying out feature fusion on category prediction probabilities of different decision trees in random forests of a front cascade level and regional risk features aiming at any random forest layer in the depth forests to obtain to-be-processed features; the feature to be processed of the head layer is a regional risk feature;
The class prediction probability obtaining subunit is used for inputting the feature to be processed into the random forest of the post cascade level to obtain class prediction probabilities of different decision trees in the random forest of the post cascade level;
and the reference risk category determining subunit is used for taking category prediction probabilities of different decision trees in a random forest at the tail level in the depth forest as each reference risk category of the region to be detected.
Optionally, the apparatus further includes a superparameter determining unit, and the superparameter determining unit includes:
an initial population feature construction subunit, configured to construct an initial population feature including initial values of each super parameter;
the target population characteristic acquisition subunit is used for carrying out characteristic search and/or solution space contraction iteration in the super-parameter solution space according to the initial population characteristic to obtain a target population characteristic;
and the parameter value determining subunit is used for taking each characteristic value in the target population characteristic as the parameter value of the corresponding super parameter in the depth forest.
Optionally, the target population feature acquiring subunit includes:
the target feature processing mode determining slave unit is used for determining a target feature processing mode according to random probability of the iteration aiming at any feature iteration process; the target feature processing mode is feature searching or solution space contraction;
The current population characteristic obtaining slave unit is used for carrying out characteristic update on the current population characteristic obtained in the previous iteration based on the characteristic processing logic corresponding to the target characteristic processing mode to obtain the current population characteristic obtained in the current iteration; the current population characteristics obtained by the first iteration are initial population characteristics;
the target population characteristic obtaining slave unit is used for taking the current population characteristic obtained by the last iteration as the target population characteristic.
Optionally, if the target feature processing mode is feature searching, the current population feature acquiring slave unit is specifically configured to:
determining a reference population characteristic according to the distance coefficient;
determining a feature searching distance between the reference population feature and the current population feature obtained in the previous iteration;
and according to the feature searching distance, performing feature searching towards the reference population feature in the hyper-parameter solution space to obtain the current population feature of the current iteration.
Optionally, the current population feature obtaining slave unit is specifically configured to, when executing the method for determining the reference population feature according to the distance coefficient:
if the distance coefficient meets the random search condition, selecting any candidate population feature in the super-parameter solution space as a reference population feature;
If the distance coefficient meets the optimal searching condition, selecting candidate population characteristics with higher adaptation degree in the super-parameter solution space as reference population characteristics;
wherein the random search condition is complementary to the optimal search condition.
Optionally, the current population feature obtaining slave unit is specifically configured to, when executing the current population feature method of the current iteration, perform feature search towards the reference population feature in the super-parameter solution space according to the feature search distance:
updating the feature searching distance according to the distance coefficient;
and carrying out feature searching towards the reference population feature in the super-parameter solution space according to the updated feature searching distance to obtain the current population feature of the current iteration.
Optionally, if the target feature processing mode is solution space contraction, the current population feature acquiring slave unit is specifically configured to:
taking candidate population characteristics with higher adaptation degree in the hyper-parametric solution space as reference population characteristics;
determining a space searching distance between the reference population characteristic and the current population characteristic obtained in the previous iteration;
updating the space searching distance according to a preset spiral coefficient;
and according to the updated space searching distance, performing feature searching towards the reference population feature in the hyper-parametric solution space to obtain the current population feature obtained by the current iteration.
Optionally, each hyper-parameter in the depth forest comprises at least one of: the number of layers of random forests, the number of random forests of each layer, the number of decision trees in the random forests and the maximum bifurcation number of the decision trees.
The regional risk identification device provided by the embodiment of the application can execute the regional risk identification method provided by any embodiment of the application, and has the corresponding functional modules and beneficial effects of executing the regional risk identification methods.
Fig. 5 is a schematic structural diagram of an electronic device for implementing a regional risk identification method according to an embodiment of the present application. The electronic device 512 shown in fig. 5 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present application.
As shown in FIG. 5, the electronic device 512 is in the form of a general purpose computing device. Components of electronic device 512 may include, but are not limited to: one or more processors or processing units 516, a system memory 528, a bus 518 that connects the various system components (including the system memory 528 and processing units 516).
Bus 518 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Electronic device 512 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by electronic device 512 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 528 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 530 and/or cache memory 532. The electronic device 512 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 534 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 5, commonly referred to as a "hard disk drive"). Although not shown in fig. 5, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 518 through one or more data media interfaces. Memory 528 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of the embodiments of the present application.
A program/utility 540 having a set (at least one) of program modules 542 may be stored in, for example, memory 528, such program modules 542 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 542 generally perform the functions and/or methods in the embodiments described herein.
The electronic device 512 may also communicate with one or more external devices 514 (e.g., keyboard, pointing device, display 524, etc.), one or more devices that enable a user to interact with the electronic device 512, and/or any devices (e.g., network card, modem, etc.) that enable the electronic device 512 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 522. Also, the electronic device 512 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through the network adapter 520. As shown, network adapter 520 communicates with other modules of electronic device 512 over bus 518. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 512, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
Processing unit 516 performs various functional applications and data processing, such as implementing the regional risk identification methods provided by embodiments of the present application, by running at least one of the other programs in a plurality of programs stored in system memory 528.
The present embodiments also provide a computer readable storage medium having stored thereon a computer program (or referred to as computer executable instructions) for performing the region risk identification method provided by the embodiments of the present application when the program is executed by a processor.
Any combination of one or more computer readable media may be employed as the computer storage media of the embodiments herein. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for embodiments of the present application may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
Embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements a regional risk identification method as provided by any of the embodiments of the present application.
Computer program product in the implementation, the computer program code for carrying out the operations of the present application may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
Note that the above is only a preferred embodiment of the present application and the technical principle applied. Those skilled in the art will appreciate that the present application is not limited to the particular embodiments described herein, but is capable of numerous obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the present application. Therefore, while the present application has been described in connection with the above embodiments, the present application is not limited to the above embodiments, but may include many other equivalent embodiments without departing from the spirit of the present application, the scope of which is defined by the scope of the appended claims.

Claims (25)

1. A method for identifying regional risk, comprising:
acquiring risk index data of each resource association transaction in the region to be detected under different dimensions;
extracting regional risk characteristics in each risk index data;
and determining the regional risk category of the region to be detected according to the regional risk characteristics.
2. The method of claim 1, wherein the extracting regional risk features in the risk indicator data comprises:
extracting features of each risk index data to obtain initial feature data;
Determining a weight coefficient of the initial characteristic data;
determining a noise reduction threshold according to the initial characteristic data and the weight coefficient;
and determining the regional risk characteristics according to the noise reduction threshold value and the initial characteristic data.
3. The method according to claim 1, wherein determining the region risk category of the region to be detected based on the region risk feature comprises:
inputting the regional risk characteristics into a trained depth forest, and sequentially carrying out classification prediction according to a random forest layer cascaded in the depth forest to obtain at least one reference risk category of the region to be detected;
and determining the regional risk category of the region to be detected according to each reference risk category.
4. A method according to claim 3, wherein the inputting the regional risk feature into the trained depth forest sequentially performs classification prediction according to a random forest layer cascaded in the depth forest to obtain at least one reference risk category of the region to be detected, and the method comprises:
aiming at any random forest layer in the depth forest, carrying out feature fusion on the category prediction probability of different decision trees in the random forest of the front cascade level and the regional risk feature to obtain the feature to be processed; wherein the feature to be processed of the header level is the regional risk feature;
Inputting the features to be processed into a random forest of a post-cascade level to obtain category prediction probabilities of different decision trees in the random forest of the post-cascade level;
and taking the category prediction probability of different decision trees in a random forest of the tail level in the depth forest as each reference risk category of the region to be detected.
5. A method according to claim 3, wherein each hyper-parameter in the depth forest is determined in the following way:
constructing initial population characteristics comprising initial values of all super parameters;
according to the initial population characteristics, performing characteristic search and/or solution space contraction iteration in a hyper-parameter solution space to obtain target population characteristics;
and taking each characteristic value in the target population characteristic as a parameter value of a corresponding super parameter in the depth forest.
6. The method of claim 5, wherein performing feature search and/or solution space contraction iteration in a hyper-parametric solution space based on the initial population feature to obtain a target population feature comprises:
aiming at any characteristic iteration process, determining a target characteristic processing mode according to the random probability of the iteration; the target feature processing mode is feature searching or solution space contraction;
Based on the feature processing logic corresponding to the target feature processing mode, carrying out feature update on the current population feature obtained in the previous iteration to obtain the current population feature obtained in the current iteration; the current population characteristics obtained by the first iteration are the initial population characteristics;
and taking the current population characteristic obtained by the last iteration as the target population characteristic.
7. The method of claim 6, wherein if the target feature processing manner is feature search, the performing feature update on the current population feature obtained in the previous iteration based on feature processing logic corresponding to the target feature processing manner to obtain the current population feature obtained in the current iteration, includes:
determining a reference population characteristic according to the distance coefficient;
determining a feature searching distance between the reference population feature and the current population feature obtained in the previous iteration;
and carrying out feature searching towards the reference population feature in the hyper-parameter solution space according to the feature searching distance to obtain the current population feature of the current iteration.
8. The method of claim 7, wherein determining the reference population characteristic based on the distance coefficient comprises:
If the distance coefficient meets a random search condition, selecting any candidate population feature in the hyper-parametric solution space as the reference population feature;
if the distance coefficient meets the optimal searching condition, selecting candidate population characteristics with higher adaptation degree in the hyper-parametric solution space as the reference population characteristics;
wherein the random search condition is complementary to the optimal search condition.
9. The method of claim 7, wherein the performing feature search in the hyper-parametric solution space towards the reference population feature according to the feature search distance to obtain the current population feature of the current iteration comprises:
updating the characteristic searching distance according to the distance coefficient;
and carrying out feature searching towards the reference population feature in the hyper-parameter solution space according to the updated feature searching distance to obtain the current population feature of the current iteration.
10. The method of claim 6, wherein if the target feature processing mode is solution space contraction, the performing feature update on the current population feature obtained in the previous iteration based on feature processing logic corresponding to the target feature processing mode to obtain the current population feature obtained in the current iteration includes:
Taking the candidate population characteristic with higher adaptation degree in the hyper-parametric solution space as a reference population characteristic;
determining a space searching distance between the reference population characteristic and the current population characteristic obtained in the previous iteration;
updating the space searching distance according to a preset spiral coefficient;
and according to the updated space searching distance, performing feature searching towards the reference population feature in the hyper-parametric solution space to obtain the current population feature obtained by the current iteration.
11. The method of any one of claims 5-10, wherein each hyper-parameter in the depth forest comprises at least one of: the number of layers of random forests, the number of random forests of each layer, the number of decision trees in the random forests and the maximum bifurcation number of the decision trees.
12. An area risk recognition apparatus, comprising:
the risk index data acquisition module is used for acquiring risk index data of each resource association transaction in the region to be detected under different dimensionalities;
the regional risk feature extraction module is used for extracting regional risk features in each risk index data;
and the regional risk category determining module is used for determining the regional risk category of the region to be detected according to the regional risk characteristics.
13. The apparatus of claim 12, wherein the regional risk feature extraction module comprises:
the initial characteristic data acquisition unit is used for carrying out characteristic extraction on each risk index data to obtain initial characteristic data;
a weight coefficient determining unit for determining a weight coefficient of the initial feature data;
the noise reduction threshold determining unit is used for determining a noise reduction threshold according to the initial characteristic data and the weight coefficient;
and the regional risk feature determining unit is used for determining the regional risk feature according to the noise reduction threshold value and the initial feature data.
14. The apparatus of claim 12, wherein the regional risk category determination module comprises:
the reference risk category obtaining unit is used for inputting the regional risk characteristics into a trained depth forest, and sequentially carrying out classification prediction according to a random forest layer cascaded in the depth forest to obtain at least one reference risk category of the region to be detected;
and the regional risk category determining unit is used for determining the regional risk category of the region to be detected according to each reference risk category.
15. The apparatus of claim 14, wherein the reference risk category acquisition unit comprises:
the to-be-processed feature acquisition subunit is used for carrying out feature fusion on category prediction probabilities of different decision trees in random forests of a front cascade level and the regional risk features aiming at any random forest layer in the depth forests to obtain to-be-processed features; wherein the feature to be processed of the header level is the regional risk feature;
the class prediction probability obtaining subunit is used for inputting the feature to be processed into the random forest of the post cascade level to obtain class prediction probabilities of different decision trees in the random forest of the post cascade level;
and the reference risk category determining subunit is used for taking category prediction probabilities of different decision trees in a random forest of a tail level in the depth forest as each reference risk category of the region to be detected.
16. The apparatus according to claim 14, further comprising a super parameter determination unit, the super parameter determination unit comprising:
an initial population feature construction subunit, configured to construct an initial population feature including initial values of each super parameter;
The target population characteristic acquisition subunit is used for carrying out characteristic search and/or solution space contraction iteration in the super-parameter solution space according to the initial population characteristic to obtain a target population characteristic;
and the parameter value determining subunit is used for taking each characteristic value in the target population characteristic as the parameter value of the corresponding super parameter in the depth forest.
17. The apparatus of claim 16, wherein the target population feature acquisition subunit comprises:
the target feature processing mode determining slave unit is used for determining a target feature processing mode according to random probability of the iteration aiming at any feature iteration process; the target feature processing mode is feature searching or solution space contraction;
the current population feature acquisition slave unit is used for carrying out feature update on the current population feature obtained in the previous iteration based on the feature processing logic corresponding to the target feature processing mode to obtain the current population feature obtained in the current iteration; the current population characteristics obtained by the first iteration are the initial population characteristics;
and the target population characteristic obtaining slave unit is used for taking the current population characteristic obtained by the last iteration as the target population characteristic.
18. The apparatus of claim 17, wherein if the target feature processing method is feature searching, the current population feature acquisition slave unit is specifically configured to:
determining a reference population characteristic according to the distance coefficient;
determining a feature searching distance between the reference population feature and the current population feature obtained in the previous iteration;
and carrying out feature searching towards the reference population feature in the hyper-parameter solution space according to the feature searching distance to obtain the current population feature of the current iteration.
19. The apparatus according to claim 18, wherein the current population characteristic obtaining slave unit, when executing the method for determining the reference population characteristic according to the distance coefficient, is specifically configured to:
if the distance coefficient meets a random search condition, selecting any candidate population feature in the hyper-parametric solution space as the reference population feature;
if the distance coefficient meets the optimal searching condition, selecting candidate population characteristics with higher adaptation degree in the hyper-parametric solution space as the reference population characteristics;
wherein the random search condition is complementary to the optimal search condition.
20. The apparatus of claim 18, wherein the current population feature obtaining slave unit is configured to, when executing the feature search according to the feature search distance, perform a feature search in the hyper-parametric solution space toward the reference population feature to obtain the current population feature method of the current iteration:
updating the characteristic searching distance according to the distance coefficient;
and carrying out feature searching towards the reference population feature in the hyper-parameter solution space according to the updated feature searching distance to obtain the current population feature of the current iteration.
21. The apparatus of claim 17, wherein if the target feature processing manner is solution space contraction, the current population feature acquisition slave unit is specifically configured to:
taking the candidate population characteristic with higher adaptation degree in the hyper-parametric solution space as a reference population characteristic;
determining a space searching distance between the reference population characteristic and the current population characteristic obtained in the previous iteration;
updating the space searching distance according to a preset spiral coefficient;
and according to the updated space searching distance, performing feature searching towards the reference population feature in the hyper-parametric solution space to obtain the current population feature obtained by the current iteration.
22. The apparatus of any one of claims 16-21, wherein each hyper-parameter in the depth forest comprises at least one of: the number of layers of random forests, the number of random forests of each layer, the number of decision trees in the random forests and the maximum bifurcation number of the decision trees.
23. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable by the processor, wherein the processor implements the regional risk identification method of any of claims 1-11 when the computer program is executed by the processor.
24. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the regional risk identification method as claimed in any one of claims 1-11.
25. A computer program product comprising a computer program which, when executed by a processor, implements the regional risk identification method as claimed in any one of claims 1 to 11.
CN202211658951.9A 2022-12-22 2022-12-22 Regional risk identification method, device, equipment and storage medium Pending CN116011810A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211658951.9A CN116011810A (en) 2022-12-22 2022-12-22 Regional risk identification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211658951.9A CN116011810A (en) 2022-12-22 2022-12-22 Regional risk identification method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116011810A true CN116011810A (en) 2023-04-25

Family

ID=86027601

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211658951.9A Pending CN116011810A (en) 2022-12-22 2022-12-22 Regional risk identification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116011810A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117011063A (en) * 2023-09-25 2023-11-07 中国建设银行股份有限公司 Customer transaction risk prediction processing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117011063A (en) * 2023-09-25 2023-11-07 中国建设银行股份有限公司 Customer transaction risk prediction processing method and device
CN117011063B (en) * 2023-09-25 2023-12-29 中国建设银行股份有限公司 Customer transaction risk prediction processing method and device

Similar Documents

Publication Publication Date Title
TWI723528B (en) Computer-executed event risk assessment method and device, computer-readable storage medium and computing equipment
CN111460250B (en) Image data cleaning method, image data cleaning device, image data cleaning medium, and electronic apparatus
US20230281298A1 (en) Using multimodal model consistency to detect adversarial attacks
CN107229627B (en) Text processing method and device and computing equipment
CN111368878B (en) Optimization method based on SSD target detection, computer equipment and medium
CN116089873A (en) Model training method, data classification and classification method, device, equipment and medium
CN112883990A (en) Data classification method and device, computer storage medium and electronic equipment
CN116011810A (en) Regional risk identification method, device, equipment and storage medium
CN116304033B (en) Complaint identification method based on semi-supervision and double-layer multi-classification
CN113239883A (en) Method and device for training classification model, electronic equipment and storage medium
CN116342164A (en) Target user group positioning method and device, electronic equipment and storage medium
CN111738290A (en) Image detection method, model construction and training method, device, equipment and medium
CN114692778B (en) Multi-mode sample set generation method, training method and device for intelligent inspection
CN113836297B (en) Training method and device for text emotion analysis model
US11954685B2 (en) Method, apparatus and computer program for selecting a subset of training transactions from a plurality of training transactions
CN113822313A (en) Method and device for detecting abnormity of graph nodes
CN111427880A (en) Data processing method, device, computing equipment and medium
CN113806558B (en) Question selection method, knowledge graph construction device and electronic equipment
CN113033170B (en) Form standardization processing method, device, equipment and storage medium
US20230079815A1 (en) Calculate fairness of machine learning model by identifying and filtering outlier transactions
CN116028880B (en) Method for training behavior intention recognition model, behavior intention recognition method and device
CN116932487B (en) Quantized data analysis method and system based on data paragraph division
CN116703616A (en) Nuclear protection method, device, terminal equipment and storage medium
CN117112395A (en) API abnormal access detection method, device, equipment and medium
CN116090462A (en) Element extraction method, element extraction device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination