CN112445844A - Financial data management control system of big data platform - Google Patents

Financial data management control system of big data platform Download PDF

Info

Publication number
CN112445844A
CN112445844A CN202011360661.7A CN202011360661A CN112445844A CN 112445844 A CN112445844 A CN 112445844A CN 202011360661 A CN202011360661 A CN 202011360661A CN 112445844 A CN112445844 A CN 112445844A
Authority
CN
China
Prior art keywords
data
abnormal
abnormal data
time
judgment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011360661.7A
Other languages
Chinese (zh)
Other versions
CN112445844B (en
Inventor
卿赟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Elide Software Technology Co.,Ltd.
Original Assignee
Chongqing Medical and Pharmaceutical College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Medical and Pharmaceutical College filed Critical Chongqing Medical and Pharmaceutical College
Priority to CN202011360661.7A priority Critical patent/CN112445844B/en
Publication of CN112445844A publication Critical patent/CN112445844A/en
Application granted granted Critical
Publication of CN112445844B publication Critical patent/CN112445844B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2477Temporal data queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting
    • G06Q40/125Finance or payroll

Abstract

The invention provides a financial data management control system of a big data platform, which comprises: the query extraction module is used for acquiring financial data through the cloud database, logging abnormal data in the financial data, then starting preliminary query, and performing real-time query, check and extraction on invalid data in the query process; the abnormal judgment module is used for setting a judgment interval of abnormal data after real-time query, check and extraction, and forming standardized data in the judgment interval; the system comprises a screening and scoring module and a comprehensive judgment module, wherein the screening and scoring module is used for analyzing the deviation degree of the standardized data, screening abnormal data through a screening model after analysis, and performing feature scoring on the screened abnormal data, and the comprehensive judgment module is used for judging and outputting the risk degree of the abnormal data in the financial data through a comprehensive risk judgment model after the feature scoring.

Description

Financial data management control system of big data platform
Technical Field
The invention relates to the field of big data analysis, in particular to a financial data management control system of a big data platform.
Background
With the rapid development of informatization and intellectualization, transaction history data which is difficult to count is formed due to the increase of transaction times in the financial data management process, and more or less of the transaction history data are compliance operations or normal transaction behaviors, and the financial data manager cannot meet increasingly complex abnormal transaction behaviors of the current society through a traditional checking mode.
Especially in schools, government offices or large chain enterprises, the total transaction amount and the transaction times are difficult to count, the transaction behaviors including transaction risks cannot be rapidly and accurately acquired through a traditional computer accumulation mode or a statistical principle, and even if some extraction algorithms are used, the control and checking processes of abnormal financial data are inaccurate. There is a great need for those skilled in the art to solve the corresponding technical problems.
Disclosure of Invention
The invention aims to at least solve the technical problems in the prior art, and particularly creatively provides a financial data management control system of a large data platform.
In order to achieve the above object, the present invention provides a big data platform financial data management control system, comprising:
the query extraction module is used for acquiring financial data through the cloud database, logging abnormal data in the financial data, then starting preliminary query, and performing real-time query, check and extraction on invalid data in the query process;
the abnormal judgment module is used for setting a judgment interval of abnormal data after real-time query, check and extraction, and forming standardized data in the judgment interval;
the screening and scoring module is used for analyzing the deviation degree of the standardized data, screening abnormal data through the screening model after analysis, and scoring the characteristics of the screened abnormal data,
and the comprehensive judgment module is used for judging and outputting the risk degree of the abnormal data in the financial data through the comprehensive risk judgment model after the characteristic scoring.
Preferably, the query extraction module includes:
the financial data is called from the cloud database, the abnormal data is obtained from the financial data, the abnormal data extraction process dynamically requests the financial data of the cloud database by data balance in a preliminary query process, a dynamic configuration mode is adopted, the abnormal data obtaining threshold value is set, different abnormal data are extracted according to the safety control mechanism and the authority management requirements of different financial data to carry out login operation,
in the preliminary query process, the cloud database stores financial data authentication and function access authority information in a local database, and unified financial data authentication and function authority control are performed; logically isolating the abnormal data of the financial data, and storing the abnormal data in an independent database; verifying the identity of a user in the financial data login process, constructing an abnormal data set which the user has the right to access according to the abnormal data access authority information in the financial data, and performing authentication access through the identity authentication process of a cloud database; if the access fails, returning abnormal data access failure information; if the access is successful, the login is successful; and establishing a channel which is independent of the application server instance dynamically distributed by the system.
Preferably, the query extraction module includes:
the access and use process of the abnormal data comprises the steps of forming abnormal data relation nodes according to a plurality of abnormal data, searching PaaS platform resources to convert the abnormal data relation nodes into tree nodes, generating an abnormal data tree node list, taking an empty abnormal data node set as a current node set, performing traversal operation on the current abnormal data tree node set so as to judge whether the abnormal data father resource information list of the node set of the current traversal operation is equal to the preset abnormal data root node information list or not, if the abnormal data father resource information list is equal to the preset abnormal data root node information list, and if the current traversal operation node set is not equal to the preset abnormal data root node information list, continuing to traverse the resources of which the abnormal data identification is equal to the parent resource information list of the current traversal operation node, and marking the resources as the abnormal data parent nodes of the current traversal operation node.
Preferably, the query extraction module includes:
judging whether the current tree node list is traversed or not according to whether the abnormal data node is equal to the father resource information list of the currently traversed node or not; if the traversal is finished, detecting an abnormal data father node information list, if the traversal is not finished, taking the current abnormal data father node information list as a root node of a current tree node, and marking a recursion to construct an abnormal data service query tree; and redistributing a plurality of abnormal data query requests distributed on a certain node of abnormal data to a certain computing node of the abnormal data and backing up, so that each of the certain computing node and the backup computing node is only distributed with one sub-query.
Preferably, the abnormality determination module includes:
after query and check, dividing the abnormal data into judgment intervals, calculating the similarity of the abnormal data to generate the judgment intervals, carrying out standard processing on the abnormal data through proportional scaling calculation, and transferring a large amount of abnormal funds into the transaction data u which is rapidly and dispersedly transferred outiIs u'iTransfer of abnormally large amounts of dispersed funds into fast and centrally transferred-out transaction data viHas a conversion value of v'iTransaction data x at abnormal time pointsiHas a conversion value of x'iAbnormal same amount transaction data yiIs y'iAbnormal excess transaction data ziIs z'k
Substituting the converted transaction abnormal data and time and date variables into a judgment model, and calculating a judgment value of the abnormal data within any statistical time and date:
Figure BDA0002803831260000031
b (t, d) is a judgment value of the abnormal data at any time t and any date d; f (u'i(ii) a t, d) is a judgment value of the time and date of transferring the abnormally large amount of funds into the transaction data which is rapidly dispersed and transferred out; f (v'i(ii) a t, d) is a judgment value of the time and date of transferring the abnormally large amount of dispersed funds into the transaction data which is transferred out in a rapid and centralized way; f (x'i(ii) a t, d) is a judgment value of the time and date of transaction data at an abnormal time point; f (y'i(ii) a t, d) is a judgment value of time and date of transaction data with the same amount; f (z'i(ii) a t, d) is a judgment value of abnormal excess transaction data; the maximum value of i is 60, so that the abnormal data of each second in one minute can be monitored and judged in real time.
Preferably, the abnormality determination module includes:
calculating the difference value between the actual value and the judgment value of each abnormal data on time and date, performing linear curve fitting process on the discrete abnormal data through the residual sum of squares, thereby judging the risk trend of the abnormal data,
Figure BDA0002803831260000041
wherein, W is the sum of the squares of the residuals of each abnormal data; b is0(t, d) actual values for each anomaly data at the time and date; b (t, d) is a judgment value of each abnormal data at the time and date; m is the time or the maximum days of the day with the maximum statistical time.
Preferably, the abnormality determination module includes:
then calculating the degree of deviation of the abnormal data
Figure BDA0002803831260000042
F is a calculation constant, adjustment is carried out through an adjustment coefficient lambda, and the calculation constant F is larger due to the fact that W is increased; accurately obtaining value H by adding abnormal datajAnd after accurately acquiring and accumulating all N abnormal data, carrying out deviation convergence on the characteristic value e, wherein beta is a characteristic threshold value.
Preferably, the screening and scoring module comprises:
after the deviation degree of the abnormal data is analyzed, statistical information calculation is carried out in the abnormal data through prior probability distribution; calculating the prior conditional probability distribution of the abnormal data, and setting the internal attributes of the first abnormal data set C and the second abnormal data set E, wherein the internal attributes areWherein the first set of anomalous data comprises uiAnd viThe second abnormal data set includes xi、yiAnd ziRespectively calculating conditional probability under probability distribution condition by defining time class attribute G and date class attribute I of abnormal data
Figure BDA0002803831260000043
And
Figure BDA0002803831260000044
and calculating to obtain:
Figure BDA0002803831260000045
the derivation is continued to obtain,
Figure BDA0002803831260000046
wherein
Figure BDA0002803831260000047
Representing a first set of anomalous data
Figure BDA0002803831260000048
Traversing the first abnormal data set by combining the probability distribution with the time class attribute G and the date class attribute I
Figure BDA0002803831260000049
And all values of the time class attribute G get its conditional probability distribution
Figure BDA0002803831260000051
And a first set of exception data
Figure BDA0002803831260000052
Obtaining the conditional probability distribution of all the values of the date type attribute I
Figure BDA0002803831260000053
Time class attribute conditional probability Q (G), date classProperty conditional probability Q (I);
then, calculating:
Figure BDA0002803831260000054
the derivation is continued to obtain,
Figure BDA0002803831260000055
wherein
Figure BDA0002803831260000056
Representing a second set of anomalous data
Figure BDA0002803831260000057
Traversing the second abnormal data set according to the joint probability distribution of the time class attribute G and the date class attribute I
Figure BDA0002803831260000058
And all values of the time class attribute G get its conditional probability distribution
Figure BDA0002803831260000059
And a second set of exception data
Figure BDA00028038312600000510
Obtaining the conditional probability distribution of all the values of the date type attribute I
Figure BDA00028038312600000511
Preferably, the screening and scoring module comprises:
the joint probability distribution value of each abnormal data node in the first abnormal data set C and the condition information of the time attribute and the date attribute of each abnormal data node in the second abnormal data set E is as follows;
Figure BDA00028038312600000512
selecting class attributes J of the abnormal data and putting the class attributes J into a big data platform; constructing a naive Bayesian network by taking class attributes J as parent nodes of internal attributes in the first abnormal data set C and the second abnormal data set E;
putting the nodes in the first abnormal data set C and the second abnormal data set E into the Bayesian network one by one; if in the first abnormal data set C
Figure BDA00028038312600000513
Then will be
Figure BDA00028038312600000514
Putting the network as a parent node; (ii) a If in the second abnormal data set E
Figure BDA00028038312600000515
Then will be
Figure BDA00028038312600000516
Putting the network as a parent node; thereby obtaining the Bayesian network for abnormal data grade screening and sorting.
Preferably, the comprehensive judgment module includes:
combining risk degree weight calculation to transfer abnormal large amount of funds into transaction data u which are rapidly and dispersedly transferred outiAnd (3) calculating:
Figure BDA0002803831260000061
wherein, TtotalIs the total base time; p is a radical ofuiDynamically changing components of transaction data weight for the transfer of abnormally large amounts of funds into the fast scatter transfer-out; vtotalU is a transaction data detection time component and is a total reference date; k is a date component of transaction data detection;
transaction data v for transferring abnormal large amount of dispersed funds into rapid and centralized transfer-outiThe weight of the degree of risk is calculated,
Figure BDA0002803831260000062
wherein the content of the first and second substances,
Figure BDA0002803831260000063
transaction data v for transferring abnormally large amount of dispersed funds into rapid centralized transfer-outiA dynamically varying component of the weight;
for abnormal time point transaction data xiThe weight of the degree of risk is calculated,
Figure BDA0002803831260000064
wherein h isxiTransacting data x for abnormal points in timeiA dynamic variation component of;
for abnormal same amount transaction data yiThe weight of the degree of risk is calculated,
Figure BDA0002803831260000065
wherein the content of the first and second substances,
Figure BDA0002803831260000066
transacting data y for abnormally same amountiA dynamic variation component of;
over-limit transaction data z for exceptionsiThe weight of the degree of risk is calculated,
Figure BDA0002803831260000067
and (3) defining a comprehensive risk judgment model:
Figure BDA0002803831260000071
wherein the content of the first and second substances,
Figure BDA0002803831260000072
the transaction data predicted value for transferring abnormally large amount of funds into the rapid dispersed transfer-out;
Figure BDA0002803831260000073
a decision threshold for transferring unusually large amounts of funds into rapidly dispersed roll-out transaction data,
Figure BDA0002803831260000074
the transaction data predicted value for transferring abnormally large amount of dispersed funds into the transaction data which is transferred out quickly and intensively;
Figure BDA0002803831260000075
for the judgment threshold value of transaction data of transferring abnormally large amount of dispersed funds into fast centralized transfer-out,
Figure BDA0002803831260000076
predicting a transaction data value at an abnormal time point;
Figure BDA0002803831260000077
is a judgment threshold value of transaction data at an abnormal time point,
Figure BDA0002803831260000078
the transaction data prediction value is abnormal same amount;
Figure BDA0002803831260000079
a decision threshold for anomalous same amount transaction data,
Figure BDA00028038312600000710
predicting a transaction data value for abnormal excess;
Figure BDA00028038312600000711
the judgment threshold value is abnormal excess transaction data, and epsilon is a judgment correction coefficient.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
in schools, government offices or large chain enterprises, the total transaction amount and the transaction times are more difficult to count, transaction behaviors including transaction risks cannot be rapidly and accurately acquired through a traditional computer accumulation mode or a statistical principle, and even if some extraction algorithms are used, the handling control and checking process of abnormal financial data is inaccurate due to the counting of the total financial transaction amount and the transaction times. Particularly, a plurality of unreasonable convergence thresholds and judgment conditions exist in the calculation process of the data screening process, relatively accurate abnormal transaction behaviors are extracted through a neural network learning algorithm, screening evaluation and risk control are carried out on the abnormal behaviors through a big data platform, the working efficiency is improved, and the safety, the prejudgment performance and the transaction stability of data can be improved in the process of extracting and managing massive financial data.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a general schematic of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
As shown in FIG. 1, the invention discloses a big data platform financial data management control system, comprising the following steps:
s1, acquiring financial data through the cloud database, logging abnormal data in the financial data, starting preliminary query, and performing real-time query, check and extraction on invalid data in the query process;
s2, after real-time query, check and extraction, setting a judgment interval of abnormal data, and forming standardized data in the judgment interval;
s3, analyzing the standard data for deviation, screening abnormal data through the screening model after analysis, and scoring the characteristics of the screened abnormal data,
and S4, after the characteristic scoring, judging and outputting the risk degree of the abnormal data in the financial data through a comprehensive risk judgment model.
The S1 includes:
s1-1, calling financial data from the cloud database, obtaining abnormal data from the financial data, performing data balance in the abnormal data extraction process through a preliminary query process to dynamically request the financial data from the cloud database, setting an abnormal data obtaining threshold value in a dynamic configuration mode, extracting different abnormal data according to the security control mechanism and authority management requirements of different financial data to perform login operation,
s1-2, in the preliminary query process, the cloud database stores the financial data authentication and function access authority information in the local database, and unified financial data authentication and function authority control are carried out; logically isolating the abnormal data of the financial data, and storing the abnormal data in an independent database; verifying the identity of a user in the financial data login process, constructing an abnormal data set which the user has the right to access according to the abnormal data access authority information in the financial data, and performing authentication access through the identity authentication process of a cloud database; if the access fails, returning abnormal data access failure information; if the access is successful, the login is successful; establishing a channel independent of the user and the application server instance dynamically allocated by the system,
s1-3, the access and use process of the abnormal data is that, according to a plurality of abnormal data, forming abnormal data relation node, searching PaaS platform resource to convert it into tree node, generating abnormal data tree node list, using the empty abnormal data node set as current node set, traversing the current abnormal data tree node set, thus judging whether the abnormal data father resource information list of the current traversing operation node set is equal to the preset abnormal data root node information list, if it is, then the current traversing operation node set is the current abnormal data authority tree root node, if it is not equal to the preset abnormal data root node information list, continuing traversing the resource whose abnormal data mark is equal to the father resource information list of the current traversing operation node, marking the resource as the abnormal father node of the current traversing operation node,
s1-4, judging whether the current tree node list is traversed or not for the fact whether the abnormal data node is equal to the father resource information list of the node traversed currently or not; if the traversal is finished, detecting an abnormal data father node information list, if the traversal is not finished, taking the current abnormal data father node information list as a root node of a current tree node, and marking a recursion to construct an abnormal data service query tree; reassigning a plurality of abnormal data query requests distributed on a certain node of abnormal data to a certain computing node of the abnormal data and backing up the abnormal data so that each of the certain computing node and the backup computing node is only assigned with one sub-query;
the abnormal data is queried in a node tree mode, so that the abnormal data is preliminarily judged, and if the abnormal data is further extracted, the data needs to be deeply mined.
The S2 includes:
s2-1, after query and check, dividing the abnormal data into judgment intervals, calculating the similarity of the abnormal data to generate the judgment intervals, carrying out standardization processing on the abnormal data through scaling calculation, and transferring a large amount of abnormal funds into the transaction data u which is rapidly dispersed and transferred outiIs u'iTransfer of abnormally large amounts of dispersed funds into fast and centrally transferred-out transaction data viHas a conversion value of v'iTransaction data x at abnormal time pointsiHas a conversion value of x'iAbnormal same amount transaction data yiIs y'iAbnormal excess transaction data ziIs z'k
Substituting the converted transaction abnormal data and time and date variables into a judgment model, and calculating a judgment value of the abnormal data within any statistical time and date:
Figure BDA0002803831260000101
b (t, d) is a judgment value of the abnormal data at any time t and any date d; f (u'i(ii) a t, d) is a judgment value of the time and date of transferring the abnormally large amount of funds into the transaction data which is rapidly dispersed and transferred out; f (v'i(ii) a t, d) is a judgment value of the time and date of transferring the abnormally large amount of dispersed funds into the transaction data which is transferred out in a rapid and centralized way; f (x'i(ii) a t, d) is a judgment value of the time and date of transaction data at an abnormal time point; f (y'i(ii) a t, d) is a judgment value of time and date of transaction data with the same amount; f (z'i(ii) a t, d) is a judgment value of abnormal excess transaction data; the maximum value of i is 60 so as to ensure that abnormal data of each second in one minute is monitored and judged in real time;
s2-2, calculating the difference value of the actual sum and the judgment value of each abnormal data on time and date, carrying out linear curve fitting process on the discrete abnormal data through residual square sum, thereby judging the risk trend of the abnormal data,
Figure BDA0002803831260000102
wherein, W is the sum of the squares of the residuals of each abnormal data; b is0(t, d) actual values for each anomaly data at the time and date; b (t, d) is a judgment value of each abnormal data at the time and date; m is the maximum number of days of the time or date with the maximum statistical time;
s2-3, calculating the deviation degree of the abnormal data
Figure BDA0002803831260000103
Wherein, F is a calculation constant,the adjustment is carried out through the adjustment coefficient lambda, and the calculation constant F is larger because W is increased; accurately obtaining value H by adding abnormal datajAnd after accurately acquiring and accumulating all N abnormal data, carrying out deviation convergence on the characteristic value e, wherein beta is a characteristic threshold value.
The S3 includes:
s3-1, after the abnormal data deviation degree is analyzed, statistical information calculation is carried out in the abnormal data through prior probability distribution; calculating prior conditional probability distribution of abnormal data, and setting internal attributes of a first abnormal data set C and a second abnormal data set E, wherein the first abnormal data set comprises uiAnd viThe second abnormal data set includes xi、yiAnd ziRespectively calculating conditional probability under probability distribution condition by defining time class attribute G and date class attribute I of abnormal data
Figure BDA0002803831260000111
And
Figure BDA0002803831260000112
and calculating to obtain:
Figure BDA0002803831260000113
the derivation is continued to obtain,
Figure BDA0002803831260000114
wherein
Figure BDA0002803831260000115
Representing a first set of anomalous data
Figure BDA0002803831260000116
Traversing the first abnormal data set by combining the probability distribution with the time class attribute G and the date class attribute I
Figure BDA0002803831260000117
And time categoryAll values of the sex G get their conditional probability distribution
Figure BDA0002803831260000118
And a first set of exception data
Figure BDA0002803831260000119
Obtaining the conditional probability distribution of all the values of the date type attribute I
Figure BDA00028038312600001110
Time class attribute conditional probability Q (G), date class attribute conditional probability Q (I);
then, calculating:
Figure BDA00028038312600001111
the derivation is continued to obtain,
Figure BDA00028038312600001112
wherein
Figure BDA00028038312600001113
Representing a second set of anomalous data
Figure BDA00028038312600001114
Traversing the second abnormal data set according to the joint probability distribution of the time class attribute G and the date class attribute I
Figure BDA00028038312600001115
And all values of the time class attribute G get its conditional probability distribution
Figure BDA00028038312600001116
And a second set of exception data
Figure BDA00028038312600001117
Obtaining the conditional probability distribution of all the values of the date type attribute I
Figure BDA00028038312600001118
S3-2, the joint probability distribution value of each abnormal data node in the first abnormal data set C and the condition information of the time attribute and the date attribute of each abnormal data node in the second abnormal data set E is as follows;
Figure BDA00028038312600001119
selecting class attributes J of the abnormal data and putting the class attributes J into a big data platform; constructing a naive Bayesian network by taking class attributes J as parent nodes of internal attributes in the first abnormal data set C and the second abnormal data set E;
s3-3, putting the nodes in the first abnormal data set C and the second abnormal data set E into the Bayesian network one by one; if in the first abnormal data set C
Figure BDA0002803831260000121
Then will be
Figure BDA0002803831260000122
Putting the network as a parent node; (ii) a If in the second abnormal data set E
Figure BDA0002803831260000123
Then will be
Figure BDA0002803831260000124
Putting the network as a parent node; thereby obtaining a Bayesian network for abnormal data grade screening and sorting;
s3-4, calculating probability quality function of class attribute J
Figure BDA0002803831260000125
Obtaining the most prominent probability distribution of attribute values in the abnormal data;
Figure BDA0002803831260000126
wherein
Figure BDA0002803831260000127
Representing the product of the conditional probabilities of all the first anomalous data set C and the second anomalous data set E nodes of J-association; assigning values to attribute nodes of a first abnormal data set C and a second abnormal data set E in the Bayesian network according to probability distribution of the financial abnormal data by the big data platform; substituting basic attributes of the abnormal data into the Bayesian network in turn to pass through a probability mass function; and arranging the abnormal data according to the sequence of the calculated numerical values from large to small.
The S4 includes:
s4-1, combining risk degree weight calculation, and transferring transaction data u of abnormal large amount of funds into fast dispersed transfer-outiAnd (3) calculating:
Figure BDA0002803831260000128
wherein, TtotalIs the total base time; p is a radical ofuiDynamically changing components of transaction data weight for the transfer of abnormally large amounts of funds into the fast scatter transfer-out; vtotalU is a transaction data detection time component and is a total reference date; k is a date component of transaction data detection;
s4-2, transferring abnormal large amount of scattered funds into transaction data v transferred out quickly and intensivelyiThe weight of the degree of risk is calculated,
Figure BDA0002803831260000131
wherein the content of the first and second substances,
Figure BDA0002803831260000132
transaction data v for transferring abnormally large amount of dispersed funds into rapid centralized transfer-outiA dynamically varying component of the weight;
s4-3, transaction data x for abnormal time pointsiThe weight of the degree of risk is calculated,
Figure BDA0002803831260000133
wherein the content of the first and second substances,
Figure BDA0002803831260000134
transacting data x for abnormal points in timeiA dynamic variation component of;
s4-4, transaction data y for abnormal same amountiThe weight of the degree of risk is calculated,
Figure BDA0002803831260000135
wherein the content of the first and second substances,
Figure BDA0002803831260000136
transacting data y for abnormally same amountiA dynamic variation component of;
s4-5, transaction data z for abnormal excessiThe weight of the degree of risk is calculated,
Figure BDA0002803831260000137
and (3) defining a comprehensive risk judgment model:
Figure BDA0002803831260000138
wherein the content of the first and second substances,
Figure BDA0002803831260000139
the transaction data predicted value for transferring abnormally large amount of funds into the rapid dispersed transfer-out;
Figure BDA00028038312600001310
a decision threshold for transferring unusually large amounts of funds into rapidly dispersed roll-out transaction data,
Figure BDA00028038312600001311
the transaction data predicted value for transferring abnormally large amount of dispersed funds into the transaction data which is transferred out quickly and intensively;
Figure BDA00028038312600001312
for the judgment threshold value of transaction data of transferring abnormally large amount of dispersed funds into fast centralized transfer-out,
Figure BDA00028038312600001313
predicting a transaction data value at an abnormal time point;
Figure BDA00028038312600001314
is a judgment threshold value of transaction data at an abnormal time point,
Figure BDA0002803831260000141
the transaction data prediction value is abnormal same amount;
Figure BDA0002803831260000142
a decision threshold for anomalous same amount transaction data,
Figure BDA0002803831260000143
predicting a transaction data value for abnormal excess;
Figure BDA0002803831260000144
the judgment threshold value is abnormal excess transaction data, and epsilon is a judgment correction coefficient.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (10)

1. A big data platform financial data management control system, comprising:
the query extraction module is used for acquiring financial data through the cloud database, logging abnormal data in the financial data, then starting preliminary query, and performing real-time query, check and extraction on invalid data in the query process;
the abnormal judgment module is used for setting a judgment interval of abnormal data after real-time query, check and extraction, and forming standardized data in the judgment interval;
the screening and scoring module is used for analyzing the deviation degree of the standardized data, screening abnormal data through the screening model after analysis, and scoring the characteristics of the screened abnormal data,
and the comprehensive judgment module is used for judging and outputting the risk degree of the abnormal data in the financial data through the comprehensive risk judgment model after the characteristic scoring.
2. The big data platform financial data management control system of claim 1 wherein the query extraction module comprises:
the financial data is called from the cloud database, the abnormal data is obtained from the financial data, the abnormal data extraction process dynamically requests the financial data of the cloud database by data balance in a preliminary query process, a dynamic configuration mode is adopted, the abnormal data obtaining threshold value is set, different abnormal data are extracted according to the safety control mechanism and the authority management requirements of different financial data to carry out login operation,
in the preliminary query process, the cloud database stores financial data authentication and function access authority information in a local database, and unified financial data authentication and function authority control are performed; logically isolating the abnormal data of the financial data, and storing the abnormal data in an independent database; verifying the identity of a user in the financial data login process, constructing an abnormal data set which the user has the right to access according to the abnormal data access authority information in the financial data, and performing authentication access through the identity authentication process of a cloud database; if the access fails, returning abnormal data access failure information; if the access is successful, the login is successful; and establishing a channel which is independent of the application server instance dynamically distributed by the system.
3. The big data platform financial data management control system of claim 1 wherein the query extraction module comprises:
the access and use process of the abnormal data comprises the steps of forming abnormal data relation nodes according to a plurality of abnormal data, searching PaaS platform resources to convert the abnormal data relation nodes into tree nodes, generating an abnormal data tree node list, taking an empty abnormal data node set as a current node set, performing traversal operation on the current abnormal data tree node set so as to judge whether the abnormal data father resource information list of the node set of the current traversal operation is equal to the preset abnormal data root node information list or not, if the abnormal data father resource information list is equal to the preset abnormal data root node information list, and if the current traversal operation node set is not equal to the preset abnormal data root node information list, continuing to traverse the resources of which the abnormal data identification is equal to the parent resource information list of the current traversal operation node, and marking the resources as the abnormal data parent nodes of the current traversal operation node.
4. The big data platform financial data management control system of claim 1 wherein the query extraction module comprises:
judging whether the current tree node list is traversed or not according to whether the abnormal data node is equal to the father resource information list of the currently traversed node or not; if the traversal is finished, detecting an abnormal data father node information list, if the traversal is not finished, taking the current abnormal data father node information list as a root node of a current tree node, and marking a recursion to construct an abnormal data service query tree; and redistributing a plurality of abnormal data query requests distributed on a certain node of abnormal data to a certain computing node of the abnormal data and backing up, so that each of the certain computing node and the backup computing node is only distributed with one sub-query.
5. The big data platform financial data management control system of claim 1 wherein the anomaly determination module comprises:
after query and check, dividing the abnormal data into judgment intervals, calculating the similarity of the abnormal data to generate the judgment intervals, carrying out standard processing on the abnormal data through proportional scaling calculation, and transferring a large amount of abnormal funds into the transaction data u which is rapidly and dispersedly transferred outiIs u'iTransfer of abnormally large amounts of dispersed funds into fast and centrally transferred-out transaction data viHas a conversion value of v'iTransaction data x at abnormal time pointsiHas a conversion value of x'iAbnormal same amount transaction data yiIs y'iAbnormal excess transaction data ziIs z'k
Substituting the converted transaction abnormal data and time and date variables into a judgment model, and calculating a judgment value of the abnormal data within any statistical time and date:
Figure FDA0002803831250000031
b (t, d) is a judgment value of the abnormal data at any time t and any date d; f (u'i(ii) a t, d) is a judgment value of the time and date of transferring the abnormally large amount of funds into the transaction data which is rapidly dispersed and transferred out; f (v'i(ii) a t, d) is a judgment value of the time and date of transferring the abnormally large amount of dispersed funds into the transaction data which is transferred out in a rapid and centralized way; f (x'i(ii) a t, d) is a judgment value of the time and date of transaction data at an abnormal time point; f (y'i(ii) a t, d) is a judgment value of time and date of transaction data with the same amount; f (z'i(ii) a t, d) is a judgment value of abnormal excess transaction data; the maximum value of i is 60, so that the abnormal data of each second in one minute can be monitored and judged in real time.
6. The big data platform financial data management control system of claim 1 wherein the anomaly determination module comprises:
calculating the difference value between the actual value and the judgment value of each abnormal data on time and date, performing linear curve fitting process on the discrete abnormal data through the residual sum of squares, thereby judging the risk trend of the abnormal data,
Figure FDA0002803831250000032
wherein, W is the sum of the squares of the residuals of each abnormal data; b is0(t, d) actual values for each anomaly data at the time and date; b (t, d) is a judgment value of each abnormal data at the time and date; m is the time or the maximum days of the day with the maximum statistical time.
7. The big data platform financial data management control system of claim 1 wherein the anomaly determination module comprises:
then calculating the degree of deviation of the abnormal data
Figure FDA0002803831250000041
F is a calculation constant, adjustment is carried out through an adjustment coefficient lambda, and the calculation constant F is larger due to the fact that W is increased; accurately obtaining value H by adding abnormal datajAnd after accurately acquiring and accumulating all N abnormal data, carrying out deviation convergence on the characteristic value e, wherein beta is a characteristic threshold value.
8. The big data platform financial data management control system of claim 1 wherein the screening and scoring module comprises:
after the deviation degree of the abnormal data is analyzed, statistical information calculation is carried out in the abnormal data through prior probability distribution; calculating prior conditional probability distribution of abnormal data, and setting internal attributes of a first abnormal data set C and a second abnormal data set E, wherein the first abnormal data set comprises uiAnd viSecond abnormal data setAnd comprises xi、yiAnd ziRespectively calculating conditional probability under probability distribution condition by defining time class attribute G and date class attribute I of abnormal data
Figure FDA0002803831250000042
And
Figure FDA0002803831250000043
and calculating to obtain:
Figure FDA0002803831250000044
the derivation is continued to obtain,
Figure FDA0002803831250000045
wherein
Figure FDA0002803831250000046
Representing a first set of anomalous data
Figure FDA0002803831250000047
Traversing the first abnormal data set by combining the probability distribution with the time class attribute G and the date class attribute I
Figure FDA0002803831250000048
And all values of the time class attribute G get its conditional probability distribution
Figure FDA0002803831250000049
And a first set of exception data
Figure FDA00028038312500000410
Obtaining the conditional probability distribution of all the values of the date type attribute I
Figure FDA00028038312500000411
Time class attribute conditional probability Q (G), date class attribute conditional probability Q (I);
then, calculating:
Figure FDA00028038312500000412
the derivation is continued to obtain,
Figure FDA00028038312500000413
wherein
Figure FDA0002803831250000051
Representing a second set of anomalous data
Figure FDA0002803831250000052
Traversing the second abnormal data set according to the joint probability distribution of the time class attribute G and the date class attribute I
Figure FDA0002803831250000053
And all values of the time class attribute G get its conditional probability distribution
Figure FDA0002803831250000054
And a second set of exception data
Figure FDA0002803831250000055
Obtaining the conditional probability distribution of all the values of the date type attribute I
Figure FDA0002803831250000056
9. The big data platform financial data management control system of claim 1 wherein the screening and scoring module comprises:
the joint probability distribution value of each abnormal data node in the first abnormal data set C and the condition information of the time attribute and the date attribute of each abnormal data node in the second abnormal data set E is as follows;
Figure FDA0002803831250000057
selecting class attributes J of the abnormal data and putting the class attributes J into a big data platform; constructing a naive Bayesian network by taking class attributes J as parent nodes of internal attributes in the first abnormal data set C and the second abnormal data set E;
putting the nodes in the first abnormal data set C and the second abnormal data set E into the Bayesian network one by one; if in the first abnormal data set C
Figure FDA0002803831250000058
Then will be
Figure FDA0002803831250000059
Putting the network as a parent node; (ii) a If in the second abnormal data set E
Figure FDA00028038312500000510
Then will be
Figure FDA00028038312500000511
Putting the network as a parent node; thereby obtaining the Bayesian network for abnormal data grade screening and sorting.
10. The big data platform financial data management control system of claim 1 wherein the comprehensive judgment module comprises:
combining risk degree weight calculation to transfer abnormal large amount of funds into transaction data u which are rapidly and dispersedly transferred outiAnd (3) calculating:
Figure FDA00028038312500000512
wherein, TtotalIs the total base time;
Figure FDA00028038312500000513
dynamically changing components of transaction data weight for the transfer of abnormally large amounts of funds into the fast scatter transfer-out; vtotalU is a transaction data detection time component and is a total reference date; k is a date component of transaction data detection;
transaction data v for transferring abnormal large amount of dispersed funds into rapid and centralized transfer-outiThe weight of the degree of risk is calculated,
Figure FDA0002803831250000061
wherein, gviTransaction data v for transferring abnormally large amount of dispersed funds into rapid centralized transfer-outiA dynamically varying component of the weight;
for abnormal time point transaction data xiThe weight of the degree of risk is calculated,
Figure FDA0002803831250000062
wherein the content of the first and second substances,
Figure FDA0002803831250000063
transacting data x for abnormal points in timeiA dynamic variation component of;
for abnormal same amount transaction data yiThe weight of the degree of risk is calculated,
Figure FDA0002803831250000064
wherein the content of the first and second substances,
Figure FDA0002803831250000065
transacting data y for abnormally same amountiA dynamic variation component of;
over-limit transaction data z for exceptionsiThe weight of the degree of risk is calculated,
Figure FDA0002803831250000066
and (3) defining a comprehensive risk judgment model:
Figure FDA0002803831250000067
wherein the content of the first and second substances,
Figure FDA0002803831250000068
the transaction data predicted value for transferring abnormally large amount of funds into the rapid dispersed transfer-out;
Figure FDA0002803831250000069
a decision threshold for transferring unusually large amounts of funds into rapidly dispersed roll-out transaction data,
Figure FDA00028038312500000610
the transaction data predicted value for transferring abnormally large amount of dispersed funds into the transaction data which is transferred out quickly and intensively;
Figure FDA0002803831250000071
for the judgment threshold value of transaction data of transferring abnormally large amount of dispersed funds into fast centralized transfer-out,
Figure FDA0002803831250000072
predicting a transaction data value at an abnormal time point;
Figure FDA0002803831250000073
is a judgment threshold value of transaction data at an abnormal time point,
Figure FDA0002803831250000074
transaction data for abnormally same amountMeasuring;
Figure FDA0002803831250000075
a decision threshold for anomalous same amount transaction data,
Figure FDA0002803831250000076
predicting a transaction data value for abnormal excess;
Figure FDA0002803831250000077
the judgment threshold value is abnormal excess transaction data, and epsilon is a judgment correction coefficient.
CN202011360661.7A 2020-11-27 2020-11-27 Financial data management control system of big data platform Active CN112445844B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011360661.7A CN112445844B (en) 2020-11-27 2020-11-27 Financial data management control system of big data platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011360661.7A CN112445844B (en) 2020-11-27 2020-11-27 Financial data management control system of big data platform

Publications (2)

Publication Number Publication Date
CN112445844A true CN112445844A (en) 2021-03-05
CN112445844B CN112445844B (en) 2022-04-01

Family

ID=74737870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011360661.7A Active CN112445844B (en) 2020-11-27 2020-11-27 Financial data management control system of big data platform

Country Status (1)

Country Link
CN (1) CN112445844B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114997684A (en) * 2022-06-16 2022-09-02 上海起策教育科技有限公司 Financial data safety management system
CN115545872A (en) * 2022-11-28 2022-12-30 杭州工猫科技有限公司 Risk early warning method in application of RPA financial robot based on AI
CN116912003A (en) * 2023-09-12 2023-10-20 国网山西省电力公司营销服务中心 Multi-transaction variety-oriented power resource scheduling method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050222929A1 (en) * 2004-04-06 2005-10-06 Pricewaterhousecoopers Llp Systems and methods for investigation of financial reporting information
CN109993645A (en) * 2017-12-29 2019-07-09 北京京东尚科信息技术有限公司 Financial forecast method, system, computer system and computer readable storage medium
CN110210973A (en) * 2019-05-31 2019-09-06 三峡大学 Insider trading recognition methods based on random forest and model-naive Bayesian
US20190311428A1 (en) * 2018-04-07 2019-10-10 Brighterion, Inc. Credit risk and default prediction by smart agents
CN110874778A (en) * 2018-08-31 2020-03-10 阿里巴巴集团控股有限公司 Abnormal order detection method and device
CN111598438A (en) * 2020-05-14 2020-08-28 哈尔滨工业大学(威海) Civil aviation engine gas circuit abnormity detection method based on segmented fitting analysis and evaluation
CN111861472A (en) * 2020-07-30 2020-10-30 中国工商银行股份有限公司 Service monitoring processing method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050222929A1 (en) * 2004-04-06 2005-10-06 Pricewaterhousecoopers Llp Systems and methods for investigation of financial reporting information
CN109993645A (en) * 2017-12-29 2019-07-09 北京京东尚科信息技术有限公司 Financial forecast method, system, computer system and computer readable storage medium
US20190311428A1 (en) * 2018-04-07 2019-10-10 Brighterion, Inc. Credit risk and default prediction by smart agents
CN110874778A (en) * 2018-08-31 2020-03-10 阿里巴巴集团控股有限公司 Abnormal order detection method and device
CN110210973A (en) * 2019-05-31 2019-09-06 三峡大学 Insider trading recognition methods based on random forest and model-naive Bayesian
CN111598438A (en) * 2020-05-14 2020-08-28 哈尔滨工业大学(威海) Civil aviation engine gas circuit abnormity detection method based on segmented fitting analysis and evaluation
CN111861472A (en) * 2020-07-30 2020-10-30 中国工商银行股份有限公司 Service monitoring processing method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
简文清: "阈值面板数据模型的理论及应用研究", 《中国优秀博硕士学位论文全文数据库(博士)经济与管理科学辑》 *
陈振宇: "并行审计技术与数据挖掘技术相结合的风险控制", 《绿色财会》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114997684A (en) * 2022-06-16 2022-09-02 上海起策教育科技有限公司 Financial data safety management system
CN115545872A (en) * 2022-11-28 2022-12-30 杭州工猫科技有限公司 Risk early warning method in application of RPA financial robot based on AI
CN116912003A (en) * 2023-09-12 2023-10-20 国网山西省电力公司营销服务中心 Multi-transaction variety-oriented power resource scheduling method and system
CN116912003B (en) * 2023-09-12 2024-01-12 国网山西省电力公司营销服务中心 Multi-transaction variety-oriented power resource scheduling method and system

Also Published As

Publication number Publication date
CN112445844B (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CN112445844B (en) Financial data management control system of big data platform
WO2021184630A1 (en) Method for locating pollutant discharge object on basis of knowledge graph, and related device
CN111737101B (en) User behavior monitoring method, device, equipment and medium based on big data
CN105868373B (en) Method and device for processing key data of power business information system
CN112487053B (en) Abnormal control extraction working method for mass financial data
CN108074022A (en) A kind of hardware resource analysis and appraisal procedure based on concentration O&M
CN111199361A (en) Electric power information system health assessment method and system based on fuzzy reasoning theory
CN115409395B (en) Quality acceptance inspection method and system for hydraulic construction engineering
US11422992B2 (en) Auto reinforced anomaly detection
Ishankhodjayev et al. Optimization of information processes of multilevel intelligent systems
CN111934865A (en) Method for evaluating operation index of quantum communication network based on entropy method
CN109241190A (en) Electric power big data mixes computing architecture
CN112463853B (en) Financial data behavior screening working method through cloud platform
CN110928864A (en) Scientific research project management method and system
CN113111095A (en) Intelligent information management method and system
CN112288317A (en) Industrial big data analysis platform and method based on multi-source heterogeneous data governance
Zhao et al. IT service incident management model decision based on ELECTRE III
KR102410151B1 (en) Method, apparatus and computer-readable medium for machine learning based observation level measurement using server system log and risk calculation using thereof
CN113837481A (en) Financial big data management system based on block chain
CN112085412A (en) Resource optimization distribution system and distribution method
Zhao et al. Hadoop-Based Power Grid Data Quality Verification and Monitoring Method
Li et al. Analytic model and assessment framework for data quality evaluation in state grid
CN117891959B (en) Document metadata storage method and system based on Bayesian network
CN117829435B (en) Urban data quality management method and system based on big data
CN117172721B (en) Data flow supervision early warning method and system for financing service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230811

Address after: 712000 No. 2311, Block B, Huayu Shuangzixing, Chenyangzhai, Fengxi New City, Xixian New District, Xianyang City, Shaanxi Province

Patentee after: Shaanxi Elide Software Technology Co.,Ltd.

Address before: 401331 No. 82 Middle Road, University Town, Shapingba District, Chongqing

Patentee before: CHONGQING MEDICAL AND PHARMACEUTICAL College

TR01 Transfer of patent right