CN113269626A - Financial manipulation behavior identification method and device, electronic equipment and medium - Google Patents
Financial manipulation behavior identification method and device, electronic equipment and medium Download PDFInfo
- Publication number
- CN113269626A CN113269626A CN202110621635.3A CN202110621635A CN113269626A CN 113269626 A CN113269626 A CN 113269626A CN 202110621635 A CN202110621635 A CN 202110621635A CN 113269626 A CN113269626 A CN 113269626A
- Authority
- CN
- China
- Prior art keywords
- financial
- manipulation behavior
- sample
- manipulation
- key
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/12—Accounting
- G06Q40/125—Finance or payroll
Landscapes
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Engineering & Computer Science (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Technology Law (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The invention provides a financial manipulation behavior identification method, a financial manipulation behavior identification device, electronic equipment and a medium, wherein the financial manipulation behavior identification method comprises the steps of constructing a financial manipulation behavior identification model; acquiring key characteristic data of a company to be identified; and inputting the acquired key characteristic data into the constructed financial manipulation behavior recognition model to recognize whether the to-be-recognized company has financial manipulation. According to the method, the financial manipulation behavior recognition model is constructed, so that whether financial manipulation exists in listed companies or not is automatically recognized, and the recognition efficiency and recognition effect of the financial manipulation are remarkably improved.
Description
Technical Field
The invention relates to the field of big data, in particular to a financial manipulation behavior identification method and device, electronic equipment and a medium.
Background
For the purpose of maintaining stock prices, performance assessment, and funding, listed companies often use various "financial skills" to publish their financial reports, so-called financial manipulations.
Financial manipulations can be broadly divided into two categories: the first category relates to the use of residual power to create accounting rules within the limits permitted by the general accounting system, accounting criteria and related laws in order to intentionally process related accounting data to obtain certain behavioral expectations. It functions within the accounting rules framework and is therefore a legal act on accounting. The second category is protocols that are not subject to accounting rules, where accounting processes performed externally are often manifested as serious violations of current accounting systems, accounting standards, and related legal regulations. This type of accounting process is an illegal accounting process. The provided financial information is not true, i.e., distorted financial information. The term "financial manipulation" as used herein refers to the latter, specifically, the case where no significant matters of the company are revealed in time, no other duties are performed in law, the result of performance prediction is inaccurate or not in time, and false or serious misleading statements are revealed in the information.
Achieving effective identification of financial manipulations of a listed company is crucial to both regulators and large investors. Currently, in order to identify whether financial operations exist in a listed company, a professional having professional financial knowledge and knowing the operation condition of the listed company needs to be engaged, and the financial operations and the operation data can be obtained by performing complicated analysis on the public financial data and the operation data of the listed company. Therefore, there is a need to provide a general financial manipulation behavior recognition model to implement automatic recognition of financial manipulation behaviors of listed companies, so as to finally improve recognition efficiency and recognition effect of financial manipulation and reduce recognition cost.
Disclosure of Invention
In order to achieve the above technical object, a first aspect of the present invention provides a financial manipulation behavior recognition method, which comprises the following detailed technical steps:
a financial manipulation behavior identification method, comprising:
constructing a financial manipulation behavior recognition model;
acquiring key characteristic data of a company to be identified;
and inputting the acquired key characteristic data into the constructed financial manipulation behavior recognition model to recognize whether the to-be-recognized company has financial manipulation behaviors.
In some embodiments, said constructing a financial manipulation behavior recognition model comprises:
determining a candidate feature set, the candidate feature set comprising a number of financial features and a number of non-financial features;
obtaining a sample set, wherein the sample set comprises a positive sample and a negative sample, the positive sample is a company sample with financial manipulation behavior, and the negative sample is a company sample without financial manipulation behavior;
performing a significant difference analysis on each feature in the candidate feature set using the sample set to obtain a number of key features, wherein the key features have significant differences in the positive and negative samples;
and performing logistic regression analysis on the key features to obtain the financial manipulation behavior recognition model.
In some embodiments, said performing a significant difference analysis on features in said candidate set of features using said sample set to obtain a number of key features comprises:
performing significance difference analysis on each feature in the candidate feature set by adopting a single-factor detection method to obtain a first key feature set comprising a first number of candidate features;
performing significance difference analysis on each feature in the candidate feature set by adopting a multivariate logistic regression analysis method to obtain a second key feature set comprising a second number of candidate features;
merging the first and second key feature sets to obtain a third key feature set comprising a third number of the candidate features;
and screening the third key feature set by adopting a factor analysis method to obtain the final key feature.
In some embodiments, the number of positive samples is equal to the number of negative samples. The number of the positive samples and the number of the negative samples are equal and appear in pairs, the corresponding positive samples and the corresponding negative samples belong to the same exchange and the same industry, and the difference of the total market value is within a preset range.
In some embodiments, the single factor detection method is a non-parametric detection method.
In some embodiments, the financial manipulation behavior identification model is a multivariate logistic regression model, represented as follows:
wherein: y is the probability of financial manipulation, X2Is the flow ratio, X5Moving the ratio of assets for monetary funds, X14For cash flow to mobile liability ratio, X18Specific gravity, X, of mobile assets for prepaid account19Account specific gravity, X, of mobile assets for accounts receivable22Is the share weight concentration, X34Cash recovery for the entire asset.
A second aspect of the present invention provides a financial manipulation behavior recognition apparatus, comprising:
the modeling module is used for constructing a financial manipulation behavior recognition model;
the acquisition module is used for acquiring key characteristic data of the company to be identified;
and the identification module is used for inputting the acquired key characteristic data into the constructed financial manipulation behavior identification model so as to identify whether the company to be identified has financial manipulation behaviors.
In some embodiments, the modeling module comprises:
a determining sub-module for determining a candidate feature set, the candidate feature set comprising a plurality of financial features and a plurality of non-financial features;
the acquisition sub-module is used for acquiring a sample set, wherein the sample set comprises a positive sample and a negative sample, the positive sample is a company sample with financial manipulation behavior, and the negative sample is a company sample without financial manipulation behavior;
a key feature selection module, configured to perform a significant difference analysis on each feature in the candidate feature set using the sample set to obtain a number of key features, where the key features have significant differences in the positive sample and the negative sample;
and the model training submodule is used for carrying out logistic regression analysis on the key features to obtain the financial manipulation behavior recognition model.
A third aspect of the present invention provides an electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the financial manipulation behavior recognition method according to the first aspect of the present invention when executing the program.
A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the financial manipulation behavior recognition method of the first aspect of the present invention.
The invention realizes the automatic identification of whether the financial manipulation exists in the company to be identified by constructing the financial manipulation behavior identification model, obviously improves the identification efficiency and the identification effect of the financial manipulation of the listed company, and obviously reduces the identification cost.
Drawings
FIG. 1 is a flow chart of a financial manipulation behavior identification method according to a first embodiment of the present invention;
FIG. 2 is a flow chart of a financial manipulation behavior identification method according to a first embodiment of the present invention;
fig. 3 is a block diagram showing a financial manipulation behavior recognition apparatus according to a second embodiment of the present invention;
fig. 4 is a block diagram showing a financial manipulation behavior recognition apparatus according to a second embodiment of the present invention;
fig. 5 is a block diagram of an electronic device according to a third embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example one
As shown in fig. 1, the financial manipulation behavior recognition method 100 provided by the present embodiment includes the following steps:
and S101, constructing a financial manipulation behavior recognition model.
Specifically, as shown in fig. 2, step S101 includes the following sub-steps:
s1011, determining a candidate feature set, wherein the candidate feature set comprises a plurality of financial features and a plurality of non-financial features.
As will be appreciated by those skilled in the art, the traces of financial manipulations must be reflected in the corporate business and financial data, and therefore we choose to determine a candidate feature set from the corporate business and financial data, namely: each candidate feature in the set of candidate features may have an association with a financial manipulation. Of course, the determination of these candidate features is based on research efforts already in the field.
Candidate features belong to two classes: financial and non-financial indicators. The financial index system is subdivided into several dimensions of repayment ability, profit ability, asset quality, profit quality, operation ability, cash flow, development potential, joint transaction degree and risk level, and the non-financial index system is divided into the following aspects: company governance, company operation, operation risk and audit information.
Optionally, in this embodiment, the candidate features and definitions included in the finally determined candidate feature set are shown in table 1:
TABLE 1 candidate features and definitions
S1012, obtaining a sample set, wherein the sample set comprises a positive sample and a negative sample, the positive sample is a company sample with financial manipulation behavior, and the negative sample is a company sample without financial manipulation behavior.
Alternatively, we affirm the listed a stock financial handling company, which has been certified to officially notify and be subject to administrative sanctioning during 1/2012 to 9/30/2020, as a positive sample, where we obtain 263 positive samples without including general notifications of report criticism, disclosure and repriming, etc. to the financial handling company. The financial data, business data and corporate governance data for the sample company are from the wind database.
Next, we performed further screening of the samples by the following procedure:
1. for a company with financial manipulation behaviors for two or more continuous years, taking data of the first year of occurrence of the main discovered manipulation behaviors as data adopted by research according to the penalty time of the certificate and the supervision;
2. since the evaluation standards of the operation mode and the performance index are different from those of the common company, the marketing companies such as finance, insurance and the like are eliminated;
3. eliminating companies whose financial data are not complete and cannot acquire a complete index system:
4. and eliminating the companies which are in the stop-plate on the transaction days before and after the certificate supervision advising penalty day.
After the screening, 132 samples with financial manipulations are finally obtained. The reason for reporting the penalty is different among the samples, one reason is related to the penalty, and multiple reasons are also related to the penalty, and the invention does not distinguish the samples in particular.
Meanwhile, according to 1: and 1, selecting a non-financial manipulation marketing company which is closest to the total market value of the same exchange, the same year, the same industry and the same financial manipulation sample as a negative sample.
After the selection and screening of the samples are completed, the finally determined sample size is that 132 listed companies with financial manipulation behaviors are used as a positive sample group, 132 listed companies with one-to-one correspondence to the financial manipulation are used as a negative sample group, and the total number of the samples is 264 listed companies. The sample set constructed in the above way can control the influence of factors such as industry, year, scale and the like.
And S1013, performing significant difference analysis on each feature in the candidate feature set by using the sample set to obtain a plurality of key features, wherein the key features have significant differences in the positive samples and the negative samples.
Optionally, the specific implementation steps of step S1013 are as follows:
s10131, performing significance difference analysis on the features in the candidate feature set by adopting a single-factor detection method to obtain a first key feature set comprising a first number of the candidate features.
Before conducting a significant difference analysis on each feature in the candidate feature set, a sample-scale univariate test was first performed to exclude the impact of company scale on both sets of data. Specifically, nonparametric inspection is performed on the total market value as an inspection variable in the positive sample group and the negative sample group, the total market value and the industry are used as inspection variables, whether financial manipulation behaviors exist or not is used as a grouping variable, and 0 and 1 represent that the financial manipulation behaviors exist or do not exist respectively. The purpose of this step is to verify whether the size of the company (total market value) between the positive and negative sample sets is affected.
Results of the Many-Whitney test on the company scale in the sample-scale univariate test. The results show that the distribution of company sizes is approximately the same in the positive and negative sample groups, excluding interference from company size factors. The statistical results are shown in table 2:
TABLE 2 results of the company-scale Mantoux test
The results show progressive significance behavior of 0.950, failing at a significance level of 0.05. The two groups of data have no obvious difference in the total market value, namely, the influence of the company size on the two groups of data can be eliminated.
In this embodiment, a significance difference analysis of each feature in the candidate feature set is implemented, and the specific process is as follows:
and performing nonparametric inspection in the positive sample group and the negative sample group, sequentially taking the candidate features X1-X35 as inspection variables, taking the existence or nonexistence of the manipulation behavior as grouping variables, and defining the grouping variables, wherein the positive sample group is set to be 0, and the negative sample group is set to be 1.
Non-parametric tests performed on the positive sample group and the negative sample group resulted in 35 candidate features, which were significantly different between the positive sample group and the negative sample group, and descriptive statistics of each candidate feature are shown in table 3:
TABLE 3 descriptive statistics for each candidate feature
As can be seen from the data in table 3, the candidate features X2, X5, X7, X10, X14, X18, X19, X22, X34 were passed at the level of 0.05 in both the positive and negative sample groups, indicating that these candidate features were significantly different in both groups of data; the significance of the remaining candidate features were each greater than 0.05, indicating that the distributions of these candidates in the two sets of data were approximately the same. Therefore, after the step is executed, 8 candidate features of X2, X5, X7, X10, X14, X18, X19, X22, and X34 are finally selected as the first key feature set.
S10132, performing significance difference analysis on the features in the candidate feature set by adopting a multivariate logistic regression analysis method to obtain a second key feature set comprising a second number of the candidate features.
And searching candidate characteristics with significant difference in the two groups of samples by a logistic regression method. The logistic regression analysis is selected to research the influence of X on Y, and has no requirement on the data type of X, X can be classified data or quantitative data, but Y is required to be classified data, and a corresponding data analysis method is used according to the option number of Y. Because the dependent variable indexes selected by the user contain both quantitative data and classified data, and the independent variables are classified data, the logistic regression method meets the requirements of the user on data processing.
And (3) taking Y as a dependent variable and X1-X35 as covariates, and searching candidate characteristics with significant difference in two groups of samples by a binary logistic regression method. As shown in table 4 below, the variables eventually entered into the equation include X7, X14, X18, X19, X22, X35 by logistic regression (the way to screen the variables is Forward).
TABLE 4 variables in the logistic regression entry equation
Therefore, we finally select 6 candidate features of X7, X14, X18, X19, X22, and X35 as the second key feature set.
S10133, merging the first key feature set and the second key feature set to obtain a third key feature set including a third number of the candidate features.
And combining the first key feature set obtained by the single-factor detection method, namely X2, X5, X7, X10, X14, X18, X19, X22, X34, and a multivariate logistic regression analysis method to obtain second key feature sets, namely X7, X14, X18, X19, X22 and X35, to obtain a third key feature set, wherein the third key feature set comprises 10 features, namely X2, X5, X7, X10, X14, X18, X19, X22, X34 and X35.
S10134, screening the third key feature set by adopting a factor analysis method to obtain the final key features.
Although each feature in the third key feature set has a large correlation with the financial manipulation, the degree of correlation between the included features may be large, that is, there may be multiple collinearity between the features, and therefore, in order to ensure the effectiveness and interpretability of the subsequent identification model, it is necessary to perform further dimension reduction processing on the third key feature set to remove the multiple collinearity between the features. Optionally, in this embodiment, a multivariate factor regression analysis method is used to perform dimension reduction processing, so that a small number of major factors are finally formed to construct a financial manipulation behavior recognition model.
Through multivariate factor regression analysis, the obtained common influence dimensionality of a plurality of independent variable indexes can be obtained, the decisive index of a common factor is obtained through the rotated factor load matrix, the number of the independent variable indexes is reduced, and the independent variable indexes are better adapted to logistic regression.
As in table 5, we used three different approaches in the common factor selection process. The first is the most common case, only dimensions with eigenvalues greater than 1 are taken into account, i.e. the first four common factors are extracted. The information of the original variable covered by the method can only reach 68.263%, and the level is low. The second method is to extract 5 common factors, wherein four common factor characteristic values are more than 1, and the fifth common factor characteristic value is 0.835 which is also close to 1, so that 77.543% of data information of the original variable can be covered, which indicates that the group of data information can be well interpreted. The third method is that the eigenvalue of the 6 th common factor is 0.810, which is also very close to 1, and the information coverage probability of the extracted original variable can reach 86.541%, but the dimension of the selection of the method is too redundant, and when the control group is subjected to financial manipulation behavior recognition under the method, the accuracy is only 68%, which is lower than the first two methods, and therefore, the method is not selected. In summary, we chose the second method for factor analysis.
TABLE 5 Total variance interpretation of factor analysis
As can be seen from the rotated composition matrix table 6, the dimensions of the five common factors are determined by the following variable indexes: x34, X14, X2, X18, X5, X19, X22.
TABLE 6 composition matrix after rotation
That is, after the dimensionality reduction treatment by the factor analysis method, the finally determined key features are X2, X5, X7, X10, X14, X18, X19, X22 and X34.
And S1014, carrying out logistic regression analysis on the key features to obtain the financial manipulation behavior recognition model.
The data from the sample set in the foregoing is still used to perform logistic regression analysis (the way to screen variables is Forward) on the key features obtained by the factorial analysis.
From table 7 we have derived the following regression equation to examine financial handling behavior:
TABLE 7 variables finally entered into the equation
And S102, acquiring key characteristic data of the company to be identified.
Namely, the following data of 7 key features of the company to be identified is obtained from public data: a flow ratio X2, a monetary funds flow asset ratio X5, a cash flow to flow liability ratio X14, a specific gravity of prepaid accounts in the flow asset X18, a specific gravity of receivables in the flow asset X19, a share concentration X22, and a total asset cash recovery rate X24.
S103, inputting the acquired key feature data into the constructed financial manipulation behavior recognition model to recognize whether the to-be-recognized company has financial manipulation.
Of 3063 listed companies in the a-stock market (the industry removed finance, insurance, etc. and the stocks of ST and ST, etc.) estimated according to the existing model, a total of 1869 companies were examined for evidence of financial manipulation, accounting for 61.02%.
As shown in table 8, the mining industry has the highest percentage of companies with financial management, accounting for 77.78% of the industry, based on the industry classification. Secondly, in the industries of building materials, traffic equipment, light industry manufacturing and catering and tourism, the proportion of listed companies with operation behaviors accounts for more than 70 percent of the proportion of the industries. The industries with the least operation behavior proportion are real estate, ferrous metal, transportation and nonferrous metal industries, and are all below 50 percent. Table 7 below is the company proportion for which financial manipulations exist for each industry:
TABLE 8 proportion of companies with financial manipulations in each industry
It is noteworthy here, however, that we cannot directly ascertain with force that if a company belongs to a certain industry, it is more likely that it will have financial manipulations. The reasons are mainly as follows: the first point is that the selected sample capacity of the forecasting company is small and the distribution of the marketing company is not balanced, so that the companies engaged in mechanical equipment and information service are more than those engaged in the comprehensive class and the transportation class. From the list in table 7, it can be seen that the financial handling behavior proportion of the electronic industry and the comprehensive industry is about 70%, however, the sample capacity of the electronic industry is up to 241 families, while the sample capacity of the comprehensive industry is only 10 families, and the difference of the sample capacity is nearly 24 times, so that the financial handling behavior proportion of the comprehensive industry may have relatively large contingency and inaccuracy. The second point is the issue of threshold setting for financial manipulation behavior identification in the embodiment. Setting the threshold value of this embodiment at 50% yields the results in the table above, with values closer to 1 indicating a greater likelihood of steering behavior in the information disclosure. When the threshold is raised, the handling behavior ratio of the industry must be changed, and further the judgment of whether the whole industry is more inclined to financial handling is influenced.
Example two
The present embodiment provides a financial manipulation behavior recognition apparatus 200, and as shown in fig. 3, the financial manipulation behavior recognition apparatus 200 of the present embodiment includes:
and the modeling module 201 is used for constructing a financial manipulation behavior recognition model.
Optionally, as shown in fig. 4, the modeling module 201 may further include:
a determining sub-module 2011 configured to determine a candidate feature set, where the candidate feature set includes a plurality of financial features and a plurality of non-financial features;
the obtaining sub-module 2012 is configured to obtain a sample set, where the sample set includes a positive sample and a negative sample, the positive sample is a company sample where the financial manipulation behavior exists, and the negative sample is a company sample where the financial manipulation behavior does not exist;
a key feature selection module 2013, configured to perform a significant difference analysis on each feature in the candidate feature set by using the sample set to obtain several key features, where the key features have significant differences in the positive sample and the negative sample;
and the model training submodule 2014 is used for performing logistic regression analysis on the key features to obtain the financial manipulation behavior recognition model.
The obtaining module 202 is configured to obtain key feature data of a company to be identified.
And the identification module 203 is used for inputting the acquired key characteristic data into the constructed financial manipulation behavior identification model so as to identify whether the to-be-identified company has financial manipulation.
Since the processing procedure of each functional module of the financial manipulation behavior recognition device 200 provided in the present embodiment is consistent with the processing procedure of the financial manipulation behavior recognition method 100 in the second embodiment, the processing procedure of each functional module of the financial manipulation behavior recognition device 200 will not be described repeatedly in the present embodiment, and reference may be made to the related description of the first embodiment.
EXAMPLE III
Fig. 5 is a schematic structural diagram of an electronic device 300 according to an embodiment of the present disclosure, and as shown in fig. 5, the electronic device 300 includes a processor 301 and a memory 303, and the processor 301 and the memory 303 are connected, for example, through a bus 302.
The processor 301 may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 301 may also be a combination of implementing computing functionality, e.g., including one or more microprocessors, a combination of DSPs and microprocessors, and the like.
The memory 303 is used for storing application program codes of the present application, and is controlled to be executed by the processor 301. The processor 301 is configured to execute application program code stored in the memory 303 to implement the financial manipulation behavior recognition method according to the first embodiment.
Finally, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method for identifying financial manipulation behavior according to the first embodiment is implemented.
The invention has been described above with a certain degree of particularity. It will be understood by those of ordinary skill in the art that the description of the embodiments is merely exemplary and that all changes that come within the true spirit and scope of the invention are desired to be protected. The scope of the invention is defined by the appended claims rather than by the foregoing description of the embodiments.
Claims (10)
1. A financial manipulation behavior recognition method, comprising:
constructing a financial manipulation behavior recognition model;
acquiring key characteristic data of a company to be identified;
and inputting the acquired key characteristic data into the constructed financial manipulation behavior recognition model to recognize whether the to-be-recognized company has financial manipulation behaviors.
2. The financial manipulation behavior recognition method of claim 1 wherein said constructing a financial manipulation behavior recognition model comprises:
determining a candidate feature set, the candidate feature set comprising a number of financial features and a number of non-financial features;
obtaining a sample set, wherein the sample set comprises a positive sample and a negative sample, the positive sample is a company sample with financial manipulation behavior, and the negative sample is a company sample without financial manipulation behavior;
performing a significant difference analysis on each feature in the candidate feature set using the sample set to obtain a number of key features, wherein the key features have significant differences in the positive and negative samples;
and performing logistic regression analysis on the key features to obtain the financial manipulation behavior recognition model.
3. The financial manipulation behavior identification method of claim 2 wherein said using the sample set to perform a significant difference analysis on each feature in the candidate feature set to obtain a number of key features comprises:
performing significance difference analysis on each feature in the candidate feature set by adopting a single-factor detection method to obtain a first key feature set comprising a first number of candidate features;
performing significance difference analysis on each feature in the candidate feature set by adopting a multivariate logistic regression analysis method to obtain a second key feature set comprising a second number of candidate features;
merging the first and second key feature sets to obtain a third key feature set comprising a third number of the candidate features;
and screening the third key feature set by adopting a factor analysis method to obtain the final key feature.
4. The financial manipulation behavior recognition method of claim 2, wherein the number of the positive examples is equal to the number of the negative examples and occur in pairs, the corresponding positive examples and the corresponding negative examples belong to the same exchange, the same industry, and the difference in total market value is within a predetermined range.
5. A financial manipulation behavior recognition method according to claim 3 wherein said single factor detection method is a non-parametric detection method.
6. The financial manipulation behavior recognition method of claim 1 wherein the financial manipulation behavior recognition model is a multivariate logistic regression model represented as follows:
wherein: y is the probability of financial manipulation, X2Is the flow ratio, X5Moving the ratio of assets for monetary funds, X14For cash flow to mobile liability ratio, X18Specific gravity, X, of mobile assets for prepaid account19Account specific gravity, X, of mobile assets for accounts receivable22Is the share weight concentration, X34Cash recovery for the entire asset.
7. A financial manipulation behavior recognition apparatus, comprising:
the modeling module is used for constructing a financial manipulation behavior recognition model;
the acquisition module is used for acquiring key characteristic data of the company to be identified;
and the identification module is used for inputting the acquired key characteristic data into the constructed financial manipulation behavior identification model so as to identify whether the company to be identified has financial manipulation behaviors.
8. The financial manipulation behavior recognition device of claim 7 wherein the modeling module comprises:
a determining sub-module for determining a candidate feature set, the candidate feature set comprising a plurality of financial features and a plurality of non-financial features;
the acquisition sub-module is used for acquiring a sample set, wherein the sample set comprises a positive sample and a negative sample, the positive sample is a company sample with financial manipulation behavior, and the negative sample is a company sample without financial manipulation behavior;
a key feature selection module, configured to perform a significant difference analysis on each feature in the candidate feature set using the sample set to obtain a number of key features, where the key features have significant differences in the positive sample and the negative sample;
and the model training submodule is used for carrying out logistic regression analysis on the key features to obtain the financial manipulation behavior recognition model.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the financial manipulation behavior recognition method of any one of claims 1 to 6 when executing the program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the financial manipulation behavior recognition method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110621635.3A CN113269626A (en) | 2021-06-03 | 2021-06-03 | Financial manipulation behavior identification method and device, electronic equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110621635.3A CN113269626A (en) | 2021-06-03 | 2021-06-03 | Financial manipulation behavior identification method and device, electronic equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113269626A true CN113269626A (en) | 2021-08-17 |
Family
ID=77234179
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110621635.3A Pending CN113269626A (en) | 2021-06-03 | 2021-06-03 | Financial manipulation behavior identification method and device, electronic equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113269626A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120030081A1 (en) * | 2010-07-29 | 2012-02-02 | Bank Of America Corporation | Physiological response of a customer during financial activity |
CN105678451A (en) * | 2016-01-04 | 2016-06-15 | 宁宇新 | Method and device for automatically identifying financial fraud on the basis of financial data |
CN105809348A (en) * | 2016-03-10 | 2016-07-27 | 辽宁工程技术大学 | Enterprise performance analysis method |
CN109376995A (en) * | 2018-09-18 | 2019-02-22 | 平安科技(深圳)有限公司 | Financial data methods of marking, device, computer equipment and storage medium |
CN111612603A (en) * | 2020-04-17 | 2020-09-01 | 北京智信度科技有限公司 | Suspected financial counterfeiting behavior insights and discrimination system of listed company |
CN111783829A (en) * | 2020-05-29 | 2020-10-16 | 广发证券股份有限公司 | Financial anomaly detection method and device based on multi-label learning |
-
2021
- 2021-06-03 CN CN202110621635.3A patent/CN113269626A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120030081A1 (en) * | 2010-07-29 | 2012-02-02 | Bank Of America Corporation | Physiological response of a customer during financial activity |
CN105678451A (en) * | 2016-01-04 | 2016-06-15 | 宁宇新 | Method and device for automatically identifying financial fraud on the basis of financial data |
CN105809348A (en) * | 2016-03-10 | 2016-07-27 | 辽宁工程技术大学 | Enterprise performance analysis method |
CN109376995A (en) * | 2018-09-18 | 2019-02-22 | 平安科技(深圳)有限公司 | Financial data methods of marking, device, computer equipment and storage medium |
CN111612603A (en) * | 2020-04-17 | 2020-09-01 | 北京智信度科技有限公司 | Suspected financial counterfeiting behavior insights and discrimination system of listed company |
CN111783829A (en) * | 2020-05-29 | 2020-10-16 | 广发证券股份有限公司 | Financial anomaly detection method and device based on multi-label learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Situngkir et al. | Detecting fraudulent financial reporting using fraud score model and fraud pentagon theory: Empirical study of companies listed in the LQ 45 Index | |
Antunes et al. | Firm default probabilities revisited | |
Spathis | Audit qualification, firm litigation, and financial information: an empirical analysis in Greece | |
Alden et al. | Detection of financial statement fraud using evolutionary algorithms | |
WO2012018968A1 (en) | Method and system for quantifying and rating default risk of business enterprises | |
Svabova et al. | Prediction model of firms financial distress | |
Matos et al. | An empirical method for discovering tax fraudsters: A real case study of brazilian fiscal evasion | |
Zarei et al. | Predicting auditors' opinions using financial ratios and non-financial metrics: evidence from Iran | |
CN111144697A (en) | Data processing method, data processing device, storage medium and electronic equipment | |
CN114386856A (en) | Method, device and equipment for identifying empty-shell enterprise and computer storage medium | |
Feruleva et al. | Detecting financial statements fraud: the evidence from Russia | |
Gupta | An empirical analysis of default risk for listed companies in India: A comparison of two prediction models | |
Fieberg et al. | Machine learning in accounting research | |
CN110766547A (en) | Method, device, equipment and storage medium for determining credibility grade | |
US10783578B1 (en) | Computerized systems and methods for detecting, risk scoring and automatically assigning follow-on action codes to resolve violations of representation and warranties found in loan servicing contracts, loan purchase and sale contracts, and loan financing contracts | |
US11715120B1 (en) | Predictive machine learning models | |
Islam et al. | Application of artificial intelligence (artificial neural network) to assess credit risk: a predictive model for credit card scoring | |
CN113269626A (en) | Financial manipulation behavior identification method and device, electronic equipment and medium | |
CN115809930A (en) | Anti-fraud analysis method, device, equipment and medium based on data fusion matching | |
Sylwestrzak | Application of the Beneish Model on the Warsaw Stock Exchange | |
Yip | Business failure prediction: a case-based reasoning approach | |
CN112967127A (en) | Suspicious loan checking method, system, computer equipment and storage medium | |
Lee et al. | An Integral Predictive Model of Financial Distress | |
Hargreaves | Machine learning application to identify good credit customers | |
Shen et al. | Modelling the predictive performance of credit scoring |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |