WO2019109523A1 - 经营数据审核方法、装置、设备及计算机可读存储介质 - Google Patents

经营数据审核方法、装置、设备及计算机可读存储介质 Download PDF

Info

Publication number
WO2019109523A1
WO2019109523A1 PCT/CN2018/075658 CN2018075658W WO2019109523A1 WO 2019109523 A1 WO2019109523 A1 WO 2019109523A1 CN 2018075658 W CN2018075658 W CN 2018075658W WO 2019109523 A1 WO2019109523 A1 WO 2019109523A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
enterprise
business
simulated
confidence interval
Prior art date
Application number
PCT/CN2018/075658
Other languages
English (en)
French (fr)
Inventor
李天平
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2019109523A1 publication Critical patent/WO2019109523A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Definitions

  • the present application relates to the field of financial credit, and in particular, to a method, device, device and computer readable storage medium for operating data auditing.
  • the enterprise credit risk assessment refers to the analysis of the business status information of the loan enterprise, the possibility of overdue the loan (bad debt), and whether it is a fraudulent loan.
  • the existing corporate credit risk assessment methods use the company's financial statements as the basis for the evaluation. The data is analyzed and audited by professionals to determine the business status of the company and assess the risk of credit.
  • This traditional evaluation method mainly analyzes the data of the company's business data and other data, and it takes more time when the amount of data is large. At the same time, the analysis results are subject to subjective limitations of the professional. The influence of factors makes the evaluation results inaccurate, and even the abnormal business data and fraudulent behavior cannot be identified, resulting in bad debts and financial losses.
  • the main purpose of the present application is to provide a method, device and computer readable storage medium for operating data review, which aims to improve the ability of identifying abnormal business data in the process of enterprise credit evaluation and reduce the bad debt rate of enterprise loans.
  • the present application provides a business data review method, and the business data review includes the following steps:
  • the step of constructing a simulated standard moving average in a preset coordinate system according to the enterprise sample data includes:
  • Simulating weighted points are drawn in a preset coordinate system according to the multi-dimensional sample set and the simulated weighting amount, and the simulated standard moving average is obtained according to the simulated weighted point fitting.
  • the multi-dimensional sample set comprises a multi-dimensional sample gene of dimension m
  • the step of performing analysis learning and weighting calculation on the multi-dimensional sample group based on the genetic algorithm to obtain the corresponding simulated weighting amount includes:
  • h ⁇ (x) is the simulated weighted quantity corresponding to the multi-dimensional sample set
  • x 1 , x 2 , ..., x m are sample genes, ⁇ 0 , ⁇ 1 , ⁇ 2 , ..., ⁇ m Weighting factor
  • ⁇ T is a coefficient matrix corresponding to the weighting coefficient
  • y (i) is the tag value of the multi-dimensional sample group
  • the iterative calculation is performed based on the gradient descent formula and the squared loss function, the coefficient matrix ⁇ T is determined, and the analog weighting amount corresponding to the multi-dimensional sample set is calculated according to the coefficient matrix ⁇ T and the simulated matrix equation.
  • the gradient descent formula includes
  • the step of fitting the corresponding enterprise business line in the preset coordinate system according to the enterprise business data includes:
  • the step of comparing the enterprise operation line and the simulated confidence interval, and determining whether the enterprise operation data is abnormal according to the relationship between the enterprise operation line and the simulated confidence interval includes:
  • the accounting point is located in an area other than the simulated confidence interval, it is determined that the business operation data corresponding to the accounting point is abnormal.
  • acquiring the business operation data corresponding to the borrowing enterprise of the borrowing request, and fitting the corresponding enterprise operating line in the preset coordinate system according to the enterprise operating data include:
  • the corresponding business operation line is fitted in the preset coordinate system according to the enterprise business data.
  • the step of comparing the enterprise operation line with the simulated confidence interval, and determining whether the enterprise operation data is abnormal according to the relationship between the enterprise operation line and the simulated confidence interval further includes:
  • a corresponding data audit report is generated according to the enterprise business data, the enterprise operation line, and the simulated confidence interval, and the data audit report is displayed.
  • the present application further provides an operation data review apparatus, where the business data review apparatus includes:
  • An interval obtaining module configured to acquire enterprise sample data, construct a simulated standard moving average in a preset coordinate system according to the enterprise sample data, and obtain an analog confidence interval based on the simulated standard moving average;
  • the operation line fitting module is configured to acquire, when receiving the loan request, the enterprise operation data corresponding to the borrowing enterprise according to the borrowing request, and fit the corresponding business operation line in the preset coordinate system according to the enterprise operating data. ;
  • the data judging module is configured to compare the enterprise operating line with the simulated confidence interval, and determine whether the business data of the enterprise is abnormal according to the relationship between the enterprise operating line and the simulated confidence interval.
  • the present application further provides an operation data review device, which includes a processor, a memory, and a business data review program stored on the memory and executable by the processor.
  • a business data review program stored on the memory and executable by the processor.
  • the present application further provides a computer readable storage medium, where the readable storage medium stores a business data review program, wherein when the business data review program is executed by the processor, the following is implemented. step:
  • FIG. 1 is a schematic structural diagram of hardware of a business data auditing device involved in an embodiment of the present application
  • FIG. 2 is a schematic flow chart of a first embodiment of a business data review method according to the present application
  • FIG. 3 is a schematic diagram of a simulated standard moving average involved in the first embodiment of the business data review method of the present application
  • FIG. 4 is a diagram showing the enterprise operation data of the borrowing enterprise corresponding to the borrowing request when the borrowing request is received, and fitting the corresponding business operation line in the preset coordinate system according to the enterprise operating data; Refine the schematic diagram of the process;
  • FIG. 5 is a schematic flowchart of a second embodiment of a method for reviewing operational data of the present application
  • FIG. 6 is a schematic diagram of functional modules of the first embodiment of the business data review apparatus of the present application.
  • FIG. 1 is a schematic structural diagram of hardware of a business data review device involved in an embodiment of the present application.
  • the business data review device in the embodiment of the present application may include a processor 1001 (for example, a CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
  • the communication bus 1002 is configured to implement connection communication between the components;
  • the user interface 1003 may include a display, an input unit such as a keyboard; and the network interface 1004 may optionally include a standard wired interface and a wireless interface.
  • the memory 1005 may be a high-speed RAM memory, or may be a non-volatile memory, such as a disk storage, and the memory 1005 may alternatively be a storage device independent of the processor 1001 described above.
  • the memory 1005 as a computer readable storage medium in FIG. 1 may include an operating system, a network communication module, and a business data review program.
  • the network communication module is mainly used to connect to a database and perform data communication with the database; and the processor 1001 can call the business data review program stored in the memory 1005, and execute each of the business data review methods provided by the embodiments of the present application. Example.
  • the application provides a method for auditing business data.
  • FIG. 2 is a schematic flowchart of a first embodiment of a business data review method according to the present application.
  • the business data auditing method includes the following steps:
  • Step S10 Obtain enterprise sample data, construct a simulated standard moving average in a preset coordinate system according to the enterprise sample data, and obtain an analog confidence interval based on the simulated standard moving average;
  • a method for auditing business data is proposed.
  • a moving average and simulation confidence interval of a business operation are fitted, and the operating data of the borrowing enterprise is carried out through the moving average and the simulated confidence interval.
  • the test determines whether the operating data of the borrowing enterprise is authentic, identifies the fraudulent behavior of the borrowing enterprise, reduces the adverse impact of human thinking limitations on corporate borrowing, and improves the accuracy of risk assessment, thereby reducing the bad debt rate of credit.
  • a linear regression can be used to fit a business moving average, which can be called a simulated standard moving average; for most enterprises, it is operating.
  • the operational data generated by the process, when fitted by the same method, the resulting fitting points should fluctuate around the moving average; conversely, the operating data that significantly deviates from the simulated standard moving average will be considered abnormal.
  • the business condition of the company is that it does not meet the loan terms (of course, the data may be forged).
  • the AI wind control is constructed by means of machine learning to construct an analog standard moving average.
  • machine learning means not relying on humans to sum up experience and input logic. Humans only need to input a large amount of business data to the computer, and then the computer itself summarizes the data relationship and summarizes the corresponding logic code to get a Data conversion rules form a corresponding fitting formula.
  • genetic algorithm is an idea of survival of the fittest, rather than a specific mathematical model.
  • factors that significantly cause the business expectation to deviate from the mean will be understood by the computer as genes that may lead to poor prediction expectations, and the related companies will be fraud-labeled, and This gene will be recorded; on the other hand, genes that produce positive results will be retained (eg annual revenue growth is in the range [5.0% to 15.0%]).
  • the genes with high proportion are recorded as the basis for further prediction (the "gene"
  • the data generated for the business process including but not limited to: revenue, accounts receivable, working capital, net profit after deduction, investment, depreciation, etc., to fit a simulated standard moving average of business operations, absolutely The normal operating operations of most companies will fluctuate around this moving average.
  • the Y-axis of the moving average is a projection of a high-dimensional vector in two dimensions. According to different industries, the operating status of an enterprise is quantized into multi-dimensional business vectors, and then accumulated into a moving average (distribution model independent of the time axis). Under normal circumstances, the business information of most enterprises should be around not far from the moving average (within the simulated confidence interval), while the outliers (outside the simulated confidence interval) will be marked as abnormal.
  • the enterprise sample data will be obtained.
  • These enterprise data include revenue, accounts receivable, working capital, net profit after deduction, investment, depreciation, etc. (and of course other contents); these data are often continuous, And their respective statistical periods are different. For example, revenue is calculated on a daily basis, and liquidity is calculated on a monthly basis.
  • the enterprise sample data needs to be preprocessed in a unified time (or called a quantitative standard).
  • the enterprise sample data is quantized into a plurality of multi-dimensional sample groups, such as a month-by-month unit, and the enterprise sample data is quantized into a multi-dimensional sample group, and the multi-dimensional sample genes in the multi-dimensional sample group include a January multi-dimensional sample gene and a February multi-dimensional sample gene. Wait.
  • the multi-dimensional sample group is learned based on a genetic algorithm, and the relationship between each gene in the multi-dimensional sample group is analyzed, and then each gene is weighted according to a certain weighting relationship, and corresponding to each multi-dimensional sample group is obtained.
  • the analog weighted amount is performed by the enterprise sample data into a plurality of multi-dimensional sample groups, such as a month-by-month unit, and the enterprise sample data is quantized into a multi-dimensional sample group, and the multi-dimensional sample genes in the multi-dimensional sample group include a January multi-dimensional sample gene and a February multi-dimensional sample gene. Wait.
  • the multi-dimensional sample group is learned based on a genetic algorithm
  • the corresponding analog standard moving average can be fitted in the coordinate system; wherein the x-axis of the coordinate axis is time and the y-axis is the analog weighting amount; the preset is based on the multi-dimensional sample group and the simulated weighting amount.
  • the corresponding analog weighted points are drawn in the coordinate system, and the simulated standard moving average is obtained according to the simulated weighted point fitting. It is worth noting that although the x-axis in the coordinate axis is time, it does not mean that the analog weighting quantity changes with time-dependent changes. The time is only the quantitative standard of the enterprise sample data, which affects the expression of the simulated standard moving average. Instead of the fitting relationship among them.
  • multi-dimensional sample groups including January to August.
  • These multi-dimensional sample groups include m dimensions of genes (also called the multi-dimensional sample group). The dimension is m).
  • One of the multidimensional sample genes of a month can be expressed as x 1 , x 2 , ..., x m .
  • h ⁇ (x) is the simulated weighting amount corresponding to the multi-dimensional sample group (can be used as a simulated weighting amount for cold start)
  • ⁇ 0 , ⁇ 1 , ⁇ 2 , ..., ⁇ m are weighting coefficients.
  • analog weighting equation can be transformed into 1 into a corresponding matrix form (which can be called an analog matrix equation).
  • ⁇ T is a coefficient matrix corresponding to the weighting coefficient.
  • Equation 2 It can be seen from Equation 2 that if it is necessary to calculate the analog weighting amount, it is necessary to determine the coefficient matrix ⁇ T .
  • the square loss function can be constructed first.
  • y (i) is the tag value of the multi-dimensional sample set.
  • square loss function 3 it can be used to estimate the degree of inconsistency between the predicted value (simulated weighted quantity) and the true value, so it can be judged whether the coefficient matrix ⁇ T is accurate by the equation 3.
  • the method of gradient descent can be used to iterate, that is,
  • is the iteration step size and can also be understood as the learning rate.
  • Iterative calculation is performed based on the above formula 345. After several rounds of iteration, when the distance of the coefficient matrix ⁇ T of the current second iteration is less than a predetermined value (for example, 0.000001), the algorithm can be considered to converge, and the latter can be considered.
  • the coefficient matrix ⁇ T of the wheel is determined as the final calculation iteration result and used to calculate the analog weighting amount.
  • the simulated weighting amount When the simulated weighting amount is obtained, the simulated weighting point may be drawn in the preset coordinate system according to the multi-dimensional sample group and the simulated weighting amount, and the simulated standard moving average is obtained according to the simulated weighted point fitting. As shown in FIG.
  • the x-axis of the coordinate axis is time (months)
  • the y-axis is an analog weighting amount. It is worth noting that the analog standard moving average in Figure 3 is expressed in the form of a broken line. This is because the data sampling is periodic, and with the smooth rolling of the sampling period window, the seasonal and economic cycle factors will irresistibly take effect. Therefore, the analog standard moving average is expressed in the form of a broken line.
  • the analog standard moving average can be respectively moved up and down to obtain an analog confidence interval; wherein the distance of the simulated standard moving average can be set and adjusted according to actual conditions.
  • Step S20 When receiving the loan request, acquire the enterprise operation data corresponding to the borrowing enterprise according to the loan request, and fit the corresponding business operation line in the preset coordinate system according to the enterprise business data;
  • the data when the simulated confidence interval is obtained, the data can be audited through the simulated confidence interval.
  • the enterprise operation data of the borrowing enterprise Upon receiving the loan request from the enterprise, the enterprise operation data of the borrowing enterprise will be obtained, and then the business operation data of the borrowing enterprise is quantified in the same manner as in step S10, and the corresponding multi-dimensional business data group is obtained;
  • the data group using the coefficient matrix ⁇ T calculated in step S10, the data genes in the multi-dimensional business data group are weighted and calculated, and the corresponding business weighting amount is obtained, which represents the enterprise in a certain period of time. Business situation.
  • the corresponding business operation line can be fitted in the coordinate system.
  • the x-axis of the preset coordinate axis is time
  • the y-axis is an analog weighting amount
  • corresponding operational weighting points are drawn in a preset coordinate system according to the multi-dimensional operational data set and the operational weighting amount, and are fitted according to the operating weighting point Get the business line.
  • the business line it can be fitted in a fold line or in a curved manner; in the case of a business line fit, it can be the same coordinate in the simulated standard moving line. The system is carried out so that the business line and the simulated standard moving average (simulation confidence interval) are displayed in the same coordinate system, which facilitates subsequent comparison processing.
  • Step S30 comparing the enterprise operation line with the simulated confidence interval, and determining whether the business data of the enterprise is abnormal according to the relationship between the enterprise operation line and the simulated confidence interval.
  • the business operation line and the simulation confidence interval can be compared, and whether the business data of the enterprise is abnormal according to the positional relationship between the business operation line and the simulation confidence interval. If all the points on the business line are within the simulated confidence interval, or if the deviation between the business line and the simulated confidence interval is within the preset allowable range, then the business data corresponding to the business line of the enterprise may be considered normal.
  • the borrower is in a state of normal operation; if the points on the business line are not all within the simulated confidence interval, and the deviation between the business line and the simulated confidence interval exceeds the pre-set permit, the enterprise may be considered
  • the business data corresponding to the business line is abnormal, and the borrowing company is in an abnormal operating state.
  • the comparison between the business operation line and the simulation confidence interval may also be implemented by selecting a plurality of corresponding accounting points in the business operation line according to the preset accounting period, and then respectively determining whether the accounting points are in the simulated confidence interval. Outside the area; if an accounting point is outside the simulated confidence interval, the business data corresponding to the accounting point may be considered abnormal. For example, when fitting a business operation line, its multi-dimensional business data set includes 8 multi-dimensional sample groups from January to August, and the business operation line is based on the 8-month multi-dimensional sample group and the corresponding enterprise reinforcement amount.
  • the loan request corresponds to the enterprise operation data of the borrowing enterprise, and fits the corresponding enterprise operation line in the preset coordinate system according to the enterprise operation data; and compares the enterprise operation line with the simulated confidence interval, according to the enterprise The relationship between the operating line and the simulated confidence interval determines whether the business data of the enterprise is abnormal.
  • the present embodiment learns large-scale enterprise data in a machine learning manner, fits a moving average and simulated confidence interval, and then reviews the operating data of the borrowing enterprise through the moving average and the simulated confidence interval.
  • the operating data of the borrowing enterprise is authentic, identify the fraudulent behavior of the borrowing enterprise, reduce the adverse impact of human thinking limitations on corporate borrowing, and improve the accuracy of risk assessment, thereby reducing the bad debt rate of corporate credit.
  • FIG. 4 is a diagram showing the enterprise operation data of the borrowing enterprise corresponding to the borrowing request when the borrowing request is received, and fitting the corresponding in the preset coordinate system according to the enterprise operating data. Schematic diagram of the detailed process of the business line.
  • step S20 includes:
  • Step S21 When receiving the loan request, generate a corresponding data acquisition request according to the license included in the loan request;
  • the business data auditing device when receiving the loan request from the borrowing enterprise, the business data auditing device needs to obtain the business data related to the borrowing enterprise to analyze the operating state of the borrowing enterprise. For these business data, it is generated in the daily business activities of the enterprise, and is often collected directly by the borrowing enterprise and recorded in its own data management system; these business data can reflect the business status of the enterprise, which will involve Business secrets of the enterprise. Therefore, to obtain this data, you need to obtain the license of the borrowing company.
  • the borrowing company when applying for a loan, will add relevant authorization information to the loan request sent to indicate that the authorized data auditing device accesses the data management system and obtains the corresponding business operation data; the operating data auditing device receives Upon the request for the loan, the license is extracted and a corresponding data acquisition request is generated based on the license information.
  • the license is used by an illegal third party, and the transfer protocol may be agreed with the business data review device in advance, and then the license is first encrypted according to the content of the agreement.
  • the encryption is completed, it is added to the loan request for transmission to improve the security of information transmission;
  • the operation data auditing device first decrypts the loan request according to the transmission protocol, and obtains the license.
  • a corresponding data fetch request is generated according to the license.
  • Step S22 the data acquisition request is sent to the data management system of the borrowing enterprise to obtain enterprise business data of the borrowing enterprise;
  • the data acquisition request may be sent to the data management system of the borrowing enterprise to obtain corresponding enterprise business data.
  • the data management firstly verifies the license included in the data acquisition request, determines the authenticity of the license, and confirms the data acquisition authority of the operation data auditing device (ie, what range can be obtained) In the enterprise business data); when the confirmation is passed, the corresponding business operation data is obtained according to the content of the request included in the data acquisition request, and is returned to the business data auditing device.
  • the encryption process in step S21 may be used for the encryption process, and details are not described herein again.
  • Step S23 when receiving the enterprise business data returned by the data management system, fitting the corresponding business operation line in the preset coordinate system according to the enterprise business data.
  • the data management system When receiving the enterprise management data returned by the data management system, the data management system quantizes the enterprise operation data of the borrowing enterprise in the same manner as in step S10, and obtains the corresponding multi-dimensional operation data group; when the multi-dimensional operation data group is obtained The weighted calculation of the data genes in the multi-dimensional business data set is performed to obtain the corresponding business weighting amount, which represents the operation of the enterprise in a certain period of time; when the operating weighting amount is obtained, the coordinates can be obtained. The department fits the corresponding business line.
  • FIG. 5 is a schematic flowchart of a second embodiment of a business data review method according to the present application.
  • the method further includes:
  • Step S40 Generate a corresponding data audit report according to the enterprise operation data, the enterprise operation line, and the simulated confidence interval, and display the data audit report.
  • a corresponding data audit report may be generated.
  • the data audit report includes the business operation data used in the audit process, and includes the simulated standard moving average obtained in step S10, the simulated confidence interval, and the business operation line obtained in step S20.
  • the simulation standard moving average, simulation confidence interval and enterprise operation line can be displayed in the same coordinate system, and can be highlighted for abnormal points (abnormal business data) (such as different display colors, line segments) Bold, etc.).
  • abnormal points abnormal business data
  • the business data includes the revenue of the borrowing company for a certain period of time.
  • the revenue can be displayed in the form of a line chart to facilitate decision makers to understand the changes in the borrowing company's revenue; It can be displayed in the form of a pie chart, so that decision makers can understand the capital investment of the borrowing company for different things every month.
  • the application also provides a business data review device.
  • FIG. 6 is a schematic diagram of functional modules of a first embodiment of a business data review apparatus according to the present application.
  • the operation data review device includes:
  • the interval obtaining module 10 is configured to acquire enterprise sample data, construct an analog standard moving average in a preset coordinate system according to the enterprise sample data, and obtain an analog confidence interval based on the simulated standard moving average;
  • the operation line fitting module 20 is configured to acquire, when receiving the loan request, the enterprise operation data corresponding to the borrowing enterprise, and fit the corresponding enterprise operation in the preset coordinate system according to the enterprise operation data. line;
  • the data judging module 30 is configured to compare the enterprise operation line and the simulated confidence interval, and determine whether the business data of the enterprise is abnormal according to the relationship between the enterprise operation line and the simulated confidence interval.
  • interval obtaining module 10 further includes:
  • a first quantization unit configured to quantize the enterprise sample data into a multi-dimensional sample group
  • a first calculating unit configured to perform analysis learning and weighting calculation on the multi-dimensional sample group based on a genetic algorithm, to obtain a corresponding simulated weighting amount
  • a first fitting unit configured to draw an analog weighted point in a preset coordinate system according to the multi-dimensional sample set and the simulated weighting amount, and obtain a simulated standard moving average according to the simulated weighted point fitting.
  • the multi-dimensional sample group includes a multi-dimensional sample gene of dimension m
  • the first calculation unit is further configured to:
  • h ⁇ (x) is the simulated weighted quantity corresponding to the multi-dimensional sample set
  • x 1 , x 2 , ..., x m are sample genes, ⁇ 0 , ⁇ 1 , ⁇ 2 , ..., ⁇ m Weighting factor
  • ⁇ T is a coefficient matrix corresponding to the weighting coefficient
  • y (i) is the tag value of the multi-dimensional sample group
  • the iterative calculation is performed based on the gradient descent formula and the squared loss function, the coefficient matrix ⁇ T is determined, and the analog weighting amount corresponding to the multi-dimensional sample set is calculated according to the coefficient matrix ⁇ T and the simulated matrix equation.
  • the gradient descent formula includes
  • the line matching module 20 further includes:
  • a second calculating unit configured to quantize the business operation data into a multi-dimensional business data group, and perform weighting calculation on the multi-dimensional business data group to obtain a corresponding business weighting amount
  • a second fitting unit configured to fit the business operation line in the preset coordinate system according to the multi-dimensional operation data group and the operation weighting amount
  • the data judging module 30 further includes:
  • An accounting point selecting unit configured to select a corresponding accounting point in the business operation line according to a preset accounting period, and determine whether the accounting point is located outside the simulated confidence interval;
  • the abnormality determining unit is configured to determine that the business operation data corresponding to the accounting point is abnormal if the accounting point is located outside the simulated confidence interval.
  • the line matching module 20 further includes:
  • a request generating unit configured to generate, according to the authorization permission included in the loan request, a corresponding data acquisition request when receiving the loan request;
  • a request sending unit configured to send the data acquisition request to a data management system of the borrowing enterprise to obtain enterprise business data of the borrowing enterprise;
  • the data receiving unit is configured to, when receiving the enterprise business data returned by the data management system, fit the corresponding enterprise business line in the preset coordinate system according to the enterprise business data.
  • business data review device further includes:
  • the report generation module is configured to generate a corresponding data audit report according to the enterprise operation data, the enterprise operation line, and the simulation confidence interval, and display the data audit report.
  • the present application also provides a computer readable storage medium.
  • the computer readable storage medium of the present application stores a business data review program, and the readable storage medium stores a business data review program, wherein when the business data review program is executed by the processor, the business data review method is implemented as described above. A step of.

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本申请公开了一种经营数据审核方法,该方法包括:获取企业样本数据,在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并在所述预设坐标系中拟合对应的企业经营线;将所述企业经营线和模拟置信区间进行对比,判断所述企业经营数据是否异常。本申请还公开了一种经营数据审核装置、设备和计算机可读存储介质。本申请以机器学习的方式对大规模的企业数据进行学习,拟合出一条企业经营的均线及模拟置信区间,再通过该均线和模拟置信区间对借款企业的经营数据进行审核,从而判断借款企业的经营数据是否真实可信,识别出借款企业的欺诈行为。

Description

经营数据审核方法、装置、设备及计算机可读存储介质
本申请要求于2017年12月8日提交中国专利局、申请号为201711292700.2、发明名称为“经营数据审核方法、装置、设备及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及金融信贷领域,尤其涉及一种经营数据审核方法、装置、设备及计算机可读存储介质。
背景技术
企业信贷风险评估是指对贷款企业的经营状况信息进行分析,判断贷款逾期(坏账)的可能性,以及是否属于欺诈借贷。现有的企业信贷风险评估方法,都是以企业的财务报表作为评估的基础数据,由专业人员对这些数据进行分析和审核,从而确定企业的经营状况,并评估信贷的风险。
这种传统的评估方法主要是由人对企业的经营数据报表等资料进行分析,而在数据资料数量较大时,需要花费较多的时间;同时,分析结果容易被专业人员的思维局限等主观因素所影响,使得评估结果不准确,甚至无法识别出异常的经营数据和欺诈行为,导致贷款坏账和资金损失。
申请内容
本申请的主要目的在于提供一种经营数据审核方法、装置及计算机可读存储介质,旨在提高企业信贷评估过程中对异常经营数据的识别能力,降低企业贷款坏账率。
为实现上述目的,本申请提供一种经营数据审核方法,所述经营数据审核包括以下步骤:
获取企业样本数据,根据所述企业样本数据在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;
在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线;
将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常。
可选地,所述根据所述企业样本数据在预设坐标系中构造模拟标准均线的步骤包括:
将所述企业样本数据量化为多维样本组;
基于遗传算法对所述多维样本组进行分析学习和加权计算,获得对应的模拟加权量;
根据所述多维样本组和模拟加权量在预设坐标系中绘制模拟加权点,并根据所述模拟加权点拟合得到模拟标准均线。
可选地,所述多维样本组包括维度为m的多维样本基因,
基于遗传算法对所述多维样本组进行分析学习和加权计算,获得对应的模拟加权量的步骤包括:
根据所述多维样本基因构造模拟加权方程
h θ(x)=θ 01x 12x 2+…+θ mx m
其中,h θ(x)为所述多维样本组对应的模拟加权量,x 1、x 2、...、x m为样本基因,θ 0、θ 1、θ 2、...、θ m为加权系数;
将所述模拟加权方程转化成为对应的模拟矩阵方程
Figure PCTCN2018075658-appb-000001
其中,θ T为所述加权系数对应的系数矩阵;
构造所述模拟矩阵方程对应的平方损失函数
Figure PCTCN2018075658-appb-000002
其中,y (i)为所述多维样本组的标签值;
基于梯度下降公式和所述平方损失函数进行迭代计算,确定所述系数矩阵θ T,并根据所述系数矩阵θ T、模拟矩阵方程计算所述多维 样本组对应的模拟加权量。
可选地,所述梯度下降公式包括
Figure PCTCN2018075658-appb-000003
Figure PCTCN2018075658-appb-000004
其中,α为迭代步长。
可选地,所述根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线的步骤包括:
将所述企业经营数据量化为多维经营数据组,并对所述多维经营数据组进行加权运算,获得对应的经营加权量;
根据所述多维经营数据组和经营加权量在所述预设坐标系中拟合得到企业经营线;
所述将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常的步骤包括:
根据预设会计周期在所述企业经营线中选取对应的会计点,并判断所述会计点是否位于所述模拟置信区间之外的区域;
若所述会计点位于所述模拟置信区间之外的区域,则确定所述会计点对应的企业经营数据异常。
可选地,所述在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线的步骤包括:
在接收到借款请求时,根据所述借款请求中包括的授权许可生成对应的数据获取请求;
将所述数据获取请求发送至所述借款企业的数据管理系统,以获取所述借款企业的企业经营数据;
在接收到所述数据管理系统返回的企业经营数据时,根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线。
可选地,所述将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否 异常的步骤之后,还包括:
根据所述企业经营数据、企业经营线、模拟置信区间生成对应的数据审核报告,并显示所述数据审核报告。
此外,为实现上述目的,本申请还提供一种经营数据审核装置,所述经营数据审核装置包括:
区间获取模块,用于获取企业样本数据,根据所述企业样本数据在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;
经营线拟合模块,用于在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线;
数据判断模块,用于将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常。
此外,为实现上述目的,本申请还提供一种经营数据审核设备,所述经营数据审核设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的经营数据审核程序,其中所述经营数据审核程序被所述处理器执行时,实现以下步骤:
获取企业样本数据,根据所述企业样本数据在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;
在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线;
将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常。
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述可读存储介质上存储有经营数据审核程序,其中所述经营数据审核程序被所述处理器执行时,实现以下步骤:
获取企业样本数据,根据所述企业样本数据在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;
在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线;
将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常。
附图说明
图1为本申请实施例方案中涉及的经营数据审核设备的硬件结构示意图;
图2为本申请经营数据审核方法第一实施例的流程示意图;
图3为本申请经营数据审核方法第一实施例涉及的模拟标准均线示意图;
图4为图2所述在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线的细化流程示意图;
图5为本申请经营数据审核方法第二实施例的流程示意图;
图6为本申请经营数据审核装置第一实施例的功能模块示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请实施例涉及的经营数据审核方法主要应用于经营数据审核设备。参照图1,图1为本申请实施例方案中涉及的经营数据审核设备的硬件结构示意图。本申请实施例中经营数据审核设备可以包括处理器1001(例如CPU),通信总线1002,用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信;用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard);网络接口1004可选的可以包括标准的有线接口、 无线接口(如WI-FI接口);存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器,存储器1005可选的还可以是独立于前述处理器1001的存储装置。本领域技术人员可以理解,图1中示出的经营数据审核设备的硬件结构并不构成对经营数据审核设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。图1中作为一种计算机可读存储介质的存储器1005可以包括操作系统、网络通信模块以及经营数据审核程序。在图1中,网络通信模块主要用于连接数据库,与数据库进行数据通信;而处理器1001可以调用存储器1005中存储的经营数据审核程序,并执行本申请实施例提供的经营数据审核方法的各个实施例。
本申请提供一种经营数据审核方法。
参照图2,图2为本申请经营数据审核方法第一实施例的流程示意图。
本实施例中,所述经营数据审核方法包括以下步骤:
步骤S10,获取企业样本数据,根据所述企业样本数据在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;
本实施例中提出一种经营数据审核方法,通过对大规模的企业数据进行学习,拟合出一条企业经营的均线及模拟置信区间,再通过该均线和模拟置信区间对借款企业的经营数据进行检测,从而判断借款企业的经营数据是否真实可信,识别出借款企业的欺诈行为,降低人类思维局限性对企业借贷的不利影响,提高风险评估的准确性,从而降低信贷的坏账率。
本实施例中通过对大规模的企业数据进行学习,通过linear regression(线性回归)的方式可以拟合出一条企业经营的均线,该均线可称为模拟标准均线;对于绝大多数的企业在经营过程所产生的经营数据,在采用相同的方法进行拟合时,所得到的拟合点应该是围绕此均线进行波动的;反之,对于显著背离此模拟标准均线的经营数据,将被认为是异常,该企业的经营状况是不满足放款条件(当然也可能 是该数据是伪造的)。
考虑到企业经营数据的数据量较为庞大,若由工作人员进行分析以及均线拟合,往往需要花费较多的时间;同时,所得到的分析和拟合结果容易被专业人员的思维局限等主观因素所影响,从而影响了均线的适用性。因此,本实施例中通过机器学习的方式构造AI风控构造出模拟标准均线。其中,机器学习是指不依赖人类来总结经验、输入逻辑,人类只需要把大量的企业经营数据输入给计算机,然后由计算机自己总结出其中的数据关系,归纳出相应的逻辑代码,从而得到一个数据转换规则,形成对应的拟合公式。
本实施例中,在通过机器学习的方式进行数据分析的过程,还将引入遗传算法进行分析。值得说明的是,遗传算法是一种优胜劣汰的思想,而非特定的数学模型。在机器学习的过程中,显著地导致经营预期偏离均值的因子(某一类经营数据)将被计算机理解为可能导致较差的预测预期的基因,与之相关的企业将被打上欺诈标签,且此基因将被记录;另一方面,还将保留下能够产生正向结果的基因(如年度营收增幅介于区间[5.0%~15.0%])。本实施例中通过对离群样本点的进行分析和识别,然后统计离群样本点中各类特定基因的分布,将占比高的基因记录下来,作为进一步预测的依据(其中的“基因”为企业经营过程所产生的数据,包括但不限于:营收、应收账款、流动资金、扣非后净利润、投资、折旧等),从而拟合出一条企业经营的模拟标准均线,绝大多数的企业进行的正常经营运作行为将围绕此均线波动。该均线的Y轴是一个高维向量在二维中的投影,根据不同的行业,一个企业的经营状况被量化成多维的经营向量,然后累加成一条均线(不依赖时间轴的分布模型),在通常情形下,绝大多数的企业的经营信息应该围绕在这条均线两侧不远处(模拟置信区间内),而离群点(模拟置信区间之外)将被标记为异常。
具体的,首先将获取企业样本数据,这些企业数据包括营收、应收账款、流动资金、扣非后净利润、投资、折旧等(当然还可以包括其它内容);这些数据往往是连续、且其各自统计周期又不同,例如营收是以日进行统计、流动资金是以月进行统计,此时需要先对企业 样本数据进行预处理,以统一的时间为单位(或称为量化标准),将企业样本数据量化为若干个多维样本组,如以月为单元,将企业样本数据量化为多维样本组,该多维样本组中的多维样本基因包括1月多维样本基因、2月多维样本基因等。在预处理完成时,将基于遗传算法对所述多维样本组进行学习,分析多维样本组中各基因之间的关系,然后将各基因按照一定的加权关系进行加权计算,获得各多维样本组对应的模拟加权量。在得到模拟加权量时,即可在坐标系中拟合对应的模拟标准均线了;其中,坐标轴的x轴为时间,y轴为模拟加权量;根据多维样本组和模拟加权量在预设坐标系中绘制对应的模拟加权点,并根据所述模拟加权点拟合得到模拟标准均线。值得说明的是,虽然坐标轴中的x轴为时间,但并不代表模拟加权量是依赖时间的变化而变化,时间仅为企业样本数据的量化标准,其影响的是模拟标准均线的表现形式,而不是其中的拟合关系。
例如,以月为单位,对企业样本数据进行量化后,得到包括1月到8月共8个多维样本组,这些多维样本组中均包括m个维度的基因(也可称该多维样本组的维度为m)。其中某个月的多维样本基因可表示为x 1、x 2、...、x m
根据上述样本基因则可构造以下模拟加权方程
h θ(x)=θ 01x 12x 2+…+θ mx m   ①
其中,h θ(x)为所述多维样本组对应的模拟加权量(可以营收作为模拟加权量进行冷启动),θ 0、θ 1、θ 2、...、θ m为加权系数。
为了计算的方便,可将上述模拟加权方程转化①成为对应的矩阵形式(可称为模拟矩阵方程)
Figure PCTCN2018075658-appb-000005
其中,θ T为所述加权系数对应的系数矩阵。
通过②式可看出,若需要计算模拟加权量,则需要确定系数矩阵θ T
此时首先可构造出平方损失函数
Figure PCTCN2018075658-appb-000006
其中,y (i)为所述多维样本组的标签值。对于该平方损失函数③,可用来估量预测值(模拟加权量)与真实值的不一致程度,因此可通过③式来判断系数矩阵θ T是否准确。而对于系数矩阵θ T,本实施例中可采用梯度下降的方法进行迭代,即
Figure PCTCN2018075658-appb-000007
对于④式可变化为
Figure PCTCN2018075658-appb-000008
其中,α为迭代步长,也可理解为学习速率。
基于上述③④⑤式进行迭代计算,在经过若干轮的迭代,当前后2次迭代的系数矩阵θ T的距离小于某一个预定值(如:0.000001)时,可认为算法收敛,此时可将后一轮的系数矩阵θ T确定为最终计算迭代结果,并用以计算模拟加权量。在得到模拟加权量时,即可根据所述多维样本组和模拟加权量在预设坐标系中绘制模拟加权点,并根据所述模拟加权点拟合得到模拟标准均线。如图3所示,其中坐标轴的x轴为时间(月为单位),y轴为模拟加权量。值得说明的是,图3中的模拟标准均线时以折线的形式表示,这是因为数据采样是有周期的,而随着采样周期窗口的平滑滚动,季节及经济周期的因素会不可抗拒的生效,因此该模拟标准均线以折线的形式表示。
在得到该模拟标准均线时,可对该模拟标准均线分别进行上下平移操作,获得模拟置信区间;其中模拟标准均线平移的距离可以是根据实际情况进行设置和调整。
步骤S20,在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线;
本实施例中,在得到模拟置信区间时,即可通过该模拟置信区间进行数据审核了。在接收到企业的借款请求时,将获取该借款企业的企业经营数据,然后采用步骤S10中相同的方式,对借款企业的企业经营数据进行量化,得到对应的多维经营数据组;在得到多维经营数据组时,运用步骤S10中计算得到的系数矩阵θ T,对多维经营数据 组中的数据基因进行加权计算,获得对应的经营加权量,该经营加权量即代表了该企业在某段时间内的经营情况。
在得到经营加权量时,可在坐标系中拟合对应的企业经营线了。其中,预设坐标轴的x轴为时间,y轴为模拟加权量;根据多维经营数据组和经营加权量在预设坐标系中绘制对应的经营加权点,并根据所述经营加权点拟合得到企业经营线。值得说明的是,对于企业经营线,可以是以折线的方式进行拟合,也可以是以曲线的方式进行拟合;而在进行经营线拟合时,可以是在模拟标准均线所在的同一坐标系中进行,从而使得经营线和模拟标准均线(模拟置信区间)显示在同一坐标系中,方便后续的对比处理。
步骤S30,将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常。
在得到企业经营线和模拟置信区间时,即可对企业经营数据进行审核了。可将企业经营线和模拟置信区间进行对比,根据企业经营线和模拟置信区间的位置关系判断企业经营数据是否异常。如果企业经营线上的所有点均位于模拟置信区间之内、又或者企业经营线与模拟置信区间的偏差在预设许可范围内,则可认为该企业经营线对应的企业经营数据是正常的,该借款企业处于正常经营运作的状态;而如果企业经营线上的点,并非全部位于模拟置信区间之内,且企业经营线与模拟置信区间的偏差超过了预设许可范围,则可认为该企业经营线对应的企业经营数据是异常的,该借款企业处于异常经营运作的状态。
进一步的,对于企业经营线与模拟置信区间的比较,还可以是这样实现的:根据预设会计周期在企业经营线中选取若干个对应的会计点,然后分别判断这些会计点是否位于模拟置信区间之外的区域;如果某个会计点位于模拟置信区间之外,则可认为该会计点对应的企业经营数据异常。例如,在拟合企业经营线时,其多维经营数据组是包括1月到8月共8个多维样本组,而企业经营线也是根据这个8个月的多维样本组及对应的企业加强量拟合得到;在进行数据分析时,以 10天作为一个会计周期,那么可在企业经营线上取得25个会计点(每个月都认为是30天,同时加上1月起始点);然后可分别判断这些会计点是否位于模拟置信区间之外的区域;若发现2月到3月之间的4个会计点均位于模拟置信区间之外,则可认为2月到3月的企业经营数据异常,该借款企业2月到3月处于异常经营运作的状态。同时,还可以预设一个允许异常值,如果会计点中位于模拟置信区间之外的异常会计点的个数超过了该允许异常值,可直接认为整条企业经营线与模拟置信区间的偏差超过了预设许可范围,此时认为该整条企业经营线都是不可信的,该借款企业的企业经营数据均为异常数据。
本实施例中,通过获取企业样本数据,根据所述企业样本数据在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线;将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常。通过以上方式,本实施例以机器学习的方式对大规模的企业数据进行学习,拟合出一条企业经营的均线及模拟置信区间,再通过该均线和模拟置信区间对借款企业的经营数据进行审核,从而判断借款企业的经营数据是否真实可信,识别出借款企业的欺诈行为,降低人类思维局限性对企业借贷的不利影响,提高风险评估的准确性,从而降低企业信贷的坏账率。
参照图4,图4为图2所述在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线的细化流程示意图。
基于上述图2所示实施例,步骤S20包括:
步骤S21,在接收到借款请求时,根据所述借款请求中包括的授权许可生成对应的数据获取请求;
本实施例中,经营数据审核设备在接收到借款企业的借款请求时,需要获取到该借款企业相关的企业经营数据,以对借款企业的经营状态进行分析。而对于这些企业经营数据,是在企业日常经营行为 中产生的,往往由借款企业直接进行采集并记录在其自身的数据管理系统;这些经营数据是可以反映企业的经营状况的,其会涉及到企业的商业秘密。因此,若要获取这些数据,需要先得到借款企业的授权许可。具体的,借款企业在申请借款时,将会在发送的借款请求中添加入相关的授权许可信息,以表示授权数据审核设备访问数据管理系统并获取对应的企业经营数据;经营数据审核设备在接收到该借款请求时,将提取出其中的授权许可,并根据该许可信息生成对应数据获取请求。
进一步的,为了避免该借款请求在发送过程被非法第三方截获,导致授权许可被非法第三方使用,还可以预先与经营数据审核设备约定传输协议,然后根据协议内容先对授权许可进行加密,在加密完成时再将其添加入借款请求中进行发送,提高信息传输的安全性;经营数据审核设备在接收到该借款请求时,则先根据传输协议对借款请求进行解密,获取其中的授权许可,再根据该授权许可生成对应的数据取请求。
步骤S22,将所述数据获取请求发送至所述借款企业的数据管理系统,以获取所述借款企业的企业经营数据;
本实施例中,经营数据审核设备在生成数据获取请求时,即可将该数据获取请求发送到借款企业的数据管理系统,以获取对应的企业经营数据。该数据管理在接收到数据获取请求时,首先将对数据获取请求中的包括授权许可进行验证,判断该授权许可的真伪性,同时确认经营数据审核设备的数据获取权限(即可以获取什么范围内的企业经营数据);在确认通过时,将根据数据获取请求中包括的请求内容获取对应的企业经营数据,并将其返回至经营数据审核设备。类似的,为了确保数据获取请求发送过程的安全性,可采用步骤S21中的加密手段进行加密处理,此处不再赘述。
步骤S23,在接收到所述数据管理系统返回的企业经营数据时,根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线
数据管理系统在接收到数据管理系统返回的企业经营数据时,即采用步骤S10中相同的方式,对借款企业的企业经营数据进行量化, 得到对应的多维经营数据组;在得到多维经营数据组时,对多维经营数据组中的数据基因进行加权计算,获得对应的经营加权量,该经营加权量即代表了该企业在某段时间内的经营情况;在得到经营加权量时,即可在坐标系中拟合对应的企业经营线了。
参照图5,图5为本申请经营数据审核方法第二实施例的流程示意图。
基于上述图2或图4所示实施例,本实施例中,步骤S30之后还包括:
步骤S40,根据所述企业经营数据、企业经营线、模拟置信区间生成对应的数据审核报告,并显示所述数据审核报告。
本实施例中,在根据企业经营线和模拟置信区间的关系对企业经营数据的正常性进行判断后,还可生成对应的数据审核报告。该数据审核报告中包括了审核过程使用到的企业经营数据,还包括步骤S10所得到的模拟标准均线、模拟置信区间以及步骤S20所得的企业经营线。为了方便决策人员了解,对于模拟标准均线、模拟置信区间和企业经营线,可以是在同一个坐标系中进行显示,同时对于异常点(异常经营数据)可突出显示(如显示颜色的不同、线段加粗等)。而在数据审核报告中,由于涉及到大量数据,因此可自定义数据的显示方式。例如,企业经营数据中包括了借款企业的某个时间段的营收,该营收可以以折线图的形式进行显示,方便决策人员了解借款企业的营收变化;而对于企业的支出情况,则可以用饼状图的方式进行显示,方便决策人员了解借款企业每个月对不同事物的资金投入情况。
此外,本申请还提供一种经营数据审核装置。
参照图6,图6为本申请经营数据审核装置第一实施例的功能模块示意图。
本实施例中,所述经营数据审核装置包括:
区间获取模块10,用于获取企业样本数据,根据所述企业样本数据在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;
经营线拟合模块20,用于在接收到借款请求时,获取所述借款 请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线;
数据判断模块30,用于将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常。
进一步的,所述区间获取模块10还包括:
第一量化单元,用于将所述企业样本数据量化为多维样本组;
第一计算单元,用于基于遗传算法对所述多维样本组进行分析学习和加权计算,获得对应的模拟加权量;
第一拟合单元,用于根据所述多维样本组和模拟加权量在预设坐标系中绘制模拟加权点,并根据所述模拟加权点拟合得到模拟标准均线。
进一步的,所述多维样本组包括维度为m的多维样本基因,所述第一计算单元还用于:
根据所述多维样本基因构造模拟加权方程
h θ(x)=θ 01x 12x 2+…+θ mx m
其中,h θ(x)为所述多维样本组对应的模拟加权量,x 1、x 2、...、x m为样本基因,θ 0、θ 1、θ 2、...、θ m为加权系数;
将所述模拟加权方程转化成为对应的模拟矩阵方程
Figure PCTCN2018075658-appb-000009
其中,θ T为所述加权系数对应的系数矩阵;
构造所述模拟矩阵方程对应的平方损失函数
Figure PCTCN2018075658-appb-000010
其中,y (i)为所述多维样本组的标签值;
基于梯度下降公式和所述平方损失函数进行迭代计算,确定所述系数矩阵θ T,并根据所述系数矩阵θ T、模拟矩阵方程计算所述多维样本组对应的模拟加权量。
进一步的,所述梯度下降公式包括
Figure PCTCN2018075658-appb-000011
Figure PCTCN2018075658-appb-000012
其中,α为迭代步长。
进一步的,所述经营线拟合模块20,还包括:
第二计算单元,用于将所述企业经营数据量化为多维经营数据组,并对所述多维经营数据组进行加权计算,获得对应的经营加权量;
第二拟合单元,用于根据所述多维经营数据组和经营加权量在所述预设坐标系中拟合得到企业经营线;
所述数据判断模块30,还包括:
会计点选取单元,用于根据预设会计周期在所述企业经营线中选取对应的会计点,并判断所述会计点是否位于所述模拟置信区间之外的区域;
异常确定单元,用于若所述会计点位于所述模拟置信区间之外的区域,则确定所述会计点对应的企业经营数据异常。
进一步的,所述经营线拟合模块20,还包括:
请求生成单元,用于在接收到借款请求时,根据所述借款请求中包括的授权许可生成对应的数据获取请求;
请求发送单元,用于将所述数据获取请求发送至所述借款企业的数据管理系统,以获取所述借款企业的企业经营数据;
数据接收单元,用于在接收到所述数据管理系统返回的企业经营数据时,根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线。
进一步的,所述经营数据审核装置还包括:
报告生成模块,用于根据所述企业经营数据、企业经营线、模拟置信区间生成对应的数据审核报告,并显示所述数据审核报告。
其中,上述经营数据审核装置中各个模块与上述经营数据审核方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。
此外,本申请还提供一种计算机可读存储介质。
本申请计算机可读存储介质上存储有经营数据审核程序,所述可读存储介质上存储有经营数据审核程序,其中所述经营数据审核程序被处理器执行时,实现如上述的经营数据审核方法的步骤。
其中,经营数据审核程序被执行时所实现的方法可参照本申请经营数据审核方法的各个实施例,此处不再赘述。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种经营数据审核方法,其特征在于,所述经营数据审核包括以下步骤:
    获取企业样本数据,根据所述企业样本数据在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;
    在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线;
    将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常。
  2. 如权利要求1所述的经营数据审核方法,其特征在于,所述根据所述企业样本数据在预设坐标系中构造模拟标准均线的步骤包括:
    将所述企业样本数据量化为多维样本组;
    基于遗传算法对所述多维样本组进行分析学习和加权计算,获得对应的模拟加权量;
    根据所述多维样本组和模拟加权量在预设坐标系中绘制模拟加权点,并根据所述模拟加权点拟合得到模拟标准均线。
  3. 如权利要求2所述的经营数据审核方法,其特征在于,所述多维样本组包括维度为m的多维样本基因,
    所述基于遗传算法对所述多维样本组进行分析学习和加权计算,获得对应的模拟加权量的步骤包括:
    根据所述多维样本基因构造模拟加权方程
    h θ(x)=θ 01x 12x 2+…+θ mx m
    其中,h θ(x)为所述多维样本组对应的模拟加权量,x 1、x 2、...、x m为样本基因,θ 0、θ 1、θ 2、...、θ m为加权系数;
    将所述模拟加权方程转化成为对应的模拟矩阵方程
    Figure PCTCN2018075658-appb-100001
    其中,θ T为所述加权系数对应的系数矩阵;
    构造所述模拟矩阵方程对应的平方损失函数
    Figure PCTCN2018075658-appb-100002
    其中,y (i)为所述多维样本组的标签值;
    基于梯度下降公式和所述平方损失函数进行迭代计算,确定所述系数矩阵θ T,并根据所述系数矩阵θ T、模拟矩阵方程计算所述多维样本组对应的模拟加权量。
  4. 如权利要求3所述的经营数据审核方法,其特征在于,所述梯度下降公式包括
    Figure PCTCN2018075658-appb-100003
    Figure PCTCN2018075658-appb-100004
    其中,α为迭代步长。
  5. 如权利要求2所述的经营数据审核方法,其特征在于,所述根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线的步骤包括:
    将所述企业经营数据量化为多维经营数据组,并对所述多维经营数据组进行加权计算,获得对应的经营加权量;
    根据所述多维经营数据组和经营加权量在所述预设坐标系中拟合得到企业经营线;
    所述将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常的步骤包括:
    根据预设会计周期在所述企业经营线中选取对应的会计点,并判断所述会计点是否位于所述模拟置信区间之外的区域;
    若所述会计点位于所述模拟置信区间之外的区域,则确定所述会计点对应的企业经营数据异常。
  6. 如权利要求1所述的经营数据审核方法,其特征在于,所述在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经 营线的步骤包括:
    在接收到借款请求时,根据所述借款请求中包括的授权许可生成对应的数据获取请求;
    将所述数据获取请求发送至所述借款企业的数据管理系统,以获取所述借款企业的企业经营数据;
    在接收到所述数据管理系统返回的企业经营数据时,根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线。
  7. 如权利要求1所述的经营数据审核方法,其特征在于,所述将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常的步骤之后,还包括:
    根据所述企业经营数据、企业经营线、模拟置信区间生成对应的数据审核报告,并显示所述数据审核报告。
  8. 一种经营数据审核装置,其特征在于,所述经营数据审核装置包括:
    区间获取模块,用于获取企业样本数据,根据所述企业样本数据在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;
    经营线拟合模块,用于在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线;
    数据判断模块,用于将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常。
  9. 如权利要求8所述的经营数据审核装置,其特征在于,所述区间获取模块还包括:
    第一量化单元,用于将所述企业样本数据量化为多维样本组;
    第一计算单元,用于基于遗传算法对所述多维样本组进行分析学习和加权计算,获得对应的模拟加权量;
    第一拟合单元,用于根据所述多维样本组和模拟加权量在预设坐 标系中绘制模拟加权点,并根据所述模拟加权点拟合得到模拟标准均线。
  10. 如权利要求9所述的经营数据审核装置,其特征在于,所述多维样本组包括维度为m的多维样本基因,所述第一计算单元还用于:
    根据所述多维样本基因构造模拟加权方程
    h θ(x)=θ 01x 12x 2+…+θ mx m
    其中,h θ(x)为所述多维样本组对应的模拟加权量,x 1、x 2、...、x m为样本基因,θ 0、θ 1、θ 2、...、θ m为加权系数;
    将所述模拟加权方程转化成为对应的模拟矩阵方程
    Figure PCTCN2018075658-appb-100005
    其中,θ T为所述加权系数对应的系数矩阵;
    构造所述模拟矩阵方程对应的平方损失函数
    Figure PCTCN2018075658-appb-100006
    其中,y (i)为所述多维样本组的标签值;
    基于梯度下降公式和所述平方损失函数进行迭代计算,确定所述系数矩阵θ T,并根据所述系数矩阵θ T、模拟矩阵方程计算所述多维样本组对应的模拟加权量。
  11. 如权利要求10所述的经营数据审核装置,其特征在于,所述梯度下降公式包括
    Figure PCTCN2018075658-appb-100007
    Figure PCTCN2018075658-appb-100008
    其中,α为迭代步长。
  12. 如权利要求9所述的经营数据审核装置,其特征在于,所述经营线拟合模块包括:
    第二计算单元,用于将所述企业经营数据量化为多维经营数据组,并对所述多维经营数据组进行加权计算,获得对应的经营加权量;
    第二拟合单元,用于根据所述多维经营数据组和经营加权量在所述预设坐标系中拟合得到企业经营线;
    所述数据判断模块包括:
    会计点选取单元,用于根据预设会计周期在所述企业经营线中选取对应的会计点,并判断所述会计点是否位于所述模拟置信区间之外的区域;
    异常确定单元,用于若所述会计点位于所述模拟置信区间之外的区域,则确定所述会计点对应的企业经营数据异常。
  13. 如权利要求8所述的经营数据审核装置,其特征在于,所述经营线拟合模块包括:
    请求生成单元,用于在接收到借款请求时,根据所述借款请求中包括的授权许可生成对应的数据获取请求;
    请求发送单元,用于将所述数据获取请求发送至所述借款企业的数据管理系统,以获取所述借款企业的企业经营数据;
    数据接收单元,用于在接收到所述数据管理系统返回的企业经营数据时,根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线。
  14. 如权利要求8所述的经营数据审核装置,其特征在于,所述经营数据审核装置还包括:
    报告生成模块,用于根据所述企业经营数据、企业经营线、模拟置信区间生成对应的数据审核报告,并显示所述数据审核报告。
  15. 一种经营数据审核设备,其特征在于,所述经营数据审核设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的经营数据审核程序,其中所述经营数据审核程序被所述处理器执行时,实现以下步骤:
    获取企业样本数据,根据所述企业样本数据在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;
    在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线;
    将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常。
  16. 如权利要求15所述的经营数据审核设备,其特征在于,所述根据所述企业样本数据在预设坐标系中构造模拟标准均线的步骤包括:
    将所述企业样本数据量化为多维样本组;
    基于遗传算法对所述多维样本组进行分析学习和加权计算,获得对应的模拟加权量;
    根据所述多维样本组和模拟加权量在预设坐标系中绘制模拟加权点,并根据所述模拟加权点拟合得到模拟标准均线。
  17. 如权利要求16所述的经营数据审核设备,其特征在于,所述多维样本组包括维度为m的多维样本基因,
    所述基于遗传算法对所述多维样本组进行分析学习和加权计算,获得对应的模拟加权量的步骤包括:
    根据所述多维样本基因构造模拟加权方程
    h θ(x)=θ 01x 12x 2+…+θ mx m
    其中,h θ(x)为所述多维样本组对应的模拟加权量,x 1、x 2、...、x m为样本基因,θ 0、θ 1、θ 2、...、θ m为加权系数;
    将所述模拟加权方程转化成为对应的模拟矩阵方程
    Figure PCTCN2018075658-appb-100009
    其中,θ T为所述加权系数对应的系数矩阵;
    构造所述模拟矩阵方程对应的平方损失函数
    Figure PCTCN2018075658-appb-100010
    其中,y (i)为所述多维样本组的标签值;
    基于梯度下降公式和所述平方损失函数进行迭代计算,确定所述系数矩阵θ T,并根据所述系数矩阵θ T、模拟矩阵方程计算所述多维样本组对应的模拟加权量。
  18. 如权利要求17所述的经营数据审核设备,其特征在于,所述梯度下降公式包括
    Figure PCTCN2018075658-appb-100011
    Figure PCTCN2018075658-appb-100012
    其中,α为迭代步长。
  19. 如权利要求16所述的经营数据审核设备,其特征在于,所述根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线的步骤包括:
    将所述企业经营数据量化为多维经营数据组,并对所述多维经营数据组进行加权计算,获得对应的经营加权量;
    根据所述多维经营数据组和经营加权量在所述预设坐标系中拟合得到企业经营线;
    所述将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常的步骤包括:
    根据预设会计周期在所述企业经营线中选取对应的会计点,并判断所述会计点是否位于所述模拟置信区间之外的区域;
    若所述会计点位于所述模拟置信区间之外的区域,则确定所述会计点对应的企业经营数据异常。
  20. 一种计算机可读存储介质,其特征在于,所述可读存储介质上存储有经营数据审核程序,其中所述经营数据审核程序被处理器执行时,实现以下步骤:
    获取企业样本数据,根据所述企业样本数据在预设坐标系中构造模拟标准均线,并基于所述模拟标准均线获取模拟置信区间;
    在接收到借款请求时,获取所述借款请求对应借款企业的企业经营数据,并根据所述企业经营数据在所述预设坐标系中拟合对应的企业经营线;
    将所述企业经营线和模拟置信区间进行对比,根据所述企业经营线和模拟置信区间的关系判断所述企业经营数据是否异常。
PCT/CN2018/075658 2017-12-08 2018-02-07 经营数据审核方法、装置、设备及计算机可读存储介质 WO2019109523A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711292700.2 2017-12-08
CN201711292700.2A CN107909472B (zh) 2017-12-08 2017-12-08 经营数据审核方法、装置、设备及计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2019109523A1 true WO2019109523A1 (zh) 2019-06-13

Family

ID=61853911

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/075658 WO2019109523A1 (zh) 2017-12-08 2018-02-07 经营数据审核方法、装置、设备及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN107909472B (zh)
WO (1) WO2019109523A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117911179A (zh) * 2024-01-24 2024-04-19 中智薪税技术服务有限公司 一种财税数据审核方法及系统

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108734371A (zh) * 2018-02-12 2018-11-02 阿里巴巴集团控股有限公司 一种针对风控指令的处理方法、装置及设备
CN111489218B (zh) * 2019-01-28 2023-04-18 阿里巴巴集团控股有限公司 数据的审核方法、装置及设备
CN110222957A (zh) * 2019-05-20 2019-09-10 深圳壹账通智能科技有限公司 一种数据审核方法及相关设备
CN110223082A (zh) * 2019-05-20 2019-09-10 深圳壹账通智能科技有限公司 一种数据审核方法及相关设备
CN111242773A (zh) * 2020-01-16 2020-06-05 深圳壹账通智能科技有限公司 虚拟资源申请的对接方法、装置、计算机设备及存储介质
CN111882289B (zh) * 2020-07-01 2023-11-14 国网河北省电力有限公司经济技术研究院 一种项目数据审核指标区间测算的装置和方法
CN112241917B (zh) * 2020-10-29 2024-07-16 深圳供电局有限公司 一种智能化金融机构贷前管理方法及其系统
CN112328424B (zh) * 2020-12-03 2022-05-06 之江实验室 一种用于数值型数据的智能异常检测方法及装置
CN118296062A (zh) * 2024-06-05 2024-07-05 上海银行股份有限公司 一种财务健康度分析方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080103855A1 (en) * 2006-10-25 2008-05-01 Robert Hernandez System And Method For Detecting Anomalies In Market Data
CN105389732A (zh) * 2015-11-30 2016-03-09 安徽融信金模信息技术有限公司 一种用于企业风险评估的方法
CN105809195A (zh) * 2016-03-08 2016-07-27 中国银联股份有限公司 用于判别一个商户是否属于特定商户类别的方法和装置
CN106779457A (zh) * 2016-12-29 2017-05-31 深圳微众税银信息服务有限公司 一种企业信用评估方法及系统

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761673A (zh) * 2014-01-03 2014-04-30 浙江大唐乌沙山发电有限责任公司 一种用于判断指标异常的回归方法和系统
CN106384197A (zh) * 2016-09-13 2017-02-08 北京协力筑成金融信息服务股份有限公司 一种基于大数据的业务质量评估方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080103855A1 (en) * 2006-10-25 2008-05-01 Robert Hernandez System And Method For Detecting Anomalies In Market Data
CN105389732A (zh) * 2015-11-30 2016-03-09 安徽融信金模信息技术有限公司 一种用于企业风险评估的方法
CN105809195A (zh) * 2016-03-08 2016-07-27 中国银联股份有限公司 用于判别一个商户是否属于特定商户类别的方法和装置
CN106779457A (zh) * 2016-12-29 2017-05-31 深圳微众税银信息服务有限公司 一种企业信用评估方法及系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117911179A (zh) * 2024-01-24 2024-04-19 中智薪税技术服务有限公司 一种财税数据审核方法及系统

Also Published As

Publication number Publication date
CN107909472A (zh) 2018-04-13
CN107909472B (zh) 2020-11-03

Similar Documents

Publication Publication Date Title
WO2019109523A1 (zh) 经营数据审核方法、装置、设备及计算机可读存储介质
WO2019080407A1 (zh) 信贷评估方法、装置、设备及计算机可读存储介质
Verbraken et al. Development and application of consumer credit scoring models using profit-based classification measures
US10152752B2 (en) Methods and systems for computing trading strategies for use in portfolio management and computing associated probability distributions for use in option pricing
JP2015222596A (ja) 将来の損失に関連する頻度を予測し、損失決定ユニットの関連する自動処理のためのシステム及び方法
US20130226830A1 (en) System and method for transactional risk and return analysis
US20190244299A1 (en) System and method for evaluating decision opportunities
US20130144656A1 (en) Systems and methods to intelligently determine insurance information based on identified businesses
WO2016084642A1 (ja) 与信審査用サーバと与信審査用システム及び与信審査用プログラム
CN117391292A (zh) 碳排放节能管理分析系统及方法
CN113095800A (zh) 基站建设投资审批方法、装置、设备及存储介质
Shi et al. Long-tail longitudinal modeling of insurance company expenses
Mottaeva et al. Optimizing the resultativeness of adapting an economic entity to the conditions of digitalization
McIsaac Testing Goodwin with a stochastic differential approach—The United States (1948–2019)
US20140344020A1 (en) Competitor pricing strategy determination
CN113506023A (zh) 工作行为数据分析方法、装置、设备及存储介质
US20140344021A1 (en) Reactive competitor price determination using a competitor response model
JP2003036346A (ja) オペレーショナル・リスク評価方法及びそのシステム
Chou et al. Estimating software project effort for manufacturing firms
KR101334891B1 (ko) SaaS 환경에서의 재무위험관리 서비스를 제공하기 위한 시스템
JP2003036343A (ja) オペレーショナル・リスク管理方法及びそのシステム
WO2021017284A1 (zh) 基于皮质学习的异常检测方法、装置、终端设备及存储介质
CN111768282A (zh) 数据分析方法、装置、设备及存储介质
Torresetti et al. Scaling operational loss data and its systemic risk implications
US12045261B2 (en) Method and apparatus for measuring material risk in a data set

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18885059

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 30.11.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18885059

Country of ref document: EP

Kind code of ref document: A1