CN112488572B

CN112488572B - Audit object recommendation method, device, equipment and medium

Info

Publication number: CN112488572B
Application number: CN202011473645.9A
Authority: CN
Inventors: 黄妙红; 何胜; 王珏; 肖嘉丽
Original assignee: Guangdong Power Grid Co Ltd
Current assignee: Guangdong Power Grid Co Ltd
Priority date: 2020-12-15
Filing date: 2020-12-15
Publication date: 2023-04-07
Anticipated expiration: 2040-12-15
Also published as: CN112488572A

Abstract

The application discloses a recommendation method, a recommendation device, equipment and a medium for an audit object, wherein the method comprises the following steps: acquiring first audit data of a plurality of first audit objects; inputting the first audit data into a risk prediction model for risk level prediction to obtain the risk level of each first audit object; and selecting the first audit object with the highest risk level for recommendation to obtain a recommended audit object. The method and the device solve the technical problem that in the prior art, the audit data are retrieved, counted and analyzed manually to determine the recommended audit object, so that the efficiency is low.

Description

Audit object recommendation method, device, equipment and medium

Technical Field

The present application relates to the field of auditing technologies, and in particular, to a method, an apparatus, a device, and a medium for recommending an audit object.

Background

At present, three auditing modes of pre-auditing, in-process auditing and post-auditing are often adopted in the process of developing an auditing project, then business data is retrieved, counted and analyzed according to requirements of the auditing project, a clue basis of auditing doubtful points is formed, and an auditing fact basis is provided for developing auditing operation.

In the prior art, the recommended audit object is determined by manually retrieving, counting and analyzing the audit data, so that the technical problem of low efficiency exists.

Disclosure of Invention

The application provides an audit object recommendation method, device, equipment and medium, which are used for solving the technical problem of low efficiency in the prior art that audit data are manually retrieved, counted and analyzed to determine recommended audit objects.

In view of this, a first aspect of the present application provides an audit object recommendation method, including:

acquiring first audit data of a plurality of first audit objects;

inputting the first audit data into a risk prediction model for risk level prediction to obtain the risk level of each first audit object;

and selecting the first audit object with the highest risk level for recommendation to obtain a recommended audit object.

Optionally, the configuration method of the risk prediction model includes:

acquiring second audit data of a second audit object;

performing risk scoring on the second auditing object based on the second auditing data to obtain a risk value corresponding to the second auditing object;

performing risk level marking on the second auditing object based on the risk value corresponding to the second auditing object to obtain an auditing label;

and training a machine learning model through the second audit data and the audit tag corresponding to the second audit object to obtain the risk prediction model.

Optionally, the acquiring second audit data of the second audit object further includes:

performing grade division on the services in the second audit object to obtain a first-grade service, a second-grade service and a third-grade service;

correspondingly, the acquiring second audit data of the second audit object includes:

and acquiring second audit data corresponding to the third-level service.

Optionally, the performing risk scoring on the second audit object based on the second audit data to obtain a risk value corresponding to the second audit object includes:

performing risk scoring on the third-level service based on second audit data corresponding to the third-level service to obtain a risk value corresponding to the third-level service;

sequentially calculating the risk value of the secondary service and the risk value of the primary service based on the risk values corresponding to the tertiary services;

and calculating the risk value of the second accounting object based on the risk value of the primary service.

Optionally, the training a machine learning model through the second audit data and the audit tag corresponding to the second audit object to obtain the risk prediction model includes:

inputting the second audit data and the audit tag corresponding to the second audit object into a machine learning model, and outputting a risk level predicted value corresponding to the second audit object;

calculating a preset index value through the audit tag of the second audit object and the risk level predicted value, wherein the preset index value comprises model accuracy rate, model hit rate or model recall rate;

and verifying the machine learning model based on the preset index value, and taking the machine learning model passing the verification as the risk prediction model.

Optionally, the inputting the first audit data into a risk prediction model for risk level prediction to obtain the risk level of each first audit object further includes:

and preprocessing the first audit data.

The second aspect of the present application provides an audit object recommendation apparatus, including:

the acquisition unit is used for acquiring first audit data of a plurality of first audit objects;

the prediction unit is used for inputting the first auditing data into a risk prediction model to carry out risk level prediction to obtain the risk level of each first auditing object;

and the selecting unit is used for selecting the first audit object with the highest risk level to recommend to obtain a recommended audit object.

Optionally, the method further includes: a configuration unit for configuring the risk prediction model;

the configuration unit specifically includes:

the obtaining subunit is used for obtaining second audit data of a second audit object;

the scoring subunit is configured to perform risk scoring on the second audit object based on the second audit data to obtain a risk value corresponding to the second audit object;

the labeling subunit is used for performing risk level labeling on the second audit object based on the risk value to obtain an audit tag;

and the training subunit is used for training a machine learning model through the second audit data corresponding to the second audit object and the audit tag to obtain the risk prediction model.

A third aspect of the application provides an audit object recommendation device, which includes a processor and a memory;

the memory is used for storing program codes and transmitting the program codes to the processor;

the processor is configured to execute the audit object recommendation method of any one of the first aspect according to instructions in the program code.

A fourth aspect of the present application provides a computer-readable storage medium for storing program code for executing the audit object recommendation method of any one of the first aspects.

According to the technical scheme, the method has the following advantages:

the application provides an audit object recommendation method, which comprises the following steps: acquiring first audit data of a plurality of first audit objects; inputting the first audit data into a risk prediction model for risk level prediction to obtain the risk level of each first audit object; and selecting the first audit object with the highest risk level for output to obtain a recommended audit object.

In the embodiment of the application, the acquired first audit data of the first audit object is input into the risk prediction model to predict the risk level, then the first audit object with the highest risk level is selected as the recommended audit object, and the audit data is automatically processed and analyzed through the risk prediction model without excessive manual interference, so that the efficiency is improved, and the technical problem of low efficiency existing in the prior art that the recommended audit object is determined by manually retrieving, counting and analyzing the audit data is solved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the description below are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive labor.

Fig. 1 is a schematic flowchart of an audit object recommendation method according to an embodiment of the present application;

FIG. 2 is a topological diagram of an audit unit according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of an audit object recommendation apparatus according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

For easy understanding, please refer to fig. 1, an embodiment of an audit object recommendation method provided in the present application includes:

step 101, obtaining first audit data of a plurality of first audit objects.

In the embodiment of the present application, a plurality of units or departments such as a local and municipal administration, a prefecture and county administration, a financial department, a communication center, and a substation may be used as the first audit target, and the first audit data may be business data of each part or unit in the first audit target. The business data can be part or all data extracted from a source data system, the data extraction is divided into a full extraction mode and an incremental extraction mode, the implementation modes are different, the data extraction efficiency is different, and the incremental data extraction mode is as follows:

1. the timestamp mode is a data capturing mode based on snapshot change, a timestamp column needs to be added on a source table, and when data in a data table is updated, the value of the timestamp column is modified at the same time. And when data is extracted, the change data is determined to be extracted by comparing the system time with the time stamp column value, so that incremental extraction is realized. The timestamp mode has better performance and relatively simple extraction, and has the defects that the delete and update operations of the previous data of the timestamp cannot be captured, and the data accuracy is limited to a certain extent.

2. And the log table mode judges the change data by analyzing the online log of the database. The data can be extracted while the source data table is subjected to insert, update or delete operation, the change data is stored in the log table, and the change data is captured in such a way and then provided to the target system in a view mode. The mode is adopted by third-party data copying tools such as materialized views, DSG and golden gateway TDM provided by Oracle, the mode has the advantages of high data extraction performance, and the defect that data tables and log table data need to be modified simultaneously during data operation, so that the performance of a service system is influenced to a certain extent.

3. And the full-table comparison mode is to establish a temporary table with a similar structure for the extracted table in advance, and the temporary table records the main key of the source table and the check code calculated according to the column data. And checking the source table and the temporary table every time data is extracted, and determining whether the data of the source table is an insert operation, an update operation or a delete operation. This approach has the advantage of less impact on the source system and the disadvantage of poorer performance, less accuracy when there is no primary key or unique column in the table and there are duplicate records.

4. In the trigger mode, triggers such as insert, update, delete and the like need to be established on a source data table, when source data changes, the corresponding triggers write the changed data into a temporary table, an extraction thread extracts data from the temporary table, and the extracted data in the temporary table is marked or deleted. For example, inforEAI adopts the mode to realize incremental extraction, and is currently used in data concentration of a tax refunding auditing system of the national tax system export of our province. The method has the advantages of high data extraction efficiency and certain influence on the performance and safety of the service system due to the fact that the trigger needs to be built in the service table.

Furthermore, after the first audit data is acquired through data extraction, the first audit data can be preprocessed, including data cleaning, data correlation analysis or data conversion and the like, so that the data quality is improved, and the accuracy of a prediction result is improved.

Data cleansing is the discovery and correction of recognizable errors in data files, including checking data consistency, processing invalid values and missing values, and the like, as follows:

(1) Consistency check

The consistency check is to check whether the data is in accordance with the reasonable value range and the mutual relation of each variable, and find out the data which is out of the normal range, is logically unreasonable or is mutually contradictory. For example, a variable measured on a scale of 1-7 with a value of 0 and a negative weight should be considered as outside the normal range. Computer software such as SPSS, SAS, excel and the like can automatically identify variable values of each out-of-range according to the defined value range. Answers with logical inconsistencies may appear in a variety of forms.

(2) Processing invalid and missing values

Due to investigation, coding and logging errors, there may be some invalid and missing values in the data that need to be given appropriate treatment. The common treatment methods are: evaluation, whole case deletion, variable deletion and pair deletion.

a. And (4) estimating. The simplest way is to replace the invalid and missing values with the sample mean, median or mode of a variable. This approach is simple, but does not take into account the information already in the data, and the error may be large. Another approach is to make an estimate based on the panelists' answers to other questions, through correlation analysis or logical inference between variables. For example, the possession of a product may be related to household income, and the likelihood of possession of the product may be inferred from the household income of the panelist.

b. An entire column of deletions is a culling of samples containing missing values. Since many data may have missing values, the effective sample size is greatly reduced, and the collected data cannot be fully utilized. Therefore, the method is only suitable for the condition that the key variable is missing or the sample containing invalid value or missing value has small specific gravity.

c. And deleting the variable. If there are many invalid and missing values for a variable and the variable is not particularly important to the problem under consideration, the variable may be considered to be deleted. This reduces the number of variables available for analysis, but does not change the sample size.

d. Paired deletes represent invalid and missing values with a special code (usually 9, 99, 999, etc.) while retaining all variables and samples in the dataset. However, only samples with complete answers are used in the specific calculation, so that the effective sample amount will be different due to different analysis factors involving different variables. This is a conservative approach that preserves the information available in the data set to the maximum extent.

The use of different processing methods may have an impact on the analysis results, especially when the occurrence of missing values is not random and there is a significant correlation between the variables. Different processing modes can be selected in practical situations.

Because the data in the database is a collection of data oriented to a certain subject, the data is extracted from a plurality of business systems and contains historical data, the existence of error data and the collision of the data among the data cannot be avoided, the error or collision data are 'dirty data', the 'dirty data' is cleaned according to a certain rule, and the data cleaning is the data cleaning. Data scrubbing mainly removes incomplete data, repeated data, and erroneous data.

The general process of data cleansing is:

(1) And (3) data analysis: in order to clean up clean data, detailed analysis must be performed on the data, including format categories of the data, and the like. Such as field type, width, meaning, etc. of the collected financial data.

(2) Mode conversion: schema transformation mainly refers to mapping source data into a target data model, such as transformation of attributes, constraint conditions of fields, mapping and transformation between data sets in a database, and the like. Sometimes, a plurality of data tables need to be combined into one two-dimensional table, and sometimes, one data table needs to be split into a plurality of two-dimensional tables so as to solve the problem.

(3) Data checking: if the mode conversion in the last step is feasible, evaluation test is needed, and data can be better cleaned through repeated analysis, design, calculation and analysis. Otherwise, some error data may not be obvious and cannot be well screened out without data check. For example, when a data set is decomposed into a plurality of data tables during mode conversion, the values of the primary keywords of the parent table and the values of the external keywords of the child table are inconsistent, so that an isolated record is formed, the correctness of audit evidence of an auditor is influenced, and the correctness of an audit conclusion is further influenced.

(4) Data reflow: dirty data in the original data source is replaced by clean data, and cleaning of redo data in next data acquisition is avoided.

And 102, inputting the first audit data into a risk prediction model for risk level prediction to obtain the risk level of each first audit object.

And inputting the first audit data into a risk prediction model for risk level prediction to obtain the risk level of each first audit object, wherein the risk prediction model is a trained model and is used for risk level prediction.

Further, the configuration method of the risk prediction model comprises the following steps:

1. and acquiring second audit data of the second audit object.

The second auditing object and the first auditing object can be the same, and the second auditing data is historical business data. The second auditing object comprises a multi-service auditing object and a single-service auditing object, wherein the multi-service auditing object comprises similar local and local bureau units such as a city bureau, a district and county bureau, a power transmission station and the like, the other local departments of single service comprise more than 50 unit departments such as a news center, an assessment center, a comprehensive service center, an information center, a financial sharing center, a project center, a planning center, a logistics center, a financial department, a marketing department, a personnel department and the like, and seven major services such as marketing management, material management, financial management, investment planning, project management, infrastructure projects, safety production and the like developed by the departments and the units can be used as the single-service auditing object.

And grade division can be carried out on the services in the second audit object to obtain a first-grade service, a second-grade service and a third-grade service, and further obtain second audit data corresponding to the third-grade service. The first-level service, the second-level service and the third-level service can be divided according to actual conditions. For example, a primary business may include marketing management, materials management, financial management, asset management, contract management, engineering and project management; the secondary business may include electricity price execution, business expansion management, electricity fee accounting, depreciation management, cost fee, capital management, budget management, design reconnaissance, design change, acceptance management, investment planning, settlement management, implementation management, cost management, contract review, contract signing, and the like.

2. Performing risk scoring on the second audit object based on the second audit data to obtain a risk value corresponding to the second audit object;

performing risk scoring on the third-level services based on second audit data corresponding to the third-level services to obtain risk values corresponding to the third-level services, and performing risk scoring according to common standards in the field to obtain risk values corresponding to each third-level service; sequentially calculating the risk value of the second-level service and the risk value of the first-level service based on the risk values corresponding to the third-level services; and calculating the risk value of the second auditing object based on the risk value of the primary service.

The first-level service, the second-level service and the third-level service are cascaded from top to bottom according to the corresponding grading rule of the audit object, namely, each score of the first-level service classification is obtained from the grading of the second-level service classification after weight calculation, and each score of the second-level service classification is obtained from each model in the third-level service classification after weight calculation.

For an example of a certain audit unit, please refer to fig. 2, where fig. 2 only gives a first-level service and a second-level service as an example. And calculating the risk value of the secondary service through the risk value of the tertiary service to obtain the risk value of the secondary service, performing weighted summation on the risk values of meter reading management, electric charge accounting, electric price execution and business expansion management in the secondary service to obtain the risk value of marketing management in the primary service, and correspondingly, performing weighted summation on the risk value of each primary service to obtain the risk value of an audit unit. Fig. 2 shows an example of the weight parameters of the primary service and the secondary service, and other weight parameter settings may also be set according to actual situations, which are not specifically limited herein.

3. And carrying out risk level marking on the second audit object based on the risk value corresponding to the second audit object to obtain an audit tag.

And marking the risk level of the second audited object according to the magnitude of the risk value of the second audited object, wherein the risk level can be divided into 4 levels (extremely high risk, medium risk, low risk), can be divided into 5 levels and the like, can be flexibly set according to actual needs, and is not specifically limited herein. The risk value interval corresponding to each risk level may also be set according to actual conditions, and is not described herein again.

4. And training the machine learning model through second audit data and audit tags corresponding to the second audit object to obtain a risk prediction model.

Inputting second audit data and audit tags corresponding to the second audit object into the machine learning model, and outputting a risk level predicted value corresponding to the second audit object; calculating a preset index value through an audit tag and a risk level predicted value of a second audit object, wherein the preset index value comprises model accuracy, model hit rate or model recall rate; and verifying the machine learning model based on a preset index value, and taking the machine learning model passing the verification as a risk prediction model.

The machine learning model can be a clustering model obtained by a K-means clustering algorithm. The K-means is a relatively common clustering algorithm, belongs to an unsupervised learning algorithm, can automatically aggregate data into clusters with specified number only by initially specifying the number of target clusters, and has higher data similarity in the same cluster and lower data similarity in different clusters. The K-means algorithm has high analysis efficiency, and can complete cluster analysis within 10 seconds for 10000 samples with 50 dimensions. In the embodiment of the application, the K-means algorithm is described by using the current popular Python language.

The algorithm develops clustering algorithm mining analysis for a given sample set D, taking characteristic factors of 2 dimensions as an example,

1. selecting 4 high-risk sample points from the sample set D according to manual judgment to serve as an audit object risk reference set; a data set D is imported.

import K-means clustering algorithm for import of import clusters as pd from sketch

importmatplotlib.pyplot as plt

reference＝[[a1,b1],[a2,b2],[a3,b3],[a4,b4]]

inputfile = '/data. Xlsxsx' # data file to be clustered

iteration =250# Cluster analysis maximum number of cycles

Read _ excel (inputfile) # read dataset D

2. The initial target cluster number is set to k =8.

k =8# target cluster number

3. And performing clustering analysis to obtain a stable result cluster.

kmodel = KMeans (n _ clusterings = k, n _ jobs = 4) # calls the k-means algorithm for cluster analysis

r1= pd.series (kmodel.labels _.) value _ counts () # counts the number of each category

r2= pd, dataframe (kmode. Cluster _ centers) # finds the cluster center

# horizontal connections (0 is vertical), the number under the category corresponding to the cluster center is obtained

r＝pd.concat([r2,r1],axis＝1)

4. The cluster with the center shortest from the reference set among the result clusters is found and labeled as "extremely high (4)" risk.

# calculating the Euclidean distance between two pattern samples

# finding the shortest packet set

# detailed output of class corresponding to each sample

r＝data[index]

r.to _ excel (outputfile) # saves the classification results and marks the corresponding risk level

5. Steps 2-3 were performed using k =4, k =2, respectively, with the risks labeled "high (3)", "medium (2)", respectively.

6. The sample set of remaining unlabeled risks is labeled "low (1)".

Usually, the trained model cannot be directly applied, the trained model is verified before application, and indexes commonly used for evaluating the quality of the model are model accuracy, model hit rate or model recall rate. The present application takes two categories as examples to explain these indices. Referring to table 1, assuming that the trained machine learning model has only 2 predicted results, i.e., 1 and 0, the relationship table between the actual value (label) and the predicted value is shown in table 1.

TABLE 1 relationship between predicted values and actual values

(1) Model accuracy

The model accuracy is used for describing the overall prediction accuracy of a model, and the calculation formula is as follows:

(2) Hit rate of model

The problem of predicting many marker variables is often not concerned with the accuracy of the model. Therefore, a new index of the model hit rate is introduced, which is used for reflecting the accuracy of the list provided by the prediction result, and the calculation formula is as follows:

in the IBM SPSS Modeler, if a coincidence matrix (for a character-type target variable) is selected in the analysis node, a cross table of predicted values and actual values can be obtained, but the hit rate of the model cannot be seen from it, so a matrix node is usually selected to view the result.

(3) Percentage of recall of model

If the performance of the model cannot be ensured only by looking at the hit rate, the model recall ratio is introduced, the index is also called the model coverage rate and mainly reflects the coverage degree of the model, and the calculation formula is as follows:

in the IBM SPSS Modeler, if a coincidence matrix (for a character-type target variable) is selected in the analysis node, a cross table of predicted values and actual values can be obtained, but the model recall ratio cannot be seen, so a matrix node is generally selected to view the result.

By calculating the index value, when the model accuracy rate, the model recall rate or the model hit rate of the machine learning model reaches a preset threshold value, the machine learning model is judged to pass the verification to obtain a risk prediction model, otherwise, the machine learning model is judged to fail the verification, and the training is continued.

And 103, selecting the first audit object with the highest risk level for recommendation to obtain a recommended audit object.

And processing the first audit data through a risk prediction model to obtain the risk level of each first audit object, selecting the first audit object with the highest risk level to recommend to obtain a recommended audit object, and further auditing the recommended audit object.

According to the method and the device, the acquired first audit data of the first audit object are input into the risk prediction model to predict the risk level, then the first audit object with the highest risk level is selected as the recommended audit object, the audit data are automatically processed and analyzed through the risk prediction model, excessive manual interference is not needed, the efficiency is improved, the technical problem that in the prior art, the recommended audit object is determined through manual retrieval, statistics and analysis of the audit data, and the efficiency is low is solved.

The above is an embodiment of an audit object recommendation method provided by the present application, and the following is an embodiment of an audit object recommendation apparatus provided by the present application.

Referring to fig. 3, an audit object recommendation apparatus provided in an embodiment of the present application includes:

an obtaining unit 301, configured to obtain first audit data of a plurality of first audit objects;

the prediction unit 302 is configured to input the first audit data into a risk prediction model to perform risk level prediction, so as to obtain a risk level of each first audit object;

and the selecting unit 303 is configured to select the first audit object with the highest risk level for recommendation, so as to obtain a recommended audit object.

As a further improvement, the method further comprises the following steps: a configuration unit 304 for configuring a risk prediction model;

the configuration unit 304 specifically includes:

the scoring subunit is used for performing risk scoring on the second audit object based on the second audit data to obtain a risk value corresponding to the second audit object;

and the training subunit is used for training the machine learning model through second audit data and audit tags corresponding to the second audit object to obtain a risk prediction model.

As a further improvement, the configuration unit 304 further includes:

and the dividing subunit is used for carrying out grade division on the services in the second audit object to obtain a first-grade service, a second-grade service and a third-grade service.

Correspondingly, the obtaining subunit is specifically configured to obtain second audit data corresponding to the third-level service.

As a further improvement, the scoring subunit is specifically configured to:

and calculating the risk value of the second audit object based on the risk value of the primary service.

As a further improvement, the method further comprises the following steps:

a preprocessing unit 305, configured to preprocess the first audit data.

In the embodiment of the application, the audit object recommending device carries out risk level prediction by inputting the acquired first audit data of the first audit object into the risk prediction model, then selects the first audit object with the highest risk level as the recommended audit object, automatically processes and analyzes the audit data through the risk prediction model, does not need excessive manual interference, improves the efficiency, and solves the technical problem that the prior art determines the recommended audit object by manually retrieving, counting and analyzing the audit data, and has low efficiency.

The embodiment of the application also provides audit object recommendation equipment, which comprises a processor and a memory;

the memory is used for storing the program codes and transmitting the program codes to the processor;

the processor is configured to execute the audit object recommendation method in the foregoing method embodiments according to instructions in the program code.

The embodiment of the application further provides a computer-readable storage medium, which is used for storing program codes, and the program codes are used for executing the audit object recommendation method in the foregoing method embodiment.

It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for executing all or part of the steps of the method described in the embodiments of the present application through a computer device (which may be a personal computer, a server, or a network device). And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. An audit object recommendation method, comprising:

acquiring first audit data of a plurality of first audit objects;

preprocessing the first audit data, wherein the preprocessing comprises data cleaning, data correlation analysis or data conversion;

inputting the first audit data into a risk prediction model to predict risk levels to obtain the risk levels of the first audit objects;

the configuration method of the risk prediction model comprises the following steps:

acquiring second audit data of a second audit object;

training a machine learning model through the second audit data and the audit tag corresponding to the second audit object to obtain the risk prediction model;

before the step of obtaining the second audit data of the second audit object, the method comprises the following steps:

correspondingly, the obtaining of the second audit data of the second audit object includes:

acquiring second audit data corresponding to the third-level service;

the step of training a machine learning model through the second audit data corresponding to the second audit object and the audit tag to obtain the risk prediction model comprises:

calculating a preset index value through the audit tag of the second audit object and the risk level predicted value, wherein the preset index value comprises model accuracy, model hit rate or model recall rate;

verifying the machine learning model based on the preset index value, and taking the machine learning model passing the verification as the risk prediction model;

2. The audit object recommendation method according to claim 1, wherein the risk scoring the second audit object based on the second audit data to obtain a risk value corresponding to the second audit object comprises:

sequentially calculating the risk value of the secondary service and the risk value of the primary service based on the risk value corresponding to the tertiary service;

3. An audit object recommendation apparatus, applied to the audit object recommendation method of any one of claims 1-2, the apparatus comprising:

the device comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring first audit data of a plurality of first audit objects and preprocessing the first audit data, and the preprocessing comprises data cleaning, data correlation analysis or data conversion;

a configuration unit for configuring the risk prediction model;

the configuration unit specifically includes:

the acquisition subunit is used for acquiring second audit data of the second audit object;

the training subunit is configured to train a machine learning model through the second audit data and the audit tag corresponding to the second audit object to obtain the risk prediction model;

4. An audit object recommendation device, comprising a processor and a memory;

the processor is configured to execute the audit object recommendation method of any of claims 1-2 in accordance with instructions in the program code.

5. A computer-readable storage medium for storing program code for performing the audit object recommendation method of any of claims 1-2.