CN110378559B - Tax enterprise credit evaluation method based on generalized maximum flow - Google Patents

Tax enterprise credit evaluation method based on generalized maximum flow Download PDF

Info

Publication number
CN110378559B
CN110378559B CN201910507040.8A CN201910507040A CN110378559B CN 110378559 B CN110378559 B CN 110378559B CN 201910507040 A CN201910507040 A CN 201910507040A CN 110378559 B CN110378559 B CN 110378559B
Authority
CN
China
Prior art keywords
tax
enterprise
relation
enterprises
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910507040.8A
Other languages
Chinese (zh)
Other versions
CN110378559A (en
Inventor
郑庆华
张发
阮建飞
董博
王伊杨
高宇达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN201910507040.8A priority Critical patent/CN110378559B/en
Publication of CN110378559A publication Critical patent/CN110378559A/en
Application granted granted Critical
Publication of CN110378559B publication Critical patent/CN110378559B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes

Abstract

The invention discloses a tax payment credit evaluation method based on generalized maximum flow. Firstly, constructing a Taxpayer Global Network (TGN) through explicit social relation discovery and implicit social relation mining among tax-paying enterprises, and representing the weight of relation edges among the tax-paying enterprises; secondly, modeling the influence transfer process of the tax payment credit and the maximum flow problem in the network flow in an analog mode, using the generalized maximum flow to model the influence path selection, using an attenuation function to model the attenuation of the influence along with the path growth, and obtaining the association evaluation of the enterprise; and finally, comprehensively considering the tax paying enterprises and the interactive relation between the tax paying enterprises and the associated enterprises, and fusing the individual evaluation and the associated evaluation of the taxpayers by adopting a normalized fusion method to obtain a comprehensive tax paying credit evaluation result.

Description

Tax enterprise credit evaluation method based on generalized maximum flow
Technical Field
The invention relates to a method for identifying tax evasion suspicion groups by using a computer technology, in particular to a tax payment enterprise credit evaluation method based on generalized maximum flow.
Background
With the continuous promotion of the credit system construction in China, the social value and the social influence of tax credit are increasingly enhanced, and the tax credit is widely used in the fields of bidding, financing and the like and becomes an important asset for enterprises to participate in market competition. How to evaluate tax payment credit by means of an enterprise tax payment data construction model becomes a problem to be solved urgently.
The existing tax payment credit evaluation method mainly grades the tax payment credit of a taxpayer in a certain period according to the tax payment obligation situation fulfilled by the taxpayer. An assessment method combining annual evaluation index score and direct judgment grade is adopted: the taxpayer is generally subjected to percentage deduction according to 5 evaluation indexes of tax registration conditions, tax declaration conditions, account book and certificate management conditions, tax payment conditions, and processing conditions of violating tax laws and administrative laws and regulations of enterprises, and under some special violation conditions, the taxpayer grade is directly evaluated. The tax payment credit grade reflects the suspicion degree of the tax evasion behavior of the taxpayer, the lower the tax payment credit grade is, the higher the corresponding suspicion degree is, and otherwise, the lower the suspicion degree is. However, this tax credit assessment method relies on the formulation of evaluation indicators and assessment methods, and is limited by the expertise of experts and the large amount of human resources required for assessment work. In response to this challenge, the following references provide a referable, related method for automated tax credit assessment by computer based data mining techniques:
document 1. liberic. tax credit rating based on data mining technology [ D ]. harabin engineering university, 2004;
document 2. a tax auditing method (201610828358.2) that combines tax credit rating and transaction relationship networks;
document 3. xushao soldiers. application study of decision tree algorithm in tax payment credit rating [ J ]. computer knowledge and technology, 2009,5(2): 286-.
Document 1 proposes a neural network model based on a single-parameter dynamic search algorithm to divide tax payment credit grades, uses preprocessed data as a sample for network training, generates a neural network classifier at the same time, and inputs data to be detected into the model to obtain tax payment credit evaluation results.
Document 2 proposes to first construct a transaction relationship network with directed weighting; secondly, calculating suspicion scores through three-stage iteration and weighted scores of division, transmission and combination of tax suspicion based on tax payment credit levels and a transaction relationship network; and finally, obtaining the suspected taxpayers to be checked according to the grading sorting.
Document 3 selects attributes of manually evaluated tax-related data, generates a category attribute set, brings the data into a decision tree C4.5, trains the data to obtain a decision tree model, and finally evaluates the classified data.
The methods described in the above documents mainly have the following problems: documents 1 and 3 classify tax-related data by using a neural network model and a decision tree model respectively to obtain tax payment credit levels of enterprises, however, both only use historical tax payment data of the enterprises, and do not consider interaction relation with associated enterprises; although the document 2 considers the interaction influence relationship between the related enterprises, the interaction relationship is limited to the transaction relationship, and the problems of path selection and influence attenuation involved in the interaction influence transmission between the related enterprises are not considered.
Disclosure of Invention
The invention aims to provide a tax payment credit evaluation method based on generalized maximum flow, aiming at the defects of the prior art.
The invention is realized by adopting the following technical scheme:
a tax enterprise credit assessment method based on generalized maximum flow comprises the steps of firstly, carrying out network representation on explicit historical tax information of taxpayers and social relations of the taxpayers, mining implicit social relations among the taxpayers through constructed networks, perfecting interactive relation modeling among related enterprises, constructing a taxpayer global network through explicit social relation discovery and implicit social relation mining among the tax enterprises, and representing weights of relation edges among the tax enterprises; secondly, modeling the influence transfer process of the tax payment credit and the maximum flow problem in the network flow in an analog mode, using the generalized maximum flow to model the influence path selection, using an attenuation function to model the attenuation of the influence along with the path growth, and obtaining the association evaluation of the enterprise; and finally, comprehensively considering the tax paying enterprises and the interactive relation between the tax paying enterprises and the associated enterprises, and fusing the individual evaluation and the associated evaluation of the taxpayers by adopting a normalized fusion method to obtain a comprehensive tax paying credit evaluation result.
The invention has the further improvement that the method specifically comprises the following implementation steps:
1) discovery of social relationships
(1) Discovery of explicit social relationships
Based on the directed graph theory, natural persons and tax paying enterprises involved in the data are represented as nodes in the explicit social network, four relations of investment relation, stockholder relation, legal representative relation and transaction relation involved in the data are represented as edges in the explicit social network, and the obtained explicit social network of the taxpayer is represented as follows:
TESN=(V,E,VAttr,EAttr)
wherein V ═ { V ═ ViI ═ 1, 2., N } represents a set of nodes, V ═ P ═ u C, P represents a set of natural persons, C is a set of taxation enterprises; e ═ EijI, j ═ 1.. N } denotes the edges present in TESN, eij=(vi,vj) Representing that an edge exists from the ith node to the jth node; VAttr includes a node class VCatgAnd VInd,VCatgRepresenting that the node belongs to a natural person or a tax enterprise; vIndName is the Name of the enterprise of the tax payment enterprise or the Name of the natural person, and the company score is the initial tax payment credit score; EAttr includes node class ECatgAnd an edge weight W; eCatgThe element is left to { IR, SR, LR, TR }, wherein IR represents investment relation, SR represents shareholder relation, LR represents legal representative relation, and TR represents transaction relation; finally, a heterogeneous taxpayer explicit social network comprising the taxpaying enterprises and the natural persons is obtained through bidirectional association benefit mining;
(2) mining of implicit social relationships
On the basis of the explicit social network of the taxpayers, mining the hidden association relationship among the taxpaying enterprises for introducing the inter-locking relationship of the president; defining a control relation chain T, and satisfying:
T={(p,c1,c2,...,cn,c)|p∈P;c1,c2,...,cn,c∈C;(p,c1),(c1,c2),...,(cn,c)∈E}
the control relationship chain T represents a control relationship chain starting with a natural person and ending with a tax payment enterprise, excluding transaction relationships; wherein p represents a starting point and c represents an ending point; p represents all natural persons, C represents tax enterprise, E represents the edge set from natural person to tax enterprise or from tax enterprise to tax enterprise, (P, C) represents the edge from tax person to enterprise; finally, forming a bipartite graph PCBN (P, C, E) of the natural person-tax enterprise; p and C are natural persons of PCBN andtwo-part point set of tax enterprises; e, collecting edges from natural people to tax enterprises; after reduction, N (c) { P | P ∈ P, (P, c) ∈ EcThe tax enterprise is taken as a controller set of the tax enterprise;
and finally, converting the PCBN into a non-bipartite PCBNC of the tax enterprise, wherein:
Figure BDA0002092172680000041
when the intersection exists between the sets of controllers between any two tax-paying enterprises, a bidirectional board-to-board interlocking relationship is added to the two enterprises, and the TESN is converted into a taxpayer global network, which is expressed as:
TGN=(C,Ec,VAttr,EAttrc)
wherein E iscRepresenting a new set of edges; EAttrcIncluding node class ECatgAnd an edge weight W; eCatgE to { IR, SR, TR, CR }, wherein IR represents investment relation, SR represents stockholder relation, TR represents transaction relation, and CR represents board of director interlocking relation, and finally a homogeneous taxpayer global network only containing tax enterprises is obtained;
2) relational edge weight characterization
(1) Calculating the forward and backward influence values of the relational edge
Because the interaction occurs between the tax paying enterprises, and the influence degree should consider the influence between the tax paying enterprises and the ratio of the final tax paying to the total tax paying, the influence of the investor i on the investor j is defined as a forward influence, and the influence of the investor j on the investor i is defined as a reverse influence, specifically, the formula is as follows:
Figure BDA0002092172680000042
wherein TRijRepresenting a trade proportion, irijIndicating the amount of money invested by investor i to invested party j,
Figure BDA0002092172680000043
indicating the sponsori for the sum of all invested parties; the rest investment relation, stockholder relation and director interlocking relation IR, SR and CR are the same;
(2) fusion of multi-source edge relationships
In order to comprehensively utilize the four relations, the Dempster-Shafer evidence theory is adopted to solve the multi-value mapping problem, so that 4 m functions of the type can be transmitted for tax credit situations, and the normalization constants are respectively the weight ratios of investment relation, stockholder relation, transaction relation and director interlocking relation:
Figure BDA0002092172680000051
wherein m isi(Ai) Representing the trust degree of the edge relation evidence, and four relation synthesis rules are expressed as follows:
Figure BDA0002092172680000052
finally, converting the four-dimensional edge weight attribute into a one-dimensional probability attribute, wherein the source point tax paying enterprise expressing the relationship edge can influence the end point tax paying enterprise according to the probability;
3) calculating associated evaluation of tax paying enterprise based on generalized maximum flow delivery influence
Step1 characterization of network attenuation
In order to represent the path growth and the attenuation of the influence of tax credit, an attenuation function loss (x) is introduced to simulate the attenuation of the flow according to the proportion when every intermediate node V belongs to V \ src, dst, wherein x is the distance from the current intermediate node to the source node; considering that the calculation cost and the influence of the associated enterprises on the tax payment credit of the tax payment enterprise gradually weaken along with the increase of the transmission path, even to 0, the enterprise with stronger association degree with the tax payment enterprise is selected to transmit the influence of the tax payment credit, that is, all the associated enterprise nodes and side relations within 3 steps of the path length associated with each tax payment enterprise are extracted to form a Taxpayer Maximum flow sub-graph (TMFS) by taking each tax payment enterprise as a center, and the form is as follows:
TMFS=(Vid,MFGraph)
MaximumFlowGraph=(MFVertex,MFEdge)
wherein Vid is the only number of the tax paying enterprise, and MFGraph is the maximum flow subgraph of the taxpayer of the tax paying enterprise; the MFVertex represents the nodes contained in the maximum flow sub-graph of the tax enterprise, and the format of the nodes is (Vid, company score, Distance, Capacity), the Distance is the Distance between the node and the source node, and the Capacity is the flow size capable of being produced by the node; MFEdge is the edge contained in the maximum flow subgraph of the tax enterprise, the format is (SrcMFVertes, DstMFVertex, weight), SrcMFVertex is the source tax enterprise, DstMFVertex is the final tax enterprise, weight is the fusion weight obtained through the D-S evidence fusion mode, namely the maximum capacity limit of the edge;
in order to balance information loss and information reutilization, the idea of the maximum flow Edmonds-Karp algorithm is used for reference, and the shortest path is continuously searched as an amplifiable path according to a breadth-first search method; in order to reasonably simulate information attenuation in the propagation process, a uniform attenuation function and a non-uniform attenuation function are introduced according to the attenuation proportion;
unified attenuation: i.e. the attenuation ratio does not vary with x
leak1(x)=0.2
Non-uniform attenuation: attenuation ratio varies with x
leak2(x)=1-cos(0.09x)
leak3(x)=e-x
leak4(x)=(x+1)-4
Step2 selection of Path
Acquiring flow of an augmented path, setting an initial tax credit score (src. company score) of a source tax enterprise as total flow (fs) which can be sent outwards by a source point, namely the capacity of the source point, wherein for each edge on the selected shortest path, the flow is less than fs, and meanwhile, when the intermediate node is passed, the flow is lost, the loss proportion is represented by a loss function loss, and recording the used path and the flow used by each edge on the path until the flow flows to a terminal tax enterprise;
because noise information exists in the information transmission process, when the information passes through the intermediate node, a noise fitting method is introduced, and for the next node into which the information flow is transmitted, the flow of the residual flow path at the transmitting edge of the node is recordedothers(ii) a And simultaneously recording the residual capacity of the transmission edge to c (i, j) -usedflow, and finally updating the uniform attenuation function, wherein leak1(x) is 0.2 ═ sigma flowothers+ c (i, j) -usedflow)), a new unified attenuation function of the next node to be transmitted is obtained, and when the calculated new attenuation function value is less than 0.2, the current attenuation function leak1(x) is kept to be 0.2, and the information is normally transmitted;
step3 correction of residual net
For each edge from a source point src to a destination point dst in the obtained augmented path, correcting the capacity of a forward edge to be c (i, j) -usedflow, and adding a reverse edge e (j, i, usedflow), and simultaneously correcting fs ═ fs- (src, j) and usedflow, wherein the usedflow is the used flow (src, j) representing the path from the source to the node j; iteratively acquiring an augmentation path and correcting a residual network, and summing the flow transmitted to the dst to obtain the maximum influence degree of the associated tax payment enterprise on the concerned tax payment enterprise;
tax credit assessment
Firstly, defining a tax payment credit proportion formula In of normalization fusion associated enterprise to tax payment enterpriseij
Inij=flowij/C(vi)
InijRepresenting the proportion of the maximum influence value, i being all associated enterprise nodes except self in the taxpayer maximum flow subgraph of the tax enterprise j, flowijThe maximum value which represents the transmissible influence of the enterprise i on the enterprise j is obtained by the step 3); c (v)i) Scoring an initial tax credit for tax enterprise i; next, define other business to downstream business positive score Es
Es=β·C(vj)
Beta refers to the percentage of the tax related business to the associated business, C (v)i) Initial tax credit scores for tax paying enterprise i, then other enterprises are calculatedNegative influence scoring E of business to tax enterpriseo
Figure BDA0002092172680000071
Wherein 1-C (v)i) Representing the credit score of the associated enterprise to the tax payment enterprise,
Figure BDA0002092172680000072
expressing the comprehensive weight ratio score of the tax paying enterprise to the associated enterprise, and finally, calculating the fused score E (v)j) Representing the final evaluation of node j:
E(vj)=Es+Eo
beta is the distribution percentage of the concerned tax paying enterprises and the concerned related enterprises, and the value range is beta belonging to {0.1, 0.3, 0.5, 0.7, 0.9 }.
The invention is further improved in that the threshold value is 0.6 when E (v)j) Less than or equal to 0.6 indicates that the enterprise taxes credit is normal when E (v)j) And if the value is more than 0.6, the tax payment credit abnormality of the enterprise is indicated.
The invention has the following beneficial technical effects:
the invention is a method for calculating the tax payment credit condition of an enterprise by comprehensively considering the transmission influence of tax payment enterprises and associated enterprises on concerned enterprises, and has the following advantages: 1. by constructing the excavation and construction of the taxpayer global network structure, the relationship representation can be carried out on the taxpayers and the enterprises, so that the social relationship is excavated, and the evaluation of the taxpayer credit individuals is obtained; 2. based on the generalized maximum flow theory, mining the relation between the transmission influences to obtain tax payment credit association evaluation; 3. and (4) carrying out tax payment relation fusion through an information fusion theory so as to obtain fusion evaluation of tax payment credit.
Drawings
FIG. 1 is a block diagram of a framework flow diagram.
FIG. 2 shows a taxpayer two-way interest-associated mining graph.
FIG. 3 shows an implicit social relationship mining flow diagram.
Figure 4 shows an edge view with interlocks added.
Fig. 5 shows a maximum flow propagation diagram.
Detailed Description
In order to more clearly illustrate the technical scheme of the invention, the method for reflecting the tax payment credit condition by establishing the social relationship model between the taxpayer and the taxpayer is described in detail below with reference to the accompanying drawings and the specific embodiments.
As shown in fig. 1, the tax enterprise credit evaluation method based on the generalized maximum flow provided by the invention includes the steps of firstly, performing Network representation on explicit historical tax information of taxpayers and social relations of the taxpayers, mining implicit social relations among the taxpayers through constructed networks, perfecting interactive relation modeling among related enterprises, constructing Taxpayer Global Networks (TGNs) through explicit social relation discovery and implicit social relation mining among the taxpayers, and representing weights of relation edges among the taxpayers; secondly, modeling the influence transfer process of the tax payment credit and the maximum flow problem in the network flow in an analog mode, using the generalized maximum flow to model the influence path selection, using an attenuation function to model the attenuation of the influence along with the path growth, and obtaining the association evaluation of the enterprise; and finally, comprehensively considering the tax paying enterprises and the interactive relation between the tax paying enterprises and the associated enterprises, and fusing the individual evaluation and the associated evaluation of the taxpayers by adopting a normalized fusion method to obtain comprehensive tax paying credit evaluation. Specifically, the present invention comprises the steps of:
step 1. discovery of social relationships
The construction of the taxpayer global network mainly comprises the discovery of the relationship. Because the relationships comprise explicit social relationships and implicit social relationships, the explicit social relationships need to be characterized first, then the implicit relationships need to be mined, and finally heterogeneous explicit networks of tax paying enterprises and natural people are obtained through bidirectional association benefit mining. The method specifically comprises the following steps:
explicit social relationship characterization
The explicit social relationship mainly comprises four relationships, namely the investment relationship of a natural person to a tax paying enterprise and the investment relationship of the tax paying enterprise to the enterprise in taxpayer-investor data, the transaction relationship of the tax paying enterprise to the tax paying enterprise in invoice stub data, the legal representative relationship of the natural person to the tax paying enterprise in register taxpayer information data, and the stockholder relationship of the natural person to the tax paying enterprise and the taxpaying enterprise to the tax paying enterprise in stockholder data. Based on the directed graph theory, natural people and tax payment enterprises involved in the data are represented as nodes in the Explicit Social Network, four relations of investment relation, stockholder relation, legal representative relation and transaction relation involved in the data are represented as edges in the Explicit Social Network, and the obtained Taxpayer Explicit Social Network (TESN) is represented as:
TESN=(V,E,VAttr,EAttr)
wherein V ═ { V ═ Vi|i=1,2,...,NiExpressing a node set, wherein V is PU C, P represents a natural person set, and C is a tax enterprise set; e ═ Eij}={(vi,vje.V } represents the edge present in the TESN, eij=(vi,vj) Representing that an edge exists from the ith node to the jth node; VAttr includes a node class VCatgAnd VInd,VCatgRepresenting that the node belongs to a natural person or a tax enterprise; vIndName is the Name of the enterprise of the tax payment enterprise or the Name of the natural person, and the company score is the initial tax payment credit score. EAttr includes node class ECatgAnd an edge weight W. ECatgAnd e { IR, SR, LR, TR }, wherein IR represents investment relation, SR represents shareholder relation, LR represents legal representative relation, and TR represents transaction relation. And finally, obtaining the heterogeneous taxpayer explicit social network comprising the taxpaying enterprises and the natural persons through bidirectional association benefit mining, as shown in fig. 2.
(2) Mining of implicit relationships
On the basis of the explicit social network of the taxpayers, the hidden association relationship among the tax-paying enterprises is mined, and the introduction of the board-of-board interlocking relationship is mainly used. Defining a control relation chain T, and satisfying:
T={(p,c1,c2,...,cn,c)|p∈P;c1,c2,...,cn,c∈C;(p,c1),(c1,c2),...,(cn,c)∈E}
the control relationship chain T represents a control relationship chain that starts with a natural person and ends with a tax payment enterprise, excluding transaction relationships. Where p denotes a start point and c denotes an end point. P represents all natural persons, C is a tax enterprise, E is a collection of natural persons to edges of the tax enterprise, and (P, C) represents taxpayers to edges of the enterprise, as shown in fig. 3.
S301: simplified control edge
And finally, forming a natural person-tax enterprise bipartite graph PCBN ═ P, C and E. P and C are two part point sets of a natural person of the PCBN and a tax paying enterprise, and a control chain consisting of a legal person relation, a stock holding relation and an investment relation is identified from each natural person, so that a single control edge is obtained through simplification.
S302: mining of board of things
E is the set of edges from natural people to tax enterprises. After reduction, N (c) { P | P ∈ P, (P, c) ∈ EcAnd the tax enterprise is taken as a controller collection of the tax enterprise and taken as a generalized board of directors of the enterprise.
S303: adding interlocking edges
And finally, converting the PCBN into a non-bipartite PCBNC of the tax enterprise, wherein:
Figure BDA0002092172680000101
meaning that when there is an intersection between sets of controllers between any two tax enterprises, then a two-way board interlock is added for the two enterprises.
S304: get Taxpayer Global Network (TGN)
TESN is converted to TGN, expressed as:
TGN=(C,Ec,VAttr,EAttrc)
wherein E iscRepresenting a new set of edges; EAttrcIncluding node class ECatgAnd an edge weight W. ECatgE.g. { IR, SR, TR, CR }, wherein IR represents investment relation, SR represents stockholder relation, TR represents transaction relation, and CR represents board-of-directors interlocking relation, and finally a homogeneous taxpayer global network containing only tax enterprises is obtained,as shown in fig. 4
Step 2. characterization of weights
Because the interaction occurs between the tax paying enterprises, and the influence degree should consider the influence between the tax paying enterprises and the ratio of the final tax paying to the total tax paying, the influence of the investor i on the investor j is defined as a forward influence, and the influence of the investor j on the investor i is defined as a reverse influence, specifically, the formula is as follows:
TRij=irij/∑jirij
wherein TRijRepresenting a trade proportion, irijRepresents the sum of the investor i to the investor jjirijIndicating the amount of money invested by investor i for all invested parties. The remaining investment relations, stockholder relations and director interlocking relations IR, SR and CR are the same. The following tables 1 and 2
Figure BDA0002092172680000111
TABLE 1
Figure BDA0002092172680000112
TABLE 2
In order to comprehensively utilize the four relations, a Dempster-Shafer evidence theory is adopted to solve the multi-valued mapping problem. And the characteristic that the tax payment credit can be transferred and cannot be transferred conforms to the proposition set, so that 4 m functions in the type can be transferred for the tax payment credit condition, and the weight ratios of the investment relationship, the stockholder relationship, the transaction relationship and the board of things interlocking relationship are respectively used.
Finally, the normalization constant:
Figure BDA0002092172680000113
wherein m isi(Ai) Representing the degree of trust of the edge relationship evidence, four relationship synthesis rules can be expressed as:
Figure BDA0002092172680000121
and finally, converting the four-dimensional edge weight attribute into a one-dimensional probability attribute, wherein the source point tax paying enterprise expressing the relationship edge can influence the end point tax paying enterprise according to the probability.
Step 3. transfer of maximum flow
The transmission of the tax payment credit evaluation is similar to a flow transmission process, the flow corresponds to the initial tax payment compliance risk value of the enterprise, and the flow is converged from a source point to a destination point and carries risk information on a propagation path. Based on the similarity of the delivery, the influence process of tax compliance risk is analogized to the flow delivery process in the network flow, and because there may be multiple influence paths delivered to the end-point tax paying enterprise, the problem of information loss (one of the multiple influence paths is selected, if only the influence value on the shortest path or the longest path is selected) is faced, and on the other hand, the problem of path selection is faced. In reality, as the influence path grows, the tax information delivery influence assumes a decaying state. For these problems, with the thought of the generalized maximum flow as reference, two nodes in the capacity network want to send as many substances as possible and cannot exceed the capacity limit, and the difficult problem of path selection can be solved by using attenuation modeling and by using the maximum influence of the associated enterprises on the delivery of the concerned tax paying enterprises through the relationship network.
In the generalized network, each edge has a positive multiplier associated with the edge to represent the proportion of the flow sent along the edge, so that the problem of influence attenuation can be solved, and in the maximum flow problem, a capacity network-target network is to send as many substances as possible between two different nodes and cannot exceed the capacity limit of the edge, so that the maximum influence of an associated enterprise on the delivery of a tax enterprise through the network can be obtained, and the problem of information loss or information recycling path selection can be solved. In the characterization process, a decay function is defined as leak (x) ═ 0.2, which means that every time an intermediate node V ∈ V (src, dst) passes through, the flow decays according to the leak function, where x is the distance from the current intermediate node to the source node.
Given a network G (V, E) and a flow f thereon, the corresponding residual network can be defined as Gr (V, Er), c (i, j) and gain (i, j) being capacity limits and a prime, and the residual capacity limit in the residual network being cr (i, j) ═ c (i, j) -f (i, j), when a path can also be found in the residual network so that its traffic does not exceed the edge capacity line, such a path is called an augmentable path. The specific propagation process is shown in fig. 4.
In the initial state, 6 nodes are provided, Src is a source node, Dst is a target node, the attenuation positive seed is selected to be leak (x) 0.2 in the propagation process, and the initial transfer capacity is f (1,3) 0.6; according to the graph breadth first algorithm, nodes 1 to 2 are walked first, because the edge capacity limit of 1 to 2 is 0.5, the transfer condition is not satisfied, so nodes 1 to 3 are reselected, the capacity limit of nodes 1 to 3 is 0.6, and 0.6 is satisfied and propagated. At the same time, the decay function of node 3 is calculated. The incoming edge of node 3 is only the edge of nodes 1 to 3, and f (1,3) × leak (x) ═ 0.6 × 0.2 ═ 0.12 < 0.2, so the attenuation function of node 3 is chosen to be leak (x) ═ 0.2. Then, the information is transmitted, and according to the graph breadth first algorithm, the nodes 3 to 4 are selected to transmit, and the attenuation transmission of the information is calculated, wherein the information value of the attenuation is dst ═ leak (x), f (1,3) ═ 0.2 × (0.6) ═ 0.12, and therefore f (3,4) ═ f (1,3) · dst ═ 0.6-0.12 ═ 0.48. Then, the information transfer is continued, the nodes 4 to 6 are selected to perform information propagation, and (f (3,4) + f (2,4) + f (5,4)) × leak (x) ((0.48 +0.3+0.9) × 0.2 ═ 0.38 > 0.2, so leak (x) ((x)) 0.38). Then, the attenuation transmission of the information is calculated, and the information value dst (leak) (x) f (3,4) is 0.38 x 0.54 is 0.2052, so that f (4,6) f (3,4) -dst is 0.48-0.2052 is 0.2748, and the final dst receiving propagation information is 0.2748. Then, the transfer information capacity is 0.4, and according to the map breadth first algorithm, the node 1 to the node 2 are selected for information propagation, the initial transfer capacity f (1,2) is 0.4, and the node 1 to the node 2 are not selected because the remaining capacity value of the edge is 0. The attenuation factor is then selected, since f (1,2) × (x) × 0.08 < 0.2, and thus leak (x) × 0.2, the value of attenuation is dst ═ f (1,2) × leak (x) ═ 0.4 ═ 0.2 ═ 0.08, and thus f (2,4) ═ f (1,2) — (dst) ═ 0.3-0.08 ═ 0.22. Then according to the maximum flow theory, it is necessary to transfer as much as possible without exceeding the edge capacity, so node 2 to node 4 is selected for transfer, after calculation, leak (x) is selected to be 0.212, Dst (x) f (1,2) is selected to be 0.212 x 0.22 0.0466, so f (4,6) f (2,4) -Dst is 0.22-0.0466 0.1734, and finally Dst obtains the information of 0.1734.
Step 4. taxation credit assessment
The normalization and fusion tax payment credit evaluation of the tax payment enterprise and the associated enterprise mainly considers the influence of the associated enterprise on the tax payment enterprise and the tax payment credit condition of the central tax payment enterprise. Considering that the initial tax payment credit score of the associated enterprise is low, the contained information amount is large, and the influence degree of the associated enterprise on the central enterprise is large. In addition, positive and negative influences of the associated enterprises on the tax paying enterprises are considered, the positive influence is that the initial tax paying credit evaluation of the associated enterprises is higher than the rating of the tax paying enterprises, and the negative influence is opposite. And then, a tax enterprise fusion evaluation formula is provided to carry out tax payment compliance risk association evaluation on the taxpayer.
Using a linear fusion formula to derive a tax credit correlation evaluation, the following formula
Inij=flowij/C(vi)
Es=β·C(vj)
Figure BDA0002092172680000141
E(vj)=Es+Eo
Inij=flowij/(Cvi) And i is all the associated enterprise nodes except the self in the taxpayer maximum flow subgraph of the taxpayer enterprise j. flow (W)ijThe maximum value, C (v), representing the deliverable impact of the business on business j is determined by step3i) An initial tax credit score for the tax enterprise. EsIndicating positive scoring of downstream businesses by other businesses, EoIndicating that other businesses have negatively impacted the incoming business score, E (v)j) Representing the final evaluation of node j.
The value range of beta in the tax credit association evaluation is as follows:
β∈{0.1,0.3,0.5,0.7,0.9}
in the experiment, the invention takes the threshold value of 0.6 when E (v)j) Less than or equal to 0.6 indicates that the enterprise taxes credit is normal when E (v)j) And if the value is more than 0.6, the tax payment credit abnormality of the enterprise is indicated.

Claims (2)

1. A tax enterprise credit assessment method based on generalized maximum flow is characterized in that firstly, network representation is carried out on explicit historical tax information and social relations of taxpayers of the taxpayers, meanwhile, implicit social relations among the taxpayers are mined through constructed networks, interactive relation modeling among related enterprises is perfected, a taxpayer global network is constructed through explicit social relation discovery and implicit social relation mining among the taxpayers, and weights of relation edges among the taxpayers are represented; secondly, modeling the influence transfer process of the tax payment credit and the maximum flow problem in the network flow in an analog mode, using the generalized maximum flow to model the influence path selection, using an attenuation function to model the attenuation of the influence along with the path growth, and obtaining the association evaluation of the enterprise; finally, comprehensively considering the tax paying enterprises and the interactive relation between the tax paying enterprises and the associated enterprises, and fusing the individual evaluation and the associated evaluation of the taxpayers by adopting a normalized fusion method to obtain a comprehensive tax paying credit evaluation result; the method specifically comprises the following implementation steps:
1) discovery of social relationships
(1) Discovery of explicit social relationships
Based on the directed graph theory, natural persons and tax paying enterprises involved in the data are represented as nodes in the explicit social network, four relations of investment relation, stockholder relation, legal representative relation and transaction relation involved in the data are represented as edges in the explicit social network, and the obtained explicit social network of the taxpayer is represented as follows:
TESN=(V,E,VAttr,EAttr)
wherein V ═ { V ═ ViI ═ 1, 2., N } represents a set of nodes, V ═ P ═ u C, P represents a set of natural persons, C is a set of taxation enterprises; e ═ EijI, j ═ 1.. N } denotes the edges present in TESN, eij=(vi,vj) Representing that an edge exists from the ith node to the jth node; VAttr includes node classesPin VCatgAnd VInd,VCatgRepresenting that the node belongs to a natural person or a tax enterprise; vIndName is the Name of the enterprise of the tax payment enterprise or the Name of the natural person, and the company score is the initial tax payment credit score; EAttr includes node class ECatgAnd an edge weight W; eCatgThe element is left to { IR, SR, LR, TR }, wherein IR represents investment relation, SR represents shareholder relation, LR represents legal representative relation, and TR represents transaction relation; finally, a heterogeneous taxpayer explicit social network comprising the taxpaying enterprises and the natural persons is obtained through bidirectional association benefit mining;
(2) mining of implicit social relationships
On the basis of the explicit social network of the taxpayers, mining the hidden association relationship among the taxpaying enterprises for introducing the inter-locking relationship of the president; defining a control relation chain T, and satisfying:
T={(p,c1,c2,...,cn,c)|p∈P;c1,c2,...,cn,c∈C;(p,c1),(c1,c2),...,(cn,c)∈E}
the control relationship chain T represents a control relationship chain starting with a natural person and ending with a tax payment enterprise, excluding transaction relationships; wherein p represents a starting point and c represents an ending point; p represents all natural persons, C is a tax enterprise, and (P, C) represents the edge from the taxpayer to the enterprise; finally, forming a bipartite graph PCBN (P, C, E) of the natural person-tax enterprise; p and C are a natural person of the PCBN and a tax enterprise two-part point set; e is an edge set from a natural person to a tax payment enterprise; after reduction, N (c) { P | P ∈ P, (P, c) ∈ EcThe tax enterprise is taken as a controller set of the tax enterprise;
and finally, converting the PCBN into a non-bipartite PCBNC of the tax enterprise, wherein:
Figure FDA0003104332200000023
when the intersection exists between the sets of controllers between any two tax-paying enterprises, a bidirectional board-to-board interlocking relationship is added to the two enterprises, and the TESN is converted into a taxpayer global network, which is expressed as:
TGN=(C,Ec,VAttr,EAttrc)
wherein E iscRepresenting a new set of edges; EAttrcIncluding node class ECatgNAnd an edge weight W; eCatgNE to { IR, SR, TR, CR }, wherein IR represents investment relation, SR represents stockholder relation, TR represents transaction relation, and CR represents board of director interlocking relation, and finally a homogeneous taxpayer global network only containing tax enterprises is obtained;
2) relational edge weight characterization
(1) Calculating the forward and backward influence values of the relational edge
Because the interaction occurs between the tax paying enterprises, and the influence degree should consider the influence between the tax paying enterprises and the ratio of the final tax paying to the total tax paying, the influence of the investor i on the investor j is defined as a forward influence, and the influence of the investor j on the investor i is defined as a reverse influence, specifically, the formula is as follows:
Figure FDA0003104332200000021
wherein TRijRepresenting a trade proportion, irijIndicating the amount of money invested by investor i to invested party j,
Figure FDA0003104332200000022
the money amount of the investor i to all invested parties is represented; the rest investment relation, stockholder relation and director interlocking relation IR, SR and CR are the same;
(2) fusion of multi-source edge relationships
In order to comprehensively utilize the four relations, the Dempster-Shafer evidence theory is adopted to solve the multi-value mapping problem, so that 4 m functions of the type can be transmitted for tax credit situations, and the normalization constants are respectively the weight ratios of investment relation, stockholder relation, transaction relation and director interlocking relation:
Figure FDA0003104332200000031
wherein m isi(Ai) Representing the trust degree of the edge relation evidence, and four relation synthesis rules are expressed as follows:
Figure FDA0003104332200000032
finally, converting the four-dimensional edge weight attribute into a one-dimensional probability attribute, wherein the source point tax paying enterprise expressing the relationship edge can influence the end point tax paying enterprise according to the probability;
3) calculating associated evaluation of tax paying enterprise based on generalized maximum flow delivery influence
Step1 characterization of network attenuation
In order to represent the path growth and the attenuation of the influence of tax credit, an attenuation function leak1(x) is introduced to be 0.2 to simulate the attenuation of the flow according to the proportion when every intermediate node V belongs to V { src, dst } is passed, src and dst respectively represent a source node and a target node, V represents a node set passed by from the source to the target node, wherein x is the distance from the current intermediate node to the source node; considering that the calculation cost and the influence of the associated enterprises on the tax payment credit of the tax payment enterprises are gradually weakened to even 0 along with the increase of the transmission path, selecting the enterprises with stronger association degree with the tax payment enterprises to transmit the influence of the tax payment credit, namely taking each tax payment enterprise as a center, extracting all associated enterprise nodes and side relations within 3 steps of the path length associated with each tax payment enterprise to form a taxpayer maximum flow subgraph TMFS, and formalizing the maximum flow subgraph TMFS as follows:
TMFS=(Vid,MFGraph)
MaximumFlowGraph=(MFVertex,MFEdge)
wherein Vid is the only number of the tax paying enterprise, and MFGraph is the maximum flow subgraph of the taxpayer of the tax paying enterprise; the MFVertex represents the nodes contained in the maximum flow sub-graph of the tax enterprise, and the format of the nodes is (Vid, company score, Distance, Capacity), the Distance is the Distance between the node and the source node, and the Capacity is the flow size capable of being produced by the node; MFEdge is the edge contained in the maximum flow subgraph of the tax enterprise, the format is (SrcMFVertes, DstMFVertex, weight), SrcMFVertex is the source tax enterprise, DstMFVertex is the final tax enterprise, weight is the fusion weight obtained through the D-S evidence fusion mode, namely the maximum capacity limit of the edge;
in order to balance information loss and information reutilization, the idea of the maximum flow Edmonds-Karp algorithm is used for reference, and the shortest path is continuously searched as an amplifiable path according to a breadth-first search method; in order to reasonably simulate information attenuation in the propagation process, a uniform attenuation function and a non-uniform attenuation function are introduced according to the attenuation proportion;
unified attenuation: i.e. the attenuation ratio does not vary with x
leak1(x)=0.2
Non-uniform attenuation: attenuation ratio varies with x
leak2(x)=1-cos(0.09x)
leak3(x)=e-x
leak4(x)=(x+1)-4
Step2 selection of Path
Acquiring flow of an augmented path, wherein the initial tax credit score of a source tax enterprise is src. company score, fs is total flow which can be sent out by a source point, namely the capacity of the source point, for each edge on the selected shortest path, the flow is less than fs, and when the edge passes through an intermediate node, the flow is lost, the loss proportion is represented by a loss function loss, and the used path and the flow used by each edge on the path are recorded until the edge flows to a destination tax enterprise;
because noise information exists in the information transmission process, when the information passes through the intermediate node, a noise fitting method is introduced, and for the next node into which the information flow is transmitted, the flow of the residual flow path at the transmitting edge of the node is recordedothers(ii) a And simultaneously updating the residual capacity of the transfer edge by c (i, j) -usedflow, wherein c (i, j) represents the existing capacity of the forward edge, i represents the start node of the forward edge, j represents the end node of the forward edge, and finally updating the unified decay function, and leak1(x) is 0.2 (sigma flow)others+ c (i, j) -usedflow), resulting in a new uniform decay function for the next node to be passed,when the calculated new attenuation function value is less than 0.2, keeping the current attenuation function leak1(x) equal to 0.2, and carrying out normal information transmission;
step3 correction of residual net
For each edge from a source point src to a destination point dst in the obtained augmented path, correcting the capacity of a forward edge to be c (i, j) -usedflow, and adding a reverse edge e (j, i, usedflow), and correcting fs ═ fs- (src, j) usedflow, wherein (src, j) represents a path from a source node to a node j, and (src, j) · usedflow represents the used flow of the path; iteratively acquiring an augmentation path and correcting a residual network, and summing the flow transmitted to the dst to obtain the maximum influence degree of the associated tax payment enterprise on the concerned tax payment enterprise;
tax credit assessment
Firstly, defining a tax payment credit proportion formula In of normalization fusion associated enterprise to tax payment enterpriseij
Inij=flowij/C(vi)
InijRepresenting the proportion of the maximum influence value, i being all associated enterprise nodes except self in the taxpayer maximum flow subgraph of the tax enterprise j, flowijThe maximum value of the transmissible impact of the enterprise i on the enterprise j is obtained by Step 3; c (v)i) Scoring an initial tax credit for tax enterprise i; next, define other business to downstream business positive score Es
Es=β·C(vj)
Beta is the distribution percentage of concerned tax paying enterprises and related enterprises, the value range is beta belongs to {0.1, 0.3, 0.5, 0.7, 0.9}, and then the negative influence score E of other enterprises on the tax paying enterprises is calculatedo
Figure FDA0003104332200000051
Wherein 1-C (v)i) Representing the credit score of the associated enterprise to the tax payment enterprise,
Figure FDA0003104332200000052
expressing the comprehensive weight ratio score of the tax paying enterprise to the associated enterprise, and finally, calculating the fused score E (v)j) Representing the final evaluation of node j:
E(vj)=Es+Eo
2. the method for assessing tax-paying enterprise credit based on generalized maximum flow as claimed in claim 1, wherein the threshold value is 0.6 when E (v) is exceededj) Less than or equal to 0.6 indicates that the enterprise taxes credit is normal when E (v)j) And if the value is more than 0.6, the tax payment credit abnormality of the enterprise is indicated.
CN201910507040.8A 2019-06-12 2019-06-12 Tax enterprise credit evaluation method based on generalized maximum flow Active CN110378559B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910507040.8A CN110378559B (en) 2019-06-12 2019-06-12 Tax enterprise credit evaluation method based on generalized maximum flow

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910507040.8A CN110378559B (en) 2019-06-12 2019-06-12 Tax enterprise credit evaluation method based on generalized maximum flow

Publications (2)

Publication Number Publication Date
CN110378559A CN110378559A (en) 2019-10-25
CN110378559B true CN110378559B (en) 2021-08-13

Family

ID=68250126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910507040.8A Active CN110378559B (en) 2019-06-12 2019-06-12 Tax enterprise credit evaluation method based on generalized maximum flow

Country Status (1)

Country Link
CN (1) CN110378559B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112241914A (en) * 2020-09-30 2021-01-19 航天信息股份有限公司 Enterprise evaluation method and device, storage medium and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574649A (en) * 2015-12-10 2016-05-11 西安交通大学 Taxpayer tax evasion suspicion group detection method based on multi-stage MapReduce model
CN106850348A (en) * 2017-01-19 2017-06-13 中山大学 A kind of traffic matrix On-line Estimation method of the data center internet based on SDN
CN109242664A (en) * 2018-10-16 2019-01-18 西安交通大学 It is a kind of towards the tax risk prediction technique for newly setting up enterprise

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8359259B2 (en) * 2009-11-12 2013-01-22 Hartford Fire Insurance Company System and method for administering telematics based reinsurance pools

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574649A (en) * 2015-12-10 2016-05-11 西安交通大学 Taxpayer tax evasion suspicion group detection method based on multi-stage MapReduce model
CN106850348A (en) * 2017-01-19 2017-06-13 中山大学 A kind of traffic matrix On-line Estimation method of the data center internet based on SDN
CN109242664A (en) * 2018-10-16 2019-01-18 西安交通大学 It is a kind of towards the tax risk prediction technique for newly setting up enterprise

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
在线社会网络中个性化信任评价基础与应用研究;姜文君;《中国博士学位论文全文数据库 信息科技辑》;20141215(第12期);第I139-11页 *

Also Published As

Publication number Publication date
CN110378559A (en) 2019-10-25

Similar Documents

Publication Publication Date Title
CN106372798A (en) User customization contract generation method based on risks and system
WO2021000475A1 (en) Bipartite graph-based method for detecting collaborative stock transaction suspicious groups
Hamal et al. A novel integrated AHP and MULTIMOORA method with interval-valued spherical fuzzy sets and single-valued spherical fuzzy sets to prioritize financial ratios for financial accounting fraud detection
CN110378559B (en) Tax enterprise credit evaluation method based on generalized maximum flow
Ran et al. Combining grey relational analysis and TOPSIS concepts for evaluating the technical innovation capability of high technology enterprises with fuzzy information
Wyatt et al. Ecosystem management and forestry planning in Labrador: how does Aboriginal involvement affect management plans?
Salaudeen et al. Challenges hindering Islamic microfinance banks’ sustainable financial inclusion: A case of Al-Hayat microfinance bank in Ogun state, Nigeria
Vinska et al. Cluster analysis of the European Union gender equality and economic development
Chen Multiple criteria decision analysis: classification problems and solutions
CN108510380A (en) A kind of property tax analysis platform based on big data
CN109801162A (en) A kind of credit rating method that social media data are merged with multi-standard cross-certification
Stevenson et al. A hybrid approach to identifying and assessing interactions between climate action (SDG13) policies and a range of SDGs in a UK context
CN106294834A (en) Connected transaction based on taxpayer&#39;s interests related network is evaded the tax Activity recognition method
Dewaelheyns et al. The impact of business groups on bankruptcy prediction modeling
Guerriero Endogenous Legal Traditions and Economic Outcomes
CN113724068A (en) Method for constructing debtor decision strategy based on knowledge graph and related equipment
CN112365120B (en) Intelligent business strategy generation method based on three decisions
Kraal A grounded theory approach to the minerals resource rent tax
Shrier et al. Global Fintech: Financial Innovation in the Connected World
Zhao et al. Risk assessment of sewage treatment Public Private Partnership projects in China
Torvanger et al. Estimating mobilized private climate finance for developing countries-A Norwegian pilot study
Ajibade et al. An alternative approach to fostering African economic integration through the utilization and alignment of information technology
Roland European Corporate Tax Policy since the Crisis: How the EU steps up the Fight against Corporate Tax Avoidance
Shen et al. Enhance the evaluation quality of project performance based on fuzzy aggregation weight effect
Mujih Corporate governance reform and corporate failure in the UK

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant