CN110647590A - Target community data identification method and related device - Google Patents

Target community data identification method and related device Download PDF

Info

Publication number
CN110647590A
CN110647590A CN201910899829.2A CN201910899829A CN110647590A CN 110647590 A CN110647590 A CN 110647590A CN 201910899829 A CN201910899829 A CN 201910899829A CN 110647590 A CN110647590 A CN 110647590A
Authority
CN
China
Prior art keywords
community
transaction
relationship
network
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910899829.2A
Other languages
Chinese (zh)
Inventor
黄志苹
陈鹏飞
段琴
王培勇
陈宏仁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SERVYOU SOFTWARE GROUP Co Ltd
Original Assignee
SERVYOU SOFTWARE GROUP Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SERVYOU SOFTWARE GROUP Co Ltd filed Critical SERVYOU SOFTWARE GROUP Co Ltd
Priority to CN201910899829.2A priority Critical patent/CN110647590A/en
Publication of CN110647590A publication Critical patent/CN110647590A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting
    • G06Q40/123Tax preparation or submission

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Technology Law (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Educational Administration (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method for identifying target community data, which comprises the following steps: carrying out transaction relation network construction processing on the acquired transaction relation data of the plurality of objects to obtain a transaction relation network; carrying out community classification on the transaction relationship network based on a graph clustering algorithm of modularity to obtain a community classification result; and determining analysis dimensions according to the attributes of the target community, and analyzing the community classification result through the analysis dimensions to obtain target community data. The method comprises the steps of carrying out community classification on a transaction relation network through a graph clustering algorithm based on modularity to obtain a community classification result, and screening target community data, so that the precision and accuracy of target community data searching are improved, and the target community identification effect is improved. The application also discloses a device for identifying the target community data, a server and a computer readable storage medium, which have the beneficial effects.

Description

Target community data identification method and related device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method for identifying target community data, an apparatus for identifying target community data, a server, and a computer-readable storage medium.
Background
With the continuous development of information technology, data processing technology of a computer is often adopted to check data. For example, in the field of tax risk auditing, risks present in data are identified through data identification techniques. The forecasting from the single-point virtual open recognition and the purchase and sale ledger diagnosis based on the cargo information are analyzed for a single enterprise. And aiming at the virtual open risk community, namely the virtual open group, the analysis method is less, and a large amount of manpower and material resources are usually needed to carry out the investigation and the combing of business data so as to lock the group enterprise.
In the prior art, a machine learning model is usually adopted to analyze relationship data so as to determine a target community to be searched. However, the currently used community classification algorithm has low classification precision and accuracy, so that the effect of searching the target community data is not good.
Therefore, how to improve the effect of target community data identification is a key issue of attention by those skilled in the art.
Disclosure of Invention
The application aims to provide a target community data identification method, a target community data identification device, a server and a computer readable storage medium.
In order to solve the above technical problem, the present application provides a method for identifying target community data, including:
carrying out transaction relation network construction processing on the acquired transaction relation data of the plurality of objects to obtain a transaction relation network;
carrying out community classification on the transaction relationship network based on a graph clustering algorithm of modularity to obtain a community classification result;
and determining analysis dimensions according to the attributes of the target community, and analyzing the community classification result through the analysis dimensions to obtain target community data.
Optionally, the transaction relationship network is constructed and processed from the acquired transaction relationship data of the multiple objects, so as to obtain a transaction relationship network, including:
constructing a transaction topological network according to the acquired transaction relation data of the plurality of objects;
performing transaction edge relationship attribute calculation and enterprise point entity attribute calculation on the transaction topological network to obtain a graph calculation result;
and carrying out noise reduction processing on the transaction topological network according to the graph calculation result to obtain the transaction relation network.
Optionally, the step of carrying out community classification on the transaction relationship network based on a graph clustering algorithm of modularity to obtain a community classification result includes:
screening the transaction relationship network according to a connected community algorithm to obtain a giant community in the transaction relationship network;
and carrying out community classification on the giant community according to the graph clustering algorithm based on the modularity to obtain a community classification result.
Optionally, determining an analysis dimension according to an attribute of a target community, and analyzing the community classification result through the analysis dimension to obtain target community data, including:
when the type of the target community is a virtual partnership community, analyzing the community classification result through multiple dimensions to obtain target community data; the multiple dimensions are an arbitrary relationship dimension, a transaction relationship dimension, an address relationship dimension, an investment relationship dimension and a property relationship dimension.
The present application also provides an apparatus for identifying target community data, including:
the transaction topological network module is used for constructing and processing the transaction relationship data of the plurality of acquired objects to obtain a transaction relationship network;
the community classification module is used for carrying out community classification on the transaction relationship network based on a graph clustering algorithm of modularity to obtain a community classification result;
and the target community analysis module is used for determining analysis dimensionality according to the attribute of the target community and analyzing the community classification result through the analysis dimensionality to obtain target community data.
Optionally, the transaction topology network module includes:
the topological network construction unit is used for constructing a transaction topological network according to the acquired transaction relation data of the plurality of objects;
the graph calculation unit is used for performing transaction edge relationship attribute calculation and enterprise point entity attribute calculation on the transaction topological network to obtain a graph calculation result;
and the denoising processing unit is used for denoising the transaction topological network according to the graph calculation result to obtain the transaction relation network.
Optionally, the community classification module includes:
the giant community acquisition unit is used for screening the transaction relationship network according to a connected community algorithm to obtain giant communities in the transaction relationship network;
and the community subdivision unit is used for carrying out community classification on the giant community according to the graph clustering algorithm based on the modularity to obtain the community classification result.
Optionally, the target community analysis module is specifically configured to, when the type of the target community is a virtual partnership, analyze the community classification result through multiple dimensions to obtain the target community data; the multiple dimensions are an arbitrary relationship dimension, a transaction relationship dimension, an address relationship dimension, an investment relationship dimension and a property relationship dimension.
The present application further provides a server, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the identification method as described above when executing the computer program.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the identification method as described above.
The application provides a method for identifying target community data, which comprises the following steps: carrying out transaction relation network construction processing on the acquired transaction relation data of the plurality of objects to obtain a transaction relation network; carrying out community classification on the transaction relationship network based on a graph clustering algorithm of modularity to obtain a community classification result; and determining analysis dimensions according to the attributes of the target community, and analyzing the community classification result through the analysis dimensions to obtain target community data.
The method comprises the steps of firstly obtaining transaction relation data for a plurality of objects, then conducting transaction relation network construction processing according to the transaction relation data to obtain a transaction relation network, then conducting community classification through a graph clustering algorithm of modularity to obtain a community classification result, and finally classifying the community classification result according to different analysis dimensions to obtain target community data.
The present application further provides an apparatus for identifying target community data, a server and a computer-readable storage medium, which have the above beneficial effects and are not described herein again.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flowchart illustrating a method for identifying target community data according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of an apparatus for identifying target community data according to an embodiment of the present disclosure.
Detailed Description
The core of the application is to provide a target community data identification method, a target community data identification device, a server and a computer readable storage medium, community classification is carried out on a transaction relation network through a graph clustering algorithm based on modularity to obtain a community classification result, and then target community data are screened out, so that the precision and accuracy of target community data search are improved, and the target community identification effect is improved.
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the prior art, a machine learning model is usually adopted to analyze relationship data so as to determine a target community to be searched. However, the currently used community classification algorithm has low classification precision and accuracy, so that the effect of searching the target community data is not good.
Therefore, the application provides a method for identifying target community data, which includes the steps of firstly obtaining transaction relation data for a plurality of objects, then conducting transaction relation network construction processing according to the transaction relation data to obtain a transaction relation network, then conducting community classification through a modularity graph clustering algorithm to obtain a community classification result, and finally classifying the community classification result according to different analysis dimensions to obtain the target community data.
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for identifying target community data according to an embodiment of the present disclosure.
In this embodiment, the method may include:
s101, performing transaction relationship network construction processing on the acquired transaction relationship data of the plurality of objects to obtain a transaction relationship network;
the method comprises the steps of establishing a transaction relation network according to the acquired transaction relation data of the object. The plurality of objects refer to basic objects for analysis in the transaction relationship network, and the transaction relationship network is constructed by the plurality of objects. In the embodiment, a target community formed by a plurality of objects is mainly identified. That is, some of all the objects are identified as the target community. On the basis, the step firstly needs to construct an integral trading relation network by all the objects.
Specifically, in this step, any one of the transaction relationship network construction processing methods provided in the prior art may be adopted, and the following construction method may be adopted to improve the accuracy and precision of the constructed transaction relationship network.
Optionally, the construction method in this step may include:
constructing a transaction topological network according to the acquired transaction relation data of the plurality of objects;
performing transaction edge relationship attribute calculation and enterprise point entity attribute calculation on the transaction topological network to obtain a graph calculation result;
and carrying out noise reduction processing on the transaction topological network according to the graph calculation result to obtain a transaction relation network.
Therefore, in the alternative, firstly, a topological network is constructed, then, attribute calculation is performed on the basis of the topological network to obtain a graph calculation result, and then, noise reduction processing is performed on the transaction topological network according to the graph calculation result to obtain a transaction relation network. The calculation of the transaction relationship attribute comprises the calculation of the arithmetic mean of the sum of transaction amount, the ratio of the transaction amount to the total sale amount of the seller, the ratio of the transaction amount to the total purchase amount of the sufficient seller and the ratio of the purchase amount to the seller. The value of the entity attribute calculation of the enterprise point comprises total sales amount, total purchase amount and node degree centrality calculation (degree output, degree input and degree).
And in order to ensure the effect of the algorithm, the noise reduction processing is carried out on the relation data in the transaction topology network.
S102, carrying out community classification on the transaction relationship network based on a graph clustering algorithm of modularity to obtain a community classification result;
on the basis of S101, this step is intended to perform community classification on the transaction relationship network, that is, classify the transaction communities of the relationships among the objects hidden in the transaction relationship network, so as to obtain community classification results. Furthermore, there may be a community that does not transact with other communities in the transaction relationship network, which is called an island community, and the island community needs to be removed, so as to avoid affecting the community classification result.
Therefore, the following steps can be selected as the way of classifying the communities in the present embodiment. Specifically, the method may include:
screening the transaction relationship network according to a connected community algorithm to obtain a giant community in the transaction relationship network;
and carrying out community classification on the giant community according to a graph clustering algorithm based on modularity to obtain a community classification result.
The connected community algorithm is divided into a weak connected community algorithm and a strong connected community algorithm, and is generally used as a preprocessing algorithm for graph clustering. The weak connectivity means that all nodes in an undirected graph can reach other nodes through a path, and the strong connectivity community means that all nodes in a directed graph can reach other nodes through a path (output standard ring loops). The giant community can be reserved after the identified island community is removed.
S103, determining analysis dimensions according to the attributes of the target community, and analyzing the community classification result through the analysis dimensions to obtain target community data.
On the basis of S102, this step is intended to determine an analysis dimension in advance, and then classify the obtained community classification result by the analysis dimension to obtain target community data.
According to different analysis dimensions, different analysis operations can be performed on the community classification result to obtain target community data with different dimensions. The analysis dimension includes, but is not limited to, an arbitrary relationship dimension, a transaction relationship dimension, an address relationship dimension, an investment relationship dimension, a property relationship dimension, a virtual open diagnostic goods ledger dimension, a virtual open mode analysis dimension, a cross-region data analysis dimension, and other business label dimensions.
Therefore, when the searched target community is a virtual partnership group, the step may include:
when the type of the target community is a virtual partnership community, analyzing the community classification result through multiple dimensions to obtain target community data; the multiple dimensions are an arbitrary relationship dimension, a transaction relationship dimension, an address relationship dimension, an investment relationship dimension and a property relationship dimension.
In summary, in the embodiment, the transaction relationship data for a plurality of objects is first obtained, then the transaction relationship network is constructed according to the transaction relationship data to obtain the transaction relationship network, then the modularity graph clustering algorithm is adopted to classify the communities to obtain the community classification result, and finally the community classification result is classified according to different analysis dimensions to obtain the target community data.
The following further describes a method for identifying target community data provided by the present application by using another specific embodiment.
In this embodiment, the method may include:
step one, a topological network is traded.
And constructing a virtual open transaction relationship network. A batch of virtually opened seed enterprises in a certain area are obtained, relationship tracking is carried out on transaction related enterprises around the virtually opened enterprises, and a transaction relationship network of three layers of enterprise nodes and 7 enterprise nodes at the upstream and the downstream is constructed.
And step two, graph calculation, including transaction edge relation attribute calculation and enterprise point entity attribute calculation.
(1) And calculating the transaction relationship attribute. The method comprises the arithmetic mean of the sum of transaction amount, the ratio of the transaction amount to the total sale amount of an seller, the ratio of the transaction amount to the total purchase amount of an adequate party and the ratio of the purchase amount to the seller.
(2) And calculating the property of the enterprise point entity. The method comprises the steps of total sales, total purchase and node degree centrality calculation (out degree, in degree and degree).
Step three, data denoising: in order to ensure the effect of the algorithm, initial noise reduction processing needs to be performed on the relational data in advance. For example, for a transaction edge with a small transaction amount, a relationship edge with a purchase-sale account smaller than the average value is preprocessed for a relationship edge (degree is 1) of an isolated node.
And step four, the community classification algorithm is an algorithm module integrated by four algorithms, and comprises a connected community algorithm, a louvain modularity clustering algorithm, a PageRank algorithm and a centrality-mediated algorithm.
(1) And (3) a connected community algorithm: the method is divided into a weak connected community algorithm and a strong connected community algorithm, and is generally used as a preprocessing algorithm of graph clustering. All nodes in the weakly connected or undirected graph can reach other nodes through one path, and all nodes in the strongly connected community or the directed graph can reach other nodes (output standard ring loops) through one path. Here, only weak connected community algorithm is used for preprocessing, a huge community and other small isolated island communities are generally obtained, and the subsequent analysis subject is in the huge community.
(2) The Fast Unfolding algorithm, also called louvain algorithm, is a typical graph clustering algorithm based on modularity. The modularity can also be understood as the sum of the weights of all edges connected with community nodes subtracted from the weight of the internal edges of the community, the community division aims to enable the internal connection of the divided community to be tight, the connection between communities is sparse, the advantages and disadvantages of the division can be embodied through the modularity, and the larger the modularity is, the better the community division effect is.
The Fast Unfolding algorithm is an algorithm for dividing communities based on modularity, is an iterative algorithm and aims to divide communities continuously so that the modularity of the divided whole network is increased continuously.
The method mainly comprises two stages:
the first stage is called modulation Optimization, which mainly divides each node into communities where nodes adjacent to the node are located, so that the value of Modularity is continuously increased; the second stage is called Community Aggregation, which mainly aggregates the communities divided in the first step into one point, i.e. reconstructing a network according to the Community structure generated in the previous step. The above process is repeated until the structure in the network is no longer changed.
The primary luvain algorithm is applicable to a directed right-free graph, the embodiment is modified into a directed right-free graph, the weight is set to be a comprehensive calculation value of all factors, namely, an arithmetic mean of a ratio of a transaction amount to a total sales amount of an seller and a ratio of the transaction amount to a total purchase amount of a buyer, the factors of virtual opening of upstream and downstream first-level enterprises, upstream and downstream second-level enterprises, sale and the like are comprehensively considered, and the weight is adjusted and optimized.
Wherein, the formula of the comprehensive calculation value is as follows:
weights=∑(x1+x2+x3+x4+x5)
wherein x1 is the ratio of the transaction amount to the total amount of the seller, x2 is the ratio of the transaction amount to the total amount of the buyer, x3 is a first-level parameter of the virtual open upstream and downstream, x4 is a second-level parameter of the virtual open upstream and downstream, and x5 is a parameter of the existence of the sale, the advance of the sale, and the sale of the sale.
And obtaining a community classification result after the output of the Louvain algorithm. In order to analyze the community result, PageRank algorithm and intermediate centrality algorithm calculation are also needed to be carried out on the connected community, and corresponding values are obtained.
(3) The PageRank algorithm is a technique for a search engine to calculate the importance of web pages by calculating PR values of web pages based on their hyperlinks. In a trading community, the PR value is calculated to obtain the importance degree of enterprises in the community, and the larger the PR value is, the greater the risk of the community node is, and the 'vote accumulation' behavior can be generally judged. However, before calculation, noise reduction processing needs to be performed on nodes with a large number of associations such as accounting and tax handling.
(4) The intermediate centrality algorithm refers to the number of times a node acts as the shortest bridge between two other nodes. The higher the number of times a node acts as an "intermediary," the more central it is. In the community analysis, the possibility that the node is used as a virtual open 'channel' enterprise can be obtained by calculating the intermediate centrality of the node, and the 'ticket washing' behavior can be generally judged.
And fifthly, acquiring a community classification result, namely a community classification label obtained by the algorithm module.
And step six, performing multi-dimensional group analysis, namely performing business and rule analysis on the obtained community classification result from other multiple data dimensions.
(1) The dimension of the arbitrary relationship, namely the association of the persons of the community classification enterprises, is usually strong in the internal enterprises of the virtual development group.
(2) The transaction relationship dimension, i.e., the transaction community classification label, and the PR value and the centrality label of the transaction community, etc.
(3) The address relationship dimension, i.e. the address association of the community classification enterprises, usually makes the internal enterprises of the virtual partnership have strong association with the address.
(4) The dimension of the investment relationship, namely the investment relevance of the community classification enterprises, natural person investors, legal person investors and the like, ensures the transaction return of the virtual open enterprises, and generally has the strong association relationship between legal persons and investors.
(5) In the dimension of property relation, namely the registration and use relevance of vehicles, ships, real estate and the like, enterprises and personnel in the virtual development group usually have common property and the same property use record.
(6) And the virtual opening diagnosis goods ledger dimension is judged from the purchase and sale information of the goods, characteristic behavior data such as unbalanced purchase and sale, sale and non-purchase are intuitively calculated, the flow path, the amount and the like of the virtual opening goods are analyzed, and the role played by the enterprise in the group is judged.
(7) The virtual open mode analysis dimension, and certain virtual open gang works accord with some specific association modes, such as a spindle body mode of the upstream and downstream ticket accumulation, and the like.
(8) The cross-region data analysis dimension, the virtual open group partner usually accords with the obvious cross-region plan characteristic, namely that the transaction upstream or downstream is an extraprovincial enterprise.
(9) And other business label dimensions, such as collection mode of the community classified enterprise, enterprise scale, enterprise number, social security payment condition, enterprise key personnel identity card address relevance and the like.
Step seven, the result of virtual open ganging: and outputting the result after the multi-dimensional group analysis, wherein the result is regarded as a virtual open group result.
It can be seen that, in the embodiment, transaction relationship data for a plurality of objects is obtained, a transaction relationship network is constructed according to the transaction relationship data to obtain a transaction relationship network, then the modularity graph clustering algorithm is adopted to classify the communities to obtain community classification results, and finally the community classification results are classified according to different analysis dimensions to obtain target community data.
In the following, a description is given of a target community data identification device provided in an embodiment of the present application, and a target community data identification device described below and a target community data identification method described above may be referred to in correspondence with each other.
Referring to fig. 2, fig. 2 is a schematic structural diagram of an apparatus for identifying target community data according to an embodiment of the present disclosure.
In this embodiment, the apparatus may include:
the transaction topology network module 100 is configured to perform transaction relationship network construction processing on the acquired transaction relationship data of the multiple objects to obtain a transaction relationship network;
the community classification module 200 is used for carrying out community classification on the transaction relationship network based on a graph clustering algorithm of modularity to obtain a community classification result;
and the target community analysis module 300 is configured to determine an analysis dimension according to the attribute of the target community, and analyze the community classification result through the analysis dimension to obtain target community data.
Optionally, the transaction topology network module 100 may include:
the topological network construction unit is used for constructing a transaction topological network according to the acquired transaction relation data of the plurality of objects;
the graph calculation unit is used for performing transaction edge relationship attribute calculation and enterprise point entity attribute calculation on the transaction topological network to obtain a graph calculation result;
and the denoising processing unit is used for denoising the transaction topology network according to the graph calculation result to obtain a transaction relation network.
Optionally, the community classification module 200 may include:
the giant community acquisition unit is used for screening the transaction relationship network according to a connected community algorithm to obtain giant communities in the transaction relationship network;
and the community subdivision unit is used for carrying out community classification on the huge community according to a graph clustering algorithm based on modularity to obtain a community classification result.
Optionally, the target community analysis module 300 is specifically configured to, when the type of the target community is a virtual partnership, analyze the community classification result through multiple dimensions to obtain target community data; the multiple dimensions are an arbitrary relationship dimension, a transaction relationship dimension, an address relationship dimension, an investment relationship dimension and a property relationship dimension.
An embodiment of the present application further provides a server, including:
a memory for storing a computer program;
a processor for implementing the steps of the identification method as in the above embodiments when executing the computer program.
Embodiments of the present application further provide a computer-readable storage medium, on which a computer program is stored, and when being executed by a processor, the computer program implements the steps of the identification method according to the above embodiments.
The computer-readable storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The detailed description of the identification method of the target community data, the identification device of the target community data, the server and the computer readable storage medium provided by the application is provided above. The principles and embodiments of the present application are explained herein using specific examples, which are provided only to help understand the method and the core idea of the present application. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.

Claims (10)

1. A method for identifying target community data, comprising:
carrying out transaction relation network construction processing on the acquired transaction relation data of the plurality of objects to obtain a transaction relation network;
carrying out community classification on the transaction relationship network based on a graph clustering algorithm of modularity to obtain a community classification result;
and determining analysis dimensions according to the attributes of the target community, and analyzing the community classification result through the analysis dimensions to obtain target community data.
2. The identification method according to claim 1, wherein the step of performing transaction relationship network construction processing on the acquired transaction relationship data of the plurality of objects to obtain a transaction relationship network comprises:
constructing a transaction topological network according to the acquired transaction relation data of the plurality of objects;
performing transaction edge relationship attribute calculation and enterprise point entity attribute calculation on the transaction topological network to obtain a graph calculation result;
and carrying out noise reduction processing on the transaction topological network according to the graph calculation result to obtain the transaction relation network.
3. The identification method of claim 1, wherein the step of carrying out community classification on the transaction relationship network based on a graph clustering algorithm of modularity to obtain a community classification result comprises the steps of:
screening the transaction relationship network according to a connected community algorithm to obtain a giant community in the transaction relationship network;
and carrying out community classification on the giant community according to the graph clustering algorithm based on the modularity to obtain a community classification result.
4. The identification method of claim 1, wherein determining an analysis dimension according to the attributes of the target community, and analyzing the community classification result through the analysis dimension to obtain target community data comprises:
when the type of the target community is a virtual partnership community, analyzing the community classification result through multiple dimensions to obtain target community data; the multiple dimensions are an arbitrary relationship dimension, a transaction relationship dimension, an address relationship dimension, an investment relationship dimension and a property relationship dimension.
5. An apparatus for identifying target community data, comprising:
the transaction topological network module is used for constructing and processing the transaction relationship data of the plurality of acquired objects to obtain a transaction relationship network;
the community classification module is used for carrying out community classification on the transaction relationship network based on a graph clustering algorithm of modularity to obtain a community classification result;
and the target community analysis module is used for determining analysis dimensionality according to the attribute of the target community and analyzing the community classification result through the analysis dimensionality to obtain target community data.
6. The identification device of claim 1, wherein the transaction topology network module comprises:
the topological network construction unit is used for constructing a transaction topological network according to the acquired transaction relation data of the plurality of objects;
the graph calculation unit is used for performing transaction edge relationship attribute calculation and enterprise point entity attribute calculation on the transaction topological network to obtain a graph calculation result;
and the denoising processing unit is used for denoising the transaction topological network according to the graph calculation result to obtain the transaction relation network.
7. The identification device of claim 1, wherein the community classification module comprises:
the giant community acquisition unit is used for screening the transaction relationship network according to a connected community algorithm to obtain giant communities in the transaction relationship network;
and the community subdivision unit is used for carrying out community classification on the giant community according to the graph clustering algorithm based on the modularity to obtain the community classification result.
8. The identification device of claim 1, wherein the target community analysis module is specifically configured to, when the type of the target community is a virtual partnership, analyze the community classification result through multiple dimensions to obtain the target community data; the multiple dimensions are an arbitrary relationship dimension, a transaction relationship dimension, an address relationship dimension, an investment relationship dimension and a property relationship dimension.
9. A server, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the identification method according to any one of claims 1 to 4 when executing the computer program.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the identification method according to one of claims 1 to 4.
CN201910899829.2A 2019-09-23 2019-09-23 Target community data identification method and related device Pending CN110647590A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910899829.2A CN110647590A (en) 2019-09-23 2019-09-23 Target community data identification method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910899829.2A CN110647590A (en) 2019-09-23 2019-09-23 Target community data identification method and related device

Publications (1)

Publication Number Publication Date
CN110647590A true CN110647590A (en) 2020-01-03

Family

ID=69011063

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910899829.2A Pending CN110647590A (en) 2019-09-23 2019-09-23 Target community data identification method and related device

Country Status (1)

Country Link
CN (1) CN110647590A (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340611A (en) * 2020-02-20 2020-06-26 中国建设银行股份有限公司 Risk early warning method and device
CN111400614A (en) * 2020-01-08 2020-07-10 上海观安信息技术股份有限公司 Case-related cluster searching method based on fund transaction data
CN111400448A (en) * 2020-03-12 2020-07-10 中国建设银行股份有限公司 Method and device for analyzing incidence relation of objects
CN111445320A (en) * 2020-03-30 2020-07-24 深圳市华云中盛科技股份有限公司 Target community identification method and device, computer equipment and storage medium
CN111582538A (en) * 2020-03-25 2020-08-25 清华大学 Community value prediction method and system based on graph neural network
CN111641517A (en) * 2020-04-29 2020-09-08 深圳壹账通智能科技有限公司 Community division method and device for homogeneous network, computer equipment and storage medium
CN111784528A (en) * 2020-05-27 2020-10-16 平安科技(深圳)有限公司 Abnormal community detection method and device, computer equipment and storage medium
CN111951104A (en) * 2020-08-24 2020-11-17 上海银行股份有限公司 Risk conduction early warning method based on associated graph
CN112116403A (en) * 2020-09-28 2020-12-22 中国建设银行股份有限公司 Information recommendation method, device and equipment
CN112184299A (en) * 2020-09-23 2021-01-05 中国建设银行股份有限公司 Arbitrage user identification method, apparatus, electronic device and medium
CN112231420A (en) * 2020-10-28 2021-01-15 平安直通咨询有限公司 Data analysis method, data analysis device, electronic device, and storage medium
CN112287039A (en) * 2020-10-30 2021-01-29 税友软件集团股份有限公司 Group partner identification method and related device
CN112686654A (en) * 2021-01-21 2021-04-20 北京工业大学 Block chain digital currency transaction identification method and device, electronic equipment and storage medium
CN113313505A (en) * 2020-02-25 2021-08-27 中国移动通信集团浙江有限公司 Abnormity positioning method and device and computing equipment
CN113393250A (en) * 2021-06-09 2021-09-14 北京沃东天骏信息技术有限公司 Information processing method and device and storage medium
CN113837874A (en) * 2021-11-22 2021-12-24 北京芯盾时代科技有限公司 Data identification method and device, storage medium and electronic equipment
CN113886655A (en) * 2021-10-20 2022-01-04 支付宝(杭州)信息技术有限公司 Data processing method and device
CN114077968A (en) * 2021-11-17 2022-02-22 税友信息技术有限公司 Data risk identification method and related device
CN114297319A (en) * 2021-12-23 2022-04-08 税友信息技术有限公司 Data identification method and related device
CN115118693A (en) * 2022-06-28 2022-09-27 平安银行股份有限公司 Group member data processing method and device, electronic equipment and storage medium
CN115409297A (en) * 2022-11-02 2022-11-29 联通(广东)产业互联网有限公司 Government affair service flow optimization method and system and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102684912A (en) * 2012-04-11 2012-09-19 上海交通大学 Community structure mining method based on network potential energy
CN108520471A (en) * 2018-04-27 2018-09-11 广州杰赛科技股份有限公司 It is overlapped community discovery method, device, equipment and storage medium
CN109144984A (en) * 2017-06-27 2019-01-04 中兴通讯股份有限公司 Crime Stock discrimination method, equipment and storage medium
CN109325814A (en) * 2017-07-31 2019-02-12 上海诺悦智能科技有限公司 A method of for finding suspicious trade network
CN110209660A (en) * 2019-06-10 2019-09-06 北京阿尔山金融科技有限公司 Cheat clique's method for digging, device and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102684912A (en) * 2012-04-11 2012-09-19 上海交通大学 Community structure mining method based on network potential energy
CN109144984A (en) * 2017-06-27 2019-01-04 中兴通讯股份有限公司 Crime Stock discrimination method, equipment and storage medium
CN109325814A (en) * 2017-07-31 2019-02-12 上海诺悦智能科技有限公司 A method of for finding suspicious trade network
CN108520471A (en) * 2018-04-27 2018-09-11 广州杰赛科技股份有限公司 It is overlapped community discovery method, device, equipment and storage medium
CN110209660A (en) * 2019-06-10 2019-09-06 北京阿尔山金融科技有限公司 Cheat clique's method for digging, device and electronic equipment

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111400614A (en) * 2020-01-08 2020-07-10 上海观安信息技术股份有限公司 Case-related cluster searching method based on fund transaction data
CN111340611A (en) * 2020-02-20 2020-06-26 中国建设银行股份有限公司 Risk early warning method and device
CN111340611B (en) * 2020-02-20 2024-03-08 中国建设银行股份有限公司 Risk early warning method and device
CN113313505A (en) * 2020-02-25 2021-08-27 中国移动通信集团浙江有限公司 Abnormity positioning method and device and computing equipment
CN111400448A (en) * 2020-03-12 2020-07-10 中国建设银行股份有限公司 Method and device for analyzing incidence relation of objects
CN111582538A (en) * 2020-03-25 2020-08-25 清华大学 Community value prediction method and system based on graph neural network
CN111582538B (en) * 2020-03-25 2023-07-04 清华大学 Community value prediction method and system based on graph neural network
CN111445320A (en) * 2020-03-30 2020-07-24 深圳市华云中盛科技股份有限公司 Target community identification method and device, computer equipment and storage medium
CN111445320B (en) * 2020-03-30 2023-09-29 深圳市华云中盛科技股份有限公司 Target community identification method and device, computer equipment and storage medium
CN111641517A (en) * 2020-04-29 2020-09-08 深圳壹账通智能科技有限公司 Community division method and device for homogeneous network, computer equipment and storage medium
CN111784528A (en) * 2020-05-27 2020-10-16 平安科技(深圳)有限公司 Abnormal community detection method and device, computer equipment and storage medium
CN111951104A (en) * 2020-08-24 2020-11-17 上海银行股份有限公司 Risk conduction early warning method based on associated graph
CN112184299A (en) * 2020-09-23 2021-01-05 中国建设银行股份有限公司 Arbitrage user identification method, apparatus, electronic device and medium
CN112116403A (en) * 2020-09-28 2020-12-22 中国建设银行股份有限公司 Information recommendation method, device and equipment
CN112231420A (en) * 2020-10-28 2021-01-15 平安直通咨询有限公司 Data analysis method, data analysis device, electronic device, and storage medium
CN112287039A (en) * 2020-10-30 2021-01-29 税友软件集团股份有限公司 Group partner identification method and related device
CN112686654A (en) * 2021-01-21 2021-04-20 北京工业大学 Block chain digital currency transaction identification method and device, electronic equipment and storage medium
CN113393250A (en) * 2021-06-09 2021-09-14 北京沃东天骏信息技术有限公司 Information processing method and device and storage medium
CN113886655A (en) * 2021-10-20 2022-01-04 支付宝(杭州)信息技术有限公司 Data processing method and device
CN114077968A (en) * 2021-11-17 2022-02-22 税友信息技术有限公司 Data risk identification method and related device
CN113837874B (en) * 2021-11-22 2022-04-12 北京芯盾时代科技有限公司 Data identification method and device, storage medium and electronic equipment
CN113837874A (en) * 2021-11-22 2021-12-24 北京芯盾时代科技有限公司 Data identification method and device, storage medium and electronic equipment
CN114297319A (en) * 2021-12-23 2022-04-08 税友信息技术有限公司 Data identification method and related device
CN115118693A (en) * 2022-06-28 2022-09-27 平安银行股份有限公司 Group member data processing method and device, electronic equipment and storage medium
CN115409297A (en) * 2022-11-02 2022-11-29 联通(广东)产业互联网有限公司 Government affair service flow optimization method and system and electronic equipment

Similar Documents

Publication Publication Date Title
CN110647590A (en) Target community data identification method and related device
CN108960833B (en) Abnormal transaction identification method, equipment and storage medium based on heterogeneous financial characteristics
Olson et al. Advanced data mining techniques
Sabau Survey of clustering based financial fraud detection research
Prasad et al. Prediction of churn behavior of bank customers using data mining tools
CN103577988B (en) A kind of method and apparatus for recognizing specific user
CN109635007B (en) Behavior evaluation method and device and related equipment
CN102956009A (en) Electronic commerce information recommending method and electronic commerce information recommending device on basis of user behaviors
CN106991175A (en) A kind of customer information method for digging, device, equipment and storage medium
Javadian Kootanaee et al. A hybrid model based on machine learning and genetic algorithm for detecting fraud in financial statements
US8145585B2 (en) Automated methods and systems for the detection and identification of money service business transactions
Soumya et al. Modern Data Mining Approach to Handle Multivariate Data and to Implement Best Saving Services for Potential Investor
Liou et al. Predicting business failure under the existence of fraudulent financial reporting
Chimonaki et al. Identification of financial statement fraud in Greece by using computational intelligence techniques
CN115526700A (en) Risk prediction method and device and electronic equipment
Tackett Association rules for fraud detection
CN111046947B (en) Training system and method of classifier and recognition method of abnormal sample
Apparao et al. Financial statement fraud detection by data mining
CN112329862A (en) Decision tree-based anti-money laundering method and system
CA3183247A1 (en) Method and system for detecting a cybersecurity breach
CN110189016A (en) Technology life cycle appraisal procedure and device
Carvalho et al. Using Clustering and Text Mining to Create a Reference Price Database
Terzi et al. Comparison of financial distress prediction models: Evidence from turkey
Shriwas et al. Using text mining and rule based technique for prediction of stock market price
Sun et al. A new perspective of credit scoring for small and medium-sized enterprises based on invoice data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200103

RJ01 Rejection of invention patent application after publication