CN109523153A - Acquisition methods, device, computer equipment and the storage medium of illegal fund collection enterprise - Google Patents
Acquisition methods, device, computer equipment and the storage medium of illegal fund collection enterprise Download PDFInfo
- Publication number
- CN109523153A CN109523153A CN201811339775.6A CN201811339775A CN109523153A CN 109523153 A CN109523153 A CN 109523153A CN 201811339775 A CN201811339775 A CN 201811339775A CN 109523153 A CN109523153 A CN 109523153A
- Authority
- CN
- China
- Prior art keywords
- enterprise
- portrait
- identified
- financial
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000012549 training Methods 0.000 claims description 32
- 238000004590 computer program Methods 0.000 claims description 28
- 239000011159 matrix material Substances 0.000 claims description 14
- 238000004422 calculation algorithm Methods 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims description 8
- 239000000284 extract Substances 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 8
- 230000011218 segmentation Effects 0.000 claims description 6
- 230000000903 blocking effect Effects 0.000 claims description 3
- 239000000463 material Substances 0.000 claims 1
- 241001269238 Data Species 0.000 abstract description 2
- 238000007405 data analysis Methods 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 5
- 238000012502 risk assessment Methods 0.000 description 5
- 238000013507 mapping Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 239000000203 mixture Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000029305 taxis Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/06—Asset management; Financial planning or analysis
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Finance (AREA)
- Game Theory and Decision Science (AREA)
- Accounting & Taxation (AREA)
- Operations Research (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Technology Law (AREA)
- Tourism & Hospitality (AREA)
- Educational Administration (AREA)
- Quality & Reliability (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
This application involves acquisition methods, device, computer equipment and the storage mediums of a kind of illegal fund collection enterprise.The described method includes: obtaining the financial information data of enterprise to be identified, news corpus data and company information data, and it is drawn a portrait according to the enterprise that these information datas generate enterprise to be identified, similarity between calculating enterprise portrait and the venture business's portrait constructed in advance, obtains illegal fund collection similarity;According to enterprise's portrait and venture business's portrait building enterprise's incidence relation network, and utilize venture business's relating value between enterprise's incidence relation network query function enterprise portrait and venture business's portrait;The illegal fund collection value-at-risk of enterprise to be identified is calculated according to public sentiment risk indicator value, financial risk index value, illegal fund collection similarity and venture business's relating value, alarm threshold is preset when illegal fund collection value-at-risk is greater than, then enterprise to be identified is determined as illegal fund collection venture business.This method improves the reliability of the Corporate Identity of illegal fund collection based on big data analysis technology.
Description
Technical field
This application involves big data analysis technical fields, acquisition methods, dress more particularly to a kind of illegal fund collection enterprise
It sets, computer equipment and storage medium.
Background technique
Illegal fund collection refers to enterprise not ratifying to raise funds in a manner of various bond vouchers to the public;For
The identification of the enterprise of illegal fund collection is based primarily upon practitioner according to working experience and judges wealth from the financial statement of the enterprise
Business is abnormal, to judge whether the enterprise has the suspicion of illegal fund collection;To enterprise whether carry out illegal fund collection judgement
Cheng Zhong often relies on historical experience and carries out Digital Logic analysis and statistical analysis to a large amount of financial statement, for illegally collecting
The identification certainty for providing enterprise is poor.
Summary of the invention
Based on this, it is necessary to for the technical problem of the identification certainty difference of above-mentioned illegal fund collection enterprise, provide a kind of non-
Acquisition methods, device, computer equipment and the storage medium of method fund-raising enterprise.
A kind of acquisition methods of illegal fund collection enterprise, which comprises
The financial information data, news corpus data and company information data of enterprise to be identified are obtained, and are calculated separately
The financial risk index value and public sentiment risk indicator value of the enterprise to be identified;
Enterprise to be identified is generated according to the financial information data, the news corpus data and enterprise's essential information
The enterprise of industry draws a portrait, and calculates the similarity between enterprise's portrait and the venture business's portrait constructed in advance, obtains illegal fund collection
Similarity;
According to enterprise portrait and venture business portrait building enterprise's incidence relation network, and utilize the enterprise
Venture business's relating value between the portrait of enterprise described in industry incidence relation network query function and venture business portrait;
According to the public sentiment risk indicator value, the financial risk index value, the illegal fund collection similarity and described
Venture business's relating value calculates the illegal fund collection value-at-risk of enterprise to be identified, when the illegal fund collection value-at-risk is greater than default police
Threshold value is guarded against, then the enterprise to be identified is determined as illegal fund collection venture business.
It is described according to the financial information data, news corpus data and described in one of the embodiments,
Enterprise's essential information generates the step of enterprise's portrait of enterprise to be identified, comprising:
Construct the corresponding enterprise's label of the enterprise to be identified;
According to preset format respectively by the financial information data, the news corpus data and the company information number
According to the target financial data, target public sentiment data and Target Enterprise information data for being converted to structuring;
Respectively according to the generation of the target financial data, the target public sentiment data and the Target Enterprise data
The corresponding financial information class label of enterprise's label, public opinion info class label and company information class label, obtain described to be identified
The enterprise of enterprise draws a portrait.
The company information class label includes enterprise personnel label and business partner mark in one of the embodiments,
Label;
Described the step of enterprise's incidence relation network is constructed according to enterprise portrait and venture business portrait, packet
Include following steps:
It is drawn a portrait according to Target Enterprise and constructs entity node, and belonged to according to enterprise personnel label and the building of business partner label
Property node, wherein the Target Enterprise portrait include the enterprise to be identified enterprise portrait and the venture business portrait;
According to Target Enterprise portrait and the corresponding enterprise personnel label of Target Enterprise portrait and business partner
Label obtains the incidence relation of the incidence relation and each entity node and each attribute node between each entity node;
With between each entity node incidence relation and each entity node and being associated between the attribute node
Relationship establishes the incidence relation network between the Target Enterprise portrait.
Phase between calculating enterprise's portrait and the venture business's portrait constructed in advance in one of the embodiments,
Like degree, the step of obtaining illegal fund collection similarity, comprising:
The outstanding person for calculating separately financial information class label between enterprise's portrait and venture business portrait blocks German number, institute
The outstanding person for stating public opinion info class label blocks German number and the outstanding person of the company information class label blocks German number;
According to the outstanding illegal fund collection similarity blocking the determining enterprises' portrait of German number and drawing a portrait with the venture business.
The step of calculating the financial risk index value of the enterprise to be identified in one of the embodiments, comprising:
Obtain the first financial information data of venture business's portrait;
The finance that quantity is preset number are respectively divided in the first financial information data using clustering algorithm to cluster
In;
The cluster centre of each finance cluster is obtained, and determines that the corresponding financial risk of each finance cluster centre refers to
Mark section;
The financial information data of the enterprise to be identified are calculated to the distance value of each cluster centre, by distance value minimum
Finance cluster be determined as target financial belonging to the financial information data cluster;
According to the financial risk index section of target financial cluster and the financial information data and the target
The distance value of the cluster centre of finance cluster, determines financial risk index value in financial risk index section.
The step of public sentiment risk indicator value for calculating the enterprise to be identified in one of the embodiments, comprising:
Chinese word segmentation is carried out to the news corpus data of the enterprise to be identified, extracts the pass in the news corpus data
Keyword;
The keyword is input in the model-naive Bayesian constructed in advance, the model-naive Bayesian meter is utilized
Calculate the probability that the enterprise to be identified under conditions of the keyword occurs is illegal fund collection enterprise;
The public sentiment risk indicator value of the enterprise to be identified is determined according to the probability value.
The construction step of the model-naive Bayesian in one of the embodiments, comprising:
News corpus training sample set is obtained, the news corpus training sample set includes the news language of illegal fund collection enterprise
Expect the news corpus sample of sample and non-illegal fund collection enterprise;
Corresponding priori is general when calculating the news corpus sample that each news corpus training sample is the different types of business
Rate;
Each news corpus training sample is pre-processed to obtain the feature word of news corpus sample, generates feature
Word matrix;
The news corpus sample that news corpus training sample is the different types of business is calculated according to the feature word matrix
When each feature word conditional probability;
Model-naive Bayesian is constructed according to the prior probability and the conditional probability.
A kind of acquisition device of illegal fund collection enterprise, described device include:
Data information obtains module, for obtaining the financial information data, news corpus data and enterprise of enterprise to be identified
Industry information data, and calculate separately the financial risk index value and public sentiment risk indicator value of the enterprise to be identified;
Similarity obtains module, for according to the financial information data, the news corpus data and the enterprise
Essential information generates enterprise's portrait of enterprise to be identified, calculates between enterprise's portrait and the venture business's portrait constructed in advance
Similarity obtains illegal fund collection similarity;
Venture business's relating value obtains module, for according to enterprise portrait and portrait building enterprise, the venture business
Industry incidence relation network, and using between the portrait of enterprise described in enterprise's incidence relation network query function and venture business portrait
Venture business's relating value;
Illegal fund collection venture business obtains module, for according to the public sentiment risk indicator value, the financial risk index
Value, the illegal fund collection similarity and venture business's relating value, calculate the illegal fund collection value-at-risk of enterprise to be identified, when
The illegal fund collection value-at-risk is greater than default alarm threshold, then the enterprise to be identified is determined as illegal fund collection venture business.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing
Device performs the steps of when executing the computer program
The financial information data, news corpus data and company information data of enterprise to be identified are obtained, and are calculated separately
The financial risk index value and public sentiment risk indicator value of the enterprise to be identified;
Enterprise to be identified is generated according to the financial information data, the news corpus data and enterprise's essential information
The enterprise of industry draws a portrait, and calculates the similarity between enterprise's portrait and the venture business's portrait constructed in advance, obtains illegal fund collection
Similarity;
According to enterprise portrait and venture business portrait building enterprise's incidence relation network, and utilize the enterprise
Venture business's relating value between the portrait of enterprise described in industry incidence relation network query function and venture business portrait;
According to the public sentiment risk indicator value, the financial risk index value, the illegal fund collection similarity and described
Venture business's relating value calculates the illegal fund collection value-at-risk of enterprise to be identified, when the illegal fund collection value-at-risk is greater than default police
Threshold value is guarded against, then the enterprise to be identified is determined as illegal fund collection venture business.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor
It is performed the steps of when row
The financial information data, news corpus data and company information data of enterprise to be identified are obtained, and are calculated separately
The financial risk index value and public sentiment risk indicator value of the enterprise to be identified;
Enterprise to be identified is generated according to the financial information data, the news corpus data and enterprise's essential information
The enterprise of industry draws a portrait, and calculates the similarity between enterprise's portrait and the venture business's portrait constructed in advance, obtains illegal fund collection
Similarity;
According to enterprise portrait and venture business portrait building enterprise's incidence relation network, and utilize the enterprise
Venture business's relating value between the portrait of enterprise described in industry incidence relation network query function and venture business portrait;
According to the public sentiment risk indicator value, the financial risk index value, the illegal fund collection similarity and described
Venture business's relating value calculates the illegal fund collection value-at-risk of enterprise to be identified, when the illegal fund collection value-at-risk is greater than default police
Threshold value is guarded against, then the enterprise to be identified is determined as illegal fund collection venture business.
Acquisition methods, device, computer equipment and the storage medium of above-mentioned illegal fund collection enterprise, according to enterprise to be identified
Financial information data, news corpus data and company information data building enterprise portrait, and by enterprise's portrait and illegally
The similarity and relating value of venture business's portrait in fund-raising case are analyzed, and are excavated enterprise's portrait and are drawn with venture business
Hiding relationship as between is finally associated with from financial risk, news public sentiment risk and with enterprise in illegal fund collection case
It is that these three dimensions are started with, the risk assessment of illegal fund collection is carried out to enterprise to be identified, increases the history letter of illegal fund collection case
It ceases data and carries out illegal fund collection risk identification, the risk assessment of illegal fund collection is no longer only carried out from the Digital Logic of financial level,
The basis of characterization for optimizing illegal fund collection enterprise improves the reliability of the Corporate Identity of illegal fund collection.
Detailed description of the invention
Fig. 1 is the application scenario diagram of the acquisition methods of illegal fund collection enterprise in one embodiment;
Fig. 2 is the flow diagram of the acquisition methods of illegal fund collection enterprise in one embodiment;
Fig. 3 is the flow diagram of the construction step of enterprise's incidence relation network in one embodiment;
Fig. 4 is the structural block diagram of the acquisition device of illegal fund collection enterprise in one embodiment;
Fig. 5 is the internal structure chart of computer equipment in one embodiment.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
The acquisition methods of illegal fund collection enterprise provided by the present application, can be applied in application environment as shown in Figure 1.Its
In, terminal 102 is communicated with server 104 by network by network.Terminal 102 crawls the wealth of enterprise to be identified from network
Business information data, news corpus data and company information data, are sent in server 104, and server 104 is according to these letters
Breath data set up enterprise's portrait of enterprise to be identified and enterprise's portrait of enterprise to be identified and illegal fund collection case risk
The knowledge mapping of enterprise's portrait, server 104 calculate the similarity that enterprise's portrait of enterprise to be identified is drawn a portrait with venture business, and
The degree of association that enterprise's portrait is drawn a portrait with venture business is calculated by knowledge mapping, is added using similarity and the degree of association to wait know
The risk assessment of the illegal fund collection of other enterprise, improves the reliability of the Corporate Identity of illegal fund collection.Wherein, terminal 102 can with but
It is not limited to various personal computers, laptop, smart phone, tablet computer and portable wearable device, server
104 can be realized with the server cluster of the either multiple server compositions of independent server.
In one embodiment, it as shown in Fig. 2, providing a kind of acquisition methods of illegal fund collection enterprise, answers in this way
For being illustrated for the server in Fig. 1, comprising the following steps:
Step S210: obtaining the financial information data, news corpus data and company information data of enterprise to be identified, and
Calculate separately the financial risk index value and public sentiment risk indicator value of enterprise to be identified.
In this step, financial information data, news corpus data and company information data can be server by utilizing and climb
Acquisition is crawled on worm software network;Wherein, financial information data include wealth disclosed in the relevant financial data of enterprise and enterprise
Count off evidence, news corpus data may include the financial and economic news text of enterprise, manage row locating for dynamic newsletter archive and enterprise
The texts such as the dynamic news of industry;Company information data includes the personal information of enterprise, reference information and business event partner letter
Breath etc., which mainly includes the enterprise personnels information such as registrant and the shareholder personnel of enterprise, the reference information
Data, the business event buddy informations such as main credit grade, management state and situation of paying taxes including enterprise mainly include
The corresponding supplier of cooperative enterprise and upstream and downstream of enterprise and purchaser etc..The financial risk index value of the enterprise to be identified
And public sentiment risk indicator value can be what the existing analysis model of server by utilizing was calculated, be also possible to get to
After the financial information data and news corpus data that identify enterprise, the wealth that is calculated by server according to financial information data
Business risk indicator value and the public sentiment risk indicator value being calculated according to news corpus data.
Step S220: enterprise to be identified is generated according to financial information data, news corpus data and enterprise's essential information
Enterprise portrait, calculate enterprise portrait and construct in advance venture business portrait between similarity, obtain illegal fund collection similarity.
In this step, venture business's portrait is server previously according to each of the illegal fund collection enterprise in illegal fund collection case
Category information data, such as financial information data, news corpus data and enterprise's essential information, the portrait of building;
The process for constructing enterprise's portrait is to add tagged process to enterprise, and the information data of different enterprises is different, institute
It is also different with the label that enterprise draws a portrait;Specifically, server is in financial information data, the news language for obtaining enterprise to be identified
After expecting data and enterprise's essential information, financial information data, news corpus data and enterprise's essential information are carried out certainly
Right Language Processing analysis generates corresponding label data, to realize that the labeling for generating enterprise to be identified according to label data is drawn
Picture.It, can be according to enterprise's portrait and venture business after the enterprise's portrait and venture business's portrait for obtaining enterprise to be identified
The similarity of the similarity calculation of label between the two between portrait, to obtain similar between enterprise to be identified and illegal fund collection enterprise
Degree.
Step S230: according to enterprise's portrait and venture business's portrait building enterprise's incidence relation network, and enterprise is utilized
Venture business's relating value between incidence relation network query function enterprise portrait and venture business's portrait.
In this step, server, can be to enterprise after the enterprise's portrait and venture business's portrait for obtaining enterprise to be identified
The essential informations such as industry relation of the supply, investment relation, personnel's relationship carry out combing and obtain between enterprise and illegal fund collection enterprise to be identified
Hiding incidence relation, and the enterprise of enterprise to be identified portrait and venture business's portrait building are passed through according to the incidence relation
Incidence relation network between enterprise and enterprise, subsequent server can conduct algorithm by risk and utilize incidence relation network query function
The co-related risks value of enterprise to be identified and illegal fund collection enterprise.
Step S240: according to public sentiment risk indicator value, financial risk index value, illegal fund collection similarity and venture business
Relating value calculates the illegal fund collection value-at-risk of enterprise to be identified, then will be to when illegal fund collection value-at-risk is greater than default alarm threshold
Identification enterprise is determined as illegal fund collection venture business.
In this step, alarm threshold can be arranged according to the value-at-risk to illegal fund collection cases all on Vehicles Collected from Market
's;Illegal fund collection value-at-risk can be public sentiment risk indicator value, financial risk index value, illegal fund collection similarity and risk enterprise
Industry relating value carries out the total value for being added acquisition, and it is similar to be also possible to public sentiment risk indicator value, financial risk index value, illegal fund collection
The average value of degree and venture business's relating value;Specifically, server according to public sentiment risk indicator value, financial risk index value,
Illegal fund collection similarity and venture business's relating value are sought after obtaining illegal fund collection value-at-risk, by illegal fund collection value-at-risk and police
It guards against threshold value to compare, when illegal fund collection value-at-risk is greater than alarm threshold, then enterprise to be identified is illegal fund collection venture business.
In the acquisition methods of above-mentioned illegal fund collection enterprise, according to the financial information data of enterprise to be identified, news corpus number
Accordingly and company information data building enterprise draws a portrait, and by drawing a portrait to enterprise's portrait with the venture business in illegal fund collection case
Similarity and relating value analyzed, excavate enterprise portrait venture business portrait between hiding relationship, finally from wealth
It is engaged in risk, news public sentiment risk and these three dimensions are started with the incidence relation of the enterprise in illegal fund collection case, treats knowledge
Other enterprise carries out the risk assessment of illegal fund collection, and the history information data for increasing illegal fund collection case carries out the knowledge of illegal fund collection risk
Not, the risk assessment that illegal fund collection is no longer only carried out from the Digital Logic of financial level, optimizes the identification of illegal fund collection enterprise
Foundation improves the reliability of the Corporate Identity of illegal fund collection
In one embodiment, it is generated according to financial information data, news corpus data and enterprise's essential information wait know
The step of enterprise's portrait of other enterprise, comprising: construct the corresponding enterprise's label of enterprise to be identified;According to preset format respectively by wealth
Business information data, news corpus data and company information data are converted to the target financial data of structuring, target public sentiment number
Accordingly and Target Enterprise information data;It is generated respectively according to target financial data, target public sentiment data and Target Enterprise data
The corresponding financial information class label of enterprise's label, public opinion info class label and company information class label, obtain enterprise to be identified
Enterprise portrait.
The present embodiment is that the process for constructing enterprise's portrait can be according to wait know during server construction enterprise draws a portrait
The unique identification of the enterprises such as the enterprise name or enterprise's duty paragraph of other enterprise generates enterprise's label;Then by financial information data conversion
For the target financial data of structuring, to generate financial information category corresponding with enterprise's label according to target financial data
Label, likewise, public opinion info class label corresponding with enterprise's label is that news corpus data are converted structuring by server
Target public sentiment data, thus generated according to target public sentiment data;Company information class label corresponding with enterprise's label is service
Device converts company information data to the Target Enterprise information data of structuring, to be generated according to Target Enterprise information data
's;Have by the way that financial information data, news corpus data and company information data are generated enterprise's label to be identified respectively
Corresponding financial information class label, public opinion info class label and company information class label, to build and different enterprises one
One corresponding enterprise's portrait, convenient for the comparative analysis of subsequent enterprise to be identified and illegal fund collection enterprise, excavates out potential illegal collection
The enterprise of risk is provided, the acquisition efficiency of illegal fund collection enterprise is improved.
Specifically, financial information data, news corpus data and company information data are turned respectively according to preset format
The target financial data, target public sentiment data and Target Enterprise information data for being changed to structuring can specifically include: establish enterprise
Industry representation data table, entity object is extracted using natural language processing technique from various information data and entity object is corresponding
Characteristic value or characteristic attribute etc., by the way that the characteristic value of entity object or entity object is saved in enterprise's representation data table
In, generate structural data.By taking company information data as an example, name entity is carried out to enterprise personnel information in enterprise's essential information
Identification, obtained name entity is saved in enterprise's representation data table, subsequent that the name entity is generated corresponding label.
In one embodiment, as shown in figure 3, providing a kind of construction step of enterprise's incidence relation network, wherein enterprise
Industry info class label includes enterprise personnel label and business partner label;According to enterprise's portrait and venture business's portrait building
The step of enterprise's incidence relation network, comprising the following steps:
Step S310: according to Target Enterprise portrait enterprise's label construct entity node, and according to enterprise personnel label with
And business partner label constructs attribute node, wherein Target Enterprise portrait includes enterprise's portrait and the risk enterprise of enterprise to be identified
Industry portrait.
In this step, server is drawn a portrait using enterprise and enterprise's label of venture business's portrait is as entity node, with enterprise
The enterprise personnel label and business partner label of industry portrait and venture business's portrait are attribute node.
Step S320: according to Target Enterprise portrait and the corresponding enterprise personnel label of Target Enterprise portrait and business partner
Label obtains the incidence relation of the incidence relation and each entity node and each attribute node between each entity node.
In this step, server is obtained according to enterprise's portrait and its corresponding enterprise personnel label and business partner label
Enterprise is taken to draw a portrait the first incidence relation of corresponding enterprise's label and enterprise personnel label and business partner label, thus really
Incidence relation, each entity node between fixed each entity node and the incidence relation between each attribute node.
Step S330: between each entity node incidence relation and each entity node and being associated between attribute node
System establishes the incidence relation network between Target Enterprise portrait.
In this step, server obtains determining incidence relation between each entity node and each entity node and each category
After incidence relation between property node, using these incidence relations as connecting line, entity node is attached with attribute node, structure
The incidence relation network that enterprise's portrait of enterprise to be identified is drawn a portrait with venture business is built, a knowledge mapping is formed.
The present embodiment is the process for constructing the incidence relation network of enterprise's portrait and venture business's portrait of enterprise to be identified,
By knowledge mapping technology construct enterprise portrait with venture business portrait incidence relation network, realize to enterprise to be identified with it is non-
The combing of the essential informations such as supply relationship, investment relation, the senior executive's relationship between enterprise in method fund-raising case, thus
The existing incidence relation for obtaining enterprise to be identified Yu illegal fund collection enterprise can be analyzed from incidence relation network, improved illegal
The reliability of the Corporate Identity of fund-raising.
In one embodiment, server can draw a portrait from the enterprise of enterprise to be identified and be associated with what venture business drew a portrait
Be enterprise to be identified is obtained in network to draw a portrait associated path draw a portrait to different venture businesses, and using risk conduct algorithm according to
Associated path calculates venture business's relating value of the enterprise in enterprise to be identified and illegal fund collection case.
In one embodiment, the similarity between calculating enterprise portrait and the venture business's portrait constructed in advance, obtains non-
The step of method fund-raising similarity, comprising: calculate separately the outstanding of financial information class label between enterprise's portrait and venture business's portrait and block
German number, public opinion info class label outstanding person block the outstanding person of German number and company information class label and block German number;It is German according to outstanding person's card
Number determines the illegal fund collection similarity that enterprise's portrait is drawn a portrait with venture business.
The present embodiment is the calculating of enterprise's portrait similarity between enterprise to be identified and illegal fund collection enterprise, and outstanding person blocks German number
(Jaccard similarity coefficient) is used to compare the similitude between two sample sets, and outstanding person blocks German numerical value and gets over
Greatly, two Sample Similarities are higher.Server is by calculating finance between enterprise's portrait of enterprise to be identified and venture business's portrait
The outstanding person of info class label, public opinion info class label and company information class label blocks German number, these three outstanding persons are blocked German number
Average value is determined as enterprise's portrait similarity between enterprise to be identified and illegal fund collection enterprise.By addition enterprise to be identified and illegally
The assessment of the similarity of fund-raising enterprise, pays close attention to company's case of the illegal fund collection occurred, excavates to exist with it and hide
Incidence relation greatly improves the poor reliability for improving the Corporate Identity of illegal fund collection.
In one embodiment, the step of calculating the financial risk index value of enterprise to be identified, comprising: obtain venture business
First financial information data of portrait;It is preset number that quantity, which is respectively divided, in the first financial information data using clustering algorithm
Finance cluster in;The cluster centre of each finance cluster is obtained, and determines the corresponding financial risk index of each financial cluster centre
Section;The financial information data of enterprise to be identified are calculated to the distance value of each cluster centre, the smallest finance of distance value are clustered
It is determined as the cluster of target financial belonging to financial information data;According to the financial risk index section of target financial cluster and wealth
The distance value of the cluster centre of information data of being engaged in and target financial cluster determines that financial risk refers in financial risk index section
Scale value.
The present embodiment is the process for obtaining the financial risk index value of enterprise to be identified;Server is by obtaining market loading
The financial information data of this enterprise;It is N that quantity, which is respectively divided, in these sample companies finance sample datas using clustering algorithm
In class finance cluster, every class finance cluster a corresponding financial risk index section, and N value can be with value for 3 to 10;Obtain to
After the financial information data for identifying enterprise, according to the cluster centre of the financial information data of enterprise to be identified and each finance cluster
Distance value determine the type of the cluster of finance belonging to the financial information data of enterprise to be identified, so that it is determined that enterprise to be identified
Financial risk index section, and financial risk index value is determined in financial risk index section.
In one embodiment, the step of calculating the public sentiment risk indicator value of enterprise to be identified, comprising: to enterprise to be identified
News corpus data carry out Chinese word segmentation, extract news corpus data in keyword;Keyword is input to preparatory building
Model-naive Bayesian in, it is illegal for calculating the enterprise to be identified under conditions of keyword occurs using model-naive Bayesian
The probability of fund-raising enterprise;The public sentiment risk indicator value of enterprise to be identified is determined according to probability value.
In the present embodiment, server constructs model-naive Bayesian in advance, passes through the news corpus number to enterprise to be identified
Extract keyword in news corpus data according to participle is carried out, and will be calculated in keyword input value model-naive Bayesian to
Identification enterprise is the probability of illegal fund collection enterprise;The present embodiment improves the public sentiment wind of enterprise to be identified using NB Algorithm
Whether the accuracy of dangerous index value provides accurate foundation for subsequent identification enterprise to be identified for illegal fund collection enterprise.
In one embodiment, the construction step of model-naive Bayesian, comprising: news corpus training sample set is obtained,
News corpus training sample set includes the news corpus sample of illegal fund collection enterprise and the news corpus of non-illegal fund collection enterprise
Sample;Calculate corresponding prior probability when the news corpus sample that each news corpus training sample is the different types of business;To each
News corpus training sample is pre-processed to obtain the feature word of news corpus sample, generates feature word matrix;According to spy
Levy the condition that word matrix calculates each feature word when the news corpus sample that news corpus training sample is the different types of business
Probability;Model-naive Bayesian is constructed according to prior probability and conditional probability.
In the present embodiment, news corpus training sample set includes two kinds of news corpus sample, and one is illegal collection
The news corpus sample of enterprise is provided, another kind is the news corpus sample of non-illegal fund collection enterprise;Server is first calculated in news
Corpus training sample is concentrated, and news corpus sample is the prior probability of the news corpus sample of illegal fund collection enterprise and is non-
The prior probability of the news corpus sample of method fund-raising enterprise, then calculates each feature word in different types of news corpus sample
The conditional probability occurred in this, to construct model-naive Bayesian according to prior probability and conditional probability.Wherein, to each new
Hearing the pretreatment that corpus training sample carries out includes: to carry out stammerer participle to each news corpus sample, to obtain news language
All vocabulary for expecting sample and count these features for so the news common words removal in vocabulary extracts feature word
The number that each feature word occurs in word, generates the feature word matrix of bag of words.By constructing naive Bayesian mould
Type fast and accurately can identify that enterprise to be identified is the probability value of illegal fund collection enterprise according to news corpus data, improve
The recognition efficiency and reliability of illegal fund collection enterprise.
In one embodiment, the conditional probability occurred in the news corpus sample of illegal fund collection enterprise is greater than default
The feature word of threshold value is divided into strong feature word;Using Word2Vec model, strong feature word is expanded, is expanded
Illegal fund collection feature word library, and according to the feature word in feature word library to the feature word matrix in bag of words
It is adjusted, to improve the accuracy of model-naive Bayesian, improves the identification certainty to the enterprise of illegal fund collection.
It should be understood that although each step in the flow chart of Fig. 2 to Fig. 3 is successively shown according to the instruction of arrow,
But these steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly state otherwise herein, these
There is no stringent sequences to limit for the execution of step, these steps can execute in other order.Moreover, Fig. 2 is into Fig. 3
At least part step may include that perhaps these sub-steps of multiple stages or stage are not necessarily same to multiple sub-steps
One moment executed completion, but can execute at different times, and the execution in these sub-steps or stage sequence is also not necessarily
Be successively carry out, but can at least part of the sub-step or stage of other steps or other steps in turn or
Alternately execute.
In one embodiment, as shown in figure 4, providing a kind of acquisition device of illegal fund collection enterprise, comprising: data letter
Breath obtains module 410, similarity obtains module 420, venture business's relating value obtains module 430 and illegal fund collection venture business obtains
Modulus block 440, in which:
Data information obtains module 410, for obtain the financial information data of enterprise to be identified, news corpus data and
Company information data, and calculate separately the financial risk index value and public sentiment risk indicator value of enterprise to be identified;
Similarity obtains module 420, for raw according to financial information data, news corpus data and enterprise's essential information
It draws a portrait at the enterprise of enterprise to be identified, the similarity between calculating enterprise portrait and the venture business's portrait constructed in advance obtains non-
Method fund-raising similarity;
Venture business's relating value obtains module 430, for being closed according to enterprise's portrait and portrait building enterprise, venture business
Join relational network, and is associated with using the venture business between enterprise's incidence relation network query function enterprise portrait and venture business's portrait
Value;
Illegal fund collection venture business obtains module 440, for according to public sentiment risk indicator value, financial risk index value, non-
Method fund-raising similarity and venture business's relating value, calculate the illegal fund collection value-at-risk of enterprise to be identified, when illegal fund collection risk
Value is greater than default alarm threshold, then enterprise to be identified is determined as illegal fund collection venture business.
In one embodiment, similarity obtains module 420 for constructing the corresponding enterprise's label of enterprise to be identified;According to
Financial information data, news corpus data and company information data are converted to the target financial of structuring by preset format respectively
Data, target public sentiment data and Target Enterprise information data;Respectively according to target financial data, target public sentiment data and mesh
It marks business data and generates the corresponding financial information class label of enterprise's label, public opinion info class label and company information class label,
Obtain enterprise's portrait of enterprise to be identified.
Company information class label includes enterprise personnel label and business partner label in one of the embodiments,;Wind
Dangerous enterprise's relating value 430 obtains enterprise's label that module is used to draw a portrait according to Target Enterprise and constructs entity node, and according to enterprise
People Tab and business partner label construct attribute node, and wherein Target Enterprise portrait includes that the enterprise of enterprise to be identified draws a portrait
And venture business's portrait;According to Target Enterprise portrait and the corresponding enterprise personnel label of Target Enterprise portrait and business partner
Label obtains the incidence relation of the incidence relation and each entity node and each attribute node between each entity node;With each entity
The incidence relation between incidence relation and each entity node and attribute node between node establishes the association between Target Enterprise portrait
Relational network.
In one embodiment, similarity obtains module 420 for calculating separately between enterprise's portrait and venture business's portrait
The outstanding person of financial information class label blocks German number, the outstanding person of public opinion info class label blocks German number and the outstanding card of company information class label
German number;According to the outstanding illegal fund collection similarity blocking the determining enterprises' portrait of German number and drawing a portrait with venture business.
In one embodiment, data information obtains the first financial information that module 410 is used to obtain venture business's portrait
Data;The first financial information data are respectively divided in the finance cluster that quantity is preset number using clustering algorithm;It obtains
The cluster centre of each finance cluster, and determine each financial cluster centre corresponding financial risk index section;Calculate enterprise to be identified
The smallest finance cluster of distance value is determined as financial information data to the distance value of each cluster centre by the financial information data of industry
Affiliated target financial cluster;According to the financial risk index section of target financial cluster and financial information data and target wealth
The distance value of the cluster centre for cluster of being engaged in, determines financial risk index value in financial risk index section.
In one embodiment, data information obtains module 410 and is used to carry out the news corpus data of enterprise to be identified
Chinese word segmentation extracts the keyword in news corpus data;Keyword is input in the model-naive Bayesian constructed in advance,
The probability that the enterprise to be identified under conditions of keyword occurs is illegal fund collection enterprise is calculated using model-naive Bayesian;According to
Probability value determines the public sentiment risk indicator value of enterprise to be identified.
In one embodiment, data information obtains module 410 for obtaining news corpus training sample set, news corpus
Training sample set includes the news corpus sample of illegal fund collection enterprise and the news corpus sample of non-illegal fund collection enterprise;It calculates
Each news corpus training sample corresponding prior probability when being the news corpus sample of the different types of business;Each news corpus is instructed
Practice sample to be pre-processed to obtain the feature word of news corpus sample, generates feature word matrix;According to feature word matrix
Calculate the conditional probability of each feature word when the news corpus sample that news corpus training sample is the different types of business;According to elder generation
Test probability and conditional probability building model-naive Bayesian.
The specific restriction of acquisition device about illegal fund collection enterprise may refer to above for illegal fund collection enterprise
The restriction of acquisition methods, details are not described herein.Modules in the acquisition device of above-mentioned illegal fund collection enterprise can whole or portion
Divide and is realized by software, hardware and combinations thereof.Above-mentioned each module can be embedded in the form of hardware or independently of computer equipment
In processor in, can also be stored in a software form in the memory in computer equipment, in order to processor calling hold
The corresponding operation of the above modules of row.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction
Composition can be as shown in Figure 5.The computer equipment include by system bus connect processor, memory, network interface and
Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment
Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data
Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating
The database of machine equipment is for storing the data such as financial information, news corpus and company information.The network of the computer equipment
Interface is used to communicate with external terminal by network connection.It is a kind of illegal to realize when the computer program is executed by processor
The acquisition methods of fund-raising enterprise.
It will be understood by those skilled in the art that structure shown in Fig. 5, only part relevant to application scheme is tied
The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment
It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, a kind of computer equipment, including memory and processor are provided, which is stored with
Computer program, the processor perform the steps of when executing computer program
The financial information data, news corpus data and company information data of enterprise to be identified are obtained, and are calculated separately
The financial risk index value and public sentiment risk indicator value of enterprise to be identified;
It is drawn according to the enterprise that financial information data, news corpus data and enterprise's essential information generate enterprise to be identified
Picture, the similarity between calculating enterprise portrait and the venture business's portrait constructed in advance, obtains illegal fund collection similarity;
According to enterprise's portrait and venture business's portrait building enterprise's incidence relation network, and utilize enterprise's incidence relation net
Network calculates venture business's relating value between enterprise's portrait and venture business's portrait;
According to public sentiment risk indicator value, financial risk index value, illegal fund collection similarity and venture business's relating value, meter
The illegal fund collection value-at-risk for calculating enterprise to be identified, when illegal fund collection value-at-risk is greater than default alarm threshold, then by enterprise to be identified
It is determined as illegal fund collection venture business.
In one embodiment, processor executes computer program and realizes according to financial information data, news corpus data
And enterprise's essential information generate enterprise to be identified enterprise portrait step when, implement following steps: building it is to be identified
The corresponding enterprise's label of enterprise;According to preset format respectively by financial information data, news corpus data and company information number
According to the target financial data, target public sentiment data and Target Enterprise information data for being converted to structuring;Respectively according to target wealth
Data, target public sentiment data and the Target Enterprise data of being engaged in generate the corresponding financial information class label of enterprise's label, public opinion information
Class label and company information class label obtain enterprise's portrait of enterprise to be identified.
In one embodiment, company information class label includes enterprise personnel label and business partner label;Processor
The step of computer program is realized according to enterprise's portrait and venture business's portrait building enterprise's incidence relation network is executed, specifically
It performs the steps of and entity node is constructed according to enterprise's label of Target Enterprise portrait, and according to enterprise personnel label and industry
Business partner's label constructs attribute node, and wherein Target Enterprise portrait includes the enterprise's portrait and venture business's picture of enterprise to be identified
Picture;According to Target Enterprise portrait and the corresponding enterprise personnel label of Target Enterprise portrait and business partner label, each reality is obtained
The incidence relation of incidence relation and each entity node and each attribute node between body node;It is closed with the association between each entity node
System and the incidence relation between each entity node and attribute node establish the incidence relation network between Target Enterprise portrait.
In one embodiment, processor executes computer program and realizes that calculate enterprise's portrait looks forward to the risk constructed in advance
Similarity between industry portrait when obtaining the step of illegal fund collection similarity, implements following steps: calculating separately enterprise's portrait
The outstanding person of financial information class label blocks German number between venture business's portrait, the outstanding person of public opinion info class label blocks German number and enterprise
The outstanding person of info class label blocks German number;Determine that enterprise's portrait is similar to the illegal fund collection that venture business draws a portrait according to the German number of outstanding person's card
Degree.
In one embodiment, processor executes computer program and realizes the financial risk index value for calculating enterprise to be identified
Step when, implement following steps: obtain venture business portrait the first financial information data;Using clustering algorithm by
One financial information data are respectively divided in the finance cluster that quantity is preset number;The cluster centre of each finance cluster is obtained,
And determine each financial cluster centre corresponding financial risk index section;The financial information data for calculating enterprise to be identified are gathered to each
The smallest finance cluster of distance value is determined as the cluster of target financial belonging to financial information data by the distance value at class center;Root
According to target financial cluster financial risk index section and financial information data and target financial cluster cluster centre away from
From value, financial risk index value is determined in financial risk index section.
In one embodiment, processor executes computer program and realizes the public sentiment risk indicator value for calculating enterprise to be identified
Step when, implement following steps: Chinese word segmentation being carried out to the news corpus data of enterprise to be identified, extracts news corpus
Keyword in data;Keyword is input in the model-naive Bayesian constructed in advance, model-naive Bayesian meter is utilized
Calculate the probability that the enterprise to be identified under conditions of keyword occurs is illegal fund collection enterprise;Enterprise to be identified is determined according to probability value
Public sentiment risk indicator value.
In one embodiment, when processor executes the construction step of computer program realization model-naive Bayesian, tool
Body, which performs the steps of, obtains news corpus training sample set, and news corpus training sample set includes the new of illegal fund collection enterprise
Hear the news corpus sample of corpus sample and non-illegal fund collection enterprise;Calculating each news corpus training sample is different enterprise-class
Corresponding prior probability when the news corpus sample of type;Each news corpus training sample is pre-processed to obtain news corpus sample
This feature word generates feature word matrix;Calculating news corpus training sample according to feature word matrix is different enterprises
The conditional probability of each feature word when the news corpus sample of type;Simple pattra leaves is constructed according to prior probability and conditional probability
This model.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated
Machine program performs the steps of when being executed by processor
The financial information data, news corpus data and company information data of enterprise to be identified are obtained, and are calculated separately
The financial risk index value and public sentiment risk indicator value of enterprise to be identified;
It is drawn according to the enterprise that financial information data, news corpus data and enterprise's essential information generate enterprise to be identified
Picture, the similarity between calculating enterprise portrait and the venture business's portrait constructed in advance, obtains illegal fund collection similarity;
According to enterprise's portrait and venture business's portrait building enterprise's incidence relation network, and utilize enterprise's incidence relation net
Network calculates venture business's relating value between enterprise's portrait and venture business's portrait;
According to public sentiment risk indicator value, financial risk index value, illegal fund collection similarity and venture business's relating value, meter
The illegal fund collection value-at-risk for calculating enterprise to be identified, when illegal fund collection value-at-risk is greater than default alarm threshold, then by enterprise to be identified
It is determined as illegal fund collection venture business.
In one embodiment, computer program is executed by processor realization according to financial information data, news corpus number
When accordingly and enterprise's essential information generates the step of enterprise's portrait of enterprise to be identified, implements following steps: constructing wait know
The corresponding enterprise's label of other enterprise;According to preset format respectively by financial information data, news corpus data and company information
Data are converted to the target financial data, target public sentiment data and Target Enterprise information data of structuring;Respectively according to target
Financial data, target public sentiment data and Target Enterprise data generate the corresponding financial information class label of enterprise's label, public opinion letter
Class label and company information class label are ceased, enterprise's portrait of enterprise to be identified is obtained.
In one embodiment, company information class label includes enterprise personnel label and business partner label;Computer
When program is executed by processor the step realized according to enterprise's portrait and venture business's portrait building enterprise's incidence relation network,
Implement following steps: according to Target Enterprise portrait enterprise's label construct entity node, and according to enterprise personnel label with
And business partner label constructs attribute node, wherein Target Enterprise portrait includes enterprise's portrait and the risk enterprise of enterprise to be identified
Industry portrait;According to Target Enterprise portrait and the corresponding enterprise personnel label of Target Enterprise portrait and business partner label, obtain
The incidence relation of incidence relation and each entity node and each attribute node between each entity node;With the pass between each entity node
Incidence relation between connection relationship and each entity node and attribute node establishes the incidence relation network between Target Enterprise portrait.
In one embodiment, computer program is executed by processor the risk for realizing and calculating enterprise's portrait and constructing in advance
Similarity between enterprise's portrait when obtaining the step of illegal fund collection similarity, implements following steps: calculating separately enterprise's picture
As the outstanding person of the financial information class label between venture business's portrait blocks German number, the outstanding person of public opinion info class label blocks German number and enterprise
The outstanding person of industry info class label blocks German number;The illegal fund collection phase that enterprise's portrait is drawn a portrait with venture business is determined according to the German number of outstanding card
Like degree.
In one embodiment, computer program is executed by processor the financial risk index realized and calculate enterprise to be identified
When the step of value, following steps are implemented: obtaining the first financial information data of venture business's portrait;It will using clustering algorithm
First financial information data are respectively divided in the finance cluster that quantity is preset number;In the cluster for obtaining each finance cluster
The heart, and determine each financial cluster centre corresponding financial risk index section;The financial information data for calculating enterprise to be identified arrive
The smallest finance cluster of distance value is determined as target financial belonging to financial information data and gathered by the distance value of each cluster centre
Class;According to the cluster centre of the financial risk index section of target financial cluster and financial information data and target financial cluster
Distance value, in financial risk index section determine financial risk index value.
In one embodiment, computer program is executed by processor the public sentiment risk indicator realized and calculate enterprise to be identified
When the step of value, following steps are implemented: Chinese word segmentation being carried out to the news corpus data of enterprise to be identified, extracts news language
Expect the keyword in data;Keyword is input in the model-naive Bayesian constructed in advance, model-naive Bayesian is utilized
Calculate the probability that the enterprise to be identified under conditions of keyword occurs is illegal fund collection enterprise;Enterprise to be identified is determined according to probability value
The public sentiment risk indicator value of industry.
In one embodiment, the construction step of model-naive Bayesian is also realized when computer program is executed by processor
When, it implements following steps: obtaining news corpus training sample set, news corpus training sample set includes illegal fund collection enterprise
News corpus sample and non-illegal fund collection enterprise news corpus sample;Calculating each news corpus training sample is different enterprises
Corresponding prior probability when the news corpus sample of industry type;Each news corpus training sample is pre-processed to obtain news language
Expect the feature word of sample, generates feature word matrix;Calculating news corpus training sample according to feature word matrix is difference
The conditional probability of each feature word when the news corpus sample of the type of business;It is simple according to prior probability and conditional probability building
Bayesian model.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer
In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein,
To any reference of memory, storage, database or other media used in each embodiment provided herein,
Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM
(PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include
Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms,
Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing
Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance
Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the concept of this application, can also make when dry modification and improvement, these belong to the protection of the application
Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.
Claims (10)
1. a kind of acquisition methods of illegal fund collection enterprise, which comprises
The financial information data, news corpus data and company information data of enterprise to be identified are obtained, and are calculated separately described
The financial risk index value and public sentiment risk indicator value of enterprise to be identified;
Enterprise to be identified is generated according to the financial information data, the news corpus data and enterprise's essential information
Enterprise's portrait calculates the similarity between enterprise's portrait and the venture business's portrait constructed in advance, it is similar to obtain illegal fund collection
Degree;
According to enterprise portrait and venture business portrait building enterprise's incidence relation network, and closed using the enterprise
Connection relational network calculates venture business's relating value between enterprise's portrait and venture business portrait;
According to the public sentiment risk indicator value, the financial risk index value, the illegal fund collection similarity and the risk
Enterprise's relating value calculates the illegal fund collection value-at-risk of enterprise to be identified, when the illegal fund collection value-at-risk is greater than default Police sports
Value, then be determined as illegal fund collection venture business for the enterprise to be identified.
2. the method according to claim 1, wherein described according to the financial information data, the news language
Material data and enterprise's essential information generate the step of enterprise's portrait of enterprise to be identified, comprising:
Construct the corresponding enterprise's label of the enterprise to be identified;
The financial information data, the news corpus data and the company information data are turned respectively according to preset format
It is changed to the target financial data, target public sentiment data and Target Enterprise information data of structuring;
The enterprise is generated according to the target financial data, the target public sentiment data and the Target Enterprise data respectively
The corresponding financial information class label of label, public opinion info class label and company information class label, obtain the enterprise to be identified
Enterprise portrait.
3. according to the method described in claim 2, it is characterized in that, the company information class label include enterprise personnel label with
And business partner label;
Described the step of enterprise's incidence relation network is constructed according to enterprise portrait and venture business portrait, including with
Lower step:
It is drawn a portrait according to Target Enterprise and constructs entity node, and attribute section is constructed according to enterprise personnel label and business partner label
Point, wherein Target Enterprise portrait includes the enterprise's portrait and venture business portrait of the enterprise to be identified;
It is drawn a portrait corresponding enterprise personnel label and business partner label according to Target Enterprise portrait and the Target Enterprise,
Obtain the incidence relation of the incidence relation and each entity node and each attribute node between each entity node;
With the incidence relation between the incidence relation and each entity node and the attribute node between each entity node,
Establish the incidence relation network between the Target Enterprise portrait.
4. according to the method described in claim 2, it is characterized in that, described calculate risk enterprise portrait and constructed in advance
Similarity between enterprise's portrait, the step of obtaining illegal fund collection similarity, comprising:
The outstanding person for calculating separately financial information class label between enterprise's portrait and venture business portrait blocks German number, the carriage
Block German number by the outstanding person that the outstanding person of info class label blocks German number and the company information class label;
According to the outstanding illegal fund collection similarity blocking the determining enterprises' portrait of German number and drawing a portrait with the venture business.
5. the method according to claim 1, wherein calculating the financial risk index value of the enterprise to be identified
Step, comprising:
Obtain the first financial information data of venture business's portrait;
The first financial information data are respectively divided in the finance cluster that quantity is preset number using clustering algorithm;
The cluster centre of each finance cluster is obtained, and determines the corresponding financial risk Index areas of each finance cluster centre
Between;
The financial information data of the enterprise to be identified are calculated to the distance value of each cluster centre, by the smallest wealth of distance value
Business cluster is determined as the cluster of target financial belonging to the financial information data;
According to the financial risk index section of target financial cluster and the financial information data and the target financial
The distance value of the cluster centre of cluster determines financial risk index value in financial risk index section.
6. the method according to claim 1, wherein the public sentiment risk indicator for calculating the enterprise to be identified
The step of value, comprising:
Chinese word segmentation is carried out to the news corpus data of the enterprise to be identified, extracts the key in the news corpus data
Word;
The keyword is input in the model-naive Bayesian constructed in advance, is calculated using the model-naive Bayesian
The enterprise to be identified is the probability of illegal fund collection enterprise under conditions of the keyword occurs;
The public sentiment risk indicator value of the enterprise to be identified is determined according to the probability value.
7. according to the method described in claim 6, it is characterized in that, the construction step of the model-naive Bayesian, comprising:
News corpus training sample set is obtained, the news corpus training sample set includes the news corpus sample of illegal fund collection enterprise
The news corpus sample of this and non-illegal fund collection enterprise;
Calculate corresponding prior probability when the news corpus sample that each news corpus training sample is the different types of business;
Each news corpus training sample is pre-processed to obtain the feature word of news corpus sample, generates feature word
Matrix;
According to the feature word matrix calculate news corpus training sample be the different types of business news corpus sample when it is each
The conditional probability of the feature word;
Model-naive Bayesian is constructed according to the prior probability and the conditional probability.
8. a kind of acquisition device of illegal fund collection enterprise, which is characterized in that described device includes:
Data information obtains module, and the financial information data, news corpus data and enterprise for obtaining enterprise to be identified are believed
Data are ceased, and calculate separately the financial risk index value and public sentiment risk indicator value of the enterprise to be identified;
Similarity obtains module, for basic according to the financial information data, the news corpus data and the enterprise
Information generates enterprise's portrait of enterprise to be identified, calculates similar between enterprise's portrait and the venture business's portrait constructed in advance
Degree, obtains illegal fund collection similarity;
Venture business's relating value obtains module, for being closed according to enterprise portrait and portrait building enterprise, the venture business
Join relational network, and utilizes the wind between the portrait of enterprise described in enterprise's incidence relation network query function and venture business portrait
Dangerous enterprise's relating value;
Illegal fund collection venture business obtains module, for according to the public sentiment risk indicator value, the financial risk index value, institute
Illegal fund collection similarity and venture business's relating value are stated, the illegal fund collection value-at-risk of enterprise to be identified is calculated, when described
Illegal fund collection value-at-risk is greater than default alarm threshold, then the enterprise to be identified is determined as illegal fund collection venture business.
9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists
In the step of processor realizes any one of claims 1 to 7 the method when executing the computer program.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The step of method described in any one of claims 1 to 7 is realized when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811339775.6A CN109523153A (en) | 2018-11-12 | 2018-11-12 | Acquisition methods, device, computer equipment and the storage medium of illegal fund collection enterprise |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811339775.6A CN109523153A (en) | 2018-11-12 | 2018-11-12 | Acquisition methods, device, computer equipment and the storage medium of illegal fund collection enterprise |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109523153A true CN109523153A (en) | 2019-03-26 |
Family
ID=65773506
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811339775.6A Pending CN109523153A (en) | 2018-11-12 | 2018-11-12 | Acquisition methods, device, computer equipment and the storage medium of illegal fund collection enterprise |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109523153A (en) |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110390488A (en) * | 2019-07-26 | 2019-10-29 | 浪潮软件股份有限公司 | A kind of credit risk enterprise characteristic recognition methods based on K- means clustering algorithm |
CN110443459A (en) * | 2019-07-05 | 2019-11-12 | 深圳壹账通智能科技有限公司 | Warning information method for pushing, device, computer equipment and storage medium |
CN110458399A (en) * | 2019-07-05 | 2019-11-15 | 深圳壹账通智能科技有限公司 | Risk information generation method, device, computer equipment and storage medium |
CN110597990A (en) * | 2019-09-06 | 2019-12-20 | 民生科技有限责任公司 | Financial analysis method and system based on intelligent classification |
CN110674970A (en) * | 2019-08-19 | 2020-01-10 | 广州荔支网络技术有限公司 | Enterprise legal risk early warning method, device, equipment and readable storage medium |
CN110688463A (en) * | 2019-10-11 | 2020-01-14 | 支付宝(杭州)信息技术有限公司 | Enterprise list processing method and device |
CN110704572A (en) * | 2019-09-04 | 2020-01-17 | 北京航空航天大学 | Suspected illegal fundraising risk early warning method, device, equipment and storage medium |
CN110796394A (en) * | 2019-11-13 | 2020-02-14 | 北京顶象技术有限公司 | Marking method and device for node to be calibrated |
CN111091007A (en) * | 2020-03-23 | 2020-05-01 | 杭州有数金融信息服务有限公司 | Method for identifying relationships among multiple enterprises based on public sentiment and enterprise portrait |
CN111489105A (en) * | 2020-05-06 | 2020-08-04 | 支付宝(杭州)信息技术有限公司 | Enterprise risk identification method, device and equipment |
CN111539605A (en) * | 2020-04-14 | 2020-08-14 | 鼎富智能科技有限公司 | Enterprise portrait construction method and device |
CN111553487A (en) * | 2020-05-25 | 2020-08-18 | 支付宝(杭州)信息技术有限公司 | Business object identification method and device |
CN111709841A (en) * | 2020-04-29 | 2020-09-25 | 国家计算机网络与信息安全管理中心 | Illegal fundraising identification method and device, storage medium and electronic device |
WO2020199621A1 (en) * | 2019-04-01 | 2020-10-08 | 北京三快在线科技有限公司 | Knowledge graph-based fraud detection |
CN111914542A (en) * | 2020-05-21 | 2020-11-10 | 国家计算机网络与信息安全管理中心 | Suspected illegal investment market subject identification method, device, terminal and storage medium |
CN112015909A (en) * | 2020-08-19 | 2020-12-01 | 普洛斯科技(重庆)有限公司 | Knowledge graph construction method and device, electronic equipment and storage medium |
CN112150294A (en) * | 2020-10-12 | 2020-12-29 | 中国农业银行股份有限公司 | Identification method and device for illegal collected data and electronic equipment |
CN112541548A (en) * | 2020-12-14 | 2021-03-23 | 百果园技术(新加坡)有限公司 | Relational network generation method and device, computer equipment and storage medium |
CN112712423A (en) * | 2020-12-29 | 2021-04-27 | 广州金融科技股份有限公司 | Suspected illegal fundraising item judgment method and device, computer equipment and storage medium |
CN112732937A (en) * | 2021-01-12 | 2021-04-30 | 平安资产管理有限责任公司 | Hidden relation acquisition method, device, equipment and medium based on knowledge graph |
CN113222610A (en) * | 2021-05-07 | 2021-08-06 | 支付宝(杭州)信息技术有限公司 | Risk identification method and device |
CN113505221A (en) * | 2020-03-24 | 2021-10-15 | 国家计算机网络与信息安全管理中心 | Enterprise false propaganda risk identification method, device and storage medium |
CN113537796A (en) * | 2021-07-22 | 2021-10-22 | 大路网络科技有限公司 | Enterprise risk assessment method, device and equipment |
CN113792089A (en) * | 2021-09-16 | 2021-12-14 | 平安银行股份有限公司 | Illegal behavior detection method, device, equipment and medium based on artificial intelligence |
CN113807950A (en) * | 2021-09-22 | 2021-12-17 | 平安银行股份有限公司 | Business analysis method based on natural language processing model and related device |
CN113850675A (en) * | 2020-06-28 | 2021-12-28 | 航天信息股份有限公司 | Information processing method and device for enterprise transaction relation data |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104463215A (en) * | 2014-12-10 | 2015-03-25 | 东北大学 | Tiny aneurysm occurrence risk prediction system based on retina image processing |
CN105913195A (en) * | 2016-04-29 | 2016-08-31 | 浙江汇信科技有限公司 | All-industry data based enterprise's financial risk scoring method |
US20170262764A1 (en) * | 2016-03-11 | 2017-09-14 | Wipro Limited | System and method for predicting and managing the risks in a supply chain network |
CN107798597A (en) * | 2017-10-09 | 2018-03-13 | 上海二三四五金融科技有限公司 | A kind of dynamic excessive risk visitor group detection method and system |
CN107909274A (en) * | 2017-11-17 | 2018-04-13 | 平安科技(深圳)有限公司 | Enterprise investment methods of risk assessment, device and storage medium |
CN107909466A (en) * | 2017-11-10 | 2018-04-13 | 平安普惠企业管理有限公司 | Customer relationship network display method, apparatus, equipment and readable storage medium storing program for executing |
-
2018
- 2018-11-12 CN CN201811339775.6A patent/CN109523153A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104463215A (en) * | 2014-12-10 | 2015-03-25 | 东北大学 | Tiny aneurysm occurrence risk prediction system based on retina image processing |
US20170262764A1 (en) * | 2016-03-11 | 2017-09-14 | Wipro Limited | System and method for predicting and managing the risks in a supply chain network |
CN105913195A (en) * | 2016-04-29 | 2016-08-31 | 浙江汇信科技有限公司 | All-industry data based enterprise's financial risk scoring method |
CN107798597A (en) * | 2017-10-09 | 2018-03-13 | 上海二三四五金融科技有限公司 | A kind of dynamic excessive risk visitor group detection method and system |
CN107909466A (en) * | 2017-11-10 | 2018-04-13 | 平安普惠企业管理有限公司 | Customer relationship network display method, apparatus, equipment and readable storage medium storing program for executing |
CN107909274A (en) * | 2017-11-17 | 2018-04-13 | 平安科技(深圳)有限公司 | Enterprise investment methods of risk assessment, device and storage medium |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020199621A1 (en) * | 2019-04-01 | 2020-10-08 | 北京三快在线科技有限公司 | Knowledge graph-based fraud detection |
CN110443459A (en) * | 2019-07-05 | 2019-11-12 | 深圳壹账通智能科技有限公司 | Warning information method for pushing, device, computer equipment and storage medium |
CN110458399A (en) * | 2019-07-05 | 2019-11-15 | 深圳壹账通智能科技有限公司 | Risk information generation method, device, computer equipment and storage medium |
CN110390488A (en) * | 2019-07-26 | 2019-10-29 | 浪潮软件股份有限公司 | A kind of credit risk enterprise characteristic recognition methods based on K- means clustering algorithm |
CN110674970A (en) * | 2019-08-19 | 2020-01-10 | 广州荔支网络技术有限公司 | Enterprise legal risk early warning method, device, equipment and readable storage medium |
CN110704572A (en) * | 2019-09-04 | 2020-01-17 | 北京航空航天大学 | Suspected illegal fundraising risk early warning method, device, equipment and storage medium |
CN110704572B (en) * | 2019-09-04 | 2021-03-16 | 北京航空航天大学 | Suspected illegal fundraising risk early warning method, device, equipment and storage medium |
CN110597990A (en) * | 2019-09-06 | 2019-12-20 | 民生科技有限责任公司 | Financial analysis method and system based on intelligent classification |
CN110597990B (en) * | 2019-09-06 | 2022-05-10 | 民生科技有限责任公司 | Financial analysis method and system based on intelligent classification |
CN110688463A (en) * | 2019-10-11 | 2020-01-14 | 支付宝(杭州)信息技术有限公司 | Enterprise list processing method and device |
CN110796394A (en) * | 2019-11-13 | 2020-02-14 | 北京顶象技术有限公司 | Marking method and device for node to be calibrated |
CN111091007A (en) * | 2020-03-23 | 2020-05-01 | 杭州有数金融信息服务有限公司 | Method for identifying relationships among multiple enterprises based on public sentiment and enterprise portrait |
CN113505221B (en) * | 2020-03-24 | 2024-03-12 | 国家计算机网络与信息安全管理中心 | Enterprise false propaganda risk identification method, equipment and storage medium |
CN113505221A (en) * | 2020-03-24 | 2021-10-15 | 国家计算机网络与信息安全管理中心 | Enterprise false propaganda risk identification method, device and storage medium |
CN111539605B (en) * | 2020-04-14 | 2023-12-29 | 鼎富智能科技有限公司 | Enterprise portrait construction method and device |
CN111539605A (en) * | 2020-04-14 | 2020-08-14 | 鼎富智能科技有限公司 | Enterprise portrait construction method and device |
CN111709841A (en) * | 2020-04-29 | 2020-09-25 | 国家计算机网络与信息安全管理中心 | Illegal fundraising identification method and device, storage medium and electronic device |
CN111489105A (en) * | 2020-05-06 | 2020-08-04 | 支付宝(杭州)信息技术有限公司 | Enterprise risk identification method, device and equipment |
CN111489105B (en) * | 2020-05-06 | 2021-05-25 | 支付宝(杭州)信息技术有限公司 | Enterprise risk identification method, device and equipment |
CN111914542A (en) * | 2020-05-21 | 2020-11-10 | 国家计算机网络与信息安全管理中心 | Suspected illegal investment market subject identification method, device, terminal and storage medium |
CN111553487A (en) * | 2020-05-25 | 2020-08-18 | 支付宝(杭州)信息技术有限公司 | Business object identification method and device |
CN111553487B (en) * | 2020-05-25 | 2021-04-27 | 支付宝(杭州)信息技术有限公司 | Business object identification method and device |
CN113850675A (en) * | 2020-06-28 | 2021-12-28 | 航天信息股份有限公司 | Information processing method and device for enterprise transaction relation data |
CN112015909B (en) * | 2020-08-19 | 2024-04-30 | 普洛斯科技(重庆)有限公司 | Knowledge graph construction method and device, electronic equipment and storage medium |
CN112015909A (en) * | 2020-08-19 | 2020-12-01 | 普洛斯科技(重庆)有限公司 | Knowledge graph construction method and device, electronic equipment and storage medium |
CN112150294A (en) * | 2020-10-12 | 2020-12-29 | 中国农业银行股份有限公司 | Identification method and device for illegal collected data and electronic equipment |
CN112150294B (en) * | 2020-10-12 | 2023-10-13 | 中国农业银行股份有限公司 | Identification method and device for illegal funding and electronic equipment |
CN112541548A (en) * | 2020-12-14 | 2021-03-23 | 百果园技术(新加坡)有限公司 | Relational network generation method and device, computer equipment and storage medium |
CN112712423A (en) * | 2020-12-29 | 2021-04-27 | 广州金融科技股份有限公司 | Suspected illegal fundraising item judgment method and device, computer equipment and storage medium |
CN112732937A (en) * | 2021-01-12 | 2021-04-30 | 平安资产管理有限责任公司 | Hidden relation acquisition method, device, equipment and medium based on knowledge graph |
CN113222610B (en) * | 2021-05-07 | 2022-08-23 | 支付宝(杭州)信息技术有限公司 | Risk identification method and device |
CN113222610A (en) * | 2021-05-07 | 2021-08-06 | 支付宝(杭州)信息技术有限公司 | Risk identification method and device |
CN113537796A (en) * | 2021-07-22 | 2021-10-22 | 大路网络科技有限公司 | Enterprise risk assessment method, device and equipment |
CN113792089A (en) * | 2021-09-16 | 2021-12-14 | 平安银行股份有限公司 | Illegal behavior detection method, device, equipment and medium based on artificial intelligence |
CN113792089B (en) * | 2021-09-16 | 2024-03-22 | 平安银行股份有限公司 | Illegal behavior detection method, device, equipment and medium based on artificial intelligence |
CN113807950A (en) * | 2021-09-22 | 2021-12-17 | 平安银行股份有限公司 | Business analysis method based on natural language processing model and related device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109523153A (en) | Acquisition methods, device, computer equipment and the storage medium of illegal fund collection enterprise | |
CN109767322B (en) | Suspicious transaction analysis method and device based on big data and computer equipment | |
CN110765770B (en) | Automatic contract generation method and device | |
WO2019218699A1 (en) | Fraud transaction determining method and apparatus, computer device, and storage medium | |
CN110443458A (en) | Methods of risk assessment, device, computer equipment and storage medium | |
CN109670837A (en) | Recognition methods, device, computer equipment and the storage medium of bond default risk | |
CN110428322A (en) | A kind of adaptation method and device of business datum | |
CN111666346B (en) | Information merging method, transaction inquiring method, device, computer and storage medium | |
CN109543096A (en) | Data query method, apparatus, computer equipment and storage medium | |
CN110880142B (en) | Risk entity acquisition method and device | |
EP3391319A1 (en) | Analysis of transaction information using graphs | |
CN109949154A (en) | Customer information classification method, device, computer equipment and storage medium | |
CN110008250A (en) | Social security data processing method, device and computer equipment based on data mining | |
CN109767226A (en) | Suspicious transaction statistical views generation method and device based on big data | |
CN110362798B (en) | Method, apparatus, computer device and storage medium for judging information retrieval analysis | |
CN109284369B (en) | Method, system, device and medium for judging importance of securities news information | |
CN112347254B (en) | Method, device, computer equipment and storage medium for classifying news text | |
CN108734021B (en) | Financial loan big data risk assessment method and system based on privacy-removing data | |
JP2022548501A (en) | Data acquisition method and device for analyzing cryptocurrency transactions | |
CN110443459A (en) | Warning information method for pushing, device, computer equipment and storage medium | |
CN111858977A (en) | Bill information acquisition method and device, computer equipment and storage medium | |
CN112990989B (en) | Value prediction model input data generation method, device, equipment and medium | |
CN110389963A (en) | The recognition methods of channel effect, device, equipment and storage medium based on big data | |
CN112487306B (en) | Automatic event marking and classifying method based on knowledge graph | |
CN109902129A (en) | Insurance agent's classifying method and relevant device based on big data analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |