WO2019095572A1 - Enterprise investment risk assessment method, device, and storage medium - Google Patents

Enterprise investment risk assessment method, device, and storage medium Download PDF

Info

Publication number
WO2019095572A1
WO2019095572A1 PCT/CN2018/076169 CN2018076169W WO2019095572A1 WO 2019095572 A1 WO2019095572 A1 WO 2019095572A1 CN 2018076169 W CN2018076169 W CN 2018076169W WO 2019095572 A1 WO2019095572 A1 WO 2019095572A1
Authority
WO
WIPO (PCT)
Prior art keywords
enterprise
entity
entities
feature vector
enterprise entity
Prior art date
Application number
PCT/CN2018/076169
Other languages
French (fr)
Chinese (zh)
Inventor
汪伟
罗傲雪
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019095572A1 publication Critical patent/WO2019095572A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities

Definitions

  • the present application relates to the field of computer technology, and in particular, to an enterprise investment risk assessment method, an electronic device, and a computer readable storage medium.
  • Every news site has thousands of news every day, and the news is updated in real time.
  • the big data associated with the investment target enterprise from a large amount of news corpus, such as the internal state of the enterprise: operation, finance, executives, recruitment, website update frequency, etc.
  • the external status of the enterprise such as the status of the affiliated company, as above.
  • the customer, the rating agency's rating of the company, news media related reports and other information will form this information into a network of relationships, analyze and evaluate the risk factor of the investment target enterprise, so that the investor can consider whether the risk can be accepted and decided according to the risk factor. Whether to invest in the company. Therefore, how to extract the information related to the investment target enterprise from the news corpus and use the information to conduct risk assessment is an urgent problem to be solved.
  • the application provides an enterprise investment risk assessment method, an electronic device and a computer readable storage medium, the main purpose of which is to evaluate the risk of the investment target enterprise by analyzing the information disclosed in the news corpus.
  • the present application provides an electronic device including: a memory, a processor, and a memory investment evaluation program run on the processor, the program being used by the processor The following steps are implemented during execution:
  • A1. Crawling news corpus related to the business entity to be assessed, preprocessing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
  • the name is a node
  • the relationship between the enterprise entity and other entities is an edge
  • a network of relationships between the enterprise entity and other entities is constructed
  • A3. Calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity;
  • A6 Input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
  • the present application further provides a method for evaluating enterprise investment risk, which includes:
  • S1 crawling the news corpus related to the enterprise entity to be assessed for risk, pre-processing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
  • the present application further provides a computer readable storage medium storing a business investment risk assessment program, which is executed by a processor to implement enterprise investment risk assessment as described above. Any step of the method.
  • the present application further provides an enterprise investment risk assessment, the program comprising: an extraction module, a construction module, a first calculation module, a second calculation module, a third calculation module, and an evaluation module, the program is processed Any step of implementing the enterprise investment risk assessment method described above when executed.
  • the enterprise investment risk assessment method, the electronic device and the computer readable storage medium proposed by the application obtain the relationship between the enterprise entity and its associated entity, the internal information of the enterprise entity and the external information from the news corpus, respectively, and obtain the enterprise entity respectively.
  • the first feature vector, the second feature vector, and the third feature vector use the risk assessment model and the first feature vector, the second feature vector, and the third feature vector to perform risk assessment on the investment entity, so that the investor can capture Market investment opportunities to predict investment risks in advance.
  • FIG. 1 is a schematic diagram of an application environment of a preferred embodiment of an enterprise investment risk assessment method according to the present application
  • FIG. 2 is a network diagram of the relationship between the enterprise entity A and other associated entities
  • Figure 3 is a vector representation of the business entity A
  • Figure 4 is a block diagram of the enterprise investment risk assessment procedure of Figure 1;
  • FIG. 5 is a flow chart of a preferred embodiment of an enterprise investment risk assessment method of the present application.
  • the present application provides a method for evaluating enterprise investment risk, which is applied to an electronic device 1.
  • the electronic device 1 may be a PC (Personal Computer), or may be a terminal device such as a smart phone, a tablet computer, an e-book reader, or a portable computer.
  • the electronic device 1 includes a memory 11, a processor 12, a communication bus 13, and a network interface 14.
  • the memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (for example, an SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like.
  • the memory 11 may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1, in some embodiments.
  • the memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SMC), and a secure digital (Secure Digital) , SD) card, flash card (Flash Card), etc.
  • the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device.
  • the memory 11 can be used not only for storing application software and various types of data installed in the electronic device 1, such as the enterprise investment risk assessment program 10, but also for temporarily storing data that has been output or will be output.
  • the processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor or other data processing chip for running program code or processing stored in the memory 11. Data, such as the corporate investment risk assessment process10.
  • CPU Central Processing Unit
  • controller microcontroller
  • microprocessor or other data processing chip for running program code or processing stored in the memory 11.
  • Data such as the corporate investment risk assessment process10.
  • Communication bus 13 is used to implement connection communication between these components.
  • the network interface 14 can optionally include a standard wired interface, a wireless interface (such as a WI-FI interface), and is typically used to establish a communication connection between the device and other electronic devices.
  • a standard wired interface such as a WI-FI interface
  • Figure 1 shows only the electronic device 1 with components 11-14, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
  • the electronic device 1 may further include a user interface
  • the user interface may include a display, an input unit such as a keyboard, and the optional user interface may further include a standard wired interface and a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch sensor, or the like.
  • the display may also be referred to as a display screen or display unit for displaying information processed in the electronic device 1 and a user interface for displaying visualizations.
  • the enterprise investment risk assessment program 10 is stored in the memory 11; when the processor 12 executes the enterprise investment risk assessment program 10 stored in the memory 11, the following steps are implemented:
  • A1. Crawling news corpus related to the business entity to be assessed, preprocessing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
  • the name is a node
  • the relationship between the enterprise entity and other entities is an edge
  • a network of relationships between the enterprise entity and other entities is constructed
  • A3. Calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity;
  • A6 Input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
  • the corpus refers to a plurality of different fields.
  • This embodiment uses a news corpus as an example to describe a specific solution of the present application, but is not limited to the field of news.
  • the web crawler When the investor needs to know the current news to obtain the internal data and external data associated with the investment target enterprise, use the web crawler to crawl the online news from the Internet, for example, crawling online news of Sina, Baidu, Tencent, etc. through crawlers. . Understandably, each company operates differently in different time periods. Therefore, in order to enable investors to more accurately understand the information of the investment target company, the crawled network news is filtered in the time dimension. Set the preset time interval and only crawl the online news of the time period, for example, only crawling online news for nearly half a year.
  • the source of news corpus is diverse, there are many types of corpus in the corpus.
  • the news corpus needs to be preprocessed, and the news corpus text data is obtained to form a news corpus text set.
  • the pre-processing may unify the format of the news corpus into a text format, remove advertising noise from the news corpus and filter one or more of dirty words and sensitive words.
  • the format of the news corpus is unified into a text format, the content that cannot be converted into a text format by the current technology can be filtered out.
  • all the enterprise names are extracted from the pre-processed news corpus, and then the associated enterprise data of the enterprise entity (ie, the investment target enterprise) according to the risk to be assessed is used. , filter out other entities associated with the business entity to be assessed, and build the enterprise entity and other entities into a network of relationships.
  • the associated enterprise data can be obtained through third party data. It can be understood that it may be many to extract other entities associated with the enterprise entity from the news corpus. To construct all the related entities in the relational network is unreasonable, therefore, before extracting the relational network, the extracted and the enterprise entity Other related entities are filtered and filtered.
  • other entities associated with the enterprise entity retained by the filtering and screening step include: a shareholder company of the business entity, other entities that have money with the business entity, suppliers, Customer, credit rating structure, etc.
  • B1 is The rating agency that gives the enterprise entity A a credit rating can learn from the historical rating data that B1 gives the enterprise entity A a credit rating of BBB
  • B2 is a supplier that supplies the enterprise entity A with raw materials or goods
  • the amount owed is 300,000
  • B3 is the customer of enterprise entity A
  • enterprise entity A has defaulted to B3 twice.
  • the enterprise entity A, B1, B2, and B3 are used as nodes, and the relationship between B1, B2, and B3 and A is used as the edge, and a network diagram of the relationship between the enterprise entity and other entities as shown in FIG. 2 is constructed.
  • the vector representation of the enterprise entity A is calculated.
  • This embodiment adopts the Skip-Gram method because the vector representation of the enterprise entity A in the relational network represents the vector representation of the entities B1, B2, B3 associated with it. There is a management relationship between them.
  • the Skip-Gram method uses the current enterprise entity to predict surrounding entities, as shown in Figure 3. An1, An2, An3, and An4 in Fig. 3 are unordered and are represented as adjacent entities of the enterprise entity A.
  • a fixed prediction length L is set to predict L neighboring entities around the enterprise entity A. If the real neighboring entity is less than L, the output is NULL.
  • the vector representation embedding (E1), embedding (E2), ... of the enterprise entity A can be obtained, and the vector is represented as the first feature vector of the enterprise entity A.
  • each reference factor in the internal information of the enterprise is converted into a number to be quantified.
  • the value in the financial information can be converted into a characteristic value.
  • the net profit is 300,000 yuan
  • 30 is a corresponding feature.
  • the value, the frequency of website updates, and the number of hiring people in the most recent year are also numerical values, and can also be based on preset conversion rules. In other embodiments, it is also possible to convert 300,000 yuan into other values according to a preset conversion ratio. After each reference factor in the internal information of the enterprise entity A is quantized, a second feature vector of the enterprise entity A is generated.
  • the external information includes the upstream and downstream relationship of the enterprise entity A. For example, suppliers, customers, whether the company has defaulted or owed money to other entities in the upstream and downstream relationship, if any, the number of defaults and the period of arrears.
  • the external information of the enterprise entity A also includes the rating agency's rating on the business entity A (rating level 3A, 2A is good, A is good, BBB is general, etc.), the news media positive/negative reports on the business entity A, etc. .
  • each reference factor in the internal information of the enterprise is converted into a number for quantization.
  • the number of defaults can be quantized into three values, no default-0, mild default-1, and severe default- 2; arrears can be quantified into 2 values, no arrears-0, with arrears-1; ratings can be quantified into multiple values, rating level 3A-6, rating level 2A-5, rating level A-4, rating Level BBB-3, rating level BB-2, rating level B-1.
  • the external information is quantified, the number of defaults -1, the arrears -1, the rating -3, and the third feature vector of the enterprise entity A is generated according to the quantized information.
  • the risk enterprise A can be assessed next.
  • the name of the enterprise entity A and the first feature vector, the second feature vector and the third feature vector of the enterprise entity A are input into a predetermined risk assessment model for risk assessment, and the risk assessment result is output.
  • the training step of the predetermined risk assessment model includes: acquiring the first feature vector, the second feature vector, and the third feature vector of the plurality of enterprise entities by using the foregoing steps A1-A5, and the specific implementation manner is consistent with the foregoing steps , no longer repeat them here.
  • the second feature vector, the third feature vector, and the corresponding risk tag are used as sample data. Extracting, from the sample data, the first feature vector, the second feature vector, the third feature vector of the first proportion (for example, 60%) of the enterprise entity, and the risk tag corresponding to the first entity (for example, 60%) of the enterprise entity As a training set, the first feature vector, the second feature vector, the third feature vector, and the second ratio (eg, 50%) of the second entity (eg, 50%) of the business entity are randomly extracted from the remaining sample set.
  • the risk label corresponding to the enterprise entity is used as a verification set, that is, 20% of the sample data of the sample data is extracted as a verification set; the support vector machine is trained by using the 50% sample data, and the model parameters of the risk assessment model are determined.
  • the model output result is 0, it indicates that the investment enterprise entity A is substantially risk-free, and if the model output result is 1 , it means that the investment enterprise entity A has a greater risk.
  • the electronic device 1 proposed by the foregoing embodiment obtains the first feature vector, the second feature vector, and the third feature vector of the enterprise entity by understanding the relationship between the enterprise entity and its associated entity, the internal information of the enterprise entity, and the external information.
  • the risk assessment model and the first feature vector, the second feature vector and the third feature vector are used to perform risk assessment on the investment entity, so that the investor can capture market investment opportunities.
  • the enterprise investment risk assessment program 10 may also be divided into one or more modules, one or more modules being stored in the memory 11 and being processed by one or more processors (this Embodiments are executed by processor 12) to accomplish the present application, and a module referred to herein refers to a series of computer program instructions that are capable of performing a particular function.
  • FIG. 4 it is a schematic diagram of a module of the enterprise investment risk assessment program 10 in FIG. 1.
  • the program may be divided into an extraction module 110, a construction module 120, a first calculation module 130, and a second.
  • the calculation module 140, the third calculation module 150, and the evaluation module 160, the functions or operation steps implemented by the modules 110-160 are similar to the above, and are not described in detail herein, for example, where:
  • the extracting module 110 is configured to crawl the news corpus related to the enterprise entity to be evaluated for risk, pre-process the news corpus, and extract other entities associated with the business entity from the pre-processed news corpus;
  • the building module 120 is configured to construct a relationship network between the enterprise entity and other entities by using the name as a node, the relationship between the enterprise entity and other entities as an edge;
  • the first calculating module 130 is configured to calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity;
  • the second calculating module 140 is configured to quantize the internal information of the enterprise entity according to the first preset rule to generate a second feature vector
  • the third calculating module 150 is configured to extract external information of the enterprise entity from the news corpus, and quantize the external information of the enterprise entity according to the second preset rule to generate a third feature vector of the enterprise entity;
  • the evaluation module 160 is configured to input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
  • the application also provides a method for assessing enterprise investment risk.
  • FIG. 5 it is a flowchart of a preferred embodiment of the enterprise investment risk assessment method of the present application. The method can be performed by a device that can be implemented by software and/or hardware.
  • the enterprise investment risk assessment method includes:
  • S1 crawling the news corpus related to the enterprise entity to be assessed for risk, pre-processing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
  • the corpus refers to a plurality of different fields.
  • This embodiment uses a news corpus as an example to describe a specific solution of the present application, but is not limited to the field of news.
  • the web crawler When the investor needs to know the current news to obtain the internal data and external data associated with the investment target enterprise, use the web crawler to crawl the online news from the Internet, for example, crawling online news of Sina, Baidu, Tencent, etc. through crawlers. . Understandably, each company operates differently in different time periods. Therefore, in order to enable investors to more accurately understand the information of the investment target company, the crawled network news is filtered in the time dimension. Set the preset time interval and only crawl the online news of the time period, for example, only crawling online news for nearly half a year.
  • the sources of news corpus are diverse, there are many types of corpus in the corpus.
  • the news corpus needs to be preprocessed, and the news corpus text data is obtained to form a news corpus text set.
  • the pre-processing may unify the format of the news corpus into a text format, remove advertising noise from the news corpus and filter one or more of dirty words and sensitive words.
  • the format of the news corpus is unified into a text format, the content that cannot be converted into a text format by the current technology can be filtered out.
  • all the enterprise names are extracted from the pre-processed news corpus, and then the associated enterprise data of the enterprise entity (ie, the investment target enterprise) according to the risk to be assessed is used. , filter out other entities associated with the business entity to be assessed, and build the enterprise entity and other entities into a network of relationships.
  • the associated enterprise data can be obtained through third party data. It can be understood that it may be many to extract other entities associated with the enterprise entity from the news corpus. To construct all the related entities in the relational network is unreasonable, therefore, before extracting the relational network, the extracted and the enterprise entity Other related entities are filtered and filtered.
  • other entities associated with the enterprise entity retained by the filtering and screening step include: a shareholder company of the business entity, other entities that have money with the business entity, suppliers, Customer, credit rating structure, etc.
  • B1 is The rating agency that gives the enterprise entity A a credit rating can learn from the historical rating data that B1 gives the enterprise entity A a credit rating of BBB
  • B2 is a supplier that supplies the enterprise entity A with raw materials or goods
  • the amount owed is 300,000
  • B3 is the customer of enterprise entity A
  • enterprise entity A has defaulted to B3 twice.
  • the enterprise entity A, B1, B2, and B3 are used as nodes, and the relationship between B1, B2, and B3 and A is used as the edge, and a network diagram of the relationship between the enterprise entity and other entities as shown in FIG. 2 is constructed.
  • the vector representation of the enterprise entity A is calculated.
  • This embodiment adopts the Skip-Gram method because the vector representation of the enterprise entity A in the relational network represents the vector representation of the entities B1, B2, B3 associated with it. There is a management relationship between them.
  • the Skip-Gram method uses the current enterprise entity to predict surrounding entities, as shown in Figure 3. An1, An2, An3, and An4 in Fig. 3 are unordered and are represented as adjacent entities of the enterprise entity A.
  • a fixed prediction length L is set to predict L neighboring entities around the enterprise entity A. If the real neighboring entity is less than L, the output is NULL.
  • the vector representation embedding (E1), embedding (E2), ... of the enterprise entity A can be obtained, and the vector is represented as the first feature vector of the enterprise entity A.
  • each reference factor in the internal information of the enterprise is converted into a number to be quantified.
  • the value in the financial information can be converted into a characteristic value.
  • the net profit is 300,000 yuan
  • 30 is a corresponding feature.
  • the value, the frequency of website updates, and the number of hiring people in the most recent year are also numerical values, and can also be based on preset conversion rules. In other embodiments, it is also possible to convert 300,000 yuan into other values according to a preset conversion ratio. After each reference factor in the internal information of the enterprise entity A is quantized, a second feature vector of the enterprise entity A is generated.
  • the external information includes the upstream and downstream relationship of the enterprise entity A. For example, suppliers, customers, whether the company has defaulted or owed money to other entities in the upstream and downstream relationship, if any, the number of defaults and the period of arrears.
  • the external information of the enterprise entity A also includes the rating agency's rating on the business entity A (rating level 3A, 2A is good, A is good, BBB is general, etc.), the news media positive/negative reports on the business entity A, etc. .
  • each reference factor in the internal information of the enterprise is converted into a number for quantization.
  • the number of defaults can be quantized into three values, no default-0, mild default-1, and severe default- 2; arrears can be quantified into 2 values, no arrears-0, with arrears-1; ratings can be quantified into multiple values, rating level 3A-6, rating level 2A-5, rating level A-4, rating Level BBB-3, rating level BB-2, rating level B-1.
  • the external information is quantified, the number of defaults -1, the arrears -1, the rating -3, and the third feature vector of the enterprise entity A is generated according to the quantized information.
  • the risk enterprise A can be assessed next.
  • the name of the enterprise entity A and the first feature vector, the second feature vector and the third feature vector of the enterprise entity A are input into a predetermined risk assessment model for risk assessment, and the risk assessment result is output.
  • the training step of the predetermined risk assessment model includes: acquiring the first feature vector, the second feature vector, and the third feature vector of the plurality of enterprise entities by using the foregoing steps S1-S5, and the specific implementation manner is consistent with the foregoing steps , no longer repeat them here.
  • the second feature vector, the third feature vector, and the corresponding risk tag are used as sample data. Extracting, from the sample data, the first feature vector, the second feature vector, the third feature vector of the first proportion (for example, 60%) of the enterprise entity, and the risk tag corresponding to the first entity (for example, 60%) of the enterprise entity As a training set, the first feature vector, the second feature vector, the third feature vector, and the second ratio (eg, 50%) of the second entity (eg, 50%) of the business entity are randomly extracted from the remaining sample set.
  • the risk label corresponding to the enterprise entity is used as a verification set, that is, 20% of the sample data of the sample data is extracted as a verification set; the support vector machine is trained by using the 50% sample data, and the model parameters of the risk assessment model are determined.
  • the model output result is 0, it indicates that the investment enterprise entity A is substantially risk-free, and if the model output result is 1 , it means that the investment enterprise entity A has a greater risk.
  • the enterprise investment risk assessment method proposed by the above embodiment obtains the first feature vector, the second feature vector and the third of the enterprise entity by understanding the relationship between the enterprise entity and its associated entity, the internal information of the enterprise entity and the external information.
  • the feature vector utilizes the risk assessment model and the first feature vector, the second feature vector, and the third feature vector to perform risk assessment on the investment enterprise entity, so that the investor can capture market investment opportunities.
  • the embodiment of the present application further provides a computer readable storage medium, where the enterprise investment risk assessment program is stored, and the enterprise investment risk assessment program is executed by the processor to:
  • A1. Crawling news corpus related to the business entity to be assessed, preprocessing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
  • the name is a node
  • the relationship between the enterprise entity and other entities is an edge
  • a network of relationships between the enterprise entity and other entities is constructed
  • A3. Calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity;
  • A6 Input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
  • the technical solution of the present application which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM as described above). , a disk, an optical disk, including a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the various embodiments of the present application.
  • a terminal device which may be a mobile phone, a computer, a server, or a network device, etc.

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Educational Administration (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present application provides an enterprise investment risk assessment method. The method comprises: crawling news corpora associated with an investment target enterprise entity, and extracting other entities associated with the enterprise entity; building a relationship network by using names as nodes and association relationships between the enterprise entity and other entities as edges; computing the vector representation of the enterprise entity, and generating a first eigenvector of the enterprise entity; quantizing internal information of the enterprise entity according to a first preset rule, and generating a second eigenvector; quantizing external information of the enterprise entity according to a second preset rule, and generating a third eigenvector; and inputting the first eigenvector, the second eigenvector and the third eigenvector into an enterprise risk assessment model, and obtaining and outputting a risk label corresponding to the enterprise entity. The present application also provides an electronic device and a computer readable storage medium. By means of the present application, by analyzing information revealed in news corpora, risks of investment in target enterprises can be assessed.

Description

企业投资风险评估方法、装置及存储介质Enterprise investment risk assessment method, device and storage medium
优先权申明Priority claim
本申请基于巴黎公约申明享有2017年11月17日递交的申请号为CN201711141730.3、名称为“企业投资风险评估方法、装置及存储介质”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。This application is based on the priority of the Chinese Patent Application entitled "Enterprise Investment Risk Assessment Method, Apparatus and Storage Medium" filed on November 17, 2017, with the application number CN201711141730.3 submitted on November 17, 2017. The content is incorporated herein by reference.
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种企业投资风险评估方法、电子装置及计算机可读存储介质。The present application relates to the field of computer technology, and in particular, to an enterprise investment risk assessment method, an electronic device, and a computer readable storage medium.
背景技术Background technique
目前,在观察投资标的角度上,市面上的相关工具都相对简单,大部分停留在传统的财务分析、报表分析层面,缺乏对上下游、关联方以及市场热点、政策线索的关联量化的考虑。At present, in terms of observing investment targets, the relevant tools on the market are relatively simple, most of which stay at the traditional financial analysis and report analysis level, and lack of consideration for the correlation quantification of upstream and downstream, related parties, market hotspots and policy clues.
随着网络的普及,每个新闻网站每天有成千上万条新闻,并且新闻会实时更新。如果能从海量的新闻语料中,提取投资目标企业相关联的大数据,例如企业内部状况:经营、财务、高管、招聘、网站更新频率等,企业外部状况,例如关联公司状况如上下游、客户等,评级机构对该企业的评级,新闻媒体相关报道等信息,将这些信息形成关系网络,分析、评估投资目标企业的风险系数,从而投资方可根据风险系数考虑能否接受该风险并决定是否投资该企业。因此,如何从新闻语料中提取投资目标企业相关联的信息,并利用该信息进行风险评估是急需解决的问题。With the popularity of the web, every news site has thousands of news every day, and the news is updated in real time. If you can extract the big data associated with the investment target enterprise from a large amount of news corpus, such as the internal state of the enterprise: operation, finance, executives, recruitment, website update frequency, etc., the external status of the enterprise, such as the status of the affiliated company, as above, The customer, the rating agency's rating of the company, news media related reports and other information, will form this information into a network of relationships, analyze and evaluate the risk factor of the investment target enterprise, so that the investor can consider whether the risk can be accepted and decided according to the risk factor. Whether to invest in the company. Therefore, how to extract the information related to the investment target enterprise from the news corpus and use the information to conduct risk assessment is an urgent problem to be solved.
发明内容Summary of the invention
本申请提供一种企业投资风险评估方法、电子装置及计算机可读存储介质,其主要目的在于通过对新闻语料中透露的信息进行分析,评估投资目标企业的风险。The application provides an enterprise investment risk assessment method, an electronic device and a computer readable storage medium, the main purpose of which is to evaluate the risk of the investment target enterprise by analyzing the information disclosed in the news corpus.
为实现上述目的,本申请提供一种电子装置,该电子装置包括:存储器、处理器,所述存储器存储有可在所述处理器上运行的企业投资风险评估程序, 该程序被所述处理器执行时实现如下步骤:To achieve the above object, the present application provides an electronic device including: a memory, a processor, and a memory investment evaluation program run on the processor, the program being used by the processor The following steps are implemented during execution:
A1、爬取待评估风险的企业实体相关的新闻语料,对新闻语料进行预处理,从经过预处理后的新闻语料中提取与该企业实体相关联的其他实体;A1. Crawling news corpus related to the business entity to be assessed, preprocessing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
A2、以名称为节点、该企业实体与其他实体之间的关联关系为边,构建该企业实体与其他实体之间的关系网络;A2, the name is a node, the relationship between the enterprise entity and other entities is an edge, and a network of relationships between the enterprise entity and other entities is constructed;
A3、根据关系网络计算该企业实体的向量表示,生成该企业实体的第一特征向量;A3. Calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity;
A4、根据第一预设规则,对该企业实体的内部信息进行量化,生成第二特征向量;A4. Quantify the internal information of the enterprise entity according to the first preset rule to generate a second feature vector;
A5、从新闻语料中提取该企业实体的外部信息,根据第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;及A5. Extracting external information of the enterprise entity from the news corpus, and quantifying the external information of the enterprise entity according to the second preset rule, and generating a third feature vector of the enterprise entity;
A6、将所述第一特征向量、第二特征向量及第三特征向量输入预先确定的企业风险评估模型,输出得到该企业实体对应的风险标签。A6: Input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
此外,为实现上述目的,本申请还提供一种企业投资风险评估方法,该方法包括:In addition, to achieve the above object, the present application further provides a method for evaluating enterprise investment risk, which includes:
S1、爬取待评估风险的企业实体相关的新闻语料,对新闻语料进行预处理,从经过预处理后的新闻语料中提取与该企业实体相关联的其他实体;S1: crawling the news corpus related to the enterprise entity to be assessed for risk, pre-processing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
S2、以名称为节点、该企业实体与其他实体之间的关联关系为边,构建该企业实体与其他实体之间的关系网络;S2, constructing a relationship network between the enterprise entity and other entities by using the name as a node, the relationship between the enterprise entity and other entities as an edge;
S3、根据关系网络计算该企业实体的向量表示,生成该企业实体的第一特征向量;S3. Calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity.
S4、根据第一预设规则,对该企业实体的内部信息进行量化,生成第二特征向量;S4. Quantify the internal information of the enterprise entity according to the first preset rule to generate a second feature vector.
S5、从新闻语料中提取该企业实体的外部信息,根据第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;及S5. Extracting external information of the enterprise entity from the news corpus, and quantifying the external information of the enterprise entity according to the second preset rule, and generating a third feature vector of the enterprise entity;
S6、将所述第一特征向量、第二特征向量及第三特征向量输入预先确定的企业风险评估模型,输出得到该企业实体对应的风险标签。S6. Input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质存储有企业投资风险评估程序,该程序被处理器执行时实现如上所述的企业投资风险评估方法的任意步骤。In addition, in order to achieve the above object, the present application further provides a computer readable storage medium storing a business investment risk assessment program, which is executed by a processor to implement enterprise investment risk assessment as described above. Any step of the method.
此外,为实现上述目的,本申请还提供一种企业投资风险评估,该程序包括:提取模块、构建模块、第一计算模块、第二计算模块、第三计算模块以及评估模块,该程序被处理器执行时实现如上所述的企业投资风险评估方法的任意步骤。In addition, to achieve the above object, the present application further provides an enterprise investment risk assessment, the program comprising: an extraction module, a construction module, a first calculation module, a second calculation module, a third calculation module, and an evaluation module, the program is processed Any step of implementing the enterprise investment risk assessment method described above when executed.
本申请提出的企业投资风险评估方法、电子装置及计算机可读存储介质,通过从新闻语料中了解企业实体与其关联实体之间的关系、企业实体的内部信息及外部信息,分别得到该企业实体的第一特征向量、第二特征向量及第三特征向量,利用风险评估模型及所述第一特征向量、第二特征向量及第三特征向量,对投资该企业实体进行风险评估,便于投资方捕捉市场投资机会,提前预测投资风险。The enterprise investment risk assessment method, the electronic device and the computer readable storage medium proposed by the application obtain the relationship between the enterprise entity and its associated entity, the internal information of the enterprise entity and the external information from the news corpus, respectively, and obtain the enterprise entity respectively. The first feature vector, the second feature vector, and the third feature vector use the risk assessment model and the first feature vector, the second feature vector, and the third feature vector to perform risk assessment on the investment entity, so that the investor can capture Market investment opportunities to predict investment risks in advance.
附图说明DRAWINGS
图1为本申请企业投资风险评估方法较佳实施例的应用环境示意图;1 is a schematic diagram of an application environment of a preferred embodiment of an enterprise investment risk assessment method according to the present application;
图2为企业实体A与相关联的其他实体之间的关系网络图;2 is a network diagram of the relationship between the enterprise entity A and other associated entities;
图3为企业实体A的向量表示;Figure 3 is a vector representation of the business entity A;
图4为图1中企业投资风险评估程序的模块示意图;Figure 4 is a block diagram of the enterprise investment risk assessment procedure of Figure 1;
图5为本申请企业投资风险评估方法较佳实施例的流程图。FIG. 5 is a flow chart of a preferred embodiment of an enterprise investment risk assessment method of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional features and advantages of the present application will be further described with reference to the accompanying drawings.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting.
本申请提供一种企业投资风险评估方法,该方法应用于一种电子装置1。参照图1所示,为本申请企业投资风险评估方法较佳实施例的应用环境示意图。在本实施例中,所述电子装置1可以是PC(Personal Computer,个人电脑),也可以是智能手机、平板电脑、电子书阅读器、便携计算机等终端设备。该电子装置1包括存储器11、处理器12,通信总线13,以及网络接口14。The present application provides a method for evaluating enterprise investment risk, which is applied to an electronic device 1. Referring to FIG. 1 , it is a schematic diagram of an application environment of a preferred embodiment of an enterprise investment risk assessment method of the present application. In this embodiment, the electronic device 1 may be a PC (Personal Computer), or may be a terminal device such as a smart phone, a tablet computer, an e-book reader, or a portable computer. The electronic device 1 includes a memory 11, a processor 12, a communication bus 13, and a network interface 14.
其中,存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、磁 性存储器、磁盘、光盘等。存储器11在一些实施例中可以是所述电子装置1的内部存储单元,例如该电子装置1的硬盘。存储器11在另一些实施例中也可以是所述电子装置1的外部存储设备,例如该电子装置1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器11还可以既包括该电子装置1的内部存储单元也包括外部存储设备。存储器11不仅可以用于存储安装于该电子装置1的应用软件及各类数据,例如企业投资风险评估程序10等,还可以用于暂时地存储已经输出或者将要输出的数据。The memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (for example, an SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. The memory 11 may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1, in some embodiments. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SMC), and a secure digital (Secure Digital) , SD) card, flash card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can be used not only for storing application software and various types of data installed in the electronic device 1, such as the enterprise investment risk assessment program 10, but also for temporarily storing data that has been output or will be output.
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如企业投资风险评估程序10等。The processor 12, in some embodiments, may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor or other data processing chip for running program code or processing stored in the memory 11. Data, such as the corporate investment risk assessment process10.
通信总线13用于实现这些组件之间的连接通信。 Communication bus 13 is used to implement connection communication between these components.
网络接口14可选的可以包括标准的有线接口、无线接口(如WI-FI接口),通常用于在该装置与其他电子设备之间建立通信连接。The network interface 14 can optionally include a standard wired interface, a wireless interface (such as a WI-FI interface), and is typically used to establish a communication connection between the device and other electronic devices.
图1仅示出了具有组件11-14的电子装置1,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。Figure 1 shows only the electronic device 1 with components 11-14, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
可选地,该电子装置1还可以包括用户接口,用户接口可以包括显示器(Display)、输入单元比如键盘(Keyboard),可选的用户接口还可以包括标准的有线接口、无线接口。Optionally, the electronic device 1 may further include a user interface, the user interface may include a display, an input unit such as a keyboard, and the optional user interface may further include a standard wired interface and a wireless interface.
可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以称为显示屏或显示单元,用于显示在电子装置1中处理的信息以及用于显示可视化的用户界面。Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch sensor, or the like. The display may also be referred to as a display screen or display unit for displaying information processed in the electronic device 1 and a user interface for displaying visualizations.
在图1所示的装置实施例中,存储器11中存储有企业投资风险评估程序10;处理器12执行存储器11中存储的企业投资风险评估程序10时实现如下步骤:In the apparatus embodiment shown in FIG. 1, the enterprise investment risk assessment program 10 is stored in the memory 11; when the processor 12 executes the enterprise investment risk assessment program 10 stored in the memory 11, the following steps are implemented:
A1、爬取待评估风险的企业实体相关的新闻语料,对新闻语料进行预处理,从经过预处理后的新闻语料中提取与该企业实体相关联的其他实体;A1. Crawling news corpus related to the business entity to be assessed, preprocessing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
A2、以名称为节点、该企业实体与其他实体之间的关联关系为边,构建该企业实体与其他实体之间的关系网络;A2, the name is a node, the relationship between the enterprise entity and other entities is an edge, and a network of relationships between the enterprise entity and other entities is constructed;
A3、根据关系网络计算该企业实体的向量表示,生成该企业实体的第一特征向量;A3. Calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity;
A4、根据第一预设规则,对该企业实体的内部信息进行量化,生成第二特征向量;A4. Quantify the internal information of the enterprise entity according to the first preset rule to generate a second feature vector;
A5、从新闻语料中提取该企业实体的外部信息,根据第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;及A5. Extracting external information of the enterprise entity from the news corpus, and quantifying the external information of the enterprise entity according to the second preset rule, and generating a third feature vector of the enterprise entity;
A6、将所述第一特征向量、第二特征向量及第三特征向量输入预先确定的企业风险评估模型,输出得到该企业实体对应的风险标签。A6: Input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
语料涉及多个不同的领域,本实施例以新闻语料为例,对本申请的具体方案进行说明,但不仅仅限于新闻领域。当投资方需要了解时下的新闻,以获取投资目标企业相关联的内部数据及外部数据时,利用网络爬虫从互联网中爬取网络新闻,例如,通过爬虫爬取新浪、百度、腾讯等的网络新闻。可以理解的是,每个企业在不同的时间段内的经营情况都不一样,因此,为了使投资者更准确的了解投资目标企业的信息,在时间维度上对爬取的网络新闻进行过滤,设置预设时间区间,只爬取该时间段的网络新闻,例如,只爬取近半年的网络新闻。由于新闻语料的来源具有多样性,因此,语料中格式类型比较多,为便于对语料进行后续处理,需对新闻语料进行预处理,得到新闻语料文本数据,形成新闻语料文本集。The corpus refers to a plurality of different fields. This embodiment uses a news corpus as an example to describe a specific solution of the present application, but is not limited to the field of news. When the investor needs to know the current news to obtain the internal data and external data associated with the investment target enterprise, use the web crawler to crawl the online news from the Internet, for example, crawling online news of Sina, Baidu, Tencent, etc. through crawlers. . Understandably, each company operates differently in different time periods. Therefore, in order to enable investors to more accurately understand the information of the investment target company, the crawled network news is filtered in the time dimension. Set the preset time interval and only crawl the online news of the time period, for example, only crawling online news for nearly half a year. Because the source of news corpus is diverse, there are many types of corpus in the corpus. In order to facilitate the subsequent processing of the corpus, the news corpus needs to be preprocessed, and the news corpus text data is obtained to form a news corpus text set.
在其他实施例中,所述预处理可以将新闻语料的格式统一为文本格式,从新闻语料中去除广告噪声并过滤脏词和敏感词中的一种或多种。在将新闻语料的格式统一为文本格式时,可以将当前技术暂不能转换为文本格式的内容过滤掉。In other embodiments, the pre-processing may unify the format of the news corpus into a text format, remove advertising noise from the news corpus and filter one or more of dirty words and sensitive words. When the format of the news corpus is unified into a text format, the content that cannot be converted into a text format by the current technology can be filtered out.
接下来,利用上述分词的方法,根据预先确定的企业名称库,从经过预处理后的新闻语料中提取出所有企业名称,然后根据待评估风险的企业实体(即投资目标企业)的关联企业数据,筛选出与待评估风险的企业实体相关联的其他实体,并将企业实体和其他实体构建成关系网络。其中,所述关联企业数据可以通过第三方数据获得。可以理解的是,从新闻语料中提取与企业实体相关联的其他实体可能很多,若要将所有关联实体全部构建在关系网络不合理,因此,在构建关系网络前,对提取出的与企业实体相关联的其他实体进行过滤筛选,具体地,通过过滤筛选步骤后保留的与企业实体相关联 的其他实体包括:该企业实体的股东公司、与企业实体有发生金钱往来的其他实体、供应商、客户、信用评结构等。Next, using the above-mentioned word segmentation method, according to the predetermined enterprise name library, all the enterprise names are extracted from the pre-processed news corpus, and then the associated enterprise data of the enterprise entity (ie, the investment target enterprise) according to the risk to be assessed is used. , filter out other entities associated with the business entity to be assessed, and build the enterprise entity and other entities into a network of relationships. The associated enterprise data can be obtained through third party data. It can be understood that it may be many to extract other entities associated with the enterprise entity from the news corpus. To construct all the related entities in the relational network is unreasonable, therefore, before extracting the relational network, the extracted and the enterprise entity Other related entities are filtered and filtered. Specifically, other entities associated with the enterprise entity retained by the filtering and screening step include: a shareholder company of the business entity, other entities that have money with the business entity, suppliers, Customer, credit rating structure, etc.
在本实施例中,以企业实体A为例,对从新闻语料中提取的与企业实体A相关联的其他实体进行筛选后,假设保留的其他实体分别为B1、B2、B3,其中,B1为给企业实体A进行信用评级的评级机构,从历史评级数据中可以了解到B1给企业实体A的信用评级为BBB,B2为给企业实体A提供原材料或货品的供应商,企业实体A对B2的欠款金额为30万,B3为企业实体A的客户,企业实体A曾对B3违约2次。以企业实体A、B1、B2、B3为节点,以B1、B2、B3与A的关联关系为边,构建如图2所示的企业实体与其他实体之间的关系网络图。In this embodiment, taking the enterprise entity A as an example, after screening other entities associated with the enterprise entity A extracted from the news corpus, it is assumed that the remaining entities are B1, B2, and B3, respectively, where B1 is The rating agency that gives the enterprise entity A a credit rating can learn from the historical rating data that B1 gives the enterprise entity A a credit rating of BBB, and B2 is a supplier that supplies the enterprise entity A with raw materials or goods, and the enterprise entity A pairs B2. The amount owed is 300,000, B3 is the customer of enterprise entity A, and enterprise entity A has defaulted to B3 twice. The enterprise entity A, B1, B2, and B3 are used as nodes, and the relationship between B1, B2, and B3 and A is used as the edge, and a network diagram of the relationship between the enterprise entity and other entities as shown in FIG. 2 is constructed.
然后,根据上述关系网络图,计算企业实体A的向量表示,本实施例采用的是Skip-Gram方法,因为关系网络中企业实体A的向量表示与其相关联的实体B1、B2、B3的向量表示之间存在着管理关系。对于企业实体名称向量的训练,Skip-Gram方法利用当前企业实体去预测周围实体,如图3所示。图3中的An1,An2,An3,An4是没有顺序的,均表示为企业实体A的相邻实体。与利用Skip-Gram训练词向量的方法类似,设置一个固定的预测长度L,来预测企业实体A周围的L个相邻实体,若真实情况相邻实体不及L个,则输出为NULL。通过该方法,可以得到企业实体A的向量表示embedding(E1),embedding(E2),…,将该向量表示作为企业实体A的第一特征向量。Then, according to the above relationship network diagram, the vector representation of the enterprise entity A is calculated. This embodiment adopts the Skip-Gram method because the vector representation of the enterprise entity A in the relational network represents the vector representation of the entities B1, B2, B3 associated with it. There is a management relationship between them. For the training of enterprise entity name vectors, the Skip-Gram method uses the current enterprise entity to predict surrounding entities, as shown in Figure 3. An1, An2, An3, and An4 in Fig. 3 are unordered and are represented as adjacent entities of the enterprise entity A. Similar to the method of training the word vector by Skip-Gram, a fixed prediction length L is set to predict L neighboring entities around the enterprise entity A. If the real neighboring entity is less than L, the output is NULL. By this method, the vector representation embedding (E1), embedding (E2), ... of the enterprise entity A can be obtained, and the vector is represented as the first feature vector of the enterprise entity A.
可以理解的是,要了解投资企业实体A的风险,必须了解企业实体A的财务、经营等方面的信息,因此,需要考虑到企业实体A的内部信息,其中,内部信息包括企业实体A的经营、财务、招聘、网站更新频率等信息,其中部分信息为数字信息,例如财务信息包括企业上一年度的净利润、股票收益等。按照规则将企业内部信息中每一个参考因素转换为数字进行量化,例如,财务信息里的数值可以转化为特征值,比如在本实施例中,净利润为30万元,取30为对应的特征值,网站更新频率、最近一年的招聘人数也是数值,也可以按预设的转换规则为相应的数值。在其他实施例中,也可以按预设的转换比例将30万元转换为其他数值。将企业实体A的内部信息中的每个参考因素量化后,生成企业实体A的第二特征向量。It can be understood that to understand the risk of investing in enterprise entity A, it is necessary to understand the financial, operational and other aspects of enterprise entity A. Therefore, it is necessary to take into account the internal information of enterprise entity A, wherein the internal information includes the operation of enterprise entity A. Information such as finance, recruitment, website update frequency, etc., some of which are digital information, such as financial information including the company's previous year's net profit, stock returns and so on. According to the rules, each reference factor in the internal information of the enterprise is converted into a number to be quantified. For example, the value in the financial information can be converted into a characteristic value. For example, in the embodiment, the net profit is 300,000 yuan, and 30 is a corresponding feature. The value, the frequency of website updates, and the number of hiring people in the most recent year are also numerical values, and can also be based on preset conversion rules. In other embodiments, it is also possible to convert 300,000 yuan into other values according to a preset conversion ratio. After each reference factor in the internal information of the enterprise entity A is quantized, a second feature vector of the enterprise entity A is generated.
需要说明的是,企业实体A的经营好坏,除了自身因素外,外界的因素也至关重要,因此还需要考虑企业实体A的外部信息,其中,外部信息包括企业实体A的上下游关系,例如供应商、客户,该企业是否对上下游关系的其他实体产生过违约、欠款,如有,违约次数、欠款周期分别为多少。此外,企业实体A的外部信息还包括评级机构对企业实体A的评级(评级级别3A,2A表示优良,A表示良好,BBB表示一般等)、新闻媒体对该企业实体A的正面/负面报道等。然后,按照规则将企业内部信息中每一个参考因素转换为数字进行量化,例如,在本实施例中,违约次数可以量化为3个数值,无违约-0,轻度违约-1,重度违约-2;欠款可以量化为2个数值,无欠款-0,有欠款-1;评级可以量化为多个数值,评级级别3A-6,评级级别2A-5,评级级别A-4,评级级别BBB-3,评级级别BB-2,评级级别B-1。根据企业实体A的具体情况,将其外部信息进行量化,违约次数-1,欠款-1,评级-3,依据量化后的信息生成企业实体A的第三特征向量。It should be noted that the business entity A is good or bad. In addition to its own factors, external factors are also important. Therefore, it is also necessary to consider the external information of the enterprise entity A. The external information includes the upstream and downstream relationship of the enterprise entity A. For example, suppliers, customers, whether the company has defaulted or owed money to other entities in the upstream and downstream relationship, if any, the number of defaults and the period of arrears. In addition, the external information of the enterprise entity A also includes the rating agency's rating on the business entity A (rating level 3A, 2A is good, A is good, BBB is general, etc.), the news media positive/negative reports on the business entity A, etc. . Then, according to the rules, each reference factor in the internal information of the enterprise is converted into a number for quantization. For example, in this embodiment, the number of defaults can be quantized into three values, no default-0, mild default-1, and severe default- 2; arrears can be quantified into 2 values, no arrears-0, with arrears-1; ratings can be quantified into multiple values, rating level 3A-6, rating level 2A-5, rating level A-4, rating Level BBB-3, rating level BB-2, rating level B-1. According to the specific situation of the enterprise entity A, the external information is quantified, the number of defaults -1, the arrears -1, the rating -3, and the third feature vector of the enterprise entity A is generated according to the quantized information.
至此,了解了与企业实体A相关联的其他实体、企业实体A的内部信息及外部信息后,接下来就可以对投资企业实体A进行风险评估。将企业实体A的名称及企业实体A的第一特征向量、第二特征向量及第三特征向量输入预先确定的风险评估模型中进行风险评估,并输出风险评估结果。其中,所述预先确定的风险评估模型的训练步骤包括:利用上述A1-A5步骤,获取大量企业实体的第一特征向量、第二特征向量及第三特征向量,其具体实施方式与上述步骤一致,这里不再赘述。然后为各个企业实体标注风险标签,对“无风险”的企业实体,标注风险标签为0,对“高风险”的企业实体,标注风险标签为1,然后将各个企业实体的第一特征向量、第二特征向量、第三特征向量及对应的风险标签作为样本数据。从样本数据中的随机抽取第一比例(例如60%)的企业实体的第一特征向量、第二特征向量、第三特征向量及该第一比例(例如60%)的企业实体对应的风险标签作为训练集,从剩下的样本集中的随机抽取第二比例(例如50%)的企业实体的第一特征向量、第二特征向量、第三特征向量及该第二比例(例如50%)的企业实体对应的风险标签作为验证集,也就是说,抽取样本数据的20%的样本数据作为验证集;利用所述50%的样本数据对支持向量机进行训练,确定风险评估模型的模型参数,确定出企业实体的关联实体、内部信息、外部信息与投资该企业实体的风险 之间的关系;利用20%的样本数据对所述风险评估模型的准确性进行验证,若准确率大于或者等于预设准确率(例如90%),则训练结束,或者,若准确率小于预设准确率(例如90%),则增加样本数量并重新执行训练步骤。At this point, after understanding the internal information and external information of other entities associated with enterprise entity A, enterprise entity A, the risk enterprise A can be assessed next. The name of the enterprise entity A and the first feature vector, the second feature vector and the third feature vector of the enterprise entity A are input into a predetermined risk assessment model for risk assessment, and the risk assessment result is output. The training step of the predetermined risk assessment model includes: acquiring the first feature vector, the second feature vector, and the third feature vector of the plurality of enterprise entities by using the foregoing steps A1-A5, and the specific implementation manner is consistent with the foregoing steps , no longer repeat them here. Then mark the risk label for each enterprise entity, mark the risk label as 0 for the “risk-free” enterprise entity, mark the risk label with 1 for the “high-risk” enterprise entity, and then mark the first feature vector of each enterprise entity. The second feature vector, the third feature vector, and the corresponding risk tag are used as sample data. Extracting, from the sample data, the first feature vector, the second feature vector, the third feature vector of the first proportion (for example, 60%) of the enterprise entity, and the risk tag corresponding to the first entity (for example, 60%) of the enterprise entity As a training set, the first feature vector, the second feature vector, the third feature vector, and the second ratio (eg, 50%) of the second entity (eg, 50%) of the business entity are randomly extracted from the remaining sample set. The risk label corresponding to the enterprise entity is used as a verification set, that is, 20% of the sample data of the sample data is extracted as a verification set; the support vector machine is trained by using the 50% sample data, and the model parameters of the risk assessment model are determined. Determine the relationship between the associated entity of the business entity, internal information, external information and the risk of investing in the enterprise entity; verify the accuracy of the risk assessment model with 20% of the sample data, if the accuracy is greater than or equal to the pre- Set the accuracy rate (for example, 90%), the training ends, or, if the accuracy is less than the preset accuracy (for example, 90%), increase the sample size and re-execute Training steps.
将企业实体A的第一特征向量、第二特征向量及第三特征向量输入所述风险评估模型后,若模型输出结果为0,则表示投资企业实体A基本无风险,若模型输出结果为1,则表示投资企业实体A有较大风险。After the first feature vector, the second feature vector, and the third feature vector of the enterprise entity A are input into the risk assessment model, if the model output result is 0, it indicates that the investment enterprise entity A is substantially risk-free, and if the model output result is 1 , it means that the investment enterprise entity A has a greater risk.
上述实施例提出的电子装置1,通过了解企业实体与其关联实体之间的关系、企业实体的内部信息及外部信息,分别得到该企业实体的第一特征向量、第二特征向量及第三特征向量,利用风险评估模型及所述第一特征向量、第二特征向量及第三特征向量,对投资该企业实体进行风险评估,便于投资方捕捉市场投资机会。The electronic device 1 proposed by the foregoing embodiment obtains the first feature vector, the second feature vector, and the third feature vector of the enterprise entity by understanding the relationship between the enterprise entity and its associated entity, the internal information of the enterprise entity, and the external information. The risk assessment model and the first feature vector, the second feature vector and the third feature vector are used to perform risk assessment on the investment entity, so that the investor can capture market investment opportunities.
可选地,在其他的实施例中,企业投资风险评估程序10还可以被分割为一个或者多个模块,一个或者多个模块被存储于存储器11中,并由一个或多个处理器(本实施例为处理器12)所执行,以完成本申请,本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段。例如,参照图4所示,为图1中企业投资风险评估程序10的模块示意图,在本实施例中,该程序可以被分割为提取模块110、构建模块120、第一计算模块130、第二计算模块140、第三计算模块150以及评估模块160,所述模块110-160所实现的功能或操作步骤均与上文类似,此处不再详述,示例性地,例如其中:Alternatively, in other embodiments, the enterprise investment risk assessment program 10 may also be divided into one or more modules, one or more modules being stored in the memory 11 and being processed by one or more processors (this Embodiments are executed by processor 12) to accomplish the present application, and a module referred to herein refers to a series of computer program instructions that are capable of performing a particular function. For example, as shown in FIG. 4, it is a schematic diagram of a module of the enterprise investment risk assessment program 10 in FIG. 1. In this embodiment, the program may be divided into an extraction module 110, a construction module 120, a first calculation module 130, and a second. The calculation module 140, the third calculation module 150, and the evaluation module 160, the functions or operation steps implemented by the modules 110-160 are similar to the above, and are not described in detail herein, for example, where:
提取模块110,用于爬取待评估风险的企业实体相关的新闻语料,对新闻语料进行预处理,从经过预处理后的新闻语料中提取与该企业实体相关联的其他实体;The extracting module 110 is configured to crawl the news corpus related to the enterprise entity to be evaluated for risk, pre-process the news corpus, and extract other entities associated with the business entity from the pre-processed news corpus;
构建模块120,用于以名称为节点、该企业实体与其他实体之间的关联关系为边,构建该企业实体与其他实体之间的关系网络;The building module 120 is configured to construct a relationship network between the enterprise entity and other entities by using the name as a node, the relationship between the enterprise entity and other entities as an edge;
第一计算模块130,用于根据关系网络计算该企业实体的向量表示,生成该企业实体的第一特征向量;The first calculating module 130 is configured to calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity;
第二计算模块140,用于根据第一预设规则,对该企业实体的内部信息进行量化,生成第二特征向量;The second calculating module 140 is configured to quantize the internal information of the enterprise entity according to the first preset rule to generate a second feature vector;
第三计算模块150,用于从新闻语料中提取该企业实体的外部信息,根据 第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;及The third calculating module 150 is configured to extract external information of the enterprise entity from the news corpus, and quantize the external information of the enterprise entity according to the second preset rule to generate a third feature vector of the enterprise entity;
评估模块160,用于将所述第一特征向量、第二特征向量及第三特征向量输入预先确定的企业风险评估模型,输出得到该企业实体对应的风险标签。The evaluation module 160 is configured to input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
此外,本申请还提供一种企业投资风险评估方法。参照图5所示,为本申请企业投资风险评估方法较佳实施例的流程图。该方法可以由一个装置执行,该装置可以由软件和/或硬件实现。In addition, the application also provides a method for assessing enterprise investment risk. Referring to FIG. 5, it is a flowchart of a preferred embodiment of the enterprise investment risk assessment method of the present application. The method can be performed by a device that can be implemented by software and/or hardware.
在本实施例中,企业投资风险评估方法包括:In this embodiment, the enterprise investment risk assessment method includes:
S1、爬取待评估风险的企业实体相关的新闻语料,对新闻语料进行预处理,从经过预处理后的新闻语料中提取与该企业实体相关联的其他实体;S1: crawling the news corpus related to the enterprise entity to be assessed for risk, pre-processing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
S2、以名称为节点、该企业实体与其他实体之间的关联关系为边,构建该企业实体与其他实体之间的关系网络;S2, constructing a relationship network between the enterprise entity and other entities by using the name as a node, the relationship between the enterprise entity and other entities as an edge;
S3、根据关系网络计算该企业实体的向量表示,生成该企业实体的第一特征向量;S3. Calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity.
S4、根据第一预设规则,对该企业实体的内部信息进行量化,生成第二特征向量;S4. Quantify the internal information of the enterprise entity according to the first preset rule to generate a second feature vector.
S5、从新闻语料中提取该企业实体的外部信息,根据第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;及S5. Extracting external information of the enterprise entity from the news corpus, and quantifying the external information of the enterprise entity according to the second preset rule, and generating a third feature vector of the enterprise entity;
S6、将所述第一特征向量、第二特征向量及第三特征向量输入预先确定的企业风险评估模型,输出得到该企业实体对应的风险标签。S6. Input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
语料涉及多个不同的领域,本实施例以新闻语料为例,对本申请的具体方案进行说明,但不仅仅限于新闻领域。当投资方需要了解时下的新闻,以获取投资目标企业相关联的内部数据及外部数据时,利用网络爬虫从互联网中爬取网络新闻,例如,通过爬虫爬取新浪、百度、腾讯等的网络新闻。可以理解的是,每个企业在不同的时间段内的经营情况都不一样,因此,为了使投资者更准确的了解投资目标企业的信息,在时间维度上对爬取的网络新闻进行过滤,设置预设时间区间,只爬取该时间段的网络新闻,例如,只爬取近半年的网络新闻。由于新闻语料的来源具有多样性,因此,语料中格式类型比较多,为便于对语料进行后续处理,需对新闻语料进行预处理,得到 新闻语料文本数据,形成新闻语料文本集。The corpus refers to a plurality of different fields. This embodiment uses a news corpus as an example to describe a specific solution of the present application, but is not limited to the field of news. When the investor needs to know the current news to obtain the internal data and external data associated with the investment target enterprise, use the web crawler to crawl the online news from the Internet, for example, crawling online news of Sina, Baidu, Tencent, etc. through crawlers. . Understandably, each company operates differently in different time periods. Therefore, in order to enable investors to more accurately understand the information of the investment target company, the crawled network news is filtered in the time dimension. Set the preset time interval and only crawl the online news of the time period, for example, only crawling online news for nearly half a year. Because the sources of news corpus are diverse, there are many types of corpus in the corpus. In order to facilitate the subsequent processing of the corpus, the news corpus needs to be preprocessed, and the news corpus text data is obtained to form a news corpus text set.
在其他实施例中,所述预处理可以将新闻语料的格式统一为文本格式,从新闻语料中去除广告噪声并过滤脏词和敏感词中的一种或多种。在将新闻语料的格式统一为文本格式时,可以将当前技术暂不能转换为文本格式的内容过滤掉。In other embodiments, the pre-processing may unify the format of the news corpus into a text format, remove advertising noise from the news corpus and filter one or more of dirty words and sensitive words. When the format of the news corpus is unified into a text format, the content that cannot be converted into a text format by the current technology can be filtered out.
接下来,利用上述分词的方法,根据预先确定的企业名称库,从经过预处理后的新闻语料中提取出所有企业名称,然后根据待评估风险的企业实体(即投资目标企业)的关联企业数据,筛选出与待评估风险的企业实体相关联的其他实体,并将企业实体和其他实体构建成关系网络。其中,所述关联企业数据可以通过第三方数据获得。可以理解的是,从新闻语料中提取与企业实体相关联的其他实体可能很多,若要将所有关联实体全部构建在关系网络不合理,因此,在构建关系网络前,对提取出的与企业实体相关联的其他实体进行过滤筛选,具体地,通过过滤筛选步骤后保留的与企业实体相关联的其他实体包括:该企业实体的股东公司、与企业实体有发生金钱往来的其他实体、供应商、客户、信用评结构等。Next, using the above-mentioned word segmentation method, according to the predetermined enterprise name library, all the enterprise names are extracted from the pre-processed news corpus, and then the associated enterprise data of the enterprise entity (ie, the investment target enterprise) according to the risk to be assessed is used. , filter out other entities associated with the business entity to be assessed, and build the enterprise entity and other entities into a network of relationships. The associated enterprise data can be obtained through third party data. It can be understood that it may be many to extract other entities associated with the enterprise entity from the news corpus. To construct all the related entities in the relational network is unreasonable, therefore, before extracting the relational network, the extracted and the enterprise entity Other related entities are filtered and filtered. Specifically, other entities associated with the enterprise entity retained by the filtering and screening step include: a shareholder company of the business entity, other entities that have money with the business entity, suppliers, Customer, credit rating structure, etc.
在本实施例中,以企业实体A为例,对从新闻语料中提取的与企业实体A相关联的其他实体进行筛选后,假设保留的其他实体分别为B1、B2、B3,其中,B1为给企业实体A进行信用评级的评级机构,从历史评级数据中可以了解到B1给企业实体A的信用评级为BBB,B2为给企业实体A提供原材料或货品的供应商,企业实体A对B2的欠款金额为30万,B3为企业实体A的客户,企业实体A曾对B3违约2次。以企业实体A、B1、B2、B3为节点,以B1、B2、B3与A的关联关系为边,构建如图2所示的企业实体与其他实体之间的关系网络图。In this embodiment, taking the enterprise entity A as an example, after screening other entities associated with the enterprise entity A extracted from the news corpus, it is assumed that the remaining entities are B1, B2, and B3, respectively, where B1 is The rating agency that gives the enterprise entity A a credit rating can learn from the historical rating data that B1 gives the enterprise entity A a credit rating of BBB, and B2 is a supplier that supplies the enterprise entity A with raw materials or goods, and the enterprise entity A pairs B2. The amount owed is 300,000, B3 is the customer of enterprise entity A, and enterprise entity A has defaulted to B3 twice. The enterprise entity A, B1, B2, and B3 are used as nodes, and the relationship between B1, B2, and B3 and A is used as the edge, and a network diagram of the relationship between the enterprise entity and other entities as shown in FIG. 2 is constructed.
然后,根据上述关系网络图,计算企业实体A的向量表示,本实施例采用的是Skip-Gram方法,因为关系网络中企业实体A的向量表示与其相关联的实体B1、B2、B3的向量表示之间存在着管理关系。对于企业实体名称向量的训练,Skip-Gram方法利用当前企业实体去预测周围实体,如图3所示。图3中的An1,An2,An3,An4是没有顺序的,均表示为企业实体A的相邻实体。与利用Skip-Gram训练词向量的方法类似,设置一个固定的预测长度L,来预测企业实体A周围的L个相邻实体,若真实情况相邻实体不及L个,则 输出为NULL。通过该方法,可以得到企业实体A的向量表示embedding(E1),embedding(E2),…,将该向量表示作为企业实体A的第一特征向量。Then, according to the above relationship network diagram, the vector representation of the enterprise entity A is calculated. This embodiment adopts the Skip-Gram method because the vector representation of the enterprise entity A in the relational network represents the vector representation of the entities B1, B2, B3 associated with it. There is a management relationship between them. For the training of enterprise entity name vectors, the Skip-Gram method uses the current enterprise entity to predict surrounding entities, as shown in Figure 3. An1, An2, An3, and An4 in Fig. 3 are unordered and are represented as adjacent entities of the enterprise entity A. Similar to the method of training the word vector by Skip-Gram, a fixed prediction length L is set to predict L neighboring entities around the enterprise entity A. If the real neighboring entity is less than L, the output is NULL. By this method, the vector representation embedding (E1), embedding (E2), ... of the enterprise entity A can be obtained, and the vector is represented as the first feature vector of the enterprise entity A.
可以理解的是,要了解投资企业实体A的风险,必须了解企业实体A的财务、经营等方面的信息,因此,需要考虑到企业实体A的内部信息,其中,内部信息包括企业实体A的经营、财务、招聘、网站更新频率等信息,其中部分信息为数字信息,例如财务信息包括企业上一年度的净利润、股票收益等。按照规则将企业内部信息中每一个参考因素转换为数字进行量化,例如,财务信息里的数值可以转化为特征值,比如在本实施例中,净利润为30万元,取30为对应的特征值,网站更新频率、最近一年的招聘人数也是数值,也可以按预设的转换规则为相应的数值。在其他实施例中,也可以按预设的转换比例将30万元转换为其他数值。将企业实体A的内部信息中的每个参考因素量化后,生成企业实体A的第二特征向量。It can be understood that to understand the risk of investing in enterprise entity A, it is necessary to understand the financial, operational and other aspects of enterprise entity A. Therefore, it is necessary to take into account the internal information of enterprise entity A, wherein the internal information includes the operation of enterprise entity A. Information such as finance, recruitment, website update frequency, etc., some of which are digital information, such as financial information including the company's previous year's net profit, stock returns and so on. According to the rules, each reference factor in the internal information of the enterprise is converted into a number to be quantified. For example, the value in the financial information can be converted into a characteristic value. For example, in the embodiment, the net profit is 300,000 yuan, and 30 is a corresponding feature. The value, the frequency of website updates, and the number of hiring people in the most recent year are also numerical values, and can also be based on preset conversion rules. In other embodiments, it is also possible to convert 300,000 yuan into other values according to a preset conversion ratio. After each reference factor in the internal information of the enterprise entity A is quantized, a second feature vector of the enterprise entity A is generated.
需要说明的是,企业实体A的经营好坏,除了自身因素外,外界的因素也至关重要,因此还需要考虑企业实体A的外部信息,其中,外部信息包括企业实体A的上下游关系,例如供应商、客户,该企业是否对上下游关系的其他实体产生过违约、欠款,如有,违约次数、欠款周期分别为多少。此外,企业实体A的外部信息还包括评级机构对企业实体A的评级(评级级别3A,2A表示优良,A表示良好,BBB表示一般等)、新闻媒体对该企业实体A的正面/负面报道等。然后,按照规则将企业内部信息中每一个参考因素转换为数字进行量化,例如,在本实施例中,违约次数可以量化为3个数值,无违约-0,轻度违约-1,重度违约-2;欠款可以量化为2个数值,无欠款-0,有欠款-1;评级可以量化为多个数值,评级级别3A-6,评级级别2A-5,评级级别A-4,评级级别BBB-3,评级级别BB-2,评级级别B-1。根据企业实体A的具体情况,将其外部信息进行量化,违约次数-1,欠款-1,评级-3,依据量化后的信息生成企业实体A的第三特征向量。It should be noted that the business entity A is good or bad. In addition to its own factors, external factors are also important. Therefore, it is also necessary to consider the external information of the enterprise entity A. The external information includes the upstream and downstream relationship of the enterprise entity A. For example, suppliers, customers, whether the company has defaulted or owed money to other entities in the upstream and downstream relationship, if any, the number of defaults and the period of arrears. In addition, the external information of the enterprise entity A also includes the rating agency's rating on the business entity A (rating level 3A, 2A is good, A is good, BBB is general, etc.), the news media positive/negative reports on the business entity A, etc. . Then, according to the rules, each reference factor in the internal information of the enterprise is converted into a number for quantization. For example, in this embodiment, the number of defaults can be quantized into three values, no default-0, mild default-1, and severe default- 2; arrears can be quantified into 2 values, no arrears-0, with arrears-1; ratings can be quantified into multiple values, rating level 3A-6, rating level 2A-5, rating level A-4, rating Level BBB-3, rating level BB-2, rating level B-1. According to the specific situation of the enterprise entity A, the external information is quantified, the number of defaults -1, the arrears -1, the rating -3, and the third feature vector of the enterprise entity A is generated according to the quantized information.
至此,了解了与企业实体A相关联的其他实体、企业实体A的内部信息及外部信息后,接下来就可以对投资企业实体A进行风险评估。将企业实体A的名称及企业实体A的第一特征向量、第二特征向量及第三特征向量输入预先确定的风险评估模型中进行风险评估,并输出风险评估结果。其中,所述预先确定的风险评估模型的训练步骤包括:利用上述S1-S5步骤,获取大 量企业实体的第一特征向量、第二特征向量及第三特征向量,其具体实施方式与上述步骤一致,这里不再赘述。然后为各个企业实体标注风险标签,对“无风险”的企业实体,标注风险标签为0,对“高风险”的企业实体,标注风险标签为1,然后将各个企业实体的第一特征向量、第二特征向量、第三特征向量及对应的风险标签作为样本数据。从样本数据中的随机抽取第一比例(例如60%)的企业实体的第一特征向量、第二特征向量、第三特征向量及该第一比例(例如60%)的企业实体对应的风险标签作为训练集,从剩下的样本集中的随机抽取第二比例(例如50%)的企业实体的第一特征向量、第二特征向量、第三特征向量及该第二比例(例如50%)的企业实体对应的风险标签作为验证集,也就是说,抽取样本数据的20%的样本数据作为验证集;利用所述50%的样本数据对支持向量机进行训练,确定风险评估模型的模型参数,确定出企业实体的关联实体、内部信息、外部信息与投资该企业实体的风险之间的关系;利用20%的样本数据对所述风险评估模型的准确性进行验证,若准确率大于或者等于预设准确率(例如90%),则训练结束,或者,若准确率小于预设准确率(例如90%),则增加样本数量并重新执行训练步骤。At this point, after understanding the internal information and external information of other entities associated with enterprise entity A, enterprise entity A, the risk enterprise A can be assessed next. The name of the enterprise entity A and the first feature vector, the second feature vector and the third feature vector of the enterprise entity A are input into a predetermined risk assessment model for risk assessment, and the risk assessment result is output. The training step of the predetermined risk assessment model includes: acquiring the first feature vector, the second feature vector, and the third feature vector of the plurality of enterprise entities by using the foregoing steps S1-S5, and the specific implementation manner is consistent with the foregoing steps , no longer repeat them here. Then mark the risk label for each enterprise entity, mark the risk label as 0 for the “risk-free” enterprise entity, mark the risk label with 1 for the “high-risk” enterprise entity, and then mark the first feature vector of each enterprise entity. The second feature vector, the third feature vector, and the corresponding risk tag are used as sample data. Extracting, from the sample data, the first feature vector, the second feature vector, the third feature vector of the first proportion (for example, 60%) of the enterprise entity, and the risk tag corresponding to the first entity (for example, 60%) of the enterprise entity As a training set, the first feature vector, the second feature vector, the third feature vector, and the second ratio (eg, 50%) of the second entity (eg, 50%) of the business entity are randomly extracted from the remaining sample set. The risk label corresponding to the enterprise entity is used as a verification set, that is, 20% of the sample data of the sample data is extracted as a verification set; the support vector machine is trained by using the 50% sample data, and the model parameters of the risk assessment model are determined. Determine the relationship between the associated entity of the business entity, internal information, external information and the risk of investing in the enterprise entity; verify the accuracy of the risk assessment model with 20% of the sample data, if the accuracy is greater than or equal to the pre- Set the accuracy rate (for example, 90%), the training ends, or, if the accuracy is less than the preset accuracy (for example, 90%), increase the sample size and re-execute Practice steps.
将企业实体A的第一特征向量、第二特征向量及第三特征向量输入所述风险评估模型后,若模型输出结果为0,则表示投资企业实体A基本无风险,若模型输出结果为1,则表示投资企业实体A有较大风险。After the first feature vector, the second feature vector, and the third feature vector of the enterprise entity A are input into the risk assessment model, if the model output result is 0, it indicates that the investment enterprise entity A is substantially risk-free, and if the model output result is 1 , it means that the investment enterprise entity A has a greater risk.
上述实施例提出的企业投资风险评估方法,通过了解企业实体与其关联实体之间的关系、企业实体的内部信息及外部信息,分别得到该企业实体的第一特征向量、第二特征向量及第三特征向量,利用风险评估模型及所述第一特征向量、第二特征向量及第三特征向量,对投资该企业实体进行风险评估,便于投资方捕捉市场投资机会。The enterprise investment risk assessment method proposed by the above embodiment obtains the first feature vector, the second feature vector and the third of the enterprise entity by understanding the relationship between the enterprise entity and its associated entity, the internal information of the enterprise entity and the external information. The feature vector utilizes the risk assessment model and the first feature vector, the second feature vector, and the third feature vector to perform risk assessment on the investment enterprise entity, so that the investor can capture market investment opportunities.
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质存储有企业投资风险评估程序,所述企业投资风险评估程序被处理器执行时实现如下操作:In addition, the embodiment of the present application further provides a computer readable storage medium, where the enterprise investment risk assessment program is stored, and the enterprise investment risk assessment program is executed by the processor to:
A1、爬取待评估风险的企业实体相关的新闻语料,对新闻语料进行预处理,从经过预处理后的新闻语料中提取与该企业实体相关联的其他实体;A1. Crawling news corpus related to the business entity to be assessed, preprocessing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
A2、以名称为节点、该企业实体与其他实体之间的关联关系为边,构建 该企业实体与其他实体之间的关系网络;A2, the name is a node, the relationship between the enterprise entity and other entities is an edge, and a network of relationships between the enterprise entity and other entities is constructed;
A3、根据关系网络计算该企业实体的向量表示,生成该企业实体的第一特征向量;A3. Calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity;
A4、根据第一预设规则,对该企业实体的内部信息进行量化,生成第二特征向量;A4. Quantify the internal information of the enterprise entity according to the first preset rule to generate a second feature vector;
A5、从新闻语料中提取该企业实体的外部信息,根据第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;及A5. Extracting external information of the enterprise entity from the news corpus, and quantifying the external information of the enterprise entity according to the second preset rule, and generating a third feature vector of the enterprise entity;
A6、将所述第一特征向量、第二特征向量及第三特征向量输入预先确定的企业风险评估模型,输出得到该企业实体对应的风险标签。A6: Input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
本申请计算机可读存储介质具体实施方式与上述企业投资风险评估方法和电子装置各实施例基本相同,在此不作累述。The specific embodiment of the computer readable storage medium of the present application is substantially the same as the foregoing embodiments of the enterprise investment risk assessment method and the electronic device, and is not described herein.
需要说明的是,上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。并且本文中的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。It should be noted that the foregoing serial numbers of the embodiments of the present application are merely for the description, and do not represent the advantages and disadvantages of the embodiments. And the terms "including", "comprising", or any other variations thereof are intended to encompass a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a plurality of elements includes not only those elements but also Other elements listed, or elements that are inherent to such a process, device, item, or method. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, the device, the item, or the method that comprises the element.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better. Implementation. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM as described above). , a disk, an optical disk, including a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the various embodiments of the present application.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above is only a preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related technical fields. The same is included in the scope of patent protection of this application.

Claims (20)

  1. 一种企业投资风险评估方法,应用于电子装置,其特征在于,该方法包括:A method for evaluating enterprise investment risk, which is applied to an electronic device, characterized in that the method comprises:
    S1、爬取待评估风险的企业实体相关的新闻语料,对新闻语料进行预处理,从经过预处理后的新闻语料中提取与该企业实体相关联的其他实体;S1: crawling the news corpus related to the enterprise entity to be assessed for risk, pre-processing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
    S2、以名称为节点、该企业实体与其他实体之间的关联关系为边,构建该企业实体与其他实体之间的关系网络;S2, constructing a relationship network between the enterprise entity and other entities by using the name as a node, the relationship between the enterprise entity and other entities as an edge;
    S3、根据关系网络计算该企业实体的向量表示,生成该企业实体的第一特征向量;S3. Calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity.
    S4、根据第一预设规则,对该企业实体的内部信息进行量化,生成第二特征向量;S4. Quantify the internal information of the enterprise entity according to the first preset rule to generate a second feature vector.
    S5、从新闻语料中提取该企业实体的外部信息,根据第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;及S5. Extracting external information of the enterprise entity from the news corpus, and quantifying the external information of the enterprise entity according to the second preset rule, and generating a third feature vector of the enterprise entity;
    S6、将所述第一特征向量、第二特征向量及第三特征向量输入预先确定的企业风险评估模型,输出得到该企业实体对应的风险标签。S6. Input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
  2. 如权利要求1所述的企业投资风险评估方法,其特征在于,所述“根据关系网络计算该企业实体的向量表示”的步骤采用的是Skip-Gram方法。The enterprise investment risk assessment method according to claim 1, wherein the step of "calculating the vector representation of the business entity according to the relational network" adopts the Skip-Gram method.
  3. 如权利要求1所述的企业投资风险评估方法,其特征在于,所述第一预设规则为:将所述企业实体的内部信息中每一个参考因素转换为数字量化的规则。The enterprise investment risk assessment method according to claim 1, wherein the first preset rule is: converting each reference factor in the internal information of the enterprise entity into a digital quantization rule.
  4. 如权利要求1所述的企业投资风险评估方法,其特征在于,所述第二预设规则为:将所述企业实体的外部信息中每一个参考因素转换为数字量化的规则。The enterprise investment risk assessment method according to claim 1, wherein the second preset rule is: converting each reference factor in the external information of the enterprise entity into a digital quantization rule.
  5. 如权利要求1所述的企业投资风险评估方法,其特征在于,所述步骤S1中预处理包括:将新闻语料的格式统一为文本格式,从新闻语料中去除广告噪声。The enterprise investment risk assessment method according to claim 1, wherein the preprocessing in the step S1 comprises: unifying the format of the news corpus into a text format, and removing advertising noise from the news corpus.
  6. 如权利要求1所述的企业投资风险评估方法,其特征在于,所述预先确定的企业风险评估模型的训练步骤包括:The enterprise investment risk assessment method according to claim 1, wherein the training steps of the predetermined enterprise risk assessment model include:
    爬取多个企业实体相关的新闻语料,从新闻语料提取与该多个企业实体 相关联的其他实体,以名称为节点、实体之间的关联关系为边,分别构建该多个企业实体与其他实体之间的关系网络;Crawling news corpora related to multiple business entities, extracting other entities associated with the plurality of business entities from the news corpus, and constructing the plurality of business entities and others by using the name as a node and the relationship between the entities as edges a network of relationships between entities;
    根据关系网络分别计算所述多个企业实体的向量表示,生成所述多个企业实体的第一特征向量;Calculating a vector representation of the plurality of enterprise entities according to the relationship network, and generating a first feature vector of the plurality of enterprise entities;
    根据第一预设规则,对所述多个企业实体的内部信息进行量化,生成第二特征向量;Performing, according to the first preset rule, quantifying internal information of the multiple enterprise entities to generate a second feature vector;
    从新闻语料中提取该企业实体的外部信息,根据第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;Extracting external information of the enterprise entity from the news corpus, and quantifying the external information of the enterprise entity according to the second preset rule, and generating a third feature vector of the enterprise entity;
    根据历史风险评估记录,分别给所述多个企业实体标注风险标签,将多个企业实体的第一特征向量、第二特征向量、第三特征向量及风险标签作为样本数据;According to the historical risk assessment record, the plurality of enterprise entities are respectively labeled with risk tags, and the first feature vector, the second feature vector, the third feature vector, and the risk tag of the plurality of enterprise entities are used as sample data;
    抽取第一比例的样本数据作为训练集,抽取第二比例的样本数据作为验证集;Extracting the first proportion of sample data as a training set, and extracting a second proportion of sample data as a verification set;
    利用所述训练集对支持向量机进行训练,得到所述风险评估模型;及Training the support vector machine with the training set to obtain the risk assessment model; and
    利用所述验证集对所述风险评估模型的准确性进行验证,若准确率大于或者等于预设准确率,则训练结束,或者,若准确率小于预设准确率,则增加样本数量并重新执行训练步骤。Using the verification set to verify the accuracy of the risk assessment model, if the accuracy rate is greater than or equal to the preset accuracy rate, the training ends, or if the accuracy rate is less than the preset accuracy rate, increase the sample size and re-execute Training steps.
  7. 一种电子装置,其特征在于,该装置包括:存储器、处理器,所述存储器存储有可在所述处理器上运行的企业投资风险评估程序,该程序被所述处理器执行时实现如下步骤:An electronic device, comprising: a memory, a processor, wherein the memory stores an enterprise investment risk assessment program executable on the processor, the program being implemented by the processor to implement the following steps :
    A1、爬取待评估风险的企业实体相关的新闻语料,对新闻语料进行预处理,从经过预处理后的新闻语料中提取与该企业实体相关联的其他实体;A1. Crawling news corpus related to the business entity to be assessed, preprocessing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
    A2、以名称为节点、该企业实体与其他实体之间的关联关系为边,构建该企业实体与其他实体之间的关系网络;A2, the name is a node, the relationship between the enterprise entity and other entities is an edge, and a network of relationships between the enterprise entity and other entities is constructed;
    A3、根据关系网络计算该企业实体的向量表示,生成该企业实体的第一特征向量;A3. Calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity;
    A4、根据第一预设规则,对该企业实体的内部信息进行量化,生成第二特征向量;A4. Quantify the internal information of the enterprise entity according to the first preset rule to generate a second feature vector;
    A5、从新闻语料中提取该企业实体的外部信息,根据第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;及A5. Extracting external information of the enterprise entity from the news corpus, and quantifying the external information of the enterprise entity according to the second preset rule, and generating a third feature vector of the enterprise entity;
    A6、将所述第一特征向量、第二特征向量及第三特征向量输入预先确定的企业风险评估模型,输出得到该企业实体对应的风险标签。A6: Input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
  8. 根据权利要求7所述的电子装置,其特征在于,所述“根据关系网络计算该企业实体的向量表示”的步骤采用的是Skip-Gram方法。The electronic device according to claim 7, wherein the step of "calculating the vector representation of the business entity according to the relational network" adopts the Skip-Gram method.
  9. 根据权利要求7所述的电子装置,其特征在于,所述第一预设规则为:将所述企业实体的内部信息中每一个参考因素转换为数字量化的规则。The electronic device according to claim 7, wherein the first preset rule is: converting each reference factor in the internal information of the business entity into a digital quantization rule.
  10. 根据权利要求7所述的电子装置,其特征在于,所述第二预设规则为:将所述企业实体的外部信息中每一个参考因素转换为数字量化的规则。The electronic device according to claim 7, wherein the second preset rule is: converting each reference factor in the external information of the business entity into a digital quantization rule.
  11. 根据权利要求7所述的电子装置,其特征在于,所述步骤A1中预处理包括:将新闻语料的格式统一为文本格式,从新闻语料中去除广告噪声。The electronic device according to claim 7, wherein the pre-processing in step A1 comprises: unifying the format of the news corpus into a text format, and removing advertising noise from the news corpus.
  12. 根据权利要求7所述的电子装置,其特征在于,所述预先确定的企业风险评估模型的训练步骤包括:The electronic device according to claim 7, wherein the training steps of the predetermined enterprise risk assessment model comprise:
    爬取多个企业实体相关的新闻语料,从新闻语料提取与该多个企业实体相关联的其他实体,以名称为节点、实体之间的关联关系为边,分别构建该多个企业实体与其他实体之间的关系网络;Crawling news corpora related to multiple business entities, extracting other entities associated with the plurality of business entities from the news corpus, and constructing the plurality of business entities and others by using the name as a node and the relationship between the entities as edges a network of relationships between entities;
    根据关系网络分别计算所述多个企业实体的向量表示,生成所述多个企业实体的第一特征向量;Calculating a vector representation of the plurality of enterprise entities according to the relationship network, and generating a first feature vector of the plurality of enterprise entities;
    根据第一预设规则,对所述多个企业实体的内部信息进行量化,生成第二特征向量;Performing, according to the first preset rule, quantifying internal information of the multiple enterprise entities to generate a second feature vector;
    从新闻语料中提取该企业实体的外部信息,根据第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;Extracting external information of the enterprise entity from the news corpus, and quantifying the external information of the enterprise entity according to the second preset rule, and generating a third feature vector of the enterprise entity;
    根据历史风险评估记录,分别给所述多个企业实体标注风险标签,将多个企业实体的第一特征向量、第二特征向量、第三特征向量及风险标签作为样本数据;According to the historical risk assessment record, the plurality of enterprise entities are respectively labeled with risk tags, and the first feature vector, the second feature vector, the third feature vector, and the risk tag of the plurality of enterprise entities are used as sample data;
    抽取第一比例的样本数据作为训练集,抽取第二比例的样本数据作为验证集;Extracting the first proportion of sample data as a training set, and extracting a second proportion of sample data as a verification set;
    利用所述训练集对支持向量机进行训练,得到所述风险评估模型;及Training the support vector machine with the training set to obtain the risk assessment model; and
    利用所述验证集对所述风险评估模型的准确性进行验证,若准确率大于或者等于预设准确率,则训练结束,或者,若准确率小于预设准确率,则增加样本数量并重新执行训练步骤。Using the verification set to verify the accuracy of the risk assessment model, if the accuracy rate is greater than or equal to the preset accuracy rate, the training ends, or if the accuracy rate is less than the preset accuracy rate, increase the sample size and re-execute Training steps.
  13. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有企业投资风险评估程序,该程序被处理器执行时实现如下步骤:A computer readable storage medium, characterized in that the computer readable storage medium stores an enterprise investment risk assessment program, and when the program is executed by the processor, the following steps are implemented:
    A1、爬取待评估风险的企业实体相关的新闻语料,对新闻语料进行预处理,从经过预处理后的新闻语料中提取与该企业实体相关联的其他实体;A1. Crawling news corpus related to the business entity to be assessed, preprocessing the news corpus, and extracting other entities associated with the business entity from the pre-processed news corpus;
    A2、以名称为节点、该企业实体与其他实体之间的关联关系为边,构建该企业实体与其他实体之间的关系网络;A2, the name is a node, the relationship between the enterprise entity and other entities is an edge, and a network of relationships between the enterprise entity and other entities is constructed;
    A3、根据关系网络计算该企业实体的向量表示,生成该企业实体的第一特征向量;A3. Calculate a vector representation of the enterprise entity according to the relationship network, and generate a first feature vector of the enterprise entity;
    A4、根据第一预设规则,对该企业实体的内部信息进行量化,生成第二特征向量;A4. Quantify the internal information of the enterprise entity according to the first preset rule to generate a second feature vector;
    A5、从新闻语料中提取该企业实体的外部信息,根据第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;及A5. Extracting external information of the enterprise entity from the news corpus, and quantifying the external information of the enterprise entity according to the second preset rule, and generating a third feature vector of the enterprise entity;
    A6、将所述第一特征向量、第二特征向量及第三特征向量输入预先确定的企业风险评估模型,输出得到该企业实体对应的风险标签。A6: Input the first feature vector, the second feature vector, and the third feature vector into a predetermined enterprise risk assessment model, and output a risk tag corresponding to the enterprise entity.
  14. 根据权利要求13所述的计算机可读存储介质,其特征在于,所述“根据关系网络计算该企业实体的向量表示”的步骤采用的是Skip-Gram方法。The computer readable storage medium according to claim 13, wherein the step of "calculating a vector representation of the business entity according to a relational network" employs a Skip-Gram method.
  15. 根据权利要求13所述的计算机可读存储介质,其特征在于,所述第一预设规则为:将所述企业实体的内部信息中每一个参考因素转换为数字量化的规则。The computer readable storage medium according to claim 13, wherein the first preset rule is: converting each reference factor in the internal information of the business entity into a digital quantization rule.
  16. 根据权利要求13所述的计算机可读存储介质,其特征在于,所述第二预设规则为:将所述企业实体的外部信息中每一个参考因素转换为数字量化的规则。The computer readable storage medium according to claim 13, wherein the second preset rule is: converting each reference factor in the external information of the business entity into a digital quantization rule.
  17. 根据权利要求13所述的计算机可读存储介质,其特征在于,所述步骤A1中预处理包括:将新闻语料的格式统一为文本格式,从新闻语料中去除广告噪声。The computer readable storage medium according to claim 13, wherein the preprocessing in step A1 comprises: unifying the format of the news corpus into a text format, and removing advertising noise from the news corpus.
  18. 根据权利要求13所述的计算机可读存储介质,其特征在于,所述预先确定的企业风险评估模型的训练步骤包括:The computer readable storage medium of claim 13, wherein the training steps of the predetermined enterprise risk assessment model comprises:
    爬取多个企业实体相关的新闻语料,从新闻语料提取与该多个企业实体相关联的其他实体,以名称为节点、实体之间的关联关系为边,分别构建该多个企业实体与其他实体之间的关系网络;Crawling news corpora related to multiple business entities, extracting other entities associated with the plurality of business entities from the news corpus, and constructing the plurality of business entities and others by using the name as a node and the relationship between the entities as edges a network of relationships between entities;
    根据关系网络分别计算所述多个企业实体的向量表示,生成所述多个企业实体的第一特征向量;Calculating a vector representation of the plurality of enterprise entities according to the relationship network, and generating a first feature vector of the plurality of enterprise entities;
    根据第一预设规则,对所述多个企业实体的内部信息进行量化,生成第二特征向量;Performing, according to the first preset rule, quantifying internal information of the multiple enterprise entities to generate a second feature vector;
    从新闻语料中提取该企业实体的外部信息,根据第二预设规则,对该企业实体的外部信息进行量化,生成该企业实体的第三特征向量;Extracting external information of the enterprise entity from the news corpus, and quantifying the external information of the enterprise entity according to the second preset rule, and generating a third feature vector of the enterprise entity;
    根据历史风险评估记录,分别给所述多个企业实体标注风险标签,将多个企业实体的第一特征向量、第二特征向量、第三特征向量及风险标签作为样本数据;According to the historical risk assessment record, the plurality of enterprise entities are respectively labeled with risk tags, and the first feature vector, the second feature vector, the third feature vector, and the risk tag of the plurality of enterprise entities are used as sample data;
    抽取第一比例的样本数据作为训练集,抽取第二比例的样本数据作为验证集;Extracting the first proportion of sample data as a training set, and extracting a second proportion of sample data as a verification set;
    利用所述训练集对支持向量机进行训练,得到所述风险评估模型;及Training the support vector machine with the training set to obtain the risk assessment model; and
    利用所述验证集对所述风险评估模型的准确性进行验证,若准确率大于或者等于预设准确率,则训练结束,或者,若准确率小于预设准确率,则增加样本数量并重新执行训练步骤。Using the verification set to verify the accuracy of the risk assessment model, if the accuracy rate is greater than or equal to the preset accuracy rate, the training ends, or if the accuracy rate is less than the preset accuracy rate, increase the sample size and re-execute Training steps.
  19. 一种企业投资风险评估程序,其特征在于,该程序包括:提取模块、构建模块、第一计算模块、第二计算模块、第三计算模块以及评估模块。An enterprise investment risk assessment program is characterized in that the program comprises: an extraction module, a construction module, a first calculation module, a second calculation module, a third calculation module and an evaluation module.
  20. 如权利要求19所述的企业投资风险评估程序,其特征在于,改程序被处理器执行时,可实现如权利要求1至6中任意一项企业投资风险评估方法的步骤。The enterprise investment risk assessment program according to claim 19, wherein the step of the enterprise investment risk assessment method according to any one of claims 1 to 6 is implemented when the modified program is executed by the processor.
PCT/CN2018/076169 2017-11-17 2018-02-10 Enterprise investment risk assessment method, device, and storage medium WO2019095572A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711141730.3A CN107909274B (en) 2017-11-17 2017-11-17 Enterprise investment risk assessment method and device and storage medium
CN201711141730.3 2017-11-17

Publications (1)

Publication Number Publication Date
WO2019095572A1 true WO2019095572A1 (en) 2019-05-23

Family

ID=61845968

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/076169 WO2019095572A1 (en) 2017-11-17 2018-02-10 Enterprise investment risk assessment method, device, and storage medium

Country Status (2)

Country Link
CN (1) CN107909274B (en)
WO (1) WO2019095572A1 (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414806A (en) * 2019-07-10 2019-11-05 平安科技(深圳)有限公司 Employee's method for prewarning risk and relevant apparatus
CN110443459A (en) * 2019-07-05 2019-11-12 深圳壹账通智能科技有限公司 Warning information method for pushing, device, computer equipment and storage medium
CN110532357A (en) * 2019-09-04 2019-12-03 深圳前海微众银行股份有限公司 Generation method, device, equipment and the readable storage medium storing program for executing of ESG score-system
CN111951079A (en) * 2020-08-14 2020-11-17 国网电子商务有限公司 Credit rating method and device based on knowledge graph and electronic equipment
CN112365194A (en) * 2020-12-01 2021-02-12 未鲲(上海)科技服务有限公司 Enterprise data processing method, device, equipment and computer storage medium
CN112418320A (en) * 2020-11-24 2021-02-26 杭州未名信科科技有限公司 Enterprise association relation identification method and device and storage medium
CN112579773A (en) * 2020-12-16 2021-03-30 中国建设银行股份有限公司 Risk event grading method and device
CN112598496A (en) * 2020-12-15 2021-04-02 深圳前海微众银行股份有限公司 Wind control blacklist setting method and device, terminal equipment and readable storage medium
CN112598302A (en) * 2020-12-25 2021-04-02 北京知因智慧科技有限公司 Enterprise data evaluation method and device and server
CN112613762A (en) * 2020-12-25 2021-04-06 北京知因智慧科技有限公司 Knowledge graph-based group rating method and device and electronic equipment
CN113283806A (en) * 2021-06-22 2021-08-20 中国平安财产保险股份有限公司 Enterprise information evaluation method and device, computer equipment and storage medium
CN113506173A (en) * 2021-08-06 2021-10-15 国网电子商务有限公司 Credit risk assessment method and related equipment thereof
CN113643035A (en) * 2020-05-11 2021-11-12 阿里巴巴集团控股有限公司 Information processing method, information display method, device, equipment and storage medium
CN113673870A (en) * 2021-08-23 2021-11-19 杭州安恒信息技术股份有限公司 Enterprise data analysis method and related components
CN113689288A (en) * 2021-08-25 2021-11-23 深圳前海微众银行股份有限公司 Risk identification method, device and equipment based on entity list and storage medium
CN113743111A (en) * 2020-08-25 2021-12-03 国家计算机网络与信息安全管理中心 Financial risk prediction method and device based on text pre-training and multi-task learning
CN113837527A (en) * 2021-08-02 2021-12-24 深圳前海微众银行股份有限公司 Enterprise rating method, device, equipment and storage medium
CN113837517A (en) * 2020-12-01 2021-12-24 北京沃东天骏信息技术有限公司 Event triggering method and device, medium and electronic equipment
CN113919957A (en) * 2021-10-29 2022-01-11 深圳壹账通智能科技有限公司 Method, device, electronic equipment and medium for calculating cyclic investment proportion and representing path
CN113962568A (en) * 2021-10-26 2022-01-21 天元大数据信用管理有限公司 Model label labeling method, device and medium based on support vector machine
CN114118816A (en) * 2021-11-30 2022-03-01 建信金融科技有限责任公司 Risk assessment method, device and equipment and computer storage medium
CN117829586A (en) * 2023-12-08 2024-04-05 深圳市南弯数字科技有限公司 Control method, equipment and storage medium of enterprise risk assessment system

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109087163B (en) * 2018-07-06 2021-07-09 创新先进技术有限公司 Credit assessment method and device
CN108985638B (en) * 2018-07-25 2020-07-24 腾讯科技(深圳)有限公司 User investment risk assessment method and device and storage medium
CN109345089A (en) * 2018-09-13 2019-02-15 杭州索骥数据科技有限公司 Enterprise development state evaluating method and system based on big data
CN109299362B (en) * 2018-09-21 2023-04-14 平安科技(深圳)有限公司 Similar enterprise recommendation method and device, computer equipment and storage medium
CN109597894B (en) * 2018-09-30 2023-10-03 创新先进技术有限公司 Correlation model generation method and device, and data correlation method and device
CN109214904B (en) * 2018-10-11 2024-07-02 平安科技(深圳)有限公司 Method, device, computer equipment and storage medium for acquiring financial false-making clues
CN109523117A (en) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 Risk Forecast Method, device, computer equipment and storage medium
CN109472485A (en) * 2018-11-01 2019-03-15 成都数联铭品科技有限公司 Enterprise breaks one's promise Risk of Communication inquiry system and method
CN109523153A (en) * 2018-11-12 2019-03-26 平安科技(深圳)有限公司 Acquisition methods, device, computer equipment and the storage medium of illegal fund collection enterprise
CN109543985A (en) * 2018-11-15 2019-03-29 李志东 Business risk appraisal procedure, system and medium
CN109657917B (en) * 2018-11-19 2022-04-29 平安科技(深圳)有限公司 Risk early warning method and device for evaluation object, computer equipment and storage medium
CN109558592A (en) * 2018-11-29 2019-04-02 上海点融信息科技有限责任公司 The method and apparatus of customer Credit Risk assessment information is obtained based on artificial intelligence
CN109670837A (en) * 2018-11-30 2019-04-23 平安科技(深圳)有限公司 Recognition methods, device, computer equipment and the storage medium of bond default risk
CN109740865A (en) * 2018-12-13 2019-05-10 平安科技(深圳)有限公司 Methods of risk assessment, system, equipment and storage medium
CN109359901A (en) * 2018-12-13 2019-02-19 泰康保险集团股份有限公司 Method and device, medium and electronic equipment are determined based on the business risk of block chain
KR102202139B1 (en) * 2018-12-17 2021-01-12 지속가능발전소 주식회사 Method for analyzing risk of cooperrator supply chain, computer readable medium for performing the method
CN109800976A (en) * 2019-01-07 2019-05-24 平安科技(深圳)有限公司 Investment decision methods, device, computer equipment and storage medium
CN109829640A (en) * 2019-01-23 2019-05-31 平安科技(深圳)有限公司 Recognition methods, device, computer equipment and the storage medium of enterprise's default risk
CN111626887A (en) * 2019-02-27 2020-09-04 北京奇虎科技有限公司 Social relationship evaluation method and device
CN110033120A (en) * 2019-03-06 2019-07-19 阿里巴巴集团控股有限公司 For providing the method and device that risk profile energizes service for trade company
CN110009229A (en) * 2019-04-04 2019-07-12 泰康保险集团股份有限公司 Supply chain management method, device, storage medium and equipment based on block chain
CN110188980A (en) * 2019-04-15 2019-08-30 深圳壹账通智能科技有限公司 Business risk methods of marking, device, computer equipment and storage medium
CN112053021A (en) * 2019-06-05 2020-12-08 国网信息通信产业集团有限公司 Feature coding method and device for enterprise operation management risk identification
CN110533528A (en) * 2019-08-30 2019-12-03 北京市天元网络技术股份有限公司 Assess the method and apparatus of business standing
CN111104442A (en) * 2019-11-06 2020-05-05 杭州绿程网络科技有限公司 Preprocessing method for enterprise comprehensive data
CN111291932A (en) * 2020-02-12 2020-06-16 徐佳慧 Investment and financing relation network link prediction method, device and equipment
CN111340246A (en) * 2020-02-26 2020-06-26 未来地图(深圳)智能科技有限公司 Processing method and device for enterprise intelligent decision analysis and computer equipment
CN111311105A (en) * 2020-02-28 2020-06-19 深圳前海微众银行股份有限公司 Combined product scoring method, device, equipment and readable storage medium
CN111459961A (en) * 2020-03-31 2020-07-28 深圳前海微众银行股份有限公司 Method, device and equipment for updating service data and storage medium
CN113592519A (en) * 2020-04-30 2021-11-02 景德镇陶瓷大学 Marketing data analysis and evaluation system beneficial to enterprise development
CN111353728A (en) * 2020-05-06 2020-06-30 支付宝(杭州)信息技术有限公司 Risk analysis method and system
CN112016850A (en) * 2020-09-14 2020-12-01 支付宝(杭州)信息技术有限公司 Service evaluation method and device
CN112732804B (en) * 2020-12-23 2024-04-26 北京金堤征信服务有限公司 Cooperative data evaluation method and device, electronic equipment and readable storage medium
CN112884496B (en) * 2021-05-06 2021-08-20 达而观数据(成都)有限公司 Method, device and computer storage medium for calculating enterprise credit factor score
CN114168757B (en) * 2022-02-11 2022-04-29 子长科技(北京)有限公司 Company event risk prediction method, device, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150149247A1 (en) * 2013-05-02 2015-05-28 The Dun & Bradstreet Corporation System and method using multi-dimensional rating to determine an entity's future commercical viability
CN105740335A (en) * 2016-01-22 2016-07-06 山东合天智汇信息技术有限公司 Titan-based enterprise information analysis platform and construction method thereof
CN106203808A (en) * 2016-07-01 2016-12-07 中国民生银行股份有限公司 Enterprise Credit Risk Evaluation method and apparatus
CN107229756A (en) * 2017-06-30 2017-10-03 山东合天智汇信息技术有限公司 A kind of design method and system directly perceived for showing business connection collection of illustrative plates

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040215551A1 (en) * 2001-11-28 2004-10-28 Eder Jeff S. Value and risk management system for multi-enterprise organization
US20050027645A1 (en) * 2002-01-31 2005-02-03 Wai Shing Lui William Business enterprise risk model and method
US20070208600A1 (en) * 2006-03-01 2007-09-06 Babus Steven A Method and apparatus for pre-emptive operational risk management and risk discovery
JP6009864B2 (en) * 2011-09-21 2016-10-19 典秀 野田 Company evaluation system, company evaluation method and company evaluation program
US9087088B1 (en) * 2012-11-13 2015-07-21 American Express Travel Related Services Company, Inc. Systems and methods for dynamic construction of entity graphs
CA2905996C (en) * 2013-03-13 2022-07-19 Guardian Analytics, Inc. Fraud detection and analysis
CN103942718A (en) * 2014-04-14 2014-07-23 中国人民银行征信中心 Enterprise credit information collection and integration method
CN105528465A (en) * 2016-02-03 2016-04-27 天弘基金管理有限公司 Credit status assessment method and device
CN105975491A (en) * 2016-04-26 2016-09-28 重庆誉存企业信用管理有限公司 Enterprise news analysis method and system
CN105913195A (en) * 2016-04-29 2016-08-31 浙江汇信科技有限公司 All-industry data based enterprise's financial risk scoring method
CN106447066A (en) * 2016-06-01 2017-02-22 上海坤士合生信息科技有限公司 Big data feature extraction method and device
CN106445988A (en) * 2016-06-01 2017-02-22 上海坤士合生信息科技有限公司 Intelligent big data processing method and system
CN106126614A (en) * 2016-06-21 2016-11-16 山东合天智汇信息技术有限公司 A kind of method and system reviewing Liang Ge enterprise multi-layer associated path
CN106934712A (en) * 2017-03-16 2017-07-07 深圳微众税银信息服务有限公司 A kind of enterprise's representation data processing method and system
CN107133732A (en) * 2017-04-27 2017-09-05 青岛格兰德信用管理咨询有限公司 The relation loop method for digging analyzed based on big data and its application
CN107239882A (en) * 2017-05-10 2017-10-10 平安科技(深圳)有限公司 Methods of risk assessment, device, computer equipment and storage medium
CN107301493A (en) * 2017-05-19 2017-10-27 四川新网银行股份有限公司 A kind of mutual golden business ratings model based on deep neural network
CN107220237A (en) * 2017-05-24 2017-09-29 南京大学 A kind of method of business entity's Relation extraction based on convolutional neural networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150149247A1 (en) * 2013-05-02 2015-05-28 The Dun & Bradstreet Corporation System and method using multi-dimensional rating to determine an entity's future commercical viability
CN105740335A (en) * 2016-01-22 2016-07-06 山东合天智汇信息技术有限公司 Titan-based enterprise information analysis platform and construction method thereof
CN106203808A (en) * 2016-07-01 2016-12-07 中国民生银行股份有限公司 Enterprise Credit Risk Evaluation method and apparatus
CN107229756A (en) * 2017-06-30 2017-10-03 山东合天智汇信息技术有限公司 A kind of design method and system directly perceived for showing business connection collection of illustrative plates

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443459A (en) * 2019-07-05 2019-11-12 深圳壹账通智能科技有限公司 Warning information method for pushing, device, computer equipment and storage medium
CN110414806A (en) * 2019-07-10 2019-11-05 平安科技(深圳)有限公司 Employee's method for prewarning risk and relevant apparatus
CN110414806B (en) * 2019-07-10 2024-05-14 平安科技(深圳)有限公司 Employee risk early warning method and related device
CN110532357A (en) * 2019-09-04 2019-12-03 深圳前海微众银行股份有限公司 Generation method, device, equipment and the readable storage medium storing program for executing of ESG score-system
CN110532357B (en) * 2019-09-04 2024-03-12 深圳前海微众银行股份有限公司 ESG scoring system generation method, device, equipment and readable storage medium
CN113643035A (en) * 2020-05-11 2021-11-12 阿里巴巴集团控股有限公司 Information processing method, information display method, device, equipment and storage medium
CN111951079B (en) * 2020-08-14 2024-04-02 国网数字科技控股有限公司 Credit rating method and device based on knowledge graph and electronic equipment
CN111951079A (en) * 2020-08-14 2020-11-17 国网电子商务有限公司 Credit rating method and device based on knowledge graph and electronic equipment
CN113743111B (en) * 2020-08-25 2024-06-04 国家计算机网络与信息安全管理中心 Financial risk prediction method and device based on text pre-training and multi-task learning
CN113743111A (en) * 2020-08-25 2021-12-03 国家计算机网络与信息安全管理中心 Financial risk prediction method and device based on text pre-training and multi-task learning
CN112418320A (en) * 2020-11-24 2021-02-26 杭州未名信科科技有限公司 Enterprise association relation identification method and device and storage medium
CN112418320B (en) * 2020-11-24 2024-01-19 杭州未名信科科技有限公司 Enterprise association relation identification method, device and storage medium
CN112365194A (en) * 2020-12-01 2021-02-12 未鲲(上海)科技服务有限公司 Enterprise data processing method, device, equipment and computer storage medium
CN113837517A (en) * 2020-12-01 2021-12-24 北京沃东天骏信息技术有限公司 Event triggering method and device, medium and electronic equipment
CN112598496A (en) * 2020-12-15 2021-04-02 深圳前海微众银行股份有限公司 Wind control blacklist setting method and device, terminal equipment and readable storage medium
CN112598496B (en) * 2020-12-15 2024-04-30 深圳前海微众银行股份有限公司 Wind control blacklist setting method and device, terminal equipment and readable storage medium
CN112579773A (en) * 2020-12-16 2021-03-30 中国建设银行股份有限公司 Risk event grading method and device
CN112598302A (en) * 2020-12-25 2021-04-02 北京知因智慧科技有限公司 Enterprise data evaluation method and device and server
CN112598302B (en) * 2020-12-25 2024-03-26 北京知因智慧科技有限公司 Enterprise data evaluation method, device and server
CN112613762A (en) * 2020-12-25 2021-04-06 北京知因智慧科技有限公司 Knowledge graph-based group rating method and device and electronic equipment
CN112613762B (en) * 2020-12-25 2024-04-16 北京知因智慧科技有限公司 Group rating method and device based on knowledge graph and electronic equipment
CN113283806A (en) * 2021-06-22 2021-08-20 中国平安财产保险股份有限公司 Enterprise information evaluation method and device, computer equipment and storage medium
CN113837527A (en) * 2021-08-02 2021-12-24 深圳前海微众银行股份有限公司 Enterprise rating method, device, equipment and storage medium
CN113506173A (en) * 2021-08-06 2021-10-15 国网电子商务有限公司 Credit risk assessment method and related equipment thereof
CN113673870A (en) * 2021-08-23 2021-11-19 杭州安恒信息技术股份有限公司 Enterprise data analysis method and related components
CN113673870B (en) * 2021-08-23 2024-04-30 杭州安恒信息技术股份有限公司 Enterprise data analysis method and related components
CN113689288A (en) * 2021-08-25 2021-11-23 深圳前海微众银行股份有限公司 Risk identification method, device and equipment based on entity list and storage medium
CN113689288B (en) * 2021-08-25 2024-05-14 深圳前海微众银行股份有限公司 Risk identification method, device, equipment and storage medium based on entity list
CN113962568A (en) * 2021-10-26 2022-01-21 天元大数据信用管理有限公司 Model label labeling method, device and medium based on support vector machine
CN113919957A (en) * 2021-10-29 2022-01-11 深圳壹账通智能科技有限公司 Method, device, electronic equipment and medium for calculating cyclic investment proportion and representing path
CN114118816A (en) * 2021-11-30 2022-03-01 建信金融科技有限责任公司 Risk assessment method, device and equipment and computer storage medium
CN117829586A (en) * 2023-12-08 2024-04-05 深圳市南弯数字科技有限公司 Control method, equipment and storage medium of enterprise risk assessment system

Also Published As

Publication number Publication date
CN107909274B (en) 2023-02-28
CN107909274A (en) 2018-04-13

Similar Documents

Publication Publication Date Title
WO2019095572A1 (en) Enterprise investment risk assessment method, device, and storage medium
Chavez-Demoulin* et al. Estimating value-at-risk: a point process approach
WO2019061994A1 (en) Electronic device, insurance product recommendation method and system, and computer readable storage medium
US20220343433A1 (en) System and method that rank businesses in environmental, social and governance (esg)
US9411917B2 (en) Methods and systems for modeling crowdsourcing platform
CN107862425B (en) Wind control data acquisition method, device and system and readable storage medium
CN110457333B (en) Data real-time updating method and device and computer readable storage medium
US11804302B2 (en) Supervised machine learning-based modeling of sensitivities to potential disruptions
US10592472B1 (en) Database system for dynamic and automated access and storage of data items from multiple data sources
WO2018171295A1 (en) Method and apparatus for tagging article, terminal, and computer readable storage medium
CN113177700B (en) Risk assessment method, system, electronic equipment and storage medium
CN113449046A (en) Model training method, system and related device based on enterprise knowledge graph
CN115936895A (en) Risk assessment method, device and equipment based on artificial intelligence and storage medium
CN116860856A (en) Financial data processing method and device, computer equipment and storage medium
CN108809943A (en) Web publishing method and its device
US9141686B2 (en) Risk analysis using unstructured data
WO2019095569A1 (en) Financial analysis method based on financial and economic event on microblog, application server, and computer readable storage medium
CN117273968A (en) Accounting document generation method of cross-business line product and related equipment thereof
CN117033431A (en) Work order processing method, device, electronic equipment and medium
CN117114901A (en) Method, device, equipment and medium for processing insurance data based on artificial intelligence
US11875374B2 (en) Automated auditing and recommendation systems and methods
CN116450723A (en) Data extraction method, device, computer equipment and storage medium
US20190139168A1 (en) System and methods thereof for automated generation of an agreement related to a physical asset
US11379445B2 (en) System and method for analyzing and structuring data records
CN111179076A (en) IT system intelligent management method, IT system intelligent management device and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18879239

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 01/10/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18879239

Country of ref document: EP

Kind code of ref document: A1