CN110020048B - Enterprise risk evaluation system and method based on open source data - Google Patents

Enterprise risk evaluation system and method based on open source data Download PDF

Info

Publication number
CN110020048B
CN110020048B CN201711022805.6A CN201711022805A CN110020048B CN 110020048 B CN110020048 B CN 110020048B CN 201711022805 A CN201711022805 A CN 201711022805A CN 110020048 B CN110020048 B CN 110020048B
Authority
CN
China
Prior art keywords
index
score
word frequency
total word
enterprise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711022805.6A
Other languages
Chinese (zh)
Other versions
CN110020048A (en
Inventor
张守义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Chenyu Information Technology Co.,Ltd.
Original Assignee
Beijing Chenxin Credit Information Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Chenxin Credit Information Co ltd filed Critical Beijing Chenxin Credit Information Co ltd
Priority to CN201711022805.6A priority Critical patent/CN110020048B/en
Publication of CN110020048A publication Critical patent/CN110020048A/en
Application granted granted Critical
Publication of CN110020048B publication Critical patent/CN110020048B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/80Management or planning
    • Y02P90/82Energy audits or management systems therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an enterprise risk evaluation system and method based on open source data; the system comprises a data crawling module, a data word segmentation module and a word frequency statistics module, wherein the data crawling module can crawl data related to an enterprise to be evaluated from a webpage, and the data word segmentation module can perform word segmentation processing on the crawled data and perform word frequency statistics; the system comprises a plurality of sub-modules, each sub-module evaluates factors influencing the development prospect of the enterprise according to the word segmentation and the word frequency of the word segmentation, reasonable enterprise scores are obtained finally, the risk of the enterprise can be judged according to the score, and the judgment and comparison can be carried out according to the score difference of different enterprises.

Description

Enterprise risk evaluation system and method based on open source data
Technical Field
The invention relates to an enterprise data analysis and processing system, in particular to an enterprise risk evaluation system and method based on open source data.
Background
With the advent of the big data era, people pay more and more attention to the problem of judgment or processing by data analysis, wherein simple statistics, summation and processing are still easy to understand and process, if the corresponding relation between the known data and the required result is not obvious, statistics processing is often required to be carried out through a special program or device, but the specific processing is carried out, and the calculation is carried out through what kind of calculation device, which is rarely related in the prior art;
specifically, the number of enterprises and companies in the modern society is huge, wherein the companies of each enterprise have good quality, and before making important decisions such as selecting partners and selecting investment targets, it is necessary to fully understand the potential, capability and the like of the enterprise, and more importantly, it is necessary to perform horizontal comparison, and it is necessary to select the most suitable enterprise among many similar enterprises, and it is necessary to select the enterprise which can satisfy the needs of the enterprise.
Disclosure of Invention
In order to overcome the problems, the inventor of the invention carries out intensive research and designs an enterprise risk evaluation system and method based on open source data; the system can crawl out articles and other information related to a designated enterprise in a network with complicated big information, divide various related contents related to enterprise risks in the information into a plurality of big categories, set a plurality of small categories under the big categories, evaluate the large categories respectively, and further obtain more scientific and reasonable results after all-around evaluation; the system comprises a plurality of sub-modules, wherein each sub-module evaluates factors influencing the development prospect of an enterprise according to the word segmentation and the word frequency of the word segmentation, so that a reasonable enterprise score is obtained finally, the risk of the enterprise can be judged according to the score, and the judgment and comparison can be carried out according to the score difference of different enterprises, thereby completing the invention.
Specifically, the invention aims to provide an enterprise risk evaluation system based on open source data, which comprises
A data crawling module 1 for crawling data from web pages,
the data word segmentation module 2 is used for performing word segmentation processing on the data text crawled by the data crawling module 1 and counting word frequency; and
and the scoring module 3 is used for giving enterprise scores according to the word frequency of the participles.
The invention has the advantages that:
the enterprise risk evaluation system based on the open source data can comprehensively obtain information of all aspects of an enterprise, reasonably scores the information through the set scoring module and the set keywords, can obtain enterprise risk evaluation in a short time and acquire enterprise risk, and is provided with four scoring sub-modules which respectively score from multiple aspects, the scoring factors are sufficient, and the scoring result is more scientific and reasonable.
Drawings
Fig. 1 is a schematic diagram illustrating an overall structure of an enterprise risk assessment system based on open source data according to a preferred embodiment of the present invention.
The reference numbers illustrate:
1-data crawling module
2-data word segmentation module
3-Scoring Module
11-input device
31-enterprise operation management scoring submodule
32-enterprise competitiveness scoring submodule
33-enterprise development prospect scoring submodule
34-industry development environment scoring submodule
311-Enterprise Key index dimension determination section
312-enterprise social reputation index dimension determination section
313-common record index dimension determination unit
321-enterprise innovation level index dimension judgment part
322-Brand influence index dimension determination section
331-enterprise investment and financing index dimension determination part
332-product update iteration index dimension judgment part
333-product Life cycle index dimension determination part
334-determination of capital market dynamics index dimension
341-industry prospect index dimension judgment part
342-national policy index dimension determination section
Detailed Description
The invention is explained in more detail below with reference to the figures and examples. The features and advantages of the present invention will become more apparent from the description.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
According to the enterprise risk evaluation system based on the open source data, as shown in fig. 1, the system comprises a data crawling module 1, which is used for crawling data from a webpage, preferably, the data crawling module 1 acquires data from the open source webpage, wherein the data includes various published articles, news reports, information disclosed by various government authorities and the like;
further preferably, an input device 11, such as a keyboard, a mouse, etc., may be further connected to the data crawling module 1, search information, such as a name of an enterprise, is input through the input device 11, and the crawling module 1 crawls an article containing/referring to the name of the enterprise.
The data crawling module 1 comprises a crawling engine, a crawler module, a downloading middleware, a downloading module, a crawler middleware and an element pipeline;
specifically, the process of crawling the webpage data by the data crawling module 1 includes the following steps:
step 1, the crawling engine obtains an initial request from a crawler module,
step 2, the crawling engine lists the request obtained from the crawler module into a task plan,
step 3, the task plan returns the next request to the crawling engine,
step 4, the crawling engine sends the request returned by the task plan to the downloading module through the downloading middleware,
step 5, the downloading module downloads the page, generates a response when the downloading module downloads the page, and sends the response to the crawling engine through the downloading middleware,
step 6, after receiving the response sent by the downloading module, the crawling engine sends the response to the crawler module through the crawler middleware,
step 7, after the crawler module processes the response sent by the crawling engine, the crawler module returns the crawling element and a new request to the crawling engine through the crawler middleware,
step 8, the crawling engine sends the processed crawling element to the element pipeline, then sends a processing request to the task plan and waits for the next possible request,
and 9, repeating the steps 1-8 until no new request is made by the mission plan.
The system also comprises a data word segmentation module 2, which is used for performing word segmentation processing on the data text crawled by the data crawling module 1, wherein the word segmentation processing means that an article is segmented into a plurality of phrases/word segments, or a plurality of phrases/word segments are extracted from the article, and all the phrases/word segments are aggregated together to obtain the occurrence frequency of each word segment, namely the word frequency of the word segment.
The processing process of word segmentation processing in the invention comprises the following steps:
step 1, adding a DOCID unique identifier to a public article crawled from a network,
step 2, the crawled articles are processed, the article information is divided into two categories of article basic information and article word segmentation, the article basic information and the article word segmentation are respectively sorted and summarized,
step 3, storing the processed data into a database,
and 4, counting the word frequency of each participle.
The basic article information comprises the following steps: DOCID, title, link address, author, release time, pick-up time, and article keywords.
In the present invention, a conventional word segmentation method may be employed, and this is not particularly specified. Preferably, however, the word segmentation processing can be performed by a Chinese semantic open platform of Shanghai Bosen data technology, Inc. to obtain the basic information of the article.
In a preferred embodiment, emotion analysis is further performed on the crawled article in the word segmentation processing process, the emotion analysis means that the article is analyzed to obtain a non-negative probability and a negative probability, and the sum of the two values of the non-negative probability and the negative probability is 1. For example, through the Chinese semantic open platform of Shanghai Bosen data technology, Inc., non-negative probability and negative probability can be obtained in the word segmentation process.
Whether the article is a positive promotion, a negative promotion or a neutral promotion is known through the non-negative probability and the negative probability. Specifically, the article is judged to be a neutral promotion when the difference between the non-negative probability and the negative probability is between-0.1 and 0.1; judging the article as positive propaganda when the difference between the non-negative probability and the negative probability is a numerical value above 0.1; and judging the article as a negative propaganda when the difference between the non-negative probability and the negative probability is a numerical value below-0.1.
For a plurality of articles, whether each article is a positive promotion, a negative promotion or a neutral promotion is obtained by the method as described above. Then, counting the number of articles for positive propaganda and the number of articles for negative propaganda, calculating the ratio of the sum of the number of articles for positive propaganda to the sum of the number of articles for negative propaganda, when the ratio is more than 2, additionally increasing the final score of the company by 5 or 10 points, when the ratio is less than 0.5, additionally increasing the final score of the company by-5 or-10 points, and when the ratio is between 0.5 and 2, not additionally processing the final score; wherein, preferably, when the ratio is more than 2 and less than 5, the final score of the company is additionally increased by 5 points, and when the ratio is more than 5, the final score of the company is additionally increased by 10 points; when the ratio is below 0.5 and above 0.2, the final score of the company is additionally increased by-5 points, and when the ratio is below 0.2, the final score of the company is additionally increased by-10 points.
The system also comprises a scoring module 3, which is used for giving enterprise scores according to the word frequency of the participles;
preferably, the scoring module 3 analyzes the plurality of aspects of the enterprise respectively, and sums the analysis results according to a predetermined weighting factor to obtain a final total score.
Further, the scoring module 3 analyzes four aspects of enterprise operation management, enterprise competitiveness, enterprise development prospect and industry development environment of an enterprise; specifically, the scoring module 3 includes an enterprise operation management scoring submodule 31, an enterprise competitiveness scoring submodule 32, an enterprise development prospect scoring submodule 33 and an industry development environment scoring submodule 34, and the scoring of each aspect is calculated according to each submodule respectively to obtain the scoring of each submodule, and then the scoring is added according to the large class weight to obtain the final scoring of the enterprise; the enterprise risk can be judged by transversely comparing the scoring conditions of the enterprises; preferably, the higher the score, the better the business condition and the lower the risk;
preferably, the large-scale weight coefficient of the enterprise operation management scoring submodule 31 is 0.4, the large-scale weight coefficient of the enterprise competitiveness scoring submodule 32 is 0.2, the large-scale weight coefficient of the enterprise development prospect scoring submodule 33 is 0.1, and the large-scale weight coefficient of the industry development environment scoring submodule 34 is 0.3; and the total sum of the scores of all the sub-modules multiplied by the corresponding large-class weight coefficients is the total score.
In the actual risk evaluation process, the general enterprises are all advantageous in some aspects, have some disadvantages in some aspects, however, balancing these advantages and disadvantages has been a problem, and to solve this problem, one would study and analyze each item separately, great effort and time are spent in the process, and the research and analysis characteristics of different mechanisms and personnel are different, which finally results in that the proper combination is difficult, time is wasted, good expected effect is difficult to obtain, therefore, the four scoring submodules are arranged in the invention, which can ensure the reasonability and accuracy of scoring, unify the standard, improve the efficiency, quickly and accurately acquire the related information, and facilitates lateral comparison, facilitating lateral acquisition of risk ratings by comparing scores between similar businesses.
Each sub-module needs to consider a plurality of index dimensions, namely each sub-module comprises more than two index dimension judgment parts, each index dimension judgment part comprises more than two index items, each index item is scored respectively, the scores of the index items in one index dimension judgment part are added and multiplied by the subclass weight of the index dimension to obtain the score of the index dimension, and the sum of the scores of the index dimensions is the score of the sub-module. Wherein, each index dimension corresponds to a subclass weight.
Specifically, each index dimension includes more than two index items, more than one index keyword is stored in each index item, and based on the index keywords, the extracted participles are screened one by one to obtain which index keywords are included in the participles, that is, the participles identical to the index keywords are found out, and the word frequency of the participles/index keywords is obtained, so as to obtain the score of the index item.
The index keywords are the best and most suitable words obtained through comprehensive analysis and multiple attempts by the applicant, are representative words which are easy to split and less ambiguous, and are words which are frequently used in webpage reports, have high relevance with indexes and have high word frequency.
Preferably, the sum of the word frequencies corresponding to all the index keywords/hit keywords in the index items is a total word frequency, the word frequency corresponding to an index keyword which is not found in the keywords is zero, a judgment module is further stored in each index item, and the judgment module judges the score of each index item according to the total word frequency or the content of the hit keyword.
Further preferably, the score in the judgment condition is a score between-100 and 100, and the corresponding score for the specific index keyword may be positive or negative.
In a preferred embodiment, the enterprise operation management scoring submodule 31 includes an enterprise key index dimension determination part 311, an enterprise social reputation index dimension determination part 312 and a public record index dimension determination part 313; the subclass weight of the enterprise key person index dimension is 0.3, the subclass weight of the enterprise social reputation index dimension is 0.3, and the subclass weight of the public record index dimension is 0.4.
The enterprise key index dimension determination unit 311 includes:
the director altitude Internet exposure index item, wherein index keywords comprise a peak meeting, a forum, a year meeting, an innovation conference, a special visit, a product release meeting, a workshop and a development conference;
social responsibility index items, wherein index keywords comprise public welfare, charitable, industry leaders and labourethes;
negative news information index items, wherein index keywords comprise leaving, negating, wind waves, falling horses, breaking, being checked and being investigated; and
and the index key words of the forward news information comprise a prominent contribution prize, a character of the wind and cloud, a prominent character, a leader character, an optimal high-management character and an annual character.
The enterprise social reputation index dimension determination section 312 includes:
the prize winning information index items comprise prize winning enterprises, bank prizes, gold prizes, outstanding contribution prizes, special prize innovation prizes, excellent business prizes, medals, honor certificates and prize awarding grand ceremonies;
the method comprises the following steps of (1) displaying information index items, wherein index keywords comprise medals, honor certificates and praise;
the trade public praise index item, wherein the index key words comprise good reviews like tide, good reviews, phenomenon level, public praise prize, best, enterprise public praise and network public praise;
public service information index items, wherein index keywords comprise public service contribution prizes, charities, donations, contributions, public service activities and public service utilities;
commercial ethics good index item, wherein the index key words comprise commercial ethics enterprises, and
the business moral corruption index item, wherein the index key words comprise enterprise moral corruption and corruption.
The common record index dimension determination unit 313 includes:
the information index item of the administrative punishment, wherein the index keyword comprises the administrative punishment, the responsibility dispute, the illegal operation, the suspected violation and the punishment bulletin;
the administrative license information index item, wherein the index key words comprise an operation license, an administrative license and a license;
the abnormal operation information index item, wherein the index key words comprise an abnormal operation name list, an abnormal operation enterprise and an abnormal operation enterprise;
the tax negative information index item, wherein the index key words comprise tax payment abnormity, tax evasion and tax payment declaration abnormity;
media negative information index items, wherein index keywords comprise suspicion, unqualified products, recall, rectification, labor dispute, negative news, referees, running and counterfeiting; and
the judicial litigation information index items comprise infringement, abortion, prosecution and litigation.
Preferably, the judgment rules of the judgment module in the director high internet exposure index item and the social responsibility index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 30, when the total word frequency is between [3-5), the score is 50, and when the total word frequency is between [5- ∞), the score is 100;
the judgment rules of the judgment module in the winning information index item, the table and the public welfare information index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 10, when the total word frequency is between [3-5), the score is 25, when the total word frequency is between [5-7), the score is 50, when the total word frequency is between [7-10), the score is 75, and when the total word frequency is between [10- ∞), the score is 100;
the judgment rules of the judgment modules in the forward news information index item, the commercial moral well index item, the industry public praise index item and the administrative license information index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-5), the score is 10, when the total word frequency is between [5-10), the score is 25, when the total word frequency is between [10-15), the score is 50, when the total word frequency is between [15-20), the score is 75, and when the total word frequency is between [20- ∞), the score is 100;
the judgment rules of the judgment modules in the negative news information index item, the commercial moral corruption index item, the administrative penalty information index item, the abnormal operation information index item, the tax negative information index item, the media negative information index item and the judicial litigation information index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-5), the score is-10, when the total word frequency is between [5-10), the score is-25, when the total word frequency is between [10-15), the score is-50, when the total word frequency is between [15-20), the score is-75, and when the total word frequency is between [ 20-infinity), the score is-100.
In the present application, the parenthesis "[" indicates inclusion of the numerical value, and the parenthesis "(" and ")" indicates exclusion of the numerical value, such as [5 to 10 ] indicates 5 or more and less than 10.
In a preferred embodiment, the enterprise competitiveness scoring submodule 32 includes an enterprise innovation level indicator dimension determination 321 and a brand influence indicator dimension determination 322; the subclass weight of the enterprise innovation level indicator dimension is 0.5, and the subclass weight of the brand influence indicator dimension is 0.5.
The enterprise innovation level index dimension determination unit 321 includes:
the patent application index item, wherein the index keywords comprise patents, patent inventions and patent certificates;
a trademark registration index item, wherein index keywords comprise trademarks and trademark applications;
the copyright issues the index item, wherein the index keyword includes copyright and copyright.
The brand influence index dimension determination unit 322 includes:
the brand awareness index item, wherein index keywords comprise awareness, an awareness enterprise, a passing brand, a non-inspection product and a reputation degree;
the brand share index item, wherein the index key words comprise market share and monopoly.
Preferably, the judgment rules of the judgment modules in the trademark registration index item, the copyright publication index item, the brand awareness index item and the brand occupation ratio index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 10, when the total word frequency is between [3-5), the score is 25, when the total word frequency is between [5-7), the score is 50, when the total word frequency is between [7-10), the score is 75, and when the total word frequency is between [10- ∞), the score is 100;
the judgment rules of the judgment module in the patent application index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-5), the score is 10, when the total word frequency is between [5-10), the score is 25, when the total word frequency is between [10-15), the score is 50, when the total word frequency is between [15-20), the score is 75, and when the total word frequency is between [20- ∞), the score is 100.
In a preferred embodiment, the enterprise development prospect scoring submodule 33 includes an enterprise investment and financing index dimension determination unit 331, a product update iteration index dimension determination unit 332, a product life cycle index dimension determination unit 333 and a capital market dynamic index dimension determination unit 334. The subclass weight of the enterprise investment and financing index dimension is 0.25, the subclass weight of the product updating iteration index dimension is 0.25, the subclass weight of the product life cycle index dimension is 0.25, and the subclass weight of the capital market dynamic index dimension is 0.25.
The enterprise investment and financing index dimension determination unit 331 includes:
an external investment index item, wherein index keywords comprise investment and investment;
enterprise financing index items, wherein index keywords comprise listed listing, IPO, stock issuance, bond issuance, Angel turn, A turn, B turn, C turn and D turn;
the product update iteration index dimension determination unit 332 includes:
the technical index item comprises index keywords, wherein the index keywords comprise new technical investment, new technology, technical change and technical revolution;
the industry barrier breaks through index items, wherein index keywords comprise that the industry barrier is broken and broken through;
a new product index item, wherein index keywords comprise a product release party;
the product life cycle index dimension determination unit 333 includes:
the investment period index item, wherein the index key words comprise money smashing, money burning and investment;
the index items of the maturity period, wherein the index keywords comprise stock price greatly rising, price competition and repeated purchase rate;
the method comprises the following steps of (1) a decline period index item, wherein index keywords comprise sales volume which is obviously reduced;
the capital market dynamics index dimension determination section 334 includes:
positive dynamic index items, wherein index keywords comprise fluctuation, market value fluctuation, stock price surge and financing;
the negative dynamic index item, wherein the index keywords comprise stop, merger, recombination, rise and fall, performance slide, profit slide, sales fall, market value shrink, base price sale, big diving, continuous fall, deficit and debt default;
good market quotation index items, wherein index keywords comprise 'rising' and 'rising',
the market quotation is not good, and index keywords comprise 'fall'.
Preferably, the judgment rules of the judgment modules in the external investment index item and the enterprise financing index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 10, when the total word frequency is between [3-5), the score is 25, when the total word frequency is between [5-7), the score is 50, when the total word frequency is between [7-10), the score is 75, and when the total word frequency is between [10- ∞), the score is 100;
the judgment rules of the judgment module in the new technical index item, the industry barrier breakthrough index item and the new product index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 30, when the total word frequency is between [3-5), the score is 50, and when the total word frequency is between [5- ∞), the score is 100;
the judgment rules of the judgment module in the investment period index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 30, and when the total word frequency is between [3- ∞ ], the score is 50;
the judgment rules of the judgment module in the maturity index items are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 50, and when the total word frequency is between [3- ∞ ], the score is 100;
the judgment rules of the judgment module in the decline period index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 10, and when the total word frequency is between [3- ∞ ], the score is 25;
the judgment rules of the judgment module in the positive dynamic index item and the market good market condition index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-5), the score is 10, when the total word frequency is between [5-10), the score is 25, when the total word frequency is between [10-15), the score is 50, when the total word frequency is between [15-20), the score is 75, and when the total word frequency is between [20- ∞), the score is 100;
the judgment rules of the judgment module in the negative dynamic index item and the market condition bad index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-5), the score is-10, when the total word frequency is between [5-10), the score is-25, when the total word frequency is between [10-15), the score is-50, when the total word frequency is between [15-20), the score is-75, and when the total word frequency is between [ 20-infinity), the score is-100.
In a preferred embodiment, the industry development environment scoring submodule 34 includes an industry prospect index dimension determination part 341 and a national policy index dimension determination part 342; the subclass weight of the industry prospect index dimension is 05, and the subclass weight of the national policy index dimension is 0.5.
The industry foreground index dimension determination unit 341 includes:
the industrial status quoting index items, wherein the index key words comprise broad prospects and unclear prospects, and can also extend foreground optimism, foreground promptness and good prospect, and are equivalent to the broad prospects;
the industrial analysis index item, wherein the index key words comprise rapid rise, stable development, slow development and industry obstruction, and can also extend to rise, transformation upgrade, growth, rapid growth, outbreak and hot rod, and all the index key words are equivalent to rapid rise;
the national policy index dimension determination unit 342 includes:
supporting policy index items, wherein index keywords comprise financial support, fund support, enterprise income tax reduction and exemption and enterprise income tax exemption;
limiting policy index items, wherein index keywords comprise policy regulation limits, policy limits and limiting policies;
protection policy index items, wherein index keywords comprise protection policies and policy protection;
adjusting policy index items, wherein index keywords comprise adjusting policies and policy adjustments;
promoting policy index items, wherein index keywords comprise policy promoting and promoting policies;
the index key words of the guide policy index item comprise guide policy and policy guide.
Preferably, the judgment rule of the judgment module in the industry status index item is as follows: the score is 100 when the word frequency with wide index keyword prospect is between 1-infinity); the broad word frequency of the index keyword foreground is 0, and the score is 50 when the word frequency of the index keyword foreground is unclear between 1 and infinity); when the broad prospect and the unclear prospect of the index keyword are both the word frequency of 0, the score is 0;
the judgment rule of the judgment module in the industry analysis index item is as follows: the score when the word frequency of the index keyword rapidly rises is between [1- ∞) is 100; the word frequency of the rapid rising of the index keywords is 0, and the score is 75 when the word frequency of the index keywords developing stably is between [1- ∞ ]; the word frequency of the index keywords rising rapidly and developing steadily is 0, and the score is 50 when the word frequency of the index keywords developing slowly is between [1- ∞ ]; the word frequency of the index keywords which grow up rapidly, develop stably and develop slowly is 0, and the score is 25 when the word frequency of the industry barrier of the index keywords is between 1 and infinity; when the total word frequency is 0, the score is 0;
the judgment rules of the judgment module in the support policy index item, the adjustment policy index item, the promotion policy index item and the guide policy index item are as follows: the total word frequency is between [1- ∞) and the score is 100;
and restricting policy index items, wherein the judging rule of the middle judging module is as follows: the score was 0 when the total word frequency was [1- ∞).
According to the specific index keywords and the judgment rule of the judgment module, the final score of more than 50 points can be called as qualified, if the final score is higher than 60 points, the final score can be called as excellent, and in fact, the score of a part of enterprises can be negative.
The system also comprises a calculation module and a display device, wherein the calculation module is connected with each scoring submodule and used for obtaining the score given by each scoring submodule and calculating the final total score according to the large-class weight coefficient of each scoring submodule, and the display device is used for displaying input information such as enterprise names and also used for displaying the information such as the number of participles, the number of hit index keywords and the final total score.
The invention also provides an enterprise risk evaluation method based on the open source data, which is realized by the enterprise risk evaluation system based on the open source data.
Example (b):
taking the oil and gas stocks ltd in China as an example, enterprise risk assessment is carried out, the data crawling module inputs the name of the oil and gas stocks ltd/China oil of China, the number of relevant network pages is 198, articles are crawled to 3574908, and the number of keywords is as follows: 104477, final score: 77.75;
wherein, the 100 keywords with the highest word frequency are: (china, 1704), (enterprise, 967), (oil, 654), (company, 621), (central, 510), (natural gas, 504), (development, 504), (country, 498), (tour, 423), (price, 419), (reform, 394), (market, 380), (work, 373), (construction, 352), (group company, 333), (group, 333), (economy, 312), (booknote, 303), (problem, 298), (reporter, 287), (collaboration, 277), (limited company, 266), (project, 265), (energy, 255), (crude oil, 243), (industry, 239), (international, 237), (progress, 230), (representation, 227), (general manager, 224), (present, 224), (realization, 221), (state, 221), (tonics, 221), (dominant enterprise, 216), (technology, 212), (important, 211), (president, 211), (china, 209), (management, 204), (situation, 201), (industry, 199), (capacity, 191), (party, 186), (via, 183), (one, 177), (first, 177), (2015 year, 176), (simultaneously, 175), (resource, 175), (beijing, 174), (national, 174), (field, 168), (medium oil, 167), (propulsion, 166), (personnel, 166), (deputy, 166), (road, 165), (university, 165), (national enterprise, 165), (oil, 162), (pipeline, 161), (wherein, 159), (investment, 158), (development, 156), (lease, 155), (oil field, 152), (business, 151), (iran, 150), (became, 150), (commission, 149), (region, 149), (growth, 147) (consider, 147), (speciality, 144), (product, 144), (oil price, 144), (production, 141), (leader, 140), (facet, 140), (capital, 139), (party, 139), (mechanism, 139), (already, 136), (dollar, 136), (hub, 135), (petro, 131), (social, 130), (service, 129), (units, 128), (offer, 128), (department, 127), (as, 125), (since, 125), (main, 124), (responsibility, 123), (research, 122), (structure, 120), (level, 120), (adjustment, 120).
The dimensionality score of each index is as follows:
30.0 enterprise key
Enterprise social reputation: 13.5
Public record: -76.0
Enterprise innovation level: 55.0
Brand influence: 55.0
Enterprise investment and financing: 12.5
Product updating iteration: 0.0
The life cycle of the product is as follows: 0.0
Capital market dynamics 0.0
Industry prospect of 25.0
National policy: 200.0 parts of the total weight of the alloy;
wherein, each index item has a score:
director's height of internet exposure: 100 ═ peak meeting: 5+ Forum: 26+ annual meeting: 3+ Innovation Congress: 0+ special visit: 5+ product release meeting: 0+ seminar: 6+ development Association: 0
Social responsibility feeling of 100 ═ public welfare: 10+ charitable: 1+ industry leader: 0+ labor model: 1
Negative news information-100: leave job: 166+ leave job: 0+ negative: 2+ wind wave: 3+ horse falling: 1+ discipline: 0+ surveyed: 1
Forward news information 0 is a prominent contribution prize 0+ Fengyun character: 0+ outstanding characters: 0+ leader figure: 0+ best high pipe 0+ annual figure 0
Winning information, 0 is winning enterprise, 0+ silver prize, 0+ gold prize: 0+ outstanding contribution prize: 0+ special prize innovation prize: 0+ excellent business prize: 0+ medal: 0+ honor certificate: 0+ awards grand ceremony 0
Displaying the information: 10 ═ medal: 0+ honor certificate 0+ goodness 1
Trade public praise 25 is good as tide 0+ good score: 1+ phenomenon level 0+ tombstone prize 0+ optimal 5+ enterprise tombstone: 0+ network public praise 0
Public service information: 10 ═ public interest contribution prize: 0+ charitable: 1+ donation: 0+ public welfare activity: 0+ public welfare career: 0
Business moral 0+ business moral 0
Commercial moral corruption 0 to corruption 0
The administrative punishment information is 0, namely the administrative punishment is 0, the responsibility dispute is 0, the illegal operation is 0, the suspected violation is 0, and the punishment is represented as follows: 0
Administrative approval information 10 is business approval 0+ administrative approval 0+ approval 1
The abnormal operation information is 0, namely an abnormal operation name list, 0+ the abnormal operation enterprise, 0
Tax negative information of 0 to tax payment abnormal 0+ tax evasion 0+ tax payment declaration abnormal 0
Media negative information: -100 ═ suspected: 94+ non-conforming product: 0+ recall: 0+ improvement: 45+ labor dispute: 0+ negative news: 0+ referee: 3+ running: 0+ fraud: 0
Judicial litigation information-100 infringement: 1+ seption: 0+ prosecution: 22+ lition: 9
Patent application 10-patent: 2+ invention patent: 0+ certificate of patent: 0
And (3) trademark registration: 0-brand: 0+ trademark application: 0
Copyright publication 100 copyright 11+ copyright 0
Brand awareness: 10-degree of awareness: 2+ well-known enterprises: 0+ brand name coming down: 0+ non-inspection product: 0+ reputation degree 0
Brand share of 100 is market share of 4+ monopoly of 33
And (3) external investment: 0+ investment ridge 0
Enterprise financing: 50 listed as 0+ IPO:5+ issued stock: 0+ issued bond 0+ Angel wheel 0+ wheel A0 + wheel B0 + wheel C0 + wheel D0
0+ new technology investment, 0+ new technology, 0+ technical change, 0+ technical revolution, 0
And (3) breaking the industry barrier, namely 0 breaking the industry barrier and 0+ breaking the barrier: 0
A new product: 0 is product release 0
The input period is 0: 0+ money burning, 0+ marketing: 0
Maturity period 0 ═ stock price greatly increased: 0+ price competition: 0+ repeat purchase rate: 0
Decline period 0 ═ sales volume decreased significantly 0
Positive dynamic, 100 ═ swelling: 3+ market value swelling: 0+ stock price soaring: 0+ financing 63
Negative dynamic is that-100 ═ stop brand: 3+ parallel purchase: 15+ recombination: 63+ rise and fall: 0+ achievement downslide: 0+ profit downslide: 0+ sales drop: 0+ market value shrinkage: 0+ base price selling: 0+ big diving: 0+ continuous downslide: 0+ deficit: 0+ debt default: 0
The market is good: 0-liter: 0+ swelling: 0
The market is bad: 0-fall: 0
The current state of the industry: 50 has wide prospect: the 0+ foreground is not clear: 1
And (3) industrial analysis: 0 ═ rapid rise: 0+ developed steadily: 0+ develops slowly: 0+ industry is hindered: 0
And (3) supporting policies: financial support 100: 17+ fund support: 3+ deduction of income tax of enterprises: 0+ duty free enterprise income tax: 0
Restriction policy-100 policy and regulation restriction: 10+ policy restriction: 17
Protection policy 100 protection policy 8+ policy protection 10
Adjustment policy 100-adjustment policy 2+ policy adjustment: 1
And (3) promoting policy: 100 policy promotion: 1+ promotion policy: 7
And (3) a guide policy: 100 ═ bootstrap policy: 2+ policy guidance: 1
And then, questionnaire survey is carried out on employees at specific posts in the China Petroleum and Natural gas Limited company, wherein the questionnaire survey comprises 100 people in total, such as employees in the legal department, employees in the financial department, employees in the administrative department, employees in the personnel department, employees in the business department, middle-level managers and representatives of part of branch companies, and a specific questionnaire form comprises the contents in all index items of the invention, namely a director high Internet exposure index item, a social responsibility index item, a negative news information index item, a forward news information index item, a winning prize information index item, a show prize information index item, an industry public praise index item, a public welfare information index item, a commercial moral index item, an administrative punishment information index item, an administrative licensing information index item, an abnormal operation information index item, a tax negative information index item, a media negative side information index item, a judicial litigation information index item, a patent application index item, a public law information item, a public goods information item, a goods information item, A trademark registration index item, a copyright publication index item, a brand awareness index item, a brand share index item, an external investment index item, an enterprise financing index item, a new technology index item, an industry barrier breakthrough index item, a new product index item, an input period index item, a maturity index item, a decline period index item, a positive dynamic index item, a negative dynamic index item, a good market condition index item, a poor market condition index item, an industry status index item, an industry analysis index item, a support policy index item, a restriction policy index item, a protection policy index item, an adjustment policy index item, a promotion policy index item, and a guidance policy index item;
each index item has 5 options for selection, for example, the social responsibility index items comprise high social responsibility, common social responsibility, low social responsibility and unknown social responsibility; as another example, the negative dynamic index items include many negative dynamics, generally many negative dynamics, few negative dynamics, and unknown;
counting the number of people selected in each index item after the filled questionnaire is taken, taking the option with the largest number of people selected except the unknown option as the final result of the questionnaire of the index item, and taking the option with the front position as the final result of the index item if the number of people is the same; the final statistics give the following table one:
watch 1
Figure BDA0001447789650000201
Figure BDA0001447789650000211
The four options of the index item with the positive value of the score from the first to the last correspond to 100 points, 65 points, 30 points and 0 points respectively, the four options of the index item with the negative value from the first to the last correspond to-100 points, -65 points, -30 points and 0 points respectively, and the final score is obtained by calculating according to the subclass weight and the major weight in the invention, wherein the final score of the questionnaire survey in the invention is 82.3 points;
in order to further verify the reliability and the rationality of the system and the method provided by the invention, risk assessment is also carried out on other companies by using the system provided by the invention, and questionnaires are respectively carried out on employees in the companies to obtain comparison results shown in the following table II;
watch two
Company name System evaluation score Questionnaire score
China Petrochemical Corporation 77.75 82.3
State Grid Corporation 80.16 88.1
China Mobile communication group Co 75.83 84.25
CHINA RAILWAY CONSTRUCTION Corp. 79.51 85.87
China Life insurance (group) corporation 76.1 80.32
Beijing automobile group 73.5 79.2
CHINA DATANG Corp.,Ltd. 74.19 78.85
According to the results, the scores of the evaluation system provided by the invention are lower than about 4-8 points relative to questionnaire survey of employees, but the overall score fluctuation is stable, the scores of companies with good operation conditions are not greatly different, and the system provided by the invention has extremely high rationality and stability.
Further analyzing the scoring method and the questionnaire survey results provided by the invention, it can be known that the major weight coefficients and the minor weight coefficients have significant influence on the final scoring result, for example, the importance of enterprise operation management and industry development environment is higher than that of enterprise competitiveness and development prospect, and the major weight coefficients of the invention are designed under the condition of sorting and analyzing a large amount of data and combining the Chinese situation, so as to balance the relative importance relationship of each aspect, namely the influence degree on enterprise risk, through the weight distribution proportion;
in each scoring module, corresponding subclass weight coefficients are set for different sub-modules respectively, and the contribution degree of each scoring sub-module on enterprise risk influence is balanced, wherein in the enterprise operation management scoring sub-module, the public records of an enterprise can more scientifically and objectively represent the operation condition of the enterprise relative to key characters and social reputation of the enterprise, and the influence on enterprise risk is larger; in addition, the source of the data is also considered, and reasonable judgment and grading rules are further selected and set according to the characteristics of data source analysis data, for example, when the judgment module judgment rule in the national policy index item is set, the difficulty of obtaining the policy information through network crawling is higher, the obtained information quantity is less, the corresponding interference information is less, so that a higher score can be given when a small number of keywords hit; in addition, for the judgment rules of the judgment module in the items such as the forward news information index items and the like, higher scores can be given only when the hit keywords reach a sufficient number, and a plurality of score gears are set, so that the final scores are more scientific and reasonable, and the influence of interference information is reduced;
therefore, the enterprise risk evaluation system and the enterprise risk evaluation method provided by the invention are used for analyzing factors influencing enterprise risks more comprehensively, obtaining enterprise risk evaluation results more quickly, and distributing proportion relations among the enterprise risk influencing factors more reasonably, so as to obtain scientific and reasonable enterprise risk evaluation results which are close to real conditions;
on the basis of mass data analysis, in combination with actual conditions, the system and the method provided by the invention are provided with a plurality of scoring modules, a plurality of scoring sub-modules, corresponding judging modules and corresponding judging rules;
the finally obtained system and the method can obtain comprehensive data information related to enterprises on the basis of convenience and rapidness, and carry out scientific and reasonable weight distribution on the data information, so that enterprise risk evaluation results which are more valuable and closer to the real situation are obtained, other related enterprises can be rapidly and transversely compared, and the relative risk size among the enterprises can be clearly and clearly understood by comparing the final scores of different enterprises.
The present invention has been described above in connection with preferred embodiments, but these embodiments are merely exemplary and merely illustrative. On the basis of the above, the invention can be subjected to various substitutions and modifications, and the substitutions and the modifications are all within the protection scope of the invention.

Claims (7)

1. An enterprise risk evaluation system based on open source data is characterized by comprising
A data crawling module (1) for crawling data from a web page,
the data word segmentation module (2) is used for performing word segmentation processing on the data text crawled by the data crawling module (1) and counting word frequency;
the scoring module (3) is used for giving enterprise scores according to the word frequency of the participles;
an input device (11) is externally connected to the data crawling module (1), and retrieval information is input through the input device (11);
the process of crawling the webpage data by the data crawling module (1) comprises the following steps:
step 1, the crawling engine obtains an initial request from a crawler module,
step 2, the crawling engine lists the request obtained from the crawler module into a task plan,
step 3, the task plan returns the next request to the crawling engine,
step 4, the crawling engine sends the request returned by the task plan to the downloading module through the downloading middleware,
step 5, the downloading module downloads the page, generates a response when the downloading module downloads the page, and sends the response to the crawling engine through the downloading middleware,
step 6, after receiving the response sent by the downloading module, the crawling engine sends the response to the crawler module through the crawler middleware,
step 7, after the crawler module processes the response sent by the crawling engine, the crawler module returns the crawling element and a new request to the crawling engine through the crawler middleware,
step 8, the crawling engine sends the processed crawling element to the element pipeline, then sends a processing request to the task plan and waits for the next possible request,
step 9, repeating the steps 1-8 until no new request is made in the mission plan;
emotion analysis is also carried out on the crawled articles in the word segmentation processing process, non-negative probability and negative probability can be obtained in the word segmentation processing,
judging the article as a neutral promotion when the difference between the non-negative probability and the negative probability is between-0.1 and 0.1; judging the article as positive propaganda when the difference between the non-negative probability and the negative probability is a numerical value above 0.1; judging the article as a negative propaganda when the difference between the non-negative probability and the negative probability is a numerical value below-0.1;
the scoring module (3) comprises one or more of an enterprise operation management scoring submodule (31), an enterprise competitiveness scoring submodule (32), an enterprise development prospect scoring submodule (33) and an industry development environment scoring submodule (34), scores of all submodules included in the scoring module (3) are obtained respectively, and then the scores are added according to the large-class weight to obtain the final score of the enterprise;
the enterprise operation management scoring submodule (31) is 0.4 in large-scale weight coefficient, the enterprise competitiveness scoring submodule (32) is 0.2 in large-scale weight coefficient, the enterprise development prospect scoring submodule (33) is 0.1 in large-scale weight coefficient, and the industry development environment scoring submodule (34) is 0.3 in large-scale weight coefficient;
each sub-module comprises more than two index dimension judgment parts,
each index dimension judgment part comprises more than two index items, and each index item is respectively scored;
adding the scores of all index items in one index dimension judgment part, and multiplying the sum by the subclass weight of the index dimension to obtain the score of the index dimension;
the sum of the scores of all the index dimensions is the score of the submodule;
more than one index key word is stored in each index item, a participle which is the same as the index key word is found out from the participles extracted by the data participle module (2), and the word frequency of the participle is obtained;
a judging module is also stored in each index item, and the judging module judges the score of each index item according to the total word frequency or the content of the hit keywords;
and the total word frequency is the sum of the word frequencies of all index keywords corresponding to/hitting the participles in the index items.
2. The open-source-data-based enterprise risk assessment system of claim 1,
the enterprise operation management scoring submodule (31) comprises an enterprise key index dimension judgment part (311), an enterprise social reputation index dimension judgment part (312) and a public record index dimension judgment part (313);
wherein the enterprise key index dimension determination unit (311) includes:
the director altitude Internet exposure index item, wherein the index key words include Peak meeting, Forum, annual meeting, Innovation conference, Special visit, product release conference, workshop and development conference,
social responsibility index items, wherein index keywords comprise public welfare, charitable, industry leaders and labourethes,
negative news information index items, wherein index keywords comprise leaving, negating, wind wave, falling horse, breaking, being checked and being investigated,
the method comprises the steps of (1) a forward news information index item, wherein index keywords comprise a prominent contribution prize, a Fengyun character, a prominent character, a leader character, an optimal high-ranking character and an annual character;
the enterprise social reputation index dimension determination section (312) includes:
the prize winning information index item, wherein the index keywords comprise prize winning enterprises, bank prizes, gold prizes, outstanding contribution prizes, special prize innovation prizes, excellent business prizes, medals, honor certificates and prize awarding grand ceremonies,
show out information index items, wherein the index keywords comprise medals, honor certificates and goodness,
the industry public praise index item, wherein the index key words comprise good reviews like tide, good reviews, phenomenon level, public praise prize, best, enterprise public praise and network public praise,
public welfare information index items, wherein index keywords comprise public welfare contribution prizes, charities, donations, contributions, public welfare activities and public welfare careers,
the good index item of commercial moral, wherein the index key words comprise commercial moral enterprises and moral enterprises,
the business moral corruption index item, wherein the index key words comprise enterprise moral corruption and corruption;
the common record index dimension determination unit (313) includes:
an administrative punishment information index item, wherein the index key words comprise administrative punishment, responsibility dispute, illegal operation, suspected violation and punishment bulletin,
an administrative permission information index item, wherein index keywords comprise an operation permission, an administrative permission and a license,
the index key words include abnormal operation name list, abnormal operation enterprise and abnormal operation enterprise,
the tax negative information index item, wherein the index key words comprise tax payment abnormity, tax evasion and tax payment declaration abnormity,
media negative information index items, wherein the index keywords comprise suspicion, unqualified products, recall, rectification, labor dispute, negative news, referee, running, counterfeiting, and
the judicial litigation information index items, wherein the index keywords comprise infringement, abortion, prosecution and litigation;
wherein the content of the first and second substances,
the judgment rules of the judgment module in the board of director high internet exposure index item and the social responsibility index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 30, when the total word frequency is between [3-5), the score is 50, and when the total word frequency is between [5- ∞), the score is 100;
the judgment rules of the judgment module in the winning information index item, the table and the public welfare information index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 10, when the total word frequency is between [3-5), the score is 25, when the total word frequency is between [5-7), the score is 50, when the total word frequency is between [7-10), the score is 75, and when the total word frequency is between [10- ∞), the score is 100;
the judgment rules of the judgment modules in the forward news information index item, the commercial moral well index item, the industry public praise index item and the administrative license information index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-5), the score is 10, when the total word frequency is between [5-10), the score is 25, when the total word frequency is between [10-15), the score is 50, when the total word frequency is between [15-20), the score is 75, and when the total word frequency is between [20- ∞), the score is 100;
the judgment rules of the judgment modules in the negative news information index item, the commercial moral corruption index item, the administrative penalty information index item, the abnormal operation information index item, the tax negative information index item, the media negative information index item and the judicial litigation information index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-5), the score is-10, when the total word frequency is between [5-10), the score is-25, when the total word frequency is between [10-15), the score is-50, when the total word frequency is between [15-20), the score is-75, and when the total word frequency is between [ 20-infinity), the score is-100.
3. The open-source-data-based enterprise risk assessment system of claim 1,
the enterprise competitiveness scoring submodule (32) comprises an enterprise innovation level index dimension judgment part (321) and a brand influence index dimension judgment part (322);
wherein the enterprise innovation level index dimension determination unit (321) includes:
the patent application index item, wherein the index key words comprise patents, invented patents and patent certificates,
the index item of trademark registration, wherein the index keyword comprises a trademark and a trademark application,
the copyright publication index item, wherein the index keywords comprise copyright and copyright;
the brand influence index dimension determination unit (322) includes:
the index key words of the brand awareness index item comprise awareness, an awareness enterprise, a passing brand, a non-inspection product and a reputation,
the brand share index item, wherein the index key words comprise market share and monopoly.
4. The open-source-data-based enterprise risk assessment system of claim 3,
the judgment rules of the judgment module in the trademark registration index item, the copyright publication index item, the brand awareness index item and the brand occupation rate index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 10, when the total word frequency is between [3-5), the score is 25, when the total word frequency is between [5-7), the score is 50, when the total word frequency is between [7-10), the score is 75, and when the total word frequency is between [10- ∞), the score is 100;
the judgment rules of the judgment module in the patent application index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-5), the score is 10, when the total word frequency is between [5-10), the score is 25, when the total word frequency is between [10-15), the score is 50, when the total word frequency is between [15-20), the score is 75, and when the total word frequency is between [20- ∞), the score is 100.
5. The open-source-data-based enterprise risk assessment system of claim 1,
the enterprise development prospect scoring submodule (33) comprises an enterprise investment and financing index dimension judgment part (331), a product updating iteration index dimension judgment part (332), a product life cycle index dimension judgment part (333) and a capital market dynamic index dimension judgment part (334);
wherein, the enterprise investment and financing index dimension judgment part (331) comprises:
an external investment index item, wherein index keywords comprise investment and investment;
enterprise financing index items, wherein index keywords comprise listed listing, IPO, stock issuance, bond issuance, Angel turn, A turn, B turn, C turn and D turn;
the product update iteration index dimension determination unit (332) includes:
the technical index item comprises index keywords, wherein the index keywords comprise new technical investment, new technology, technical change and technical revolution;
the industry barrier breaks through index items, wherein index keywords comprise that the industry barrier is broken and broken through;
a new product index item, wherein index keywords comprise a product release party;
the product life cycle index dimension determination unit (333) includes:
the investment period index item, wherein the index key words comprise money smashing, money burning and investment;
the index items of the maturity period, wherein the index keywords comprise stock price greatly rising, price competition and repeated purchase rate;
the method comprises the following steps of (1) a decline period index item, wherein index keywords comprise sales volume which is obviously reduced;
the capital market dynamics index dimension determination section (334) includes:
positive dynamic index items, wherein index keywords comprise fluctuation, market value fluctuation, stock price surge and financing,
negative dynamic index items, wherein index keywords comprise stop, merger, recombination, rise and fall, performance slide, profit slide, sales fall, market value shrink, base price sale, big diving, continuous fall, deficit and debt default,
good market quotation index items, wherein index keywords comprise 'rising' and 'rising',
the market quotation is not good, wherein the index key words comprise 'fall';
wherein the content of the first and second substances,
the judgment rules of the judgment module in the external investment index item and the enterprise financing index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 10, when the total word frequency is between [3-5), the score is 25, when the total word frequency is between [5-7), the score is 50, when the total word frequency is between [7-10), the score is 75, and when the total word frequency is between [10- ∞), the score is 100;
the judgment rules of the judgment module in the new technical index item, the industry barrier breakthrough index item and the new product index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 30, when the total word frequency is between [3-5), the score is 50, and when the total word frequency is between [5- ∞), the score is 100;
the judgment rules of the judgment module in the investment period index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 30, and when the total word frequency is between [3- ∞ ], the score is 50;
the judgment rules of the judgment module in the maturity index items are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 50, and when the total word frequency is between [3- ∞ ], the score is 100;
the judgment rules of the judgment module in the decline period index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-3), the score is 10, and when the total word frequency is between [3- ∞ ], the score is 25;
the judgment rules of the judgment module in the positive dynamic index item and the market good market condition index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-5), the score is 10, when the total word frequency is between [5-10), the score is 25, when the total word frequency is between [10-15), the score is 50, when the total word frequency is between [15-20), the score is 75, and when the total word frequency is between [20- ∞), the score is 100;
the judgment rules of the judgment module in the negative dynamic index item and the market condition bad index item are as follows: when the total word frequency is 0, the score is 0, when the total word frequency is between (0-5), the score is-10, when the total word frequency is between [5-10), the score is-25, when the total word frequency is between [10-15), the score is-50, when the total word frequency is between [15-20), the score is-75, and when the total word frequency is between [ 20-infinity), the score is-100.
6. The open-source-data-based enterprise risk assessment system of claim 1,
the industry development environment scoring submodule (34) comprises an industry prospect index dimension judgment part (341) and a national policy index dimension judgment part (342);
wherein the industry prospect index dimension determination unit (341) includes:
the industrial status index item, wherein the index key words comprise broad prospect and unclear prospect;
the index items are analyzed in the industry, wherein the index keywords comprise rapid rise, stable development, slow development and industry obstruction;
the national policy index dimension determination unit (342) includes:
the policy support policy index item, wherein the index key words comprise financial support, fund support, enterprise income tax deduction and enterprise income tax deduction,
the policy index item is limited, wherein the index keywords comprise policy regulation limit, policy limit and limitation policy,
protection policy index items, wherein index keywords comprise protection policies and policy protections,
adjusting policy indicators, wherein the indicators include adjusting policies and policy adjustments,
policy-promoting indicators, wherein the indicators key include policy promotion, promotion policy, and
the index key words of the guide policy index item comprise guide policy and policy guide.
7. The open-source-data-based enterprise risk assessment system of claim 6,
the judgment rule of the judgment module in the industry status index item is as follows: the score is 100 when the word frequency with wide index keyword prospect is between 1-infinity); the broad word frequency of the index keyword foreground is 0, and the score is 50 when the word frequency of the index keyword foreground is unclear between 1 and infinity); when the broad prospect and the unclear prospect of the index keyword are both the word frequency of 0, the score is 0;
the judgment rule of the judgment module in the industry analysis index item is as follows: the score when the word frequency of the index keyword rapidly rises is between [1- ∞) is 100; the word frequency of the rapid rising of the index keywords is 0, and the score is 75 when the word frequency of the index keywords developing stably is between [1- ∞ ]; the word frequency of the index keywords rising rapidly and developing steadily is 0, and the score is 50 when the word frequency of the index keywords developing slowly is between [1- ∞ ]; the word frequency of the index keywords which grow up rapidly, develop stably and develop slowly is 0, and the score is 25 when the word frequency of the industry barrier of the index keywords is between 1 and infinity; when the total word frequency is 0, the score is 0;
the judgment rules of the judgment module in the support policy index item, the adjustment policy index item, the promotion policy index item and the guide policy index item are as follows: the total word frequency is between [1- ∞) and the score is 100;
and restricting policy index items, wherein the judging rule of the middle judging module is as follows: the score was 0 when the total word frequency was [1- ∞).
CN201711022805.6A 2017-10-27 2017-10-27 Enterprise risk evaluation system and method based on open source data Active CN110020048B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711022805.6A CN110020048B (en) 2017-10-27 2017-10-27 Enterprise risk evaluation system and method based on open source data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711022805.6A CN110020048B (en) 2017-10-27 2017-10-27 Enterprise risk evaluation system and method based on open source data

Publications (2)

Publication Number Publication Date
CN110020048A CN110020048A (en) 2019-07-16
CN110020048B true CN110020048B (en) 2021-09-14

Family

ID=67186658

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711022805.6A Active CN110020048B (en) 2017-10-27 2017-10-27 Enterprise risk evaluation system and method based on open source data

Country Status (1)

Country Link
CN (1) CN110020048B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112446776A (en) * 2019-08-27 2021-03-05 北京宸信征信有限公司 Small and medium-sized enterprise credit evaluation system and method based on multi-source docking fusion data
CN111222774B (en) * 2019-12-30 2020-08-18 广州博士信息技术研究院有限公司 Enterprise data analysis method and device and server
CN112418600A (en) * 2020-10-15 2021-02-26 重庆市科学技术研究院 Enterprise policy scoring method and system based on index set
CN112418601A (en) * 2020-10-15 2021-02-26 重庆市科学技术研究院 Policy matching method and system based on index set
CN114971432A (en) * 2022-08-01 2022-08-30 威海海洋职业学院 Enterprise financial risk early warning method and system
CN115239215B (en) * 2022-09-23 2022-12-20 中国电子科技集团公司第十五研究所 Enterprise risk identification method and system based on deep anomaly detection
CN115908082A (en) * 2023-01-06 2023-04-04 佰聆数据股份有限公司 Enterprise pollution discharge monitoring method and device based on electricity utilization characteristic indexes
CN117422312B (en) * 2023-12-18 2024-03-12 福建实达集团股份有限公司 Assessment method, medium and device for enterprise management risk

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103700029A (en) * 2013-12-16 2014-04-02 国家电网公司 Establishing method for post-evaluation index system for power grid construction project
CN105719073A (en) * 2016-01-18 2016-06-29 苏州汇誉通数据科技有限公司 Enterprise credit evaluation system and method
CN105975491A (en) * 2016-04-26 2016-09-28 重庆誉存企业信用管理有限公司 Enterprise news analysis method and system
CN106709818A (en) * 2016-12-30 2017-05-24 国家电网公司 Power consumption enterprise credit risk evaluation method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090037235A1 (en) * 2007-07-30 2009-02-05 Anthony Au System that automatically identifies a Candidate for hiring by using a composite score comprised of a Spec Score generated by a Candidates answers to questions and an Industry Score based on a database of key words & key texts compiled from source documents, such as job descriptions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103700029A (en) * 2013-12-16 2014-04-02 国家电网公司 Establishing method for post-evaluation index system for power grid construction project
CN105719073A (en) * 2016-01-18 2016-06-29 苏州汇誉通数据科技有限公司 Enterprise credit evaluation system and method
CN105975491A (en) * 2016-04-26 2016-09-28 重庆誉存企业信用管理有限公司 Enterprise news analysis method and system
CN106709818A (en) * 2016-12-30 2017-05-24 国家电网公司 Power consumption enterprise credit risk evaluation method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"B2C电子商务物流配送服务满意度研究";刘亚利;《淮南职业技术学院学报》;20161015;第16卷(第5期);第41-45页 *
刘亚利."B2C电子商务物流配送服务满意度研究".《淮南职业技术学院学报》.2016,第16卷(第5期),第41-45页. *

Also Published As

Publication number Publication date
CN110020048A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN110020048B (en) Enterprise risk evaluation system and method based on open source data
CN110704572B (en) Suspected illegal fundraising risk early warning method, device, equipment and storage medium
Fernández-Feijóo-Souto et al. Measuring quality of sustainability reports and assurance statements: Characteristics of the high quality reporting companies
Mulatsih et al. Landscape Financial Distress One Decade: Bibliometric Analysis
KR102089666B1 (en) Method for automatically aggregating and evaluating seller credit rate using big data and ai auto classification server
US20180225764A1 (en) Automated compliance scoring system that analyzes network accessible data sources
Mardini et al. Determinants of segmental disclosures: evidence from the emerging capital market of Jordan
Kuwahara et al. Role of the credit risk database in developing SMEs in Japan: Lessons for the rest of Asia
CN106933814A (en) Tax data exception analysis method and system
Jiang et al. Digital trade barriers and export performance: Evidence from China
CN112989070A (en) Core periodical quantitative evaluation system and method based on computer system
Chi et al. Debt rating model based on default identification: Empirical evidence from Chinese small industrial enterprises
CN112102006A (en) Target customer acquisition method, target customer search method and target customer search device based on big data analysis
Chen et al. Firm-level ESG information and active fund management
JP2018534674A (en) A global network system for creating real-time global company rankings based on globally acquired data
Fernandes et al. On the real effect of financial pressure: evidence from firm‐level employment during the euro‐area crisis
Kuwahara et al. Role of the credit risk database in developing SMEs in japan: Ideas for Asia
Lin et al. The application of decision tree and artificial neural network to income tax audit: the examples of profit-seeking enterprise income tax and individual income tax in Taiwan
Shkulipa Evaluation of accounting journals by coverage of accounting topics in 2018–2019
Chen et al. Bagging or boosting? Empirical evidence from financial statement fraud detection
Rahman et al. Volatility of other comprehensive income and audit fees: evidence from China
Katsimperis et al. Creating a flexible business credit rating model using multicriteria decision analysis
CN114971241A (en) Credit evaluation method and system for engineering construction subject
Tanabe et al. Analysis of trends of purchasers of motorcycles in Latin America
Amthauer et al. Ready or not? A systematic review of case studies using data-driven approaches to detect real-world antitrust violations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230810

Address after: No. 117-389 Yunhan Avenue, Beibei District, Chongqing, 400700

Patentee after: Chongqing Chenyu Information Technology Co.,Ltd.

Address before: Room 1201, building 65-a5, Fuxing Road, Haidian District, Beijing 100036

Patentee before: BEIJING CHENXIN CREDIT INFORMATION CO.,LTD.

TR01 Transfer of patent right