CN114328461A - Big data analysis-based enterprise innovation and growth capacity evaluation method and system - Google Patents

Big data analysis-based enterprise innovation and growth capacity evaluation method and system Download PDF

Info

Publication number
CN114328461A
CN114328461A CN202111669894.XA CN202111669894A CN114328461A CN 114328461 A CN114328461 A CN 114328461A CN 202111669894 A CN202111669894 A CN 202111669894A CN 114328461 A CN114328461 A CN 114328461A
Authority
CN
China
Prior art keywords
enterprise
data
index
sample
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111669894.XA
Other languages
Chinese (zh)
Inventor
李奇陵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kechuangtong Chengdu Co ltd
Original Assignee
Kechuangtong Chengdu Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kechuangtong Chengdu Co ltd filed Critical Kechuangtong Chengdu Co ltd
Priority to CN202111669894.XA priority Critical patent/CN114328461A/en
Publication of CN114328461A publication Critical patent/CN114328461A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a big data analysis-based enterprise innovation and growth capacity evaluation method and system, and relates to the field of quantitative evaluation, wherein the method comprises the following steps: collecting sample data of each enterprise dimension of a sample enterprise, and performing data cleaning, calculation and label classification on the sample data; training an index weight model by using the cleaned and classified sample data so as to obtain a relative importance score and a sequence chart of the index items and calculate a final weight value of each index item; respectively calculating the final evaluation score of each enterprise to be evaluated according to the enterprise data of each enterprise to be evaluated and the final weight value of each index item; the enterprise dimensions include, among others, trades, employees, qualifications, intellectual property, projects, financing, assets, and research and development costs. According to the invention, the proportion weight of intangible assets of the scientific and technological enterprise, including innovation capacity and scientific and technological projects, is accurately and reliably quantitatively evaluated and calculated, the growth capacity of the enterprise is accurately measured, and data support is provided for future development planning and decision making of the enterprise.

Description

Big data analysis-based enterprise innovation and growth capacity evaluation method and system
Technical Field
The invention relates to the field of quantitative evaluation, in particular to a method and a system for evaluating enterprise innovation and growth capacity based on big data analysis.
Background
The traditional evaluation method of enterprise innovation and growth capacity based on big data analysis is mainly a financial analysis method, and the evaluation method is used for analyzing financial data, business modes and business plans of enterprises, such as asset liability condition, sales profit condition and the like, and the evaluation method of enterprise growth capacity can be evaluated or updated only when the enterprises finance, audit or issue financial reports, and is generally oriented to enterprises with larger scale or enterprises with longer establishment time; for entrepreneurship type scientific and technological enterprises with short establishment time, the enterprises are characterized by short operation time, less financial data and large proportion of intangible assets of the enterprises, and the enterprises cannot be suitable for evaluating the growth capacity of the enterprises by a financial analysis method.
The other enterprise evaluation technology is enterprise portrait, but the enterprise portrait shows existing data of an enterprise from different dimensions, does not perform deep analysis on the existing data, and does not perform quantitative evaluation on the business behavior of the enterprise.
Because the growth capability of scientific and medium-sized enterprises with large intangible asset proportion cannot be evaluated accurately and reliably, the enterprises are very difficult to obtain financing loans, policy support and the like, and are not beneficial to the development and development of the enterprises and scientific and technological innovation.
Disclosure of Invention
The invention aims to provide an enterprise innovation and growth capacity evaluation method and system based on big data analysis, which can enable enterprises to carry out relevant enhancement according to evaluation conditions of all dimensions in evaluation results by quantitatively evaluating the unquantifiable innovation capacity, achievement transformation capacity and the like in enterprise operation, and can obtain more appropriate data support and investment items according to an evaluation structure when policies are implemented or enterprises invest and finance.
In order to achieve the above purpose, the invention provides the following technical scheme: a big data analysis-based enterprise innovation and growth capacity evaluation method comprises the following steps:
collecting sample data of each enterprise dimension of a sample enterprise, and performing data cleaning, calculation and label classification on the sample data according to a time sequence; the enterprise dimension comprises the wagons, the employees, the qualifications, intellectual property rights, projects, financing loans, assets and research and development expenses;
sample data and label classification after sample enterprise cleaning are adopted, and an index weight model is trained so as to obtain relative importance scores and a sequence diagram of index items according to the index weight model, and further calculate the final weight value of each index item;
respectively calculating the final evaluation score and ranking of each index item of each enterprise to be evaluated according to enterprise data of each enterprise to be evaluated and the final weight value of each index item; the enterprise data is sample data of each enterprise dimension after data cleaning is carried out on the enterprise to be evaluated.
Further, the process of performing data cleaning, calculation and label classification on the sample data according to the time sequence includes:
acquiring the information of the industrial and commercial enterprises, the quantity of researchers and the academic duty ratio, the qualification authentication information, the quantity of I-type intellectual property rights, the quantity of II-type intellectual property rights, the project application information, the loan amount, the financing data, the research and development expenses and the technical contract information of each year in the last three years, wherein the loan amount comprises the maximum loan amount, the minimum loan amount, the average loan amount and the total loan amount;
cleaning and normalizing the content of the acquired data, replacing the Chinese content with digital representation, and acquiring numerical enterprise data;
performing label classification on the numerical enterprise data to determine each index item, putting the data of each index item into an object of an enterprise data matrix, and respectively calculating the data value of the sample enterprise under each index item;
and calculating the sum, the average value, the maximum value, the minimum value, the 25% value and the 75% value of the data values of all the sample enterprises under each index item, and processing abnormal data values.
Further, the calculation process of the final weight value of the index item is as follows:
predefining a multi-level data classification label as a calculation index item;
receiving all sample data of a sample enterprise, extracting the characteristics of each sample data by using a gradient boosting decision tree algorithm of a machine learning xgboost library, classifying, sequencing and grouping, calculating a softmax loss function and a regular term by combining input calculation index terms, and acquiring a relative importance score and an optimal sequence chart of each index term one by one;
and taking the average value of the relative importance scores of the index items and the ratio scores of the preset corresponding index items as the final weight value of the index items.
Further, the calculation process of the final evaluation score of each index item of each enterprise to be evaluated is as follows:
acquiring sample data of each enterprise dimension of each enterprise to be evaluated and performing data processing to obtain an enterprise data matrix of each enterprise to be evaluated;
dividing each index item calculated by the index weight model into five groups of evaluation dimensions, including endogenous innovation, collaborative quotation, management operation capability, achievement transformation capability and continuous growth capability, and performing matrix dot multiplication on a numerical value corresponding to any index item in each group of evaluation dimensions in an enterprise data matrix and a final weight value corresponding to the numerical value to obtain an evaluation score of the corresponding index item;
and converting the scores of all groups of evaluation dimensions into 100 grading results according to the normal distribution characteristics to be used as the final evaluation scores of the enterprise index items.
Further, the index items of the sample enterprise include total number of employees, proportion of researchers, proportion of academic records over a large-scale project, proportion of research students, quantity of enterprise qualifications, quantity of production research and research qualifications, quantity of jurisdictions, quantity of research and development activities, proportion of research and development expenses, previous year research and development expenses of the enterprise, previous year research and development expenses, previous year standing expenses, maximum quantity of project amount, minimum quantity of project amount, mean quantity of project amount, total quantity of acquired project, total quantity of application project, total quantity of equity investment in previous year, total quantity of equity in previous year, total equity amount in previous year, total amount of equity in previous year, total equity in previous year, net equity in previous year, business income in previous year, business in previous year, Income of business for leading year, cost charge for leading year, net profit of leading year, profit tax paid on leading year, business proportion of leading year, net profit margin of leading year, management authentication, intellectual property right class I, intellectual property right class II, technical contract number of leading year, maximum technical contract technical transaction amount, minimum technical contract technical transaction amount, average technical contract technical transaction amount, technical contract technical transaction amount total, national reward amount, formation standard amount, technical achievement conversion amount, high and new technology product amount, high and new technology income, total asset growth rate of leading year, net asset growth rate, business income growth rate, cost charge growth rate, net profit growth rate of leading year, income of business profit, income change rate of cost charge, net profit growth rate, and income, The method comprises the following steps of increasing the upper paying profit tax rate, increasing the research and development cost, increasing the ratio change rate of main business, increasing the net profit rate, increasing the research and development cost ratio, increasing the quantity of intellectual property rights and increasing the ratio of research and development personnel.
Further, the processing method of the abnormal data value is as follows: for data values of which the proportion data are not in the range of 0-1, setting the data values to be 75% of the values of the corresponding items when the data values are more than 1, and setting the data values to be 25% of the values of the corresponding items when the data values are less than 1; data with growth rates greater than 20 are set to the 75% value of the corresponding item and data with growth rates less than-20 are set to the 25% value of the corresponding item.
The invention also discloses a big data analysis-based enterprise innovation and growth capacity evaluation system, which comprises:
the enterprise data collection processing module is used for collecting sample data of each enterprise dimension of a sample enterprise, and performing data cleaning, calculation and label classification on the sample data according to a time sequence; the enterprise dimension comprises the wagons, the employees, the qualifications, intellectual property rights, projects, financing loans, assets and research and development expenses;
the index weight module is used for training an index weight model according to sample data and label classification after sample enterprises clean, so that a relative importance score and an order graph of index items can be obtained according to the index weight model, and the final weight value of each index item can be calculated;
and the score giving module is used for respectively calculating the final evaluation score and the ranking of each index item of each enterprise to be evaluated according to the enterprise data of each enterprise to be evaluated and the final weight value of each index item.
Further, the enterprise data collection processing module comprises:
the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring the information of the industry and the commerce of a sample enterprise, the quantity of researchers and the academic duty ratio, the qualification authentication information, the quantity of I-type intellectual property rights, the quantity of II-type intellectual property rights, the project declaration information, the loan amount, the financing data, the research and development cost and the technical contract information in each year of three years, and the loan amount comprises the maximum loan amount, the minimum loan amount, the average loan amount and the total loan amount;
the cleaning and normalization unit is used for cleaning and normalizing the data acquired by the acquisition unit, and acquiring numerical enterprise data by deleting repeated data, evaluating irrelevant data and replacing Chinese content into digital representation;
the classification unit is used for performing label classification on the numerical enterprise data to determine each index item, putting the data of each index item into an object of an enterprise data matrix, and respectively calculating the data value of the sample enterprise under each index item;
and the calculating unit is used for calculating the sum, the average value, the maximum value, the minimum value, the 25% value and the 75% value of the data values of all the sample enterprises under each index item and processing abnormal data values.
Further, the index weight module comprises:
the pre-defining unit is used for pre-defining a multi-level data classification label as a calculation index item;
the machine learning unit is used for receiving all sample data of a sample enterprise, extracting the characteristics of each sample data by utilizing a gradient lifting decision tree algorithm of a machine learning xgboost library, classifying, sorting and grouping, calculating a softmax loss function and a regular term by combining with an input calculation index term, and acquiring a relative importance score and an excellent graph of each index term one by one;
and the weight calculation unit is used for calculating the average value of the relative importance scores of the index items and the ratio scores of the preset corresponding index items, and taking the average value as the final weight value of the index item.
Further, the score assignment module includes:
the second acquisition unit is used for acquiring the sample data of each enterprise dimension of each enterprise to be evaluated and performing data processing to obtain an enterprise data matrix of each enterprise to be evaluated;
the division evaluation unit is used for dividing each index item calculated by the index weight model into five groups of evaluation dimensions, including endogenous innovation, collaborative quotation, management operation capability, achievement transformation capability and continuous growth capability, and performing matrix multiplication on a numerical value corresponding to any index item in each group of evaluation dimensions in an enterprise data matrix and a final weight value corresponding to the numerical value to obtain an evaluation score of the corresponding index item;
and the score conversion unit is used for converting the scores of all groups of evaluation dimensions into 100 score results according to the normal distribution characteristics, and the 100 score results are used as final evaluation scores of the enterprise index items.
According to the technical scheme, the technical scheme of the invention has the following beneficial effects:
according to the method and the system for evaluating the innovation and growth capacity of the enterprise based on big data analysis, disclosed by the invention, the financial indexes of the enterprise are considered, and the growth capacity of the enterprise is accurately and reliably measured by accurately and quantitatively evaluating intangible assets of the enterprise, such as innovation capacity, scientific and technological projects and the like. Specifically, the method comprises the following steps: collecting sample data of each enterprise dimension of a sample enterprise, and performing data cleaning, calculation and label classification on the sample data according to a time sequence, wherein the enterprise dimensions comprise wages, employees, qualifications, intellectual property rights, projects, financing loans, assets and research and development expenses; sample data and label classification after sample enterprise cleaning are adopted, and an index weight model is trained so as to obtain relative importance scores and a sequence diagram of index items according to the index weight model, and further calculate the final weight value of each index item; respectively calculating the final evaluation score of each index item of each enterprise to be evaluated according to the enterprise data of each enterprise to be evaluated and the final weight value of each index item; the final evaluation score centers on five aspects, namely endogenous innovation, cooperative quotation, management operation capacity, achievement transformation capacity and continuous growth capacity.
Compared with a financial analysis method focusing on financial index items, the final evaluation score of the enterprise obtained by the method is more suitable for medium and small enterprises to evaluate the growth capacity of the enterprise, and provides data support for future development planning, policy adaptation and decision making of the enterprise; compared with the enterprise portrait, the method can not only display the development data of the enterprise from different dimensions, but also analyze the influence of the business behavior and innovation ability of the enterprise on the growth ability of the enterprise more deeply.
It should be understood that all combinations of the foregoing concepts and additional concepts described in greater detail below can be considered as part of the inventive subject matter of this disclosure unless such concepts are mutually inconsistent.
The foregoing and other aspects, embodiments and features of the present teachings can be more fully understood from the following description taken in conjunction with the accompanying drawings. Additional aspects of the present invention, such as features and/or advantages of exemplary embodiments, will be apparent from the description which follows, or may be learned by practice of specific embodiments in accordance with the teachings of the present invention.
Drawings
The drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures may be represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. Embodiments of various aspects of the present invention will now be described, by way of example, with reference to the accompanying drawings, in which:
FIG. 1 is a flow chart of a big data analysis-based enterprise innovation and growth capability evaluation method according to the present invention;
FIG. 2 is a flow chart of sample data cleaning preparation in the embodiment;
FIG. 3 is a first flowchart illustrating sample data cleaning in an embodiment;
FIG. 4 is a second flowchart of sample data cleaning in the embodiment;
fig. 5 is a weight of each evaluation dimension for generating an evaluation index in the embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings of the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the invention without any inventive step, are within the scope of protection of the invention. Unless defined otherwise, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs.
The use of "first," "second," and similar terms in the description and claims of the present application do not denote any order, quantity, or importance, but rather the terms are used to distinguish one element from another. Similarly, the singular forms "a," "an," or "the" do not denote a limitation of quantity, but rather denote the presence of at least one, unless the context clearly dictates otherwise. The terms "comprises," "comprising," or the like, mean that the elements or items listed before "comprises" or "comprising" encompass the features, integers, steps, operations, elements, and/or components listed after "comprising" or "comprising," and do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The method for evaluating the enterprise growth capacity in the prior art is mainly a financial analysis method, and focuses on financial indexes of enterprises, so that the enterprise growth capacity evaluated by the method can only be used when the enterprises finance, audit or release financial reports, and can only be suitable for the enterprises with larger scale or longer establishment time, and is not suitable for small and medium-sized micro enterprises and entrepreneurship-type scientific and technological enterprises with shorter establishment time, because the new capacity and the achievement transformation capacity of the enterprises in the operation activities of the enterprises are not quantitatively evaluated in the evaluation method. Therefore, the invention aims to provide an enterprise innovation and growth capacity evaluation method and system based on big data analysis, which can quantitatively evaluate the enterprise intangible assets, unquantifiable innovation capacity, achievement transformation capacity and the like besides common indexes such as financial indexes and the like, accurately and reliably evaluate the enterprise growth capacity, and provide data support for future development and decision making of the enterprise.
The method and system for evaluating enterprise innovation and growth ability based on big data analysis disclosed by the invention are further specifically described below with reference to the embodiments shown in the drawings.
With reference to the flowchart shown in fig. 1, the method for evaluating enterprise innovation and growth capacity based on big data analysis disclosed in the embodiment of the present application includes the following steps:
step S102, collecting sample data of each enterprise dimension of a sample enterprise, and performing data cleaning, calculation and label classification on the sample data according to a time sequence; the enterprise dimension comprises the wagons, the employees, the qualifications, intellectual property rights, projects, financing loans, assets and research and development expenses;
the specific execution process of step S102 is: firstly, acquiring the information of industrial and commercial enterprises of a sample enterprise, the quantity of researchers and the ratio of academic to academic accounts, qualification authentication information, the quantity of I-class intellectual property rights, the quantity of II-class intellectual property rights, project declaration information, loan amount, financing data, research and development expenses and technical contract information in each year of three years, wherein the loan amount comprises the maximum loan amount, the minimum loan amount, the average loan amount and the total loan amount; then, cleaning and normalizing the content of the acquired data, replacing the Chinese content with digital representation, and acquiring numerical enterprise data; secondly, performing label classification on the numerical enterprise data to determine each index item, putting the data of each index item into an object of an enterprise data matrix, and respectively calculating the data value of the sample enterprise under each index item; and finally, calculating the sum, the average value, the maximum value, the minimum value, the 25% value and the 75% value of the data values of all the sample enterprises under each index item, and processing abnormal data values.
The index items of the sample enterprise comprise total number of employees, proportion of researchers, proportion of academic calendar of more than a large amount, proportion of research students, quantity of enterprise qualifications, quality of production and research, quantity of institutions under jurisdiction, quantity of research and development activities, proportion of research and development expenses, previous year research and development expenses of enterprises, previous year research and development expenses, previous year standing expenses, maximum value of project amount, minimum value of project amount, mean value of project amount, total number of times of project approval, total number of times of project application, total number of investment equity in previous year, total amount of debt equity in previous year, total amount of investment in previous year, total amount of net equity in previous year, business income in previous year, business income in business in previous year, business income in business in previous year, business in last year, business in last year, business in the same year, business in the same amount, in the same amount in different amount, in different business, different business in different business, different business in different business, different, Previous year cost charge, last year cost charge, previous year net profit, last year net profit, previous year paid profit tax, last year paid profit tax, previous year dominant business duty, last year dominant business duty, previous year net profit margin, last year net profit margin, management certification, class I intellectual property, class II intellectual property, previous year technical contract quantity, last year technical contract quantity, maximum technical contract technical transaction quantity, minimum technical contract technical transaction quantity, average technical contract technical transaction quantity, total technical contract transaction quantity, national reward scientific quantity, formed standard quantity, scientific achievement conversion quantity, high and new technology product quantity, high and new technology income, last year asset total asset growth rate, net growth rate, major business income growth rate, cost charge change rate, net profit growth rate, upper profit tax rate, etc, The development cost growth rate, the change rate of the main business occupation ratio, the net profit rate growth rate, the development cost occupation growth rate, the intellectual property quantity growth rate and the development personnel occupation growth rate.
The process of acquiring, cleaning, calculating and classifying the dimension sample data of each enterprise of the sample enterprise at least comprises 3 stages as shown in fig. 2 to 4.
As shown in fig. 2, the early stage enters a data preparation phase, and business revenue data is processed to derive, from the database, relevant data required by all sample enterprises for evaluating the index model, where the relevant data at least covers the above 8 enterprise dimensions, and the relevant data forms an "all enterprises.xls" document, which is stored in the memory for later use; a set of all non-duplicate sample business names is obtained from the "all businesses. xls" document, and for any sample business in the set: firstly, acquiring latest scientific and creative loan data of an enterprise as a data basis of the enterprise in the evaluation method; then calculating the maximum, minimum, average and total subject creation loan amounts obtained by the enterprise in the last three years, 1), 2) the average occupation ratio of researchers in each year, 3) the average number of class I intellectual property rights in each year, 4) the average number of class II intellectual property rights in each year, and forming a data object CA; and merging the latest piece of the business creation loan data of the enterprise and the corresponding data object CA of any sample enterprise in the set to form new enterprise data, and transferring the new enterprise data into an initial table 'data 1. xls' file in the first step.
Normalizing the data content in the initial table in the first step, for example, removing Chinese in the list of the proportion of researchers, the proportion of academic history above major experts and the total number of employees in the table, and converting the Chinese into float type data; converting 0/1 whether the table shows whether the business is a high business, a technology advanced enterprise, a technology type small and medium business; and converting the enterprise data types of the columns specified in the first-step initial table into float type data, and merging the enterprise information of each sample enterprise and the enterprise data corresponding to the enterprise into a first-step result table, namely a sheet1.xlsx file.
As shown in fig. 3, the middle stage performs label classification quantitative evaluation on the result conversion capability of the sample enterprise, including acquiring five parts of data; first, the first step results table "sheet 1. xlsx" file; secondly, obtaining enterprise data about a plurality of qualification certification information and intellectual property rights and researcher data of the enterprise from the initial table 'date 1. xls' file in the first step; for qualification certification, sorting all unrepeated certification types contained by the enterprises according to separators in the table, converting the certification types into type columns according to the certification types, and recording the column value of the enterprise certified by the enterprises as 1; thirdly, acquiring the equity financing data of all enterprises from the database, taking intersection of the data and all related data of the 'all enterprises, xls' document to acquire first intersection data, and calculating the acquired investment times and the total amount of the investment of the enterprises for any sample enterprise according to the first intersection data; fourthly, acquiring technical contract data of all enterprises from a database, taking intersection of the data and all related data of the 'all enterprises, xls' document to obtain second intersection data, and sequentially calculating the technical transaction amount of each year of the enterprise in the last three years and the maximum, minimum, average, total data and times of the technical transaction amount of the enterprise according to the second intersection data for any sample enterprise; and fifthly, acquiring project declaration data of all enterprises from a database, intersecting the data with all related data of the 'all enterprises, xls' document to obtain third intersection data, and sequentially calculating the maximum, minimum, average and total data and times of project establishment amount of each year of the enterprises and project establishment amount of the enterprises for any sample enterprise according to the third intersection data.
And finally, combining the five parts of data according to the enterprise dimensions, and outputting to obtain a result table 'data99. xlsx' file in the second step.
As shown in fig. 4, the data values and abnormal data values of the post-computation sample enterprise under each index item are processed, and the stage is to perform combined processing on the data in the early stage and the data in the middle stage; the business income in the first step result table "date 1. xls" file is merged with the second step result table "date 99. xlsxs" file, the total acquired subsidy amount (obstetrical research) of each sample enterprise, the qualification certification amount (enterprise qualification) of each sample enterprise, the management certification amount (management certification) acquired by each enterprise, the asset, profit, income, tax payment and development expense growth rate of the sample enterprise, the ratio of going/previous year such as talent, expense, major business income, zero profit and the like of the sample enterprise are sequentially calculated, and the calculation result data is merged with the data in the second step result table "date 99. xls" file and is recorded as the index item data "bb _ date" table object for all sample enterprises; for each column data in the table object of the index item data "bb _ date", count, mean, std, min, 25%, 50%, 75%, max, null _% (the ratio of the number of columns to the total number of rows having data) of the column data is calculated, and the calculated value is recorded as "n _ df" table object.
Finally, the abnormal data value in the table object of the index item data "bb _ date" is processed, and the processing method of the abnormal data value provided by the embodiment is as follows: for data values of which the proportion data are not in the range of 0-1, setting the data values to be 75% of the values of the corresponding items when the data values are more than 1, and setting the data values to be 25% of the values of the corresponding items when the data values are less than 1; setting the data with the growth rate larger than 20 as 75% value of the corresponding item, and setting the data with the growth rate smaller than-20 as 25% value of the corresponding item; setting the value of part of the label column exceeding the maximum value as a corresponding maximum value, wherein the value comprises the total times of the approved projects, the number of the jurisdictional institutions, the total times of the applied projects, the maximum value of the credit amount, the mean value of the credit amount and the total amount of the credit amount; and after the abnormal data value is processed, storing the processed data into a result data 'data _72. xlsx' file in the third step.
Step S104, classifying sample data and labels after sample enterprises clean, training an index weight model so as to obtain relative importance scores and an excellent chart of index items according to the index weight model, and further calculating final weight values of the index items;
the specific execution process of step S104 is: predefining a multi-level data classification label as a calculation index item; receiving all sample data of a sample enterprise, extracting the characteristics of each sample data by using a gradient boosting decision tree algorithm of a machine learning xgboost library, classifying, sequencing and grouping, calculating a softmax loss function and a regular term by combining input calculation index terms, and acquiring a relative importance score and an optimal sequence chart of each index term one by one; and taking the average value of the relative importance scores of the index items and the ratio scores of the preset corresponding index items as the final weight value of the index items.
FIG. 5 shows the calculation process of the final weight value of the index item by taking the total amount of the item money as the calculation index item. Specifically, firstly, copying a total amount column of the item money in the result data 'data _72. xlsx' in the third step into a column Y, and preprocessing the data in the column Y, wherein all rows with the median value null in the column Y are excluded, and at least 50 rows which are not null columns are reserved; secondly, training a weight model by using a gradient boosting decision tree algorithm of an xgboost library, wherein a characteristic matrix is data except for a column Y, and a label is the column Y; then, obtaining an importance score result of the index item, and binding the result with the column name of the feature matrix source, wherein the binding result is the column name of the third step result data 'data _72. xlsx' with the row name and the value is the importance score; on the basis of the binding result, adding a feature matrix source column index number to form a number column 1, and arranging the number column 1 in an ascending order according to the importance scores; and adding a feature matrix source row index number to the number column 1 to form a number column 2, and obtaining the column name and the numerical value of the number column 2 according to the sequencing of the number column 1.
Finally, merging the serial number column 2 and the data of the comprehensive weight column and the index column read from the preset index weight distribution table zonghe.xlsx, and calculating the final weight value of the index item, wherein the calculation method comprises the following steps: (numerical value + comprehensive weight of number column 2)/2, calculating the index items of the enterprise dimension in sequence, and storing all results as a weight data' qz.
Step S106, respectively calculating the final evaluation score and ranking of each index item of each enterprise to be evaluated according to the enterprise data of each enterprise to be evaluated and the final weight value of each index item; the enterprise data is sample data of each enterprise dimension after data cleaning is carried out on the enterprise to be evaluated.
The specific execution process of step S108 is: acquiring sample data of each enterprise dimension of each enterprise to be evaluated and performing data processing to obtain an enterprise data matrix of each enterprise to be evaluated; dividing each index item calculated by the index weight model into five groups of evaluation dimensions, including endogenous innovation, collaborative quotation, management operation capability, achievement transformation capability and continuous growth capability, and performing matrix dot multiplication on a numerical value corresponding to any index item in each group of evaluation dimensions in an enterprise data matrix and a final weight value corresponding to the numerical value to obtain an evaluation score of the corresponding index item; and converting the scores of all groups of evaluation dimensions into 100 grading results according to the normal distribution characteristics to be used as the final evaluation scores of the enterprise index items.
Embodiments of the present application also provide an electronic device, which includes a processor and a memory, where the memory is used to store program instructions and transmit the program instructions to the processor; when the program instructions are executed by the processor, the processor executes the method for evaluating enterprise innovation and growth capacity based on big data analysis in the above embodiments.
The programs described above may be run on a processor or may also be stored in a computer readable storage medium, which includes both permanent and non-permanent, removable and non-removable media, that may implement the storage of information by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer-readable storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable storage medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
These computer programs may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks, and corresponding steps may be implemented by different modules.
By way of example, the present embodiment provides an apparatus or system for evaluating enterprise innovation and growth capacity based on big data analysis, the system includes the following program modules: the enterprise data collection processing module is used for collecting sample data of each enterprise dimension of a sample enterprise, and performing data cleaning, calculation and label classification on the sample data according to a time sequence; the enterprise dimension comprises the wagons, the employees, the qualifications, intellectual property rights, projects, financing loans, assets and research and development expenses; the index weight module is used for training an index weight model according to sample data and label classification after sample enterprises clean, so that a relative importance score and an order graph of index items can be obtained according to the index weight model, and the final weight value of each index item can be calculated; and the score giving module is used for respectively calculating the final evaluation score and the ranking of each index item of each enterprise to be evaluated according to the enterprise data of each enterprise to be evaluated and the final weight value of each index item.
The system is used for implementing the functions of the evaluation method in the above embodiments, and each module in the system corresponds to each step in the method, which has already been described in the method, and is not described again here.
For example, the enterprise data collection processing module includes: the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring the information of the industry and the commerce of a sample enterprise, the quantity of researchers and the academic duty ratio, the qualification authentication information, the quantity of I-type intellectual property rights, the quantity of II-type intellectual property rights, the project declaration information, the loan amount, the financing data, the research and development cost and the technical contract information in each year of three years, and the loan amount comprises the maximum loan amount, the minimum loan amount, the average loan amount and the total loan amount; the cleaning and normalization unit is used for cleaning and normalizing the data acquired by the acquisition unit, and acquiring numerical enterprise data by deleting repeated data, evaluating irrelevant data and replacing Chinese content into digital representation; the classification unit is used for performing label classification on the numerical enterprise data to determine each index item, putting the data of each index item into an object of an enterprise data matrix, and respectively calculating the data value of the sample enterprise under each index item; and the calculating unit is used for calculating the sum, the average value, the maximum value, the minimum value, the 25% value and the 75% value of the data values of all the sample enterprises under each index item and processing abnormal data values.
For another example, the metric weight module includes: the pre-defining unit is used for pre-defining a multi-level data classification label as a calculation index item; the machine learning unit is used for receiving all sample data of a sample enterprise, extracting the characteristics of each sample data by utilizing a gradient lifting decision tree algorithm of a machine learning xgboost library, classifying, sorting and grouping, calculating a softmax loss function and a regular term by combining with an input calculation index term, and acquiring a relative importance score and an excellent graph of each index term one by one; and the weight calculation unit is used for calculating the average value of the relative importance scores of the index items and the ratio scores of the preset corresponding index items, and taking the average value as the final weight value of the index item.
And for example, the score assignment module includes: the second acquisition unit is used for acquiring the sample data of each enterprise dimension of each enterprise to be evaluated and performing data processing to obtain an enterprise data matrix of each enterprise to be evaluated; the division evaluation unit is used for dividing each index item calculated by the index weight model into five groups of evaluation dimensions, including endogenous innovation, collaborative quotation, management operation capability, achievement transformation capability and continuous growth capability, and performing matrix multiplication on a numerical value corresponding to any index item in each group of evaluation dimensions in an enterprise data matrix and a final weight value corresponding to the numerical value to obtain an evaluation score of the corresponding index item; and the score conversion unit is used for converting the scores of all groups of evaluation dimensions into 100 score results according to the normal distribution characteristics, and the 100 score results are used as final evaluation scores of the enterprise index items.
The embodiment of the application solves the problem that entrepreneurship type scientific and technological enterprises which are short in establishment time are not suitable for evaluating the growth capacity of the enterprises by a financial analysis method, and the financial index items of the enterprises are prevented from being emphasized by quantitatively evaluating the unquantifiable innovation capacity, the achievement transformation capacity and the like in the operation activities of the enterprises.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention should be determined by the appended claims.

Claims (10)

1. A big data analysis-based enterprise innovation and growth capacity evaluation method is characterized by comprising the following steps:
collecting sample data of each enterprise dimension of a sample enterprise, and performing data cleaning, calculation and label classification on the sample data according to a time sequence; the enterprise dimension comprises the wagons, the employees, the qualifications, intellectual property rights, projects, financing loans, assets and research and development expenses;
sample data and label classification after sample enterprise cleaning are adopted, and an index weight model is trained so as to obtain relative importance scores and a sequence diagram of index items according to the index weight model, and further calculate the final weight value of each index item;
respectively calculating the final evaluation score and ranking of each index item of each enterprise to be evaluated according to enterprise data of each enterprise to be evaluated and the final weight value of each index item; the enterprise data is sample data of each enterprise dimension after data cleaning is carried out on the enterprise to be evaluated.
2. The big data analysis-based enterprise innovation and growth capability evaluation method according to claim 1, wherein the process of performing data cleaning, calculation and label classification on the sample data according to the time series comprises:
acquiring the information of the industrial and commercial enterprises, the quantity of researchers and the academic duty ratio, the qualification authentication information, the quantity of I-type intellectual property rights, the quantity of II-type intellectual property rights, the project application information, the loan amount, the financing data, the research and development expenses and the technical contract information of each year in the last three years, wherein the loan amount comprises the maximum loan amount, the minimum loan amount, the average loan amount and the total loan amount;
cleaning and normalizing the content of the acquired data, replacing the Chinese content with digital representation, and acquiring numerical enterprise data;
performing label classification on the numerical enterprise data to determine each index item, putting the data of each index item into an object of an enterprise data matrix, and respectively calculating the data value of the sample enterprise under each index item;
and calculating the sum, the average value, the maximum value, the minimum value, the 25% value and the 75% value of the data values of all the sample enterprises under each index item, and processing abnormal data values.
3. The big data analysis-based enterprise innovation and growth capability evaluation method according to claim 2, wherein the final weight value of the index item is calculated by:
predefining a multi-level data classification label as a calculation index item;
receiving all sample data of a sample enterprise, extracting the characteristics of each sample data by using a gradient boosting decision tree algorithm of a machine learning xgboost library, classifying, sequencing and grouping, calculating a softmax loss function and a regular term by combining input calculation index terms, and acquiring a relative importance score and an optimal sequence chart of each index term one by one;
and taking the average value of the relative importance scores of the index items and the ratio scores of the preset corresponding index items as the final weight value of the index items.
4. The method for evaluating enterprise innovation and growth capacity based on big data analysis as claimed in claim 3, wherein the calculation process of the final evaluation score of each index item of each enterprise to be evaluated is as follows:
acquiring sample data of each enterprise dimension of each enterprise to be evaluated and performing data processing to obtain an enterprise data matrix of each enterprise to be evaluated;
dividing each index item calculated by the index weight model into five groups of evaluation dimensions, including endogenous innovation, collaborative quotation, management operation capability, achievement transformation capability and continuous growth capability, and performing matrix dot multiplication on a numerical value corresponding to any index item in each group of evaluation dimensions in an enterprise data matrix and a final weight value corresponding to the numerical value to obtain an evaluation score of the corresponding index item;
and converting the scores of all groups of evaluation dimensions into 100 grading results according to the normal distribution characteristics to be used as the final evaluation scores of the enterprise index items.
5. The method for evaluating enterprise innovation and growth capacity based on big data analysis as claimed in claim 2, wherein the index items of the sample enterprise include total number of employees, proportion of researchers, proportion of academic history over major, proportion of research students, amount of enterprise qualification, quality of production research, amount of research and development, amount of lower jurisdiction, amount of research and development activities, proportion of research and development expenses, cost of research and development in previous year, cost of research and development in last year, cost of research in previous year, maximum value of project amount, minimum value of project amount, mean value of project amount, total amount of project amount, total number of items granted, total number of project application, total amount of equity investment, total number of equity investment, total amount of debt in previous year, total amount of debt in last year, maximum value of equity amount, minimum value of credy amount, mean value of creditory amount, total amount of creditory amount, total number of acquired equity amount, Total previous-year assets, total last-year assets, net previous-year assets, net last-year assets, business income of leading business of previous year, cost of previous year, net profit of previous year, interest and tax paid in last year, business occupation ratio of leading business of previous year, business occupation ratio of leading business of last year, net profit rate of previous year, management and authentication, intellectual property right of class I, intellectual property right of class II, quantity of technical contract of previous year, maximum technical contract technical transaction quantity, minimum technical contract technical transaction quantity, average technical contract technical transaction quantity, total technical contract technical transaction quantity, national technical reward quantity, formation standard quantity, technical conversion quantity, high and new technology product quantity, high and new technology income, total annual income growth rate, growth rate of total assets of previous year, technology contract technical product quantity, and quality, Net asset growth rate, major business income growth rate, cost expense change rate, net profit growth rate, upper paying profit tax growth rate, research and development expense growth rate, major business proportion change rate, net profit growth rate, research and development expense proportion growth rate, intellectual property quantity growth rate and research and development personnel proportion growth rate.
6. The big data analysis-based enterprise innovation and growth capability evaluation method of claim 2, wherein the abnormal data value is processed by: for data values of which the proportion data are not in the range of 0-1, setting the data values to be 75% of the values of the corresponding items when the data values are more than 1, and setting the data values to be 25% of the values of the corresponding items when the data values are less than 1; data with growth rates greater than 20 are set to the 75% value of the corresponding item and data with growth rates less than-20 are set to the 25% value of the corresponding item.
7. A big data analysis-based enterprise innovation and growth capacity evaluation system is characterized by comprising:
the enterprise data collection processing module is used for collecting sample data of each enterprise dimension of a sample enterprise, and performing data cleaning, calculation and label classification on the sample data according to a time sequence; the enterprise dimension comprises the wagons, the employees, the qualifications, intellectual property rights, projects, financing loans, assets and research and development expenses;
the index weight module is used for training an index weight model according to sample data and label classification after sample enterprises clean, so that a relative importance score and an order graph of index items can be obtained according to the index weight model, and the final weight value of each index item can be calculated;
and the score giving module is used for respectively calculating the final evaluation score and the ranking of each index item of each enterprise to be evaluated according to the enterprise data of each enterprise to be evaluated and the final weight value of each index item.
8. The big-data-analysis-based enterprise innovation and growth capability evaluation system of claim 7, wherein the enterprise data collection processing module comprises:
the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring the information of the industry and the commerce of a sample enterprise, the quantity of researchers and the academic duty ratio, the qualification authentication information, the quantity of I-type intellectual property rights, the quantity of II-type intellectual property rights, the project declaration information, the loan amount, the financing data, the research and development cost and the technical contract information in each year of three years, and the loan amount comprises the maximum loan amount, the minimum loan amount, the average loan amount and the total loan amount;
the cleaning and normalization unit is used for cleaning and normalizing the data acquired by the acquisition unit, and acquiring numerical enterprise data by deleting repeated data, evaluating irrelevant data and replacing Chinese content into digital representation;
the classification unit is used for performing label classification on the numerical enterprise data to determine each index item, putting the data of each index item into an object of an enterprise data matrix, and respectively calculating the data value of the sample enterprise under each index item;
and the calculating unit is used for calculating the sum, the average value, the maximum value, the minimum value, the 25% value and the 75% value of the data values of all the sample enterprises under each index item and processing abnormal data values.
9. The big-data-analysis-based enterprise innovation and growth capability evaluation system of claim 8, wherein the indexing-weight module comprises:
the pre-defining unit is used for pre-defining a multi-level data classification label as a calculation index item;
the machine learning unit is used for receiving all sample data of a sample enterprise, extracting the characteristics of each sample data by utilizing a gradient lifting decision tree algorithm of a machine learning xgboost library, classifying, sorting and grouping, calculating a softmax loss function and a regular term by combining with an input calculation index term, and acquiring a relative importance score and an excellent graph of each index term one by one;
and the weight calculation unit is used for calculating the average value of the relative importance scores of the index items and the ratio scores of the preset corresponding index items, and taking the average value as the final weight value of the index item.
10. The big-data-analysis-based enterprise innovation and growth capability evaluation system of claim 9, wherein the score assignment module comprises:
the second acquisition unit is used for acquiring the sample data of each enterprise dimension of each enterprise to be evaluated and performing data processing to obtain an enterprise data matrix of each enterprise to be evaluated;
the division evaluation unit is used for dividing each index item calculated by the index weight model into five groups of evaluation dimensions, including endogenous innovation, collaborative quotation, management operation capability, achievement transformation capability and continuous growth capability, and performing matrix multiplication on a numerical value corresponding to any index item in each group of evaluation dimensions in an enterprise data matrix and a final weight value corresponding to the numerical value to obtain an evaluation score of the corresponding index item;
and the score conversion unit is used for converting the scores of all groups of evaluation dimensions into 100 score results according to the normal distribution characteristics, and the 100 score results are used as final evaluation scores of the enterprise index items.
CN202111669894.XA 2021-12-31 2021-12-31 Big data analysis-based enterprise innovation and growth capacity evaluation method and system Pending CN114328461A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111669894.XA CN114328461A (en) 2021-12-31 2021-12-31 Big data analysis-based enterprise innovation and growth capacity evaluation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111669894.XA CN114328461A (en) 2021-12-31 2021-12-31 Big data analysis-based enterprise innovation and growth capacity evaluation method and system

Publications (1)

Publication Number Publication Date
CN114328461A true CN114328461A (en) 2022-04-12

Family

ID=81020284

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111669894.XA Pending CN114328461A (en) 2021-12-31 2021-12-31 Big data analysis-based enterprise innovation and growth capacity evaluation method and system

Country Status (1)

Country Link
CN (1) CN114328461A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115760432A (en) * 2022-11-22 2023-03-07 东方微银科技股份有限公司 Accurate positioning method and system for life cycle of scientific and technological enterprise
CN116186177A (en) * 2023-04-27 2023-05-30 华智众创(北京)投资管理有限责任公司 Data processing method and device, computing equipment and storage medium
CN116775900A (en) * 2023-06-13 2023-09-19 南京智绘星图信息科技有限公司 Government affair auxiliary management method and system based on rule knowledge graph driving
CN116823022A (en) * 2023-05-26 2023-09-29 科学技术部火炬高技术产业开发中心 Innovative management analysis method for scientific and technological enterprises
CN117436726A (en) * 2023-12-14 2024-01-23 惠民县黄河先进技术研究院 Regional high and new technology industry evaluation method and system
CN116823022B (en) * 2023-05-26 2024-06-28 科学技术部火炬高技术产业开发中心 Innovative management analysis method for scientific and technological enterprises

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115760432A (en) * 2022-11-22 2023-03-07 东方微银科技股份有限公司 Accurate positioning method and system for life cycle of scientific and technological enterprise
CN116186177A (en) * 2023-04-27 2023-05-30 华智众创(北京)投资管理有限责任公司 Data processing method and device, computing equipment and storage medium
CN116823022A (en) * 2023-05-26 2023-09-29 科学技术部火炬高技术产业开发中心 Innovative management analysis method for scientific and technological enterprises
CN116823022B (en) * 2023-05-26 2024-06-28 科学技术部火炬高技术产业开发中心 Innovative management analysis method for scientific and technological enterprises
CN116775900A (en) * 2023-06-13 2023-09-19 南京智绘星图信息科技有限公司 Government affair auxiliary management method and system based on rule knowledge graph driving
CN116775900B (en) * 2023-06-13 2024-02-02 南京智绘星图信息科技有限公司 Government affair auxiliary management method and system based on rule knowledge graph driving
CN117436726A (en) * 2023-12-14 2024-01-23 惠民县黄河先进技术研究院 Regional high and new technology industry evaluation method and system

Similar Documents

Publication Publication Date Title
Wei et al. Discovering bank risk factors from financial statements based on a new semi‐supervised text mining algorithm
CN114328461A (en) Big data analysis-based enterprise innovation and growth capacity evaluation method and system
Katayose et al. Sentiment extraction in music
CN112598500A (en) Credit processing method and system for non-limit client
Budianto et al. Dividend Payout Ratio (DPR) in Islamic and Conventional Banking: Mapping Research Topics using VOSviewer Bibliometric and Library Research
CN112541817A (en) Marketing response processing method and system for potential customers of personal consumption loan
Utami et al. Financial literacy of micro, small, and medium enterprises of consumption sector in probolinggo city
CN112613977A (en) Personal credit loan admission credit granting method and system based on government affair data
CN110796539A (en) Credit investigation evaluation method and device
CN112668945A (en) Enterprise credit risk assessment method and device
Carboni et al. Innovative activities and investment decision: evidence from European firms
Ufo et al. Determinants of financial distress in manufacturing firms of Ethiopia
Wang et al. Applying TOPSIS method to evaluate the business operation performance of Vietnam listing securities companies
CN113554504A (en) Vehicle loan wind control model generation method and device and scoring card generation method
Biswas et al. Automated credit assessment framework using ETL process and machine learning
CN112766814A (en) Training method, device and equipment for credit risk pressure test model
CN115860924A (en) Supply chain financial credit risk early warning method and related equipment
CN113177733B (en) Middle and small micro enterprise data modeling method and system based on convolutional neural network
CN117252677A (en) Credit line determination method and device, electronic equipment and storage medium
CN114757754A (en) Image system, method, storage medium and electronic device for listed company
CN112767121A (en) Method and device for processing risk level data
CN112732866A (en) Investor emotion index construction method, heterogeneous subject market simulation method, equipment and medium
Salih et al. Factors affecting the practice of incomesmoothing policy (an applied study on Sudanese banks)
Ufo Impact of Financial Distress on the Liquidity of Selected Manufacturing Firms of Ethiopia
Taha The possibility of using artificial neural networks in auditing-theoretical analytical paper

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination