CN111681022A - Network platform data resource value evaluation method - Google Patents
Network platform data resource value evaluation method Download PDFInfo
- Publication number
- CN111681022A CN111681022A CN202010298734.8A CN202010298734A CN111681022A CN 111681022 A CN111681022 A CN 111681022A CN 202010298734 A CN202010298734 A CN 202010298734A CN 111681022 A CN111681022 A CN 111681022A
- Authority
- CN
- China
- Prior art keywords
- data
- data resource
- evaluation
- value
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000011156 evaluation Methods 0.000 title claims abstract description 120
- 238000000034 method Methods 0.000 claims abstract description 36
- 238000007637 random forest analysis Methods 0.000 claims abstract description 27
- 238000004364 calculation method Methods 0.000 claims abstract description 25
- 238000013210 evaluation model Methods 0.000 claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 9
- 238000012216 screening Methods 0.000 claims abstract description 8
- 238000010219 correlation analysis Methods 0.000 claims abstract description 6
- 238000012937 correction Methods 0.000 claims abstract description 4
- 230000008901 benefit Effects 0.000 claims description 16
- 230000000694 effects Effects 0.000 claims description 8
- 238000012360 testing method Methods 0.000 claims description 6
- 238000012952 Resampling Methods 0.000 claims description 3
- 230000009977 dual effect Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000000875 corresponding effect Effects 0.000 description 6
- 238000011160 research Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 241000238413 Octopus Species 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000012098 association analyses Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0282—Rating or review of business operators or products
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Development Economics (AREA)
- Accounting & Taxation (AREA)
- Entrepreneurship & Innovation (AREA)
- Finance (AREA)
- Economics (AREA)
- Physics & Mathematics (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Educational Administration (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Data Mining & Analysis (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Complex Calculations (AREA)
Abstract
The invention relates to a method for evaluating the data resource value of a network platform, which comprises the following steps of 1: constructing a data resource value evaluation index system under a network platform transaction environment; 2: determining evaluation index weight based on an entropy value correction G1 method; 3: pre-screening the platform committed transaction data resources based on a grey correlation analysis method; screening out the committed data resources with the relevance degree of the pre-evaluation data resources being more than or equal to a threshold value to form a model sample set T; 4: and selecting a random forest model as a basic model for evaluating the data resource value of the network platform, and constructing a data resource value evaluation model by using the sample set T. And inputting each evaluation index of the pre-evaluation data resource into the data resource value evaluation model, and taking the average value of the output values of each regression tree as the data resource value evaluation result of the data resource value evaluation model. The method can obviously improve the accuracy of predicting the data resource value, reduce the calculation amount of the RFR model and improve the training efficiency.
Description
Technical Field
The invention belongs to the field of resource value evaluation, and relates to a network platform data resource value evaluation method based on grey correlation analysis-random forest regression.
Background
In the data explosion era, data not only plays roles of recording and retaining files, more complete knowledge and deeper intelligence are formed by multi-source and cross-field data association analysis, the future prediction function is greatly enhanced, and the data resources increasingly become general cognitive and objective demands as tradable commodities through open circulation. According to the prediction of '2018 Chinese big data development report' issued by the national information center, the Chinese big data trading market scale in 2020 reaches 731 billion yuan. Under the trend of 'internet +', a network platform becomes an important transaction channel and medium, and Data transaction platforms such as Factual, BDEX, Data Plaza, Guiyang big Data exchange and the like appear in succession. As a non-standardized emerging thing, the data resources are limited in market accumulation and trading which can be referred by a data supplier, a data demander cannot obtain direct experience similar to tangible commodities, and the value of the data demander has bidirectional uncertainty on a data supply side and a demand side, so that the supply and demand mismatch of a data network trading platform is caused, and the data trading achievement rate and the data value disk survival rate are reduced. Therefore, how to achieve accurate assessment of the value of the data resource is the key to the transition of data resource trading from "unordered" to "normalized".
With the continuous development of data trading, part of data trading platforms realizes the importance of data resource value evaluation and develops beneficial exploration. However, in the existing data transaction market in china, each network transaction platform still mainly relies on subjective evaluation of expert experience, and adopts a pragmatic evaluation form, so that the problems of "low confidence level and transparency" of data resource value evaluation exist, effective value reference is difficult to be provided for multiple parties involved in data resource transaction, and the data transaction achievement effect is not ideal. In the existing theoretical research, data resource value evaluation methods comprise asset evaluation, multi-attribute comprehensive evaluation and economics, but are discussed from the perspective of data owners and are far from the transaction situation of a network platform. Some researchers also put forward the research idea of artificial intelligence method evaluation, and the establishment of a data resource value evaluation model by using a neural network becomes a research trend in the field, but related research is not abundant and empirical test is lacked.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a network platform data resource value evaluation method based on grey correlation analysis-random forest regression, and a data resource value evaluation model based on Grey Correlation Analysis (GCA) -Random Forest Regression (RFR) is constructed.
In order to achieve the above purposes, the technical scheme adopted by the invention is as follows:
a method for evaluating the value of data resources of a network platform comprises the following steps:
step 1: constructing a data resource value evaluation index system under a network platform transaction environment;
step 2: determining evaluation index weight based on an entropy value correction G1 method;
and step 3: pre-screening the platform committed data resources based on a Grey Correlation Analysis (GCA) method; screening out the committed data resources with the relevance degree of the pre-evaluation data resources being more than or equal to a threshold value to form a model sample set T;
and 4, step 4: and selecting a Random Forest (RFR) model as a basic model for evaluating the data resource value of the network platform, and constructing a data resource value evaluation model by using the sample set T. And inputting each evaluation index of the pre-evaluation data resource into the data resource value evaluation model, and taking the average value of the output values of each regression tree as the data resource value evaluation result of the data resource value evaluation model.
The step 1 specifically comprises the following steps: on the basis of a data resource value influence factor combing result, comprehensively considering the use frequency of each influence factor and the availability of a selected variable, selecting 7 factors as evaluation indexes based on the dual view angles of resources and assets, and constructing a data resource value evaluation index system under a network platform transaction environment;
the evaluation indexes comprise benefit indexes, cost type indexes and standardized indexes; the benefit indexes comprise a data scale degree, a market attention degree and a data application degree, the cost indexes comprise data freshness, the standardized indexes comprise the data activity degree, the data exclusivity degree and the data certainty degree, and the data activity degree, the data exclusivity degree and the data certainty degree are standardized data.
The step 2 specifically comprises the following steps: firstly, sorting all evaluation indexes in a data resource value evaluation index system according to importance degrees through experts; secondly, calculating the information entropy sum of each evaluation index by an entropy method; then calculating the ratio of the importance degrees between the adjacent evaluation indexes; and finally, calculating the weight of each evaluation index.
WhereinWherein i is more than or equal to 1 and less than or equal to m, and j is more than or equal to 1 and less than or equal to 7; f. ofijWhen equal to 0, fijln fij0; m is the number of committed data resources, xijEvaluating an index for the jth item of the ith committed data resource; h isjThe information entropy sum of the j-th evaluation index is represented; y isijA normalization process of the presentation data is performed,
the calculation formula of the importance degree ratio between the adjacent indexes is as follows:
wherein r isjRepresenting neighboring evaluationsIndex xj-1And xjThe ratio of the degrees of importance therebetween; h isj-1The information entropy sum of the j-1 th evaluation index is represented; h isjAnd information entropy sum of j-th evaluation index.
The calculation formula of the evaluation index weight is as follows:
the step 3 specifically comprises the following steps: carrying out data standardization processing on benefit type indexes and cost type indexes of the pre-evaluation data resources and the committed data resources; computing pre-evaluation data resource Z0With each committed data resource ZiAbsolute difference in corresponding evaluation index; calculating two-stage minimum and maximum differences; computing pre-evaluation data resource Z0With committed data resource ZiCorrelation coefficients on the respective evaluation indices; calculating the correlation degree between the pre-evaluation data resource and the committed data resource, and selecting the correlation degree gammaiThe committed data resources of more than or equal to 0.8 form a model sample set T.
Benefit type index y1ijAnd cost index y2ijThe formula for normalizing the data of (a) is:
the standardized index, the benefit index after data standardization or the cost index after data standardization are marked as D, and the data resource Z is pre-evaluated0With committed data resource ZiThe calculation formula of the absolute difference value Z on the corresponding evaluation index is:
Z=∣D0(j)-Di(j)∣,j=1,2,…,7 (6)
the calculation formula of the two-stage minimum difference and the two-stage maximum difference is as follows:
Z1=min1≤i≤mmin1≤j≤7∣D0(j)-Di(j)∣ (7)
Z2=max1≤i≤mmax1≤j≤7∣D0(j)-Di(j)∣ (8)
pre-evaluating data resource Z0With committed data resource ZiThe correlation coefficient calculation formula on the corresponding evaluation index is as follows:
in the above formula, ρ is a resolution coefficient, and the value is 0.5.
The calculation formula of the association degree of the pre-evaluation data resource and the committed data resource is as follows:
in the above formula wjTo evaluate the index weight.
The step 4 specifically comprises the following steps: setting the number K of regression trees; randomly extracting K training sample sets from the sample set T by adopting a Bootstrap resampling method, wherein the sample sets which are not extracted are called out-of-bag OOB data; randomly selecting A evaluation indexes, wherein A is more than or equal to 1 and less than or equal to 7, and training to generate an RFR model; performing error estimation on the RFR model by taking the out-of-bag OOB data as a test sample; adjusting the value of the parameter K, establishing a plurality of RFR models, calculating the generalization error of each model, and selecting the RFR model with the minimum generalization error as a final value evaluation model of the data resource; and inputting each evaluation index of the pre-evaluation data resource into the value evaluation model of the data resource, and taking the average value of the output values of the regression tree as the data resource value evaluation result of the final value evaluation model.
The formula for calculating the average of the output values of the regression tree is:
in the formula (f)KRepresenting the output value of each regression tree, K tableThe number of regression trees is shown.
The invention has the beneficial effects that:
1. according to the method, octopus data acquisition software is used for crawling 10-class data resource transaction data on a website on a big data transaction platform, and true quantifiable data is used for performing demonstration, so that the effectiveness and the practicability of an evaluation model are effectively guaranteed;
2. the correlation between the data resource value influence factors and the values of the data resource value influence factors is demonstrated, and the selected data resource value influence factors are all quantifiable indexes, so that the dilemma that the data resource value evaluation indexes are subjective and difficult to measure is broken;
3. the intelligent method for judging the value of the data resources based on the market historical trading condition is provided, has strong attractiveness and is more suitable for the characteristics of huge amount of the data resources of a network platform, unknown demanders and the like;
4. compared with an intelligent value evaluation method based on parameter models such as a neural network and a support vector machine, the RFR model only needs to set the parameter value of the number of regression trees, has the advantage of less adjusting parameters, and when the number of the regression trees is large, the generalization error of the RFR tends to be convergent, so that the phenomenon of overfitting can not occur;
5. compared with the method of singly using the RFR model, the GCA-RFR model of the invention firstly adopts a grey correlation analysis method to preprocess the platform committed data resource commodities, screens out the data resource commodities with high similarity to the pre-evaluated data resource commodities on the index sequence to form the sample training RFR model, fully exerts the advantage of the RFR model that the sample data is less in demand, not only can obviously improve the accuracy of predicting the data resource value, but also can reduce the calculation amount of the RFR model and improve the training efficiency of the RFR model.
Drawings
The invention has the following drawings:
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a flow chart of RFR-based data resource value prediction.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
As shown in fig. 1-2, the method for evaluating the value of the data resource of the network platform according to the present invention includes the following steps:
step 1: constructing a data resource value evaluation index system under a network platform transaction environment;
step 2: determining evaluation index weight based on an entropy value correction G1 method;
and step 3: pre-screening the platform committed data resources based on a Grey Correlation Analysis (GCA) method; screening out the committed data resources with the relevance degree of the pre-evaluation data resources being more than or equal to a threshold value to form a model sample set T;
and 4, step 4: and selecting a Random Forest (RFR) model as a basic model for evaluating the data resource value of the network platform, and constructing a data resource value evaluation model by using the sample set T. And inputting each evaluation index of the pre-evaluation data resource into the data resource value evaluation model, and taking the average value of the output values of each regression tree as the data resource value evaluation result of the data resource value evaluation model.
The step 1 specifically comprises the following steps: on the basis of a data resource value influence factor combing result, the use frequency of each influence factor and the availability of a selected variable are comprehensively considered, 7 factors are selected as evaluation indexes based on the dual view angles of resources and assets, and a data resource value evaluation index system under a network platform transaction environment is constructed.
The evaluation indexes comprise data activity degree, data scale degree, data freshness degree, data occupation degree, data accuracy degree, market attention degree and data application degree.
The index positively correlated with the evaluation result is a benefit index (data scale degree, market attention degree, data application degree), and the index negatively correlated with the evaluation result is called a cost-type index (data freshness degree). The data activity degree, the data exclusivity degree and the data authority degree are standardized data (0/1 variable).
The step 2 specifically comprises the following steps: first by expert logarithmAccording to each evaluation index x in the resource value evaluation index systemjSorting according to the importance degree; secondly, calculating the information entropy sum of each evaluation index by an entropy method; then calculating the ratio of the importance degrees between the adjacent evaluation indexes; and finally, calculating the weight of each evaluation index.
WhereinWherein i is more than or equal to 1 and less than or equal to m, and j is more than or equal to 1 and less than or equal to 7; f. ofijWhen equal to 0, fijln fij0; m is the number of committed data resources, xijEvaluating an index for the jth item of the ith committed data resource; h isjThe information entropy sum of the j-th evaluation index is represented; y isijA normalization process of the presentation data is performed,
the calculation formula of the importance degree ratio between the adjacent indexes is as follows:
wherein r isjIndicates the adjacent evaluation index xj-1And xjThe ratio of the degrees of importance therebetween; h isj-1The information entropy sum of the j-1 th evaluation index is represented; h isjAnd information entropy sum of j-th evaluation index.
The calculation formula of the evaluation index weight is as follows:
the step 3 specifically comprises the following steps: carrying out data standardization processing on benefit type indexes and cost type indexes of the pre-evaluation data resources and the committed data resources; computing pre-evaluation data resource Z0With each already committed transactionData resource ZiAbsolute difference in corresponding evaluation index; calculating two-stage minimum and maximum differences; computing pre-evaluation data resource Z0With committed data resource ZiCorrelation coefficients on the respective evaluation indices; calculating the correlation degree between the pre-evaluation data resource and the committed data resource, and selecting the correlation degree gammaiThe committed data resources of ≧ the threshold r (r takes 0.8 herein) constitute the model sample set T.
Benefit type index y1ijAnd cost index y2ijThe formula for normalizing the data of (a) is:
the activity degree, the monopolizing degree, the data accuracy degree, the benefit index after data standardization or the cost index after data standardization are marked as D, and the data resource Z is pre-evaluated0With committed data resource ZiThe calculation formula of the absolute difference value Z on the corresponding evaluation index is:
Z=∣D0(j)-Di(j)∣,j=1,2,…,7 (6)
the calculation formula of the two-stage minimum difference and the two-stage maximum difference is as follows:
Z1=min1≤i≤mmin1≤j≤7∣D0(j)-Di(j)∣ (7)
Z2=max1≤i≤mmax1≤j≤7∣D0(j)-Di(j)∣ (8)
pre-evaluating data resource Z0With committed data resource ZiThe correlation coefficient calculation formula on the corresponding evaluation index is as follows:
in the above formula, rho is a resolution coefficient, the value of rho is between 0 and 1, and the value of rho is 0.5.
The calculation formula of the association degree of the pre-evaluation data resource and the committed data resource is as follows:
in the above formula wjTo evaluate the index weight.
The step 4 specifically comprises the following steps: setting the number K of regression trees; randomly extracting K training sample sets T from the sample set T by adopting a Bootstrap resampling method1、t2…tKThe sample set that is not drawn is referred to as out-of-bag OOB (out of bag) data. Randomly selecting A (A is more than or equal to 1 and less than or equal to 7) evaluation indexes, and training to generate an RFR model; performing error estimation on the RFR model by taking the out-of-bag OOB data as a test sample; adjusting the value of the parameter K, establishing a plurality of RFR models, calculating the generalization error of each model, and selecting the RFR model with the minimum generalization error as a final value evaluation model of the data resource; and inputting each evaluation index of the pre-evaluation data resource into the value evaluation model of the data resource, and taking the average value of the output values of the regression tree as the data resource value evaluation result of the final value evaluation model.
When the error estimation is carried out on the RFR model, the OOB data outside the bag is used as a test sample, and cross validation or other separate test sample sets are not needed.
Inputting each evaluation index of the pre-evaluation data resource, retrieving the average value of the output values of the tree as the data resource value evaluation result of the final value evaluation model, wherein the calculation formula is as follows:
in the above formula, fKThe output value of each regression tree is represented, and K represents the number of regression trees.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that are within the form and principle of the present invention are intended to be included within the scope of the present invention.
Those not described in detail in this specification are within the skill of the art.
Claims (10)
1. A method for evaluating the value of data resources of a network platform is characterized by comprising the following steps:
step 1: constructing a data resource value evaluation index system under a network platform transaction environment;
step 2: determining evaluation index weight based on an entropy value correction G1 method;
and step 3: pre-screening the platform committed transaction data resources based on a grey correlation analysis method; screening out the committed data resources with the relevance degree of the pre-evaluation data resources being more than or equal to a threshold value to form a model sample set T;
and 4, step 4: selecting a random forest model as a basic model for evaluating the data resource value of the network platform, and constructing a data resource value evaluation model by using a sample set T; and inputting each evaluation index of the pre-evaluation data resource into the data resource value evaluation model, and taking the average value of the output values of each regression tree as the data resource value evaluation result of the data resource value evaluation model.
2. The method for evaluating the value of a data resource on a network platform according to claim 1, wherein: the step 1 specifically comprises the following steps: on the basis of a data resource value influence factor combing result, comprehensively considering the use frequency of each influence factor and the availability of a selected variable, selecting 7 factors as evaluation indexes based on the dual view angles of resources and assets, and constructing a data resource value evaluation index system under a network platform transaction environment;
the evaluation indexes comprise benefit indexes, cost type indexes and standardized indexes; the benefit indexes comprise a data scale degree, a market attention degree and a data application degree, the cost indexes comprise data freshness, the standardized indexes comprise the data activity degree, the data exclusivity degree and the data certainty degree, and the data activity degree, the data exclusivity degree and the data certainty degree are standardized data.
3. The method for evaluating the value of the data resource of the network platform according to claim 2, wherein the step 2 is specifically as follows: firstly, sorting all evaluation indexes in a data resource value evaluation index system according to importance degrees through experts; secondly, calculating the information entropy sum of each evaluation index by an entropy method; then calculating the ratio of the importance degrees between the adjacent evaluation indexes; and finally, calculating the weight of each evaluation index.
4. The method for evaluating the value of the data resource of the network platform according to claim 3, wherein the information entropy and the calculation formula are as follows:
whereinWherein i is more than or equal to 1 and less than or equal to m, and j is more than or equal to 1 and less than or equal to 7; f. ofijWhen equal to 0, fijlnfij0; m is the number of committed data resources, xijEvaluating an index for the jth item of the ith committed data resource; h isjThe information entropy sum of the j-th evaluation index is represented; y isijA normalization process of the presentation data is performed,
the calculation formula of the importance degree ratio between the adjacent indexes is as follows:
wherein r isjIndicates the adjacent evaluation index xj-1And xjThe ratio of the degrees of importance therebetween; h isj-1The information entropy sum of the j-1 th evaluation index is represented; h isjAnd information entropy sum of j-th evaluation index.
6. the method for evaluating the value of the data resource of the network platform according to claim 5, wherein the step 3 specifically comprises: carrying out data standardization processing on benefit type indexes and cost type indexes of the pre-evaluation data resources and the committed data resources; computing pre-evaluation data resource Z0With each committed data resource ZiAbsolute difference in corresponding evaluation index; calculating two-stage minimum and maximum differences; computing pre-evaluation data resource Z0With committed data resource ZiCorrelation coefficients on the respective evaluation indices; calculating the correlation degree between the pre-evaluation data resource and the committed data resource, and selecting the correlation degree gammaiThe committed data resources of more than or equal to 0.8 form a model sample set T.
7. The method of claim 6, wherein the benefit index y1 isijAnd cost index y2ijThe formula for normalizing the data of (a) is:
the standardized index, the benefit index after data standardization or the cost index after data standardization are marked as D, and the data resource Z is pre-evaluated0With committed data resource ZiThe calculation formula of the absolute difference value Z on the corresponding evaluation index is:
Z=∣D0(j)-Di(j)∣,j=1,2,…,7 (6)
the calculation formula of the two-stage minimum difference and the two-stage maximum difference is as follows:
Z1=min1≤i≤mmin1≤j≤7∣D0(j)-Di(j)∣ (7)
Z2=max1≤i≤mmax1≤j≤7∣D0(j)-Di(j)∣ (8)
pre-evaluating data resource Z0With committed data resource ZiThe correlation coefficient calculation formula on the corresponding evaluation index is as follows:
in the above formula, ρ is a resolution coefficient, and the value is 0.5.
8. The method for evaluating the value of a data resource on a network platform according to claim 7, wherein the calculation formula of the degree of association between the pre-evaluated data resource and the committed data resource is as follows:
in the above formula wjTo evaluate the index weight.
9. The method for evaluating the value of the data resource of the network platform according to claim 1, wherein the step 4 specifically comprises: setting the number K of regression trees; randomly extracting K training sample sets from the sample set T by adopting a Bootstrap resampling method, wherein the sample sets which are not extracted are called out-of-bag OOB data; randomly selecting A evaluation indexes, wherein A is more than or equal to 1 and less than or equal to 7, and training to generate an RFR model; performing error estimation on the RFR model by taking the out-of-bag OOB data as a test sample; adjusting the value of the parameter K, establishing a plurality of RFR models, calculating the generalization error of each model, and selecting the RFR model with the minimum generalization error as a final value evaluation model of the data resource; and inputting each evaluation index of the pre-evaluation data resource into the value evaluation model of the data resource, and taking the average value of the output values of the regression tree as the data resource value evaluation result of the final value evaluation model.
10. The network platform data resource value evaluation method of claim 9, wherein: the calculation formula of the average value of the output values of the regression tree is as follows:
in the formula (f)KThe output value of each regression tree is represented, and K represents the number of regression trees.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010298734.8A CN111681022A (en) | 2020-04-16 | 2020-04-16 | Network platform data resource value evaluation method |
NL2027964A NL2027964B1 (en) | 2020-04-16 | 2021-04-14 | Data resource valuation method for network platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010298734.8A CN111681022A (en) | 2020-04-16 | 2020-04-16 | Network platform data resource value evaluation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111681022A true CN111681022A (en) | 2020-09-18 |
Family
ID=72433321
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010298734.8A Pending CN111681022A (en) | 2020-04-16 | 2020-04-16 | Network platform data resource value evaluation method |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111681022A (en) |
NL (1) | NL2027964B1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112668851A (en) * | 2020-12-21 | 2021-04-16 | 浙江弄潮儿智慧科技有限公司 | Method and system for determining biodiversity protection key area |
CN112686530A (en) * | 2020-12-28 | 2021-04-20 | 贵州电网有限责任公司 | Relay protection operation reliability evaluation method |
CN113128621A (en) * | 2021-05-12 | 2021-07-16 | 北京大学 | Data resource value evaluation report generation method and device |
CN113128911A (en) * | 2021-05-12 | 2021-07-16 | 北京大学 | Online evaluation method and device for data resource value |
CN113128907A (en) * | 2021-05-12 | 2021-07-16 | 北京大学 | Patent value online evaluation method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20010035256A (en) * | 2001-01-29 | 2001-05-07 | 이귀영 | Method for appraisal of technology value by using internet web appraisal model |
CN108074115A (en) * | 2016-11-11 | 2018-05-25 | 上海文化广播影视集团有限公司 | A kind of TV programme copyright valve estimating system and its appraisal procedure |
CN108805422A (en) * | 2018-05-24 | 2018-11-13 | 国信优易数据有限公司 | A kind of data assessment model training systems, data assessment platform and method |
-
2020
- 2020-04-16 CN CN202010298734.8A patent/CN111681022A/en active Pending
-
2021
- 2021-04-14 NL NL2027964A patent/NL2027964B1/en active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20010035256A (en) * | 2001-01-29 | 2001-05-07 | 이귀영 | Method for appraisal of technology value by using internet web appraisal model |
CN108074115A (en) * | 2016-11-11 | 2018-05-25 | 上海文化广播影视集团有限公司 | A kind of TV programme copyright valve estimating system and its appraisal procedure |
CN108805422A (en) * | 2018-05-24 | 2018-11-13 | 国信优易数据有限公司 | A kind of data assessment model training systems, data assessment platform and method |
Non-Patent Citations (2)
Title |
---|
倪渊等: "基于AGA-BP 神经网络的网络平台交易环境下数据资源价值评估研究" * |
王子焉等: "基于灰色关联分析-随机森林回归的网络平台专利价值评估方法研究" * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112668851A (en) * | 2020-12-21 | 2021-04-16 | 浙江弄潮儿智慧科技有限公司 | Method and system for determining biodiversity protection key area |
CN112668851B (en) * | 2020-12-21 | 2021-11-02 | 浙江弄潮儿智慧科技有限公司 | Method and system for determining biodiversity protection key area |
CN112686530A (en) * | 2020-12-28 | 2021-04-20 | 贵州电网有限责任公司 | Relay protection operation reliability evaluation method |
CN112686530B (en) * | 2020-12-28 | 2022-07-26 | 贵州电网有限责任公司 | Relay protection operation reliability evaluation method |
CN113128621A (en) * | 2021-05-12 | 2021-07-16 | 北京大学 | Data resource value evaluation report generation method and device |
CN113128911A (en) * | 2021-05-12 | 2021-07-16 | 北京大学 | Online evaluation method and device for data resource value |
CN113128907A (en) * | 2021-05-12 | 2021-07-16 | 北京大学 | Patent value online evaluation method and system |
Also Published As
Publication number | Publication date |
---|---|
NL2027964A (en) | 2021-10-25 |
NL2027964B1 (en) | 2022-06-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111681022A (en) | Network platform data resource value evaluation method | |
Piao et al. | Housing price prediction based on CNN | |
CN109960737B (en) | Remote sensing image content retrieval method for semi-supervised depth confrontation self-coding Hash learning | |
CN109685277A (en) | Electricity demand forecasting method and device | |
CN112735097A (en) | Regional landslide early warning method and system | |
CN115470962A (en) | LightGBM-based enterprise confidence loss risk prediction model construction method | |
CN113111924A (en) | Electric power customer classification method and device | |
CN116187835A (en) | Data-driven-based method and system for estimating theoretical line loss interval of transformer area | |
CN115983622A (en) | Risk early warning method of internal control cooperative management system | |
CN116883157A (en) | Small sample credit assessment method and system based on metric learning | |
CN116169670A (en) | Short-term non-resident load prediction method and system based on improved neural network | |
CN117370766A (en) | Satellite mission planning scheme evaluation method based on deep learning | |
CN115239502A (en) | Analyst simulation method, analyst simulation system, electronic device and storage medium | |
CN113762591B (en) | Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning | |
Sun | Real estate evaluation model based on genetic algorithm optimized neural network | |
CN113591947A (en) | Power data clustering method and device based on power consumption behaviors and storage medium | |
CN116956160A (en) | Data classification prediction method based on self-adaptive tree species algorithm | |
Li et al. | An improved genetic-XGBoost classifier for customer consumption behavior prediction | |
CN114529063A (en) | Financial field data prediction method, device and medium based on machine learning | |
Dai et al. | A Sales Forecast Method for Products with No Historical Data | |
CN115423091A (en) | Conditional antagonistic neural network training method, scene generation method and system | |
CN115098674A (en) | Method for generating confrontation network generation data based on cloud ERP supply chain ecosphere | |
CN114936701A (en) | Real-time monitoring method and device for comprehensive energy consumption and terminal equipment | |
CN114862007A (en) | Short-period gas production rate prediction method and system for carbonate gas well | |
CN114626594A (en) | Medium-and-long-term electric quantity prediction method based on cluster analysis and deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200918 |