CN111681022A - Network platform data resource value evaluation method - Google Patents

Network platform data resource value evaluation method Download PDF

Info

Publication number
CN111681022A
CN111681022A CN202010298734.8A CN202010298734A CN111681022A CN 111681022 A CN111681022 A CN 111681022A CN 202010298734 A CN202010298734 A CN 202010298734A CN 111681022 A CN111681022 A CN 111681022A
Authority
CN
China
Prior art keywords
data
data resource
evaluation
value
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010298734.8A
Other languages
Chinese (zh)
Inventor
倪渊
杨露
李子峰
张健
高宇东
蔡功山
高霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Information Science and Technology University
Original Assignee
Beijing Information Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Information Science and Technology University filed Critical Beijing Information Science and Technology University
Priority to CN202010298734.8A priority Critical patent/CN111681022A/en
Publication of CN111681022A publication Critical patent/CN111681022A/en
Priority to NL2027964A priority patent/NL2027964B1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0282Rating or review of business operators or products

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Educational Administration (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Complex Calculations (AREA)

Abstract

The invention relates to a method for evaluating the data resource value of a network platform, which comprises the following steps of 1: constructing a data resource value evaluation index system under a network platform transaction environment; 2: determining evaluation index weight based on an entropy value correction G1 method; 3: pre-screening the platform committed transaction data resources based on a grey correlation analysis method; screening out the committed data resources with the relevance degree of the pre-evaluation data resources being more than or equal to a threshold value to form a model sample set T; 4: and selecting a random forest model as a basic model for evaluating the data resource value of the network platform, and constructing a data resource value evaluation model by using the sample set T. And inputting each evaluation index of the pre-evaluation data resource into the data resource value evaluation model, and taking the average value of the output values of each regression tree as the data resource value evaluation result of the data resource value evaluation model. The method can obviously improve the accuracy of predicting the data resource value, reduce the calculation amount of the RFR model and improve the training efficiency.

Description

Network platform data resource value evaluation method
Technical Field
The invention belongs to the field of resource value evaluation, and relates to a network platform data resource value evaluation method based on grey correlation analysis-random forest regression.
Background
In the data explosion era, data not only plays roles of recording and retaining files, more complete knowledge and deeper intelligence are formed by multi-source and cross-field data association analysis, the future prediction function is greatly enhanced, and the data resources increasingly become general cognitive and objective demands as tradable commodities through open circulation. According to the prediction of '2018 Chinese big data development report' issued by the national information center, the Chinese big data trading market scale in 2020 reaches 731 billion yuan. Under the trend of 'internet +', a network platform becomes an important transaction channel and medium, and Data transaction platforms such as Factual, BDEX, Data Plaza, Guiyang big Data exchange and the like appear in succession. As a non-standardized emerging thing, the data resources are limited in market accumulation and trading which can be referred by a data supplier, a data demander cannot obtain direct experience similar to tangible commodities, and the value of the data demander has bidirectional uncertainty on a data supply side and a demand side, so that the supply and demand mismatch of a data network trading platform is caused, and the data trading achievement rate and the data value disk survival rate are reduced. Therefore, how to achieve accurate assessment of the value of the data resource is the key to the transition of data resource trading from "unordered" to "normalized".
With the continuous development of data trading, part of data trading platforms realizes the importance of data resource value evaluation and develops beneficial exploration. However, in the existing data transaction market in china, each network transaction platform still mainly relies on subjective evaluation of expert experience, and adopts a pragmatic evaluation form, so that the problems of "low confidence level and transparency" of data resource value evaluation exist, effective value reference is difficult to be provided for multiple parties involved in data resource transaction, and the data transaction achievement effect is not ideal. In the existing theoretical research, data resource value evaluation methods comprise asset evaluation, multi-attribute comprehensive evaluation and economics, but are discussed from the perspective of data owners and are far from the transaction situation of a network platform. Some researchers also put forward the research idea of artificial intelligence method evaluation, and the establishment of a data resource value evaluation model by using a neural network becomes a research trend in the field, but related research is not abundant and empirical test is lacked.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a network platform data resource value evaluation method based on grey correlation analysis-random forest regression, and a data resource value evaluation model based on Grey Correlation Analysis (GCA) -Random Forest Regression (RFR) is constructed.
In order to achieve the above purposes, the technical scheme adopted by the invention is as follows:
a method for evaluating the value of data resources of a network platform comprises the following steps:
step 1: constructing a data resource value evaluation index system under a network platform transaction environment;
step 2: determining evaluation index weight based on an entropy value correction G1 method;
and step 3: pre-screening the platform committed data resources based on a Grey Correlation Analysis (GCA) method; screening out the committed data resources with the relevance degree of the pre-evaluation data resources being more than or equal to a threshold value to form a model sample set T;
and 4, step 4: and selecting a Random Forest (RFR) model as a basic model for evaluating the data resource value of the network platform, and constructing a data resource value evaluation model by using the sample set T. And inputting each evaluation index of the pre-evaluation data resource into the data resource value evaluation model, and taking the average value of the output values of each regression tree as the data resource value evaluation result of the data resource value evaluation model.
The step 1 specifically comprises the following steps: on the basis of a data resource value influence factor combing result, comprehensively considering the use frequency of each influence factor and the availability of a selected variable, selecting 7 factors as evaluation indexes based on the dual view angles of resources and assets, and constructing a data resource value evaluation index system under a network platform transaction environment;
the evaluation indexes comprise benefit indexes, cost type indexes and standardized indexes; the benefit indexes comprise a data scale degree, a market attention degree and a data application degree, the cost indexes comprise data freshness, the standardized indexes comprise the data activity degree, the data exclusivity degree and the data certainty degree, and the data activity degree, the data exclusivity degree and the data certainty degree are standardized data.
The step 2 specifically comprises the following steps: firstly, sorting all evaluation indexes in a data resource value evaluation index system according to importance degrees through experts; secondly, calculating the information entropy sum of each evaluation index by an entropy method; then calculating the ratio of the importance degrees between the adjacent evaluation indexes; and finally, calculating the weight of each evaluation index.
The entropy and the calculation formula of the information are
Figure BDA0002453192780000031
Wherein
Figure BDA0002453192780000032
Wherein i is more than or equal to 1 and less than or equal to m, and j is more than or equal to 1 and less than or equal to 7; f. ofijWhen equal to 0, fijln fij0; m is the number of committed data resources, xijEvaluating an index for the jth item of the ith committed data resource; h isjThe information entropy sum of the j-th evaluation index is represented; y isijA normalization process of the presentation data is performed,
Figure BDA0002453192780000033
the calculation formula of the importance degree ratio between the adjacent indexes is as follows:
Figure BDA0002453192780000041
wherein r isjRepresenting neighboring evaluationsIndex xj-1And xjThe ratio of the degrees of importance therebetween; h isj-1The information entropy sum of the j-1 th evaluation index is represented; h isjAnd information entropy sum of j-th evaluation index.
The calculation formula of the evaluation index weight is as follows:
Figure BDA0002453192780000042
the step 3 specifically comprises the following steps: carrying out data standardization processing on benefit type indexes and cost type indexes of the pre-evaluation data resources and the committed data resources; computing pre-evaluation data resource Z0With each committed data resource ZiAbsolute difference in corresponding evaluation index; calculating two-stage minimum and maximum differences; computing pre-evaluation data resource Z0With committed data resource ZiCorrelation coefficients on the respective evaluation indices; calculating the correlation degree between the pre-evaluation data resource and the committed data resource, and selecting the correlation degree gammaiThe committed data resources of more than or equal to 0.8 form a model sample set T.
Benefit type index y1ijAnd cost index y2ijThe formula for normalizing the data of (a) is:
Figure BDA0002453192780000043
Figure BDA0002453192780000044
the standardized index, the benefit index after data standardization or the cost index after data standardization are marked as D, and the data resource Z is pre-evaluated0With committed data resource ZiThe calculation formula of the absolute difference value Z on the corresponding evaluation index is:
Z=∣D0(j)-Di(j)∣,j=1,2,…,7 (6)
the calculation formula of the two-stage minimum difference and the two-stage maximum difference is as follows:
Z1=min1≤i≤mmin1≤j≤7∣D0(j)-Di(j)∣ (7)
Z2=max1≤i≤mmax1≤j≤7∣D0(j)-Di(j)∣ (8)
pre-evaluating data resource Z0With committed data resource ZiThe correlation coefficient calculation formula on the corresponding evaluation index is as follows:
Figure BDA0002453192780000051
in the above formula, ρ is a resolution coefficient, and the value is 0.5.
The calculation formula of the association degree of the pre-evaluation data resource and the committed data resource is as follows:
Figure BDA0002453192780000052
in the above formula wjTo evaluate the index weight.
The step 4 specifically comprises the following steps: setting the number K of regression trees; randomly extracting K training sample sets from the sample set T by adopting a Bootstrap resampling method, wherein the sample sets which are not extracted are called out-of-bag OOB data; randomly selecting A evaluation indexes, wherein A is more than or equal to 1 and less than or equal to 7, and training to generate an RFR model; performing error estimation on the RFR model by taking the out-of-bag OOB data as a test sample; adjusting the value of the parameter K, establishing a plurality of RFR models, calculating the generalization error of each model, and selecting the RFR model with the minimum generalization error as a final value evaluation model of the data resource; and inputting each evaluation index of the pre-evaluation data resource into the value evaluation model of the data resource, and taking the average value of the output values of the regression tree as the data resource value evaluation result of the final value evaluation model.
The formula for calculating the average of the output values of the regression tree is:
Figure BDA0002453192780000053
in the formula (f)KRepresenting the output value of each regression tree, K tableThe number of regression trees is shown.
The invention has the beneficial effects that:
1. according to the method, octopus data acquisition software is used for crawling 10-class data resource transaction data on a website on a big data transaction platform, and true quantifiable data is used for performing demonstration, so that the effectiveness and the practicability of an evaluation model are effectively guaranteed;
2. the correlation between the data resource value influence factors and the values of the data resource value influence factors is demonstrated, and the selected data resource value influence factors are all quantifiable indexes, so that the dilemma that the data resource value evaluation indexes are subjective and difficult to measure is broken;
3. the intelligent method for judging the value of the data resources based on the market historical trading condition is provided, has strong attractiveness and is more suitable for the characteristics of huge amount of the data resources of a network platform, unknown demanders and the like;
4. compared with an intelligent value evaluation method based on parameter models such as a neural network and a support vector machine, the RFR model only needs to set the parameter value of the number of regression trees, has the advantage of less adjusting parameters, and when the number of the regression trees is large, the generalization error of the RFR tends to be convergent, so that the phenomenon of overfitting can not occur;
5. compared with the method of singly using the RFR model, the GCA-RFR model of the invention firstly adopts a grey correlation analysis method to preprocess the platform committed data resource commodities, screens out the data resource commodities with high similarity to the pre-evaluated data resource commodities on the index sequence to form the sample training RFR model, fully exerts the advantage of the RFR model that the sample data is less in demand, not only can obviously improve the accuracy of predicting the data resource value, but also can reduce the calculation amount of the RFR model and improve the training efficiency of the RFR model.
Drawings
The invention has the following drawings:
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a flow chart of RFR-based data resource value prediction.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
As shown in fig. 1-2, the method for evaluating the value of the data resource of the network platform according to the present invention includes the following steps:
step 1: constructing a data resource value evaluation index system under a network platform transaction environment;
step 2: determining evaluation index weight based on an entropy value correction G1 method;
and step 3: pre-screening the platform committed data resources based on a Grey Correlation Analysis (GCA) method; screening out the committed data resources with the relevance degree of the pre-evaluation data resources being more than or equal to a threshold value to form a model sample set T;
and 4, step 4: and selecting a Random Forest (RFR) model as a basic model for evaluating the data resource value of the network platform, and constructing a data resource value evaluation model by using the sample set T. And inputting each evaluation index of the pre-evaluation data resource into the data resource value evaluation model, and taking the average value of the output values of each regression tree as the data resource value evaluation result of the data resource value evaluation model.
The step 1 specifically comprises the following steps: on the basis of a data resource value influence factor combing result, the use frequency of each influence factor and the availability of a selected variable are comprehensively considered, 7 factors are selected as evaluation indexes based on the dual view angles of resources and assets, and a data resource value evaluation index system under a network platform transaction environment is constructed.
The evaluation indexes comprise data activity degree, data scale degree, data freshness degree, data occupation degree, data accuracy degree, market attention degree and data application degree.
The index positively correlated with the evaluation result is a benefit index (data scale degree, market attention degree, data application degree), and the index negatively correlated with the evaluation result is called a cost-type index (data freshness degree). The data activity degree, the data exclusivity degree and the data authority degree are standardized data (0/1 variable).
The step 2 specifically comprises the following steps: first by expert logarithmAccording to each evaluation index x in the resource value evaluation index systemjSorting according to the importance degree; secondly, calculating the information entropy sum of each evaluation index by an entropy method; then calculating the ratio of the importance degrees between the adjacent evaluation indexes; and finally, calculating the weight of each evaluation index.
The entropy and the calculation formula of the information are
Figure BDA0002453192780000081
Wherein
Figure BDA0002453192780000082
Wherein i is more than or equal to 1 and less than or equal to m, and j is more than or equal to 1 and less than or equal to 7; f. ofijWhen equal to 0, fijln fij0; m is the number of committed data resources, xijEvaluating an index for the jth item of the ith committed data resource; h isjThe information entropy sum of the j-th evaluation index is represented; y isijA normalization process of the presentation data is performed,
Figure BDA0002453192780000083
the calculation formula of the importance degree ratio between the adjacent indexes is as follows:
Figure BDA0002453192780000084
wherein r isjIndicates the adjacent evaluation index xj-1And xjThe ratio of the degrees of importance therebetween; h isj-1The information entropy sum of the j-1 th evaluation index is represented; h isjAnd information entropy sum of j-th evaluation index.
The calculation formula of the evaluation index weight is as follows:
Figure BDA0002453192780000085
the step 3 specifically comprises the following steps: carrying out data standardization processing on benefit type indexes and cost type indexes of the pre-evaluation data resources and the committed data resources; computing pre-evaluation data resource Z0With each already committed transactionData resource ZiAbsolute difference in corresponding evaluation index; calculating two-stage minimum and maximum differences; computing pre-evaluation data resource Z0With committed data resource ZiCorrelation coefficients on the respective evaluation indices; calculating the correlation degree between the pre-evaluation data resource and the committed data resource, and selecting the correlation degree gammaiThe committed data resources of ≧ the threshold r (r takes 0.8 herein) constitute the model sample set T.
Benefit type index y1ijAnd cost index y2ijThe formula for normalizing the data of (a) is:
Figure BDA0002453192780000091
Figure BDA0002453192780000092
the activity degree, the monopolizing degree, the data accuracy degree, the benefit index after data standardization or the cost index after data standardization are marked as D, and the data resource Z is pre-evaluated0With committed data resource ZiThe calculation formula of the absolute difference value Z on the corresponding evaluation index is:
Z=∣D0(j)-Di(j)∣,j=1,2,…,7 (6)
the calculation formula of the two-stage minimum difference and the two-stage maximum difference is as follows:
Z1=min1≤i≤mmin1≤j≤7∣D0(j)-Di(j)∣ (7)
Z2=max1≤i≤mmax1≤j≤7∣D0(j)-Di(j)∣ (8)
pre-evaluating data resource Z0With committed data resource ZiThe correlation coefficient calculation formula on the corresponding evaluation index is as follows:
Figure BDA0002453192780000093
in the above formula, rho is a resolution coefficient, the value of rho is between 0 and 1, and the value of rho is 0.5.
The calculation formula of the association degree of the pre-evaluation data resource and the committed data resource is as follows:
Figure BDA0002453192780000094
in the above formula wjTo evaluate the index weight.
The step 4 specifically comprises the following steps: setting the number K of regression trees; randomly extracting K training sample sets T from the sample set T by adopting a Bootstrap resampling method1、t2…tKThe sample set that is not drawn is referred to as out-of-bag OOB (out of bag) data. Randomly selecting A (A is more than or equal to 1 and less than or equal to 7) evaluation indexes, and training to generate an RFR model; performing error estimation on the RFR model by taking the out-of-bag OOB data as a test sample; adjusting the value of the parameter K, establishing a plurality of RFR models, calculating the generalization error of each model, and selecting the RFR model with the minimum generalization error as a final value evaluation model of the data resource; and inputting each evaluation index of the pre-evaluation data resource into the value evaluation model of the data resource, and taking the average value of the output values of the regression tree as the data resource value evaluation result of the final value evaluation model.
When the error estimation is carried out on the RFR model, the OOB data outside the bag is used as a test sample, and cross validation or other separate test sample sets are not needed.
Inputting each evaluation index of the pre-evaluation data resource, retrieving the average value of the output values of the tree as the data resource value evaluation result of the final value evaluation model, wherein the calculation formula is as follows:
Figure BDA0002453192780000101
in the above formula, fKThe output value of each regression tree is represented, and K represents the number of regression trees.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that are within the form and principle of the present invention are intended to be included within the scope of the present invention.
Those not described in detail in this specification are within the skill of the art.

Claims (10)

1. A method for evaluating the value of data resources of a network platform is characterized by comprising the following steps:
step 1: constructing a data resource value evaluation index system under a network platform transaction environment;
step 2: determining evaluation index weight based on an entropy value correction G1 method;
and step 3: pre-screening the platform committed transaction data resources based on a grey correlation analysis method; screening out the committed data resources with the relevance degree of the pre-evaluation data resources being more than or equal to a threshold value to form a model sample set T;
and 4, step 4: selecting a random forest model as a basic model for evaluating the data resource value of the network platform, and constructing a data resource value evaluation model by using a sample set T; and inputting each evaluation index of the pre-evaluation data resource into the data resource value evaluation model, and taking the average value of the output values of each regression tree as the data resource value evaluation result of the data resource value evaluation model.
2. The method for evaluating the value of a data resource on a network platform according to claim 1, wherein: the step 1 specifically comprises the following steps: on the basis of a data resource value influence factor combing result, comprehensively considering the use frequency of each influence factor and the availability of a selected variable, selecting 7 factors as evaluation indexes based on the dual view angles of resources and assets, and constructing a data resource value evaluation index system under a network platform transaction environment;
the evaluation indexes comprise benefit indexes, cost type indexes and standardized indexes; the benefit indexes comprise a data scale degree, a market attention degree and a data application degree, the cost indexes comprise data freshness, the standardized indexes comprise the data activity degree, the data exclusivity degree and the data certainty degree, and the data activity degree, the data exclusivity degree and the data certainty degree are standardized data.
3. The method for evaluating the value of the data resource of the network platform according to claim 2, wherein the step 2 is specifically as follows: firstly, sorting all evaluation indexes in a data resource value evaluation index system according to importance degrees through experts; secondly, calculating the information entropy sum of each evaluation index by an entropy method; then calculating the ratio of the importance degrees between the adjacent evaluation indexes; and finally, calculating the weight of each evaluation index.
4. The method for evaluating the value of the data resource of the network platform according to claim 3, wherein the information entropy and the calculation formula are as follows:
Figure FDA0002453192770000021
wherein
Figure FDA0002453192770000022
Wherein i is more than or equal to 1 and less than or equal to m, and j is more than or equal to 1 and less than or equal to 7; f. ofijWhen equal to 0, fijlnfij0; m is the number of committed data resources, xijEvaluating an index for the jth item of the ith committed data resource; h isjThe information entropy sum of the j-th evaluation index is represented; y isijA normalization process of the presentation data is performed,
Figure FDA0002453192770000023
the calculation formula of the importance degree ratio between the adjacent indexes is as follows:
Figure FDA0002453192770000024
wherein r isjIndicates the adjacent evaluation index xj-1And xjThe ratio of the degrees of importance therebetween; h isj-1The information entropy sum of the j-1 th evaluation index is represented; h isjAnd information entropy sum of j-th evaluation index.
5. The method for evaluating the value of the data resource of the network platform according to claim 4, wherein the calculation formula for evaluating the index weight is as follows:
Figure FDA0002453192770000025
6. the method for evaluating the value of the data resource of the network platform according to claim 5, wherein the step 3 specifically comprises: carrying out data standardization processing on benefit type indexes and cost type indexes of the pre-evaluation data resources and the committed data resources; computing pre-evaluation data resource Z0With each committed data resource ZiAbsolute difference in corresponding evaluation index; calculating two-stage minimum and maximum differences; computing pre-evaluation data resource Z0With committed data resource ZiCorrelation coefficients on the respective evaluation indices; calculating the correlation degree between the pre-evaluation data resource and the committed data resource, and selecting the correlation degree gammaiThe committed data resources of more than or equal to 0.8 form a model sample set T.
7. The method of claim 6, wherein the benefit index y1 isijAnd cost index y2ijThe formula for normalizing the data of (a) is:
Figure FDA0002453192770000031
Figure FDA0002453192770000032
the standardized index, the benefit index after data standardization or the cost index after data standardization are marked as D, and the data resource Z is pre-evaluated0With committed data resource ZiThe calculation formula of the absolute difference value Z on the corresponding evaluation index is:
Z=∣D0(j)-Di(j)∣,j=1,2,…,7 (6)
the calculation formula of the two-stage minimum difference and the two-stage maximum difference is as follows:
Z1=min1≤i≤mmin1≤j≤7∣D0(j)-Di(j)∣ (7)
Z2=max1≤i≤mmax1≤j≤7∣D0(j)-Di(j)∣ (8)
pre-evaluating data resource Z0With committed data resource ZiThe correlation coefficient calculation formula on the corresponding evaluation index is as follows:
Figure FDA0002453192770000033
in the above formula, ρ is a resolution coefficient, and the value is 0.5.
8. The method for evaluating the value of a data resource on a network platform according to claim 7, wherein the calculation formula of the degree of association between the pre-evaluated data resource and the committed data resource is as follows:
Figure FDA0002453192770000041
in the above formula wjTo evaluate the index weight.
9. The method for evaluating the value of the data resource of the network platform according to claim 1, wherein the step 4 specifically comprises: setting the number K of regression trees; randomly extracting K training sample sets from the sample set T by adopting a Bootstrap resampling method, wherein the sample sets which are not extracted are called out-of-bag OOB data; randomly selecting A evaluation indexes, wherein A is more than or equal to 1 and less than or equal to 7, and training to generate an RFR model; performing error estimation on the RFR model by taking the out-of-bag OOB data as a test sample; adjusting the value of the parameter K, establishing a plurality of RFR models, calculating the generalization error of each model, and selecting the RFR model with the minimum generalization error as a final value evaluation model of the data resource; and inputting each evaluation index of the pre-evaluation data resource into the value evaluation model of the data resource, and taking the average value of the output values of the regression tree as the data resource value evaluation result of the final value evaluation model.
10. The network platform data resource value evaluation method of claim 9, wherein: the calculation formula of the average value of the output values of the regression tree is as follows:
Figure FDA0002453192770000042
in the formula (f)KThe output value of each regression tree is represented, and K represents the number of regression trees.
CN202010298734.8A 2020-04-16 2020-04-16 Network platform data resource value evaluation method Pending CN111681022A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010298734.8A CN111681022A (en) 2020-04-16 2020-04-16 Network platform data resource value evaluation method
NL2027964A NL2027964B1 (en) 2020-04-16 2021-04-14 Data resource valuation method for network platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010298734.8A CN111681022A (en) 2020-04-16 2020-04-16 Network platform data resource value evaluation method

Publications (1)

Publication Number Publication Date
CN111681022A true CN111681022A (en) 2020-09-18

Family

ID=72433321

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010298734.8A Pending CN111681022A (en) 2020-04-16 2020-04-16 Network platform data resource value evaluation method

Country Status (2)

Country Link
CN (1) CN111681022A (en)
NL (1) NL2027964B1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668851A (en) * 2020-12-21 2021-04-16 浙江弄潮儿智慧科技有限公司 Method and system for determining biodiversity protection key area
CN112686530A (en) * 2020-12-28 2021-04-20 贵州电网有限责任公司 Relay protection operation reliability evaluation method
CN113128621A (en) * 2021-05-12 2021-07-16 北京大学 Data resource value evaluation report generation method and device
CN113128911A (en) * 2021-05-12 2021-07-16 北京大学 Online evaluation method and device for data resource value
CN113128907A (en) * 2021-05-12 2021-07-16 北京大学 Patent value online evaluation method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010035256A (en) * 2001-01-29 2001-05-07 이귀영 Method for appraisal of technology value by using internet web appraisal model
CN108074115A (en) * 2016-11-11 2018-05-25 上海文化广播影视集团有限公司 A kind of TV programme copyright valve estimating system and its appraisal procedure
CN108805422A (en) * 2018-05-24 2018-11-13 国信优易数据有限公司 A kind of data assessment model training systems, data assessment platform and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010035256A (en) * 2001-01-29 2001-05-07 이귀영 Method for appraisal of technology value by using internet web appraisal model
CN108074115A (en) * 2016-11-11 2018-05-25 上海文化广播影视集团有限公司 A kind of TV programme copyright valve estimating system and its appraisal procedure
CN108805422A (en) * 2018-05-24 2018-11-13 国信优易数据有限公司 A kind of data assessment model training systems, data assessment platform and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
倪渊等: "基于AGA-BP 神经网络的网络平台交易环境下数据资源价值评估研究" *
王子焉等: "基于灰色关联分析-随机森林回归的网络平台专利价值评估方法研究" *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668851A (en) * 2020-12-21 2021-04-16 浙江弄潮儿智慧科技有限公司 Method and system for determining biodiversity protection key area
CN112668851B (en) * 2020-12-21 2021-11-02 浙江弄潮儿智慧科技有限公司 Method and system for determining biodiversity protection key area
CN112686530A (en) * 2020-12-28 2021-04-20 贵州电网有限责任公司 Relay protection operation reliability evaluation method
CN112686530B (en) * 2020-12-28 2022-07-26 贵州电网有限责任公司 Relay protection operation reliability evaluation method
CN113128621A (en) * 2021-05-12 2021-07-16 北京大学 Data resource value evaluation report generation method and device
CN113128911A (en) * 2021-05-12 2021-07-16 北京大学 Online evaluation method and device for data resource value
CN113128907A (en) * 2021-05-12 2021-07-16 北京大学 Patent value online evaluation method and system

Also Published As

Publication number Publication date
NL2027964A (en) 2021-10-25
NL2027964B1 (en) 2022-06-21

Similar Documents

Publication Publication Date Title
CN111681022A (en) Network platform data resource value evaluation method
Piao et al. Housing price prediction based on CNN
CN109960737B (en) Remote sensing image content retrieval method for semi-supervised depth confrontation self-coding Hash learning
CN109685277A (en) Electricity demand forecasting method and device
CN112735097A (en) Regional landslide early warning method and system
CN115470962A (en) LightGBM-based enterprise confidence loss risk prediction model construction method
CN113111924A (en) Electric power customer classification method and device
CN116187835A (en) Data-driven-based method and system for estimating theoretical line loss interval of transformer area
CN115983622A (en) Risk early warning method of internal control cooperative management system
CN116883157A (en) Small sample credit assessment method and system based on metric learning
CN116169670A (en) Short-term non-resident load prediction method and system based on improved neural network
CN117370766A (en) Satellite mission planning scheme evaluation method based on deep learning
CN115239502A (en) Analyst simulation method, analyst simulation system, electronic device and storage medium
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
Sun Real estate evaluation model based on genetic algorithm optimized neural network
CN113591947A (en) Power data clustering method and device based on power consumption behaviors and storage medium
CN116956160A (en) Data classification prediction method based on self-adaptive tree species algorithm
Li et al. An improved genetic-XGBoost classifier for customer consumption behavior prediction
CN114529063A (en) Financial field data prediction method, device and medium based on machine learning
Dai et al. A Sales Forecast Method for Products with No Historical Data
CN115423091A (en) Conditional antagonistic neural network training method, scene generation method and system
CN115098674A (en) Method for generating confrontation network generation data based on cloud ERP supply chain ecosphere
CN114936701A (en) Real-time monitoring method and device for comprehensive energy consumption and terminal equipment
CN114862007A (en) Short-period gas production rate prediction method and system for carbonate gas well
CN114626594A (en) Medium-and-long-term electric quantity prediction method based on cluster analysis and deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200918