CN106355447A - Price evaluation method and system for data commodities - Google Patents

Price evaluation method and system for data commodities Download PDF

Info

Publication number
CN106355447A
CN106355447A CN201610791054.3A CN201610791054A CN106355447A CN 106355447 A CN106355447 A CN 106355447A CN 201610791054 A CN201610791054 A CN 201610791054A CN 106355447 A CN106355447 A CN 106355447A
Authority
CN
China
Prior art keywords
data
commodity
price
metadata
data commodity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610791054.3A
Other languages
Chinese (zh)
Inventor
孙玉权
王肃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoxin Youe Data Co Ltd
Original Assignee
Guoxin Youe Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoxin Youe Data Co Ltd filed Critical Guoxin Youe Data Co Ltd
Priority to CN201610791054.3A priority Critical patent/CN106355447A/en
Publication of CN106355447A publication Critical patent/CN106355447A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0283Price estimation or determination

Landscapes

  • Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a price evaluation method and system for data commodities. The price evaluation system for the data commodities comprises a recognition module, a quality evaluation module and a price evaluation module. The price evaluation method and system can realize automation and standardization of a data commodity price evaluation process, the defects of heavy workload, different subjective evaluation standards and large technical level difference caused by artificial evaluation are overcome, price evaluation is performed on the data commodities under the same evaluation system, the objectivity and reasonability of prices of the data commodities are guaranteed, and smooth transaction of the data commodities is facilitated.

Description

A kind of price evaluation method and system of data commodity
Technical field
The present invention relates to technical field of electronic commerce, more particularly, to a kind of price evaluation method and system of data commodity.
Background technology
With scientific and technological development, entered now the information age comprehensively.In the information age, data is as one Plant commodity can be traded, but it is contemplated that buyer or the seller cannot weigh the price of data effectively, result in Seller's interest or buyer's interest impaired.
And, there is not yet the method and system that the price to data commodity is estimated in prior art.And in order that Data commodity have price as basis of business in market transaction, need to carry out price evaluation to data commodity.
Content of the invention
It is an object of the present invention to provide a kind of price evaluation method and system of data commodity, to solve current data commodity no The problem of method accurate evaluation price.
The present invention solves technical problem and adopts the following technical scheme that a kind of value appraisal system of data commodity, comprising:
Identification module, it is used for identifying the species of data commodity to be assessed;When data commodity to be assessed belong to structure When changing data, unstructured data and semi-structured data, then the price of data commodity to be assessed is estimated;When to be evaluated When the data commodity estimated are not belonging to structural data, unstructured data and semi-structured data, then terminate to data commodity Price evaluation process;
Quality assessment modules, for carrying out quality evaluation to data commodity;
Price evaluation module, based on the index obtained by described quality assessment modules, comments to the price of data commodity Estimate.
Optionally, described quality assessment modules include:
Compliance evaluation unit, it is used for the concordance of data commodity is estimated;
Efficiency assessment unit, it is estimated to the effectiveness of data commodity;
Repeated assessment unit, it is estimated to the repeatability of the data in data commodity;
Scarcity assessment unit, it is estimated to the scarcity of data commodity;
Data volume assessment unit, it is estimated to the data volume of described data commodity;
Structuring scale evaluation unit, it is accounted for based on the destructuring in data content, semi-structured, structural data Ratio calculates the overall structuring degree of data.
Optionally, described price evaluation module includes:
Set of metadata of similar data commodity searching unit, it is used for obtaining the set of metadata of similar data commodity in data trade market;
Set of metadata of similar data commodity average price computing unit, for calculating the average price of these set of metadata of similar data commodity;
Standard deviation computing unit, for calculating the standard deviation of set of metadata of similar data commodity;
Data evaluating commodity value unit, it is used for assessing data commodity value;
Data commodity price assessment unit, it is used for assessing data commodity price.
Optionally, the price evaluation of data commodity when there are set of metadata of similar data commodity, is completed using following formula,
&gamma; ( p &overbar; + ( f - 0.6 ) &sigma; ) < p < &gamma; ( p &overbar; + ( f - 0.4 ) &sigma; )
In formula, f is value assessment score,For set of metadata of similar data commodity average price, σ is the standard deviation of set of metadata of similar data commodity, γ is user feedback correction factor, and interval is [0.9,1], uses its value to be 1 for the first time;
When there are not set of metadata of similar data commodity, by the price of following formula assessment data product:
p = p ^ ( 1 + f )
In formula,For cost price, f is value assessment score.
The present invention solves technical problem and also adopts the following technical scheme that a kind of price evaluation method of data commodity, its bag Include following steps:
S10, determine data type of merchandize to be assessed;When data commodity to be assessed belong to structural data, non-structural When changing data and semi-structured data, then the price of data commodity to be assessed is estimated;When data commodity to be assessed When being not belonging to structural data, unstructured data and semi-structured data, then terminate the price evaluation process to data commodity;
S20, quality evaluation is carried out to data commodity;Quality evaluation to data commodity includes compliance evaluation, effectiveness Assessment, repeatability assessment, scarcity assessment, data volume assessment and structuring scale evaluation;
S30, price evaluation is carried out to data commodity;
Wherein step s30 specifically includes:
S3011, lookup set of metadata of similar data commodity;If there is similar data commodity, then execution step s3012, s3013, S3014 and s3015;If there is no similar data commodity, then execution step s3014 and s3016;
S3012, calculating set of metadata of similar data commodity average price;
S3013, the standard deviation of calculating set of metadata of similar data commodity;
S3014, assessment data commodity value;
S3015, according to following formula assess data commodity price:
&gamma; ( p &overbar; + ( f - 0.6 ) &sigma; ) < p < &gamma; ( p &overbar; + ( f - 0.4 ) &sigma; ) ;
In formula, f is value assessment score,For set of metadata of similar data commodity average price, σ is the standard deviation of set of metadata of similar data commodity, γ is user feedback correction factor, and interval is [0.9,1], uses its value to be 1 for the first time;
S3016, according to following formula assess data commodity price:
p = p ^ ( 1 + f ) ;
In formula,For cost price, f is value assessment score, the price that is, cost-or-market method is assessed be the basis of cost it On, add certain profit amount.
Optionally, in described step s20, by the concordance of following formula assessment data commodity:
f y = 1 - 1 3 ( | l a - l m | max ( l a , l m ) + | s a - s m | max ( s s , s m ) + p ) , p &element; { 0 , 1 }
In formula: fy represents the score of coincident indicator;La represents actual amount of data;Lm represents metadata record data volume; Sa represents actual data files size;Sm represents metadata record file size;P represents data form concordance;Using file Suffix name differentiates, if file suffixes name is identical with the data name of record in metadata, assignment 0, and otherwise it is entered as 1.
Optionally, in described step s20, by the effectiveness of following formula assessment data commodity:
h = &sigma; i = 1 m &sigma; j = 1 n a i j n ;
In formula, h represents the score of Validity Index;aijRepresent whether the i-th row, jth column data are virtual value.If having Valid value then takes 0, is not that virtual value then takes 1;N represent entirety data amount check it is assumed that data commodity common m row, n arrange, then n=m × N, m, n are natural number.
Optionally, in described step s20, by the repeatability of following formula assessment data commodity:
f c = 1 - &sigma; i = 1 n a i n ;
In formula, fc represents the score of repeatability index, aiRepresent that certain is repeated to record the number of times occurring;N is the total of record Number;Wherein, fc span is [0,1], and fc value is bigger, and information repeatability is little, and data value is higher.
Optionally, in described step s20, according to the scarcity of following formula assessment data commodity:
f x = 2 e - y / x 1 + e - y / x ;
Wherein, fx represents the score of scarcity index, and y represents the data bulk of the set of metadata of similar data commodity that market occurs;X table Show the data bulk of current data commodity, e is the bottom of natural logrithm.
Optionally, in described step s20, according to the data volume of following formula assessment data commodity:
f s = 1 - | l a - l m | m a x ( l a , l m )
In formula, fs represents the score of data figureofmerit, and la represents actual amount of data;Lm represents metadata record data volume; The span of fs is [0,1], when fs is much smaller than the data volume in metadata close to 0 explanation data volume;Fs shows when being equal to 1 Data volume meets the quantity of metadata offer.
The invention has the following beneficial effects: the present invention can with the automatization of data commodity price estimation flow, standardization, The shortcoming that the workload that minimizing manual evaluation is brought is big, subjective evaluation standard differs and difference of technology level is big, makes data commodity Price carries out price evaluation it is ensured that the objectivity of data commodity price and reasonability under same evaluation system, is conducive to data The smooth transaction of commodity.
Brief description
Fig. 1 is the structural representation of the value appraisal system of data commodity of the present invention;
Fig. 2 is the flow chart of the price evaluation method of data commodity of the present invention;
Specific embodiment
With reference to embodiment and accompanying drawing, technical scheme is further elaborated.
Embodiment 1
Present embodiments provide a kind of value appraisal system of data commodity, comprising:
Identification module, it is used for identifying the species of data commodity to be assessed, when data commodity to be assessed belong to above-mentioned When structural data, unstructured data and semi-structured data, then the price of data commodity to be assessed is estimated;When When data commodity to be assessed are not belonging to structural data, unstructured data and semi-structured data, then terminate this secondary data The price evaluation process of commodity.
Quality assessment modules, for carrying out quality evaluation to data commodity.In the present embodiment, described quality assessment modules bag Include:
Compliance evaluation unit, it is used for the concordance of data commodity is estimated, i.e. assessment real data and promise Data whether consistent.The data promised to undertake generally uses metadata.Metadata is the data with regard to data, have recorded data commodity Each index item, such as size, bar number, file format, time, author etc..In the present embodiment, data business is assessed by following formula The concordance of product:
f y = 1 - 1 3 ( | l a - l m | max ( l a , l m ) + | s a - s m | max ( s a , s m ) + p ) , p &element; { 0 , 1 }
In formula: fy represents the score of coincident indicator;La represents actual amount of data;Lm represents metadata record data volume; Sa represents actual data files size;Sm represents metadata record file size;P represents data form concordance;Using file Suffix name differentiates, if file suffixes name is identical with the data name of record in metadata, assignment 0, and otherwise it is entered as 1.
In obtained result, fy span is [0,1], and value is bigger, and concordance is better, and data value is higher.
Efficiency assessment unit, it is estimated to the effectiveness of data commodity, in the present embodiment, by calculating significant figure Obtain according to the accounting in data volume, formula is:
h = &sigma; i = 1 m &sigma; j = 1 n a i j n
In formula, h represents the score of Validity Index;aijRepresent whether the i-th row, jth column data are virtual value.If having Valid value then takes 0, is not that virtual value then takes 1;N represent entirety data amount check it is assumed that data commodity common m row, n arrange, then n=m × N, m, n are natural number.
The span of h is [0,1], and h value is bigger, represents that data validity is better.
Repeated assessment unit, it is estimated to the repeatability of the data in data commodity;Information repeatability is higher, number Less according to being worth, and calculated by following formula:
f c = 1 - &sigma; i = 1 n a i n .
In formula, fc represents the score of repeatability index, aiRepresent that certain is repeated to record the number of times occurring;N is the total of record Number;Wherein, fc span is [0,1], and fc value is bigger, and information repeatability is little, and data value is higher.
Scarcity assessment unit, it is estimated to the scarcity of data commodity;If homogeneous data is more, scarcity is got over Low;Homogeneous data is fewer, and scarcity is higher, and is calculated by following formula:
f x = 2 e - y / x 1 + e - y / x .
Wherein, fx represents the score of scarcity index, and y represents the data bulk of the set of metadata of similar data commodity that market occurs;X table Show the data bulk of current data commodity, e is the bottom of natural logrithm.
The calculating of data scarcity, needs to obtain the data goods catalogue of current data trade market, including data commodity Quantity and field name.When comparing, need to propose current data field name, compare each data commodity successively Field name, and calculate text similarity.Similarity is higher than a certain threshold value then it is assumed that they are set of metadata of similar data.Use similarity number again According to quantity to calculate the scarcity of current data, calculating text similarity can be so that using mode of the prior art, here be not Repeat one by one again.
The span of fx is [0,1], when fx is not very rare close to 0 explanation data;Fx shows current number when being equal to 1 According to trade market no set of metadata of similar data.
Data volume assessment unit, it is estimated to the data volume of described data commodity;By calculate actual amount of data with The difference of amount of metadata is as follows to calculate data figureofmerit, formula:
f s = 1 - | l a - l m | m a x ( l a , l m )
In formula, fs represents the score of data figureofmerit, and la represents actual amount of data;Lm represents metadata record data volume; The span of fs is [0,1], when fs is much smaller than the data volume in metadata close to 0 explanation data volume;Fs shows when being equal to 1 Data volume meets the quantity of metadata offer.
Structuring scale evaluation unit, it is accounted for based on the destructuring in data content, semi-structured, structural data Ratio calculates the overall structuring degree of data.In the present embodiment, described structuring scale evaluation is calculated by following formula: fj=0 × q+0.5×p+1×h;
In formula, fj represents the score of structuring level index, and q is destructuring ratio, and p is semi-structured ratio;H is knot Structure ratio;Wherein, p+q+h=1.
The span of fj is that between [0,1], the less explanation of fj is that structural data is fewer, otherwise is that structural data occupies Many.
Price evaluation module, for being estimated to the price of data commodity, in the present embodiment, for realizing to described data The price of commodity is estimated, and described price evaluation module includes:
Set of metadata of similar data commodity searching unit, it is used for obtaining the set of metadata of similar data commodity in data trade market, the present embodiment In, then can calculate text similarity, find and current data commodity by obtaining the data directory in data trade market Set of metadata of similar data commodity, the information of these data commodity is put forward, including price, bar number, size, field name, data trade name Deng.
Text similarity computing uses included angle cosine formula, first all texts is carried out Chinese word segmentation, obtains entry literary composition Shelves matrix, then utilizes included angle cosine formula to calculate similarity two-by-two between text, finds the text phase with current word manifold Like the higher text of degree it is believed that being the set of metadata of similar data commodity of current data commodity.
Set of metadata of similar data commodity average price computing unit, for calculating the average price of these set of metadata of similar data commodity;This reality Apply in example, calculate the average price data amount of these data commodity first.Work as weight using data volume, calculate these prices Weighted mean.Assume there is z set of metadata of similar data commodity, their average priceFor:
Wherein, piFor the price of each data commodity, niBar number for each data commodity.Described average Price is the anchor of current data price, substantially determines the price place magnitude of current data commodity.
Standard deviation computing unit, for calculating the standard deviation of set of metadata of similar data commodity;Its according to the average price of data commodity, The i.e. data price of every, calculates the standard deviation of data price, weighs the fluctuation range of data price with this.This z data business The standard deviation of product is:
Wherein, piFor the price of each data commodity,Flat fare for data commodity Lattice, niBar number for each data commodity.
Data evaluating commodity value unit, it is used for assessing data commodity value, in the present embodiment, first to data commodity Coincident indicator, Validity Index, repeatability index, scarcity index, data figureofmerit and structuring level index setting Weight, the weight of wherein each index is referring to table one:
Table one: the weight of each index
It is generally believed that concordance, effectiveness, scarcity are maximum to the price of data.The high explanation of concordance score and The data promised to undertake is consistent, with a high credibility;Effectiveness illustrates the effectively usable degree of data;Scarcity illustrates that the source of data is No rare and precious.
Data volume number be the factor whether price adds up, if the data of unit magnitude has similar value, count Bigger according to measuring, data product is worth higher.Repeatability to data value influence less, is simply just being examined according to data volume is time-consuming Consider and whether should count price.Structuring degree, the side light easy-to-use degree of data, in general, structural data is more Operation beneficial to data.
In the present embodiment, according to the comprehensive score of following formula calculating data commodity:
f = &sigma; i = 1 6 &alpha; i w i
In formula, aiFor the score of each index, that is, coincident indicator, Validity Index, repeatability index, scarcity refer to The score of mark, data figureofmerit and structuring level index, wiWeight for each index.
Data commodity price assessment unit, it is used for assessing data commodity price, in the present embodiment, can be complete using following formula Become the price evaluation of data commodity,
&gamma; ( p &overbar; + ( f - 0.6 ) &sigma; ) < p < &gamma; ( p &overbar; + ( f - 0.4 ) &sigma; )
In formula, f is value assessment score,For set of metadata of similar data commodity average price, σ is the standard deviation of set of metadata of similar data commodity, γ is user feedback correction factor, and interval is [0.9,1], uses its value to be 1 for the first time.
And, in the present embodiment, can not find set of metadata of similar data commodity when on the net, the valency of data product can be assessed by following formula Lattice:
p = p ^ ( 1 + f )
In formula,For cost price, f is the estimated value of data commodity, and the price that is, cost-or-market method is assessed is in cost On basis, add certain profit amount.
Embodiment 2
Present embodiments provide a kind of price evaluation method of data commodity, it comprises the following steps:
S10, determine data type of merchandize to be assessed
In the present embodiment, the price evaluation method of described data commodity is not applied to all data commodity, and it is only fitted For structural data, unstructured data and semi-structured data.When data commodity to be assessed belong to above-mentioned data type When, then the price of data commodity to be assessed is estimated;When data commodity to be assessed are not belonging to above-mentioned data type, Then terminate the price evaluation process of this secondary data commodity.
Wherein, structural data refers to be stored in lane database, can be realized come logical expression with bivariate table structure Data;Unstructured data refers to the data not having fixed structure, including but not limited to office documents, text, picture, all kinds of report Table, image and audio frequency, video data;Semi-structured data refers to that data has implicit structure but is not with bivariate table etc Form exists, the data between structuring and unstructured knowledge source, the including but not limited to lattice such as xml, html, json The data of formula.
S20, quality evaluation is carried out to data commodity.
In the present embodiment, the quality evaluation of data commodity is included compliance evaluation, efficiency assessment, repeatability assessment, Scarcity assessment, data volume assessment and structuring scale evaluation.
Specifically, described compliance evaluation refers to whether real data is consistent with the data of promise.The data promised to undertake is general Using metadata.Metadata is the data with regard to data, have recorded each index item of data commodity, such as size, bar number, literary composition Part form, time, author etc..
In the present embodiment, by the concordance of following formula assessment data commodity:
f y = 1 - 1 3 ( | l a - l m | max ( l a , l m ) + | s a - s m | max ( s a , s m ) + p ) , p &element; { 0 , 1 }
In formula: fy represents the score of coincident indicator;La represents actual amount of data;Lm represents metadata record data volume; Sa represents actual data files size;Sm represents metadata record file size;P represents data form concordance;Using file Suffix name differentiates, if file suffixes name is identical with the data name of record in metadata, assignment 0, and otherwise it is entered as 1.
In obtained result, fy span is [0,1], and value is bigger, and concordance is better, and data value is higher.
Described efficiency assessment, i.e. data validity, refer to the data being stored in data base it should have actually used Meaning, also refer to the data value in data commodity and be correct state, its value be by calculate valid data in data volume In accounting and obtain, formula is:
h = &sigma; i = 1 m &sigma; j = 1 n a i j n
In formula, h represents the score of Validity Index;aijRepresent whether the i-th row, jth column data are virtual value.If having Valid value then takes 0, is not that virtual value then takes 1;N represent entirety data amount check it is assumed that data commodity common m row, n arrange, then n=m × N, m, n are natural number.
The span of h is [0,1], and h value is bigger, represents that data validity is better.
Described repeatability assessment, i.e. information repeatability index, is the appearance ratio calculating repeated data.Information repeatability is got over Height, data value is less, and is calculated by following formula:
f c = 1 - &sigma; i = 1 n a i n .
In formula, fc represents the score of repeatability index, aiRepresent that certain is repeated to record the number of times occurring;N is the total of record Number;Wherein, fc span is [0,1], and fc value is bigger, and information repeatability is little, and data value is higher.
In the present embodiment, described scarcity assessment, according to the offer situation of homogeneous data, calculates the degree of scarcity of data.As Fruit homogeneous data is more, and scarcity is lower;Homogeneous data is fewer, and scarcity is higher, and is calculated by following formula:
f x = 2 e - y / x 1 + e - y / x .
Wherein, fx represents the score of scarcity index, and y represents the data bulk of the set of metadata of similar data commodity that market occurs;X table Show the data bulk of current data commodity, e is the bottom of natural logrithm.
The calculating of data scarcity, needs to obtain the data goods catalogue of current data trade market, including data commodity Quantity and field name.When comparing, need to propose current data field name, compare each data commodity successively Field name, and calculate text similarity.Similarity is higher than a certain threshold value then it is assumed that they are set of metadata of similar data.Use similarity number again According to quantity to calculate the scarcity of current data, calculating text similarity can be so that using mode of the prior art, here be not Repeat one by one again.
The span of fx is [0,1], when fx is not very rare close to 0 explanation data;Fx shows current number when being equal to 1 According to trade market no set of metadata of similar data.
Described data volume assessment refers to the bar number of data inventory records, also refers to the detail record summation of data commodity, and leads to Cross calculating actual amount of data as follows to calculate data figureofmerit, formula with the difference of amount of metadata:
f s = 1 - | l a - l m | m a x ( l a , l m )
In formula, fs represents the score of data figureofmerit, and la represents actual amount of data;Lm represents metadata record data volume; The span of fs is [0,1], when fs is much smaller than the data volume in metadata close to 0 explanation data volume;Fs shows when being equal to 1 Data volume meets the quantity of metadata offer.
Structuring scale evaluation is based on the destructuring in data content, semi-structured, structural data accounting, calculates The overall structuring degree of data.In the present embodiment, described structuring scale evaluation is calculated by following formula: fj=0 × q+0.5 × p+1×h;
In formula, fj represents the score of structuring level index, and q is destructuring ratio, and p is semi-structured ratio;H is knot Structure ratio;Wherein, p+q+h=1.
The span of fj is that between [0,1], the less explanation of fj is that structural data is fewer, otherwise is that structural data occupies Many.
S30, price evaluation is carried out to data commodity.
Data assessment price be built upon data assessment value basis on, only data is carried out value assessment with Afterwards, could price potential.Price potential problem to be solved is that the result of estimated value is the decimal between 0 to 1, and Price potential be 0 arrive just infinite between a price.Two estimated value identical data commodity, corresponding price may Gap can become Radix Achyranthis Bidentatae, Shang Qianbei.Therefore, price evaluation is carried out to data, need to determine magnitude or the fluctuation of data with anchor Center.
The determination of anchor in the present invention, using market method and two kinds of appraisal procedures of cost-or-market method.Wherein market method price evaluation side Method needs to obtain the set of metadata of similar data commodity of current data trade market;The data commodity that set of metadata of similar data commodity cannot be obtained are adopted Take cost-or-market method price evaluation method.
In the present embodiment, described market method comprises the following steps:
S3011, lookup set of metadata of similar data commodity
Obtain the data directory in data trade market first, by calculating text similarity, find and current data commodity Set of metadata of similar data commodity, the information of these data commodity is put forward, including price, bar number, size, field name, data commodity Name etc..
Text similarity computing uses included angle cosine formula, first all texts is carried out Chinese word segmentation, obtains entry literary composition Shelves matrix, then utilizes included angle cosine formula to calculate similarity two-by-two between text, finds the text phase with current word manifold Like the higher text of degree it is believed that being the set of metadata of similar data commodity of current data commodity.
S3012, calculating set of metadata of similar data commodity average price
After obtaining similar data commodity, calculate the average price data amount of these data commodity respectively.Using number Work as weight according to amount, calculate the weighted mean of these prices.Assume there is z set of metadata of similar data commodity, their average priceFor:
Wherein, piFor the price of each data commodity, niBar number for each data commodity.Described average Price is the anchor of current data price, substantially determines the price place magnitude of current data commodity.
S3013, the standard deviation of calculating set of metadata of similar data commodity
According to the average price of data commodity, i.e. the data price of every, calculate the standard deviation of data price, weighed with this The fluctuation range of data price.The standard deviation of this z data commodity is:
Wherein, piFor the price of each data commodity,Flat fare for data commodity Lattice, niBar number for each data commodity.
S3014, assessment data commodity value
To the coincident indicator of data commodity, Validity Index, repeatability index, scarcity index, data figureofmerit and Structuring level index arranges weight, and the weight of wherein each index is referring to table two:
Table two: the weight of each index
It is generally believed that concordance, effectiveness, scarcity are maximum to the price of data.The high explanation of concordance score and The data promised to undertake is consistent, with a high credibility;Effectiveness illustrates the effectively usable degree of data;Scarcity illustrates that the source of data is No rare and precious.
Data volume number be the factor whether price adds up, if the data of unit magnitude has similar value, count Bigger according to measuring, data product is worth higher.Repeatability to data value influence less, is simply just being examined according to data volume is time-consuming Consider and whether should count price.Structuring degree, the side light easy-to-use degree of data, in general, structural data is more Operation beneficial to data.
In the present embodiment, according to the comprehensive score of following formula calculating data commodity:
f = &sigma; i = 1 6 &alpha; i w i
In formula, aiFor the score of each index, that is, coincident indicator, Validity Index, repeatability index, scarcity refer to The score of mark, data figureofmerit and structuring level index, wiWeight for each index.
S3015, assessment data commodity price
In the present embodiment, the price of described data commodity is determined by following factor.First, average price determines price Benchmark.Secondly, estimated value determines the fluctuation area of price.If f score is more than 0.6, then the price of this data commodity Can float more than average price, domain of walker is 0.2 standard deviation.If f is less than 0.4, then the price of this data commodity Can float below average price, domain of walker is 0.2 standard deviation.If f is more than 0.4 is less than 0.6, then this data commodity Price can float about average price, domain of walker be 0.2 standard deviation.Finally, user is to this data commodity or similar The last feedback with evaluation value of data commodity, completes last correction to the price of data commodity.
&gamma; ( p &overbar; + ( f - 0.6 ) &sigma; ) < p < &gamma; ( p &overbar; + ( f - 0.4 ) &sigma; )
In formula, f is value assessment score,For set of metadata of similar data commodity average price, σ is the standard deviation of set of metadata of similar data commodity, γ is user feedback correction factor, and interval is [0.9,1], uses its value to be 1 for the first time.
In the present embodiment, can not find set of metadata of similar data commodity when on the net, need use cost method to assess the price of data product, Specifically, using the cost price etc. of the expected price of data commodity user or the purchasing price of data or data, Its step includes:
Carry out value assessment first, that is, identical with step s3014 of market method.
Secondly, assess data product price by following formula.
p = p ^ ( 1 + f )
In formula,For cost price, f is value assessment score, the price that is, cost-or-market method is assessed be the basis of cost it On, add certain profit amount.
In the present embodiment, the Valuation Method of described data commodity can also include step s40, by the valency of described assessment Lattice show on data merchandise news display terminal.
The sequencing of above example only for ease of description, does not represent the quality of embodiment.
Finally it is noted that above example, only in order to technical scheme to be described, is not intended to limit;Although With reference to the foregoing embodiments the present invention is described in detail, it will be understood by those within the art that: it still may be used To modify to the technical scheme described in foregoing embodiments, or equivalent is carried out to wherein some technical characteristics; And these modification or replace, do not make appropriate technical solution essence depart from various embodiments of the present invention technical scheme spirit and Scope.

Claims (10)

1. a kind of value appraisal system of data commodity is it is characterised in that include:
Identification module, it is used for identifying the species of data commodity to be assessed;When data commodity to be assessed belong to structuring number During according to, unstructured data and semi-structured data, then the price of data commodity to be assessed is estimated;When to be assessed When data commodity are not belonging to structural data, unstructured data and semi-structured data, then terminate the price to data commodity Evaluation process;
Quality assessment modules, for carrying out quality evaluation to data commodity;
Price evaluation module, based on the index obtained by described quality assessment modules, is estimated to the price of data commodity.
2. the value appraisal system of data commodity according to claim 1 is it is characterised in that described quality assessment modules bag Include:
Compliance evaluation unit, it is used for the concordance of data commodity is estimated;
Efficiency assessment unit, it is estimated to the effectiveness of data commodity;
Repeated assessment unit, it is estimated to the repeatability of the data in data commodity;
Scarcity assessment unit, it is estimated to the scarcity of data commodity;
Data volume assessment unit, it is estimated to the data volume of described data commodity;
Structuring scale evaluation unit, it is counted based on the destructuring in data content, semi-structured, structural data accounting Count according to overall structuring degree.
3. the value appraisal system of data commodity according to claim 2 is it is characterised in that described price evaluation module bag Include:
Set of metadata of similar data commodity searching unit, it is used for obtaining the set of metadata of similar data commodity in data trade market;
Set of metadata of similar data commodity average price computing unit, for calculating the average price of these set of metadata of similar data commodity;
Standard deviation computing unit, for calculating the standard deviation of set of metadata of similar data commodity;
Data evaluating commodity value unit, it is used for assessing data commodity value;
Data commodity price assessment unit, it is used for assessing data commodity price.
4. data commodity according to claim 3 value appraisal system it is characterised in that
The price evaluation of data commodity when there are set of metadata of similar data commodity, is completed using following formula,
&gamma; ( p &overbar; + ( f - 0.6 ) &sigma; ) < p < &gamma; ( p &overbar; + ( f - 0.4 ) &sigma; )
In formula, f is value assessment score,For set of metadata of similar data commodity average price, σ is the standard deviation of set of metadata of similar data commodity, and γ is User feedback correction factor, interval is [0.9,1], uses its value to be 1 for the first time;
When there are not set of metadata of similar data commodity, by the price of following formula assessment data product:
p = p ^ ( 1 + f )
In formula,For cost price, f is value assessment score.
5. a kind of price evaluation method of data commodity is it is characterised in that comprise the following steps:
S10, determine data type of merchandize to be assessed;When data commodity to be assessed belong to structural data, destructuring number According to during with semi-structured data, then the price of data commodity to be assessed is estimated;When data commodity to be assessed do not belong to When structural data, unstructured data and semi-structured data, then terminate the price evaluation process to data commodity;
S20, quality evaluation is carried out to data commodity;The quality evaluation of data commodity is included compliance evaluation, efficiency assessment, Repeatability assessment, scarcity assessment, data volume assessment and structuring scale evaluation;
S30, price evaluation is carried out to data commodity;
Wherein step s30 specifically includes:
S3011, lookup set of metadata of similar data commodity;If there is similar data commodity, then execution step s3012, s3013, s3014 And s3015;If there is no similar data commodity, then execution step s3014 and s3016;
S3012, calculating set of metadata of similar data commodity average price;
S3013, the standard deviation of calculating set of metadata of similar data commodity;
S3014, assessment data commodity value;
S3015, according to following formula assess data commodity price:
&gamma; ( p &overbar; + ( f - 0.6 ) &sigma; ) < p < &gamma; ( p &overbar; + ( f - 0.4 ) &sigma; ) ;
In formula, f is value assessment score,For set of metadata of similar data commodity average price, σ is the standard deviation of set of metadata of similar data commodity, and γ is User feedback correction factor, interval is [0.9,1], uses its value to be 1 for the first time;
S3016, according to following formula assess data commodity price:
p = p ^ ( 1 + f ) ;
In formula,For cost price, f is value assessment score, and the price that is, cost-or-market method is assessed is on the basis of cost, Plus certain profit amount.
6. the price evaluation method of data commodity according to claim 5 is it is characterised in that in described step s20, lead to Cross following formula assess data commodity concordance:
f y = 1 - 1 3 ( | l a - l m | max ( l a , l m ) + | s a - s m | max ( s a , s m ) + p ) , p &element; { 0 , 1 }
In formula: fy represents the score of coincident indicator;La represents actual amount of data;Lm represents metadata record data volume;Sa table Show actual data files size;Sm represents metadata record file size;P represents data form concordance;Using file suffixes Name differentiates, if file suffixes name is identical with the data name of record in metadata, assignment 0, and otherwise it is entered as 1.
7. the price evaluation method of data commodity according to claim 5 is it is characterised in that in described step s20, lead to Cross following formula assess data commodity effectiveness:
h = &sigma; i = 1 m &sigma; j = 1 n a i j n ;
In formula, h represents the score of Validity Index;aijRepresent whether the i-th row, jth column data are virtual value.If virtual value Then take 0, be not that virtual value then takes 1;N represents the data amount check of entirety it is assumed that data commodity common m row, n arrange, then n=m × n, m, n For natural number.
8. the price evaluation method of data commodity according to claim 5 is it is characterised in that in described step s20, lead to Cross following formula assess data commodity repeatability:
f c = 1 - &sigma; i = 1 n a i n ;
In formula, fc represents the score of repeatability index, aiRepresent that certain is repeated to record the number of times occurring;N is the sum of record;Its In, fc span is [0,1], and fc value is bigger, and information repeatability is little, and data value is higher.
9. the price evaluation method of data commodity according to claim 5 is it is characterised in that in described step s20, root Scarcity according to following formula assessment data commodity:
f x = 2 e - y / x 1 + e - y / x ;
Wherein, fx represents the score of scarcity index, and y represents the data bulk of the set of metadata of similar data commodity that market occurs;X represents and works as The data bulk of front data commodity, e is the bottom of natural logrithm.
10. the price evaluation method of data commodity according to claim 5 is it is characterised in that in described step s20, Data volume according to following formula assessment data commodity:
f s = 1 - | l a - l m | m a x ( l a , l m )
In formula, fs represents the score of data figureofmerit, and la represents actual amount of data;Lm represents metadata record data volume;Fs's Span is [0,1], when fs is much smaller than the data volume in metadata close to 0 explanation data volume;Fs shows data when being equal to 1 Amount meets the quantity of metadata offer.
CN201610791054.3A 2016-08-31 2016-08-31 Price evaluation method and system for data commodities Pending CN106355447A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610791054.3A CN106355447A (en) 2016-08-31 2016-08-31 Price evaluation method and system for data commodities

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610791054.3A CN106355447A (en) 2016-08-31 2016-08-31 Price evaluation method and system for data commodities

Publications (1)

Publication Number Publication Date
CN106355447A true CN106355447A (en) 2017-01-25

Family

ID=57856502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610791054.3A Pending CN106355447A (en) 2016-08-31 2016-08-31 Price evaluation method and system for data commodities

Country Status (1)

Country Link
CN (1) CN106355447A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107027058A (en) * 2016-01-29 2017-08-08 腾讯科技(北京)有限公司 Price evaluation method, client and the server of online Media file
CN107705161A (en) * 2017-10-25 2018-02-16 深圳市宝盒科技有限公司 A kind of merchandise cost appraisal procedure and computer
CN108734405A (en) * 2018-05-24 2018-11-02 国信优易数据有限公司 A kind of data value Evaluation Platform and method
CN108764705A (en) * 2018-05-24 2018-11-06 国信优易数据有限公司 A kind of data quality accessment platform and method
CN108764995A (en) * 2018-05-24 2018-11-06 国信优易数据有限公司 A kind of data value determines system and method
CN108829750A (en) * 2018-05-24 2018-11-16 国信优易数据有限公司 A kind of quality of data determines system and method
CN109345301A (en) * 2018-09-26 2019-02-15 国信优易数据有限公司 A kind of data price-determining system and determining method
CN109615431A (en) * 2018-12-13 2019-04-12 普元信息技术股份有限公司 The system and method for data assets perception and pricing function are realized under big data background
CN110659926A (en) * 2018-06-29 2020-01-07 国信优易数据有限公司 Data value evaluation system and method
CN110766429A (en) * 2018-07-26 2020-02-07 国信优易数据有限公司 Data value evaluation system and method
CN110766428A (en) * 2018-07-25 2020-02-07 国信优易数据有限公司 Data value evaluation system and method
CN110858368A (en) * 2018-08-24 2020-03-03 国信优易数据有限公司 Data evaluation service value determination system and method
CN110858369A (en) * 2018-08-24 2020-03-03 国信优易数据有限公司 Data value evaluation system and method and electronic equipment
US11094015B2 (en) 2014-07-11 2021-08-17 BMLL Technologies, Ltd. Data access and processing system
CN113822692A (en) * 2020-12-28 2021-12-21 京东科技控股股份有限公司 Commodity information processing method and device, electronic equipment and storage medium
CN115409419A (en) * 2022-09-26 2022-11-29 河南星环众志信息科技有限公司 Value evaluation method and device of business data, electronic equipment and storage medium
CN116069849A (en) * 2023-03-02 2023-05-05 安徽兴博远实信息科技有限公司 Artificial intelligent management system applied to cross-platform data exchange sharing

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11094015B2 (en) 2014-07-11 2021-08-17 BMLL Technologies, Ltd. Data access and processing system
CN107027058A (en) * 2016-01-29 2017-08-08 腾讯科技(北京)有限公司 Price evaluation method, client and the server of online Media file
CN107027058B (en) * 2016-01-29 2020-03-17 腾讯科技(北京)有限公司 Price evaluation method of online media file, client and server
CN107705161A (en) * 2017-10-25 2018-02-16 深圳市宝盒科技有限公司 A kind of merchandise cost appraisal procedure and computer
CN108734405A (en) * 2018-05-24 2018-11-02 国信优易数据有限公司 A kind of data value Evaluation Platform and method
CN108764705A (en) * 2018-05-24 2018-11-06 国信优易数据有限公司 A kind of data quality accessment platform and method
CN108764995A (en) * 2018-05-24 2018-11-06 国信优易数据有限公司 A kind of data value determines system and method
CN108829750A (en) * 2018-05-24 2018-11-16 国信优易数据有限公司 A kind of quality of data determines system and method
CN110659926A (en) * 2018-06-29 2020-01-07 国信优易数据有限公司 Data value evaluation system and method
CN110766428A (en) * 2018-07-25 2020-02-07 国信优易数据有限公司 Data value evaluation system and method
CN110766429A (en) * 2018-07-26 2020-02-07 国信优易数据有限公司 Data value evaluation system and method
CN110858368A (en) * 2018-08-24 2020-03-03 国信优易数据有限公司 Data evaluation service value determination system and method
CN110858369A (en) * 2018-08-24 2020-03-03 国信优易数据有限公司 Data value evaluation system and method and electronic equipment
CN109345301A (en) * 2018-09-26 2019-02-15 国信优易数据有限公司 A kind of data price-determining system and determining method
CN109615431A (en) * 2018-12-13 2019-04-12 普元信息技术股份有限公司 The system and method for data assets perception and pricing function are realized under big data background
CN113822692A (en) * 2020-12-28 2021-12-21 京东科技控股股份有限公司 Commodity information processing method and device, electronic equipment and storage medium
CN113822692B (en) * 2020-12-28 2024-04-05 京东科技控股股份有限公司 Commodity information processing method, commodity information processing device, electronic equipment and storage medium
CN115409419A (en) * 2022-09-26 2022-11-29 河南星环众志信息科技有限公司 Value evaluation method and device of business data, electronic equipment and storage medium
CN115409419B (en) * 2022-09-26 2023-12-05 河南星环众志信息科技有限公司 Method and device for evaluating value of business data, electronic equipment and storage medium
CN116069849A (en) * 2023-03-02 2023-05-05 安徽兴博远实信息科技有限公司 Artificial intelligent management system applied to cross-platform data exchange sharing
CN116069849B (en) * 2023-03-02 2023-06-09 安徽兴博远实信息科技有限公司 Artificial intelligent management system applied to cross-platform data exchange sharing

Similar Documents

Publication Publication Date Title
CN106355447A (en) Price evaluation method and system for data commodities
CN108764705A (en) A kind of data quality accessment platform and method
CN108734405A (en) A kind of data value Evaluation Platform and method
Deshmukh et al. Investment, cash flow, and corporate hedging
CN106469395A (en) A kind of data commodity dynamic comprehensive appraisal procedure and system
JP7181334B2 (en) Intelligent supplier management system and intelligent supplier management method
CN102985939A (en) Art evaluation engine and method for automatic development of an art index
CN115829714A (en) Financial product data dynamic monitoring method and system
Rokhman et al. The effects of e-government, e-billing and e-filing on taxpayer compliance: A case of taxpayers in Indonesia
CN108764995A (en) A kind of data value determines system and method
Lech Proposal of a compact IT value assessment method
CN112099801A (en) Excel analysis method and system based on metadata driving
Awosejo et al. Adoption of accounting information systems in an organization in South Africa
Siebert A structural model on the impact of prediscovery licensing and research joint ventures on innovation and product market efficiency
O'Leary Purchase order “analytic audit”
Krisnato Empirical study on the relationships of internet banking quality, customer value, and customer satisfaction
CN115563176A (en) Electronic commerce data processing system and method
CN109960777B (en) Personalized recommendation method and system for article comment, electronic equipment and storage medium
Johansen et al. Regional policy and the role of interregional trade data: policy simulations with a model for Norway
CN116308466B (en) Data information acquisition and intelligent analysis method, system, equipment and storage medium
Hu Study on the impacts of service quality and customer satisfaction on customer loyalty in B2C e-commerce
Srivastava XBRL (extensible business reporting language): A research perspective
Awalludin Analysis Adoption e-Commerce SMEs Using Innovation Diffusion Theory Framework (Case Report: Karawang District)
Lubis et al. The Effect of E-Commerce and Purchase Effectiveness on Student Learning
CN115829405A (en) Target data processing method, device, equipment and medium based on multiple dimensions

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170125