CN106469395A - A kind of data commodity dynamic comprehensive appraisal procedure and system - Google Patents
A kind of data commodity dynamic comprehensive appraisal procedure and system Download PDFInfo
- Publication number
- CN106469395A CN106469395A CN201610792027.8A CN201610792027A CN106469395A CN 106469395 A CN106469395 A CN 106469395A CN 201610792027 A CN201610792027 A CN 201610792027A CN 106469395 A CN106469395 A CN 106469395A
- Authority
- CN
- China
- Prior art keywords
- data
- assessment
- commodity
- information
- amount
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
Landscapes
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Marketing (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a kind of data commodity dynamic comprehensive appraisal procedure, comprise the following steps:S1, determine the data commodity of pending dynamic comprehensive assessment;S2, the data quality accessment information obtaining described data commodity in real time, user evaluate three kinds of data of data related article amount, and carry out data assessment to three kinds of data;S3, dynamic comprehensive assessment is carried out to described data commodity according to the result of data assessment.The data commodity dynamic comprehensive appraisal procedure of the present invention and system and its system, by the comprehensive assessment to carry out data commodity for the much information of data commodity, ensure that the assessment of data commodity more comprehensively, accurately, simultaneously, by the information of dynamic realtime gathered data commodity, the assessment result of further guarantee data commodity is more ageing.
Description
Technical field
The present invention relates to technical field of data processing is and in particular to a kind of data commodity dynamic comprehensive appraisal procedure and be
System.
Background technology
Society has come into the epoch of digitization now, and physical commodity has been no longer the trading object only deposited, data
Also become a kind of tradable commodity.
Specifically, data commodity refer to the data set being traded using on transaction platform.Because data commodity are deposited
In larger trading volume, this will be likely to the price of data commodity is affected, if do not entered to data commodity
If the monitoring of row real-time, then it is likely that can be deposited due to data merchandise valuation during being traded to data commodity
Timeliness sex chromosome mosaicism, and seller or the interests of buyer are suffered damage.
Therefore, a kind of appraisal procedure that can either carry out dynamic realtime assessment to data commodity how is provided just to become with system
For problem demanding prompt solution.
Content of the invention
The invention provides a kind of data commodity dynamic comprehensive appraisal procedure and system, by real-time data collection commodity
Data is simultaneously estimated so that the assessment of data commodity, according to preferably ageing, further ensure that data commodity simultaneously
Assessment more comprehensive and accurate.
The Part I of the present invention provides a kind of data commodity dynamic comprehensive appraisal procedure, comprises the following steps:
S1, determine the data commodity of pending dynamic comprehensive assessment;
S2, the data quality accessment information obtaining data commodity in real time, user evaluate three kinds of numbers of data related article amount
According to, and data assessment is carried out to three kinds of data;
S3, dynamic comprehensive assessment is carried out to data commodity by the following method according to the result of data assessment:
Wherein, P assesses score for dynamic comprehensive;PiData assessment score for i-th kind of data;FiItem for i-th kind of data
Mesh weight.
Preferably, the data assessment evaluation index of data quality accessment information include data consistency, data integrity,
Data redudancy, data age data amount;
The data assessment of the quality of data calculates by the following method:
Wherein, P1For data quality accessment score;HjFor jth item evaluation index score;EjFinger for jth item evaluation index
Mark weight.
Described data consistency is according to the actual amount of data of described data quality accessment information, real data size, data
Form, and the record data amount of metadata, log file size, metadata data form described data quality accessment is believed
The data consistency of breath carries out index evaluation;
And, described data consistency carries out index evaluation by the following method:
Wherein, H1For data consistency index evaluation score;LaRepresent actual amount of data;LmFor metadata record data volume;
SaFor actual data files size;SmFor metadata record file size;P is data form concordance, and it uses file suffixes name
Differentiate, if file suffixes name is identical with the data name of record in metadata, assignment 1, otherwise it is entered as 0.
It is further preferred that data integrity is according to the non-null value amount of data quality accessment information and all data amount checks
To carry out index evaluation to the data integrity of data quality estimation information;And, data integrity is referred to by the following method
Mark assessment:
Wherein, H2For data integrity index evaluation score;aijWhether it is null value for the i-th row, jth column data, if
Null value then takes 0, is not that null value then takes 1;N is all data amount checks, when data commodity common m row, n row, then N=m × n, and m with
N is natural number.
It is further preferred that data redudancy is according to the number of times repeating record appearance in data quality accessment information, and
The sum of record to carry out index evaluation to the data redudancy of data quality estimation information;And, data redudancy passes through following
Method carries out index evaluation:
Wherein, H3For data redudancy index evaluation score;ciRepeat to record the number of times occurring for i-th;R is record
Sum.
It is further preferred that data age according to the initial time of data quality accessment information record, the final time and
Current time to carry out index evaluation to the data age of data quality estimation information;And, data age passes through with lower section
Method carries out index evaluation:
Wherein, H4For data age index evaluation score;TfFor the final time of record, such as no record time, then use
The metadata time;TsFor the initial time of record, such as the no record time, then using the metadata time;TnFor current time.
It is further preferred that data volume according to the assessment full dose data volume of data quality accessment information, amount of metadata and is held
Promise data to carry out index evaluation to the data volume of data quality estimation information;And, data volume enters row index by the following method
Assessment:
Wherein, H5For data volume index evaluation score;X is current data volume;O1For assessing full dose data volume;O2For unit
Data volume;O3It is commitment data.
Preferably, user evaluate according to described user evaluate in favorable comment, in comment and differ from the accounting commented user is evaluated into
Row data assessment;And, the data assessment that user evaluates calculates by the following method:
P2=a1×1+a2×0.5+a3× 0, a1+a2+a3=1;
Wherein, P2For user's assessment score;a1Favorable comment accounting in evaluating for user;a2Comment in evaluating for user and account for
Than;a3In evaluating for user, difference comments accounting.
Preferably, the quantity of the related article according to the data commodity retrieving for the data related article amount is come related to data
Article amount carries out data assessment;And, the data assessment of data related article amount calculates by the following method:
Wherein, P3Assess score for data related article amount;Y is the quantity of the related article of data commodity, and e is nature pair
The truth of a matter of number.
The Part II of the present invention provides a kind of data commodity dynamic comprehensive assessment system, including:
Model construction server, determines the data commodity of pending dynamic comprehensive assessment, and builds for data commodity dynamic
State assessment models;
Data acquisition terminal, the data quality accessment information of real-time data collection commodity, user evaluate the related literary composition of data
Three kinds of data of chapter amount;
Comprehensive assessment server, by the dynamic evaluation model of model construction server construction, and according to data acquisition eventually
Data commodity are carried out dynamic comprehensive assessment by three kinds of data of end collection.
Preferably, comprehensive assessment server includes:
Data quality accessment unit, for carrying out data quality accessment according to data quality accessment information;
User's assessment unit, carries out user's assessment for evaluating according to user;
Data related article amount assessment unit, for carrying out the assessment of data related article amount according to data related article amount;
Dynamic comprehensive assessment unit, for related according to data quality accessment unit, user's assessment unit data
The assessment result of article amount assessment unit carries out the Dynamic Comprehensive Evaluation of data commodity.
It is further preferred that data quality accessment unit includes:
Compliance evaluation subelement, for carrying out compliance evaluation to data quality estimation information;
Integrity assessment subelement, for carrying out integrity assessment to data quality estimation information;
Redundancy assesses subelement, for carrying out redundancy assessment to data quality estimation information;
Ageing assessment subelement, for carrying out ageing assessment to data quality estimation information
Data volume assesses subelement, for carrying out data volume assessment to data quality estimation information;
Data quality accessment subelement, for commenting according to compliance evaluation subelement, integrity assessment subelement, redundancy
Estimate subelement, the assessment result of ageing assessment subelement data amount assessment subelement carries out data quality accessment.
Dynamic comprehensive assessment unit according to the quality of data of described data commodity, user evaluate data related article amount Lai
Dynamic comprehensive assesses described data commodity;
And, described dynamic comprehensive assessment unit carries out dynamic comprehensive assessment by the following method:
Wherein, P assesses score for dynamic comprehensive;PiData assessment score for i-th kind of data;FiItem for i-th kind of data
Mesh weight.
It is further preferred that data quality accessment subelement is according to the data consistency of data quality accessment information, data
Integrity, data redudancy, data age data amount are assessing the quality of data of data quality accessment information;And, described
Data quality accessment subelement carries out data quality accessment by the following method:
Wherein, P1For data quality accessment score;HjFor jth item evaluation index score;EjFinger for jth item evaluation index
Mark weight.
It is further preferred that compliance evaluation subelement is according to the actual amount of data of data quality accessment information, actual number
To assess data according to the record data amount of size, data form, and metadata, log file size, metadata data form
The data consistency of quality estimation information;And, compliance evaluation subelement carries out compliance evaluation by the following method:
Wherein, H1For data consistency index evaluation score;LaRepresent actual amount of data;LmFor metadata record data volume;
SaFor actual data files size;SmFor metadata record file size;P is data form concordance, and it uses file suffixes name
Differentiate, if file suffixes name is identical with the data name of record in metadata, assignment 1, otherwise it is entered as 0.
It is further preferred that integrity assessment subelement is according to the non-null value amount of data quality accessment information and all numbers
To assess the data integrity of data quality accessment information according to number;And, integrity assessment subelement is carried out by the following method
Integrity assessment:
Wherein, H2For data integrity index evaluation score;aijWhether it is null value for the i-th row, jth column data, if
Null value then takes 0, is not that null value then takes 1;N is all data amount checks, when data commodity common m row, n row, then N=m × n, and m with
N is natural number.
It is further preferred that redundancy assess subelement according to repeat in data quality accessment information record occur time
Number, and the data redudancy to assess data quality accessment information for the sum of record;And, redundancy assessment subelement pass through with
Lower method carries out redundancy assessment:
Wherein, H3For data redudancy index evaluation score;ciRepeat to record the number of times occurring for i-th;R is record
Sum.
It is further preferred that ageing assessment subelement is according to the initial time of data quality accessment information record, final
Time and current time are assessing the data age of data quality accessment information;And, ageing assessment subelement passes through following
Method carries out ageing assessment:
Wherein, H4For data age index evaluation score;TfFor the final time of record, such as no record time, then use
The metadata time;TsFor the initial time of record, such as the no record time, then using the metadata time;TnFor current time.
It is further preferred that data volume assesses subelement according to the assessment full dose data volume of data quality accessment information, unit
Data volume and commitment data are assessing the data volume of data quality accessment information;And, data volume assessment subelement passes through with lower section
Method carries out data volume assessment:
Wherein, H5For data volume index evaluation score;X is current data volume;O1For assessing full dose data volume;O2For unit
Data volume;O3It is commitment data.
It is further preferred that user's assessment unit according to favorable comment in user's evaluation, in comment and differ from the accounting commented and comment
Estimate user to evaluate;And, user's assessment unit carries out user's assessment by the following method:
P2=a1×1+a2×0.5+a3× 0, a1+a2+a3=1;
Wherein, P2For user's assessment score;a1Favorable comment accounting in evaluating for user;a2Comment in evaluating for user and account for
Than;a3In evaluating for user, difference comments accounting.
It is further preferred that the number of the related article according to the data commodity retrieving for the data related article amount assessment unit
Measure and to assess data related article amount;And, data related article amount assessment unit carries out data related article by the following method
Amount assessment:
Wherein, P3Assess score for data related article amount;Y is the quantity of the related article of data commodity, and e is nature pair
The truth of a matter of number.
The data commodity dynamic comprehensive appraisal procedure of the present invention and system, enter line number by the much information of data commodity
According to the comprehensive assessment of commodity, ensure that the assessment of data commodity more comprehensively, accurately, meanwhile, number is gathered by dynamic realtime
According to the information of commodity, the assessment result of further guarantee data commodity is more ageing.
Brief description
In order to be illustrated more clearly that the technical scheme of the embodiment of the present invention, below will be to required use in embodiment description
Accompanying drawing be briefly described it should be apparent that, drawings in the following description are some embodiments of the present invention, for ability
For the those of ordinary skill of domain, without having to pay creative labor, others can also be obtained according to these accompanying drawings
Accompanying drawing.
Fig. 1 is the flow chart of an embodiment of data commodity dynamic comprehensive appraisal procedure of the present invention.
Fig. 2 is the logistic regression function image of an embodiment of data commodity dynamic comprehensive assessment system of the present invention.
Fig. 3 is the system diagram of an embodiment of data commodity dynamic comprehensive assessment system of the present invention.
Fig. 4 is the layer rank figure of an embodiment of data commodity dynamic comprehensive assessment system of the present invention.
Specific embodiment
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached in the embodiment of the present invention
Figure, is clearly and completely described to the technical scheme in the embodiment of the present invention it is clear that described embodiment is the present invention
A part of embodiment, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not having
The every other embodiment being obtained under the premise of making creative work, broadly falls into the scope of protection of the invention.
Embodiment one
Fig. 1 is the flow chart of an embodiment of data commodity dynamic comprehensive appraisal procedure of the present invention, as shown in figure 1, this
In embodiment, data commodity dynamic comprehensive appraisal procedure, comprise the following steps:
S1, determine the data commodity of pending dynamic comprehensive assessment;
S2, the data quality accessment information obtaining data commodity in real time, user evaluate three kinds of numbers of data related article amount
According to, and data assessment is carried out to three kinds of data;
S3, dynamic comprehensive assessment is carried out by below equation to data commodity according to the result of data assessment:
Wherein, P assesses score for dynamic comprehensive;PiData assessment score for i-th kind of data;FiItem for i-th kind of data
Mesh weight.
Specifically, determine, in S1, the data commodity needing to carry out dynamic comprehensive assessment, to be subsequently directed to this data commodity
Carry out the collection of related data, and dynamic comprehensive assessment is carried out according to the data of collection.
In S2 be directed to need the data commodity carrying out dynamic comprehensive assessment carry out data quality accessment information, user evaluate and
The Real-time Collection of three kinds of data such as data related article amount obtains, and the data quality accessment information collecting, user are evaluated
Three kinds of data of data related article amount carry out data assessment, to obtain data quality accessment score, user's rating evaluation obtains
Divide data related article amount assessment score.
Wherein, when carrying out data assessment to data quality estimation information, need to be estimated for different evaluation indexes,
In the present embodiment, its evaluation index has five, that is,:Data consistency, data integrity, data redudancy, data age
Data amount.
Data consistency is the degree of consistency judging Various types of data and data structure description information in data content, and counts
It is divided into the destructuring of data, semi-structured and structuring degree according to structure description information.
In the present embodiment, data consistency can according to the actual amount of data of data quality accessment information, real data size,
Data form, and the record data amount of metadata, log file size, metadata data form data quality accessment is believed
The data consistency of breath carries out index evaluation, and, data consistency can carry out index evaluation especially by equation 1 below:
Formula 1
Wherein, H1For data consistency index evaluation score;LaRepresent actual amount of data;LmFor metadata record data volume;
SaFor actual data files size;SmFor metadata record file size;P is data form concordance, and it uses file suffixes name
Differentiate, if file suffixes name is identical with the data name of record in metadata, assignment 1, otherwise it is entered as 0.
According to above formula 1, the H finally trying to achieve1Span be [0-1], and H1Value bigger, then say
Bright data consistency is better, and the value of data is higher.
And, the metadata wherein referring to is also referred to as broker data, relay data, it is the data of description data, main
If the information of description data attribute, for supporting such as to indicate the work(such as storage location, historical data, resource lookup, file record
Energy.
Data integrity is to calculate accounting in total amount of data for the data valid data.
In the present embodiment, data integrity refers to non-null value amount and the entirety in data commodity with actually used meaning
The accounting situation of data.
Then, data integrity can carry out index evaluation especially by equation 2 below:
Formula 2
Wherein, H2For data integrity index evaluation score;aijWhether it is null value for the i-th row, jth column data, if
Null value then takes 0, is not that null value then takes 1;N is all data amount checks, when data commodity common m row, n row, then N=m × n, and m with
N is natural number.
According to above formula 2, the H finally trying to achieve2Span be [0-1], and H2Value bigger, then illustrate
Data integrity is better.
Data redudancy is the ratio calculating the appearance of repeated data in data.Repeat generally in a data acquisition system
Data is referred to as data redundancy, and data redudancy is higher, then the value of this data is less.
In the present embodiment, data redudancy is according to the number of times repeating record appearance in data quality accessment information, Yi Jiji
The sum of record to carry out index evaluation to the data redudancy of data quality estimation information, and, data redudancy can be especially by
Equation 3 below carries out index evaluation:
Formula 3
Wherein, H3For data redudancy index evaluation score;ciRepeat to record the number of times occurring for i-th;R is record
Sum.
According to above formula 3, the H finally trying to achieve3Span be [0-1], and H3Value bigger, then illustrate
Data redundancy is less, and the value of data is higher.
Data age is to calculate the time interval representated by data and the relation between the data offer time.Generally, number
Time range according to record is bigger, and nearer apart from current time, then the value of data is higher.
In the present embodiment, data age according to the initial time of data quality accessment information record, final time and is worked as
The front time carries out index evaluation to the data age of data quality estimation information, and, data age can especially by with
Lower formula 4 carries out index evaluation:
Formula 4
Wherein, H4For data age index evaluation score;TfFor the final time of record, such as no record time, then use
The metadata time;TsFor the initial time of record, such as the no record time, then using the metadata time;TnFor current time.
According to above formula 4, the H finally trying to achieve4Span be [0-1], and H4Value bigger, then illustrate
Data ageing stronger, the value of data is higher.
It is directed to the present embodiment, unit of time is accurate to sky.
Data volume is the data volume for homogeneous data for each data trade platform according to collection, and data description letter
Description to data presented amount in breath, the accounting of the valid data amount that judgement is an actually-received.
In the present embodiment, data volume is according to the assessment full dose data volume of data quality accessment information, amount of metadata and promise
Data to carry out index evaluation to the data volume of data quality estimation information, and, data volume can be carried out especially by equation 5 below
Index evaluation:
Formula 5
Wherein, H5For data volume index evaluation score;X is current data volume;O1For assessing full dose data volume;O2For unit
Data volume;O3It is commitment data.
According to above formula 5, the H finally trying to achieve5Span be [0-1], and H5When=0, then data is described
Amount is little, otherwise data volume is big.
And, the assessment full dose data wherein referring to represents the full dose data of respective type data, for example:Personal record
Full dose data is Chinese 1,500,000,000 people;The full dose data of province data is Chinese 33 provinces etc..And commitment data is initial promise
Data volume.
According to above formula 1 to formula 5, the data consistency index evaluation score calculating respectively, data integrity refer to
Mark assessment score, data redudancy index evaluation score, data age index evaluation score data figureofmerit assessment score
The quality of data of data quality estimation information can be estimated.
Specifically, the data assessment of the quality of data is calculated by equation 6 below:
Formula 6
Wherein, P1For data quality accessment score;HjFor jth item evaluation index score;EjFinger for jth item evaluation index
Mark weight.
With reference to above in relation to data consistency, data integrity, data redudancy, data age data amount etc. five
The index evaluation of index is along number.
Then, as j=1, H1For data consistency index evaluation score, it is obtained by above formula 1, E1For data one
The cause property shared index weights in data quality accessment of score.
Understand in the same manner, as j=2, H2For data integrity index evaluation score, it is obtained by above formula 2, E2For
The shared index weights in data quality accessment of data integrity score;As j=3, H3For data redudancy index evaluation
Score, it is obtained by above formula 3, E3For the shared index weights in data quality accessment of data redudancy score;Work as j
When=4, H4For data age index evaluation score, it is obtained by above formula 4, E4For data age score in data
Shared index weights in quality evaluation;As j=5, H5For data volume index evaluation score, it is obtained by above formula 5,
E5For the shared index weights in data quality accessment of data volume score.
And in data quality accessment evaluation process, comment for data consistency index evaluation score, data integrity index
Estimate score, data redudancy index evaluation score, data age index evaluation score data figureofmerit assessment score five
The shared index weights in data quality accessment of different evaluation indexes then can be chosen by table 1 below.
Table 1
That is, by upper table 1, in data quality accessment evaluation process, data consistency index evaluation score, data
Integrity metrics assessment score, data redudancy index evaluation score, data age index evaluation score data figureofmerit
The shared index weights in data quality accessment of the different evaluation indexes of assessment score five are respectively 0.25,0.25,0.15,
0.2 and 0.15.
Can be according to data consistency index evaluation score, data integrity index evaluation score, data redudancy based on this
Index evaluation score, data age index evaluation score obtain data quality accessment score.
For the situation of data commodity, often there is more user and evaluate, and in user evaluates, it also can be divided into
Comment, in comment and differ from and comment three classification.
If the evaluation bar number accounting situation of different classifications is evaluated in evaluating for user, can effectively obtain
The feedback information to data commodity for the user.
Meanwhile, the Real-time and Dynamic by evaluating to user obtains, then acquisition user that can be more accurate and effective is to data
The evaluation information of commodity and feedback information.
The data assessment that user evaluates is calculated by equation 7 below:
Formula 7
P2=a1×1+a2×0.5+a3× 0, a1+a2+a3=1;
Wherein, P2For user's assessment score;a1Favorable comment accounting in evaluating for user;a2Comment in evaluating for user and account for
Than;a3In evaluating for user, difference comments accounting.
Data related article amount refers to after the title of this data commodity of related academics search, the accurate association of acquisition
Article quantity, the article quantity according to obtaining is capable of the temperature situation of evaluating data commodity, be user buyer definitely
Solve the situation of this data commodity.
For the reference document quantity of data commodity, according to logistic regression function characteristic, set up ranking score function, by energy
Enough definitely, effectively data related article amount is introduced in the evaluation of data commodity and uses.
Meanwhile, by Real-time and Dynamic acquisition is carried out to data related article amount, then it is possible to more accurately obtain data
The related article of commodity carrys out amount.
The data assessment of data related article amount is calculated by equation 8 below:
Formula 8
Wherein, P3Assess score for data related article amount;Y is the quantity of the related article of data commodity, and e is nature pair
The truth of a matter of number.
According to above formula 8, and the characteristic of ranking score function understands, when data correlation is quoted article amount and is equal to 0,
Then data related article amount is assessed and must be divided into P3={ 0;(Y=0) };When data correlation is quoted article amount and is equal to 1, then data phase
Close article amount and assess and must be divided into P3={ 0.075;(Y=1) }.
Because the span that data correlation quotes article amount is integer, then according to logistic regression function as shown in Figure 2
Image can show, the value of ranking score function will be 10 in data related article amount4Place trends towards 1, i.e. work as data
When related article amount is more than 10,000, data related article amount assessment score has tended to 1, therefore, according to ranking score function
Property specify that herein it must be divided into 1.
Finally, related civilian by the data quality accessment score obtained respectively above, user's assessment score data
Chapter amount assessment score is possible to data commodity are carried out dynamic comprehensive assessment, and draws the dynamic comprehensive assessment of final data commodity
Score.
Data commodity are carried out dynamic comprehensive assessment and are calculated by equation 9 below:
Formula 9
Wherein, P assesses score for dynamic comprehensive;PiData assessment score for i-th kind of data;FiItem for i-th kind of data
Mesh weight.
As i=1, P1For data quality accessment score, it is obtained by above formula 6, F1For data quality accessment score
Shared Term Weight in dynamic comprehensive assessment.
Understand in the same manner, as i=2, P2For user's assessment score, it is obtained by above formula 7, F2Comment for user
Valency assesses the shared Term Weight in dynamic comprehensive assessment of score;As i=3, P3Assess score for data related article amount,
It is obtained by above formula 8, F3Assess the shared Term Weight in dynamic comprehensive assessment of score for data related article amount.
And in data commodity dynamic comprehensive evaluation process, for data quality accessment score, user's assessment score and
Data related article amount is assessed the shared Term Weight in assessment of three different pieces of informations of score and then can be chosen by table 2 below.
Table 2
That is, according to table 2, in data commodity dynamic comprehensive evaluation process, data quality accessment score, user evaluate
The Term Weight that assessment score data related article amount is assessed corresponding to score three is respectively 0.45,0.3,0.25, and logical
Cross above Term Weight and every corresponding assessment score is possible to obtain the dynamic comprehensive assessment score of data commodity.
Embodiment two
Fig. 3 is the system diagram of an embodiment of data commodity dynamic comprehensive assessment system of the present invention;Fig. 4 is number of the present invention
Layer rank figure according to an embodiment of commodity dynamic comprehensive assessment system.
As shown in Figure 3,4, data commodity dynamic comprehensive assessment system in the present embodiment, including:
Model construction server, determines the data commodity of pending dynamic comprehensive assessment, and builds for data commodity dynamic
State assessment models;
Data acquisition terminal, the data quality accessment information of real-time data collection commodity, user evaluate the related literary composition of data
Three kinds of data of chapter amount;
Comprehensive assessment server, by the dynamic evaluation model of described model construction server construction, and according to described number
According to three kinds of data obtaining terminal collection, dynamic comprehensive assessment is carried out to data commodity.
Specifically, model construction server, first determines the data commodity needing to carry out dynamic comprehensive assessment, being directed to afterwards should
Data commodity carry out dynamic evaluation model structure.
Wherein, as shown in figure 4, can specifically be divided into three stratum, i.e. destination layer, solution layer in dynamic evaluation model
And indicator layer, it corresponds specifically to the final appraisal results of data commodity Dynamic Comprehensive Evaluation, evaluation procedure and evaluation index.
Wherein, indicator layer specifically includes and evaluates data related article amount three for data commodity data quality, user
The structure of aspect.
It is respectively directed to the structure to carry out dynamic comprehensive evaluation scheme for the data obtain in indicator layer, it is concrete in solution layer
Evaluate the structure that data related article amount is assessed respectively including for the quality of data, user.
Finally, destination layer carries out final dynamic comprehensive assessment by the data obtained by solution layer to data commodity.
Data related article is evaluated to the data quality accessment information of data commodity, user by data acquisition terminal
The data of amount is obtained in real time.
Further, in data acquisition terminal, it can further include:Matter for gathered data quality estimation information
Amount assessment information acquisition unit;Evaluate acquiring unit for the user that Real-time Collection user evaluates;And it is used for Real-time Collection number
Related article acquiring unit according to related article amount.
Acquiring unit is evaluated by quality estimation information acquiring unit, user and related article acquiring unit is respectively directed to
The data that data quality accessment information, user evaluate data related article amount carries out ensure that the real-time acquisition of data more
Quick it is ensured that data commodity related data ageing.
And in comprehensive assessment server, it can further include:For being carried out according to described data quality accessment information
The data quality accessment unit of data quality accessment;Evaluate for evaluating the user carrying out user's assessment according to described user
Assessment unit;, for according to described data related article amount carry out data related article amount assessment data related article amount comment
Estimate unit;And for according to described data quality accessment unit, the assessment of user's assessment unit data related article amount
The assessment result of unit carries out the dynamic comprehensive assessment unit of the Dynamic Comprehensive Evaluation of described data commodity.
Wherein, data quality accessment unit can be by above-mentioned formula 1 to 6, and bond quality assesses information acquisition unit institute
The data quality accessment score to calculate data commodity for the data quality accessment information of collection.
Further, data quality accessment unit may particularly include:For carrying out unanimously to data quality estimation information
Property assessment compliance evaluation subelement;Integrity assessment for data quality estimation information is carried out with integrity assessment is single
Unit;Redundancy for data quality estimation information is carried out with redundancy assessment assesses subelement;For to data quality accessment
Information carries out the ageing assessment subelement of ageing assessment;For data quality estimation information is carried out with the number of data volume assessment
According to amount assessment subelement;And for according to described compliance evaluation subelement, integrity assessment subelement, redundancy assessment
Unit, the assessment result of ageing assessment subelement data amount assessment subelement carry out the data matter of described data quality accessment
Amount assessment subelement.
Wherein, compliance evaluation subelement, integrity assessment subelement, redundancy assessment subelement, ageing assessment
Unit, data volume assessment subelement data quality evaluation subelement can be calculated by above formula 1,2,3,4,5 and 6 respectively
The data consistency index evaluation score of the quality of data, data integrity index evaluation score, data redudancy index evaluation obtain
Point, data age index evaluation score, data volume index evaluation score and data quality accessment score, supplementary number is come with this
Dynamic comprehensive assessment according to commodity.
User's assessment unit can evaluate, in conjunction with user, the use that acquiring unit obtains in real time by above formula 7
Family is evaluated and to be calculated user's assessment score of data commodity.
Data related article amount assessment unit can be obtained in conjunction with related article acquiring unit in real time by above formula 8
Data related article amount come to calculate data commodity data related article amount assess score.
Dynamic comprehensive assessment unit can be by above formula 9, in conjunction with data quality estimation unit, user's assessment list
The data quality accessment score that first data related article amount assessment unit calculates, user's assessment score data are related
Article amount is assessed score and is assessed score come the dynamic comprehensive to calculate data commodity.
Finally, the dynamic comprehensive of the data commodity by calculating assesses score, and buyer and seller can be made convenient
The comprehensive information recognizing this data commodity, also assures that data information on commodity comment has higher ageing simultaneously, be
Buyer and seller make final decision and provide effective data support.
The data commodity dynamic comprehensive appraisal procedure of the present invention and system and its system, by the much information of data commodity
To carry out the comprehensive assessment of data commodity, to ensure that the assessment of data commodity more comprehensively, accurately, meanwhile, by dynamically real
When gathered data commodity information, further ensure that the assessment result of data commodity is more ageing.
Finally it should be noted that:Above example only in order to technical scheme to be described, is not intended to limit;Although
With reference to the foregoing embodiments the present invention is described in detail, it will be understood by those within the art that:It still may be used
To modify to the technical scheme described in foregoing embodiments, or equivalent is carried out to wherein some technical characteristics;
And these modification or replace, do not make appropriate technical solution essence depart from various embodiments of the present invention technical scheme spirit and
Scope.
Claims (22)
1. a kind of data commodity dynamic comprehensive appraisal procedure is it is characterised in that comprise the following steps:
S1, determine the data commodity of pending dynamic comprehensive assessment;
S2, the data quality accessment information obtaining described data commodity in real time, user evaluate three kinds of numbers of data related article amount
According to, and data assessment is carried out to three kinds of data;
S3, dynamic comprehensive assessment is carried out to described data commodity by the following method according to the result of data assessment:
Wherein, P assesses score for dynamic comprehensive;PiData assessment score for i-th kind of data;FiProject power for i-th kind of data
Weight.
2. data commodity dynamic comprehensive appraisal procedure according to claim 1 it is characterised in that
It is superfluous that the data assessment evaluation index of described data quality accessment information includes data consistency, data integrity, data
Remaining, data age data amount;
The data assessment of the described quality of data calculates by the following method:
Wherein, P1For data quality accessment score;HjFor jth item evaluation index score;EjIndex power for jth item evaluation index
Weight.
3. data commodity dynamic comprehensive appraisal procedure according to claim 2 it is characterised in that
Described data consistency according to the actual amount of data of described data quality accessment information, real data size, data form,
And the record data amount of metadata, log file size, metadata data form come to described data quality accessment information
Data consistency carries out index evaluation;
And, described data consistency carries out index evaluation by the following method:
Wherein, H1For data consistency index evaluation score;LaRepresent actual amount of data;LmFor metadata record data volume;SaFor
Actual data files size;SmFor metadata record file size;P is data form concordance, and it is sentenced using file suffixes name
Not, if file suffixes name is identical with the data name of record in metadata, assignment 1, otherwise it is entered as 0.
4. data commodity dynamic comprehensive appraisal procedure according to claim 2 it is characterised in that
Described data integrity is according to the non-null value amount of described data quality accessment information and the data amount check of entirety come to described
The data integrity of data quality accessment information carries out index evaluation;
And, described data integrity carries out index evaluation by the following method:
Wherein, H2For data integrity index evaluation score;aijWhether it is null value for the i-th row, jth column data, if null value
Then take 0, be not that null value then takes 1;N is all data amount checks, and when data commodity common m row, n row, then N=m × n, and m and n are
Natural number.
5. data commodity dynamic comprehensive appraisal procedure according to claim 2 it is characterised in that
Described data redudancy is according to the number of times repeating record appearance in described data quality accessment information, and the sum of record
To carry out index evaluation to the data redudancy of described data quality accessment information;
And, described data redudancy carries out index evaluation by the following method:
Wherein, H3For data redudancy index evaluation score;ciRepeat to record the number of times occurring for i-th;R is the sum of record.
6. data commodity dynamic comprehensive appraisal procedure according to claim 2 it is characterised in that
Described data age according to the initial time of described data quality accessment information record, final time and current time Lai
Index evaluation is carried out to the data age of described data quality accessment information;
And, described data age carries out index evaluation by the following method:
Wherein, H4For data age index evaluation score;TfFor the final time of record, such as no record time, then using first number
According to the time;TsFor the initial time of record, such as the no record time, then using the metadata time;TnFor current time.
7. data commodity dynamic comprehensive appraisal procedure according to claim 2 it is characterised in that
Described data volume is according to the assessment full dose data volume of described data quality accessment information, amount of metadata and commitment data come right
The data volume of described data quality accessment information carries out index evaluation;
And, described data volume carries out index evaluation by the following method:
Wherein, H5For data volume index evaluation score;X is current data volume;O1For assessing full dose data volume;O2For metadata
Amount;O3It is commitment data.
8. data commodity dynamic comprehensive appraisal procedure according to claim 1 it is characterised in that
Described user evaluate favorable comment in being evaluated according to described user, in comment and differ from the accounting commented and comment data is carried out to user's evaluation
Estimate;
And, the data assessment that described user evaluates calculates by the following method:
P2=a1×1+a2×0.5+a3× 0, a1+a2+a3=1;
Wherein, P2For user's assessment score;a1Favorable comment accounting in evaluating for user;a2Accounting is commented in evaluating for user;a3
In evaluating for user, difference comments accounting.
9. data commodity dynamic comprehensive appraisal procedure according to claim 1 it is characterised in that
The quantity of the related article according to the described data commodity retrieving for the described data related article amount is come literary composition related to data
Chapter amount carries out data assessment;
And, the data assessment of described data related article amount calculates by the following method:
Wherein, P3Assess score for data related article amount;Y is the quantity of the related article of data commodity, and e is natural logrithm
The truth of a matter.
10. a kind of data commodity dynamic comprehensive assessment system is it is characterised in that include:
Model construction server, determines the data commodity of pending dynamic comprehensive assessment, and builds for described data commodity dynamic
State assessment models;
Data acquisition terminal, the data quality accessment information of data commodity described in Real-time Collection, user evaluate the related literary composition of data
Three kinds of data of chapter amount;
Comprehensive assessment server, by the dynamic evaluation model of described model construction server construction, and obtains according to described data
Take three kinds of data of terminal collection, dynamic comprehensive assessment is carried out to data commodity.
11. data commodity dynamic comprehensive assessment systems according to claim 10 it is characterised in that
Described data acquisition terminal includes:
Quality estimation information acquiring unit, for gathered data quality estimation information;
User evaluates acquiring unit, evaluates for Real-time Collection user;
Related article acquiring unit, for real-time data collection related article amount.
12. data commodity dynamic comprehensive assessment systems according to claim 11 it is characterised in that
Described comprehensive assessment server includes:
Data quality accessment unit, for carrying out data quality accessment according to described data quality accessment information;
User's assessment unit, carries out user's assessment for evaluating according to described user;
Data related article amount assessment unit, for carrying out the assessment of data related article amount according to described data related article amount;
Dynamic comprehensive assessment unit, for related according to described data quality accessment unit, user's assessment unit data
The assessment result of article amount assessment unit carries out the Dynamic Comprehensive Evaluation of described data commodity.
13. data commodity dynamic comprehensive assessment systems according to claim 12 it is characterised in that
Described data quality accessment unit includes:
Compliance evaluation subelement, for carrying out compliance evaluation to data quality estimation information;
Integrity assessment subelement, for carrying out integrity assessment to data quality estimation information;
Redundancy assesses subelement, for carrying out redundancy assessment to data quality estimation information;
Ageing assessment subelement, for carrying out ageing assessment to data quality estimation information
Data volume assesses subelement, for carrying out data volume assessment to data quality estimation information;
Data quality accessment subelement, for commenting according to described compliance evaluation subelement, integrity assessment subelement, redundancy
Estimate subelement, the assessment result of ageing assessment subelement data amount assessment subelement carries out described data quality accessment.
14. data commodity dynamic comprehensive assessment systems according to claim 12 it is characterised in that
Described dynamic comprehensive assessment unit according to the quality of data of described data commodity, user evaluate data related article amount Lai
Dynamic comprehensive assesses described data commodity;
And, described dynamic comprehensive assessment unit carries out dynamic comprehensive assessment by the following method:
Wherein, P assesses score for dynamic comprehensive;PiData assessment score for i-th kind of data;FiProject power for i-th kind of data
Weight.
15. data commodity dynamic comprehensive assessment systems according to claim 14 it is characterised in that
Described data quality accessment subelement is according to the data consistency of described data quality accessment information, data integrity, number
To assess the quality of data of described data quality accessment information according to redundancy, data age data amount;
And, described data quality accessment subelement carries out described data quality accessment by the following method:
Wherein, P1For data quality accessment score;HjFor jth item evaluation index score;EjIndex power for jth item evaluation index
Weight.
16. data commodity dynamic comprehensive assessment systems according to claim 15 it is characterised in that
Described compliance evaluation subelement is according to the actual amount of data of described data quality accessment information, real data size, number
According to form, and the record data amount of metadata, log file size, metadata data form are commented assessing the described quality of data
Estimate the data consistency of information;
And, described compliance evaluation subelement carries out described compliance evaluation by the following method:
Wherein, H1For data consistency index evaluation score;LaRepresent actual amount of data;LmFor metadata record data volume;SaFor
Actual data files size;SmFor metadata record file size;P is data form concordance, and it is sentenced using file suffixes name
Not, if file suffixes name is identical with the data name of record in metadata, assignment 1, otherwise it is entered as 0.
17. data commodity dynamic comprehensive assessment systems according to claim 15 it is characterised in that
Described integrity assessment subelement according to the non-null value amount of described data quality accessment information and all data amount checks Lai
Assess the data integrity of described data quality accessment information;
And, described integrity assessment subelement carries out described integrity assessment by the following method:
Wherein, H2For data integrity index evaluation score;aijWhether it is null value for the i-th row, jth column data, if null value
Then take 0, be not that null value then takes 1;N is all data amount checks, and when data commodity common m row, n row, then N=m × n, and m and n are
Natural number.
18. data commodity dynamic comprehensive assessment systems according to claim 15 it is characterised in that
Described redundancy assesses subelement according to the number of times repeating record appearance in described data quality accessment information, and record
The data redudancy to assess described data quality accessment information for the sum;
And, described redundancy assessment subelement carries out described redundancy assessment by the following method:
Wherein, H3For data redudancy index evaluation score;ciRepeat to record the number of times occurring for i-th;R is the sum of record.
19. data commodity dynamic comprehensive assessment systems according to claim 15 it is characterised in that
Described ageing assessment subelement is according to the initial time of described data quality accessment information record, final time and current
Time assesses the data age of described data quality accessment information;
And, described ageing assessment subelement carries out described ageing assessment by the following method:
Wherein, H4For data age index evaluation score;TfFor the final time of record, such as no record time, then using first number
According to the time;TsFor the initial time of record, such as the no record time, then using the metadata time;TnFor current time.
20. data commodity dynamic comprehensive assessment systems according to claim 15 it is characterised in that
Described data volume is assessed subelement and according to the assessment full dose data volume of described data quality accessment information, amount of metadata and is held
Promise data is assessing the data volume of described data quality accessment information;
And, described data volume assessment subelement carries out described data volume assessment by the following method:
Wherein, H5For data volume index evaluation score;X is current data volume;O1For assessing full dose data volume;O2For metadata
Amount;O3It is commitment data.
21. data commodity dynamic comprehensive assessment systems according to claim 14 it is characterised in that
Described user's assessment unit evaluated according to described user in favorable comment, in comment and differ from the accounting commented and comment assessing user
Valency;
And, described user's assessment unit carries out user's assessment by the following method:
P2=a1×1+a2×0.5+a3× 0, a1+a2+a3=1;
Wherein, P2For user's assessment score;a1Favorable comment accounting in evaluating for user;a2Accounting is commented in evaluating for user;a3
In evaluating for user, difference comments accounting.
22. data commodity dynamic comprehensive assessment systems according to claim 14 it is characterised in that
The quantity of the related article according to the described data commodity retrieving for the described data related article amount assessment unit is assessed
Data related article amount;
And, described data related article amount assessment unit carries out the assessment of data related article amount by the following method:
Wherein, P3Assess score for data related article amount;Y is the quantity of the related article of data commodity, and e is natural logrithm
The truth of a matter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610792027.8A CN106469395A (en) | 2016-08-31 | 2016-08-31 | A kind of data commodity dynamic comprehensive appraisal procedure and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610792027.8A CN106469395A (en) | 2016-08-31 | 2016-08-31 | A kind of data commodity dynamic comprehensive appraisal procedure and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106469395A true CN106469395A (en) | 2017-03-01 |
Family
ID=58230389
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610792027.8A Pending CN106469395A (en) | 2016-08-31 | 2016-08-31 | A kind of data commodity dynamic comprehensive appraisal procedure and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106469395A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107807972A (en) * | 2017-10-19 | 2018-03-16 | 北京科技大学 | A kind of test data consistency detecting method |
CN108734405A (en) * | 2018-05-24 | 2018-11-02 | 国信优易数据有限公司 | A kind of data value Evaluation Platform and method |
CN108764705A (en) * | 2018-05-24 | 2018-11-06 | 国信优易数据有限公司 | A kind of data quality accessment platform and method |
CN108764995A (en) * | 2018-05-24 | 2018-11-06 | 国信优易数据有限公司 | A kind of data value determines system and method |
CN108829750A (en) * | 2018-05-24 | 2018-11-16 | 国信优易数据有限公司 | A kind of quality of data determines system and method |
CN110059083A (en) * | 2019-04-24 | 2019-07-26 | 北京金堤科技有限公司 | A kind of data evaluation method, apparatus and electronic equipment |
CN110766429A (en) * | 2018-07-26 | 2020-02-07 | 国信优易数据有限公司 | Data value evaluation system and method |
CN110766428A (en) * | 2018-07-25 | 2020-02-07 | 国信优易数据有限公司 | Data value evaluation system and method |
CN110858369A (en) * | 2018-08-24 | 2020-03-03 | 国信优易数据有限公司 | Data value evaluation system and method and electronic equipment |
CN113691523A (en) * | 2021-08-20 | 2021-11-23 | 中国科学技术大学先进技术研究院 | Real-time network traffic password application-oriented evaluation method and terminal equipment |
-
2016
- 2016-08-31 CN CN201610792027.8A patent/CN106469395A/en active Pending
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107807972A (en) * | 2017-10-19 | 2018-03-16 | 北京科技大学 | A kind of test data consistency detecting method |
CN107807972B (en) * | 2017-10-19 | 2020-12-22 | 北京科技大学 | Test data consistency detection method |
CN108734405A (en) * | 2018-05-24 | 2018-11-02 | 国信优易数据有限公司 | A kind of data value Evaluation Platform and method |
CN108764705A (en) * | 2018-05-24 | 2018-11-06 | 国信优易数据有限公司 | A kind of data quality accessment platform and method |
CN108764995A (en) * | 2018-05-24 | 2018-11-06 | 国信优易数据有限公司 | A kind of data value determines system and method |
CN108829750A (en) * | 2018-05-24 | 2018-11-16 | 国信优易数据有限公司 | A kind of quality of data determines system and method |
CN110766428A (en) * | 2018-07-25 | 2020-02-07 | 国信优易数据有限公司 | Data value evaluation system and method |
CN110766429A (en) * | 2018-07-26 | 2020-02-07 | 国信优易数据有限公司 | Data value evaluation system and method |
CN110858369A (en) * | 2018-08-24 | 2020-03-03 | 国信优易数据有限公司 | Data value evaluation system and method and electronic equipment |
CN110059083A (en) * | 2019-04-24 | 2019-07-26 | 北京金堤科技有限公司 | A kind of data evaluation method, apparatus and electronic equipment |
CN113691523A (en) * | 2021-08-20 | 2021-11-23 | 中国科学技术大学先进技术研究院 | Real-time network traffic password application-oriented evaluation method and terminal equipment |
CN113691523B (en) * | 2021-08-20 | 2023-10-10 | 中科国昱(合肥)科技有限公司 | Real-time network traffic password application evaluation method and terminal equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106469395A (en) | A kind of data commodity dynamic comprehensive appraisal procedure and system | |
Brezigar-Masten et al. | CART-based selection of bankruptcy predictors for the logit model | |
Olczyk | A systematic retrieval of international competitiveness literature: a bibliometric study | |
Tsalis et al. | A social LCA framework to assess the corporate social profile of companies: Insights from a case study | |
CN108133013A (en) | Information processing method, device, computer equipment and storage medium | |
Morhardt | Scoring corporate environmental reports for comprehensiveness: a comparison of three systems | |
CN107729519B (en) | Multi-source multi-dimensional data-based evaluation method and device, and terminal | |
La Nauze | Sexual orientation–based wage gaps in Australia: The potential role of discrimination and personality | |
CN110766428A (en) | Data value evaluation system and method | |
Eckhaus | Corporate transformational leadership's effect on financial performance | |
CN109146611A (en) | A kind of electric business product quality credit index analysis method and system | |
CN109299085A (en) | A kind of data processing method, electronic equipment and storage medium | |
CN108573339A (en) | A kind of consumer's net purchase methods of risk assessment of multi objective Project Decision Method | |
CN110659926A (en) | Data value evaluation system and method | |
Hajkowicz | Rethinking the economist’s evaluation toolkit in light of sustainability policy | |
CN112132441A (en) | Risk propagation information evaluation method, risk propagation information evaluation system, storage medium and computer equipment | |
CN106202299A (en) | A kind of people with disability authority user based on people with disability's feature recommends method | |
CN113283806A (en) | Enterprise information evaluation method and device, computer equipment and storage medium | |
CN105447117A (en) | User clustering method and apparatus | |
Zhao | Economic policy uncertainty and manufacturing value-added exports | |
MacEachern | Measuring the added value of library and information services: the New Zealand approach | |
Wu et al. | Environment damage assessment: a literature review using social network analysis | |
Murugan | Creation of a recommendation system to recommend cryptocurrency portfolio using Association rule mining | |
Guerrini et al. | Measuring the efficiency of the Italian construction industry | |
CN112232945A (en) | Method and device for determining personal customer credit |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170301 |