CN109190089A - Probabilistic Synthesis sort method - Google Patents
Probabilistic Synthesis sort method Download PDFInfo
- Publication number
- CN109190089A CN109190089A CN201811035247.1A CN201811035247A CN109190089A CN 109190089 A CN109190089 A CN 109190089A CN 201811035247 A CN201811035247 A CN 201811035247A CN 109190089 A CN109190089 A CN 109190089A
- Authority
- CN
- China
- Prior art keywords
- line
- data set
- sequence
- sequence line
- comparison element
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/06—Arrangements for sorting, selecting, merging, or comparing data on individual record carriers
- G06F7/08—Sorting, i.e. grouping record carriers in numerical or other ordered sequence according to the classification of at least some of the information they carry
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Probability & Statistics with Applications (AREA)
- Operations Research (AREA)
- Algebra (AREA)
- Bioinformatics & Computational Biology (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Evolutionary Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to field of computer technology, provide a kind of Probabilistic Synthesis sort method.This method mainly comprise the steps that by the past experimental result resolve into only include two comparison elements sequence line;Count the repetition frequency of every sequence line;The position accuracy for selecting the starting point sequence each comparison element of line cycle calculations, until obtaining the optimal sequencing line that all comparison elements are formed with extreme higher position accuracy.The present invention is carried out multiple sequence lines in the way of correct probability for the first time and merged, the line that sorts can not include whole elements, and can be in the presence of that Partial Elements collating sequence is inconsistent between different sequence lines, the method that exhaustive all possible sequence lines carry out correct probability comparison again is abandoned, by the way of the screening of highest correct probability, gradually the sequence line of highest correct probability is shown, the review of coherent element is finally carried out to the sequence line of high probability again, the calculation amount not only greatly reduced, and technical result accuracy rate is higher.
Description
Technical field
The present invention relates to field of computer technology more particularly to a kind of Probabilistic Synthesis sort methods.
Background technique
During scientific experiment, some scientific experiments can obtain some ranking results, if the sequence knot of multiple experiments
Fruit can be brought together comprehensive utilization, form a comprehensive ranking results, be to have certain significance of scientific research.
According to previous sort method, be the element determination value in all experiments is put together and is ranked up, but due to
Experiment condition is different, experiment survey meter device is different, experimental method is different, although being directed to same experimental subjects, different research
Experiment, the experimental result of acquisition is different, and different experiments are quite big to the difference of the measurement of same experimental subjects magnitude obtained,
It is difficult to accept or reject, since some differences can reach the degree of the order of magnitude, thus it is integrated ordered only according to measured value progress, it is extremely inaccurate.
For example, have a kind of scientific experiment be about composition measurement class scientific experiment, experiment be measure the X Chinese medicine in several provinces A it is effective
The comparison result of component content, it is contemplated that look at the A active constituent content highest of the X Chinese medicine in which province, quality of medicinal material is best.Have
Measuring be the first and second the third provinces, measuring be the third naphthacene of second, measuring be the first and second naphthacene, obtained
What is obtained is the concrete content value of the A effective component of the X Chinese medicine in different provinces.When researcher wants to obtain that the A effective component of X Chinese medicine contains
When all province comparison results of amount, such as which province highest or secondary height, if the measured value directly provided according to experiment by
It is ranked up (experiment condition is different) according to size, sequence is extremely inaccurate.Because the A for measuring X Chinese medicine in certain experiment is effective
Component content, the measured value of the first and second the third provinces are respectively 0.6mg/ml, 0.5mg/ml, 0.4mg/ml, and in another experiment
The measured value that Ding Yijia is saved is respectively 10mg/ml, 9mg/ml, 8mg/ml, if directly sorting according to measured value is that can not obtain
Accurate integrated ordered result.
Since experiment condition is consistent in single experiment, so the ranking results comprising part province are quasi- in single experiment
True, such as: certain experiment obtains first > second > the third, and in addition experiment obtains fourth > second > first.If it is desired to obtain all provinces about in X
The sequence of the A effective component of medicine needs a kind of sort method and the ranking results of each experiment is integrated, this sequence side
Method not only otherwise by the experiment condition result of each research is different is limited, but also can solve different between different experiments province
The problem of ranking results.
Summary of the invention
The technical problem to be solved in the present invention is to provide a kind of Probabilistic Synthesis sort methods, can comprehensively utilize different sections
Experiment ranking results are ground, the higher ranking results of accuracy rate are provided.
In order to solve the above-mentioned technical problems, the present invention provides a kind of Probabilistic Synthesis sort methods, comprising the following steps:
S1: the data set definition by the past experiment ranking results composition is data set P, will be each in the data set P
Item sequence line resolves into sequence line only comprising two comparison elements, all sequence lines compositions only comprising two comparison elements
Data set definition is data set Q;
S2: the repetition frequency of every sequence line in the data set Q is counted;
S3: using the highest sequence line of frequency of occurrence in the data set Q as starting point sequence line, in starting point sequence line
Comparison element q1With comparison element q2Based on, emerging comparison element q in subsequent sequence line is added one by onen, by the ratio
Compared with element qnWith comparison element q1~qn-1Combination, is listed comprising comparison element q1~qn-1N item sort line, the n item sorts line
Composition data collection M, wherein the comparison element qnPosition in the every sequence line of the data set M is different from, and n is
Positive integer, maximum value are the quantity of comparison element in the data set Q;
S4: every sequence line in the data set M is resolved into only according to the method in the step S1 comprising described
Comparison element qnWith the comparison element q1~qn-1Any one of n-1 item sort line, the n group decomposited in the data set M
N-1 item sequence line composition data collection R;
S5: searching for the sequence line in the data set R respectively in the data set Q, compares and sorts in the data set Q
Sort the ordering relation of line in line and the data set R, and the correct of line of sorting in the data set R is marked according to comparing result
The frequency or the wrong frequency;
S6: the comparison element q is calculated separatelynPosition accuracy in the every sequence line of the data set M, calculates
Formula are as follows: in the data set R in the sum of correct frequency of every group of sequence line/data set R every group of sequence line correct frequency
The summation * 100% of the secondary and wrong frequency;
S7: the comparison element q is chosennThe highest sequence line of position accuracy in the sequence line of the data set M is made
To calculate comparison element qn+1The starting point sequence line of position accuracy, return step S3 repeat step S3~S6, obtain the ratio
Compared with element qn+1The highest sequence line of position accuracy, circulation executes until acquisition all comparison element positions accuracy is highest
Optimal sequencing line.
Further, the step S1 further includes, by the sequence line p of the data set PnIn any two comparison element
In conjunction with according to any two comparison element in the sequence line pnIn ordering relation be ranked up, obtain only comprising institute
State the sequence line of any two comparison element.
Further, the step S5 further includes searching for the sequence in the data set R respectively in the data set Q
Line, if the first sequence line in the data set R is identical as the first sequence line in the data set Q, by the data
The frequency of occurrence of first sequence line described in collection Q is labeled as the correct frequency of the first sequence line described in the data set R;If
The second sequence line in the data set R and the second sequence line in the data set Q are on the contrary, then by institute in the data set Q
The frequency of occurrence for stating the second sequence line is labeled as the wrong frequency of the second sequence line described in the data set R;If the number
It is then marked in the data set Q both without identical sequence line or not opposite sequence line according to the third sequence line in collection R
The correct frequency of the sequence line of third described in the data set R is 0.
Further, in the step S6, if the comparison element qn2 or more in the data set M are sorted
Position accuracy in line is identical and is extreme higher position accuracy, then all regard described 2 or more sequence lines as starting point sequence line,
Continue to calculate next emerging comparison element qn+1。
Further, further include step S8 after the step S7: calculating all comparison elements in the optimal sequencing line
Mean place accuracy, sum to the position accuracy of each comparison element in the optimal sequencing line, then divided by more first
The quantity of element, obtains the mean place accuracy of the optimal sequencing line.
It further, further include step S9 after the step S8: to the position of comparison element in the optimal sequencing line
Accuracy is checked, and is searched in the data set Q all comprising the comparison element qnSequence line, by every include institute
State comparison element qnSequence line be compared with the optimal sequencing line, marked according to the comparison result of ordering relation described
Comparison element qnThe correct frequency or the wrong frequency.
Further, the step S9 further include:, will be comprising described if the comparison result is that ordering relation is identical
Comparison element qnFrequency of occurrence of the sequence line in the data set Q be labeled as the comparison element qnIn the optimal sequencing
The correct frequency in position in line;If the comparison result be ordering relation on the contrary, if will include the comparison element qnRow
Frequency of occurrence of the sequence line in the data set Q is labeled as the comparison element qnPositional fault in the optimal sequencing line
The frequency;If in the data set Q including the comparison element qnSequence line ordering relation do not appear in it is described optimal
In sequence line, then by the comparison element qnPosition accuracy in the optimal sequencing line is labeled as 0.
Further, further include step S10 after the step S9: it is multiple to calculate all comparison elements in the optimal sequencing line
Mean place accuracy after core.
Above-mentioned technical proposal of the invention has the advantages that the present invention proposes solution aiming at the problem that multiple sequence line generalizations
Certainly method is carried out multiple sequence lines in the way of correct probability for the first time and merged, and sequence line can not include whole elements, and
Can be in the presence of that Partial Elements collating sequence is inconsistent between difference sequence line.The present invention abandons during realization
The method that exhaustive all possible sequence lines carry out correct probability comparison again, the mode for having used highest correct probability to screen, gradually
The sequence line of highest correct probability is shown, and finally carries out the review of coherent element to the sequence line of highest correct probability again,
Calculation amount is not only greatly reduced in this way, and ranking results accuracy and the effect of exhaustive all sequence lines are consistent.
Detailed description of the invention
Fig. 1 is the flow diagram of Probabilistic Synthesis sort method of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention
Technical solution in embodiment is clearly and completely described, it is clear that described embodiment is that a part of the invention is implemented
Example, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creativeness
Every other embodiment obtained, shall fall within the protection scope of the present invention under the premise of labour.
Fig. 1 is the flow diagram of Probabilistic Synthesis sort method of the present invention.As shown in Figure 1, Probabilistic Synthesis of the invention sorts
Method the following steps are included:
S1: the data set definition by the past experiment ranking results composition is data set P, and each in data set P is arranged
Sequence line resolves into sequence line only comprising two comparison elements, the data of all sequence lines compositions only comprising two comparison elements
Collection is defined as data set Q.
It in step sl, is data set P by the data set definition of the past experiment ranking results composition, it will be in data set P
Each sequence line resolves into sequence line only comprising two comparison elements, and all only includes the sequence line group of two comparison elements
At data set definition be data set Q.Sequence line in data set Q is to decompose the sequence line in data set P, by data
Collect same sequence line p in PnIn any two comparison element combine, according to any two comparison element sequence line pnIn
Ordering relation be ranked up, obtain only include any two comparison element a sequence line.
For example, having 6 sequences lines, respectively p in data set P1:A>C>B>D、p2:A>B>D>C>Y、p3:A>B>Y、p4:Y>
D、p5:A>D>C>Y、p6: A > B > D > C, target are sequence of the expectation acquisition about 5 elements of ACBDY.It is any to choose sequence line
p1: two comparison elements A and C in A > C > B > D, according to elements A and C in sequence line p1In ordering relation, then form sequence
Line: A > C similarly chooses comparison element A and B, according to A and B in sequence line p1In ordering relation formed sequence line: A > B, successively
Analogize, by the line p that sorts1: A > C > B > D forms 5 sequences line as shown in Table 1 after decomposing.
Table 1
Serial number | 1 | 2 | 3 | 4 | 5 |
Sort line | A>C | A>B | A>D | C>B | B>D |
S2: the repetition frequency of every sequence line in statistical data collection Q.
In step s 2, summarize the sequence line repeated in data set Q, count the repetition frequency of every sequence line.Example
Such as, after 6 sequence lines in above-mentioned data set P all being decomposed, summarize 13 sequence lines out, count the repetition of every sequence line
The frequency, as shown in table 2.
Table 2
Serial number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 |
Sort line | A>B | A>D | A>C | D>C | B>D | A>Y | C>Y | B>C | B>Y | D>Y | C>B | C>D | Y>D |
The frequency | 4 | 4 | 4 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 1 | 1 | 1 |
S3: first with the comparison in starting point sequence line using the highest sequence line of frequency of occurrence in data set Q as starting point sequence line
Plain q1With comparison element q2Based on, emerging comparison element q in subsequent sequence line is added one by onen, by comparison element qnWith than
Compared with element q1~qn-1Combination, is listed comprising comparison element q1~qnN item sort line, n item sort line composition data collection M, wherein
Comparison element qnPosition in the every sequence line of data set M is different from, and n is positive integer, and maximum value is to compare in data set Q
Compared with the quantity of element.
In step s3, using the highest sequence line of frequency of occurrence in data set Q as starting point sequence line, for example, in table 2, with
A > B that frequency of occurrence is 4 times is that starting point sequence line is gradually added in subsequent sequence line newly to go out based on comparison element A and B
Existing comparison element, as Article 2 sort the emerging comparison element D of line, by emerging comparison element D compared with before
3 sequence lines comprising comparison element A, B and D 1. A > B > D, 2. A > D > B and 3. D > A > B are listed in elements A and B combination, wherein
Position in every sequence line where comparison element D is different from, that is, lists all positions being likely to occur comparison element D.On
State 3 sequence lines 1. A > B > D, 2. A > D > B and 3. D > A > B composition data collection M.Wherein, n is positive integer, and maximum value is data set Q
The quantity of middle comparison element.
S4: every sequence line in data set M is resolved into only according to the method in step S1 comprising comparison element qnWith
Comparison element q1~qn-1Any one of n-1 item sort line, the n group n-1 item decomposited in data set M sorts line composition data
Collect R.
In step s 4, every sequence line in data set M is resolved into only according to the method in step S1 comprising qnAnd q1
~qn-1Any one of n-1 item sort line, share n group n-1 item sequence line.Such as 3 of comparison element A, B and D composition are arranged
Sequence line 1. A > B > D, 2. A > D > B and 3. D > A > B according to the method in step S1 resolves into 3 groups of sequence lines, every group of sequence line includes 2
Item sorts line, and every sequence line only includes any one of comparison element D and A, B, as shown in table 3.3*2 item sequence in table 3
Line composition data collection R.
Table 3
S5: searching for the sequence line in data set R respectively in data set Q, and sort line and data set R in correlation data collection Q
The ordering relation of middle sequence line, according to comparing result come the correct frequency for the line that sorts in labeled data collection R or the wrong frequency.
In step s 5, the sequence line in data set R is searched for respectively in data set Q, if the first row in data set R
Sequence line is identical as the first sequence line in data set Q, then the frequency of occurrence of the first sequence line in data set Q is labeled as data set
The correct frequency of first sequence line in R, for example, searching for sequence line A > D in data set R, that is, table 3, hair in data set Q, that is, table 2
Existing identical sequence line, frequency of occurrence are 4 times, then the correct frequency for the line A > D that sorts in labeled data collection R is 4.
If the second sequence line in data set R and the second sequence line in data set Q on the contrary, if by data set Q the
The frequency of occurrence of two sequence lines is labeled as the wrong frequency of the second sequence line in data set R, for example, searching in data set Q, that is, table 2
Sequence line D > B in rope data set R, that is, table 3, discovery have opposite sequence line B > D, and frequency of occurrence is 3 times, then labeled data
The wrong frequency for collecting the line D > B that sorts in R is 3.
If the third sequence line in data set R is in data set Q both without identical sequence line or not opposite row
Sequence line, then the correct frequency of third sequence line is 0 in labeled data collection R.That is there is no in data set for the sequence line in data set R
Occur in Q, shows do not occurred this as a result, so its correct frequency can be designated as 0 in experiment.
It after by the sequence line in data set R all search, can obtain as shown in table 4 as a result, listing 3*2 in table 4
Correct, the mistake frequency of item sequence line.
Table 4
S6: comparison element q is calculated separatelynPosition accuracy in the every sequence line of data set M, calculation formula are as follows:
In data set R in the sum of correct frequency of every group of sequence line/data set R the correct frequency of every group of sequence line and the wrong frequency it is total
With * 100%.
In step s 6, comparison element q is calculated separatelynPosition accuracy in the every sequence line of data set M, calculates
Formula are as follows: in data set R in the sum of correct frequency of every group of sequence line/data set R every group of sequence line the correct and wrong frequency
Summation * 100%.For example, in table 4,3 bar sequence lines 1. A > B > D, 2. A > D > B and 3. D > A of the comparison element D in data set M
Position in > B is different from, and according to above-mentioned formula, position accuracy of the comparison element D in sequence line 1. A > B > D can be calculated
For (correct 4 times+3 times correct)/(correct 4 times+3 times correct) * 100%=100%;Comparison element D is in sequence line 2. A > D > B
Position accuracy can be calculated as correct 4 times/(correct 4 times+mistake 3 times) * 100%=57.14%;Comparison element D is sorting
Position accuracy of the line 3. in D > A > B can be calculated as correct 0 time/(4 times+mistake of mistake 3 times) * 100%=0.Calculated result is such as
Shown in table 5.
Table 5
S7: comparison element q is chosennThe highest sequence line of position accuracy in the sequence line of data set M, as calculating ratio
Compared with element qn+1The starting point sequence line of position accuracy, return step S3 repeat step S3~S6, obtain comparison element qn+1Position
The highest sequence line of accuracy, circulation execute until obtaining the highest optimal sequencing line of all comparison element positions accuracy.
In step S7, comparison element q is chosennThe highest sequence line of position accuracy, example in the sequence line of data set M
Such as, in 3 sequence lines of data set M 1. A > B > D, 2. A > D > B and 3. in D > A > B, comparison element D is in sequence line 1. A > B > D
Position accuracy is up to 100%, therefore chooses sequence line 1. starting point row of the A > B > D as the next new comparison element C of calculating
Sequence line.Return step S3 calculates next emerging comparison element C, repeats step S3~S6, is obtaining comparison element location of C just
The highest sequence line of true rate, until obtaining the highest optimal sequencing line of all comparison element positions accuracy.
For example, can continue to calculate and compare Elements C and the position accuracy of Y, wherein with comparison element D in the example above
Setting the highest sequence line A > B > D of accuracy is starting point sequence line, and comparison element C and sequence line A > B > D can be combined to 4 sequence lines
1. it is correct to can be calculated comparison element location of C by the above method by A > B > D > C, 2. A > B > C > D, 3. A > C > B > D, 4. C > A > B > D
The highest sequence line of rate, then the rest may be inferred calculate comparison element Y position accuracy, finally obtain the optimal of 5 comparison elements
Sequence line is A > B > D > C > Y, and each comparison element is as shown in table 6 in position accuracy wherein.
Table 6
Comparison element | A | B | D | C | Y |
Position accuracy | 100.00% | 100.00% | 100.00% | 81.82% | 90.00% |
Present embodiment pass through using the highest frequency only include two comparison elements sequence line be starting point sequence line, pass through meter
The accuracy height for calculating each element in the sequence line after new element is added is ranked up the extension of line, until all coherent elements
The sequence being involved in forms the highest sequence line of final accuracy.During realization, exhaustive all possible sequence lines have been abandoned
The comparative approach for carrying out correct probability again, by the way of the screening of highest correct probability, gradually the sequence of highest correct probability
Line is shown.
In step s 6, if comparison element qnPosition accuracy in 2 or more of data set M sequence lines it is identical and
For extreme higher position accuracy, then all it regard this 2 or more sequence lines as starting point sequence line, continues to calculate next emerging comparison
Element qn+1.Comparison element q ought occurnWhen position accuracy in 2 or more the sequence lines of data set M is identical, such as than
3 sequence lines 1. A > B > D, 2. A > D > B and 3. in D > A > B compared with element D in data set M, sequence line 1. A > B > D and 2. A > D > B
Position accuracy it is identical and be extreme higher position accuracy, then will sequence line 1. A > B > D and 2. A > D > B is remained, respectively
Using sort line 1. A > B > D and 2. A > D > B as starting point sort line, continue to calculate next emerging comparison element C.Since starting point is arranged
There is the case where being greater than 1 in sequence line, therefore the optimal sequencing line finally obtained will also the case where being greater than 1 occur, if
Repeatedly occurs the case where increasing starting point sequence line during calculating each comparison element, then the optimal sequencing line finally obtained
A fairly large number of situation will be will appear.
Further include step S8 after step S7: calculating the mean place accuracy of all comparison elements in optimal sequencing line.
In step s 8, the mean place accuracy for calculating all comparison elements in optimal sequencing line, to optimal sequencing line
In each comparison element the summation of position accuracy, then divided by the quantity of comparison element, obtain the mean place of optimal sequencing line
Accuracy.When optimal sequencing line has a plurality of, the mean place accuracy of all optimal sequencing lines is calculated, is sorted by numerical value height
Compare.For example, the mean place accuracy in above-mentioned table 6 is (100.00%+100.00%+100.00%+81.82%+
90.00%)/5=94.36%.
When calculating each comparison element, since a plurality of starting point sequence line may be chosen, such as the starting point sequence most started
When line options, the highest sequence line of frequency of occurrence has 3, respectively A > B, A > D, A > C in data set Q, this 3 sequence lines all may be used
Subsequent calculating is carried out as starting point sequence line, meanwhile, the identical situation of the position accuracy being likely to occur in calculating process
It will lead to and increase starting point sequence line when calculating next comparison element, therefore, the calculating carried out for a plurality of starting point sequence line can
Obtain a plurality of optimal sequencing line.By calculating the mean place accuracy of all comparison elements in optimal sequencing line, can exist
When a plurality of optimal sequencing line, to the further sequence respectively of a plurality of optimal sequencing line.
The position accuracy that preamble calculates each comparison element is calculated since the highest order item of the frequency, constantly handle
Sequence line is sophisticated to multinomial from 2, only considered the preamble comparison element of each newly-increased comparison element, this side in calculating process
Method, which does not need especially largely to calculate, can list relevant sequence line.If from the point of view of more comprehensively, all elements
Should all participate in some element position accuracy calculate, not only should preamble order item participate in calculate, subsequent sequence
Item should also participate in calculating.
As a result, further include step S9 after step S8: the position accuracy of comparison element in optimal sequencing line is answered
Core is searched for all comprising comparison element q in data set QnSequence line, by every include comparison element qnSequence line with most
Excellent sequence line is compared.It include comparison element q by this if comparison result is that ordering relation is identicalnSequence line exist
Frequency of occurrence in data set Q is labeled as comparison element qnThe correct frequency in position in this bar optimal sequencing line, if compared
It as a result is ordering relation on the contrary, including then comparison element q by thisnFrequency of occurrence of the sequence line in data set Q be labeled as
Comparison element qnThe positional fault frequency in this bar optimal sequencing line, if this includes comparison element q in data set QnRow
The ordering relation of sequence line does not appear in this bar optimal sequencing line, then by comparison element qnPosition in this bar optimal sequencing line
It sets accuracy and is labeled as 0.
This step mainly checks the position accuracy of each element and the mean place accuracy of all comparison elements,
Such as optimal sequencing line is A > B > D > C > Y, the sequence line frequency of occurrence of data set Q, it is first to search for all about comparison in reference table 2
The sequence line of plain A can obtain as shown in table 7 as a result, the position accuracy review result of so comparison element A is that A=is correct
The total frequency 15*100%=100% of the frequency 15/.
Table 7
2 order items containing A | The frequency | Compared with optimal sequencing line |
A>B | 4 | Correctly |
A>C | 4 | Correctly |
A>D | 4 | Correctly |
A>Y | 3 | Correctly |
It is total | 15 | Correct total 15 |
Similarly, the position accuracy for comparing element B is checked, the results are shown in Table 8, then comparison element B
It is the total frequency 12*100%=91.67% of the correct frequency 11/ of B=that position accuracy, which checks result,.
Table 8
2 order items containing B | The frequency | Compared with optimal sequencing line |
A>B | 4 | Correctly |
B>D | 3 | Correctly |
B>C | 2 | Correctly |
B>Y | 2 | Correctly |
C>B | 1 | Mistake |
It is total | 12 | Correct total 11 |
The position accuracy that table 9 lists whole comparison elements is checked as a result, according to the position of the comparison element after review
Accuracy, can also all comparison elements after calculation review mean place accuracy.Finally, each sequence line is according to averagely just
True rate height sorts, for users to use the sequence line of highest average accuracy.
Table 9
It is first to choose the sequence of two elements as starting point sequence line, with the increasing of latter element because of this sequence work
Add and increase the length of sequence line, sequence in this way is the increase with element and increases the first prime number for participating in sequence, in this way before
The sequence element of participation can not be compared with the element to sort below there are no participation, set review step in the present invention
Suddenly, it exactly allows the ordering scenario of each element and all elements to carry out the comparison of accuracy, makes calculating more comprehensive in this way, and
And calculation amount is not too large.
Below to illustrate method of the invention for measuring the experiment of genunie medicinal materials effective component.
Genunie medicinal materials, also known as authentic medicinal herbs refer to and preferably come out by tcm clinical practice prolonged application, in specific region logical
The produced medicinal material of specific production process is crossed, good effect good compared with other produced medical material quanlities of the same race in area has higher well-known
Degree.In the effective constituent determination experiment of genunie medicinal materials,
One of genunie medicinal materials Radix Salviae Miltiorrhizae, many experiments measure this by studying the effective component in each province Radix Salviae Miltiorrhizae
The quality of province Radix Salviae Miltiorrhizae, wherein the effective component Tanshinone I in Radix Salviae Miltiorrhizae is a common significant ingredient, in order to more each
Which province highest of the content of province Tanshinone I, therefore be ranked up using method of the invention.
Each province that each experimental study is acquired about Tanshinone I is sorted, and forms two comparison elements sequence lines, portion
Divide example as shown in table 10.Data are screened from the technical journal published in table 10, and Tanshinone I difference province is surveyed
Fixed partial results, which are set out, to be come, and starts to be ranked up.
Table 10
Data are the sequence lines sequence line of each experiment resolved into only comprising two comparison elements in table 11.
Table 11
Test serial number | 2 sequences |
3 | Henan Province > Anhui Province |
3 | Henan Province > Shaanxi Province |
3 | Henan Province > Gansu Province |
3 | Henan Province > Sichuan Province |
3 | Henan Province > Shandong Province |
3 | Henan Province > Hebei province |
3 | Anhui Province > Shaanxi Province |
3 | Anhui Province > Gansu Province |
3 | Anhui Province > Sichuan Province |
3 | Anhui Province > Shandong Province |
3 | Anhui Province > Hebei province |
3 | Shaanxi Province > Gansu Province |
…… | …… |
Data are the frequencys of the sequence line for two comparison elements for counting all experiments in table 12.
Table 12
2 sequences | The frequency |
Shandong Province > Henan Province | 10 |
Henan Province > Anhui Province | 8 |
Shandong Province > Hebei province | 8 |
Shandong Province > Sichuan Province | 8 |
Shandong Province > Anhui Province | 7 |
Henan Province > Hebei province | 6 |
Hebei province > Anhui Province | 6 |
Henan Province > Sichuan Province | 6 |
Shandong Province > Jiangsu Province | 5 |
Sichuan Province > Henan Province | 4 |
Jiangsu Province > Anhui Province | 4 |
…… | …… |
Data are the accuracy for the optimal sequencing line that method according to the present embodiment obtains in table 13, are by repeatedly just
The cycle calculations of true probability, form a plurality of optimal sequencing line by accuracy height.
Table 13
Data are that the review of accuracy is carried out to every optimal sequencing line in table 14, recalculate average correct probability, with
The height of average correct probability is ranked up line and enumerates, and uses for researcher.
Table 14
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although
Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used
To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features;
And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and
Range.
Claims (8)
1. a kind of Probabilistic Synthesis sort method, which comprises the following steps:
S1: the data set definition by the past experiment ranking results composition is data set P, and each in the data set P is arranged
Sequence line resolves into sequence line only comprising two comparison elements, the data of all sequence lines compositions only comprising two comparison elements
Collection is defined as data set Q;
S2: the repetition frequency of every sequence line in the data set Q is counted;
S3: using the highest sequence line of frequency of occurrence in the data set Q as starting point sequence line, with the ratio in starting point sequence line
Compared with element q1With comparison element q2Based on, emerging comparison element q in subsequent sequence line is added one by onen, the comparison is first
Plain qnWith comparison element q1~qn-1Combination, is listed comprising comparison element q1~qn-1N item sort line, n item sequence line composition
Data set M, wherein the comparison element qnPosition in the every sequence line of the data set M is different from, and n is positive whole
Number, maximum value are the quantity of comparison element in the data set Q;
S4: every sequence line in the data set M is resolved into only according to the method in the step S1 comprising the comparison
Element qnWith the comparison element q1~qn-1Any one of n-1 item sort line, the n group n-1 item decomposited in the data set M
Sort line composition data collection R;
S5: searching for the sequence line in the data set R respectively in the data set Q, compare in the data set Q sort line and
The ordering relation of sequence line in the data set R, the correct frequency for the line that sorts in the data set R is marked according to comparing result
Or the wrong frequency;
S6: the comparison element q is calculated separatelynPosition accuracy in the every sequence line of the data set M, calculation formula
Are as follows: in the data set R in the sum of correct frequency of every group of sequence line/data set R the correct frequency of every group of sequence line and
The summation * 100% of the mistake frequency;
S7: the comparison element q is chosennThe highest sequence line of position accuracy in the sequence line of the data set M, as meter
Calculate comparison element qn+1The starting point sequence line of position accuracy, return step S3 repeat step S3~S6, it is first to obtain the comparison
Plain qn+1The highest sequence line of position accuracy, circulation executes until acquisition all comparison element positions accuracy is highest optimal
Sort line.
2. Probabilistic Synthesis sort method according to claim 1, which is characterized in that the step S1 further includes, will be described
The sequence line p of data set PnIn any two comparison element combine, according to any two comparison element in the sequence
Line pnIn ordering relation be ranked up, obtain only include any two comparison element a sequence line.
3. Probabilistic Synthesis sort method according to claim 1, which is characterized in that the step S5 further includes, described
The sequence line in the data set R is searched in data set Q respectively, if the first sequence line and the number in the data set R
It is identical according to the first sequence line in collection Q, then the frequency of occurrence of the first sequence line described in the data set Q is labeled as the number
According to the correct frequency of the first sequence line described in collection R;If in the second sequence line and the data set Q in the data set R
Second sequence line on the contrary, then by described in the data set Q second sequence line frequency of occurrence be labeled as in the data set R
The wrong frequency of the second sequence line;If the third sequence line in the data set R is in the data set Q both without phase
With sequence line also not opposite sequence line, then the correct frequency for marking the sequence line of third described in the data set R is 0.
4. Probabilistic Synthesis sort method according to claim 1, which is characterized in that in the step S6, if described
Comparison element qnPosition accuracy in 2 or more the sequence lines of the data set M is identical and is extreme higher position accuracy, then
It all regard described 2 or more sequence lines as starting point sequence line, continues to calculate next emerging comparison element qn+1。
5. Probabilistic Synthesis sort method according to claim 1, which is characterized in that after the step S7 further include step
S8: calculating the mean place accuracy of all comparison elements in the optimal sequencing line, to each ratio in the optimal sequencing line
Position accuracy compared with element is summed, then divided by the quantity of comparison element, the mean place for obtaining the optimal sequencing line is correct
Rate.
6. according to claim 1 to any Probabilistic Synthesis sort method in 5, which is characterized in that after the step S8
Further include step S9: the position accuracy of comparison element in the optimal sequencing line being checked, is searched in the data set Q
Suo Suoyou includes the comparison element qnSequence line, by every include the comparison element qnSequence line and the optimal row
Sequence line is compared, and the comparison element q is marked according to the comparison result of ordering relationnThe correct frequency or the wrong frequency.
7. Probabilistic Synthesis sort method according to claim 6, which is characterized in that the step S9 further include: if institute
Stating comparison result is that ordering relation is identical, then will include the comparison element qnAppearance of the sequence line in the data set Q
The frequency is labeled as the comparison element qnThe correct frequency in position in the optimal sequencing line;If the comparison result is row
Order relation will be on the contrary, then will include the comparison element qnFrequency of occurrence of the sequence line in the data set Q be labeled as it is described
Comparison element qnThe positional fault frequency in the optimal sequencing line;If in the data set Q including the comparison element qn
The ordering relation of sequence line do not appear in the optimal sequencing line, then by the comparison element qnIn the optimal sequencing
Position accuracy in line is labeled as 0.
8. Probabilistic Synthesis sort method according to claim 7, which is characterized in that further include step after the step S9
S10: the mean place accuracy in the optimal sequencing line after all comparison element reviews is calculated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811035247.1A CN109190089B (en) | 2018-09-06 | 2018-09-06 | Probability comprehensive ordering method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811035247.1A CN109190089B (en) | 2018-09-06 | 2018-09-06 | Probability comprehensive ordering method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109190089A true CN109190089A (en) | 2019-01-11 |
CN109190089B CN109190089B (en) | 2023-01-03 |
Family
ID=64914751
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811035247.1A Active CN109190089B (en) | 2018-09-06 | 2018-09-06 | Probability comprehensive ordering method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109190089B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1530852A (en) * | 2003-03-10 | 2004-09-22 | 磊 杨 | Computer sequencing technology based on probability distribution |
CN101807925A (en) * | 2010-02-08 | 2010-08-18 | 南京朗坤软件有限公司 | Historical data compression method based on numerical ordering and linear fitting |
CN104751254A (en) * | 2015-04-23 | 2015-07-01 | 国家电网公司 | Line loss rate prediction method based on non-isometric weighted grey model and fuzzy clustering sorting |
-
2018
- 2018-09-06 CN CN201811035247.1A patent/CN109190089B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1530852A (en) * | 2003-03-10 | 2004-09-22 | 磊 杨 | Computer sequencing technology based on probability distribution |
CN101807925A (en) * | 2010-02-08 | 2010-08-18 | 南京朗坤软件有限公司 | Historical data compression method based on numerical ordering and linear fitting |
CN104751254A (en) * | 2015-04-23 | 2015-07-01 | 国家电网公司 | Line loss rate prediction method based on non-isometric weighted grey model and fuzzy clustering sorting |
Also Published As
Publication number | Publication date |
---|---|
CN109190089B (en) | 2023-01-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104217015B (en) | Based on the hierarchy clustering method for sharing arest neighbors each other | |
Shah et al. | Rumor centrality: a universal source detector | |
Stanton et al. | Constructing and sampling graphs with a prescribed joint degree distribution | |
CN104809130B (en) | Method, equipment and the system of data query | |
CN106874322A (en) | A kind of data table correlation method and device | |
CN106777946B (en) | Personalized health service recommendation method based on hierarchal model | |
Wei et al. | Possibility degree method for ranking intuitionistic fuzzy numbers | |
CN109190089A (en) | Probabilistic Synthesis sort method | |
CN103559318B (en) | The method that the object containing heterogeneous information network packet is ranked up | |
CN106951325A (en) | Space computational fields calculate intensity cube construction method | |
Balaban et al. | Computer program for finding all possible cycles in graphs | |
Vragović et al. | Network community structure and loop coefficient method | |
CN107562948A (en) | A kind of printenv multidimensional data clustering method based on distance | |
JP6511971B2 (en) | Information processing apparatus and program | |
Gupta et al. | Community detection in heterogenous networks using incremental seed expansion | |
CN105761119B (en) | Online number distribution calculation method and device | |
CN108984630A (en) | Application method of the Node Contraction in Complex Networks importance in spam page detection | |
Ballester-Bolinches et al. | A question on partial CAP-subgroups of finite groups | |
Chanchary et al. | Time Windowed Data Structures for Graphs. | |
Mendonça et al. | Asymptotic behavior of the length of the longest increasing subsequences of random walks | |
Jin et al. | An efficient detecting communities algorithm with self-adapted fuzzy C-means clustering in complex networks | |
Nikolić et al. | Complexity of some interesting (chemical) graphs | |
Sheng et al. | Exact and approximate algorithms for the most connected vertex problem | |
Michieli | Complex network analysis of men single atp tennis matches | |
Morales et al. | On the classification of resolvable 2-(12, 6, 5c) designs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |