CN110399540B - Instance retrieval method integrating correlation function and D-HS index - Google Patents

Instance retrieval method integrating correlation function and D-HS index Download PDF

Info

Publication number
CN110399540B
CN110399540B CN201910662850.0A CN201910662850A CN110399540B CN 110399540 B CN110399540 B CN 110399540B CN 201910662850 A CN201910662850 A CN 201910662850A CN 110399540 B CN110399540 B CN 110399540B
Authority
CN
China
Prior art keywords
fuzzy
formula
interval
correlation function
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910662850.0A
Other languages
Chinese (zh)
Other versions
CN110399540A (en
Inventor
赵燕伟
徐晨
朱芬
桂方志
任设东
黄程侃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201910662850.0A priority Critical patent/CN110399540B/en
Publication of CN110399540A publication Critical patent/CN110399540A/en
Application granted granted Critical
Publication of CN110399540B publication Critical patent/CN110399540B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing

Abstract

An example retrieval method fusing a correlation function and a D-HS index comprises the following steps: the concept and the corresponding calculation formula of the fuzzy correlation function are provided by combining the fuzzy number, and then the single-dimensional correlation function combination similarity measurement method is provided by combining the analytic hierarchy process, compared with the full-dimensional correlation function calculation method, the method not only ensures the index precision but also improves the index speed when high-dimensional retrieval is carried out; nonlinear transformation is carried out on data by using a Sigmoid function, so that the problem of low index accuracy caused by uneven data distribution in the traditional D-HS index method is effectively solved; by fusing the improved method, the rigor and the accuracy of the traditional D-HS index are improved, the complexity that the correlation function needs to carry out similarity calculation on the full-instance library instance is avoided, and the retrieval utility problem is solved to a certain extent.

Description

Instance retrieval method integrating correlation function and D-HS index
Technical Field
The invention relates to an example retrieval method.
Background
How to measure the distance between the examples is a core problem of example retrieval, and a correlation function, a mixed attribute distance, a nearest neighbor method, an improvement research thereof and the like are successively proposed and are very effective in improving the index precision. However, the example retrieval is an organic combination of basic contradictions of index precision and index speed, and the increasingly complex calculation method brings huge calculation amount while improving the precision, thereby causing the reduction of the index speed. In addition, the indexing speed is not only limited by the indexing precision, but also influenced by the number of instances in the instance library, and in the construction process of the instance library, in order to enrich the instance knowledge as much as possible, designers often continuously update the instance library, so that part of the instance retrieval has the problem of influencing the indexing speed and is not beneficial to redundant calculation of the indexing precision, namely retrieval effectiveness.
Aiming at the contradiction between the index speed and the index precision, the invention combines the fuzzy number and the correlation function, provides the concept of the fuzzy correlation function and a calculation formula thereof, abandons the full-dimensional correlation function calculation method with huge calculation amount when facing multi-attribute products, and changes a single-dimensional correlation function combination similarity measurement method combining a hierarchy analysis method; meanwhile, the D-HS index is improved based on the Sigmoid function, and the measurement method and the improved D-HS index are fused, so that partial redundant calculation in instance retrieval is avoided, and the retrieval utility problem is effectively solved.
Disclosure of Invention
The invention provides an example retrieval method integrating a correlation function and a D-HS index, aiming at solving the problems of basic contradiction between index precision and index speed and retrieval utility commonly existing in example retrieval, and realizing faster and more reasonable retrieval.
The technical scheme adopted by the invention for solving the technical problems is as follows:
an example retrieval method fusing a correlation function and a D-HS index comprises the following steps:
S1: and improving a D-HS indexing method based on a Sigmoid function. Aiming at the problem of low index accuracy caused by the phenomenon of non-uniform distribution of example attribute values of an example library, the invention utilizes a Sigmoid function to carry out non-linear transformation on the attribute values, and the transformation can respectively expand and compress dense and sparse areas of data on the premise of not changing the overall distribution and arrangement condition of the data, thereby improving the identification of the data and the capability of dividing layers. The Sigmoid function is expressed as follows:
Figure BDA0002139106260000011
in the formula, the coefficient β is the median of the sample data, and the coefficient α ═ ln9/t, where the parameter t is the smaller of the distances from 90% quantile and 10% quantile to the median of the sample data.
S11: and carrying out logarithm processing on the original data in the example base by taking a natural logarithm e as a base to obtain ln data, and then carrying out descending order arrangement on the ln data.
S12: and (3) calculating the values of the coefficient alpha and the coefficient beta in the formula (1) according to the ln data, and substituting the ln data into the formula (1) to obtain Sigmoid data.
S13: and dividing the index interval according to the Sigmoid data to form an index table.
S2: preliminary index set SPAnd (6) obtaining. Converting the related attribute values of the target examples into Sigmoid data, substituting the Sigmoid data into an index table, calculating the attribute matching degree of each example in the example base relative to the target examples by combining a formula (2), and rejecting the examples with too low attribute matching degree in the example base (according to the number of the examples, generally 60% to 80% after rejection) to form a preliminary index set SP
Figure BDA0002139106260000021
S3: preliminary index set SPAnd calculating an internal instance correlation function. At present, the measurement of the fuzzy parameter by the correlation function is in a blank stage, and the fuzzy parameter includes a fuzzy numerical type, a fuzzy interval value type and a fuzzy conceptual type. The invention combines the expression modes of the correlation function and the fuzzy number and provides the concept of the fuzzy correlation function and a related calculation formula. In addition, the invention provides a unit correlation function combination similarity measurement method combined with an analytic hierarchy process.
S31: the correlation function is similar to a membership function in a fuzzy set theory, and can quantitatively describe the degree of things belonging to a certain interval or having a certain property, wherein the expression (3) is a correlation function expression, and the expressions (4) and (5) are side distance and bit value expressions respectively.
Figure BDA0002139106260000022
Figure BDA0002139106260000023
D(x,X0,X)=ρ(x,x0,X)-ρ(x,x0,X0) (5)
In the formula (4), x0Respectively representing a point and an optimum point in space, X0Representing a range constraint in space; x is the number of1,x2Respectively representing points x and x0Of (a) and the intersection of the boundary of the interval X, where X1Is far from x0A closer point; ext (x)0x1) Representing along a straight line x0x1And with x1Ray of origin, Ext (x)0x2) The same process is carried out; x ═ x0Representing point x and optimum point x0Overlapping; -max { | x0M in M | } represents each vertex of the interval X, -max { | X0M | } means to take the optimum point x0The negative number of the maximum distance from each vertex of the interval.
From the formula (5), the bit value D (X, X)0X) is equal to the side distance rho (X, X) in X0X) and with X0Is the side distance rho (x, x) of the interval0,X0) A difference of (d); intervals X and X0Respectively represent a feasible interval and an ideal interval, so
Figure BDA00021391062600000310
When the intervals X and X0D (X, X) in absence of common boundary0X) is constantly less than 0, D (X, X) when a common boundary exists0And X) is constantly equal to or less than 0.
S32: the fuzzy number can be generally represented by a triangular fuzzy number and a trapezoidal fuzzy number, and the expressions (6 and 7) are respectively membership function expressions.
Figure BDA0002139106260000031
In the formula, s1,s2Two support points, sMIs a peak point, the membership degree of the support point and the two sides of the support point is 0, and the membership degree of the peak point is 1; in practical application, triangular fuzzy number is available
Figure BDA0002139106260000032
And (4) showing.
Figure BDA0002139106260000033
In the formula, t1,t2Is two supporting points, the membership degree of the supporting points and the two sides of the supporting points is 0 when
Figure BDA0002139106260000034
The membership degree is 1; in practical application, trapezoidal fuzzy number can be used
Figure BDA0002139106260000035
And (4) showing.
Suppose an example ciIs of fuzzy numerical type (about)
Figure BDA0002139106260000036
) And fuzzy interval value type (about)
Figure BDA0002139106260000037
) Then the mth attribute can be blurred with a triangle
Figure BDA0002139106260000038
Is shown, in which:
Figure BDA0002139106260000039
Figure BDA0002139106260000041
similarly, the nth attribute can be fuzzy number by trapezoid
Figure BDA0002139106260000042
Is shown, in which:
Figure BDA0002139106260000043
based on the expression mode of the fuzzy number, the invention calculates three elements (an optimal point, an ideal interval and a feasible interval) of the analog-to-correlation function, and calculates the triangular fuzzy number
Figure BDA0002139106260000044
sMAnalogize to the optimal point x0Interval [ s ]1,s2]Analogizing to a feasible interval [ X ]1,X2]Then, according to the user's requirement and in combination with the design standard, the ideal interval [ x ] is obtained from the optimal point and the feasible interval1,x2](ii) a Similarly, for trapezoidal fuzzy numbers
Figure BDA0002139106260000045
Interval(s)
Figure BDA0002139106260000046
Can be analogized to an ideal interval [ x ]1,x2]Interval [ t ]1,t2]Analogizing to a feasible interval [ X ]1,X2]And combining the intervals to obtain an optimal point x0
S33: a fuzzy numerical parameter correlation function metric. According to the expressions and the relations of the fuzzy number and the correlation function, the invention provides the fuzzy numerical value type lateral distance rho (x, x)0X) expression and mathematical calculation are shown in formula (10) and formula (11), respectively.
Figure BDA0002139106260000047
In the formula (I), the compound is shown in the specification,
Figure BDA0002139106260000048
representing triangles Qxx0The area of (a) is,
Figure BDA0002139106260000049
representing a trapezoid Qx0The area of xM (same reason as the rest);
Figure BDA00021391062600000410
meaning taking the triangle QX1x0Area and triangle QX2x0The greater value in area, since the present invention specifies X1Is far from x0Relatively close to each other, therefore
Figure BDA00021391062600000411
Figure BDA0002139106260000051
The correlation function needs to calculate the feasible interval X and the ideal interval X respectively0In the face of fuzzy numerical parameters, the two calculation methods are similar, that is, the feasible region X is [ X ]1,X2]Substitution into ideal interval X0=[x1,x2]Therefore, the side pitch ρ (x, x) will not be described in detail0,X0) The computational graph and the mathematical computation.
The lateral distance rho (x, x)0X) and lateral distance ρ (X, X)0,X0) The calculation result is substituted into the formula (3) and the formula (5) to obtain the correlation function value corresponding to the fuzzy numerical parameter.
S34: fuzzy interval value type parameter correlation function measurement. The invention provides a fuzzy interval value type side distance rho (x, x)0X) expression and mathematical calculation are shown in formula (12) and formula (13), respectively.
Figure BDA0002139106260000052
Figure BDA0002139106260000053
Different from fuzzy numerical parameters, the calculation methods of the distances between two sides of the fuzzy interval value parameters are different, and an ideal interval X is0Side distance ρ (x, x)0,X0) Is expressed by the formula (14), and the formula (15) is a corresponding mathematical calculation formula.
Figure BDA0002139106260000061
Figure BDA0002139106260000062
The lateral distance rho (x, x)0X) and lateral distance ρ (X, X)0,X0) The correlation function value corresponding to the fuzzy interval value type parameter can be obtained by substituting the calculation result of the formula (3) and the formula (5).
S35: a fuzzy conceptual parametric correlation function metric. Aiming at fuzzy conceptual parameters, the invention introduces a fuzzy semantic conversion table (shown in table 1) to quantize fuzzy words, combines the quantized value with fuzzy numerical parameters, namely represents the fuzzy value by using a triangular fuzzy number (if certain performance is good, the fuzzy numerical parameter can be converted into about 1.0, and if certain performance is poor, the fuzzy numerical parameter can be converted into about 0.2), and then obtains the correlation function value according to a corresponding formula.
TABLE 1 fuzzy semantic conversion Table
Figure BDA0002139106260000063
S36: an overall similarity measure. The correlation function calculation formula of the deterministic parameter and the fuzzy parameter is given above, the correlation function value of each dimension attribute between the examples can be obtained according to the formula, and the target example and the preliminary index set S can be obtained by combining each attribute weight obtained by the analytic hierarchy processPThe combined correlation function of the inner example is used for drawing the similarity of the two, and the calculation formula is as follows:
Figure BDA0002139106260000064
in the formula, k (A)*) Representing a combined correlation function, k (A)i) A correlation function, w, representing an attribute iiAre the corresponding weights.
S4: similar examples can be used as new examples to be re-stored after retrieval, reuse and modification, so that the distribution condition of the attribute values of all dimensions of the example library is continuously changed in the process, and the distribution condition of the attribute values after nonlinear change is also continuously changed, so that the division of corresponding intervals is not constant, and the division operation of the intervals can be completed when the example library is updated and maintained without being performed before each retrieval.
Compared with the prior art, the invention has the advantages that:
1. the concept and the corresponding calculation formula of the fuzzy correlation function are provided by combining the fuzzy number, and the single-dimensional correlation function combination similarity measurement method is provided by combining the analytic hierarchy process.
2. The data is subjected to nonlinear transformation by using the Sigmoid function, and the problem of low indexing accuracy caused by uneven data distribution in the traditional D-HS indexing method is effectively solved.
3. By fusing the improved method, the rigor and the accuracy of the traditional D-HS index are improved, the complexity that the correlation function needs to carry out similarity calculation on the full-instance library instance is avoided, and the retrieval utility problem is solved to a certain extent.
4. The embodiment is accessed in a partition mode, so that the dynamic update and maintenance of data in the embodiment library are facilitated;
drawings
FIG. 1 is a graphical representation of the improvement and fusion of the correlation function with the D-HS index.
Fig. 2 is a flow chart of the operation of the present invention.
Fig. 3a to 3b are diagrams of triangle-trapezoid blur numbers, where fig. 3a shows a triangle blur number and fig. 3b shows a trapezoid blur number.
Fig. 4a to 4b are graphs of correlation function calculation based on the fuzzy number, wherein fig. 4a shows a triangular fuzzy number-correlation function, and fig. 4b shows a trapezoidal fuzzy number-correlation function.
FIGS. 5a to 5e show blur numerical side distances ρ (x, x)0X) calculation diagram, where fig. 5a represents X e Ext (X)0,X1) FIG. 5b shows x ∈ x0X1,|x0X1|≠0,x≠x0FIG. 5c shows x ∈ x0X2FIG. 5d shows x ∈ Ext (x)0X2) Fig. 5e shows that x is x0
FIGS. 6a to 6g show blur section value type side distances ρ (x, x)0X) calculation diagram, where fig. 6a represents X e Ext (X)0,X1) FIG. 6b shows x ∈ x1X1,|x1X1| ≠ 0, FIG. 6c denotes x ∈ x0x1,|x0x1|≠0,x≠x0FIG. 6d shows x ∈ x0x2FIG. 6e shows x ∈ x2X2FIG. 6f shows x ∈ Ext (x)0X2) Fig. 6g shows that x ═ x0
FIGS. 7a to 7e show blur section value type side distances ρ (x, x)0,X0) Graphical representation of the calculation, where FIG. 7a represents x ∈ Ext (x)0,x1) FIG. 7b shows x ∈ x0x1,|x0x1|≠0,x≠x0FIG. 7c shows x ∈ x0x2FIG. 7d shows x ∈ Ext (x)0x2) Fig. 7e shows that x is x0
Fig. 8a to 8c are data comparison diagrams of original data of an example library before and after nonlinear transformation, wherein fig. 8a shows original normalized data, fig. 8b shows ln normalized data, and fig. 8c shows Sigmoid data.
Fig. 9a to 9b are graphs of fitting the fuzzy correlation function to the correlation function, wherein fig. 9a shows the fuzzy correlation function, and fig. 9b shows the correlation function.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
The invention takes 150 examples in a certain vacuum pump example library as sample data. The method comprises the following steps:
S1: and obtaining the related attribute value of the target instance from the user, and obtaining the corresponding weight by combining an analytic hierarchy process. Table 2 lists attribute values and corresponding weights of the target instances, where Sigmoid data of the interval-type parameters is a corresponding value of the midpoint of the interval;
TABLE 2 target instance attribute values and corresponding weights
Figure BDA0002139106260000081
S2: and calling original data of vacuum pumps of various models and division conditions of various intervals after nonlinear transformation from an instance library to form an index table. Tables 3, 4 and 5 are index tables corresponding to the key attributes of the vacuum pump, i.e., pumping rate, operating power and failure rate, respectively (in view of the higher search dimension and the larger number of examples, the index tables for the 9-dimensional attributes are all listed in space).
TABLE 3 index Table-air extraction Rate
Figure BDA0002139106260000082
TABLE 4 index Table-Power of operation
Figure BDA0002139106260000091
TABLE 5 index Table-failure Rate
Figure BDA0002139106260000092
S3: and (3) converting the related attribute values of the target examples into Sigmoid data, substituting the Sigmoid data into an index table, and calculating the attribute matching degree of each example in the example base relative to the target examples by combining a formula (2), wherein the calculation result is shown in a table 6.
TABLE 6 calculation of Attribute match
Figure BDA0002139106260000093
S4: according to the distribution of the attribute matching degrees of each instance in the table 6, the similar instances with SA being more than or equal to 4 are taken to form a primary index set SPAnd obtaining S according to the calculation method set forth in section 2PThe association function of each dimension attribute of the internal instance is combined with the corresponding weight to obtain a combined association function. TABLE 7 is SPThe results of the single-dimensional correlation function and the combined correlation function calculations for each instance within.
TABLE 7 results of single-dimensional and combinatorial relevance function computations (preliminary index set S)P)
Figure BDA0002139106260000101
Comparing the combined correlation function k (A) in Table 7, the example c is known39,c44The similarity with the target instance is highest, and the target instance can be preferentially used as an object for reusing and modifying the subsequent instance, and the table 8 shows the model numbers of the two instances and the related attributes.
TABLE 8 example c39,c44Model number and associated attributes
Figure BDA0002139106260000102
S5: and the above examples are retrieved, reused and modified to be used as new examples to be re-stored in the library, and the interval division condition of the example library is updated.
The description of the examples is given for the sake of illustration only, and it is not intended to limit the scope of the invention to the particular forms set forth, but rather to the extent that such equivalents are known to those skilled in the art and can be made on the basis of the teachings herein.

Claims (1)

1. An example retrieval method fusing a correlation function and a D-HS index comprises the following steps:
S1: improving a D-HS indexing method based on a Sigmoid function; aiming at the problem of low index accuracy caused by the non-uniform distribution phenomenon of the attribute values of the instance library, the attribute values are subjected to nonlinear transformation by using a Sigmoid function, and the transformation can be used for respectively expanding and compressing dense and sparse areas of data on the premise of not changing the overall distribution and arrangement condition of the data, so that the identification of the data and the capability of dividing the hierarchy are improved; the Sigmoid function is expressed as follows:
Figure FDA0002139106250000011
wherein, the coefficient beta is the median of the sample data, the coefficient alpha is ln9/t, and the parameter t is the smaller value of the distance from 90% quantile point and 10% quantile point of the sample data to the median;
S11: carrying out logarithm processing on original data in the example library by taking a natural logarithm e as a base to obtain ln data, and then carrying out descending order arrangement on the ln data;
S12: calculating the values of the coefficient alpha and the coefficient beta in the formula (1) according to the ln data, and then substituting the ln data into the formula (1) to obtain Sigmoid data;
S13: dividing an index interval according to the Sigmoid data to form an index table;
S2: preliminary index set SPObtaining; converting the related attribute values of the target examples into Sigmoid data, substituting the Sigmoid data into an index table, calculating the attribute matching degree of each example in the example base relative to the target examples by combining a formula (2), and eliminating the examples with too low attribute matching degree in the example base to form a primary index set SP
Figure FDA0002139106250000012
S3: preliminary index set SPCalculating an internal instance correlation function; at present, the correlation letterMeasuring the number-to-fuzzy parameter in a blank stage, wherein the fuzzy parameter comprises a fuzzy numerical model, a fuzzy interval value model and a fuzzy conceptual model; therefore, the association function is combined with the expression mode of the fuzzy number, and the concept and the related calculation formula of the fuzzy association function are provided; in addition, a single-dimensional correlation function combination similarity measurement method is provided by combining an analytic hierarchy process;
S31: the correlation function is similar to a membership function in a fuzzy set theory, the degree that an object belongs to a certain interval or has certain properties can be quantitatively described, the formula (3) is a correlation function expression, and the formulas (4) and (5) are side distance and bit value expressions respectively;
Figure FDA0002139106250000021
Figure FDA0002139106250000022
D(x,X0,X)=ρ(x,x0,X)-ρ(x,x0,X0) (5)
in the formula (4), x0Respectively representing a point and an optimum point in space, X0Representing a range constraint in space; x is the number of1,x2Respectively representing points x and x0Of (a) and the intersection of the boundary of the interval X, where X1Is far from x0A closer point; ext (x)0x1) Representing along a straight line x0x1And with x1Ray of origin, Ext (x)0x2) The same process is carried out; x ═ x0Representing point x and optimum point x0Overlapping; -max { | x0M in M | } represents each vertex of the interval X, -max { | X0M | } means to take the optimum point x0The negative number of the maximum distance between the interval and each vertex;
from the formula (5), the bit value D (X, X)0X) is equal to the side distance rho (X, X) in X0X) and with X0Is the side distance rho (x, x) of the interval0,X0) A difference of (d); intervals X and X0Respectively represent feasible intervals andideal interval, therefore
Figure FDA0002139106250000025
When the intervals X and X0D (X, X) in absence of common boundary0X) is constantly less than 0, D (X, X) when a common boundary exists0X) is constantly equal to or less than 0;
S32: the fuzzy number is usually represented by a triangular fuzzy number and a trapezoidal fuzzy number, and the formula (6) and the formula (7) are membership function expressions of the triangular fuzzy number and the trapezoidal fuzzy number respectively;
Figure FDA0002139106250000023
in the formula, s1,s2Two support points, sMIs a peak point, the membership degree of the support point and the two sides of the support point is 0, and the membership degree of the peak point is 1; in practical application, triangular fuzzy number is used
Figure FDA0002139106250000024
Represents;
Figure FDA0002139106250000031
in the formula, t1,t2Is two supporting points, the membership degree of the supporting points and the two sides of the supporting points is 0 when
Figure FDA0002139106250000032
The membership degree is 1; in practical application, the trapezoidal fuzzy number is used
Figure FDA0002139106250000033
Represents;
suppose an example ciThe m-th and n-th attributes are fuzzy numerical type and fuzzy interval value type respectively, and the expression mode is about
Figure FDA0002139106250000034
And the combination
Figure FDA0002139106250000035
Then the mth attribute is blurred with triangles
Figure FDA0002139106250000036
Is shown, in which:
Figure FDA0002139106250000037
similarly, the nth attribute is fuzzy with trapezoidal
Figure FDA0002139106250000038
Is shown, in which:
Figure FDA0002139106250000039
based on the expression of the fuzzy number, the relation function is analogized to calculate three elements, namely an optimal point, an ideal interval and a feasible interval, for the triangular fuzzy number
Figure FDA00021391062500000310
sMAnalogy is the optimal point x0Interval [ s ]1,s2]Analogy is the feasible region [ X1,X2]Then, according to the user's requirement and in combination with the design standard, the ideal interval [ x ] is obtained from the optimal point and the feasible interval1,x2](ii) a Similarly, for trapezoidal fuzzy numbers
Figure FDA00021391062500000311
Interval(s)
Figure FDA00021391062500000312
Analogy is the ideal interval [ x ]1,x2]Interval [ t ]1,t2]Analogy is the feasible region [ X1,X2]And combining the intervals to obtain an optimal point x0
S33: fuzzy numerical parameter correlation function measurement; according to the expression and relation of the fuzzy number and the correlation function, the fuzzy numerical value type lateral distance rho (x, x) is provided0X) expression and mathematical calculation are respectively shown as formula (10) and formula (11);
Figure FDA0002139106250000041
in the formula (I), the compound is shown in the specification,
Figure FDA0002139106250000042
representing triangles Qxx0The area of (a) is,
Figure FDA0002139106250000043
representing a trapezoid Qx0The area of xM, the rest is the same;
Figure FDA0002139106250000044
meaning taking the triangle QX1x0Area and triangle QX2x0Greater value in area, due to specification of X1Is far from x0Relatively close to each other, therefore
Figure FDA0002139106250000045
Figure FDA0002139106250000046
The correlation function needs to calculate the feasible interval X and the ideal interval X respectively0In the face of fuzzy numerical parameters, the two calculation methods are similar, that is, the feasible region X is [ X ]1,X2]Substitution into ideal interval X0=[x1,x2]Therefore, the side pitch ρ (x, x) will not be described in detail0,X0) The computational graph and the mathematical computation;
the lateral distance rho (x, x)0X) and lateral distance ρ (X, X)0,X0) Substituting the calculation results of the formula (3) and the formula (5) to obtain a correlation function value corresponding to the fuzzy numerical parameter;
S34: fuzzy interval value type parameter association function measurement; the fuzzy interval value type side distance rho (x, x) is provided0X) expression and mathematical calculation are respectively shown as formula (12) and formula (13);
Figure FDA0002139106250000047
Figure FDA0002139106250000051
different from fuzzy numerical parameters, the calculation methods of the distances between two sides of the fuzzy interval value parameters are different, and an ideal interval X is0Side distance ρ (x, x)0,X0) The expression (2) is shown as a formula (14), and the formula (15) is a corresponding mathematical calculation formula;
Figure FDA0002139106250000052
Figure FDA0002139106250000053
the lateral distance rho (x, x)0X) and lateral distance ρ (X, X)0,X0) Substituting the calculation result of the formula (3) and the formula (5) to obtain a correlation function value corresponding to the fuzzy interval value type parameter;
S35: fuzzy conceptual parameter correlation function measurement; aiming at fuzzy conceptual parameters, fuzzy semantics are introduced to carry out corresponding magnitude conversion, namely certain performance is very poor: 0.2; some performance is poor: 0.4; some properties are generally: 0.6; certain performance is good: 0.8; certain properties are good: 1.0; median values for the corresponding meanings given above: 0.1, 0.3, 0.5, 0.7, 0.9,and then combining the quantized value with a fuzzy numerical parameter, namely representing by using a triangular fuzzy number: some performance translates well into fuzzy numerical parameters: about 1.0; some performance translates poorly into fuzzy numerical parameters: about 0.2, and then obtaining a correlation function value according to a corresponding formula;
S36: an overall similarity measure; the correlation function calculation formula of the deterministic parameter and the fuzzy parameter is given above, the correlation function value of each dimension attribute between the examples is obtained according to the formula, and the target example and the preliminary index set S are obtained by combining each attribute weight obtained by the analytic hierarchy processPThe combined correlation function of the inner example is used for drawing the similarity of the two, and the calculation formula is as follows:
Figure FDA0002139106250000061
in the formula, k (A)*) Representing a combined correlation function, k (A)i) A correlation function, w, representing an attribute iiIs the corresponding weight;
S4: similar examples are retrieved, reused and modified to serve as new examples to be re-warehoused, so that the distribution situation of attribute values of all dimensions of the example warehouse is changed continuously in the process, the distribution situation of the attribute values after nonlinear change is also changed continuously, the division of the corresponding interval is not constant, the division operation of the interval is completed when the example warehouse is updated and maintained, and the division operation is not required to be performed before retrieval every time.
CN201910662850.0A 2019-07-22 2019-07-22 Instance retrieval method integrating correlation function and D-HS index Active CN110399540B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910662850.0A CN110399540B (en) 2019-07-22 2019-07-22 Instance retrieval method integrating correlation function and D-HS index

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910662850.0A CN110399540B (en) 2019-07-22 2019-07-22 Instance retrieval method integrating correlation function and D-HS index

Publications (2)

Publication Number Publication Date
CN110399540A CN110399540A (en) 2019-11-01
CN110399540B true CN110399540B (en) 2021-08-24

Family

ID=68324860

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910662850.0A Active CN110399540B (en) 2019-07-22 2019-07-22 Instance retrieval method integrating correlation function and D-HS index

Country Status (1)

Country Link
CN (1) CN110399540B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104317838A (en) * 2014-10-10 2015-01-28 浙江大学 Cross-media Hash index method based on coupling differential dictionary
CN104462018A (en) * 2014-11-21 2015-03-25 浙江工业大学 Similar case retrieval method based on multidimensional correlation function
CN105512122A (en) * 2014-09-22 2016-04-20 华为技术有限公司 Ordering method and ordering device for information retrieval system
CN108897791A (en) * 2018-06-11 2018-11-27 云南师范大学 A kind of image search method based on depth convolution feature and semantic similarity amount
CN109460423A (en) * 2018-10-18 2019-03-12 浙江工业大学 A kind of low-carbon similar case retrieval method based on D-HS

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9529843B2 (en) * 2011-09-02 2016-12-27 Oracle International Corporation Highly portable and dynamic user interface component to specify and perform simple to complex filtering on data using natural language-like user interface

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105512122A (en) * 2014-09-22 2016-04-20 华为技术有限公司 Ordering method and ordering device for information retrieval system
CN104317838A (en) * 2014-10-10 2015-01-28 浙江大学 Cross-media Hash index method based on coupling differential dictionary
CN104462018A (en) * 2014-11-21 2015-03-25 浙江工业大学 Similar case retrieval method based on multidimensional correlation function
CN108897791A (en) * 2018-06-11 2018-11-27 云南师范大学 A kind of image search method based on depth convolution feature and semantic similarity amount
CN109460423A (en) * 2018-10-18 2019-03-12 浙江工业大学 A kind of low-carbon similar case retrieval method based on D-HS

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于多维关联函数的相似实例检索方法研究与实现;赵燕伟 等;《数学的实践与认识》;20151031;全文 *

Also Published As

Publication number Publication date
CN110399540A (en) 2019-11-01

Similar Documents

Publication Publication Date Title
Huang et al. A riemannian block coordinate descent method for computing the projection robust wasserstein distance
CN105117488B (en) A kind of distributed storage RDF data balanced division method based on hybrid hierarchy cluster
CN109783628B (en) Method for searching KSAARM by combining time window and association rule mining
Shukla et al. Analysis and evaluation of outlier detection algorithms in data streams
CN104933156A (en) Collaborative filtering method based on shared neighbor clustering
Zhang et al. Optimization and improvement of data mining algorithm based on efficient incremental kernel fuzzy clustering for large data
Xin et al. An overlapping semantic community detection algorithm base on the ARTs multiple sampling models
Liu et al. Method of Time Series Similarity Measurement Based on Dynamic Time Warping.
Liu et al. Multi-fidelity global optimization using a data-mining strategy for computationally intensive black-box problems
He et al. A Method of Identifying Thunderstorm Clouds in Satellite Cloud Image Based on Clustering.
CN114428803A (en) Operation optimization method and system for air compression station, storage medium and terminal
Li et al. Optimization of CVC shifting mode for hot strip mill based on the proposed LightGBM prediction model of roll shifting
CN106651461A (en) Film personalized recommendation method based on gray theory
Chen et al. Differential privacy histogram publishing method based on dynamic sliding window
CN110399540B (en) Instance retrieval method integrating correlation function and D-HS index
Fu et al. ICA: an incremental clustering algorithm based on OPTICS
Guo et al. Mobile user credit prediction based on lightgbm
CN113032443B (en) Method, apparatus, device and computer readable storage medium for processing data
Sokolov et al. Resource efficient data warehouse optimization
Marjit Aggregated similarity optimization in ontology alignment through multiobjective particle swarm optimization
CN114297582A (en) Modeling method of discrete counting data based on multi-probe locality sensitive Hash negative binomial regression model
Kuchuganov et al. Clustering algorithm for a set of machine parts on the basis of engineering drawings
Liu et al. A novel effective distance measure and a relevant algorithm for optimizing the initial cluster centroids of K-means
Lu et al. Main control factors affecting mechanical oil recovery efficiency in complex blocks identified using the improved k-means algorithm
Parthasarathy et al. Ensemble Learning Based Collaborative Filtering with Instance Selection and Enhanced Clustering.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant