Embodiment
The application is described in further detail below in conjunction with the accompanying drawings.
In one typical configuration of the application, terminal, the equipment of service network and trusted party include
One or more processors (CPU), input/output interface, network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory
And/or the form, such as read-only storage (ROM) or flash memory (flash such as Nonvolatile memory (RAM)
RAM).Internal memory is the example of computer-readable medium.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be with
Realize that information is stored by any method or technique.Information can be computer-readable instruction, data knot
Structure, the module of program or other data.The example of the storage medium of computer includes, but are not limited to phase
Become internal memory (PRAM), static RAM (SRAM), dynamic random access memory
(DRAM), other kinds of random access memory (RAM), read-only storage (ROM), electricity
It is Erasable Programmable Read Only Memory EPROM (EEPROM), fast flash memory bank or other memory techniques, read-only
Compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storages,
Magnetic cassette tape, magnetic disk storage or other magnetic storage apparatus or any other non-transmission medium,
The information that can be accessed by a computing device available for storage.Defined according to herein, it is computer-readable
Medium not include non-temporary computer readable media (transitory media), such as modulation data-signal and
Carrier wave.
Fig. 1 shows a kind of process chart for mixing querying method, comprises the following steps:
Step S101, determines main dimension, and divide many based on the main dimension in multiple queries dimension
Individual data source.Wherein, the inquiry dimension refers to the screening conditions in mixing inquiry, for example "
Closest preceding k is inquired about on map to include " chafing dish ", the restaurant of " dry pot " two keywords "
Mixing inquiry in, include two inquiry dimension, i.e. Spatial Dimensions (distance) and text dimensionality (pass
Key word).In addition, conventional inquiry dimension is also including time dimension etc., for example, further limit screening
Condition:N month Nei Xinkai restaurant.
In step S101, using some specific inquiry dimension as main dimension, for example, tieed up with text
Degree is then based on the main dimension and divides multiple data sources as main dimension.Master is being used as using text dimensionality
When dimension carries out data source division, it can typically be divided according to keyword.Still inquired about with foregoing mixing
Exemplified by, data object is divided into by keyword " chafing dish " and " dry pot " in text dimensionality
Two data sources, correspond to " chafing dish " and " dry pot " respectively.In " chafing dish " data source, own
Data object includes keyword " chafing dish ", correspondingly, in " dry pot " data source, all numbers
Keyword " dry pot " is included according to object, while including " dry pot " and " chafing dish " both keyword
Data object can appear in simultaneously in two data sources.As another feasible embodiment,
Spatial Dimension or time dimension can also be determined to be main dimension when determining main dimension, then according to away from
From or the time divide data source.By taking Spatial Dimension as an example, number can be divided according to the difference of distance
According to source, multiple data sources such as dividing 0~5km, 5~10km and 10~15km.
In actual applications, text dimensionality is general divides data source, therefore each data according to keyword
There is the main dimension (text dimensionality) of data object in preferable dispersion, each data source between source
Degree of correlation all same, be easy to the follow-up concurrent processing to each data source.And Spatial Dimension and time
According to distance or time is generally in dimension, data source division is carried out relative to according to keyword, respectively
Dispersion between individual data source is relatively poor, the main dimension (space of data object in each data source
Dimension or time dimension) the degree of correlation may differ, therefore it is follow-up to each data source and
Hair processing is relatively complicated.
Therefore, as a preferred embodiment, inquire about dimension include text dimensionality when, preferentially
It is main dimension by its determination.Specifically, the step S101 is preferably:Including text dimensionality
The text dimensionality is determined to be main dimension in multiple queries dimension, and according in the text dimensionality
The main dimension is divided multiple data sources by keyword.Thus, it is possible to take out phase in each data source
The complexity during weighting degree of correlation highest data object of all inquiry dimensions under data source is answered, is carried
The efficiency of height mixing inquiry.
Step S102, by the weighting that all inquiry dimensions under respective data sources are taken out in each data source
Degree of correlation highest data object adds candidate collection, and updates each data pair in the candidate collection
The local correlation degree of elephant.
Still by taking foregoing mixing inquiry scene as an example, unified number is set up for Spatial Dimension and text dimensionality
It is specific as shown in table 1 according to model:
Table 1
Based on above-mentioned data model, the mixing is inquired about in scene, and data object p is in respective data sources t
Under all inquiry dimensions weighting degree of correlation ζt(p, q) can be expressed as:
Wherein, the weight of α representation spaces dimension, δ (p.i, q.i) represents data object p space correlation
Degree, | q.d | the quantity (i.e. the quantity of keyword) of data source is represented, for same data object p, its
The component all same of spatial correlation under each data source, therefore the space correlation under data source t
The component of degree is the 1/ of the real space degree of correlation | q.d |.(1- α) represents the weight of text dimensionality,
θt(p.d, q.d) represents the text degrees of correlation of the data object p in data source t.
For the mixing inquiry of arbitary inquiry dimension, it is only necessary to by data object p in respective data sources t
Under all inquiry dimensions weighting degree of correlation ζtThe calculation formula of (p, q) is enterprising in respective queries dimension
Row is expanded, and the formula after expansion is as follows:
Wherein, t represents the data source of main dimension, and p represents data object, and q represents mixing inquiry, ζt(p,q)
Represent to inquire about the data object p of the data source t of main dimension in q under the data source in mixed once
The actual degree of correlation, j represents a certain inquiry dimension, and m represents to inquire about the quantity of dimension, αjRepresent corresponding
Inquire about the weight of dimension, RjRepresent correlations of the data source t data object p in respective queries dimension
Degree.
The mixing inquiry scene illustrated in correspondence the embodiment of the present application, m value is 2, if j=1 tables
Show Spatial Dimension, thenRepresent " dry pot " data
The weighting degree of correlation of Spatial Dimension under source, correspondingly, j=2 represents text dimensionality, then α2=1- α,
R2=θt(p.d, q.d), α2·R2=(1- α) θt(p.d, q.d), is represented under " dry pot " data source
The weighting degree of correlation of text dimensionality.
Described ζt(p, q) value is used for the weighting degree of correlation for representing each data object in data source t,
ζt(p, q) value is higher, represents that the correlation of the data object is stronger, it is possible thereby in each data source
Choose maximally related data object.Specifically, can be according to each data object in each data source t
P ζt(p, q) value, is ranked up to data object, is taken according to the result of sequence in each data source
Go out ζtThe maximum data object of (p, q) value.For example in " dry pot " data source, first three in ranking results
Data object be respectively:Q, W, Y, in " chafing dish " data source, first three in ranking results
Data object is respectively:Y, W, X, the data object difference taken out for the first time in two data sources
For Q and Y, candidate collection is then added into.In processing mixing query process, by each data
Take out ζ in sourcetThe processing procedure of the maximum data object of (p, q) value is concurrently carried out so that query performance
On do not have excessive loss, and due to main dimension is divided into multiple data sources, therefore each data
The complexity that data object sorts in source can be less than directly to be ranked up according to the degree of correlation of all dimensions,
Which thereby enhance the efficiency of mixing inquiry.
Waited being updated according to ranking results after qualified data object is taken out in data source, accordingly
The local correlation degree of each data object during selected works are closed.Wherein, the local correlation degree is used to represent certain
The extreme case of the weighting degree of correlation of one data object under all inquiry dimensions, i.e., comprising the upper bound with
One value range on boundary.
The lower bound represents that the weighting degree of correlation of the data object under all inquiry dimensions may go out
Existing minimum value, by taking the mixing inquiry scene in the present embodiment as an example, corresponding to a certain data object p
Only include the situation of one of keyword.For example only do not include " chafing dish " comprising " dry pot ",
The data object p then now taken out in " dry pot " data source has minimum under all inquiry dimensions
The weighting degree of correlation, the weighting degree of correlation of the minimum is the lower bound of local correlation degree.And the upper bound is represented
The maximum that the weighting degree of correlation of the data object under all inquiry dimensions is likely to occur, with this reality
Apply exemplified by the mixing inquiry scene in example, all keywords are included corresponding to a certain data object p, and
It is ζ in the data source of each keywordt(p, q) value highest situation.
Specifically, the concrete mode of the local correlation degree of each data object in the candidate collection is updated
It is as follows:For the lower bound of the local correlation degree, by the data object of the candidate collection in respective counts
According to the weighting of the weighting degree of correlation and the non-master dimension under remainder data source of all inquiry dimensions under source
Degree of correlation sum, is used as the lower bound of the local correlation degree of the data object;Accordingly for described
The upper bound of local correlation degree, all under respective data sources of the data object of the candidate collection are looked into
Ask the weighting degree of correlation of dimension and the highest weighting degree of correlation of all inquiry dimensions under remainder data source
Sum, is used as the upper bound of the local correlation degree of the data object.
Still by taking foregoing mixing inquiry scene as an example, the data object taken out for the first time in two data sources
Respectively Q and Y, the local correlation degree is designated as partial score.Q and Y in respective data sources (i.e.
Dry pot and chafing dish) under the weighting degrees of correlation of all inquiry dimensions be respectively:ζt(Q, q)=0.614 He
ζt(Y, q)=0.511.For the data object Q, its ζ taken out in " dry pot " data sourcet(Q, q) in value
Include two parts:The weighting degree of correlation of text dimensionality (main dimension) under " dry pot " data source
The weighting of Spatial Dimension (non-master dimension) under D11=0.501, and " dry pot " data source is related
I12=0.113 is spent, both sums are ζt(Q, q)=0.614.Assuming that data object Q does not include keyword
" chafing dish ", then text dimensionalities (main dimension) of the data object Q under " chafing dish " data source plus
Power degree of correlation D21=0, and the weighting phase of the Spatial Dimension (non-master dimension) under " chafing dish " data source
Pass degree I22=I21=0.113.Thus, the lower bound of data object Q local correlation degree is 0.727 (i.e.
D11+I12+I22).It is determined that Q local correlation degree the upper bound when, due to data object Q " fire
The weighting degree of correlation of all inquiry dimensions can not possibly exceed data object Y, therefore number under pot " data source
According to object Y ζt(Y, q) value is the theoretic highest weighting degree of correlation, thus, data object Q
Local correlation degree the upper bound be 1.125 (i.e. 0.614+0.511).Based on aforesaid way, according to this
The data object Q and Y of candidate collection are put into, its local correlation degree is updated for Q=[0.727,1.125].
Similarly, if the weighting degree of correlation of Spatial Dimensions of the data object Y under each data source is 0.111,
The upper bound that object Y local correlation degree can then be updated the data is 0.511+0.111, and lower bound is
0.511+.0614, i.e. Y=[0.622,1.125]
Step S103, according to the upper bound of the local correlation degree of data object and lower bound in the candidate collection,
Query Result object is determined in the data object of the candidate collection.In actual applications, for by
The data object taken out for the first time in each data source, can not directly determine inquiry knot in most cases
Fruit object, only when the data object taken out for the first time in each data source is identical data object,
Because the data object possesses the weighting degree of correlation of maximum in two data sources, necessarily meet bar
The data object of part, therefore can be directly as Query Result object.For example, at " dry pot "
In " chafing dish " two data sources, first data object is Q in ranking results, then can be by
Data object Q is used as Query Result object.
In most cases, the data object of ranking results first can be differed in each data source,
Such as Q and Y in precedent.Therefore, step S103, can be in candidate collection when implementing
The lower bound of the local correlation degree of each data object is upper with remaining any data object local correlation degree
Boundary is compared, if being more than or equal to remaining any number in the presence of the lower bound of the local correlation degree of a data object
According to the local correlation degree of object the upper bound when, by local correlation degree lower bound be more than remaining any data object
The data object in the local correlation degree upper bound be defined as Query Result object;Otherwise, step is returned to
S102, by the weighting degree of correlation highest that all inquiry dimensions under respective data sources are taken out in each data source
Data object add candidate collection determine Query Result object.
Still by taking foregoing mixing inquiry scene as an example, if the condition for determining Query Result object is not met, then
When secondary return to step S102 is handled, due to data object Q and Y in previous processing
Through being removed, therefore now ζ in two data sourcest(p, q) value highest is data object W respectively,
Then based on the data object W newly added, the part of all data objects in candidate collection is updated
The degree of correlation.For data object W, because its weighting degree of correlation in two data sources is true
Determine, therefore the bound of its local correlation degree is also thereby determined that, it is assumed that W is in " dry pot " and " fire
The weighting degree of correlation of all inquiry dimensions under pot " data source is respectively 0.454 and 0.504, then may be used
Be 0.958 with the upper bound and lower bound for the local correlation degree for determining data object W, i.e. W=[0.958,
0.958], represent that the weighting degree of correlation of the data object W has been uniquely determined.
Meanwhile, the data object W based on newest taking-up is understood, data object Q is in " chafing dish " number
The weighting degree of correlation according to all inquiry dimensions under source can not possibly exceed data object W, and data object Y
The weighting degree of correlation of all inquiry dimensions is also impossible to exceed data object W under " chafing dish " data source,
Thus, object Q, the upper bound of data object Y local correlation degree are updated the data.Data object Q
The upper bound of local correlation degree be updated to 0.614+0.504, i.e. Q=[0.727,1.118], data object
The upper bound of Y local correlation degree is updated to 0.511+0.454, i.e. Y=[0.622,0.965].
Now, according to the bound of data object Q, Y and W local correlation degree, do not meet yet
Determine the condition of Query Result object, it is therefore desirable to again return to step S102 and continue in data source
Take out data object.If the data object point this time taken out in " dry pot " and " chafing dish " data source
Wei not Z and X, and ζs of the data object Z in " dry pot " data sourcet(Z, q) value is 0.411, and
ζs of the data object X in " chafing dish " data sourcet(X, q) value is 0.25., can be with according to aforementioned manner
The upper bound for determining data object Z local correlation degree is 0.411+0.25, and lower bound is 0.411+0.098
(the weighting degree of correlation of the Spatial Dimension under " chafing dish " data source), i.e. Z=[0.509,0.661], and count
It is 0.25+0.411 according to the upper bound of object X local correlation degree, lower bound is 0.25+0.101 (" chafing dish "
The weighting degree of correlation of Spatial Dimension under data source), i.e. X=[0.351,0.661].
Meanwhile, object Q and Y Local Phase is updated the data based on the data object Z and X newly taken out
The Guan Du upper bound.Wherein, the upper bound of data object Q local correlation degree is updated to 0.614+0.25,
That is the upper bound of Q=[0.727,0.864], data object Y local correlation degree is updated to 0.511+0.411,
That is Y=[0.622,0.922], and the upper bound of data object W local correlation degree and lower bound are unique
It is determined that, therefore be still W=[0.958,0.958] without updating.Now, data object W part
The lower bound of the degree of correlation is more than or equal to the local correlation degree of remaining any data object (Q, X, Y and Z)
The upper bound, meet determine Query Result object condition, therefore by data object W be defined as inquiry
Result object, the Query Result object got by way of above-mentioned loop iteration necessarily all dimensions
The maximum data object of the weighting degree of correlation of degree.In above-mentioned three iterative process, candidate collection
In data object local correlation degree update status it is as shown in table 2.
Table 2
It is determined that during Query Result object, by way of above-mentioned loop iteration, it is only necessary to N
The iterative processing of wheel, you can get the Query Result object of the condition of satisfaction, without taking out each number
Screening is ranked up according to all data objects in source, therefore, it is possible to effectively improve the efficiency of mixing inquiry,
Shorten query time.
Mix and inquire about for TOP-K, due to needing to obtain K Query Result object, therefore in step
Rapid S103 is determined as after the data object of Query Result, is still further comprised:
Step S104, the Query Result object is taken out in the candidate collection and is added in results set,
And judge whether the quantity of the Query Result object in the results set reaches preset value;If reaching pre-
If value, then the result inquired about all Query Result objects in the results set as mixing;If
Not up to preset value, then again by taking out all inquiry dimensions under respective data sources in each data source
Weighting degree of correlation highest data object adds candidate collection and determines Query Result object.
Still by taking foregoing inquiry scene as an example, after third time iterative processing, it has been determined that data object
W is Query Result object, Query Result object W is taken out in candidate collection, and add result
In set, the data object now included in candidate collection is Q, Y, Z, X, and in results set
Comprising data object be W.
The preset value is the fruiting quantities that TOP-K mixing inquiries need, and the present embodiment is set as 5,
Because the quantity of the Query Result object in now results set only has 1, and not up to preset value,
Therefore return to step S102 processing, and continue through once or multiple iterative processing
Continue to obtain Query Result object, until the quantity of the Query Result object added in results set reaches
Untill 5, now all Query Result objects in the results set are mixed as TOP-K and looked into
The result of inquiry is returned.
For being inquired about on map, closest preceding k include " chafing dish ", " dry pot " two passes
The TOP-K mixing inquiries in the restaurant of keyword, the method provided using the present embodiment is handled it
Specific steps Fig. 2 shown in, including:
Step S201, using text dimensionality as main dimension, and is divided into two data sources according to keyword.
Step S202, inquiry is obtained in respective data respectively in " dry pot " and " chafing dish " two data sources
Degree of correlation highest data object is weighted under source.
Step S203, takes out the data object that step S202 is obtained, calculates its local correlation degree (partial
score)。
Step S204, the data object that step S202 is obtained adds candidate collection, and based on the number
According to the upper bound of the local correlation degree (partial score) of data object in object renewal candidate collection with
Boundary.
Step S205, the lower bound of the no local correlation degree that there is a data object of candidate collection is more than or equal to
The upper bound of the local correlation degree of remaining any data object, if in the presence of continuing executing with step S206;
If being not present, return to step S202.
Step S206, the lower bound that local correlation degree is taken out in candidate collection is more than or equal to remaining any number
According to the data object in the upper bound of the local correlation degree of object, one of TOP-K mixing Query Results are used as
Return, add in results set.
Whether step S207, the data object quantity in judged result set is equal to preset value K, if
It is then to return to the final result of TOP-K mixing inquiries, terminates this inquiry;If it has not, then returning
Return step S202.
To sum up, the scheme of the embodiment of the present application provides a kind of new mixing inquiry mode, is looked into multiple
Ask and determined in dimension after main dimension, data source is divided based on main dimension, each data source concurrent processing is obtained
Corresponding candidate target is taken, and introduces the concept of local correlation degree, by comparing data in candidate collection
The local correlation degree of object, the object for meeting mixing search request can be returned in advance, without according to institute
The degree of correlation for having dimension carries out screening and sequencing to all data objects for meeting querying condition, thus, it is possible to
The efficiency of mixing inquiry is improved, shortens query time.
Another aspect based on the application, additionally provides offer mixing query facility, the structure of the equipment
As shown in figure 3, dividing device 310, candidate's updating device 320 and object determining device including data
330.Specifically, the data divide device 310 and are used to determine main dimension in multiple queries dimension,
And multiple data sources are divided based on the main dimension;Candidate's updating device 320 is used for by each number
According to the weighting degree of correlation highest data object for all inquiry dimensions taken out in source under respective data sources
Candidate collection is added, and updates the local correlation degree of each data object in the candidate collection;It is described
Object determining device 330 is used for the upper bound according to the local correlation degree of data object in the candidate collection
And lower bound, Query Result object is determined in the data object of the candidate collection, wherein the part
The upper bound of the degree of correlation and lower bound are the weighting degree of correlation of the data object under all inquiry dimensions
Maximum and minimum value.
Here, it will be appreciated by those skilled in the art that the mixing query facility can include but not limit
Setting of being constituted is integrated by network in user equipment, the network equipment or user equipment and the network equipment
It is standby.The user equipment includes but is not limited to the realization such as personal computer, touch control terminal;The network
Equipment includes but is not limited to such as network host, single network server, multiple webserver collection or base
Realized in set of computers of cloud computing etc..Here, cloud is by based on cloud computing (Cloud Computing)
A large amount of main frames or the webserver constitute, wherein, cloud computing is one kind of Distributed Calculation, by one
One virtual machine of the computer collection composition of group's loose couplings.
Wherein, the inquiry dimension refers to the screening conditions in mixing inquiry, such as " on map
Inquiry closest preceding k is included " chafing dish ", the restaurant of " dry pot " two keywords " it is mixed
Close in inquiry, include two inquiry dimension, i.e. Spatial Dimensions (distance) and text dimensionality (keyword).
In addition, conventional inquiry dimension is also including time dimension etc., for example, further limit screening conditions:n
Individual month Nei Xinkai restaurant.
In data divide device 310, using some specific inquiry dimension as main dimension, for example with
Text dimensionality is then based on the main dimension and divides multiple data sources as main dimension.With text dimensionality
When carrying out data source division as main dimension, it can typically be divided according to keyword.Still with foregoing mixed
Close exemplified by inquiring about, keyword " chafing dish " and " dry pot " in text dimensionality, by data object
Two data sources are divided into, " chafing dish " and " dry pot " are corresponded to respectively.In " chafing dish " data source,
All data objects include keyword " chafing dish ", correspondingly, in " dry pot " data source, institute
There is data object to include keyword " dry pot ", while including " dry pot " and " chafing dish " two passes
The data object of key word can be appeared in two data sources simultaneously.As another feasible embodiment,
It is determined that Spatial Dimension or time dimension can also be determined into be main dimension during main dimension, then basis
Distance or time divide data source.By taking Spatial Dimension as an example, it can be divided according to the difference of distance
Data source, multiple data sources such as dividing 0~5km, 5~10km and 10~15km.
In actual applications, text dimensionality is general divides data source, therefore each data according to keyword
There is the main dimension (text dimensionality) of data object in preferable dispersion, each data source between source
Degree of correlation all same, be easy to the follow-up concurrent processing to each data source.And Spatial Dimension and time
According to distance or time is generally in dimension, data source division is carried out relative to according to keyword, respectively
Dispersion between individual data source is relatively poor, the main dimension (space of data object in each data source
Dimension or time dimension) the degree of correlation may differ, therefore it is follow-up to each data source and
Hair processing is relatively complicated.
Therefore, as a preferred embodiment, inquire about dimension include text dimensionality when, data
It is main dimension that assembly first, which is divided, by its determination.Specifically, the data divide device 310 and specifically used
In the text dimensionality is determined into be main dimension, and root in the multiple queries dimension comprising text dimensionality
The main dimension is divided into multiple data sources according to the keyword in the text dimensionality.Thus, it is possible to by
The weighting degree of correlation highest number of all inquiry dimensions under respective data sources is taken out in each data source
According to complexity during object, the efficiency of mixing inquiry is improved.
Still by taking foregoing mixing inquiry scene as an example, candidate's updating device is handled to each data source
When, unified data model can be set up for Spatial Dimension and text dimensionality, it is specific as shown in table 1.
Based on the data model shown in table 1, the mixing is inquired about in scene, and data object p is in respective counts
According to the weighting degree of correlation ζ of all inquiry dimensions under the t of sourcet(p, q) can be expressed as:
Wherein, the weight of α representation spaces dimension, δ (p.i, q.i) represents data object p space correlation
Degree, | q.d | the quantity (i.e. the quantity of keyword) of data source is represented, for same data object p, its
The component all same of spatial correlation under each data source, therefore the space correlation under data source t
The component of degree is the 1/ of the real space degree of correlation | q.d |.(1- α) represents the weight of text dimensionality,
θt(p.d, q.d) represents the text degrees of correlation of the data object p in data source t.
For the mixing inquiry of arbitary inquiry dimension, it is only necessary to by data object p in respective data sources t
Under all inquiry dimensions weighting degree of correlation ζtThe calculation formula of (p, q) is enterprising in respective queries dimension
Row is expanded, and the formula after expansion is as follows:
Wherein, t represents the data source of main dimension, and p represents data object, and q represents mixing inquiry, ζt(p,q)
Represent to inquire about the data object p of the data source t of main dimension in q under the data source in mixed once
The actual degree of correlation, j represents a certain inquiry dimension, and m represents to inquire about the quantity of dimension, αjRepresent corresponding
Inquire about the weight of dimension, RjRepresent correlations of the data source t data object p in respective queries dimension
Degree.
The mixing inquiry scene illustrated in correspondence the embodiment of the present application, m value is 2, if j=1 tables
Show Spatial Dimension, thenRepresent " dry pot " data
The weighting degree of correlation of Spatial Dimension under source, correspondingly, j=2 represents text dimensionality, then α2=1- α,
R2=θt(p.d, q.d), α2·R2=(1- α) θt(p.d, q.d), is represented under " dry pot " data source
The weighting degree of correlation of text dimensionality.
Described ζt(p, q) value is used for the weighting degree of correlation for representing each data object in data source t,
ζt(p, q) value is higher, represents that the correlation of the data object is stronger, it is possible thereby in each data source
Choose maximally related data object.Specifically, can be according to each data object in each data source t
P ζt(p, q) value, is ranked up to data object, is taken according to the result of sequence in each data source
Go out ζtThe maximum data object of (p, q) value.For example in " dry pot " data source, first three in ranking results
Data object be respectively:Q, W, Y, in " chafing dish " data source, first three in ranking results
Data object is respectively:Y, W, X, the data object difference taken out for the first time in two data sources
For Q and Y, candidate collection is then added into.In processing mixing query process, by each data
Take out ζ in sourcetThe processing procedure of the maximum data object of (p, q) value is concurrently carried out so that query performance
On do not have excessive loss, and due to main dimension is divided into multiple data sources, therefore each data
The complexity that data object sorts in source can be less than directly to be ranked up according to the degree of correlation of all dimensions,
Which thereby enhance the efficiency of mixing inquiry.
Waited being updated according to ranking results after qualified data object is taken out in data source, accordingly
The local correlation degree of each data object during selected works are closed.Wherein, the local correlation degree is used to represent certain
The extreme case of the weighting degree of correlation of one data object under all inquiry dimensions, i.e., comprising the upper bound with
One value range on boundary.
The lower bound represents that the weighting degree of correlation of the data object under all inquiry dimensions may go out
Existing minimum value.By taking the mixing inquiry scene in the present embodiment as an example, corresponding to a certain data object p
The situation of one of keyword is only included, for example, does not only include " chafing dish " comprising " dry pot ",
The data object p then now taken out in " dry pot " data source has minimum under all inquiry dimensions
The weighting degree of correlation, the weighting degree of correlation of the minimum is the lower bound of local correlation degree.And the upper bound is represented
The maximum that the weighting degree of correlation of the data object under all inquiry dimensions is likely to occur, with this reality
Apply exemplified by the mixing inquiry scene in example, all keywords are included corresponding to a certain data object p, and
It is ζ in the data source of each keywordt(p, q) value highest situation.
Specifically, candidate's updating device updates the part of each data object in the candidate collection
The concrete mode of the degree of correlation is as follows:For the lower bound of the local correlation degree, by the candidate collection
Under the weighting degree of correlation of all inquiry dimensions of the data object under respective data sources and remainder data source
Non-master dimension weighting degree of correlation sum, be used as the lower bound of the local correlation degree of the data object;
Accordingly for the upper bound of the local correlation degree, by the data object of the candidate collection corresponding
The weighting degree of correlation of all inquiry dimensions under data source and all inquiry dimensions under remainder data source
Highest weighting degree of correlation sum, be used as the upper bound of the local correlation degree of the data object.
Still by taking foregoing mixing inquiry scene as an example, the data object taken out for the first time in two data sources
Respectively Q and Y, the local correlation degree is designated as partial score.Q and Y in respective data sources (i.e.
Dry pot and chafing dish) under the weighting degrees of correlation of all inquiry dimensions be respectively:ζt(Q, q)=0.614 He
ζt(Y, q)=0.511.For the data object Q, its ζ taken out in " dry pot " data sourcet(Q, q) in value
Include two parts:The weighting degree of correlation of text dimensionality (main dimension) under " dry pot " data source
The weighting of Spatial Dimension (non-master dimension) under D11=0.501, and " dry pot " data source is related
I12=0.113 is spent, both sums are ζt(Q, q)=0.614.Assuming that data object Q does not include keyword
" chafing dish ", then text dimensionalities (main dimension) of the data object Q under " chafing dish " data source plus
Power degree of correlation D21=0, and the weighting phase of the Spatial Dimension (non-master dimension) under " chafing dish " data source
Pass degree I22=I21=0.113.Thus, the lower bound of data object Q local correlation degree is 0.727 (i.e.
D11+I12+I22).It is determined that Q local correlation degree the upper bound when, due to data object Q " fire
The weighting degree of correlation of all inquiry dimensions can not possibly exceed data object Y, therefore number under pot " data source
According to object Y ζt(Y, q) value is the theoretic highest weighting degree of correlation, thus, data object Q
Local correlation degree the upper bound be 1.125 (i.e. 0.614+0.511).Based on aforesaid way, according to this
The data object Q and Y of candidate collection are put into, its local correlation degree is updated for Q=[0.727,1.125].
Similarly, if the weighting degree of correlation of Spatial Dimensions of the data object Y under each data source is 0.111,
The upper bound that object Y local correlation degree can then be updated the data is 0.511+0.111, and lower bound is
0.511+.0614, i.e. Y=[0.622,1.125]
Object determining device 330 according in the candidate collection local correlation degree of data object it is upper
Boundary and lower bound, and in the data object of the candidate collection determine Query Result object when, for by
The data object taken out for the first time in each data source, can not directly determine inquiry knot in most cases
Fruit object, only when the data object taken out for the first time in each data source is identical data object,
Because the data object possesses the weighting degree of correlation of maximum in two data sources, necessarily meet bar
The data object of part, therefore can be directly as Query Result object.For example, at " dry pot "
In " chafing dish " two data sources, first data object is Q in ranking results, then can be by
Data object Q is used as Query Result object.
In most cases, the data object of ranking results first can be differed in each data source,
Such as Q and Y in precedent.Therefore, the object determining device 330 specifically for:When having one
The lower bound of the local correlation degree of data object is more than or equal to the local correlation degree of remaining any data object
The upper bound when, local correlation degree lower bound is more than to the local correlation degree upper bound of remaining any data object
Data object is defined as Query Result object;Otherwise, corresponding data is taken out in each data source again
The weighting degree of correlation highest data object of all inquiry dimensions adds candidate collection and determines inquiry under source
Result object.
Still by taking foregoing mixing inquiry scene as an example, if not meeting the condition for determining Query Result object, by
Candidate's updating device is handled again, due to data object Q and Y in previous processing
Through being removed, therefore now ζ in two data sourcest(p, q) value highest is data object W respectively,
Then based on the data object W newly added, the part of all data objects in candidate collection is updated
The degree of correlation.For data object W, because its weighting degree of correlation in two data sources is true
Determine, therefore the bound of its local correlation degree is also thereby determined that, it is assumed that W is in " dry pot " and " fire
The weighting degree of correlation of all inquiry dimensions under pot " data source is respectively 0.454 and 0.504, then may be used
Be 0.958 with the upper bound and lower bound for the local correlation degree for determining data object W, i.e. W=[0.958,
0.958], represent that the weighting degree of correlation of the data object W has been uniquely determined.
Meanwhile, the data object W based on newest taking-up is understood, data object Q is in " chafing dish " number
The weighting degree of correlation according to all inquiry dimensions under source can not possibly exceed data object W, and data object Y
The weighting degree of correlation of all inquiry dimensions is also impossible to exceed data object W under " chafing dish " data source,
Thus, object Q, the upper bound of data object Y local correlation degree are updated the data.Data object Q
The upper bound of local correlation degree be updated to 0.614+0.504, i.e. Q=[0.727,1.118], data object
The upper bound of Y local correlation degree is updated to 0.511+0.454, i.e. Y=[0.622,0.965].
Now, according to the bound of data object Q, Y and W local correlation degree, do not meet yet
Determine the condition of Query Result object, it is therefore desirable to continue to take in data source by candidate's updating device
Go out data object.If the data object difference this time taken out in " dry pot " and " chafing dish " data source
For Z and X, and ζs of the data object Z in " dry pot " data sourcet(Z, q) value is 0.411, and is counted
According to ζs of the object X in " chafing dish " data sourcet(X, q) value is 0.25., can be true according to aforementioned manner
The upper bound for determining data object Z local correlation degree is 0.411+0.25, and lower bound is 0.411+0.098 (" fire
The weighting degree of correlation of Spatial Dimension under pot " data source), i.e. Z=[0.509,0.661], and data pair
As X local correlation degree the upper bound be 0.25+0.411, lower bound be 0.25+0.101 (" chafing dish " data
The weighting degree of correlation of Spatial Dimension under source), i.e. X=[0.351,0.661].
Meanwhile, object Q and Y Local Phase is updated the data based on the data object Z and X newly taken out
The Guan Du upper bound.Wherein, the upper bound of data object Q local correlation degree is updated to 0.614+0.25,
That is the upper bound of Q=[0.727,0.864], data object Y local correlation degree is updated to 0.511+0.411,
That is Y=[0.622,0.922], and the upper bound of data object W local correlation degree and lower bound are unique
It is determined that, therefore be still W=[0.958,0.958] without updating.Now, data object W part
The lower bound of the degree of correlation is more than or equal to the local correlation degree of remaining any data object (Q, X, Y and Z)
The upper bound, meet determine Query Result object condition, therefore by data object W be defined as inquiry
Result object, the Query Result object got by way of above-mentioned loop iteration necessarily all dimensions
The maximum data object of the weighting degree of correlation of degree.In above-mentioned three iterative process, candidate collection
In data object local correlation degree update status it is as shown in table 2.
It is determined that during Query Result object, by way of above-mentioned loop iteration, it is only necessary to N
The iterative processing of wheel, you can get the Query Result object of the condition of satisfaction, without taking out each number
Screening is ranked up according to all data objects in source, therefore, it is possible to effectively improve the efficiency of mixing inquiry,
Shorten query time.
Mix and inquire about for TOP-K, due to needing to obtain K Query Result object, therefore this Shen
Please embodiment further provide a kind of preferred mixing query facility, structure such as Fig. 4 institutes of the equipment
Show, except the data in Fig. 3 divide device 310, candidate's updating device 320 and object determining device 330
Outside, in addition to result return mechanism 340.The result return mechanism 340 is used to look into being determined as
After the data object for asking result, the Query Result object is taken out in the candidate collection and adds knot
During fruit is gathered, and judge whether the quantity of the Query Result object in the results set reaches preset value;
When reaching preset value, all Query Result objects in the results set are regard as mixing inquiry
As a result;And in not up to preset value, again by taking out institute under respective data sources in each data source
The weighting degree of correlation highest data object for having inquiry dimension adds candidate collection and determines Query Result pair
As.
Still by taking foregoing inquiry scene as an example, after third time iterative processing, it has been determined that data object
W is Query Result object, Query Result object W is taken out in candidate collection, and add result
In set, the data object now included in candidate collection is Q, Y, Z, X, and in results set
Comprising data object be W.
The preset value is the fruiting quantities that TOP-K mixing inquiries need, and the present embodiment is set as 5,
Because the quantity of the Query Result object in now results set only has 1, and not up to preset value,
Therefore need to proceed processing by candidate's updating device 320, and continued by object determining device 330
By once or multiple iterative processing continues to obtain Query Result object, until adding results set
In Query Result object quantity reach 5 untill, now result return mechanism 340 is by the knot
All Query Result objects during fruit is gathered are returned as the TOP-K results for mixing inquiry.
To sum up, the scheme of the embodiment of the present application provides a kind of new mixing inquiry mode, is looked into multiple
Ask and determined in dimension after main dimension, data source is divided based on main dimension, each data source concurrent processing is obtained
Corresponding candidate target is taken, and introduces the concept of local correlation degree, by comparing data in candidate collection
The local correlation degree of object, the object for meeting mixing search request can be returned in advance, without according to institute
The degree of correlation for having dimension carries out screening and sequencing to all data objects for meeting querying condition, thus, it is possible to
The efficiency of mixing inquiry is improved, shortens query time.
It should be noted that the application can be carried out in the assembly of software and/or software and hardware, example
Such as, it can be set using application specific integrated circuit (ASIC), general purpose computer or any other similar hardware
It is standby to realize.In one embodiment, the software program of the application can be realized by computing device
Steps described above or function.Similarly, the software program (including related data structure) of the application
Can be stored in computer readable recording medium storing program for performing, for example, RAM memory, magnetically or optically driver or
Floppy disc and similar devices.In addition, some steps or function of the application can employ hardware to realize, example
Such as, as coordinating with processor so as to performing the circuit of each step or function.
In addition, the part of the application can be applied to computer program product, such as computer program
Instruction, when it is computer-executed, by the operation of the computer, can call or provide basis
The present processes and/or technical scheme.And the programmed instruction of the present processes is called, it may be deposited
Store up in fixed or moveable recording medium, and/or by broadcast or other signal bearing medias
Data flow and be transmitted, and/or be stored according to the computer equipment of described program instruction operation
In working storage.Here, including a device, the device bag according to one embodiment of the application
The memory for storing computer program instructions and the processor for execute program instructions are included, wherein,
When the computer program instructions are by the computing device, the plant running is triggered based on foregoing according to this
The methods and/or techniques scheme of multiple embodiments of application.
It is obvious to a person skilled in the art that the application is not limited to the thin of above-mentioned one exemplary embodiment
Section, and in the case of without departing substantially from spirit herein or essential characteristic, can be with other specific
Form realizes the application.Therefore, no matter from the point of view of which point, embodiment all should be regarded as exemplary
, and be nonrestrictive, scope of the present application is limited by appended claims rather than described above
It is fixed, it is intended that all changes fallen in the implication and scope of the equivalency of claim are included
In the application.The right that any reference in claim should not be considered as involved by limitation will
Ask.Furthermore, it is to be understood that the word of " comprising " one is not excluded for other units or step, odd number is not excluded for plural number.Dress
Software can also be passed through by a unit or device by putting the multiple units stated in claim or device
Or hardware is realized.The first, the second grade word is used for representing title, and is not offered as any specific
Order.