CN108304585A - A kind of result data choosing method and relevant apparatus based on spatial key search - Google Patents
A kind of result data choosing method and relevant apparatus based on spatial key search Download PDFInfo
- Publication number
- CN108304585A CN108304585A CN201810184309.9A CN201810184309A CN108304585A CN 108304585 A CN108304585 A CN 108304585A CN 201810184309 A CN201810184309 A CN 201810184309A CN 108304585 A CN108304585 A CN 108304585A
- Authority
- CN
- China
- Prior art keywords
- text object
- candidate
- candidate spatial
- diversity
- topics
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
Abstract
This application discloses a kind of result data choosing methods based on spatial key search, first passing through diversity number of topics realizes the diversity of metric space text object, the boundary cost of each candidate spatial text object is determined by distance coefficient and diversity number of topics again, in the candidate spatial text object to result set for choosing boundary cost minimum, the object in result set is set to be maintained at higher state with the shorter and diversity number of topics of query object distance, namely in view of the diversity of each search result while being selected based on distance coefficient, improve the diversity of result set, meet the diversified search need of user.Disclosed herein as well is a kind of result data selecting device, server and computer readable storage mediums based on spatial key search, have above-mentioned advantageous effect.
Description
Technical field
This application involves field of computer technology, more particularly to a kind of result data based on spatial key search is chosen
Method, result data selecting device, server and computer readable storage medium.
Background technology
It is more and more to be associated with spatial position using real phenomenon with the appearance of positioning service technology, spread out
Widely used spatial key inquiry is born, that is, combines space querying and text query to be looked into the mixing for seeking optimal result
It askes.
Usual spatial key inquiry is divided into substantially three steps, be respectively by space text object and related data into
Row formalization measurement, all space text objects are established corresponding index structure and by the key word of the inquiry of reception into
Row inquiry.Wherein, there is unified module for the formalization of space text object measurement, it can be with by the module
Efficiently carry out corresponding space Text Object Query.Based on a upper mode, may be implemented to carry out spatial key further
Inquiry operation, obtain and the relevant query result of searching keyword.
But the space text pair that the query result generally returned for the querying method of spatial key at present returns
As having higher similarity with searching keyword, and the relationship between the point of interest in result set does not require, usually these
It is all much like between the point being returned, it cannot be satisfied the diversified demand of user.For example, user wants near search as far as possible
The restaurant of more classifications, to be selected in different types of restaurant, and search engine may only return to unified class nearby
The restaurant of type can do nothing to help user and select.
Therefore, so that spatial key search is improved the diversity of search result, meet the diversified demand of user, be
Those skilled in the art's Important Problems of interest.
Invention content
The purpose of the application is to provide a kind of result data choosing method searched for based on spatial key, result data choosing
Device, server and computer readable storage medium are taken, is determined by distance coefficient and diversity number of topics each candidate empty
Between text object boundary cost, choose boundary cost minimum candidate spatial text object to result set in, make in result set
The shorter and diversity number of topics of object and query object distance be maintained at higher state, that is, selected based on distance coefficient
In view of the diversity of each search result while selecting, the diversity of result set is improved, meets the diversified search need of user
It asks.
In order to solve the above technical problems, the application provides a kind of result data selection side searched for based on spatial key
Method, including:
Index structure is executed to multiple space text objects and establishes operation, obtains index structure;
Multiple candidate spatial text objects are chosen using the index structure according to obtained query object, obtain candidate
Collection;
The distance for determining each the candidate spatial text object and the query object, obtains each candidate spatial
The distance coefficient of text object and the query object;
It determines the theme quantity that each candidate spatial text object includes except all themes of initialization, obtains
First diversity number of topics of each candidate spatial text object;
Each candidate spatial text is determined according to all distance coefficients and all first diversity numbers of topics
First boundary cost of this object;Wherein, the distance coefficient and first boundary cost are proportional relation, more than described first
Sample Sexual Themes number is inverse relation with first boundary cost;
The candidate spatial text object for choosing the first boundary cost minimum is added to result set.
Optionally, further include:
When the candidate spatial text object for choosing the first boundary cost minimum is added to result set, each institute is determined
State candidate spatial text object the first boundary cost minimum candidate spatial text object be added after all themes it
The theme quantity for including outside obtains the second diversity number of topics of each candidate spatial text object;
The corresponding candidate spatial is determined according to all distance coefficients and all second diversity numbers of topics
The second boundary cost of text object;Wherein, the distance coefficient and the second boundary cost are proportional relation, described second
Diversity number of topics is inverse relation with the second boundary cost;
The candidate spatial text object for choosing the second boundary cost minimization is added to the result set.
Optionally, index structure is executed to multiple space text objects and establishes operation, obtain index structure, including:
Determine the keyword occurrence number of each space text object;
The space text object that the keyword occurrence number is less than to preset times is set as block structure, obtains multiple pieces
Structure;
The space text object that the keyword occurrence number is more than or equal to the preset times is set as tree construction, obtains
To multiple tree constructions;
Using all block structures and all tree constructions as the index structure.
Optionally, multiple candidate spatial text objects are chosen using the index structure according to obtained query object, obtained
To Candidate Set, including:
According to obtained query object using the index structure according to greedy algorithm from all space text objects
It is middle to choose multiple candidate spatial text objects, obtain the Candidate Set.
The application also provides a kind of result data selecting device searched for based on spatial key, including:
Index establishes module, establishes operation for executing index structure to multiple space text objects, obtains index structure;
Candidate Set acquisition module, for choosing multiple candidate spatials using the index structure according to obtained query object
Text object obtains Candidate Set;
Distance coefficient acquisition module, for determine each candidate spatial text object and the query object away from
From obtaining the distance coefficient of each the candidate spatial text object and the query object;
First diversity number of topics acquisition module, for determining that each the candidate spatial text object is in the institute of initialization
There is the theme quantity for including except theme, obtains the first diversity number of topics of each candidate spatial text object;
First boundary cost acquisition module, for according to all distance coefficients and all first various Sexual Themes
Number determines the first boundary cost of each candidate spatial text object;Wherein, the distance coefficient and first boundary
Cost is proportional relation, and the first diversity number of topics is inverse relation with first boundary cost;
First result data chooses module, and the candidate spatial text object for choosing the first boundary cost minimum adds
Enter to result set.
Optionally, further include:
Second diversity number of topics acquisition module, for when the candidate spatial text for choosing the first boundary cost minimum
When object is added to result set, determine that each candidate spatial text object is empty in the minimum candidate of first boundary cost
Between text object be added after all themes except include theme quantity, obtain the of each candidate spatial text object
Two diversity numbers of topics;
The second boundary cost acquisition module, for according to all distance coefficients and all second various Sexual Themes
Number determines the second boundary cost of the corresponding candidate spatial text object;Wherein, the distance coefficient and second side
Boundary's cost is proportional relation, and the second diversity number of topics is inverse relation with the second boundary cost;
Second result data chooses module, and the candidate spatial text object for choosing the second boundary cost minimization adds
Enter to the result set.
Optionally, the index establishes module, including:
Keyword occurrence number acquiring unit, the keyword occurrence number for determining each space text object;
Block structure acquiring unit, the space text object setting for the keyword occurrence number to be less than to preset times
For block structure, multiple block structures are obtained;
Tree construction acquiring unit, the space text for the keyword occurrence number to be more than or equal to the preset times
Object is set as tree construction, obtains multiple tree constructions;
Index structure acquiring unit, for tying all block structures and all tree constructions as the index
Structure.
Optionally, the Candidate Set acquisition module, including:
Candidate Set acquiring unit, for according to obtained query object using the index structure according to greedy algorithm from institute
Have and choose multiple candidate spatial text objects in the space text object, obtains the Candidate Set.
The application also provides a kind of server, including:
Memory, for storing computer program;
Processor realizes result data choosing method as described above when for executing the computer program.
The application also provides a kind of computer readable storage medium, and calculating is stored on the computer readable storage medium
Machine program, the computer program realize result data choosing method as described above when being executed by processor.
A kind of result data choosing method based on spatial key search provided herein, including:To multiple skies
Between text object execute index structure establish operation, obtain index structure;It is tied using the index according to obtained query object
Structure chooses multiple candidate spatial text objects, obtains Candidate Set;Determine each candidate spatial text object and the inquiry
The distance of object obtains the distance coefficient of each the candidate spatial text object and the query object;It determines each described
The theme quantity that candidate spatial text object includes except all themes of initialization obtains each candidate spatial text
First diversity number of topics of object;It is determined according to all distance coefficients and all first diversity numbers of topics each
First boundary cost of the candidate spatial text object;Wherein, the distance coefficient and first boundary cost are direct ratio
Relationship, the first diversity number of topics are inverse relation with first boundary cost;Choose first boundary cost most
Small candidate spatial text object is added to result set.
As it can be seen that first passing through diversity number of topics realizes the diversity of metric space text object, then pass through distance coefficient
The boundary cost of each candidate spatial text object is determined with diversity number of topics, chooses the candidate spatial text of boundary cost minimum
In this object to result set, the object in result set is made to be maintained at higher with the shorter and diversity number of topics of query object distance
State, that is, while being selected based on distance coefficient the more of result set are improved in view of the diversity of each search result
Sample meets the diversified search need of user.
The application also provides a kind of result data selecting device, server and computer searched for based on spatial key
Readable storage medium storing program for executing has above-mentioned advantageous effect, and this will not be repeated here.
Description of the drawings
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
A kind of stream for result data choosing method searched for based on spatial key that Fig. 1 is provided by the embodiment of the present application
Cheng Tu;
The follow-up choosing for the result data choosing method searched for based on spatial key that Fig. 2 is provided by the embodiment of the present application
Take the flow chart of process;
The index for the result data choosing method searched for based on spatial key that Fig. 3 is provided by the embodiment of the present application is built
Vertical flow chart;
A kind of knot for result data selecting device searched for based on spatial key that Fig. 4 is provided by the embodiment of the present application
Structure schematic diagram.
Specific implementation mode
The core of the application is to provide a kind of result data choosing method searched for based on spatial key, result data choosing
Device, server and computer readable storage medium are taken, is determined by distance coefficient and diversity number of topics each candidate empty
Between text object boundary cost, choose boundary cost minimum candidate spatial text object to result set in, make in result set
The shorter and diversity number of topics of object and query object distance be maintained at higher state, that is, selected based on distance coefficient
In view of the diversity of each search result while selecting, the diversity of result set is improved, meets the diversified search need of user
It asks.
To keep the purpose, technical scheme and advantage of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, technical solutions in the embodiments of the present application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art
The every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
Referring to FIG. 1, Fig. 1 is selected by a kind of result data searched for based on spatial key that the embodiment of the present application provides
Take the flow chart of method.
The present embodiment provides a kind of result data choosing methods based on spatial key search, can improve the more of search
Sample may include:
S101 executes index structure to multiple space text objects and establishes operation, obtains index structure;
This step is intended to execute index structure foundation operation to multiple space text objects, obtains corresponding index structure,
Index structure namely is established to all space text objects.When being inquired accordingly according to querying condition, first have to
Index structure is established, the search to space text object just may be implemented.Specifically, the method for building up of index structure is not done herein
It limits, as long as the method for building up that the search to space text object may be implemented can be the foundation side used in the present embodiment
Method.
And the characteristics of different search environments and inquiry can be directed to, suitable index structure is established, be conducive to add
The speed searched for soon, and reduced index structures reduce the cost safeguarded and update index structure.
Wherein, space text object obtains after searched space text data is carried out Formal Representation.Specifically
Formalization process illustrates in subsequent paragraph.
S102 chooses multiple candidate spatial text objects using index structure according to obtained query object, obtains candidate
Collection;
On the basis of step S101, this step is intended to choose to obtain according to obtained index structure and query object multiple
Candidate spatial text object, obtains Candidate Set.The step for purpose mainly by query object in all space texts pair
Candidate spatial text object is selected as selecting, that is, by the indexed results, it is therefore an objective to make the candidate spatial text pair
Limitation as meeting query object, therefore multiple candidate spatial text objects are selected by query object in this step, and
Obtain Candidate Set.
Wherein, query object is similar with the space text object in previous step, is all that data are carried out Formal Representation
It obtains, specific formalization process illustrates in subsequent paragraph.
S103 determines the distance of each candidate spatial text object and query object, obtains each candidate spatial text pair
As the distance coefficient with query object;
On the basis of step S102, this step is intended to determine between each candidate spatial text object and query object
Distance obtains corresponding distance coefficient.Determine that the computational methods of distance can be with general space length calculating side in this step
Method is identical, main purpose be also obtained in space search object between query object at a distance from so that subsequent inquiry
Process can obtain optimal query result by distance screening.
Certainly, determine that the method for distance can also be according to the form of candidate spatial text object and query object in this step
The difference for changing expression way is changed, and specific version and mode should be selected depending on actual application environment, not done herein
It limits.
S104 determines the theme quantity that each candidate spatial text object includes except all themes of initialization, obtains
To the first diversity number of topics of each candidate spatial text object;
On the basis of step S102, this step be intended to determine each candidate spatial text object initialization all masters
The theme quantity for including except topic obtains the corresponding first diversity number of topics of the candidate spatial text object.
Since in the search of general spatial key, selection process only takes into account candidate spatial text object and inquiry pair
The distance between as, i.e., final result data is to be formed with the shortest candidate spatial text object of query object, can be made in this way
Result data is closely similar with query object.But result obtained in this way cannot be satisfied user for the various of query process
Change demand, i.e., under the certain similarity degree of the query object, it is desirable that query result is diversified as far as possible, that is to say, that result
Data include with the relevant a plurality of types of points of interest of some keyword subject.
Further, it is wrapped except all themes of initialization by some candidate spatial text object in this step
The theme quantity included determines the diversity number of topics of the candidate spatial text object.Wherein, theme is candidate spatial text object
With the attribute common to query object, and the general property that is used in the text search of space.
Wherein, all themes of initialization are and initialize all space text objects in obtained result set at present to be wrapped
The theme included, further, so that it may to go out some candidate spatial by the theme quality metric except the range of all themes
The theme diversity of text object.Specifically it can ought pass through this in the search process of space text object in actual use
Diversity number of topics chooses the result data of search, the diversity of search result is improved, so that result data meets the more of user
Sample demand.
So above-mentioned result set carries out initialization process and obtains in the present embodiment, and added in result set
The data added are that the data that continuous cycle is chosen in Candidate Set obtain.If being at this time first round cycle, at initialization
Space text object can be not present by managing obtained result set, also just without corresponding subject area.Initialization process can also
It is by other searching methods to adding a certain number of result datas in result set, method again through this embodiment later
More space text objects are added, the result set that initialization process obtains at this time also there is certain subject area, according to
The subject area is assured that out the first diversity number of topics of candidate spatial text object.
It is envisioned that the initialisation process in the present embodiment can also be the search of another spatial key
Process, that is, this implementation can apply after other spatial key search process, be searched with improving the spatial key
The diversity of rope.Further, other searching methods can be used after the spatial key search process of the present embodiment,
Selection more suitably search result.
It should be noted that this step sequentially has no precedence relationship with step S103 in execution.
S105 determines each candidate spatial text object according to all distance coefficients and all first diversity numbers of topics
First boundary cost;Wherein, distance coefficient and the first boundary cost are proportional relation, the first diversity number of topics and the first boundary
Cost is inverse relation;
On the basis of step S103 and S104, this step is intended to according to obtained distance coefficient and first various Sexual Themes
Number determines the first boundary cost of the candidate spatial text object.Namely in step S103 and the obtained distances of step S104
On the basis of coefficient and the first diversity number of topics, boundary cost is drawn as unified measure object similitude and multifarious
Metric form.And make distance coefficient and the first boundary cost be proportional relation, the first diversity number of topics and the first boundary at
This is inverse relation, that is, boundary cost is smaller represents that distance coefficient is smaller while the first diversity number of topics is bigger.
S106, the candidate spatial text object for choosing the first boundary cost minimum are added to result set.
On the basis of step S105, the candidate spatial text object that this step is intended to choose the first boundary cost minimum adds
Enter into result set.Namely by the best candidate spatial text object of distance coefficient in Candidate Set and the first diversity number of topics
Choose as a result data one of, improve the diversity of result data, meet the diversity requirement of user.
To sum up, the present embodiment determines the boundary of each candidate spatial text object by distance coefficient and diversity number of topics
Cost, choose boundary cost minimum candidate spatial text object to result set in, make the object in result set and query object
The shorter and diversity number of topics of distance is maintained at higher state, that is, in view of every while being selected based on distance coefficient
The diversity of a search result improves the diversity of result set, meets the diversified search need of user.
Optionally, in the present embodiment can also according to obtained query object using index structure according to greedy algorithm from institute
Have and choose multiple candidate spatial text objects in the text object of space, obtains Candidate Set.
The main purpose of this alternative is to select candidate spatial text object from all space text objects, is obtained
Candidate Set, special feature are to select suitable candidate spatial text object by greedy algorithm.It is primarily due to above-described embodiment
In the result data choosing method based on boundary cost cannot be guaranteed result data diversification and distance simultaneously meet centainly
It is required that so in order to improve the quality of query result, obtained using the selection of greedy algorithm in this alternative multiple candidate empty
Between text object, obtain Candidate Set.
The core concept of this alternative be according to the space length between space text object and query object by they
It is layered different layers, then certain amount of space text object is taken in each layer choosing, these space text objects is made to possess
Uncovered number of topics is more than other space text objects of same layer.
Specifically, giving a spatial object set D, q is inquired for a spatial key, it is assumed that can be found on D
Meet diversified demand simultaneously to cover enough themes and minimize k spatial object of distance function, they are from inquiry q
The sum of space length be M.We assume that the possible very little of the value of M, it is possible to from distance to inquire the position of q as the center of circle, half
Start to have looked in the smaller circular scope of diameter, if the result found within the scope of this cannot cover enough themes, expands M
It searches in the larger context.For each circle search range, by the object in range by them at a distance from query point
It is divided into different layers, then takes suitable object in each layer choosing.
Further, the greedy algorithm used in the alternative can also be carried out accordingly after obtaining result data
Search, that is, screened result set is obtained in the present embodiment again by greedy algorithm, original result set can be improved
Accuracy rate, even if result set at a distance from query object closer to.
Based on above-described embodiment, can be to Formal Representations such as wherein described space text object, query objects
Following form.
1) space text object Formal Representation
The point o={ loc, term, topic } described with position coordinates and text using one in 2 dimension spaces carrys out table
Show a space text object.Wherein, loc is made of longitude and latitude, indicates the position where object o;Term is for describing pair
As a set of keyword of o;Topic indicates the theme set that object o is covered.
For example, in map application environment, a spatial key has corresponded to a point of interest, i.e. businessman or mechanism, is
System have recorded it position and text description, the theme that this point of interest is covered by handmarking or natural language can be passed through
Speech treatment technology analyzes its comment information to obtain.For convenience, space text object can also be known as spatial object.
Based on above-mentioned definition, we indicate the set of all space text objects in database with D, i.e.,:
2) Formal Representation of spatial key inquiry
Spatial key inquiry form turns to q={ loc, term }, and q is query object.Wherein, loc be query point i.e.
Position where user is indicated in two-dimensional space with latitude and longitude coordinates;Term is the set of keyword that user is inputted, such as
" Chinese-style restaurant ", the query intention for describing user.
To given query object q, search engine is selected and k the most matched most like space texts of q from data set D
This object is as the result data returned.Wherein, result data is apart from close, various between text degree of correlation is big and result
The set of one group of high space text object of change degree.
3) Formal Representation of Candidate Set
Give a space text object database D, a spatial key query object q={ loc, term } and one
The a subset S of threshold value Thre, D are (i.e.) it is referred to as Candidate Set.And if only if two conditions of satisfaction:
Keyword limits, each space text object o in S includes all keys word of the inquiry, i.e.,
Diversification requires, and in S not less than Thre, i.e., all space text objects cover the sum of different number of topics
4) Formal Representation of the distance function of space text object
Give a space text object database D, a spatial key query object q={ loc, term }, for D
Element number be k a subsetThe distance function of our definition set R and q is:
Wherein the distance between Dist (q, o) representation space text object o and query object q, DistmaxIndicate data set D
In space text object and query object maximum distance.As it appears from the above, query object and space text object set away from
From by normalized, i.e. value is in [0,1] section.
5) formal definitions of problem are searched for
Give a space text object data set D, a spatial key inquiry q={ loc, term }, a distance
Function f and threshold value Thre considers that space length, text threshold value and the theme between space text object and inquiry inquiry cover
Degree, it is quasi- to return to k space text object for meeting following two similarity measurement conditions:
1, this k space text object forms Candidate Set R, i.e.,And
2, f (q, R) obtains minimum value.
Referring to FIG. 2, the result data selection side searched for based on spatial key that Fig. 2 is provided by the embodiment of the present application
The follow-up flow chart for choosing process of method.
Based on a upper embodiment, the present embodiment is mainly for candidate's sky when the first boundary cost minimum in a upper embodiment
Between text object be added to one done after result set and expand explanation, preceding sections are substantially the same with a upper embodiment, identical portions
A upper embodiment can be referred to by dividing, and this will not be repeated here.
The present embodiment may include:
S201 is determined each when the candidate spatial text object for choosing the first boundary cost minimum is added to result set
Candidate spatial text object the first boundary cost minimum candidate spatial text object be added after all themes except include
Theme quantity, obtain the second diversity number of topics of each candidate spatial text object;
This step is intended to work as in a upper embodiment is added to knot by the candidate spatial text object of the first boundary cost minimum
When fruit collects, it is equivalent at this time and the space text object in result set is changed, that is, the theme model that result set is included
It encloses and corresponding variation has occurred, in order to continue to improve the diversity for choosing candidate spatial text object from Candidate Set, it is necessary to
Determine all themes of each candidate spatial text object after the candidate spatial text object of the first boundary cost minimum is added
Except include theme quantity, obtain the second diversity number of topics.
Since result set adds in first embodiment selected candidate spatial text data, in corresponding result set
Included subject area also sent out variation, that is, the subject area measured also is changed, therefore the mesh of this step
Mainly recalculate the diversity numbers of topics of all candidate spatial text objects on the basis of the second result set, i.e., second
Diversity number of topics.
S202 determines corresponding candidate spatial text object according to all distance coefficients and all second diversity numbers of topics
The second boundary cost;Wherein, distance coefficient and the second boundary cost are proportional relation, the second diversity number of topics and the second side
Boundary's cost is inverse relation;
On the basis of step S202, this step is intended to distance coefficient and second various Sexual Themes according to a upper embodiment
Number determines the second boundary cost.Particular content is substantially the same with a upper embodiment, can be referred to jacket embodiment, not done herein superfluous
It states.
S203, the candidate spatial text object for choosing the second boundary cost minimization are added to result set.
On the basis of step S202, this step is intended to the candidate spatial text object of the second boundary cost minimization being added
To result set.
Included subject area due to adding result data in being searched in spatial key every time all has occurred accordingly
Variation, therefore the present embodiment be intended to explanation subsequently how to add result data, to keep the diversity of result data.Cause
This, the illustrated step of this implementation can be expanded to multiple, it is only necessary to make the modification of adaptability on the basis of the present embodiment, have
Body repeats no more.
Referring to FIG. 3, the result data selection side searched for based on spatial key that Fig. 3 is provided by the embodiment of the present application
The flow chart that the index of method is established.
Based on a upper embodiment, the present embodiment is primarily directed to one for how establishing indexed results in a upper embodiment and doing
It illustrates, other parts can refer to a upper embodiment, and this will not be repeated here.
The present embodiment may include:
S301 determines the keyword occurrence number of each space text object;
S302, the space text object that keyword occurrence number is less than to preset times are set as block structure, obtain multiple
Block structure;
S303, the space text object that keyword occurrence number is more than or equal to preset times are set as tree construction, obtain
Multiple tree constructions;
S304, using all block structures and all tree constructions as index structure.
In the inquiry of existing spatial key, indexed results can be divided into three classifications:I.e. with the preferential index in space
Structure, the index structure combined closely with the preferential index structure of text and the two.The preferential index structure in space can be divided into again
Index structure based on R trees, grid and space filling curve;The preferential index structure of text is based primarily upon inverted file and position
Figure;Space text in conjunction with index structure simultaneously combined closely these structures come it is more efficient filtering some do not meet inquiry pair
As desired space text object.But with the increase of data volume, these index structures all become abnormal huge, this makes
The space hold amount of index ramps, and renewal speed is slack-off, influences the experience in practical application.
Therefore, by the keyword occurrence number of each space text object the object is arranged different ropes in the present embodiment
Guiding structure, that is, classification processing is carried out to the index structure of space text object, by the lower object of key word frequency of occurrence
It is set as block structure, in a fairly large number of object data of storage.Set the higher object of key word frequency of occurrence to tree knot
Structure carries out finding relevant object when conveniently scanning for.
And during object search, you may search for different tree constructions and block structure can complete the corresponding
Search operation.For meeting the tree construction of query object condition, can be accessed with the incremental order of minimum boundary cost therein
Object node, wherein minimum boundary cost can be defined as:
Wherein N indicates the node of tree construction, and Dist (q, N.mbr) is space length of the minimum boundary rectangle from inquiry of N,
|Occuri=1 | the number (i.e. the number of topics that N is covered) occurred in the index structure for being N.
The embodiment of the present application provides a kind of result data choosing method searched for based on spatial key, can by away from
From the boundary cost that coefficient and diversity number of topics determine each candidate spatial text object, the candidate of boundary cost minimum is chosen
In the text object to result set of space, keep the object in result set shorter with query object distance and the holding of diversity number of topics
In higher state, that is, result is improved in view of the diversity of each search result while being selected based on distance coefficient
The diversity of collection meets the diversified search need of user.
Below to a kind of result data selecting device progress based on spatial key search provided by the embodiments of the present application
It introduces, a kind of result data selecting device based on spatial key search described below is with above-described one kind based on sky
Between the result data choosing method of keyword search can correspond reference.
Referring to FIG. 4, Fig. 4 is selected by a kind of result data searched for based on spatial key that the embodiment of the present application provides
Take the structural schematic diagram of device.
The present embodiment provides a kind of result data selecting devices based on spatial key search, may include:
Index establishes module 100, establishes operation for executing index structure to multiple space text objects, obtains index knot
Structure;
Candidate Set acquisition module 200, for choosing multiple candidate spatials using index structure according to obtained query object
Text object obtains Candidate Set;
Distance coefficient acquisition module 300, the distance for determining each candidate spatial text object and query object, obtains
The distance coefficient of each candidate spatial text object and query object;
First diversity number of topics acquisition module 400, for determine each candidate spatial text object initialization institute
There is the theme quantity for including except theme, obtains the first diversity number of topics of each candidate spatial text object;
First boundary cost acquisition module 500, for true according to all distance coefficients and all first diversity numbers of topics
First boundary cost of fixed each candidate spatial text object;Wherein, distance coefficient and the first boundary cost are proportional relation, the
One diversity number of topics and the first boundary cost are inverse relation;
First result data chooses module 600, and the candidate spatial text object for choosing the first boundary cost minimum adds
Enter to result set.
Based on above-described embodiment, can also include:
Second diversity number of topics acquisition module, for when the candidate spatial text object for choosing the first boundary cost minimum
When being added to result set, determine that each candidate spatial text object adds in the candidate spatial text object of the first boundary cost minimum
The theme quantity for including except all themes after entering obtains the second diversity number of topics of each candidate spatial text object;
The second boundary cost acquisition module, for according to all distance coefficients and the determination pair of all second diversity numbers of topics
The second boundary cost for the candidate spatial text object answered;Wherein, distance coefficient and the second boundary cost are proportional relation, second
Diversity number of topics is inverse relation with the second boundary cost;
Second result data chooses module, and the candidate spatial text object for choosing the second boundary cost minimization is added to
Result set.
Wherein, which establishes module 100, may include:
Keyword occurrence number acquiring unit, the keyword occurrence number for determining each space text object;
Block structure acquiring unit, the space text object for keyword occurrence number to be less than to preset times are set as block
Structure obtains multiple block structures;
Tree construction acquiring unit, the space text object setting for keyword occurrence number to be more than or equal to preset times
For tree construction, multiple tree constructions are obtained;
Index structure acquiring unit, for using all block structures and all tree constructions as index structure.
Wherein, the Candidate Set acquisition module 200 may include:
Candidate Set acquiring unit, for according to obtained query object using index structure according to greedy algorithm from having time
Between choose multiple candidate spatial text objects in text object, obtain Candidate Set.
The embodiment of the present application also provides a kind of server, including:
Memory, for storing computer program;
Processor realizes the result data choosing method such as above-described embodiment when for executing computer program.
The embodiment of the present application also provides a kind of computer readable storage medium, and meter is stored on computer readable storage medium
Calculation machine program realizes the result data choosing method such as above-described embodiment when computer program is executed by processor.
Each embodiment is described by the way of progressive in specification, the highlights of each of the examples are with other realities
Apply the difference of example, just to refer each other for identical similar portion between each embodiment.For device disclosed in embodiment
Speech, since it is corresponded to the methods disclosed in the examples, so description is fairly simple, related place is referring to method part illustration
.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure
And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and
The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These
Function is implemented in hardware or software actually, depends on the specific application and design constraint of technical solution.Profession
Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered
Think to exceed scope of the present application.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor
The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit
Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology
In any other form of storage medium well known in field.
Above to a kind of result data choosing method, result data based on spatial key search provided herein
Selecting device, server and computer readable storage medium are described in detail.Specific case used herein is to this
The principle and embodiment of application is expounded, the explanation of above example is only intended to help understand the present processes and
Its core concept.It should be pointed out that for those skilled in the art, in the premise for not departing from the application principle
Under, can also to the application, some improvement and modification can also be carried out, these improvement and modification also fall into the protection of the application claim
In range.
Claims (10)
1. a kind of result data choosing method based on spatial key search, which is characterized in that including:
Index structure is executed to multiple space text objects and establishes operation, obtains index structure;
Multiple candidate spatial text objects are chosen using the index structure according to obtained query object, obtain Candidate Set;
The distance for determining each the candidate spatial text object and the query object, obtains each candidate spatial text
The distance coefficient of object and the query object;
It determines the theme quantity that each candidate spatial text object includes except all themes of initialization, obtains each
First diversity number of topics of the candidate spatial text object;
Each candidate spatial text pair is determined according to all distance coefficients and all first diversity numbers of topics
The first boundary cost of elephant;Wherein, the distance coefficient and first boundary cost are proportional relation, first diversity
Number of topics is inverse relation with first boundary cost;
The candidate spatial text object for choosing the first boundary cost minimum is added to result set.
2. result data choosing method according to claim 1, which is characterized in that further include:
When the candidate spatial text object for choosing the first boundary cost minimum is added to result set, each time is determined
Select the outsourcing of all themes of the space text object after the candidate spatial text object of the first boundary cost minimum is added
The theme quantity included obtains the second diversity number of topics of each candidate spatial text object;
The corresponding candidate spatial text is determined according to all distance coefficients and all second diversity numbers of topics
The second boundary cost of object;Wherein, the distance coefficient and the second boundary cost are proportional relation, and described second is various
Sexual Themes number is inverse relation with the second boundary cost;
The candidate spatial text object for choosing the second boundary cost minimization is added to the result set.
3. result data choosing method according to claim 2, which is characterized in that execute rope to multiple space text objects
Guiding structure establishes operation, obtains index structure, including:
Determine the keyword occurrence number of each space text object;
The space text object that the keyword occurrence number is less than to preset times is set as block structure, obtains multiple agllutinations
Structure;
The space text object that the keyword occurrence number is more than or equal to the preset times is set as tree construction, obtains more
A tree construction;
Using all block structures and all tree constructions as the index structure.
4. result data choosing method according to claim 3, which is characterized in that use institute according to obtained query object
It states index structure and chooses multiple candidate spatial text objects, obtain Candidate Set, including:
It is selected from all space text objects according to greedy algorithm using the index structure according to obtained query object
Multiple candidate spatial text objects are taken, the Candidate Set is obtained.
5. a kind of result data selecting device based on spatial key search, which is characterized in that including:
Index establishes module, establishes operation for executing index structure to multiple space text objects, obtains index structure;
Candidate Set acquisition module, for choosing multiple candidate spatial texts using the index structure according to obtained query object
Object obtains Candidate Set;
Distance coefficient acquisition module, the distance for determining each the candidate spatial text object and the query object, obtains
To the distance coefficient of each the candidate spatial text object and the query object;
First diversity number of topics acquisition module, for determining that each the candidate spatial text object is in all masters of initialization
The theme quantity for including except topic obtains the first diversity number of topics of each candidate spatial text object;
First boundary cost acquisition module, for true according to all distance coefficients and all first diversity numbers of topics
First boundary cost of fixed each candidate spatial text object;Wherein, the distance coefficient and first boundary cost
For proportional relation, the first diversity number of topics is inverse relation with first boundary cost;
First result data chooses module, and the candidate spatial text object for choosing the first boundary cost minimum is added to
Result set.
6. result data selecting device according to claim 5, which is characterized in that further include:
Second diversity number of topics acquisition module, for when the candidate spatial text object for choosing the first boundary cost minimum
When being added to result set, determine that each candidate spatial text object is literary in the candidate spatial of the first boundary cost minimum
The theme quantity for including except all themes after the addition of this object, obtains more than the second of each candidate spatial text object
Sample Sexual Themes number;
The second boundary cost acquisition module, for true according to all distance coefficients and all second diversity numbers of topics
The second boundary cost of the fixed corresponding candidate spatial text object;Wherein, the distance coefficient and the second boundary at
This is proportional relation, and the second diversity number of topics is inverse relation with the second boundary cost;
Second result data chooses module, and the candidate spatial text object for choosing the second boundary cost minimization is added to
The result set.
7. result data selecting device according to claim 6, which is characterized in that the index establishes module, including:
Keyword occurrence number acquiring unit, the keyword occurrence number for determining each space text object;
Block structure acquiring unit, the space text object for the keyword occurrence number to be less than to preset times are set as block
Structure obtains multiple block structures;
Tree construction acquiring unit, the space text object for the keyword occurrence number to be more than or equal to the preset times
It is set as tree construction, obtains multiple tree constructions;
Index structure acquiring unit, for using all block structures and all tree constructions as the index structure.
8. result data selecting device according to claim 7, which is characterized in that the Candidate Set acquisition module, including:
Candidate Set acquiring unit, for according to obtained query object using the index structure according to greedy algorithm from all institutes
It states and chooses multiple candidate spatial text objects in the text object of space, obtain the Candidate Set.
9. a kind of server, which is characterized in that including:
Memory, for storing computer program;
Processor realizes that Claims 1-4 any one of them result data such as is chosen when for executing the computer program
Method.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium
Program realizes such as Claims 1-4 any one of them result data selection side when the computer program is executed by processor
Method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810184309.9A CN108304585B (en) | 2018-03-06 | 2018-03-06 | Result data selection method based on space keyword search and related device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810184309.9A CN108304585B (en) | 2018-03-06 | 2018-03-06 | Result data selection method based on space keyword search and related device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108304585A true CN108304585A (en) | 2018-07-20 |
CN108304585B CN108304585B (en) | 2022-05-17 |
Family
ID=62849191
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810184309.9A Active CN108304585B (en) | 2018-03-06 | 2018-03-06 | Result data selection method based on space keyword search and related device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108304585B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112149005A (en) * | 2019-06-27 | 2020-12-29 | 腾讯科技(深圳)有限公司 | Method, apparatus, device and readable storage medium for determining search results |
CN112632267A (en) * | 2020-12-04 | 2021-04-09 | 中国人民大学 | Search result diversification system combining global interaction and greedy selection |
CN113065036A (en) * | 2021-04-14 | 2021-07-02 | 深圳大学 | Method and device for measuring performance of space supporting point and related components |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104834679A (en) * | 2015-04-14 | 2015-08-12 | 苏州大学 | Representation and inquiry method of behavior track and device therefor |
CN105069094A (en) * | 2015-08-06 | 2015-11-18 | 苏州大学 | Semantic understanding based space keyword indexing method |
CN106503223A (en) * | 2016-11-04 | 2017-03-15 | 华东师范大学 | A kind of binding site and the online source of houses searching method and device of key word information |
CN107145545A (en) * | 2017-04-18 | 2017-09-08 | 东北大学 | Top k zone users text data recommends method in a kind of location-based social networks |
-
2018
- 2018-03-06 CN CN201810184309.9A patent/CN108304585B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104834679A (en) * | 2015-04-14 | 2015-08-12 | 苏州大学 | Representation and inquiry method of behavior track and device therefor |
CN105069094A (en) * | 2015-08-06 | 2015-11-18 | 苏州大学 | Semantic understanding based space keyword indexing method |
CN106503223A (en) * | 2016-11-04 | 2017-03-15 | 华东师范大学 | A kind of binding site and the online source of houses searching method and device of key word information |
CN107145545A (en) * | 2017-04-18 | 2017-09-08 | 东北大学 | Top k zone users text data recommends method in a kind of location-based social networks |
Non-Patent Citations (4)
Title |
---|
JIABAO SUN等: "Interactive Spatial Keyword Querying with Semantics", 《 PROCEEDINGS OF THE 2017 ACM ON CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT》 * |
ZHIHU QIAN等: "On Efficient Spatial Keyword Querying with Semantics", 《INTERNATIONAL CONFERENCE ON DATABASE SYSTEMS FOR ADVANCED APPLICATIONS》 * |
ZHIHU QIAN等: "Semantic-aware top-k spatial keyword queries", 《WORLD WIDE WEB》 * |
梁银等: "基于对象集合的空间关键词查询", 《计算机应用》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112149005A (en) * | 2019-06-27 | 2020-12-29 | 腾讯科技(深圳)有限公司 | Method, apparatus, device and readable storage medium for determining search results |
CN112149005B (en) * | 2019-06-27 | 2023-09-01 | 腾讯科技(深圳)有限公司 | Method, apparatus, device and readable storage medium for determining search results |
CN112632267A (en) * | 2020-12-04 | 2021-04-09 | 中国人民大学 | Search result diversification system combining global interaction and greedy selection |
CN112632267B (en) * | 2020-12-04 | 2023-05-02 | 中国人民大学 | Global interaction and greedy selection combined search result diversification system |
CN113065036A (en) * | 2021-04-14 | 2021-07-02 | 深圳大学 | Method and device for measuring performance of space supporting point and related components |
Also Published As
Publication number | Publication date |
---|---|
CN108304585B (en) | 2022-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7413580B2 (en) | Generating integrated circuit floorplans using neural networks | |
Albert et al. | Statistical mechanics of complex networks | |
CN102893553B (en) | Personal information de-identification device | |
Bentley | Multidimensional binary search trees used for associative searching | |
Preparata et al. | Computational geometry: an introduction | |
Bozzon et al. | Liquid query: multi-domain exploratory search on the web | |
US6868420B2 (en) | Method for traversing quadtrees, octrees, and N-dimensional bi-trees | |
CN109284363A (en) | A kind of answering method, device, electronic equipment and storage medium | |
CN108304585A (en) | A kind of result data choosing method and relevant apparatus based on spatial key search | |
CN106462990A (en) | Customizable route planning using graphics processing unit | |
CA2625726C (en) | Optimization-based visual context management | |
CN103514230B (en) | A kind of method and apparatus being used for according to language material sequence train language model | |
CN109243468A (en) | Audio recognition method, device, electronic equipment and storage medium | |
CN105069094B (en) | A kind of spatial key indexing means based on semantic understanding | |
CN110019616A (en) | A kind of POI trend of the times state acquiring method and its equipment, storage medium, server | |
CN108549690A (en) | Spatial key querying method and system based on space length constraint | |
CN106572272A (en) | IVR voice menu determination method and apparatus | |
CN107315833A (en) | Method and apparatus of the retrieval with downloading based on application program | |
CN109492150A (en) | Reverse nearest neighbor queries method and device based on semantic track big data | |
CN113177058A (en) | Geographic position information retrieval method and system based on composite condition | |
KR102189811B1 (en) | Method and Apparatus for Completing Knowledge Graph Based on Convolutional Learning Using Multi-Hop Neighborhoods | |
CN107391528A (en) | Front end assemblies Dependency Specification searching method and equipment | |
CN105550308B (en) | A kind of information processing method, search method and electronic equipment | |
CN110297942A (en) | A kind of video heuristic approach, device, equipment and storage medium | |
Mackenzie | Protocols and the irreducible traces of embodiment: The Viterbi algorithm and the mosaic of machine time |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |