CN108304585A - A kind of result data choosing method and relevant apparatus based on spatial key search - Google Patents

A kind of result data choosing method and relevant apparatus based on spatial key search Download PDF

Info

Publication number
CN108304585A
CN108304585A CN201810184309.9A CN201810184309A CN108304585A CN 108304585 A CN108304585 A CN 108304585A CN 201810184309 A CN201810184309 A CN 201810184309A CN 108304585 A CN108304585 A CN 108304585A
Authority
CN
China
Prior art keywords
text object
candidate
candidate spatial
diversity
topics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810184309.9A
Other languages
Chinese (zh)
Other versions
CN108304585B (en
Inventor
钱志虎
许佳捷
郑凯
柳诚飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN201810184309.9A priority Critical patent/CN108304585B/en
Publication of CN108304585A publication Critical patent/CN108304585A/en
Application granted granted Critical
Publication of CN108304585B publication Critical patent/CN108304585B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Abstract

This application discloses a kind of result data choosing methods based on spatial key search, first passing through diversity number of topics realizes the diversity of metric space text object, the boundary cost of each candidate spatial text object is determined by distance coefficient and diversity number of topics again, in the candidate spatial text object to result set for choosing boundary cost minimum, the object in result set is set to be maintained at higher state with the shorter and diversity number of topics of query object distance, namely in view of the diversity of each search result while being selected based on distance coefficient, improve the diversity of result set, meet the diversified search need of user.Disclosed herein as well is a kind of result data selecting device, server and computer readable storage mediums based on spatial key search, have above-mentioned advantageous effect.

Description

A kind of result data choosing method and relevant apparatus based on spatial key search
Technical field
This application involves field of computer technology, more particularly to a kind of result data based on spatial key search is chosen Method, result data selecting device, server and computer readable storage medium.
Background technology
It is more and more to be associated with spatial position using real phenomenon with the appearance of positioning service technology, spread out Widely used spatial key inquiry is born, that is, combines space querying and text query to be looked into the mixing for seeking optimal result It askes.
Usual spatial key inquiry is divided into substantially three steps, be respectively by space text object and related data into Row formalization measurement, all space text objects are established corresponding index structure and by the key word of the inquiry of reception into Row inquiry.Wherein, there is unified module for the formalization of space text object measurement, it can be with by the module Efficiently carry out corresponding space Text Object Query.Based on a upper mode, may be implemented to carry out spatial key further Inquiry operation, obtain and the relevant query result of searching keyword.
But the space text pair that the query result generally returned for the querying method of spatial key at present returns As having higher similarity with searching keyword, and the relationship between the point of interest in result set does not require, usually these It is all much like between the point being returned, it cannot be satisfied the diversified demand of user.For example, user wants near search as far as possible The restaurant of more classifications, to be selected in different types of restaurant, and search engine may only return to unified class nearby The restaurant of type can do nothing to help user and select.
Therefore, so that spatial key search is improved the diversity of search result, meet the diversified demand of user, be Those skilled in the art's Important Problems of interest.
Invention content
The purpose of the application is to provide a kind of result data choosing method searched for based on spatial key, result data choosing Device, server and computer readable storage medium are taken, is determined by distance coefficient and diversity number of topics each candidate empty Between text object boundary cost, choose boundary cost minimum candidate spatial text object to result set in, make in result set The shorter and diversity number of topics of object and query object distance be maintained at higher state, that is, selected based on distance coefficient In view of the diversity of each search result while selecting, the diversity of result set is improved, meets the diversified search need of user It asks.
In order to solve the above technical problems, the application provides a kind of result data selection side searched for based on spatial key Method, including:
Index structure is executed to multiple space text objects and establishes operation, obtains index structure;
Multiple candidate spatial text objects are chosen using the index structure according to obtained query object, obtain candidate Collection;
The distance for determining each the candidate spatial text object and the query object, obtains each candidate spatial The distance coefficient of text object and the query object;
It determines the theme quantity that each candidate spatial text object includes except all themes of initialization, obtains First diversity number of topics of each candidate spatial text object;
Each candidate spatial text is determined according to all distance coefficients and all first diversity numbers of topics First boundary cost of this object;Wherein, the distance coefficient and first boundary cost are proportional relation, more than described first Sample Sexual Themes number is inverse relation with first boundary cost;
The candidate spatial text object for choosing the first boundary cost minimum is added to result set.
Optionally, further include:
When the candidate spatial text object for choosing the first boundary cost minimum is added to result set, each institute is determined State candidate spatial text object the first boundary cost minimum candidate spatial text object be added after all themes it The theme quantity for including outside obtains the second diversity number of topics of each candidate spatial text object;
The corresponding candidate spatial is determined according to all distance coefficients and all second diversity numbers of topics The second boundary cost of text object;Wherein, the distance coefficient and the second boundary cost are proportional relation, described second Diversity number of topics is inverse relation with the second boundary cost;
The candidate spatial text object for choosing the second boundary cost minimization is added to the result set.
Optionally, index structure is executed to multiple space text objects and establishes operation, obtain index structure, including:
Determine the keyword occurrence number of each space text object;
The space text object that the keyword occurrence number is less than to preset times is set as block structure, obtains multiple pieces Structure;
The space text object that the keyword occurrence number is more than or equal to the preset times is set as tree construction, obtains To multiple tree constructions;
Using all block structures and all tree constructions as the index structure.
Optionally, multiple candidate spatial text objects are chosen using the index structure according to obtained query object, obtained To Candidate Set, including:
According to obtained query object using the index structure according to greedy algorithm from all space text objects It is middle to choose multiple candidate spatial text objects, obtain the Candidate Set.
The application also provides a kind of result data selecting device searched for based on spatial key, including:
Index establishes module, establishes operation for executing index structure to multiple space text objects, obtains index structure;
Candidate Set acquisition module, for choosing multiple candidate spatials using the index structure according to obtained query object Text object obtains Candidate Set;
Distance coefficient acquisition module, for determine each candidate spatial text object and the query object away from From obtaining the distance coefficient of each the candidate spatial text object and the query object;
First diversity number of topics acquisition module, for determining that each the candidate spatial text object is in the institute of initialization There is the theme quantity for including except theme, obtains the first diversity number of topics of each candidate spatial text object;
First boundary cost acquisition module, for according to all distance coefficients and all first various Sexual Themes Number determines the first boundary cost of each candidate spatial text object;Wherein, the distance coefficient and first boundary Cost is proportional relation, and the first diversity number of topics is inverse relation with first boundary cost;
First result data chooses module, and the candidate spatial text object for choosing the first boundary cost minimum adds Enter to result set.
Optionally, further include:
Second diversity number of topics acquisition module, for when the candidate spatial text for choosing the first boundary cost minimum When object is added to result set, determine that each candidate spatial text object is empty in the minimum candidate of first boundary cost Between text object be added after all themes except include theme quantity, obtain the of each candidate spatial text object Two diversity numbers of topics;
The second boundary cost acquisition module, for according to all distance coefficients and all second various Sexual Themes Number determines the second boundary cost of the corresponding candidate spatial text object;Wherein, the distance coefficient and second side Boundary's cost is proportional relation, and the second diversity number of topics is inverse relation with the second boundary cost;
Second result data chooses module, and the candidate spatial text object for choosing the second boundary cost minimization adds Enter to the result set.
Optionally, the index establishes module, including:
Keyword occurrence number acquiring unit, the keyword occurrence number for determining each space text object;
Block structure acquiring unit, the space text object setting for the keyword occurrence number to be less than to preset times For block structure, multiple block structures are obtained;
Tree construction acquiring unit, the space text for the keyword occurrence number to be more than or equal to the preset times Object is set as tree construction, obtains multiple tree constructions;
Index structure acquiring unit, for tying all block structures and all tree constructions as the index Structure.
Optionally, the Candidate Set acquisition module, including:
Candidate Set acquiring unit, for according to obtained query object using the index structure according to greedy algorithm from institute Have and choose multiple candidate spatial text objects in the space text object, obtains the Candidate Set.
The application also provides a kind of server, including:
Memory, for storing computer program;
Processor realizes result data choosing method as described above when for executing the computer program.
The application also provides a kind of computer readable storage medium, and calculating is stored on the computer readable storage medium Machine program, the computer program realize result data choosing method as described above when being executed by processor.
A kind of result data choosing method based on spatial key search provided herein, including:To multiple skies Between text object execute index structure establish operation, obtain index structure;It is tied using the index according to obtained query object Structure chooses multiple candidate spatial text objects, obtains Candidate Set;Determine each candidate spatial text object and the inquiry The distance of object obtains the distance coefficient of each the candidate spatial text object and the query object;It determines each described The theme quantity that candidate spatial text object includes except all themes of initialization obtains each candidate spatial text First diversity number of topics of object;It is determined according to all distance coefficients and all first diversity numbers of topics each First boundary cost of the candidate spatial text object;Wherein, the distance coefficient and first boundary cost are direct ratio Relationship, the first diversity number of topics are inverse relation with first boundary cost;Choose first boundary cost most Small candidate spatial text object is added to result set.
As it can be seen that first passing through diversity number of topics realizes the diversity of metric space text object, then pass through distance coefficient The boundary cost of each candidate spatial text object is determined with diversity number of topics, chooses the candidate spatial text of boundary cost minimum In this object to result set, the object in result set is made to be maintained at higher with the shorter and diversity number of topics of query object distance State, that is, while being selected based on distance coefficient the more of result set are improved in view of the diversity of each search result Sample meets the diversified search need of user.
The application also provides a kind of result data selecting device, server and computer searched for based on spatial key Readable storage medium storing program for executing has above-mentioned advantageous effect, and this will not be repeated here.
Description of the drawings
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
A kind of stream for result data choosing method searched for based on spatial key that Fig. 1 is provided by the embodiment of the present application Cheng Tu;
The follow-up choosing for the result data choosing method searched for based on spatial key that Fig. 2 is provided by the embodiment of the present application Take the flow chart of process;
The index for the result data choosing method searched for based on spatial key that Fig. 3 is provided by the embodiment of the present application is built Vertical flow chart;
A kind of knot for result data selecting device searched for based on spatial key that Fig. 4 is provided by the embodiment of the present application Structure schematic diagram.
Specific implementation mode
The core of the application is to provide a kind of result data choosing method searched for based on spatial key, result data choosing Device, server and computer readable storage medium are taken, is determined by distance coefficient and diversity number of topics each candidate empty Between text object boundary cost, choose boundary cost minimum candidate spatial text object to result set in, make in result set The shorter and diversity number of topics of object and query object distance be maintained at higher state, that is, selected based on distance coefficient In view of the diversity of each search result while selecting, the diversity of result set is improved, meets the diversified search need of user It asks.
To keep the purpose, technical scheme and advantage of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, technical solutions in the embodiments of the present application is clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art The every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
Referring to FIG. 1, Fig. 1 is selected by a kind of result data searched for based on spatial key that the embodiment of the present application provides Take the flow chart of method.
The present embodiment provides a kind of result data choosing methods based on spatial key search, can improve the more of search Sample may include:
S101 executes index structure to multiple space text objects and establishes operation, obtains index structure;
This step is intended to execute index structure foundation operation to multiple space text objects, obtains corresponding index structure, Index structure namely is established to all space text objects.When being inquired accordingly according to querying condition, first have to Index structure is established, the search to space text object just may be implemented.Specifically, the method for building up of index structure is not done herein It limits, as long as the method for building up that the search to space text object may be implemented can be the foundation side used in the present embodiment Method.
And the characteristics of different search environments and inquiry can be directed to, suitable index structure is established, be conducive to add The speed searched for soon, and reduced index structures reduce the cost safeguarded and update index structure.
Wherein, space text object obtains after searched space text data is carried out Formal Representation.Specifically Formalization process illustrates in subsequent paragraph.
S102 chooses multiple candidate spatial text objects using index structure according to obtained query object, obtains candidate Collection;
On the basis of step S101, this step is intended to choose to obtain according to obtained index structure and query object multiple Candidate spatial text object, obtains Candidate Set.The step for purpose mainly by query object in all space texts pair Candidate spatial text object is selected as selecting, that is, by the indexed results, it is therefore an objective to make the candidate spatial text pair Limitation as meeting query object, therefore multiple candidate spatial text objects are selected by query object in this step, and Obtain Candidate Set.
Wherein, query object is similar with the space text object in previous step, is all that data are carried out Formal Representation It obtains, specific formalization process illustrates in subsequent paragraph.
S103 determines the distance of each candidate spatial text object and query object, obtains each candidate spatial text pair As the distance coefficient with query object;
On the basis of step S102, this step is intended to determine between each candidate spatial text object and query object Distance obtains corresponding distance coefficient.Determine that the computational methods of distance can be with general space length calculating side in this step Method is identical, main purpose be also obtained in space search object between query object at a distance from so that subsequent inquiry Process can obtain optimal query result by distance screening.
Certainly, determine that the method for distance can also be according to the form of candidate spatial text object and query object in this step The difference for changing expression way is changed, and specific version and mode should be selected depending on actual application environment, not done herein It limits.
S104 determines the theme quantity that each candidate spatial text object includes except all themes of initialization, obtains To the first diversity number of topics of each candidate spatial text object;
On the basis of step S102, this step be intended to determine each candidate spatial text object initialization all masters The theme quantity for including except topic obtains the corresponding first diversity number of topics of the candidate spatial text object.
Since in the search of general spatial key, selection process only takes into account candidate spatial text object and inquiry pair The distance between as, i.e., final result data is to be formed with the shortest candidate spatial text object of query object, can be made in this way Result data is closely similar with query object.But result obtained in this way cannot be satisfied user for the various of query process Change demand, i.e., under the certain similarity degree of the query object, it is desirable that query result is diversified as far as possible, that is to say, that result Data include with the relevant a plurality of types of points of interest of some keyword subject.
Further, it is wrapped except all themes of initialization by some candidate spatial text object in this step The theme quantity included determines the diversity number of topics of the candidate spatial text object.Wherein, theme is candidate spatial text object With the attribute common to query object, and the general property that is used in the text search of space.
Wherein, all themes of initialization are and initialize all space text objects in obtained result set at present to be wrapped The theme included, further, so that it may to go out some candidate spatial by the theme quality metric except the range of all themes The theme diversity of text object.Specifically it can ought pass through this in the search process of space text object in actual use Diversity number of topics chooses the result data of search, the diversity of search result is improved, so that result data meets the more of user Sample demand.
So above-mentioned result set carries out initialization process and obtains in the present embodiment, and added in result set The data added are that the data that continuous cycle is chosen in Candidate Set obtain.If being at this time first round cycle, at initialization Space text object can be not present by managing obtained result set, also just without corresponding subject area.Initialization process can also It is by other searching methods to adding a certain number of result datas in result set, method again through this embodiment later More space text objects are added, the result set that initialization process obtains at this time also there is certain subject area, according to The subject area is assured that out the first diversity number of topics of candidate spatial text object.
It is envisioned that the initialisation process in the present embodiment can also be the search of another spatial key Process, that is, this implementation can apply after other spatial key search process, be searched with improving the spatial key The diversity of rope.Further, other searching methods can be used after the spatial key search process of the present embodiment, Selection more suitably search result.
It should be noted that this step sequentially has no precedence relationship with step S103 in execution.
S105 determines each candidate spatial text object according to all distance coefficients and all first diversity numbers of topics First boundary cost;Wherein, distance coefficient and the first boundary cost are proportional relation, the first diversity number of topics and the first boundary Cost is inverse relation;
On the basis of step S103 and S104, this step is intended to according to obtained distance coefficient and first various Sexual Themes Number determines the first boundary cost of the candidate spatial text object.Namely in step S103 and the obtained distances of step S104 On the basis of coefficient and the first diversity number of topics, boundary cost is drawn as unified measure object similitude and multifarious Metric form.And make distance coefficient and the first boundary cost be proportional relation, the first diversity number of topics and the first boundary at This is inverse relation, that is, boundary cost is smaller represents that distance coefficient is smaller while the first diversity number of topics is bigger.
S106, the candidate spatial text object for choosing the first boundary cost minimum are added to result set.
On the basis of step S105, the candidate spatial text object that this step is intended to choose the first boundary cost minimum adds Enter into result set.Namely by the best candidate spatial text object of distance coefficient in Candidate Set and the first diversity number of topics Choose as a result data one of, improve the diversity of result data, meet the diversity requirement of user.
To sum up, the present embodiment determines the boundary of each candidate spatial text object by distance coefficient and diversity number of topics Cost, choose boundary cost minimum candidate spatial text object to result set in, make the object in result set and query object The shorter and diversity number of topics of distance is maintained at higher state, that is, in view of every while being selected based on distance coefficient The diversity of a search result improves the diversity of result set, meets the diversified search need of user.
Optionally, in the present embodiment can also according to obtained query object using index structure according to greedy algorithm from institute Have and choose multiple candidate spatial text objects in the text object of space, obtains Candidate Set.
The main purpose of this alternative is to select candidate spatial text object from all space text objects, is obtained Candidate Set, special feature are to select suitable candidate spatial text object by greedy algorithm.It is primarily due to above-described embodiment In the result data choosing method based on boundary cost cannot be guaranteed result data diversification and distance simultaneously meet centainly It is required that so in order to improve the quality of query result, obtained using the selection of greedy algorithm in this alternative multiple candidate empty Between text object, obtain Candidate Set.
The core concept of this alternative be according to the space length between space text object and query object by they It is layered different layers, then certain amount of space text object is taken in each layer choosing, these space text objects is made to possess Uncovered number of topics is more than other space text objects of same layer.
Specifically, giving a spatial object set D, q is inquired for a spatial key, it is assumed that can be found on D Meet diversified demand simultaneously to cover enough themes and minimize k spatial object of distance function, they are from inquiry q The sum of space length be M.We assume that the possible very little of the value of M, it is possible to from distance to inquire the position of q as the center of circle, half Start to have looked in the smaller circular scope of diameter, if the result found within the scope of this cannot cover enough themes, expands M It searches in the larger context.For each circle search range, by the object in range by them at a distance from query point It is divided into different layers, then takes suitable object in each layer choosing.
Further, the greedy algorithm used in the alternative can also be carried out accordingly after obtaining result data Search, that is, screened result set is obtained in the present embodiment again by greedy algorithm, original result set can be improved Accuracy rate, even if result set at a distance from query object closer to.
Based on above-described embodiment, can be to Formal Representations such as wherein described space text object, query objects Following form.
1) space text object Formal Representation
The point o={ loc, term, topic } described with position coordinates and text using one in 2 dimension spaces carrys out table Show a space text object.Wherein, loc is made of longitude and latitude, indicates the position where object o;Term is for describing pair As a set of keyword of o;Topic indicates the theme set that object o is covered.
For example, in map application environment, a spatial key has corresponded to a point of interest, i.e. businessman or mechanism, is System have recorded it position and text description, the theme that this point of interest is covered by handmarking or natural language can be passed through Speech treatment technology analyzes its comment information to obtain.For convenience, space text object can also be known as spatial object.
Based on above-mentioned definition, we indicate the set of all space text objects in database with D, i.e.,:
2) Formal Representation of spatial key inquiry
Spatial key inquiry form turns to q={ loc, term }, and q is query object.Wherein, loc be query point i.e. Position where user is indicated in two-dimensional space with latitude and longitude coordinates;Term is the set of keyword that user is inputted, such as " Chinese-style restaurant ", the query intention for describing user.
To given query object q, search engine is selected and k the most matched most like space texts of q from data set D This object is as the result data returned.Wherein, result data is apart from close, various between text degree of correlation is big and result The set of one group of high space text object of change degree.
3) Formal Representation of Candidate Set
Give a space text object database D, a spatial key query object q={ loc, term } and one The a subset S of threshold value Thre, D are (i.e.) it is referred to as Candidate Set.And if only if two conditions of satisfaction:
Keyword limits, each space text object o in S includes all keys word of the inquiry, i.e.,
Diversification requires, and in S not less than Thre, i.e., all space text objects cover the sum of different number of topics
4) Formal Representation of the distance function of space text object
Give a space text object database D, a spatial key query object q={ loc, term }, for D Element number be k a subsetThe distance function of our definition set R and q is:
Wherein the distance between Dist (q, o) representation space text object o and query object q, DistmaxIndicate data set D In space text object and query object maximum distance.As it appears from the above, query object and space text object set away from From by normalized, i.e. value is in [0,1] section.
5) formal definitions of problem are searched for
Give a space text object data set D, a spatial key inquiry q={ loc, term }, a distance Function f and threshold value Thre considers that space length, text threshold value and the theme between space text object and inquiry inquiry cover Degree, it is quasi- to return to k space text object for meeting following two similarity measurement conditions:
1, this k space text object forms Candidate Set R, i.e.,And
2, f (q, R) obtains minimum value.
Referring to FIG. 2, the result data selection side searched for based on spatial key that Fig. 2 is provided by the embodiment of the present application The follow-up flow chart for choosing process of method.
Based on a upper embodiment, the present embodiment is mainly for candidate's sky when the first boundary cost minimum in a upper embodiment Between text object be added to one done after result set and expand explanation, preceding sections are substantially the same with a upper embodiment, identical portions A upper embodiment can be referred to by dividing, and this will not be repeated here.
The present embodiment may include:
S201 is determined each when the candidate spatial text object for choosing the first boundary cost minimum is added to result set Candidate spatial text object the first boundary cost minimum candidate spatial text object be added after all themes except include Theme quantity, obtain the second diversity number of topics of each candidate spatial text object;
This step is intended to work as in a upper embodiment is added to knot by the candidate spatial text object of the first boundary cost minimum When fruit collects, it is equivalent at this time and the space text object in result set is changed, that is, the theme model that result set is included It encloses and corresponding variation has occurred, in order to continue to improve the diversity for choosing candidate spatial text object from Candidate Set, it is necessary to Determine all themes of each candidate spatial text object after the candidate spatial text object of the first boundary cost minimum is added Except include theme quantity, obtain the second diversity number of topics.
Since result set adds in first embodiment selected candidate spatial text data, in corresponding result set Included subject area also sent out variation, that is, the subject area measured also is changed, therefore the mesh of this step Mainly recalculate the diversity numbers of topics of all candidate spatial text objects on the basis of the second result set, i.e., second Diversity number of topics.
S202 determines corresponding candidate spatial text object according to all distance coefficients and all second diversity numbers of topics The second boundary cost;Wherein, distance coefficient and the second boundary cost are proportional relation, the second diversity number of topics and the second side Boundary's cost is inverse relation;
On the basis of step S202, this step is intended to distance coefficient and second various Sexual Themes according to a upper embodiment Number determines the second boundary cost.Particular content is substantially the same with a upper embodiment, can be referred to jacket embodiment, not done herein superfluous It states.
S203, the candidate spatial text object for choosing the second boundary cost minimization are added to result set.
On the basis of step S202, this step is intended to the candidate spatial text object of the second boundary cost minimization being added To result set.
Included subject area due to adding result data in being searched in spatial key every time all has occurred accordingly Variation, therefore the present embodiment be intended to explanation subsequently how to add result data, to keep the diversity of result data.Cause This, the illustrated step of this implementation can be expanded to multiple, it is only necessary to make the modification of adaptability on the basis of the present embodiment, have Body repeats no more.
Referring to FIG. 3, the result data selection side searched for based on spatial key that Fig. 3 is provided by the embodiment of the present application The flow chart that the index of method is established.
Based on a upper embodiment, the present embodiment is primarily directed to one for how establishing indexed results in a upper embodiment and doing It illustrates, other parts can refer to a upper embodiment, and this will not be repeated here.
The present embodiment may include:
S301 determines the keyword occurrence number of each space text object;
S302, the space text object that keyword occurrence number is less than to preset times are set as block structure, obtain multiple Block structure;
S303, the space text object that keyword occurrence number is more than or equal to preset times are set as tree construction, obtain Multiple tree constructions;
S304, using all block structures and all tree constructions as index structure.
In the inquiry of existing spatial key, indexed results can be divided into three classifications:I.e. with the preferential index in space Structure, the index structure combined closely with the preferential index structure of text and the two.The preferential index structure in space can be divided into again Index structure based on R trees, grid and space filling curve;The preferential index structure of text is based primarily upon inverted file and position Figure;Space text in conjunction with index structure simultaneously combined closely these structures come it is more efficient filtering some do not meet inquiry pair As desired space text object.But with the increase of data volume, these index structures all become abnormal huge, this makes The space hold amount of index ramps, and renewal speed is slack-off, influences the experience in practical application.
Therefore, by the keyword occurrence number of each space text object the object is arranged different ropes in the present embodiment Guiding structure, that is, classification processing is carried out to the index structure of space text object, by the lower object of key word frequency of occurrence It is set as block structure, in a fairly large number of object data of storage.Set the higher object of key word frequency of occurrence to tree knot Structure carries out finding relevant object when conveniently scanning for.
And during object search, you may search for different tree constructions and block structure can complete the corresponding Search operation.For meeting the tree construction of query object condition, can be accessed with the incremental order of minimum boundary cost therein Object node, wherein minimum boundary cost can be defined as:
Wherein N indicates the node of tree construction, and Dist (q, N.mbr) is space length of the minimum boundary rectangle from inquiry of N, |Occuri=1 | the number (i.e. the number of topics that N is covered) occurred in the index structure for being N.
The embodiment of the present application provides a kind of result data choosing method searched for based on spatial key, can by away from From the boundary cost that coefficient and diversity number of topics determine each candidate spatial text object, the candidate of boundary cost minimum is chosen In the text object to result set of space, keep the object in result set shorter with query object distance and the holding of diversity number of topics In higher state, that is, result is improved in view of the diversity of each search result while being selected based on distance coefficient The diversity of collection meets the diversified search need of user.
Below to a kind of result data selecting device progress based on spatial key search provided by the embodiments of the present application It introduces, a kind of result data selecting device based on spatial key search described below is with above-described one kind based on sky Between the result data choosing method of keyword search can correspond reference.
Referring to FIG. 4, Fig. 4 is selected by a kind of result data searched for based on spatial key that the embodiment of the present application provides Take the structural schematic diagram of device.
The present embodiment provides a kind of result data selecting devices based on spatial key search, may include:
Index establishes module 100, establishes operation for executing index structure to multiple space text objects, obtains index knot Structure;
Candidate Set acquisition module 200, for choosing multiple candidate spatials using index structure according to obtained query object Text object obtains Candidate Set;
Distance coefficient acquisition module 300, the distance for determining each candidate spatial text object and query object, obtains The distance coefficient of each candidate spatial text object and query object;
First diversity number of topics acquisition module 400, for determine each candidate spatial text object initialization institute There is the theme quantity for including except theme, obtains the first diversity number of topics of each candidate spatial text object;
First boundary cost acquisition module 500, for true according to all distance coefficients and all first diversity numbers of topics First boundary cost of fixed each candidate spatial text object;Wherein, distance coefficient and the first boundary cost are proportional relation, the One diversity number of topics and the first boundary cost are inverse relation;
First result data chooses module 600, and the candidate spatial text object for choosing the first boundary cost minimum adds Enter to result set.
Based on above-described embodiment, can also include:
Second diversity number of topics acquisition module, for when the candidate spatial text object for choosing the first boundary cost minimum When being added to result set, determine that each candidate spatial text object adds in the candidate spatial text object of the first boundary cost minimum The theme quantity for including except all themes after entering obtains the second diversity number of topics of each candidate spatial text object;
The second boundary cost acquisition module, for according to all distance coefficients and the determination pair of all second diversity numbers of topics The second boundary cost for the candidate spatial text object answered;Wherein, distance coefficient and the second boundary cost are proportional relation, second Diversity number of topics is inverse relation with the second boundary cost;
Second result data chooses module, and the candidate spatial text object for choosing the second boundary cost minimization is added to Result set.
Wherein, which establishes module 100, may include:
Keyword occurrence number acquiring unit, the keyword occurrence number for determining each space text object;
Block structure acquiring unit, the space text object for keyword occurrence number to be less than to preset times are set as block Structure obtains multiple block structures;
Tree construction acquiring unit, the space text object setting for keyword occurrence number to be more than or equal to preset times For tree construction, multiple tree constructions are obtained;
Index structure acquiring unit, for using all block structures and all tree constructions as index structure.
Wherein, the Candidate Set acquisition module 200 may include:
Candidate Set acquiring unit, for according to obtained query object using index structure according to greedy algorithm from having time Between choose multiple candidate spatial text objects in text object, obtain Candidate Set.
The embodiment of the present application also provides a kind of server, including:
Memory, for storing computer program;
Processor realizes the result data choosing method such as above-described embodiment when for executing computer program.
The embodiment of the present application also provides a kind of computer readable storage medium, and meter is stored on computer readable storage medium Calculation machine program realizes the result data choosing method such as above-described embodiment when computer program is executed by processor.
Each embodiment is described by the way of progressive in specification, the highlights of each of the examples are with other realities Apply the difference of example, just to refer each other for identical similar portion between each embodiment.For device disclosed in embodiment Speech, since it is corresponded to the methods disclosed in the examples, so description is fairly simple, related place is referring to method part illustration .
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, depends on the specific application and design constraint of technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think to exceed scope of the present application.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
Above to a kind of result data choosing method, result data based on spatial key search provided herein Selecting device, server and computer readable storage medium are described in detail.Specific case used herein is to this The principle and embodiment of application is expounded, the explanation of above example is only intended to help understand the present processes and Its core concept.It should be pointed out that for those skilled in the art, in the premise for not departing from the application principle Under, can also to the application, some improvement and modification can also be carried out, these improvement and modification also fall into the protection of the application claim In range.

Claims (10)

1. a kind of result data choosing method based on spatial key search, which is characterized in that including:
Index structure is executed to multiple space text objects and establishes operation, obtains index structure;
Multiple candidate spatial text objects are chosen using the index structure according to obtained query object, obtain Candidate Set;
The distance for determining each the candidate spatial text object and the query object, obtains each candidate spatial text The distance coefficient of object and the query object;
It determines the theme quantity that each candidate spatial text object includes except all themes of initialization, obtains each First diversity number of topics of the candidate spatial text object;
Each candidate spatial text pair is determined according to all distance coefficients and all first diversity numbers of topics The first boundary cost of elephant;Wherein, the distance coefficient and first boundary cost are proportional relation, first diversity Number of topics is inverse relation with first boundary cost;
The candidate spatial text object for choosing the first boundary cost minimum is added to result set.
2. result data choosing method according to claim 1, which is characterized in that further include:
When the candidate spatial text object for choosing the first boundary cost minimum is added to result set, each time is determined Select the outsourcing of all themes of the space text object after the candidate spatial text object of the first boundary cost minimum is added The theme quantity included obtains the second diversity number of topics of each candidate spatial text object;
The corresponding candidate spatial text is determined according to all distance coefficients and all second diversity numbers of topics The second boundary cost of object;Wherein, the distance coefficient and the second boundary cost are proportional relation, and described second is various Sexual Themes number is inverse relation with the second boundary cost;
The candidate spatial text object for choosing the second boundary cost minimization is added to the result set.
3. result data choosing method according to claim 2, which is characterized in that execute rope to multiple space text objects Guiding structure establishes operation, obtains index structure, including:
Determine the keyword occurrence number of each space text object;
The space text object that the keyword occurrence number is less than to preset times is set as block structure, obtains multiple agllutinations Structure;
The space text object that the keyword occurrence number is more than or equal to the preset times is set as tree construction, obtains more A tree construction;
Using all block structures and all tree constructions as the index structure.
4. result data choosing method according to claim 3, which is characterized in that use institute according to obtained query object It states index structure and chooses multiple candidate spatial text objects, obtain Candidate Set, including:
It is selected from all space text objects according to greedy algorithm using the index structure according to obtained query object Multiple candidate spatial text objects are taken, the Candidate Set is obtained.
5. a kind of result data selecting device based on spatial key search, which is characterized in that including:
Index establishes module, establishes operation for executing index structure to multiple space text objects, obtains index structure;
Candidate Set acquisition module, for choosing multiple candidate spatial texts using the index structure according to obtained query object Object obtains Candidate Set;
Distance coefficient acquisition module, the distance for determining each the candidate spatial text object and the query object, obtains To the distance coefficient of each the candidate spatial text object and the query object;
First diversity number of topics acquisition module, for determining that each the candidate spatial text object is in all masters of initialization The theme quantity for including except topic obtains the first diversity number of topics of each candidate spatial text object;
First boundary cost acquisition module, for true according to all distance coefficients and all first diversity numbers of topics First boundary cost of fixed each candidate spatial text object;Wherein, the distance coefficient and first boundary cost For proportional relation, the first diversity number of topics is inverse relation with first boundary cost;
First result data chooses module, and the candidate spatial text object for choosing the first boundary cost minimum is added to Result set.
6. result data selecting device according to claim 5, which is characterized in that further include:
Second diversity number of topics acquisition module, for when the candidate spatial text object for choosing the first boundary cost minimum When being added to result set, determine that each candidate spatial text object is literary in the candidate spatial of the first boundary cost minimum The theme quantity for including except all themes after the addition of this object, obtains more than the second of each candidate spatial text object Sample Sexual Themes number;
The second boundary cost acquisition module, for true according to all distance coefficients and all second diversity numbers of topics The second boundary cost of the fixed corresponding candidate spatial text object;Wherein, the distance coefficient and the second boundary at This is proportional relation, and the second diversity number of topics is inverse relation with the second boundary cost;
Second result data chooses module, and the candidate spatial text object for choosing the second boundary cost minimization is added to The result set.
7. result data selecting device according to claim 6, which is characterized in that the index establishes module, including:
Keyword occurrence number acquiring unit, the keyword occurrence number for determining each space text object;
Block structure acquiring unit, the space text object for the keyword occurrence number to be less than to preset times are set as block Structure obtains multiple block structures;
Tree construction acquiring unit, the space text object for the keyword occurrence number to be more than or equal to the preset times It is set as tree construction, obtains multiple tree constructions;
Index structure acquiring unit, for using all block structures and all tree constructions as the index structure.
8. result data selecting device according to claim 7, which is characterized in that the Candidate Set acquisition module, including:
Candidate Set acquiring unit, for according to obtained query object using the index structure according to greedy algorithm from all institutes It states and chooses multiple candidate spatial text objects in the text object of space, obtain the Candidate Set.
9. a kind of server, which is characterized in that including:
Memory, for storing computer program;
Processor realizes that Claims 1-4 any one of them result data such as is chosen when for executing the computer program Method.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program realizes such as Claims 1-4 any one of them result data selection side when the computer program is executed by processor Method.
CN201810184309.9A 2018-03-06 2018-03-06 Result data selection method based on space keyword search and related device Active CN108304585B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810184309.9A CN108304585B (en) 2018-03-06 2018-03-06 Result data selection method based on space keyword search and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810184309.9A CN108304585B (en) 2018-03-06 2018-03-06 Result data selection method based on space keyword search and related device

Publications (2)

Publication Number Publication Date
CN108304585A true CN108304585A (en) 2018-07-20
CN108304585B CN108304585B (en) 2022-05-17

Family

ID=62849191

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810184309.9A Active CN108304585B (en) 2018-03-06 2018-03-06 Result data selection method based on space keyword search and related device

Country Status (1)

Country Link
CN (1) CN108304585B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149005A (en) * 2019-06-27 2020-12-29 腾讯科技(深圳)有限公司 Method, apparatus, device and readable storage medium for determining search results
CN112632267A (en) * 2020-12-04 2021-04-09 中国人民大学 Search result diversification system combining global interaction and greedy selection
CN113065036A (en) * 2021-04-14 2021-07-02 深圳大学 Method and device for measuring performance of space supporting point and related components

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834679A (en) * 2015-04-14 2015-08-12 苏州大学 Representation and inquiry method of behavior track and device therefor
CN105069094A (en) * 2015-08-06 2015-11-18 苏州大学 Semantic understanding based space keyword indexing method
CN106503223A (en) * 2016-11-04 2017-03-15 华东师范大学 A kind of binding site and the online source of houses searching method and device of key word information
CN107145545A (en) * 2017-04-18 2017-09-08 东北大学 Top k zone users text data recommends method in a kind of location-based social networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834679A (en) * 2015-04-14 2015-08-12 苏州大学 Representation and inquiry method of behavior track and device therefor
CN105069094A (en) * 2015-08-06 2015-11-18 苏州大学 Semantic understanding based space keyword indexing method
CN106503223A (en) * 2016-11-04 2017-03-15 华东师范大学 A kind of binding site and the online source of houses searching method and device of key word information
CN107145545A (en) * 2017-04-18 2017-09-08 东北大学 Top k zone users text data recommends method in a kind of location-based social networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
JIABAO SUN等: "Interactive Spatial Keyword Querying with Semantics", 《 PROCEEDINGS OF THE 2017 ACM ON CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT》 *
ZHIHU QIAN等: "On Efficient Spatial Keyword Querying with Semantics", 《INTERNATIONAL CONFERENCE ON DATABASE SYSTEMS FOR ADVANCED APPLICATIONS》 *
ZHIHU QIAN等: "Semantic-aware top-k spatial keyword queries", 《WORLD WIDE WEB》 *
梁银等: "基于对象集合的空间关键词查询", 《计算机应用》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149005A (en) * 2019-06-27 2020-12-29 腾讯科技(深圳)有限公司 Method, apparatus, device and readable storage medium for determining search results
CN112149005B (en) * 2019-06-27 2023-09-01 腾讯科技(深圳)有限公司 Method, apparatus, device and readable storage medium for determining search results
CN112632267A (en) * 2020-12-04 2021-04-09 中国人民大学 Search result diversification system combining global interaction and greedy selection
CN112632267B (en) * 2020-12-04 2023-05-02 中国人民大学 Global interaction and greedy selection combined search result diversification system
CN113065036A (en) * 2021-04-14 2021-07-02 深圳大学 Method and device for measuring performance of space supporting point and related components

Also Published As

Publication number Publication date
CN108304585B (en) 2022-05-17

Similar Documents

Publication Publication Date Title
JP7413580B2 (en) Generating integrated circuit floorplans using neural networks
Albert et al. Statistical mechanics of complex networks
CN102893553B (en) Personal information de-identification device
Bentley Multidimensional binary search trees used for associative searching
Preparata et al. Computational geometry: an introduction
Bozzon et al. Liquid query: multi-domain exploratory search on the web
US6868420B2 (en) Method for traversing quadtrees, octrees, and N-dimensional bi-trees
CN109284363A (en) A kind of answering method, device, electronic equipment and storage medium
CN108304585A (en) A kind of result data choosing method and relevant apparatus based on spatial key search
CN106462990A (en) Customizable route planning using graphics processing unit
CA2625726C (en) Optimization-based visual context management
CN103514230B (en) A kind of method and apparatus being used for according to language material sequence train language model
CN109243468A (en) Audio recognition method, device, electronic equipment and storage medium
CN105069094B (en) A kind of spatial key indexing means based on semantic understanding
CN110019616A (en) A kind of POI trend of the times state acquiring method and its equipment, storage medium, server
CN108549690A (en) Spatial key querying method and system based on space length constraint
CN106572272A (en) IVR voice menu determination method and apparatus
CN107315833A (en) Method and apparatus of the retrieval with downloading based on application program
CN109492150A (en) Reverse nearest neighbor queries method and device based on semantic track big data
CN113177058A (en) Geographic position information retrieval method and system based on composite condition
KR102189811B1 (en) Method and Apparatus for Completing Knowledge Graph Based on Convolutional Learning Using Multi-Hop Neighborhoods
CN107391528A (en) Front end assemblies Dependency Specification searching method and equipment
CN105550308B (en) A kind of information processing method, search method and electronic equipment
CN110297942A (en) A kind of video heuristic approach, device, equipment and storage medium
Mackenzie Protocols and the irreducible traces of embodiment: The Viterbi algorithm and the mosaic of machine time

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant