CN116341567B - Interest point semantic labeling method and system based on space and semantic neighbor information - Google Patents

Interest point semantic labeling method and system based on space and semantic neighbor information Download PDF

Info

Publication number
CN116341567B
CN116341567B CN202310614884.9A CN202310614884A CN116341567B CN 116341567 B CN116341567 B CN 116341567B CN 202310614884 A CN202310614884 A CN 202310614884A CN 116341567 B CN116341567 B CN 116341567B
Authority
CN
China
Prior art keywords
interest
semantic
vector
points
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310614884.9A
Other languages
Chinese (zh)
Other versions
CN116341567A (en
Inventor
陈勐
张大滨
王宇
郭衍民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Institute Of Industrial Technology
Shandong University
Original Assignee
Shandong Institute Of Industrial Technology
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Institute Of Industrial Technology, Shandong University filed Critical Shandong Institute Of Industrial Technology
Priority to CN202310614884.9A priority Critical patent/CN116341567B/en
Publication of CN116341567A publication Critical patent/CN116341567A/en
Application granted granted Critical
Publication of CN116341567B publication Critical patent/CN116341567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention belongs to the field of data mining, and particularly relates to a method and a system for labeling interest point semantics based on space and semantic neighbor information. In the aspect of space information, category embedding is firstly carried out on the interest points, the interest points are constructed into a space diagram, the interest points are nodes, the category embedding is node characteristics, finally neighbor information of the interest points is captured, the space characteristics of the interest points are learned, and space characteristic vectors are obtained. In terms of semantic information, pre-training text vectors of interest points are obtained, and then the nearest text vector is obtained according to the pre-training text vectorskAnd capturing semantic neighbor information by using an attention mechanism to obtain text feature vectors. And finally, splicing the space feature vector, the text feature vector and the pre-training text vector, and sending the spliced space feature vector, the text feature vector and the pre-training text vector into a multi-layer neural network for semantic annotation. The data used by the method is easy to obtain, has wide application, has better robustness, and can better solve the sparse problem of semantic information.

Description

Interest point semantic labeling method and system based on space and semantic neighbor information
Technical Field
The invention belongs to the field of data mining, and particularly relates to a method and a system for labeling interest point semantics based on space and semantic neighbor information.
Background
Spatio-temporal data mining based on geospatial data has led to a number of useful applications in which point of interest (Point of Interest, abbreviated POI) data has proven to be effective in many urban computing and service tasks, however, semantic labeling of points of interest without class labels is significant due to the uncertainty of the process of manually annotating the class labels of points of interest, which often occurs in the absence or incorrect situation.
Many methods are proposed at home and abroad on the semantic annotation task of the interest points, however, the following limitations exist: the additional multi-mode data information and the sign-in data of the user often have the problems of high data cost, difficult acquisition, easy invasion of privacy and the like. The need for these data sources greatly limits the generalization capability of many existing approaches; modeling of existing spatial information relies on the partitioning of rigid grids, finding appropriate parameters to partition the grid requires a lot of time and requires some experience. Furthermore, points of interest of the grid boundaries will mine biased information.
Disclosure of Invention
In order to solve the problems, the invention provides a method and a system for semantic labeling of interest points based on space and semantic neighbor information, which can more naturally simulate information transfer between the interest points by utilizing the space neighbor information and the semantic neighbor information of the interest points to mine feature representation of the interest points, and better solve the sparse problem of the semantic information.
In order to achieve the above object, the present invention mainly includes the following two aspects:
in a first aspect, the present invention provides a method for labeling interest points semantically based on spatial and semantic neighbor information, including:
step 1: using a one-hot vector as category embedding of interest points, constructing a graph structure according to longitude and latitude information of the interest points by using Deltay triangulation, using the interest points as graph nodes, initializing node characteristics as category embedding of the interest points, and initializing nodes without category labels as nearest nodeskAverage value of embedded vectors of each classified label node.
Step 2: obtaining a pre-training text vector of the interest point, calculating the distance between the interest point without the category label and the interest point with the category label through a cosine distance formula, and finding the nearest interest point through heap orderingkPersonal semantic neighbors。
Step 3: and obtaining a spatial feature vector and a text feature vector of the interest point, splicing the fusion feature vector spliced together with the pre-training text vector, and predicting the probability of the interest point category through a classifier. The parameters of the model are learned by training, i.e. minimizing the loss function.
And 4, inputting a map structure of the urban interest points, interest point category embedding with category labels and interest point names when predicting the urban interest point categories, and enabling the interest point semantic neighbor categories to be embedded into the pre-trained text vectors so as to predict the interest point categories.
In some embodiments, in step 1, the embedding of the category is expressed as:, wherein />The length of (2) is equal to the number of categories, +.>Representation of individual categories->Is>The number of elements is 1.
In some embodiments, in step 1, a node in the graph structure and />The distance of (2) is expressed as:
wherein LIs the diagonal distance of the smallest rectangle containing all points of interest,is interest point-> and />Log (·) represents a logarithmic function while distance +.>Final normalization is between range 0 and 1.
In some embodiments, in step 2, the method for obtaining the pre-training text vector specifically includes: dividing the point-of-interest name into a word sequence and a word sequence, respectively finding out pre-training vectors of the words and the words in the two sequences, summing and averaging the vectors in the sequences to obtain a word-level vector and a word-level vector, and splicing the vectors to obtain a pre-training text vector of the point-of-interest.
In some embodiments, in step 3, the graph structure is fed into a graph convolutional neural network (GCN) based spatial encoder to obtain spatial feature vectors for points of interest.
Further, the specific mode is as follows: extracting spatial feature vectors by using a layer of graph convolution neural network:
, wherein AIs an adjacency matrix of the graph structure,Dis the degree matrix of the adjacency matrix,Xis the initial feature matrix of the node->θIs a leachable linear transformation applied to each node, softmax (·) is a normalized exponential function, ++>I.e. the acquired spatial feature vector.
In some embodiments, in step 3, the names of the points of interest, the names of the neighboring points of interest, and the category are embedded into a text encoder based on an attention mechanism to obtain text feature vectors.
Further, the specific mode is as follows: inputting the names of the interest points and the names of the interest point semantic neighbors into a Bert (bi-directional encoder characterization quantity from a transformer, a pre-trained language characterization model) encoder to obtain the firstPoint of interest->Text feature->And a method for manufacturing the samekText features of individual semantic neighbors->The distance between the text feature of the interest point and the neighbor text feature is calculated by a cosine distance formula: />, wherein />Representation->Is the first of (2)jText features of individual semantic neighbors, +.>、/>Respectively represent vectors +.>And->Is a mold of (2);
and then will contain points of interestAnd a method for manufacturing the sameCosine distance vector of all semantic neighbors +.>Input fully connected layer gets potential vector representation +.>:/>, wherein /> and />Respectively representing the weight and bias of training;
the points of interest are then normalized with a softmax functionAnd the firstjPersonal neighbor Point of interest->Attention weight between:
wherein jRefers toIs the first of (2)jThe subscript of the sum formula in the denominator is named ++for the purpose of distinguishing between neighbors>Representing traversal in the sum formula>From->To->,/>、/>Respectively represent interest points converted by a layer of MLP (Multi-layer perceptron)>And neighbor interest point->、/>Distance of->、/>Respectively expressed in +.>The exponential function input for the base +.>、/>The value obtained later,/->Representing from->To->For->And (4) the first->Logarithmic function value of distance of individual semantic neighbors +.>Summing (up) the->Is interest point->And the firstjPersonal neighbor Point of interest->Is a weight of attention of (2);
finally, obtaining text feature vectors of the interest points based on an attention mechanism:
wherein Representing neighbor interest points->The category embeds the vector.
In some embodiments, in step 3, the pre-training text vector and the fusion feature vector are spliced in the following manner: by means of space feature vectorsIs +.>Spliced together, the pre-trained text vector is spliced together after the two views are fused to obtain the vector +.>As input to the final multi-layer MLP:
wherein ,representing interest points->Probability of belonging to each interest point type, +.>、/>、/>Weights of layers 1, 2, 3 of the MLP classifier are respectively represented, +.>、/>、/>Bias, vector +.>The vector is spliced with the pre-training text vector after view fusion, and softmax (·) is a normalized exponential function, which is an activation function of the last layer of the MLP classifier.
Finally, minimizing the loss function (cross entropy loss) by using the interest point classAnd (3) transmitting the known interest point category information back to the depth network of the model: />
wherein Representing the number of points of interest in the dataset, +.>Representing the number of categories>Representing interest points->Whether or not marked as category->,/>Is->Marked as category->Is used for the prediction probability of (1).
In a second aspect, the present invention provides a system for labeling semantic points of interest based on spatial and semantic neighbor information, comprising:
the spatial encoder module is used for mining the spatial information of the interest points based on the graph convolution neural network;
the text encoder module based on the attention mechanism is used for adjusting the attention weight of the neighbors through the attention mechanism, acquiring semantic neighbor information of the interest points and enhancing the semantic information of the interest points;
and the multi-view fusion module is used for fusing a plurality of feature vectors to carry out semantic annotation on the interest points.
If data mining is only performed based on text information (interest point name) and space information (interest point longitude and latitude) of interest points, so that semantic annotation is realized on the interest points, the difficulties include: it is difficult to infer the interest point category directly from the coordinates of the interest point, and how to effectively use the spatial information of the interest point to understand the environment and background of the interest point is a challenging problem; the names of the points of interest, although containing category information to some extent, are sparse because the names of the points of interest are short text, and are not sufficient to reflect their actual categories. In the present invention, inIn the aspect of space information, firstly, category embedding is carried out on interest points, then all the interest points are constructed into a space diagram through a Delaunay triangulation (Delaunay Triangulation is called DT for short), the interest points are nodes of the diagram, the category embedding is node characteristics, finally, neighbor information of the interest points is captured through a diagram convolutional neural network, the space characteristics of the interest points are learned, and a space characteristic vector is obtained. In terms of semantic information, firstly dividing names of interest points into word and word sequences, respectively obtaining word and word-level pre-training text vectors according to the vacated pre-training word vectors, splicing the two pre-training text vectors to form the interest points, and then obtaining the nearest interest points according to the interest points pre-training text vectorskAnd capturing semantic neighbor information by using an attention mechanism to obtain text feature vectors. And finally, splicing the space feature vector, the text feature vector and the pre-training text vector, and sending the spliced space feature vector, the text feature vector and the pre-training text vector into a multi-layer neural network for semantic labeling of the interest points. The method can well solve the problems.
One or more embodiments of the present invention achieve at least the following technical effects:
1. the invention provides an interest point semantic annotation model for information enhancement based on spatial neighbors and semantic neighbors, which is used for mining feature representation of interest points by utilizing the spatial neighbor information and the semantic neighbor information of the interest points;
2. the data used by the method is easy to acquire, only the text (name) and the position (longitude and latitude) of the interest point are considered, additional information which is difficult to acquire is not used, and the method is more widely used;
3. according to the method, the spatial information of the interest points is extracted by constructing the graph structure, so that the information transfer between the interest points can be simulated more naturally, and the robustness is better;
4. according to the method, the interest point semantic neighbor information is extracted through the attention mechanism, so that the problem of sparseness of semantic information can be solved well.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
Fig. 1 is a model diagram of the present invention.
Fig. 2 is a general frame diagram in an embodiment of the present invention.
Fig. 3 is a diagram of a model of spatial feature vector acquisition in accordance with an embodiment of the present invention.
Fig. 4 is a diagram of a model of text feature vector acquisition in an embodiment of the present invention.
Detailed Description
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
As described above, the interest point semantic annotation method in the prior art has the problems of high data cost, difficult acquisition, easy invasion of privacy, long time consumption, inaccuracy and the like. Based on the above, the invention provides a method and a system for labeling interest point semantics based on space and semantic neighbor information, as shown in fig. 1 and fig. 2.
In a first aspect, the invention provides a method for labeling interest points semantically based on space and semantic neighbor information, which specifically comprises the following steps:
step 1, using the independent heat encoding vector as the category embedding of the interest points, constructing all the interest points as follows by using delaunay triangulation according to the longitude and latitude information of the interest points (taking the longitude 116.341878 degrees and the latitude 40.030869 degrees of the Boer in FIG. 2 as an example)In the graph structure, the interest points are used as graph nodes, the node characteristics are initialized to be embedded in the categories of the interest points, and the nodes without category labels are initialized to be nearestkAverage value of embedded vectors of each classified label node.
The point of interest category reflects the point of interest semantics to a large extent. The present invention selects a simple class-one-hot coded representation as the initial node feature because it provides intuitive semantic tag information. That is, embedding of categories is expressed as:, wherein />The length of (2) is equal to the number of categories, +.>Representation of individual categories->Is>The number of elements is 1.
After the category information of the interest points is obtained and embedded, modeling is conducted on the space context of the interest points. According to the first law of geography, points of interest in close spatial proximity have a strong relationship. Often, points of interest around some points of interest often have the same category, e.g., restaurants are always grouped together, and around points of interest often surround points of interest. Also, while some points of interest may not be clustered together, the distribution of categories around them may be similar. As shown in fig. 3, based on the thought, the invention firstly utilizes Delaunay triangulation to construct all interest points into a graph structure according to the longitude and latitude of the interest points, wherein the interest points are nodes, category embedding is node characteristics, and the nodes without category labels use the nearest surroundingkThe class embedded mean of each node with label information is characterized. The following formulas are used simultaneously as nodes in the graph structure and />Distance of (2):
wherein LIs the diagonal distance of the smallest rectangle containing all points of interest,is interest point-> and />Log (·) represents a logarithmic function while distance +.>Final normalization is between range 0 and 1.
Step 2: dividing the point-of-interest name into a word sequence and a word sequence, respectively finding out pre-training vectors of the words and the words in the two sequences, summing and averaging the vectors in the sequences to obtain a word-level vector and a word-level vector, and splicing the vectors to obtain a pre-training text vector of the point-of-interest. Then calculating the distance between the interest points without category labels and the interest points with the category labels through cosine distance formula according to the pre-training text vector of the interest points, and finding out the nearest interest points through heap orderingkSemantic neighbors.
Considering interest points with similar semantic information, the categories of the interest points are also similar, and the invention enhances the information of the interest points by extracting the semantic neighbor information. Firstly, finding semantic neighbors of interest points according to the names of the interests: dividing the names of interest points into word sequences and word sequences, embedding the training vectors into a corpus by a Tencent AI laboratory, adding and averaging the corresponding training vectors in the sequences to obtain training text vectors of interest point word levels and word levels respectively, and embedding the training text vectors into the training text vectorsAnd splicing to obtain the pre-training text vector of the interest point. Then using the pre-training text vector to calculate the distance between the interest point and the interest point with the class label according to the cosine distance formula, and finding the nearest interest point through heap orderingkSemantic neighbors.
Step 3: the graph structure is sent to a spatial encoder based on a graph convolution neural network to obtain a spatial feature vector of the interest point, and the name of the interest point, the name of the neighbor interest point and the category are embedded and input to a text encoder based on an attention mechanism to obtain a text feature vector. And splicing vectors of the two views together through a multi-view fusion module, splicing the pre-training text vector and the fused feature vector together, and predicting the probability of the interest point category through a classifier. The parameters of the model are learned by training, i.e. minimizing the loss function.
Spatial features are extracted using a one-layer graph convolution neural network:
, wherein AIs an adjacency matrix of the graph structure, the invention does not add a self-loop in the adjacency matrix, which is to prevent adding own category information when acquiring the characteristic information of the neighbors,Dis the degree matrix of the adjacency matrix,Xis the initial feature matrix of the node->θIs a leachable linear transformation applied to each node, softmax (·) is a normalized exponential function, ++>I.e. the acquired spatial feature vector.
After obtaining the semantic neighbors of the interest points, semantically enhancing the interest points through an attention mechanism, as shown in fig. 4, inputting the names of the interest points and the names of the semantic neighbors of the interest points into a Bert encoder to obtain the firstPoint of interest->Text feature->And a method for manufacturing the samekText features of individual semantic neighbors->The text feature is finer, and the distance between the text feature of the interest point and the neighbor text feature is calculated by a cosine distance formula: />
And then will contain points of interestCosine distance vector of all semantic neighbors thereof +.>Input fully connected layer gets potential vector representation +.>:/>, wherein /> and />Respectively representing the weight and bias of training;
the points of interest are then normalized with a softmax functionAnd the firstjPersonal semantic neighbor Point of interest->Attention weight between:
wherein The representation is the interest point after the conversion by a layer of MLP +.>And neighbor interest point->Is used for the distance of (a),is interest point->And the firstjPersonal neighbor Point of interest->Is a weight of attention of (2);
finally, obtaining text feature vectors of the interest points based on an attention mechanism:
, wherein />Representing neighbor interest points->The category embeds the vector.
The inventor explores a plurality of fusion methods, finally selects splicing as a final fusion mode, and the invention uses the space feature vectorIs +.>Spliced together, the pre-trained text vector is spliced together after the two views are fused to obtain the vector +.>As input to the final multi-layer MLP:
wherein ,representing interest points->Probability of belonging to each interest point type, +.>、/>、/>Weights of layers 1, 2, 3 of the MLP classifier are respectively represented, +.>、/>、/>Bias, vector +.>The vector is spliced with the pre-training text vector after view fusion, and softmax (·) is a normalized exponential function, which is an activation function of the last layer of the MLP classifier.
Finally, minimizing the loss function (cross entropy loss) by using the interest point classAnd transmitting the known interest point category information back to the depth network:
wherein Representing the number of points of interest in the dataset, +.>Representing the number of categories>Representing interest points->Whether or not marked as category->,/>Is->Marked as category->Is used for the prediction probability of (1).
Step 4: when the urban interest point category prediction is carried out, the map structure of the urban interest points, the interest point category embedding with category labels and the interest point names are input, and the interest point category can be predicted by embedding the interest point semantic neighbor categories and pre-training text vectors.
The model used in the invention is run on two real data sets, and the performance of the method and the performance of the comparison methods are evaluated by using several indexes of Accuracy (Accuracy), macro F1 (Macro-F1) and average reciprocal rank (MRR), wherein the comparison methods comprise text characteristics (WTF) based on words, text characteristics (ATF) based on attention, space characteristics (GSF) based on grids, pre-training semantic embedded global GPS coordinates (GPS 2 Vec), an integrated POI hierarchical classification framework (EHC) andword-based text feature+pre-training semantics embeds global GPS coordinates (wtf+gps 2 Vec). Wherein MRR is based onThe generated ranking list of the predicted category labels looks up the index of the ranking of the real labels in the ranking list, and the calculation method comprises the following steps:
wherein Representing test set->Quantity of->Representing the ranking of the real categories in the predicted categories. The evaluation results of all the indexes are shown in table 1. It can be seen that the invention has a significant improvement in performance over other methods.
Table 1 evaluation results of various methods
In a second aspect, the present invention provides a system for labeling semantic points of interest based on spatial and semantic neighbor information, comprising:
the spatial encoder module is used for mining the spatial information of the interest points based on the graph convolution neural network;
the text encoder module based on the attention mechanism is used for adjusting the attention weight of the neighbors through the attention mechanism, acquiring semantic neighbor information of the interest points and enhancing the semantic information of the interest points;
and the multi-view fusion module is used for fusing a plurality of feature vectors to carry out semantic annotation on the interest points.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. The interest point semantic labeling method based on the space and semantic neighbor information is characterized by comprising the following steps of:
step 1: using one-hot vector as category embedding of interest points, constructing a graph structure according to longitude and latitude information of the interest points by using Deltay triangulation, using the interest points as graph nodes, initializing node characteristics as category embedding of the interest points, and initializing nodes without category labels as nearest nodeskAverage value of embedded vectors of each classified label node;
step 2: obtaining a pre-training text vector of the interest point, calculating the distance between the interest point without the category label and the interest point with the category label through a cosine distance formula, and finding the nearest interest point through heap orderingkA semantic neighbor;
step 3: obtaining a space feature vector and a text feature vector of an interest point, splicing the fusion feature vector spliced together with a pre-training text vector, predicting the probability of the class of the interest point through a classifier, and learning parameters of a model through training, namely minimizing a loss function;
step 4: when urban interest point category prediction is carried out, a map structure of urban interest points, interest point category embedding with category labels and interest point names are input, and the interest point semantic neighbor category embedding and pre-training text vectors can predict the interest point categories;
in step 3, the fusion feature vector and the pre-training text vector are spliced in the following manner: by means of space feature vectorsIs +.>Spliced together, the pre-trained text vector is spliced together after the two views are fused to obtain the vector +.>As input to the final multi-layer MLP:
wherein ,representing interest points->Probability of belonging to each interest point type, +.>、/>、/>Weights of layers 1, 2, 3 of the MLP classifier are respectively represented, +.>、/>、/>Bias, vector +.>The vector is spliced with the pre-training text vector after view fusion, and softmax (·) is a normalized exponential function, which is an activation function of the last layer of the MLP classifier;
finally, the known interest point category information is transmitted back to the depth network by using the interest point category minimization loss function:
wherein Representing the number of points of interest in the dataset, +.>Representing the number of categories>Representing interest points->Whether or not marked as category->,/>Is->Marked as category->Is used for the prediction probability of (1).
2. The interest point semantic labeling method based on space and semantic neighbor information according to claim 1, wherein in step 1, the embedding of the category is represented as:, wherein />The length of (2) is equal to the number of categories, +.>Representation of individual categories->Is>The number of elements is 1.
3. The interest point semantic labeling method based on space and semantic neighbor information according to claim 1, wherein in step 1, nodes in a graph structure and />The distance of (2) is expressed as:
wherein LIs the diagonal distance of the smallest rectangle containing all points of interest,is interest point-> and />Log (·) represents a logarithmic function while distance +.>Final normalization is between range 0 and 1.
4. The method for labeling the interest points semantically based on the space and semantic neighbor information according to claim 1, wherein in the step 2, the method for obtaining the pre-training text vector of the interest points is specifically as follows: dividing the point-of-interest name into a word sequence and a word sequence, respectively finding out pre-training vectors of the words and the words in the two sequences, summing and averaging the vectors in the sequences to obtain a word-level vector and a word-level vector, and splicing the vectors to obtain a pre-training text vector of the point-of-interest.
5. The interest point semantic labeling method based on space and semantic neighbor information according to claim 1, wherein in step 3, a graph structure is sent to a spatial encoder based on a graph convolution neural network to obtain a spatial feature vector of the interest point.
6. The interest point semantic labeling method based on space and semantic neighbor information according to claim 5, wherein the specific way of obtaining the spatial feature vector of the interest point is as follows: extracting spatial feature vectors by using a layer of graph convolution neural network:
wherein AIs an adjacency matrix of the graph structure,Dis the degree matrix of the adjacency matrix,Xis the initial feature matrix of the nodeθIs a leachable linear transformation applied to each node, softmax (·) is a normalized exponential function, ++>I.e. the acquired spatial feature vector.
7. The method for labeling the interest points semantically based on the space and semantic neighbor information according to claim 1, wherein in the step 3, the names of the interest points, the names of the neighbor interest points and the category are embedded and input into a text encoder based on an attention mechanism to obtain text feature vectors.
8. The interest point semantic labeling method based on space and semantic neighbor information according to claim 7, wherein the specific manner of obtaining text feature vectors is as follows: inputting the names of the points of interest and the names of semantic neighbors of the points of interest into a Bert encoder to obtain the firstPoint of interest->Text feature->And a method for manufacturing the samekText features of individual semantic neighborsThe distance between the text feature of the interest point and the neighbor text feature is calculated by a cosine distance formula:
and then will contain points of interestCosine distance vector of all semantic neighbors thereof +.>Input fully connected layer gets potential vector representation +.>:/>
wherein and />Respectively representing the weight and bias of training;
the points of interest are then normalized with a softmax functionAnd the firstjPersonal neighbor Point of interest->Attention weight between:
wherein The representation is the interest point after the conversion by a layer of MLP +.>And neighbor interest point->Is used for the distance of (a),is interest point->And the firstjPersonal neighbor Point of interest->Is a weight of attention of (2);
finally, obtaining text feature vectors of the interest points based on an attention mechanism:
wherein Representing neighbor interest points->The category embeds the vector.
CN202310614884.9A 2023-05-29 2023-05-29 Interest point semantic labeling method and system based on space and semantic neighbor information Active CN116341567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310614884.9A CN116341567B (en) 2023-05-29 2023-05-29 Interest point semantic labeling method and system based on space and semantic neighbor information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310614884.9A CN116341567B (en) 2023-05-29 2023-05-29 Interest point semantic labeling method and system based on space and semantic neighbor information

Publications (2)

Publication Number Publication Date
CN116341567A CN116341567A (en) 2023-06-27
CN116341567B true CN116341567B (en) 2023-08-29

Family

ID=86880732

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310614884.9A Active CN116341567B (en) 2023-05-29 2023-05-29 Interest point semantic labeling method and system based on space and semantic neighbor information

Country Status (1)

Country Link
CN (1) CN116341567B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362652A (en) * 2019-07-19 2019-10-22 辽宁工程技术大学 Based on space-semanteme-numerical value degree of correlation spatial key Top-K querying method
CN112528639A (en) * 2020-11-30 2021-03-19 腾讯科技(深圳)有限公司 Object recognition method and device, storage medium and electronic equipment
CN112633380A (en) * 2020-12-24 2021-04-09 北京百度网讯科技有限公司 Interest point feature extraction method and device, electronic equipment and storage medium
WO2021160100A1 (en) * 2020-02-12 2021-08-19 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Methods for searching images and for indexing images, and electronic device
CN113918837A (en) * 2021-10-15 2022-01-11 山东大学 Method and system for generating urban interest point category representation
CN115221325A (en) * 2022-07-25 2022-10-21 中国人民解放军军事科学院军事科学信息研究中心 Text classification method based on label semantic learning and attention adjustment mechanism
CN115374792A (en) * 2022-09-14 2022-11-22 山东省计算中心(国家超级计算济南中心) Policy text labeling method and system combining pre-training and graph neural network
CN115422441A (en) * 2022-08-11 2022-12-02 华中科技大学 Continuous interest point recommendation method based on social space-time information and user preference
CN115577294A (en) * 2022-11-22 2023-01-06 深圳市规划和自然资源数据管理中心(深圳市空间地理信息中心) Urban area classification method based on interest point spatial distribution and semantic information

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11238362B2 (en) * 2016-01-15 2022-02-01 Adobe Inc. Modeling semantic concepts in an embedding space as distributions
CN109145219B (en) * 2018-09-10 2020-12-25 百度在线网络技术(北京)有限公司 Method and device for judging validity of interest points based on Internet text mining

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362652A (en) * 2019-07-19 2019-10-22 辽宁工程技术大学 Based on space-semanteme-numerical value degree of correlation spatial key Top-K querying method
WO2021160100A1 (en) * 2020-02-12 2021-08-19 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Methods for searching images and for indexing images, and electronic device
CN112528639A (en) * 2020-11-30 2021-03-19 腾讯科技(深圳)有限公司 Object recognition method and device, storage medium and electronic equipment
CN112633380A (en) * 2020-12-24 2021-04-09 北京百度网讯科技有限公司 Interest point feature extraction method and device, electronic equipment and storage medium
CN113918837A (en) * 2021-10-15 2022-01-11 山东大学 Method and system for generating urban interest point category representation
CN115221325A (en) * 2022-07-25 2022-10-21 中国人民解放军军事科学院军事科学信息研究中心 Text classification method based on label semantic learning and attention adjustment mechanism
CN115422441A (en) * 2022-08-11 2022-12-02 华中科技大学 Continuous interest point recommendation method based on social space-time information and user preference
CN115374792A (en) * 2022-09-14 2022-11-22 山东省计算中心(国家超级计算济南中心) Policy text labeling method and system combining pre-training and graph neural network
CN115577294A (en) * 2022-11-22 2023-01-06 深圳市规划和自然资源数据管理中心(深圳市空间地理信息中心) Urban area classification method based on interest point spatial distribution and semantic information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
融合空间和文本信息的兴趣点类别表征模型;徐则林;计算机应用https://kns.cnki.net/kcms/detail//51.1307.TP.20221130.1052.003.html;全文 *

Also Published As

Publication number Publication date
CN116341567A (en) 2023-06-27

Similar Documents

Publication Publication Date Title
Feng et al. Deepmove: Predicting human mobility with attentional recurrent networks
CN111274811B (en) Address text similarity determining method and address searching method
CN111160471B (en) Interest point data processing method and device, electronic equipment and storage medium
Luca et al. Deep learning for human mobility: a survey on data and models
Chen et al. Indoor cartography
Xu et al. A dynamic topic model and matrix factorization-based travel recommendation method exploiting ubiquitous data
Palumbo et al. Predicting Your Next Stop-over from Location-based Social Network Data with Recurrent Neural Networks.
US10068178B2 (en) Methods and system for associating locations with annotations
Xu et al. Application of a graph convolutional network with visual and semantic features to classify urban scenes
Wong et al. Reviewing geotagging research in tourism
Skoumas et al. Location estimation using crowdsourced spatial relations
Ding et al. Spatial-temporal distance metric embedding for time-specific POI recommendation
Zhu et al. Geoinformation harvesting from social media data: A community remote sensing approach
Silva et al. Applications of geospatial big data in the Internet of Things
Balsebre et al. Cityfm: City foundation models to solve urban challenges
Kapp et al. Generative models for synthetic urban mobility data: A systematic literature review
Qian et al. Vehicle trajectory modelling with consideration of distant neighbouring dependencies for destination prediction
CN116341567B (en) Interest point semantic labeling method and system based on space and semantic neighbor information
Fan et al. DuMapper: Towards Automatic Verification of Large-Scale POIs with Street Views at Baidu Maps
Feng et al. A survey of visual analytics in urban area
Ali et al. Enabling spatial digital twins: Technologies, challenges, and future research directions
Qin et al. Identifying urban functional zones by capturing multi-spatial distribution patterns of points of interest
Yang et al. Point‐of‐interest detection from Weibo data for map updating
CN112417260B (en) Localized recommendation method, device and storage medium
Liao et al. Enriching large-scale trips with fine-grained travel purposes: A semi-supervised deep graph embedding framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant