CN116341567B

CN116341567B - Interest point semantic labeling method and system based on space and semantic neighbor information

Info

Publication number: CN116341567B
Application number: CN202310614884.9A
Authority: CN
Inventors: 陈勐; 张大滨; 王宇; 郭衍民
Original assignee: Shandong Institute Of Industrial Technology; Shandong University
Current assignee: Shandong Institute Of Industrial Technology; Shandong University
Priority date: 2023-05-29
Filing date: 2023-05-29
Publication date: 2023-08-29
Anticipated expiration: 2043-05-29
Also published as: CN116341567A

Abstract

The invention belongs to the field of data mining, and particularly relates to a method and a system for labeling interest point semantics based on space and semantic neighbor information. In the aspect of space information, category embedding is firstly carried out on the interest points, the interest points are constructed into a space diagram, the interest points are nodes, the category embedding is node characteristics, finally neighbor information of the interest points is captured, the space characteristics of the interest points are learned, and space characteristic vectors are obtained. In terms of semantic information, pre-training text vectors of interest points are obtained, and then the nearest text vector is obtained according to the pre-training text vectorskAnd capturing semantic neighbor information by using an attention mechanism to obtain text feature vectors. And finally, splicing the space feature vector, the text feature vector and the pre-training text vector, and sending the spliced space feature vector, the text feature vector and the pre-training text vector into a multi-layer neural network for semantic annotation. The data used by the method is easy to obtain, has wide application, has better robustness, and can better solve the sparse problem of semantic information.

Description

Interest point semantic labeling method and system based on space and semantic neighbor information

Technical Field

The invention belongs to the field of data mining, and particularly relates to a method and a system for labeling interest point semantics based on space and semantic neighbor information.

Background

Spatio-temporal data mining based on geospatial data has led to a number of useful applications in which point of interest (Point of Interest, abbreviated POI) data has proven to be effective in many urban computing and service tasks, however, semantic labeling of points of interest without class labels is significant due to the uncertainty of the process of manually annotating the class labels of points of interest, which often occurs in the absence or incorrect situation.

Many methods are proposed at home and abroad on the semantic annotation task of the interest points, however, the following limitations exist: the additional multi-mode data information and the sign-in data of the user often have the problems of high data cost, difficult acquisition, easy invasion of privacy and the like. The need for these data sources greatly limits the generalization capability of many existing approaches; modeling of existing spatial information relies on the partitioning of rigid grids, finding appropriate parameters to partition the grid requires a lot of time and requires some experience. Furthermore, points of interest of the grid boundaries will mine biased information.

Disclosure of Invention

In order to solve the problems, the invention provides a method and a system for semantic labeling of interest points based on space and semantic neighbor information, which can more naturally simulate information transfer between the interest points by utilizing the space neighbor information and the semantic neighbor information of the interest points to mine feature representation of the interest points, and better solve the sparse problem of the semantic information.

In order to achieve the above object, the present invention mainly includes the following two aspects:

in a first aspect, the present invention provides a method for labeling interest points semantically based on spatial and semantic neighbor information, including:

step 1: using a one-hot vector as category embedding of interest points, constructing a graph structure according to longitude and latitude information of the interest points by using Deltay triangulation, using the interest points as graph nodes, initializing node characteristics as category embedding of the interest points, and initializing nodes without category labels as nearest nodeskAverage value of embedded vectors of each classified label node.

Step 2: obtaining a pre-training text vector of the interest point, calculating the distance between the interest point without the category label and the interest point with the category label through a cosine distance formula, and finding the nearest interest point through heap orderingkPersonal semantic neighbors。

Step 3: and obtaining a spatial feature vector and a text feature vector of the interest point, splicing the fusion feature vector spliced together with the pre-training text vector, and predicting the probability of the interest point category through a classifier. The parameters of the model are learned by training, i.e. minimizing the loss function.

And 4, inputting a map structure of the urban interest points, interest point category embedding with category labels and interest point names when predicting the urban interest point categories, and enabling the interest point semantic neighbor categories to be embedded into the pre-trained text vectors so as to predict the interest point categories.

In some embodiments, in step 1, the embedding of the category is expressed as:, wherein />The length of (2) is equal to the number of categories, +.>Representation of individual categories->Is>The number of elements is 1.

In some embodiments, in step 1, a node in the graph structure and />The distance of (2) is expressed as:

，

wherein LIs the diagonal distance of the smallest rectangle containing all points of interest,is interest point-> and />Log (·) represents a logarithmic function while distance +.>Final normalization is between range 0 and 1.

In some embodiments, in step 2, the method for obtaining the pre-training text vector specifically includes: dividing the point-of-interest name into a word sequence and a word sequence, respectively finding out pre-training vectors of the words and the words in the two sequences, summing and averaging the vectors in the sequences to obtain a word-level vector and a word-level vector, and splicing the vectors to obtain a pre-training text vector of the point-of-interest.

In some embodiments, in step 3, the graph structure is fed into a graph convolutional neural network (GCN) based spatial encoder to obtain spatial feature vectors for points of interest.

Further, the specific mode is as follows: extracting spatial feature vectors by using a layer of graph convolution neural network:

, wherein AIs an adjacency matrix of the graph structure,Dis the degree matrix of the adjacency matrix,Xis the initial feature matrix of the node->，θIs a leachable linear transformation applied to each node, softmax (·) is a normalized exponential function, ++>I.e. the acquired spatial feature vector.

In some embodiments, in step 3, the names of the points of interest, the names of the neighboring points of interest, and the category are embedded into a text encoder based on an attention mechanism to obtain text feature vectors.

Further, the specific mode is as follows: inputting the names of the interest points and the names of the interest point semantic neighbors into a Bert (bi-directional encoder characterization quantity from a transformer, a pre-trained language characterization model) encoder to obtain the firstPoint of interest->Text feature->And a method for manufacturing the samekText features of individual semantic neighbors->The distance between the text feature of the interest point and the neighbor text feature is calculated by a cosine distance formula: />, wherein />Representation->Is the first of (2)jText features of individual semantic neighbors, +.>、/>Respectively represent vectors +.>And->Is a mold of (2);

and then will contain points of interestAnd a method for manufacturing the sameCosine distance vector of all semantic neighbors +.>Input fully connected layer gets potential vector representation +.>：/>, wherein /> and />Respectively representing the weight and bias of training;

the points of interest are then normalized with a softmax functionAnd the firstjPersonal neighbor Point of interest->Attention weight between:

，

wherein jRefers toIs the first of (2)jThe subscript of the sum formula in the denominator is named ++for the purpose of distinguishing between neighbors>Representing traversal in the sum formula>From->To->，/>、/>Respectively represent interest points converted by a layer of MLP (Multi-layer perceptron)>And neighbor interest point->、/>Distance of->、/>Respectively expressed in +.>The exponential function input for the base +.>、/>The value obtained later,/->Representing from->To->For->And (4) the first->Logarithmic function value of distance of individual semantic neighbors +.>Summing (up) the->Is interest point->And the firstjPersonal neighbor Point of interest->Is a weight of attention of (2);

finally, obtaining text feature vectors of the interest points based on an attention mechanism:，

wherein Representing neighbor interest points->The category embeds the vector.

In some embodiments, in step 3, the pre-training text vector and the fusion feature vector are spliced in the following manner: by means of space feature vectorsIs +.>Spliced together, the pre-trained text vector is spliced together after the two views are fused to obtain the vector +.>As input to the final multi-layer MLP:

，

wherein ,representing interest points->Probability of belonging to each interest point type, +.>、/>、/>Weights of layers 1, 2, 3 of the MLP classifier are respectively represented, +.>、/>、/>Bias, vector +.>The vector is spliced with the pre-training text vector after view fusion, and softmax (·) is a normalized exponential function, which is an activation function of the last layer of the MLP classifier.

Finally, minimizing the loss function (cross entropy loss) by using the interest point classAnd (3) transmitting the known interest point category information back to the depth network of the model: />，

wherein Representing the number of points of interest in the dataset, +.>Representing the number of categories>Representing interest points->Whether or not marked as category->，/>Is->Marked as category->Is used for the prediction probability of (1).

In a second aspect, the present invention provides a system for labeling semantic points of interest based on spatial and semantic neighbor information, comprising:

the spatial encoder module is used for mining the spatial information of the interest points based on the graph convolution neural network;

the text encoder module based on the attention mechanism is used for adjusting the attention weight of the neighbors through the attention mechanism, acquiring semantic neighbor information of the interest points and enhancing the semantic information of the interest points;

and the multi-view fusion module is used for fusing a plurality of feature vectors to carry out semantic annotation on the interest points.

If data mining is only performed based on text information (interest point name) and space information (interest point longitude and latitude) of interest points, so that semantic annotation is realized on the interest points, the difficulties include: it is difficult to infer the interest point category directly from the coordinates of the interest point, and how to effectively use the spatial information of the interest point to understand the environment and background of the interest point is a challenging problem; the names of the points of interest, although containing category information to some extent, are sparse because the names of the points of interest are short text, and are not sufficient to reflect their actual categories. In the present invention, inIn the aspect of space information, firstly, category embedding is carried out on interest points, then all the interest points are constructed into a space diagram through a Delaunay triangulation (Delaunay Triangulation is called DT for short), the interest points are nodes of the diagram, the category embedding is node characteristics, finally, neighbor information of the interest points is captured through a diagram convolutional neural network, the space characteristics of the interest points are learned, and a space characteristic vector is obtained. In terms of semantic information, firstly dividing names of interest points into word and word sequences, respectively obtaining word and word-level pre-training text vectors according to the vacated pre-training word vectors, splicing the two pre-training text vectors to form the interest points, and then obtaining the nearest interest points according to the interest points pre-training text vectorskAnd capturing semantic neighbor information by using an attention mechanism to obtain text feature vectors. And finally, splicing the space feature vector, the text feature vector and the pre-training text vector, and sending the spliced space feature vector, the text feature vector and the pre-training text vector into a multi-layer neural network for semantic labeling of the interest points. The method can well solve the problems.

One or more embodiments of the present invention achieve at least the following technical effects:

1. the invention provides an interest point semantic annotation model for information enhancement based on spatial neighbors and semantic neighbors, which is used for mining feature representation of interest points by utilizing the spatial neighbor information and the semantic neighbor information of the interest points;

2. the data used by the method is easy to acquire, only the text (name) and the position (longitude and latitude) of the interest point are considered, additional information which is difficult to acquire is not used, and the method is more widely used;

3. according to the method, the spatial information of the interest points is extracted by constructing the graph structure, so that the information transfer between the interest points can be simulated more naturally, and the robustness is better;

4. according to the method, the interest point semantic neighbor information is extracted through the attention mechanism, so that the problem of sparseness of semantic information can be solved well.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.

Fig. 1 is a model diagram of the present invention.

Fig. 2 is a general frame diagram in an embodiment of the present invention.

Fig. 3 is a diagram of a model of spatial feature vector acquisition in accordance with an embodiment of the present invention.

Fig. 4 is a diagram of a model of text feature vector acquisition in an embodiment of the present invention.

Detailed Description

It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.

As described above, the interest point semantic annotation method in the prior art has the problems of high data cost, difficult acquisition, easy invasion of privacy, long time consumption, inaccuracy and the like. Based on the above, the invention provides a method and a system for labeling interest point semantics based on space and semantic neighbor information, as shown in fig. 1 and fig. 2.

In a first aspect, the invention provides a method for labeling interest points semantically based on space and semantic neighbor information, which specifically comprises the following steps:

step 1, using the independent heat encoding vector as the category embedding of the interest points, constructing all the interest points as follows by using delaunay triangulation according to the longitude and latitude information of the interest points (taking the longitude 116.341878 degrees and the latitude 40.030869 degrees of the Boer in FIG. 2 as an example)In the graph structure, the interest points are used as graph nodes, the node characteristics are initialized to be embedded in the categories of the interest points, and the nodes without category labels are initialized to be nearestkAverage value of embedded vectors of each classified label node.

The point of interest category reflects the point of interest semantics to a large extent. The present invention selects a simple class-one-hot coded representation as the initial node feature because it provides intuitive semantic tag information. That is, embedding of categories is expressed as:, wherein />The length of (2) is equal to the number of categories, +.>Representation of individual categories->Is>The number of elements is 1.

After the category information of the interest points is obtained and embedded, modeling is conducted on the space context of the interest points. According to the first law of geography, points of interest in close spatial proximity have a strong relationship. Often, points of interest around some points of interest often have the same category, e.g., restaurants are always grouped together, and around points of interest often surround points of interest. Also, while some points of interest may not be clustered together, the distribution of categories around them may be similar. As shown in fig. 3, based on the thought, the invention firstly utilizes Delaunay triangulation to construct all interest points into a graph structure according to the longitude and latitude of the interest points, wherein the interest points are nodes, category embedding is node characteristics, and the nodes without category labels use the nearest surroundingkThe class embedded mean of each node with label information is characterized. The following formulas are used simultaneously as nodes in the graph structure and />Distance of (2):

，

Step 2: dividing the point-of-interest name into a word sequence and a word sequence, respectively finding out pre-training vectors of the words and the words in the two sequences, summing and averaging the vectors in the sequences to obtain a word-level vector and a word-level vector, and splicing the vectors to obtain a pre-training text vector of the point-of-interest. Then calculating the distance between the interest points without category labels and the interest points with the category labels through cosine distance formula according to the pre-training text vector of the interest points, and finding out the nearest interest points through heap orderingkSemantic neighbors.

Considering interest points with similar semantic information, the categories of the interest points are also similar, and the invention enhances the information of the interest points by extracting the semantic neighbor information. Firstly, finding semantic neighbors of interest points according to the names of the interests: dividing the names of interest points into word sequences and word sequences, embedding the training vectors into a corpus by a Tencent AI laboratory, adding and averaging the corresponding training vectors in the sequences to obtain training text vectors of interest point word levels and word levels respectively, and embedding the training text vectors into the training text vectorsAnd splicing to obtain the pre-training text vector of the interest point. Then using the pre-training text vector to calculate the distance between the interest point and the interest point with the class label according to the cosine distance formula, and finding the nearest interest point through heap orderingkSemantic neighbors.

Step 3: the graph structure is sent to a spatial encoder based on a graph convolution neural network to obtain a spatial feature vector of the interest point, and the name of the interest point, the name of the neighbor interest point and the category are embedded and input to a text encoder based on an attention mechanism to obtain a text feature vector. And splicing vectors of the two views together through a multi-view fusion module, splicing the pre-training text vector and the fused feature vector together, and predicting the probability of the interest point category through a classifier. The parameters of the model are learned by training, i.e. minimizing the loss function.

Spatial features are extracted using a one-layer graph convolution neural network:

, wherein AIs an adjacency matrix of the graph structure, the invention does not add a self-loop in the adjacency matrix, which is to prevent adding own category information when acquiring the characteristic information of the neighbors,Dis the degree matrix of the adjacency matrix,Xis the initial feature matrix of the node->，θIs a leachable linear transformation applied to each node, softmax (·) is a normalized exponential function, ++>I.e. the acquired spatial feature vector.

After obtaining the semantic neighbors of the interest points, semantically enhancing the interest points through an attention mechanism, as shown in fig. 4, inputting the names of the interest points and the names of the semantic neighbors of the interest points into a Bert encoder to obtain the firstPoint of interest->Text feature->And a method for manufacturing the samekText features of individual semantic neighbors->The text feature is finer, and the distance between the text feature of the interest point and the neighbor text feature is calculated by a cosine distance formula: />，

And then will contain points of interestCosine distance vector of all semantic neighbors thereof +.>Input fully connected layer gets potential vector representation +.>：/>, wherein /> and />Respectively representing the weight and bias of training;

the points of interest are then normalized with a softmax functionAnd the firstjPersonal semantic neighbor Point of interest->Attention weight between:

，

wherein The representation is the interest point after the conversion by a layer of MLP +.>And neighbor interest point->Is used for the distance of (a),is interest point->And the firstjPersonal neighbor Point of interest->Is a weight of attention of (2);

finally, obtaining text feature vectors of the interest points based on an attention mechanism:

, wherein />Representing neighbor interest points->The category embeds the vector.

The inventor explores a plurality of fusion methods, finally selects splicing as a final fusion mode, and the invention uses the space feature vectorIs +.>Spliced together, the pre-trained text vector is spliced together after the two views are fused to obtain the vector +.>As input to the final multi-layer MLP:

，

Finally, minimizing the loss function (cross entropy loss) by using the interest point classAnd transmitting the known interest point category information back to the depth network:

，

Step 4: when the urban interest point category prediction is carried out, the map structure of the urban interest points, the interest point category embedding with category labels and the interest point names are input, and the interest point category can be predicted by embedding the interest point semantic neighbor categories and pre-training text vectors.

The model used in the invention is run on two real data sets, and the performance of the method and the performance of the comparison methods are evaluated by using several indexes of Accuracy (Accuracy), macro F1 (Macro-F1) and average reciprocal rank (MRR), wherein the comparison methods comprise text characteristics (WTF) based on words, text characteristics (ATF) based on attention, space characteristics (GSF) based on grids, pre-training semantic embedded global GPS coordinates (GPS 2 Vec), an integrated POI hierarchical classification framework (EHC) andword-based text feature+pre-training semantics embeds global GPS coordinates (wtf+gps 2 Vec). Wherein MRR is based onThe generated ranking list of the predicted category labels looks up the index of the ranking of the real labels in the ranking list, and the calculation method comprises the following steps:

，

wherein Representing test set->Quantity of->Representing the ranking of the real categories in the predicted categories. The evaluation results of all the indexes are shown in table 1. It can be seen that the invention has a significant improvement in performance over other methods.

Table 1 evaluation results of various methods

。

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The interest point semantic labeling method based on the space and semantic neighbor information is characterized by comprising the following steps of:

step 1: using one-hot vector as category embedding of interest points, constructing a graph structure according to longitude and latitude information of the interest points by using Deltay triangulation, using the interest points as graph nodes, initializing node characteristics as category embedding of the interest points, and initializing nodes without category labels as nearest nodeskAverage value of embedded vectors of each classified label node;

step 2: obtaining a pre-training text vector of the interest point, calculating the distance between the interest point without the category label and the interest point with the category label through a cosine distance formula, and finding the nearest interest point through heap orderingkA semantic neighbor;

step 3: obtaining a space feature vector and a text feature vector of an interest point, splicing the fusion feature vector spliced together with a pre-training text vector, predicting the probability of the class of the interest point through a classifier, and learning parameters of a model through training, namely minimizing a loss function;

step 4: when urban interest point category prediction is carried out, a map structure of urban interest points, interest point category embedding with category labels and interest point names are input, and the interest point semantic neighbor category embedding and pre-training text vectors can predict the interest point categories;

in step 3, the fusion feature vector and the pre-training text vector are spliced in the following manner: by means of space feature vectorsIs +.>Spliced together, the pre-trained text vector is spliced together after the two views are fused to obtain the vector +.>As input to the final multi-layer MLP:

，

wherein ,representing interest points->Probability of belonging to each interest point type, +.>、/>、/>Weights of layers 1, 2, 3 of the MLP classifier are respectively represented, +.>、/>、/>Bias, vector +.>The vector is spliced with the pre-training text vector after view fusion, and softmax (·) is a normalized exponential function, which is an activation function of the last layer of the MLP classifier;

finally, the known interest point category information is transmitted back to the depth network by using the interest point category minimization loss function:

，

2. The interest point semantic labeling method based on space and semantic neighbor information according to claim 1, wherein in step 1, the embedding of the category is represented as:, wherein />The length of (2) is equal to the number of categories, +.>Representation of individual categories->Is>The number of elements is 1.

3. The interest point semantic labeling method based on space and semantic neighbor information according to claim 1, wherein in step 1, nodes in a graph structure and />The distance of (2) is expressed as:

，

4. The method for labeling the interest points semantically based on the space and semantic neighbor information according to claim 1, wherein in the step 2, the method for obtaining the pre-training text vector of the interest points is specifically as follows: dividing the point-of-interest name into a word sequence and a word sequence, respectively finding out pre-training vectors of the words and the words in the two sequences, summing and averaging the vectors in the sequences to obtain a word-level vector and a word-level vector, and splicing the vectors to obtain a pre-training text vector of the point-of-interest.

5. The interest point semantic labeling method based on space and semantic neighbor information according to claim 1, wherein in step 3, a graph structure is sent to a spatial encoder based on a graph convolution neural network to obtain a spatial feature vector of the interest point.

6. The interest point semantic labeling method based on space and semantic neighbor information according to claim 5, wherein the specific way of obtaining the spatial feature vector of the interest point is as follows: extracting spatial feature vectors by using a layer of graph convolution neural network:

，

wherein AIs an adjacency matrix of the graph structure,Dis the degree matrix of the adjacency matrix,Xis the initial feature matrix of the node，θIs a leachable linear transformation applied to each node, softmax (·) is a normalized exponential function, ++>I.e. the acquired spatial feature vector.

7. The method for labeling the interest points semantically based on the space and semantic neighbor information according to claim 1, wherein in the step 3, the names of the interest points, the names of the neighbor interest points and the category are embedded and input into a text encoder based on an attention mechanism to obtain text feature vectors.

8. The interest point semantic labeling method based on space and semantic neighbor information according to claim 7, wherein the specific manner of obtaining text feature vectors is as follows: inputting the names of the points of interest and the names of semantic neighbors of the points of interest into a Bert encoder to obtain the firstPoint of interest->Text feature->And a method for manufacturing the samekText features of individual semantic neighborsThe distance between the text feature of the interest point and the neighbor text feature is calculated by a cosine distance formula:；

and then will contain points of interestCosine distance vector of all semantic neighbors thereof +.>Input fully connected layer gets potential vector representation +.>：/>，

wherein and />Respectively representing the weight and bias of training;

，

wherein Representing neighbor interest points->The category embeds the vector.