US12099806B2 - Place recognition method based on knowledge graph inference - Google Patents
Place recognition method based on knowledge graph inference Download PDFInfo
- Publication number
- US12099806B2 US12099806B2 US17/701,137 US202217701137A US12099806B2 US 12099806 B2 US12099806 B2 US 12099806B2 US 202217701137 A US202217701137 A US 202217701137A US 12099806 B2 US12099806 B2 US 12099806B2
- Authority
- US
- United States
- Prior art keywords
- place
- knowledge graph
- description
- inference
- entity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/268—Morphological analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/44—Statistical methods, e.g. probability models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
Definitions
- the present disclosure relates to a place recognition method based on knowledge graph inference, which belongs to the technical field of artificial intelligence and knowledge graphs.
- Place perception refers to automatically processing and analyzing the environmental information such as vision, sound, distance, natural language, etc., by means of artificial intelligence, and determining and recognizing the specific place semantics (e.g., kitchen, street, etc.) that the environment carries. Place perception not only helps to understand the overall semantic content of the environmental information, but also provides a basis for place-related human-computer interaction tasks. Therefore, place recognition is of great importance to automatic understanding of the environment by an intelligent device and improvement of the intelligent level of human-computer interaction.
- the current place recognition technologies mostly use images or distances (by means of infrared rays, ultrasonic waves, etc.) as recognition clues, and learn and train a Deep Neural Network (DNN) model through a huge quantity of samples, so that the network model can give the place category corresponding to the environmental information.
- DNN Deep Neural Network
- Such methods mainly have the following shortcomings: 1. It is required to design different model methods according to different information source types, and integration of heterogeneous information cannot be realized, thus lacking unified inference and failing to ensure the recognition accuracy. 2.
- the DNN belongs to an end-to-end model, and therefore has no intermediate results of the inferring process, so that a large number of semantic cues related to the place understanding task are lost.
- a knowledge graph is a semantic network that can explicitly reveal the relationship between knowledge, and can formally describe all kinds of things and their interrelation.
- This technology helps knowledge in the relevant fields to be created, shared, updated, inferred, etc., and to be effectively understood directly by people.
- the current knowledge graphs are all constructed independently by different users based on their own application fields, and there is still an absence of construction and inferring methods of knowledge graphs targeted for the place filed. Therefore, there is an urgent need for a novel technical solution to solve the foregoing technical problems.
- the present disclosure provides a place recognition method based on knowledge graph inference, which integrates environmental information of various places by means of knowledge graph technology, can effectively solve the problem of a low recognition rate of a recognition method based on homogeneous information, and further can enrich the semantics of inference results, thus improving the human-computer interaction and other place-related intelligent tasks.
- a place recognition method based on knowledge graph inference is provided, which includes the following steps:
- the acquisition of the basic semantic data in step 1) includes the following sub-steps:
- the generation of the place description entities in step 2) includes the following sub-steps:
- the construction of the place knowledge graph in step 3) includes the following sub-steps:
- p i , j f i , j ⁇ i ⁇ f i , j , to calculate the probability value; and thus, constructing the place knowledge graph, where a basic triple structure thereof is “description entities-place categories-probability values”, which is specifically expressed as: the i-th description entity-place category j-occurrence probability p i,j ; in addition, triples corresponding to the probability values of p i,j ⁇ 10 ⁇ 2 are not recorded in the knowledge graph, and corresponding modification or deletion is also synchronously made in the description entity dictionary in step 2); and moreover, two new entities: “placeholder” and “unknown character”, are added to the description entity dictionary in step 2), where the former one does not have any semantic concept and is only used for data padding in an inference model; and the latter one is semantic data acquired in step 1), is not stored in the description entity dictionary in step 2), and indicates that the entity concept is unknown.
- the inference from the place knowledge graph in step 4) includes the following sub-steps:
- the description entity dictionary includes the following two sets: an object set and an action state set, where elements in the object set are words corresponding to real objects, and elements in the action state set are words corresponding to interactions between humans and objects or between humans, and certain states of humans or produced events; and other semantic words are not included in the description entity dictionary.
- the DNN inference model has the following structure or steps:
- the neural network structure at least includes: an embedded vector fully connected layer, used for realizing mapping from a one-hot code to a dense vector; a recurrent neural network or its variant structure, used for realizing integration and fusion of the set of “description entities-probability values”; and a softmax layer, used for calculating a classification probability of place categories.
- an embedded vector fully connected layer used for realizing mapping from a one-hot code to a dense vector
- a recurrent neural network or its variant structure used for realizing integration and fusion of the set of “description entities-probability values”
- a softmax layer used for calculating a classification probability of place categories.
- the training process for optimizing the inference model at least includes: a cross entropy loss function, used for realizing improvement of model classification performance; and a triplet loss function, used for improving a vector representation capability of the description entities, so that the Euclidean distance between the word embedding vectors of description entity corresponding to places of the same category is as close as possible, and the Euclidean distance between the word embedding vectors of description entity corresponding to places of different categories is as far as possible.
- the present disclosure provides a place recognition method based on knowledge graph inference, which first gives a construction method of a place knowledge graph, thus solving the current problem of the absence of knowledge graphs in the place recognition and understanding field; and secondly, can well solve the problems such as low recognition accuracy, poor semantic interpretability, inability to visualize the inference process, and lack of comprehensive inference for multi-source and heterogeneous information in the current place recognition methods.
- the knowledge graph in the place field can provide engineering foundation for intelligent tasks of intelligent robots, such as task planning and decomposition, human-robot interaction, and speech understanding.
- the method provided by the present disclosure has simple steps, is easy to implement, and can achieve a good place recognition effect.
- FIG. 1 is a schematic framework diagram of a place recognition method based on knowledge graph inference of the present disclosure
- FIG. 2 is a diagram of a DNN model for knowledge graph inference
- FIG. 3 is a schematic diagram of a visualized place knowledge graph (a part) of the present disclosure.
- the acquisition of the basic semantic data in step 1) includes the following sub-steps:
- the generation of the place description entities in step 2) includes the following sub-steps:
- the construction of the place knowledge graph in step 3) includes the following sub-steps:
- the place knowledge graph can be constructed, and a basic triple structure thereof is “description entities-place categories-probability values”, which is specifically expressed as: the i-th description entity-place category j-occurrence probability p i,j .
- triples corresponding to the probability values of p i,j ⁇ 10 ⁇ 2 are not recorded in the knowledge graph, and corresponding modification or deletion is also synchronously made in the description entity dictionary in step 2).
- step 2 two new entities: “placeholder” and “unknown character”, are added to the description entity dictionary in step 2), where the former one does not have any semantic concept and is only used for data padding in an inference model; and the latter one is semantic data acquired in step 1), is not stored in the description entity dictionary in step 2), and indicates that the entity concept is unknown.
- the inference from the place knowledge graph in step 4) includes the following sub-steps:
- FIG. 1 The framework of a place recognition method based on knowledge graph inference provided by the present disclosure is shown by FIG. 1 , which includes a training process and an inference process. As shown in FIG. 1 , the training process mainly includes the following four steps:
- the inference process mainly includes the following four steps:
- the place information data used in the experiment of the present disclosure comes from a large-scale scene image database established by J. Xiao et al. (SUN dataset: https://vision.cs.princeton.edu/projects/2010/SUN/, 2020 Nov. 25; and the corresponding literature is SUN database: Large-scale scene recognition from abbey to zoo[C]//Computer Vision & Pattern Recognition. IEEE, 2010. by Xiao J, Hays J, Ehinger K A, et al.). This database contains a total of about 100,000 RGB images in 397 categories, and each scene contains at least 100 image samples, where about 16,000 images have been manually annotated, with English words to mark the main items contained therein.
- This experiment selects images of 14 categories of indoor places for experimental verification, and reference can be made to Table 1 for the specific categories of the places and the numbers of corresponding samples. Because the numbers of samples of different place categories are different, test samples are randomly selected from samples corresponding to each place category, where the selected samples account for 10% of a total of the samples corresponding to this place category, and the remaining samples are used as training samples.
- this experiment takes a recognition rate as an estimation means.
- d i ⁇ D, l ⁇ L, i 1, 2 . . . , n ⁇ , D denoting natural language knowledge used by humans to describe places, and L denoting all place categories that can be recognized by the knowledge graph.
- This set participates in the following inference process as the basic semantic data.
- the basic semantic data is preprocessed by using natural language processing methods.
- the specific steps are described below with reference to specific instances:
- step 1.2 With reference to the place description entity set obtained in step 1.2, it is required to construct a place knowledge graph according to the following steps:
- p i , j f i , j ⁇ i ⁇ f i , j , to obtain the entity occurrence probability value p i,j .
- the inference process has two parts: inference model training and inference model test, where a basic structure of the inference model is shown by FIG. 2 .
- this neural network model is merely an experimental preferred result of the present disclosure and should not be construed as limiting the present disclosure.
- Other inference models or methods shall also be regarded as falling within the scope of the present disclosure.
- the neural network model is formed by an input layer, a word embedding unit, a bi-gated network layer, a fully connected layer, a fusion layer, and a classification layer.
- the description entities and the probability values p i,j in the knowledge graph constitute the input layer.
- the description entities and the place categories are denoted by a one-hot code vector w i , and in the vector, positions corresponding to the entity dictionary are 1 and other positions are 0.
- the word embedding unit is a lookup table consisting of fully connected layers; and can map the one-hot code vector to a dense real-number vector, which is referred to as an embedding vector.
- the input dimension of the fully connected layer is a dictionary capacity, and its output dimension is manually set and less than the dictionary capacity.
- the dictionary capacity is 412 and the dimension of the embeddding vector is 256.
- Bi-Gated Recurrent Units There are two Bi-Gated Recurrent Units (Bi-GRUs), one of which receives the probability values and the other one receives the dense vector of the description entities.
- the hidden-layer dimensions of the gated units are manually set, which are 32 and 256 respectively in this experiment.
- the Bi-GRU uses a dynamic recurrent neural network structure; and its maximum acceptable length is manually determined, which is 20 in this experiment.
- the last hidden layer state of the Bi-GRU is passed to a fully connected layer.
- the output dimensions of the fully connected layers are all 14, which are corresponding to the number of place categories selected in this experiment.
- the fusion layer fuses the foregoing outputs by multiplying the elements of the corresponding positions of two vectors, and performs data fine-tuning by using a fully connected layer. Finally, data is input to the softmax classification layer, to obtain confidence corresponding to the different place categories.
- a set containing at least one piece of triple knowledge is obtained after each training sample is subjected to the operations in steps 1.1 and 1.2. Further, the description entities are subjected to pruning and padding operations according to the maximum acceptable length, and the place category labels are denoted as a one-hot code vector, to finally form a training data set.
- the training process adopts a manner of minimizing a cross entropy loss function and a triplet loss function, and uses the Adam optimizer for optimization. An initial value of the learning rate is 0.002 and the cosine decay method is implemented to decay the learning rate. The whole training process lasted for 200 epochs and then stops.
- samples for subsequent inference are also subjected to the foregoing same operations, only excluding the place category labels.
- a confidence vector of this sample for all place categories can be obtained.
- a place category corresponding to a maximum confidence is selected, which is the inference result.
- Results of this experiment are obtained by execution according to the experiment process described in section 1.
- the experimental environment is a Windows system with an Intel i5-4590 CPU and 12 GB RAM, the neural network structure is written using the TensorFlow 1.15 function library, and the code is written in Python language.
- This experiment selects 14 categories of places for test, and experimental results are shown in Table 1. It can be seen through analysis and comparison of the recognition rates that the method of the present disclosure can effectively realize place recognition. Further, because the place knowledge graph is constructed, semantic elements of different places can be directly acquired, so that people can conveniently and intuitively understand the composition of the place.
- FIG. 3 shows a partial visualized result of the place knowledge graph, where the probability values are shown in connecting edges and are omitted for simplicity.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Animal Behavior & Ethology (AREA)
- Machine Translation (AREA)
Abstract
Description
-
- Step 1) Acquisition of Basic Semantic Data
- the basic semantic data mainly describing items contained in a specific place, events, and special semantic concepts associated with the place, and there being the following two acquisition ways: during construction of a place knowledge graph and training of an inference model, annotating various information manually by using natural language description, including images, sound, distances, voice, etc., collected in a place environment, so as to obtain the basic semantic data and a corresponding place category; and on the other hand, in a place recognition and inference process, automatically generating the semantic data by an existing semantic generation model according to types of heterogeneous information;
Step 2) Generation of Place Description Entities - by using natural language processing methods such as text segmentation, removal of stop words, entity extraction, lemmatization, and manual screening, preprocessing the basic semantic data, where natural language text obtained after screening contains description entities in the place knowledge graph;
Step 3) Construction of the Place Knowledge Graph - counting the occurrence frequencies of the place description entities in an actual application environment, to obtain a frequency of each description entity in a specific place, and then performing normalization to obtain a probability value, to finally form the place knowledge graph having a basic triple structure of “description entities-place categories-probability values”; and
Step 4) Inference from the Place Knowledge Graph - learning the knowledge graph by using a DNN, where its objective task is to perform training according to triple sets of the knowledge graph, so that the DNN has a function of inferring the “place category” according to a knowledge set of “description entities-probability values”; during implementation of inference, automatically extracting description entities from a place information source according to steps 1) and 2), and further performing entity matching with the knowledge graph in step 3); and then making inference by using a well-trained DNN from a knowledge set obtained after the matching, thus realizing place recognition.
-
- 1-1) in an actual application environment, the place information being collected by an intelligent device via a sensor, and original information being expressed with images, videos, sound, distances, etc.; first, describing the foregoing information in natural language by means of manual annotation, where description content includes semantic concepts such as names of things, events, and human behavior or states that are contained in the information, so as to form the basic semantic data; and
- 1-2) in the inference process, automatically generating the basic semantic data by the existing semantic generation network according to specific information types, where training data for the network is provided in sub-step 1-1).
-
- after acquisition of the basic semantic data in step 1), requiring preprocessing by using natural language processing methods: first, segmenting the semantic data to obtain word units each having a minimal semantic concept; then, performing word deletion according to a stop word list; further performing entity extraction for the screened word units, where an extraction principle is: reserving word units each having a minimal semantic concept according to the thing names, events, actions, or states, such units generally having the attributes of nouns or verbs and being able to affect the judgment on the place category; and finally, performing lemmatization for the extracted entities, to lemmatize words in terms of verb tense, person, and noun plural, where through the foregoing steps, a description entity dictionary is formed, which can be stored, added, deleted, and modified.
-
- first, collecting sample statistics on the number of occurrences of each description entity in an actual application environment according to the description entity dictionary obtained in step 2). Let ni,j denote the number of occurrences of the i-th description entity in the dictionary in all samples regarding category-j places and let mj denote a total number of the samples regarding the category-j places, to obtain the following calculation formula of a description entity frequency value fi,j: fi,j=ni,j/mj; if the i-th description entity does not occur in the samples regarding the category-j places in the statistical process, assigning a minimal value to the frequency fi,j, that is, fi,j=σ(σ<10−3); performing normalization for frequency values of all the description entities in places of the same category, that is,
where the function F(⋅) denotes a normalization method, to finally obtain an entity occurrence probability value pi,j, and preferably,
can be established by using
to calculate the probability value; and thus, constructing the place knowledge graph, where a basic triple structure thereof is “description entities-place categories-probability values”, which is specifically expressed as: the i-th description entity-place category j-occurrence probability pi,j; in addition, triples corresponding to the probability values of pi,j<10−2 are not recorded in the knowledge graph, and corresponding modification or deletion is also synchronously made in the description entity dictionary in step 2); and moreover, two new entities: “placeholder” and “unknown character”, are added to the description entity dictionary in step 2), where the former one does not have any semantic concept and is only used for data padding in an inference model; and the latter one is semantic data acquired in step 1), is not stored in the description entity dictionary in step 2), and indicates that the entity concept is unknown.
-
- 4-1) in the training process, combining the entity dictionary and the place categories into a new dictionary, which is denoted by a one-hot code vector; and designing a DNN inference model by using a set of “description entities-probability values” of the samples as the input and the “place categories” as the output; and
- 4-2) in the inference process, performing entity matching between the set of “description entities” of the samples acquired in step 2) and the knowledge graph constructed in step 3), to obtain a set of “description entities-probability values”; and then inputting the set into the well-trained inference model in sub-step 4-1), to finally obtain place category knowledge.
-
- The basic semantic data mainly describes items contained in a specific place, events, and special semantic concepts associated with the place, and there are the following two acquisition ways: During construction of a place knowledge graph and training of an inference model, various information, including images, sound, distances, voice, etc., collected in a place environment is annotated manually by using natural language description, so as to obtain the basic semantic data and a corresponding place category. On the other hand, in a place recognition and inference process, the foregoing semantic information is automatically generated by an existing semantic generation model according to types of heterogeneous information.
Step 2) Generation of Place Description Entities - By using natural language processing methods such as text segmentation, removal of stop words, entity extraction, lemmatization, and manual screening, the basic semantic data is preprocessed, and natural language text obtained after screening contains description entities in the place knowledge graph.
Step 3) Construction of the Place Knowledge Graph - The occurrence frequencies of the place description entities in an actual application environment are counted, to obtain a frequency of each description entity in a specific place, and then normalization is performed to obtain a probability value, to finally form the place knowledge graph having a basic triple structure of “description entities-place categories-probability values”.
Step 4) Inference from the Place Knowledge Graph - The knowledge graph is learned by using a DNN, where its objective task is to perform training according to triple sets of the knowledge graph, so that the DNN has a function of inferring the “place category” according to a knowledge set of “description entities-probability values”. During implementation of inference, it is required to automatically extract description entities from a place information source according to steps 1) and 2), and further entity matching with the knowledge graph in step 3) is performed; and inference is made by using a well-trained DNN from a knowledge set obtained after the matching, thus realizing place recognition.
- The basic semantic data mainly describes items contained in a specific place, events, and special semantic concepts associated with the place, and there are the following two acquisition ways: During construction of a place knowledge graph and training of an inference model, various information, including images, sound, distances, voice, etc., collected in a place environment is annotated manually by using natural language description, so as to obtain the basic semantic data and a corresponding place category. On the other hand, in a place recognition and inference process, the foregoing semantic information is automatically generated by an existing semantic generation model according to types of heterogeneous information.
-
- 1-1) In an actual application environment, the place information is collected by an intelligent device via a sensor, and original information is expressed with images, videos, sound, distances, etc. First, the foregoing information is described in natural language by means of manual annotation, where description content includes semantic concepts such as names of things, events, and human behavior or states that are contained in the information, so as to form the basic semantic data.
- 1-2) In the inference process, the basic semantic data is automatically generated by the existing semantic generation network according to specific information types, where training data for the network is provided in sub-step 1-1).
-
- After acquisition of the basic semantic data in step 1), preprocessing is required by using natural language processing methods: First, the semantic data is segmented to obtain word units each having a minimal semantic concept. Then, word deletion is performed according to a stop word list. Further, entity extraction is performed for the screened word units, where an extraction principle is: reserving word units each having a minimal semantic concept according to the thing names, events, actions, or states; and such units generally have the attributes of nouns or verbs and are able to affect the judgment on the place category. Finally, lemmatization is performed for the extracted entities, to lemmatize words in terms of verb tense, person, and noun plural. Through the foregoing steps, a description entity dictionary can be formed, which can be stored, added, deleted, and modified.
-
- First, sample statistics on the number of occurrences of each description entity in an actual application environment are collected according to the description entity dictionary obtained in step 2). Let ni,j denote the number of occurrences of the i-th description entity in the dictionary in all samples regarding category-j places and let mj denote a total number of the samples regarding the category-j places, to obtain the following calculation formula of a description entity frequency value fi,j: fi,j=ni,j/mj. If the i-th description entity does not occur in the samples regarding the category-j places in the statistical process, a minimal value is assigned to the frequency fi,j, that is, fi,j=σ(σ<10−3). Normalization is performed for frequency values of all the description entities in places of the same category, that is,
where the function F(⋅) denotes a normalization method, to finally obtain an entity occurrence probability value pi,j. Preferably,
can be established by using
to calculate the probability value. Thus, the place knowledge graph can be constructed, and a basic triple structure thereof is “description entities-place categories-probability values”, which is specifically expressed as: the i-th description entity-place category j-occurrence probability pi,j. In addition, triples corresponding to the probability values of pi,j<10−2 are not recorded in the knowledge graph, and corresponding modification or deletion is also synchronously made in the description entity dictionary in step 2). Moreover, two new entities: “placeholder” and “unknown character”, are added to the description entity dictionary in step 2), where the former one does not have any semantic concept and is only used for data padding in an inference model; and the latter one is semantic data acquired in step 1), is not stored in the description entity dictionary in step 2), and indicates that the entity concept is unknown.
-
- 4-1) In the training process, the entity dictionary and place categories are combined into a new dictionary, which is denoted by a one-hot code vector; and a DNN inference model is designed by using a set of “description entities-probability values” of the samples as the input and the “place categories” as the output.
- 4-2) In the inference process, entity matching is performed between the set of “description entities” of the samples acquired in step 2) and the knowledge graph constructed in step 3), to obtain a set of “description entities-probability values”; and then the set is input into the well-trained inference model in sub-step 4-1), to finally obtain place category knowledge.
-
- 1) acquiring basic semantic data from various heterogeneous place information by means of manual annotation, which is mainly the semantics of things covered by place information described in natural language; and by using the acquired data as a data sample, designing a semantic generation model;
- 2) preprocessing and screening the basic semantic data by using natural language processing methods, to acquire description entity knowledge of a place;
- 3) by means of sample statistics in an actual application environment, acquiring an occurrence probability of each description entity, thus forming a place knowledge graph having a basic triple structure of “description entities-place categories-probability values”; and
- 4) with reference to the place knowledge graph, designing a DNN inference model by using a set of “description entities-probability values” as the input and the “place categories” as the output, for sample learning and network parameter training.
-
- 1) generating basic semantic data from various heterogeneous place information by using the semantic generation model;
- 2) preprocessing and screening the basic semantic data by using natural language processing methods, to acquire description entity knowledge of a place;
- 3) matching the description entities with the place knowledge graph, to obtain a set of “description entities-probability values” of sample information; and
- 4) inputting the set of “description entities-probability values” into the inference model, to obtain information about “place categories”.
Specific Experimental Procedure and Results of Embodiment 1
-
- 1) First, the basic semantic description is segmented according to human semantic knowledge, that is, di={sj i|j=1, 2, . . . , m}, where sj i denotes the smallest semantic unit that indicates a particular concept. For example, a natural language description of a certain picture sample is “A man is eating that red apple”, and after a segmentation step, a set {A, man, is, eating, that, red, apple} is obtained.
- 2) Afterwards, word deletion is performed according to a stop word list, to remove a word meaningless for description of the sample. For the instance in the previous step, {that} can be removed because it is meaningless for description of the sample in this instance.
- 3) Finally, part-of-speech tagging is performed by using the entity extraction technology, and word units each having a minimal semantic concept that describes objects, events, and actions are reserved, where such units generally have the attributes of nouns or verbs and are able to affect the judgment on the place category. Thus, place description entities I={wj i∈di|j=1, 2, . . . , k, k<m} are generated. With reference to the previous instance, the finally reserved place description entities are {man, eating, apple}.
1.3 Construction of a Place Knowledge Graph
-
- 1) Duplicates in the place description entity sets of all samples are eliminated to form a description entity dictionary, where this dictionary can be stored, modified, deleted, and added, and is a basic element of the knowledge in the knowledge graph. In addition, it is required to add two new entities: “placeholder” and “unknown character” to the description entity dictionary, where the former one does not have any semantic concept and is only used for data padding in an inference model; and the latter one is a unit not stored in the description entity dictionary and indicates that the entity concept is unknown.
- 2) The number of occurrences of each unit in the dictionary in an actual application environment is counted. Let ni,j denote the number of occurrences of the i-th description entity in the dictionary in all samples regarding category-j places and let mj denote a total number of the samples regarding the category-j places, to obtain the following calculation formula of a description entity frequency value fi,j: fi,j=ni,j/mj. If the i-th description entity does not occur in the samples regarding the category-j places in the statistical process, a minimal value is assigned to the frequency fi,j, that is, fi,j=σ(σ<10−3).
- 3) Normalization is performed for frequency values of all the description entities in places of the same category, that is,
is established by using
to obtain the entity occurrence probability value pi,j.
-
- 4) The place knowledge graph is constructed, and a basic triple structure thereof is “description entities-place categories-probability values”, which is specifically expressed as: the i-th description entity-place category j-occurrence probability pi,j. In addition, triples corresponding to the probability values of pi,j<10−2 are not recorded in the knowledge graph. That is, such description entities can be deleted because their occurrence likelihood is rather low in actual application. Then, corresponding modification or deletion is also synchronously made in the description entity dictionary.
1.4 Inference from the Place Knowledge Graph
- 4) The place knowledge graph is constructed, and a basic triple structure thereof is “description entities-place categories-probability values”, which is specifically expressed as: the i-th description entity-place category j-occurrence probability pi,j. In addition, triples corresponding to the probability values of pi,j<10−2 are not recorded in the knowledge graph. That is, such description entities can be deleted because their occurrence likelihood is rather low in actual application. Then, corresponding modification or deletion is also synchronously made in the description entity dictionary.
| TABLE 1 |
| Results of sample distribution and recognition |
| rates of 14 categories of places |
| Number of | ||||
| correctly | ||||
| Training | Test | recognized | Recognition | |
| Place categories | samples | samples | samples | rates |
| Airport terminal | 114 | 13 | 12 | 92.31% |
| Art studio | 95 | 11 | 9 | 81.82% |
| Bathroom | 652 | 73 | 72 | 98.63% |
| Bedroom | 1402 | 156 | 137 | 87.82% |
| Meeting room | 193 | 22 | 15 | 68.18% |
| Corridor | 123 | 14 | 14 | 100.00% |
| Dining room | 470 | 53 | 43 | 81.13% |
| Playroom | 95 | 11 | 7 | 63.64% |
| Hotel room | 206 | 23 | 18 | 78.26% |
| Kitchen | 735 | 82 | 75 | 91.46% |
| Living room | 900 | 101 | 90 | 89.11% |
| Poolroom | 121 | 14 | 13 | 92.86% |
| Street | 266 | 30 | 30 | 100.00% |
| Waiting room | 96 | 11 | 10 | 90.91% |
| Total | 5468 | 614 | 545 | — |
| Average value | — | — | — | 88.76% |
Claims (7)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202011556111.2A CN112966823B (en) | 2020-12-24 | 2020-12-24 | Site identification method based on knowledge graph reasoning |
| CN202011556111.2 | 2020-12-24 | ||
| PCT/CN2020/141444 WO2022134167A1 (en) | 2020-12-24 | 2020-12-30 | Knowledge graph inference-based method for place identification |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2020/141444 Continuation WO2022134167A1 (en) | 2020-12-24 | 2020-12-30 | Knowledge graph inference-based method for place identification |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20220215175A1 US20220215175A1 (en) | 2022-07-07 |
| US12099806B2 true US12099806B2 (en) | 2024-09-24 |
Family
ID=76271403
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/701,137 Active 2041-11-18 US12099806B2 (en) | 2020-12-24 | 2022-03-22 | Place recognition method based on knowledge graph inference |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US12099806B2 (en) |
| CN (1) | CN112966823B (en) |
| WO (1) | WO2022134167A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20250140126A1 (en) * | 2024-01-08 | 2025-05-01 | Xuzhou Medical University | Method of dynamically generating personalized knowledge graph based on prompt learning |
Families Citing this family (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11151320B1 (en) * | 2020-04-21 | 2021-10-19 | Microsoft Technology Licensing, Llc | Labeled knowledge graph based priming of a natural language model providing user access to programmatic functionality through natural language input |
| CN113342982B (en) * | 2021-06-24 | 2023-07-25 | 长三角信息智能创新研究院 | Enterprise industry classification method integrating Roberta and external knowledge base |
| CN113468330B (en) * | 2021-07-06 | 2023-04-28 | 北京有竹居网络技术有限公司 | Information acquisition method, device, equipment and medium |
| CN114255427B (en) * | 2021-12-21 | 2023-04-18 | 北京百度网讯科技有限公司 | Video understanding method, device, equipment and storage medium |
| CN114925190B (en) * | 2022-05-30 | 2023-08-04 | 南瑞集团有限公司 | Mixed reasoning method based on rule reasoning and GRU neural network reasoning |
| US12481497B2 (en) * | 2022-07-20 | 2025-11-25 | Lti Mindtree Ltd | Method and system for building and leveraging a knowledge fabric to improve software delivery lifecycle (SDLC) productivity |
| CN115496996A (en) * | 2022-10-24 | 2022-12-20 | 南京邮电大学 | A robot indoor scene recognition method based on knowledge graph embedding |
| CN115935996A (en) * | 2022-12-21 | 2023-04-07 | 华南农业大学 | A Fine-Grained Butterfly Recognition Method Combining Knowledge Graph and Spoken Text |
| JP2024091177A (en) * | 2022-12-23 | 2024-07-04 | 富士通株式会社 | Specific program, specific method, and information processing device |
| CN115858816A (en) * | 2022-12-27 | 2023-03-28 | 北京融信数联科技有限公司 | Construction method and system of intelligent agent cognitive map for public security field |
| CN116978113A (en) * | 2023-05-26 | 2023-10-31 | 西安电子科技大学广州研究院 | Action category identification method and device integrating visual knowledge graph |
| CN116579747B (en) * | 2023-07-11 | 2023-09-08 | 国网信通亿力科技有限责任公司 | Image progress management method based on big data |
| CN117648929B (en) * | 2023-10-25 | 2024-09-10 | 西安理工大学 | Target false recognition correction method based on similar humanized generalized perception mechanism |
| CN118537835A (en) * | 2024-04-17 | 2024-08-23 | 广东工业大学 | A traffic dynamic occlusion tracking method and system based on multimodal fusion knowledge graph |
| CN119293021B (en) * | 2024-09-03 | 2025-12-12 | 中国长江电力股份有限公司 | Multi-business state data management method and platform for intelligent comprehensive energy |
| GB202417329D0 (en) * | 2024-11-26 | 2025-01-08 | Semantics 21 | Method, program, and apparatus for automated analysis of criminal evidence |
| CN120012947B (en) * | 2025-04-18 | 2025-07-01 | 中国科学院空天信息创新研究院 | Knowledge graph multi-hop reasoning method and device oriented to attribute fusion |
| CN120258122A (en) * | 2025-04-25 | 2025-07-04 | 甘肃省交通规划勘察设计院股份有限公司 | A method for constructing a traffic safety knowledge graph for highway sections with continuous longitudinal slopes |
| CN120144790B (en) * | 2025-05-15 | 2025-07-25 | 江西师范大学 | A method and system for constructing intelligent knowledge architecture diagram based on large model |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10679133B1 (en) * | 2019-06-07 | 2020-06-09 | Peritus.AI, Inc. | Constructing and utilizing a knowledge graph for information technology infrastructure |
| US20220180065A1 (en) * | 2020-12-09 | 2022-06-09 | Beijing Wodong Tianjun Information Technology Co., Ltd. | System and method for knowledge graph construction using capsule neural network |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109885824B (en) * | 2019-01-04 | 2024-02-20 | 北京捷通华声科技股份有限公司 | Hierarchical Chinese named entity recognition method, hierarchical Chinese named entity recognition device and readable storage medium |
-
2020
- 2020-12-24 CN CN202011556111.2A patent/CN112966823B/en active Active
- 2020-12-30 WO PCT/CN2020/141444 patent/WO2022134167A1/en not_active Ceased
-
2022
- 2022-03-22 US US17/701,137 patent/US12099806B2/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10679133B1 (en) * | 2019-06-07 | 2020-06-09 | Peritus.AI, Inc. | Constructing and utilizing a knowledge graph for information technology infrastructure |
| US20220180065A1 (en) * | 2020-12-09 | 2022-06-09 | Beijing Wodong Tianjun Information Technology Co., Ltd. | System and method for knowledge graph construction using capsule neural network |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20250140126A1 (en) * | 2024-01-08 | 2025-05-01 | Xuzhou Medical University | Method of dynamically generating personalized knowledge graph based on prompt learning |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112966823A (en) | 2021-06-15 |
| US20220215175A1 (en) | 2022-07-07 |
| WO2022134167A1 (en) | 2022-06-30 |
| CN112966823B (en) | 2022-05-13 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12099806B2 (en) | Place recognition method based on knowledge graph inference | |
| JP7468929B2 (en) | How to acquire geographical knowledge | |
| CN109783666B (en) | Image scene graph generation method based on iterative refinement | |
| CN112036276B (en) | Artificial intelligent video question-answering method | |
| CN110222560B (en) | Text person searching method embedded with similarity loss function | |
| US5671333A (en) | Training apparatus and method | |
| CN110489395A (en) | Automatically the method for multi-source heterogeneous data knowledge is obtained | |
| CN112784902A (en) | Two-mode clustering method with missing data | |
| CN115204171B (en) | Document-level event extraction method and system based on hypergraph neural network | |
| CN118551077A (en) | Natural language interaction security video retrieval system and device based on large generated model | |
| CN117094291B (en) | Automatic news generation system based on intelligent writing | |
| CN115168678B (en) | A temporal-aware heterogeneous graph neural rumor detection model | |
| Belissen et al. | Dicta-Sign-LSF-v2: remake of a continuous French sign language dialogue corpus and a first baseline for automatic sign language processing | |
| CN115860152B (en) | Cross-modal joint learning method for character military knowledge discovery | |
| CN112148832A (en) | An event detection method based on label-aware dual self-attention network | |
| CN118377900A (en) | A social opinion event detection method based on hyperbolic graph clustering | |
| CN116450827A (en) | Event template induction method and system based on large-scale language model | |
| CN109471959B (en) | Figure reasoning model-based method and system for identifying social relationship of people in image | |
| CN120705362A (en) | Multimodal heterogeneous knowledge fusion construction and semantic enhancement retrieval system based on large model | |
| CN117726004A (en) | A social individual behavior recognition and prediction method based on large language model | |
| CN115859992A (en) | False information detection method and system based on dynamic information dissemination evolution model | |
| CN120687621A (en) | A method and system for constructing an industry map based on scenario-based marketing | |
| CN120296595A (en) | A classification method of graph neural network for multimodal data | |
| CN119066272A (en) | A network public opinion prediction method and system based on deep learning | |
| CN111950646A (en) | A method for constructing a hierarchical knowledge model of electromagnetic images and a method for target recognition |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
| AS | Assignment |
Owner name: SOUTHEAST UNIVERSITY, CHINA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LI, XINDE;LI, PEI;SUN, CHANGYIN;SIGNING DATES FROM 20220124 TO 20220125;REEL/FRAME:059375/0153 |
|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: SMAL); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |