WO2021232724A1 - 提取地理位置点空间关系的方法、训练提取模型的方法及装置 - Google Patents

提取地理位置点空间关系的方法、训练提取模型的方法及装置 Download PDF

Info

Publication number
WO2021232724A1
WO2021232724A1 PCT/CN2020/131305 CN2020131305W WO2021232724A1 WO 2021232724 A1 WO2021232724 A1 WO 2021232724A1 CN 2020131305 W CN2020131305 W CN 2020131305W WO 2021232724 A1 WO2021232724 A1 WO 2021232724A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
geographic location
spatial relationship
training
text
Prior art date
Application number
PCT/CN2020/131305
Other languages
English (en)
French (fr)
Inventor
黄际洲
王海峰
张伟
范淼
Original Assignee
百度在线网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百度在线网络技术(北京)有限公司 filed Critical 百度在线网络技术(北京)有限公司
Priority to EP20913065.7A priority Critical patent/EP3940552A4/en
Priority to US17/427,122 priority patent/US20220327421A1/en
Priority to JP2022543197A priority patent/JP2023510906A/ja
Priority to KR1020227019541A priority patent/KR20220092624A/ko
Publication of WO2021232724A1 publication Critical patent/WO2021232724A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/909Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Definitions

  • This application relates to the field of computer application technology, in particular to the field of big data technology.
  • the main goal of the map is to portray the real world and make it easier for users to travel.
  • the high-precision knowledge map of geographical locations is the basis for satisfying the core demands of users on maps and travel.
  • the spatial relationship between geographical locations is one of the essential elements of the knowledge graph, which can realize more accurate logical reasoning queries.
  • one method of mining the spatial relationship of geographic points is to automatically generate the coordinates of geographic points, but this method relies on the accuracy of the coordinates, and the coordinate errors of geographic points are generally tens of meters or even more than 100 meters, which leads to this problem.
  • the spatial relationship between geographic locations generated by the method is not accurate. In particular, the relationship between floors cannot be automatically generated by coordinates.
  • this application provides a method for training a spatial relationship extraction model for geographic location points, the method includes:
  • the second training data includes: text and labeling of geographic location points and spatial relationship information of geographic location points in the text;
  • the geographic location point spatial relationship extraction model includes an embedding layer, a Transformer layer, and a mapping layer;
  • the geographic location point spatial relationship extraction model is used to extract the geographic location point spatial relationship information from the input Internet text.
  • the present application also provides a method for extracting the spatial relationship of geographic location points.
  • the method includes:
  • the present application provides a device for training a spatial relationship extraction model of geographic location points, and the device includes:
  • the second acquiring unit is configured to acquire second training data, where the second training data includes: text and labeling of geographic location points and spatial relationship information of geographic location points in the text;
  • the second training unit is configured to use the second training data to train a geographic location point spatial relationship extraction model, where the geographic location point spatial relationship extraction model includes an embedding layer, a Transformer layer, and a mapping layer;
  • the geographic location point spatial relationship extraction model is used to extract geographic location point spatial relationship information from the input text.
  • the present application also provides a device for extracting the spatial relationship of geographic location points, the device including:
  • the obtaining unit is used to obtain text containing geographic point information from the Internet;
  • the extraction unit is used for inputting the text into a pre-trained geographic location point spatial relationship extraction model, and acquiring information about the spatial relationship output by the geographic location point spatial relationship extraction model; wherein the geographic location spatial relationship extraction model includes embedding Layer, Transformer layer and mapping layer.
  • this application provides an electronic device, including:
  • At least one processor At least one processor
  • a memory communicatively connected with the at least one processor; wherein,
  • the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the method described in any one of the foregoing.
  • the present application also provides a non-transitory computer-readable storage medium storing computer instructions, the computer instructions being used to make the computer execute any of the methods described above.
  • this application can extract the spatial relationship information of geographic location points from the Internet text, and solves the problem of inaccurate spatial relationships caused by coordinate errors of geographic locations, or the problem that the relationship between floors cannot be automatically generated. .
  • Fig. 1 shows an exemplary system architecture in which the method or device of the embodiments of the present application can be applied
  • Embodiment 2 is a flow chart of the method for extracting the spatial relationship of geographic location points provided in Embodiment 1 of the application;
  • FIG. 3 is a schematic structural diagram of a spatial relationship extraction model for geographic location points provided in Embodiment 1 of this application;
  • FIG. 4 is a flowchart of a method for training a spatial relationship extraction model for geographic location points provided in Embodiment 2 of this application;
  • FIG. 5 is a flowchart of a method for training a spatial relationship extraction model of geographic location points provided in Embodiment 3 of the application;
  • FIG. 6a is a schematic structural diagram of a pre-training model provided in Embodiment 3 of this application.
  • FIG. 6b is a schematic structural diagram of the spatial relationship extraction model of geographic location points provided in the third embodiment of the basic application.
  • FIG. 7 is a structural diagram of an apparatus for training the spatial relationship extraction model of geographic location points provided in the fourth embodiment of the application.
  • FIG. 8 is a structural diagram of an apparatus for extracting the spatial relationship of geographic location points provided in Embodiment 5 of this application;
  • Fig. 9 is a block diagram of an electronic device used to implement the method of the embodiment of the present application.
  • Fig. 1 shows an exemplary system architecture to which the method or device of the embodiment of the present application can be applied.
  • the system architecture may include terminal devices 101 and 102, a network 103 and a server 104.
  • the network 103 is used to provide a medium for communication links between the terminal devices 101 and 102 and the server 104.
  • the network 103 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
  • the user can use the terminal devices 101 and 102 to interact with the server 104 through the network 103.
  • Various applications may be installed on the terminal devices 101 and 102, such as map applications, web browser applications, and communication applications.
  • the terminal devices 101 and 102 may be various types of user equipment capable of running map applications. Including but not limited to smart phones, tablets, PCs, smart TVs, etc.
  • the apparatus for extracting the spatial relationship of geographic location points provided in the present application can be set up and run in the above-mentioned server 104, or run in a device independent of the server 104. It can be implemented as multiple software or software modules (for example, to provide distributed services), or as a single software or software module, which is not specifically limited here.
  • the server 104 can interact with the map database 105. Specifically, the server 104 can obtain data from the map database 105 or store the data in the map database 105.
  • the map database 105 stores map data including POI information.
  • the device for extracting the spatial relationship of geographic location points is set up and running in the server 104.
  • the server 104 uses the method provided in the embodiment of the present application to extract the spatial relationship of geographic location points, and then uses the acquired spatial relationship of the geographic location points to update the map.
  • Database 105 The server 104 can query the map database 105 in response to the query request of the terminal device 101, 102, and return to the terminal device 101, 102 related information of the queried geographic location point, including information generated based on the spatial relationship of the geographic location point.
  • the server 104 may be a single server or a server group composed of multiple servers. In addition to being in the form of a server, 104 may also be other computer systems or processors with higher computing performance. It should be understood that the numbers of terminal devices, networks, servers, and databases in FIG. 1 are merely illustrative. According to implementation needs, there can be any number of terminal devices, networks, servers, and databases.
  • the geographic location point involved in this application refers to a geographic location point in a map application, and the geographic location point can be inquired, browsed, displayed to the user, and so on. These geographic locations have basic attributes such as latitude and longitude, name, administrative address, and type.
  • the geographic location points may include but are not limited to POI (Point Of Interest), AOI (Area of Interest), ROI (Regin of Interest, area of interest), etc.
  • POI is taken as an example for description.
  • POI is a term in geographic information system, which generally refers to all geographic objects that can be abstracted as points.
  • a POI can be a house, a shop, a mailbox, a bus station, a school, a hospital, and so on.
  • the main purpose of POI is to describe the location of things or events, thereby enhancing the ability to describe and query the location of things or events.
  • Fig. 2 is a flow chart of the method for extracting the spatial relationship of geographic location points provided in the first embodiment of the application. As shown in the figure, the method may include the following steps:
  • a text containing geographic point information is obtained from the Internet.
  • the text containing the location information can be obtained from the official website associated with the geographical location point, for example, the text on the “Haidilao City Shopping Center Sixth Floor, Qinghe Middle Street, Qinghe Middle Street, Haidian District, Beijing” can be obtained from the official website of Haidilao , Obtained from China Merchants Bank "China Merchants Bank Beijing Tsinghua Park Branch, Haidian District, Tsinghua Science and Technology Park, Science and Technology Building B, G Floor, 200 meters south of the East Gate of Tsinghua University".
  • text containing geographic point information can also be obtained from other data sources.
  • 202 input the text into the pre-trained geographic location point spatial relationship extraction model, and obtain the information of the spatial relationship output by the geographic location point spatial relationship extraction model; where the geographic location spatial relationship extraction model includes an embedding layer, a Transformer layer, and a mapping Floor.
  • the information of the spatial relationship involved in the embodiments of the present application may include: the type and value of the spatial relationship.
  • the types of spatial relationship mainly include some types of spatial relationships, such as east, south, west, north, southeast, northeast, southwest, northwest, left, right, upstairs, downstairs, floors, buildings, etc.
  • the value can include the value of the distance, the value of the floor, the value of the building, and so on.
  • the structure of the spatial relationship extraction model for geographic location points involved in the embodiments of the present application may be as shown in FIG. 3, and the embedding layer may include multiple embedding layers.
  • the separator [CLS] can be added before the text, and the separator [SEP] can be added between the sentences, and each character and separator can be used as a Token.
  • the embedding layer in this application uses characters as the granularity as the token, which can effectively solve the problem of long-tail words.
  • the first embedding layer denoted as Token Embedding in the figure, is used to encode each Token (element) in the text.
  • the Token in the text can include characters and separators in the text.
  • the second embedding layer is used to encode the position of each Token.
  • the position information of each Token in the text can be determined. For example, the position number of each Token is sequentially numbered, and each position number is encoded.
  • the third embedding layer is used to encode the sentence identifier to which each Token belongs.
  • the sentences in the text are numbered sequentially as sentence identifiers, and the sentence identifiers to which each Token belongs are coded.
  • each Token, location information, and sentence identifiers to which they belong are transformed into dense vector representations.
  • e i represents the vector representation of the i-th Token
  • the i-th Token is a vector representation of a character, which is obtained by looking up the word vector matrix and converting the character into a dense vector.
  • the vector representation representing the position of the i-th Token is obtained by looking up the word vector matrix and converting the position into a dense vector.
  • the vector representation representing the sentence identifier of the i-th Token is obtained by looking up the word vector matrix and converting the sentence representation into a dense vector.
  • the encoding results of each embedding layer are output to the Transformer layer (represented as a multi-layer transformer in the figure).
  • the Transformer layer performs multi-layer Attention (attention) mechanism processing, it outputs hidden vectors.
  • a dense sequence of vectors E ⁇ e 1, e 2 , ..., e n ⁇
  • n is the length of the input sequence, that is, the number of Tokens included.
  • the mapping layer may include CRF (Conditional Random Field, Conditional Random Field), which is used to predict the spatial relationship information contained in the text of the input model by using the hidden vector output by the Transformer layer.
  • CRF Conditional Random Field, Conditional Random Field
  • the present application can extract the spatial relationship information of geographic location points from the text in the Internet that contains geographic location point information.
  • the embodiments of the present application define a set of description systems that represent spatial relationships, which are similar to the triples ⁇ entity 1, entity 2, relationship> in the common sense knowledge graph, using ⁇ geographic location point 1, geographic location point 2. ,
  • the type of spatial relationship, and the value of the spatial relationship> which makes the expression of spatial relationship more standardized and unified, and makes the systematic calculation, reasoning and storage of spatial relationship knowledge possible.
  • the spatial relationship extraction model of geographic location points is one of the key points.
  • the training process of the above model will be described in detail below in conjunction with embodiments.
  • Fig. 4 is a flow chart of the method for training the spatial relationship extraction model of geographic location points provided in the second embodiment of the application. As shown in Fig. 4, the method includes the following steps:
  • the training data is obtained, and the training data includes: text and labeling of the geographic location point and the spatial relationship information of the geographic location point in the text.
  • the training samples can be constructed by manual labeling. For example, grab the address data from the official website data associated with the geographic location and mark it.
  • X represents the text
  • Y represents the label.
  • O means end.
  • “O” refers to any one of the type and value of the spatial relationship that does not belong to the POI.
  • B indicates the start, for example, “POI_B” indicates the start character of the POI label, "VAL_B” indicates the start character of the spatial relationship value, and "LOF_B” indicates the start character of the spatial relationship type.
  • I means the middle, for example, "POI_I” means the middle character of the POI label.
  • training samples by manually constructing text and labeling it, or to obtain high-quality text from other data sources and perform manual labeling.
  • the training data is used to train the geographic location point spatial relationship extraction model, where the geographic location point spatial relationship extraction model includes an embedding layer, a Transformer layer, and a mapping layer.
  • the training objectives include: the mapping layer predicts the label of the text in the training data according to Labels in the training data.
  • the embedding layer may include multiple embedding layers.
  • the first embedding layer denoted as Token Embedding in the figure, is used to encode each Token (element) in the text.
  • the Token in the text can include characters and separators in the text.
  • the second embedding layer is used to encode the position of each Token.
  • the position information of each Token in the text can be determined. For example, the position number of each Token is sequentially numbered, and each position number is encoded.
  • the third embedding layer is used to encode the sentence identifier to which each Token belongs.
  • the sentences in the text are numbered sequentially as sentence identifiers, and the sentence identifiers to which each Token belongs are coded.
  • e i represents the vector representation of the i-th Token
  • the i-th Token is a vector representation of a character, which is obtained by looking up the word vector matrix and converting the character into a dense vector.
  • the vector representation representing the position of the i-th Token is obtained by looking up the word vector matrix and converting the position into a dense vector.
  • the vector representation representing the sentence identifier of the i-th Token is obtained by looking up the word vector matrix and converting the sentence representation into a dense vector.
  • the dense vector representation of the encoding result of each embedding layer is output to the Transformer layer, and the Transformer layer performs multi-layer Attention (attention) mechanism processing, and then outputs the hidden vector.
  • a dense sequence of vectors E ⁇ e 1, e 2 , ..., e n ⁇
  • n is the length of the input sequence, that is, the number of Tokens included.
  • the mapping layer may include CRF, which is used to use the hidden vector output by the Transformer layer to predict the spatial relationship information contained in the text of the input model.
  • the maximum likelihood loss function can be:
  • represents all the parameters of the model
  • is a regularized hyperparameter
  • needs to be manually adjusted and determined.
  • the actual training goal is: try to make the CRF's label prediction of the text conform to the label in the training data.
  • use the aforementioned loss function to adjust the model parameters of the embedding layer, the Transformer layer, and the CRF layer, and minimize the value of the loss function as much as possible.
  • Training a good model requires large-scale and high-quality training data.
  • the above-mentioned training data also contains geographic location points and spatial relationship information of geographic locations. There are fewer training data that meet certain quality requirements, and manual labeling is required. This makes it difficult to obtain high-quality training data.
  • a preferred embodiment is provided in the embodiments of this application, which uses pre-training + fine-tuning (optimization adjustment) to extract the spatial relationship between geographic locations. Training.
  • the text excavated from the Internet can be used to form the first training data.
  • the quality of the first training data is not high, and a relatively large amount of first training data can be obtained.
  • the official website text is manually annotated to form the second training data.
  • These training data are of high quality and small in quantity, and can be further tuned based on the model parameters obtained in the pre-training process. This method will be described below in conjunction with the third embodiment.
  • Fig. 5 is a flow chart of the method for training the spatial relationship extraction model of geographic location points provided in the third embodiment of the application. As shown in Fig. 5, the method includes the following steps:
  • first training data is obtained, and the first training data includes: text and a label of the geographic location point and the spatial relationship of the geographic location point in the text.
  • the first training data can be texts excavated from the Internet that contain keywords of geographic location points and spatial relationship types as pre-training data. This part of the training data has a lower accuracy rate than artificial precision.
  • the labeled data is weakly labeled data.
  • This application is not limited to the specific method of mining the above text from the Internet. One of the simplest methods is to pre-build a dictionary of the keywords of the spatial relationship between the geographical location points, and use the dictionary and the geographical location point names in the map database on the Internet. Matching in a large amount of text, so as to obtain the above text.
  • the labeling of the geographical location point and the spatial relationship type of the geographical location point in the text can also be realized based on the automatic matching of the geographical location point names in the above-mentioned dictionary and the map database. Since a large amount of manual participation is not required, the first training data required for the pre-training may be large-scale and massive data, so as to ensure the training requirements of the pre-training model.
  • the first training data is used to train a pre-training model.
  • the pre-training model includes: an embedding layer, a Transformer layer, and at least one task layer.
  • the structure of the pre-training model can be as shown in Figure 6a.
  • the structures and uses of the embedding layer and the Transformer layer are similar to those in the first and second embodiments. The only difference is that the embedding layer can also include the task layer used for the input text. Identifies the fourth embedding layer for encoding, which is represented as Task Embedding in the figure.
  • the first training data different forms will be used for different subsequent task layers. For the description of each task layer, please refer to the description of the subsequent steps. As shown in FIG. 6a, if the current form of the first training data is used for the task layer with the ID of 2, then the task layer ID of each Token is 2.
  • the hidden vector output by the Transformer layer is input to each task layer.
  • the task layer includes at least one of a masking prediction task layer, a spatial relationship prediction task layer, and a geographic location point prediction task layer.
  • the mask prediction task layer is used to predict the content of the mask part in the text of the first training data based on the hidden vector output by the Trasformer layer, and the training objective is to minimize the difference between the prediction result and the actual content corresponding to the mask part.
  • a mask character or a method of masking geographic location points can be used.
  • a random method can be used, or rules can be specified by the user. for example:
  • the spatial relationship prediction task layer is used to predict the spatial relationship described by the text of the first training data based on the hidden vector output by the Transformer layer, and the training target is that the prediction result conforms to the corresponding spatial relationship label.
  • This task layer can be regarded as a multi-classification task, using cross entropy to determine the loss function.
  • the geographic location point prediction task layer is used to predict the geographic location points contained in the text of the first training data based on the hidden vector output by the Transformer layer, and the training target is that the prediction result conforms to the corresponding geographic location point label.
  • X,S,P), or P r F(S
  • This task layer can be regarded as a multi-classification task, using cross entropy to determine the loss function.
  • the above-mentioned task layers can be implemented in a fully connected mode, a classifier mode, and so on.
  • each task layer is trained alternately or at the same time.
  • Use the loss function corresponding to the training target of the trained task layer to optimize the model parameters of the embedding layer, the Transformer layer and the trained task layer.
  • this application adopts a multi-task learning method, which can share knowledge among multiple tasks, thereby obtaining a better pre-training model.
  • one task layer can be selected sequentially or randomly each time for training, and the loss function of the selected task layer is used each time to optimize the models of the embedding layer, the Transformer layer and the trained task layer parameter.
  • all tasks can be trained at the same time each time, and a joint loss function can be constructed according to the loss function of each task.
  • the weighting coefficient can be determined manually by adjusting parameters, such as experimental values or empirical values. Then use the joint loss function to optimize the model parameters of the embedding layer, Transformer layer and all task layers.
  • the second training data is obtained, and the second training data includes: text and labeling of the types and values of the geographic location points and the spatial relationship between the geographic location points in the text.
  • the second training data is the same as the training data obtained in step 401 in the second embodiment, and will not be repeated here.
  • the second training data is used to train the geographic location point spatial relationship extraction model.
  • the geographic location point spatial relationship extraction model includes an embedding layer, a Transformer layer, and a mapping layer.
  • This step is actually the fine-tuning link.
  • the processing of the embedding layer and the Transformer layer has been relatively complete, so the model parameters have stabilized.
  • the fine-tuning In the training process of the link, the embedding layer and Transformer layer that have been trained by the pre-training model can be directly used to continue the training of the geographic location point spatial relationship extraction model, that is, the CRF layer replaces the above task layer, so that the output of the Transformer layer is hidden.
  • the vector is directly input to the CRF layer.
  • the pre-training model with the structure shown in Figure 6a is used. If the embedding layer also includes the Task Embedding layer, the input in the fine-tuning phase also includes the Task Embedding layer. At this time, the code of the task to which each Token belongs It can be any value randomly selected, and the corresponding structure is shown in Figure 6b. Correspondingly, when the spatial relationship extraction model is used to extract the spatial relationship after the training is completed, the embedding layer also includes the Task Embedding layer, and the code of the task to which each Token belongs can also be an arbitrary value randomly selected.
  • the model parameters of the embedding layer and the Transformer layer can be fixed unchanged. In the training process of this step, only the model parameters of the mapping layer, such as the CRF layer, are optimized (fine-tuned).
  • FIG. 7 is a structural diagram of an apparatus for training the spatial relationship extraction model of geographic location points provided in the fourth embodiment of the application.
  • the apparatus may include: a second acquisition unit 01 and a second training unit 02, and may further It includes a first acquisition unit 03 and a first training unit 04.
  • the main functions of each component are as follows:
  • the second acquiring unit 01 is configured to acquire second training data, and the second training data includes: text and labeling of geographic location points and spatial relationship information of geographic location points in the text.
  • the training samples can be constructed by manual labeling. For example, grab the address data from the official website data associated with the geographic location and mark it.
  • the second training unit 02 is used to train the spatial relationship extraction model of geographic location points using the second training data.
  • the spatial relationship extraction model of geographic location points includes an embedding layer, a Transformer layer, and a mapping layer. Among them, the trained geographic location point spatial relationship extraction model is used to extract geographic location point spatial relationship information from the input text.
  • the above-mentioned embedding layer includes: a first embedding layer for character encoding each Token in the text, a second embedding layer for position encoding each Token, and a third embedding layer for encoding the sentence identifier to which each Token belongs Floor.
  • the mapping layer may include information used to predict the spatial relationship contained in the text by using the hidden vector output by the Transformer layer.
  • the training objectives of the spatial relationship extraction model of geographic location points include: the label prediction of the text by the mapping layer conforms to the label in the second training data.
  • Training a good model requires large-scale and high-quality training data.
  • the above-mentioned training data also contains geographic location points and spatial relationship information of geographic locations. There are fewer training data that meet certain quality requirements, and manual labeling is required. This makes it difficult to obtain high-quality training data.
  • a preferred embodiment is provided in the embodiments of this application, which uses pre-training + fine-tuning (optimization adjustment) to extract the spatial relationship between geographic locations. Training.
  • the text excavated from the Internet can be used to form the first training data.
  • the quality of these texts is not high, and a relatively large amount of first training data can be obtained.
  • high-precision text is used for manual annotation to form the second training data. These texts are of high quality and small in quantity, and can be further tuned based on the model parameters obtained in the pre-training process.
  • the device also includes:
  • the first acquiring unit 03 is configured to acquire first training data, and the first training data includes: text and annotations of the geographic location points and the spatial relationship between the geographic location points in the text.
  • the first training data can be text excavated from the Internet that contains keywords of geographic location points and spatial relationship types as pre-training data. This part of the training data has a low accuracy rate and is weak compared to artificial precision standard data. Label the data.
  • This application is not limited to the specific method of mining the above text from the Internet.
  • One of the simplest methods is to pre-build a dictionary of the keywords of the spatial relationship between the geographical location points, and use the dictionary and the geographical location point names in the map database on the Internet. Matching in a large amount of text, so as to obtain the above text.
  • the labeling of the geographical location point and the spatial relationship type of the geographical location point in the text can also be realized based on the automatic matching of the geographical location point names in the above-mentioned dictionary and the map database. Since a large amount of manual participation is not required, the first training data required for the pre-training may be large-scale and massive data, so as to ensure the training requirements of the pre-training model.
  • the first training unit 04 is configured to use the first training data to train a pre-training model.
  • the pre-training model includes: an embedding layer, a Transformer layer, and at least one task layer.
  • the second training unit 02 uses the second training data to train the geographic location point space extraction model, it is based on the embedding layer and the Transformer layer obtained by the pre-training model training.
  • the aforementioned at least one task layer includes at least one of a mask prediction task layer, a spatial relationship prediction task layer, and a geographic location point prediction task layer.
  • the mask prediction task layer is used to predict the content of the mask part in the text of the first training data based on the hidden vector output by the Transformer layer.
  • the training target is that the prediction result matches the actual content of the mask part.
  • the spatial relationship prediction task layer is used to predict the spatial relationship described by the text of the first training data based on the hidden vector output by the Transformer layer, and the training target is that the prediction result conforms to the corresponding spatial relationship label.
  • the geographic location point prediction task layer is used to predict the geographic location points contained in the text of the first training data based on the hidden vector output by the Transformer layer, and the training target is that the prediction result conforms to the corresponding geographic location point label.
  • the above at least one task layer is trained alternately or simultaneously, and the model parameters of the embedding layer, the Transformer layer, and the trained task layer are optimized by using the loss function corresponding to the training target of the trained task layer.
  • the second training unit 02 uses the embedding layer and Transformer layer models trained with the training model when training the geographic location point space extraction model using the second training data. The parameters remain unchanged, and the model parameters of the mapping layer are optimized until the training target of the geographic location point space extraction model is reached.
  • FIG. 8 is a structural diagram of an apparatus for extracting the spatial relationship of geographic location points provided in Embodiment 5 of the application. As shown in FIG. 8, the apparatus may include: an acquiring unit 11 and an extracting unit 12. The main functions of each component are as follows:
  • the obtaining unit 11 is configured to obtain text containing geographic point information from the Internet.
  • the extraction unit 12 is used to input the text into the pre-trained geographic location point spatial relationship extraction model, and obtain the information of the spatial relationship output by the geographic location point spatial relationship extraction model; wherein the geographic location spatial relationship extraction model includes an embedding layer, a Transformer layer, and Mapping layer.
  • the above-mentioned embedding layer includes: a first embedding layer for character encoding each Token in the text, a second embedding layer for position encoding each Token, and a third embedding layer for encoding the sentence identifier to which each Token belongs. Embedded layer.
  • the mapping layer includes CRF, which is used to use the hidden vector output by the Transformer layer to predict the spatial relationship information contained in the text.
  • the quadruple format of ⁇ geographic location point 1, geographic location point 2, spatial relationship type, spatial relationship value> can be used to make the spatial relationship
  • the expression of is more standardized and unified, which makes the systematic calculation, reasoning and storage of spatial relationship knowledge possible.
  • the present application also provides an electronic device and a readable storage medium.
  • FIG. 9 it is a block diagram of an electronic device according to the method of the embodiment of the present application.
  • Electronic devices are intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • Electronic devices can also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are merely examples, and are not intended to limit the implementation of the application described and/or required herein.
  • the electronic device includes one or more processors 901, a memory 902, and interfaces for connecting various components, including a high-speed interface and a low-speed interface.
  • the various components are connected to each other using different buses, and can be installed on a common motherboard or installed in other ways as needed.
  • the processor may process instructions executed in the electronic device, including instructions stored in or on the memory to display graphical information of the GUI on an external input/output device (such as a display device coupled to an interface).
  • an external input/output device such as a display device coupled to an interface.
  • multiple processors and/or multiple buses can be used with multiple memories and multiple memories.
  • multiple electronic devices can be connected, and each device provides part of the necessary operations (for example, as a server array, a group of blade servers, or a multi-processor system).
  • a processor 901 is taken as an example.
  • the memory 902 is a non-transitory computer-readable storage medium provided by this application. Wherein, the memory stores instructions executable by at least one processor, so that the at least one processor executes the method provided in this application.
  • the non-transitory computer-readable storage medium of this application stores computer instructions, which are used to make a computer execute the method provided in this application.
  • the memory 902 can be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the methods in the embodiments of the present application.
  • the processor 901 executes various functional applications and data processing of the server by running non-transitory software programs, instructions, and modules stored in the memory 902, that is, implements the methods in the foregoing method embodiments.
  • the memory 902 may include a program storage area and a data storage area.
  • the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created according to the use of the electronic device.
  • the memory 902 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices.
  • the memory 902 may optionally include memories remotely provided with respect to the processor 901, and these remote memories may be connected to the electronic device through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the electronic device may further include: an input device 903 and an output device 904.
  • the processor 901, the memory 902, the input device 903, and the output device 904 may be connected by a bus or in other ways. The connection by a bus is taken as an example in FIG. 9.
  • the input device 903 can receive input digital or character information, and generate key signal input related to the user settings and function control of the electronic device, such as touch screen, keypad, mouse, track pad, touch pad, indicator stick, one or more A mouse button, trackball, joystick and other input devices.
  • the output device 904 may include a display device, an auxiliary lighting device (for example, LED), a tactile feedback device (for example, a vibration motor), and the like.
  • the display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some embodiments, the display device may be a touch screen.
  • Various implementations of the systems and techniques described herein can be implemented in digital electronic circuit systems, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: being implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, the programmable processor It can be a dedicated or general-purpose programmable processor that can receive data and instructions from the storage system, at least one input device, and at least one output device, and transmit the data and instructions to the storage system, the at least one input device, and the at least one output device. An output device.
  • machine-readable medium and “computer-readable medium” refer to any computer program product, device, and/or device used to provide machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memory, programmable logic devices (PLD)), including machine-readable media that receive machine instructions as machine-readable signals.
  • machine-readable signal refers to any signal used to provide machine instructions and/or data to a programmable processor.
  • the systems and techniques described here can be implemented on a computer that has: a display device for displaying information to the user (for example, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) ); and a keyboard and a pointing device (for example, a mouse or a trackball) through which the user can provide input to the computer.
  • a display device for displaying information to the user
  • LCD liquid crystal display
  • keyboard and a pointing device for example, a mouse or a trackball
  • Other types of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (for example, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including Acoustic input, voice input, or tactile input) to receive input from the user.
  • the systems and technologies described herein can be implemented in a computing system that includes back-end components (for example, as a data server), or a computing system that includes middleware components (for example, an application server), or a computing system that includes front-end components (for example, A user computer with a graphical user interface or a web browser, through which the user can interact with the implementation of the system and technology described herein), or includes such back-end components, middleware components, Or any combination of front-end components in a computing system.
  • the components of the system can be connected to each other through any form or medium of digital data communication (for example, a communication network). Examples of communication networks include: local area network (LAN), wide area network (WAN), and the Internet.
  • the computer system can include clients and servers.
  • the client and server are generally far away from each other and usually interact through a communication network.
  • the relationship between the client and the server is generated by computer programs that run on the corresponding computers and have a client-server relationship with each other.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Library & Information Science (AREA)
  • Evolutionary Computation (AREA)
  • Remote Sensing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种提取地理位置点空间关系的方法、训练提取模型的方法及装置,涉及大数据技术领域。该方法包括:获取第二训练数据,所述第二训练数据包括:文本以及对文本中地理位置点、地理位置点空间关系信息的标注;利用所述第二训练数据训练地理位置点空间关系提取模型,所述地理位置点空间关系提取模型包括嵌入层、Transformer层和映射层;所述地理位置点空间关系提取模型用于从输入的文本中提取地理位置点空间关系信息。该方法能够从互联网文本中提取地理位置点空间关系信息,解决了因地理位置点的坐标误差或楼层关系而导致的空间关系不准确或无法自动生成的问题。

Description

提取地理位置点空间关系的方法、训练提取模型的方法及装置
本申请要求了申请日为2020年05月21日,申请号为2020104382142发明名称为“提取地理位置点空间关系的方法、训练提取模型的方法及装置”的中国专利申请的优先权。
技术领域
本申请涉及计算机应用技术领域,特别涉及大数据技术领域。
背景技术
地图主要的目标就是刻画真实世界,让用户的出行更简单。地理位置点的高精知识图谱是满足用户在地图找点和出行等核心诉求的基础。而地理位置点空间关系是知识图谱的必备要素之一,可以实现更准确的逻辑推理查询。
目前,挖掘地理位置点空间关系的一种方法是利用地理位置点的坐标自动生成,但是该方法依赖坐标的准确度,而地理位置点的坐标误差一般在几十米甚至百米以上,导致该方法生成的地理位置点空间关系不准确。特别是楼层关系,无法通过坐标自动生成。
发明内容
有鉴于此,本申请通过以下技术方案解决现有技术中的上述技术问题。
第一方面,本申请提供了一种训练地理位置点空间关系提取模型的方法,该方法包括:
获取第二训练数据,所述第二训练数据包括:文本以及对文本中地理位置点、地理位置点空间关系信息的标注;
利用所述第二训练数据训练地理位置点空间关系提取模型,所述地理位置点空间关系提取模型包括嵌入层、Transformer层和映射层;
所述地理位置点空间关系提取模型用于从输入的互联网文本中提取地理位置点空间关系信息。
第二方面,本申请还提供了一种提取地理位置点空间关系的方法, 该方法包括:
从互联网获取包含地理位置点信息的文本;
将所述文本输入预先训练得到的地理位置点空间关系提取模型,获取所述地理位置点空间关系提取模型输出的空间关系的信息;其中所述地理位置空间关系提取模型包括嵌入层、Transformer层和映射层。
第三方面,本申请提供了一种训练地理位置点空间关系提取模型的装置,该装置包括:
第二获取单元,用于获取第二训练数据,所述第二训练数据包括:文本以及对文本中地理位置点、地理位置点空间关系信息的标注;
第二训练单元,用于利用所述第二训练数据训练地理位置点空间关系提取模型,所述地理位置点空间关系提取模型包括嵌入层、Transformer层和映射层;
所述地理位置点空间关系提取模型用于从输入的文本中提取地理位置点空间关系信息。
第四方面,本申请还提供了一种提取地理位置点空间关系的装置,该装置包括:
获取单元,用于从互联网获取包含地理位置点信息的文本;
提取单元,用于将所述文本输入预先训练得到的地理位置点空间关系提取模型,获取所述地理位置点空间关系提取模型输出的空间关系的信息;其中所述地理位置空间关系提取模型包括嵌入层、Transformer层和映射层。
第五方面,本申请提供了一种电子设备,包括:
至少一个处理器;以及
与所述至少一个处理器通信连接的存储器;其中,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述任一项所述的方法。
第六方面,本申请还提供了一种存储有计算机指令的非瞬时计算机可读存储介质,所述计算机指令用于使所述计算机执行上述任一项所述的方法。
由以上技术方案可以看出,本申请能够从互联网文本中提取地理位 置点空间关系信息,解决了因地理位置点的坐标误差而导致的空间关系不准确的问题,或楼层关系无法自动生成的问题。
上述可选方式所具有的其他效果将在下文中结合具体实施例加以说明。
附图说明
附图用于更好地理解本方案,不构成对本申请的限定。其中:
图1示出了可以应用本申请实施例的方法或装置的示例性系统架构;
图2为本申请实施例一提供的提取地理位置点空间关系的方法流程图;
图3为本申请实施例一提供的地理位置点空间关系提取模型的结构示意图;
图4为本申请实施例二提供的训练地理位置点空间关系提取模型的方法流程图;
图5为本申请实施例三提供的训练地理位置点空间关系提取模型的方法流程图;
图6a为本申请实施例三提供的预训练模型的结构示意图;
图6b为基本申请实施例三提供的地理位置点空间关系提取模型的结构示意图;
图7为本申请实施例四提供的训练地理位置点空间关系提取模型的装置结构图;
图8为本申请实施例五提供的提取地理位置点空间关系的装置结构图;
图9是用来实现本申请实施例的方法的电子设备的框图。
具体实施方式
以下结合附图对本申请的示范性实施例做出说明,其中包括本申请实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本申请的范围和精神。同样,为了清楚和简明, 以下的描述中省略了对公知功能和结构的描述。
图1示出了可以应用本申请实施例的方法或装置的示例性系统架构。如图1所示,该系统架构可以包括终端设备101和102,网络103和服务器104。网络103用以在终端设备101、102和服务器104之间提供通信链路的介质。网络103可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
用户可以使用终端设备101和102通过网络103与服务器104交互。终端设备101和102上可以安装有各种应用,例如地图类应用、网页浏览器应用、通信类应用等。
终端设备101和102可以是能够运行地图类应用的各类用户设备。包括但不限于智能手机、平板电脑、PC、智能电视等等。本申请所提供的提取地理位置点空间关系的装置可以设置并运行于上述服务器104中,也可以运行于独立于服务器104的设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务),也可以实现成单个软件或软件模块,在此不做具体限定。服务器104可以与地图数据库105之间进行交互,具体地,服务器104可以从地图数据库105中获取数据,也可以将数据存储于地图数据库105中。地图数据库105中存储有包括POI信息的地图数据。
例如,提取地理位置点空间关系的装置设置并运行于上述服务器104中,服务器104采用本申请实施例提供的方法进行地理位置点空间关系的提取,然后利用获取的地理位置点的空间关系更新地图数据库105。服务器104能够响应于终端设备101、102的查询请求,查询地图数据库105,并向终端设备101、102返回所查询地理位置点的相关信息,包括基于地理位置点空间关系所产生的信息。
服务器104可以是单一服务器,也可以是多个服务器构成的服务器群组。另外104除了以服务器的形式存在之外,也可以是具有较高计算性能的其他计算机系统或处理器。应该理解,图1中的终端设备、网络、服务器和数据库的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络、服务器和数据库。
互联网中存在与地理位置相关的大量情报,这些情报里面会提到对应的地理位置与其他地理位置的空间关系,我们可以利用文本解析技术, 从这些情报中自动的构建地理位置间的地理空间位置关系。下面分别结合实施例对这两个部分进行详细描述。
本申请中涉及的地理位置点指的是地图类应用中的地理位置点,该地理位置点可以供用户查询、浏览,向用户展现等。这些地理位置点具有经纬度、名称、行政地址、类型等基本属性。其中地理位置点可以包括但不限于POI(Point Of Interest,兴趣点)、AOI(Area of Interest,兴趣面)、ROI(Regin of Interest,兴趣区域)等。在后续实施例中均以POI为例进行描述。POI是地理信息系统中的一个术语,泛指一切可以抽象为点的地理对象,一个POI可以是一栋房子、一个商铺、一个邮筒、一个公交站、一所学校、一个医院,等等。POI的主要用途是对事物或事件的位置进行描述,从而增强对事物或事件位置的描述能力和查询能力。
实施例一、
图2为本申请实施例一提供的提取地理位置点空间关系的方法流程图,如图中所示,该方法可以包括以下步骤:
在201中,从互联网获取包含地理位置点信息的文本。
在本申请中,可以从地理位置点关联的官网中获取包含地理位置点信息的文本,例如从海底捞官网上获取到“海底捞北京市海淀区清河中街五彩城购物中心六层”的文本,从招商银行获取到“招商银行北京清华园支行海淀区清华科技园,科技大厦B座G层,清华大学东门向南200米”。
除了上述数据源之外,也可以从其他数据源获取包含地理位置点信息的文本。
在202中,将该文本输入预先训练得到的地理位置点空间关系提取模型,获取地理位置点空间关系提取模型输出的空间关系的信息;其中地理位置空间关系提取模型包括嵌入层、Transformer层和映射层。
本申请实施例中涉及的空间关系的信息可以包括:空间关系的类型和取值。空间关系的类型主要包括一些方位上的空间关系类型,例如东、南、西、北、东南、东北、西南、西北、左边、右边、楼上、楼下、楼层、楼栋等等。取值可以包括距离的取值、楼层的取值、楼栋的取值等等。
本申请实施例中涉及的地理位置点空间关系提取模型的结构可以如图3中所示,嵌入层可以包括多个嵌入层。
首先,对于文本可以看作由至少一个语句构成的序列,首先可以在文本之前增加分隔符[CLS],语句之间增加分隔符[SEP],每个字符和分隔符都分别作为一个Token。输入序列X可以表示成X={x 1,x 2,…,x n},n为Token的数量,x i表示其中一个Token。需要说明的是,本申请中嵌入层以字符为粒度作为Token,能够有效解决长尾词的问题。
第一嵌入层,图中表示为Token Embedding,用于对文本中各Token(元素)进行字符编码,文本中的Token可以包括文本中的字符以及分隔符。
第二嵌入层,图中表示为Position Embedding,用于对各Token进行位置编码,可以对各Token在文本中的位置信息,例如对各Token按顺序进行位置编号,对各位置编号进行编码。
第三嵌入层,图中表示为Sentence Embedding,用于对各Token所属语句标识进行编码。例如对文本中的语句进行按顺序编号作为语句标识,对各Token所属语句标识进行编码。
经过上述各嵌入层后,各Token、位置信息和所属语句标识被转化为稠密向量表示。其中,e i表示第i个Token的向量表示,
Figure PCTCN2020131305-appb-000001
表示第i个Token作为字符的向量表示,是通过查找词向量矩阵,将字符转换成稠密向量得到的。
Figure PCTCN2020131305-appb-000002
表示第i个Token的位置的向量表示,是通过查找词向量矩阵,将位置转换为稠密向量得到的。
Figure PCTCN2020131305-appb-000003
表示第i个Token所述语句标识的向量表示,是通过查找词向量矩阵,将语句表示转换为稠密向量得到的。
各嵌入层的编码结果输出至Transformer层(图中表示为multi-layer transformer),Transformer层进行多层的Attention(注意力)机制处理后,输出隐向量。例如,将一个稠密向量序列E={e 1,e 2,…,e n},输出是包含上下文信息的隐向量序列h=φ θ(E)={h 1,h 2,…,h n}。其中,n为输入的序列长度,即包含的Token数量。
映射层可以包括CRF(Conditional Random Field,条件随机场),用于利用Transformer层输出的隐向量,预测输入模型的文本包含的空间关系的信息。
在得到隐向量的序列h={h 1,h 2,…,h n}后,我们使用CRF预测标签,从而得到模型的输出Y={y 1,y 2,…,y n},其中y i是对应的输入x i的预测标签。
对于每个token x i,我们可以通过如下公式得到一个概率分布
Figure PCTCN2020131305-appb-000004
Figure PCTCN2020131305-appb-000005
其中
Figure PCTCN2020131305-appb-000006
即是一个d×c维的向量,为权重参数向量,c代表了输出标签的数量。
随后对于每个预测序列Y={y 1,y 2,…,y n},我们可以得到这个序列的
Figure PCTCN2020131305-appb-000007
最终,我们可以使用softmax(全连接层)层得到每个预测序列Y的概率P r
Figure PCTCN2020131305-appb-000008
其中,
Figure PCTCN2020131305-appb-000009
指得到的所有预测序列中的任一个。
最终取概率最大的预测序列Y,该预测序列中包括对地理位置点空间关系信息的预测,包括空间关系的类型和取值。更进一步地,预测序列中还包括对地理位置点的预测。最终可以表示成一个四元组R=<S,O,P,A>,其中,S和O为地理位置点,P为空间关系类型,A为空间关系取值。
经过上述地理位置点空间关系提取模型后,输入文本“海底捞北京市海淀区清河中街五彩城购物中心六层”从中提取出地理位置点“海底捞”和“五彩城”的空间关系类型为“楼层”,取值为“6层”,可以表示成一个四元组R=<海底捞,五彩城,楼层,6层>。
输入文本“招商银行北京清华园支行海淀区清华科技园,科技大厦B座G层,清华大学东门向南200米”从中提取出地理位置点“招商银行”和“清华大学东门”的空间位置关系类型为“南”,取值为“200米”,可以表示成四元组R=<招商银行,清华大学东门,南,200米>。
从该实施例中可以看出,本申请能够从互联网中包含地理位置点信息的文本中提取出地理位置点空间关系信息。
并且,本申请实施例中定义了一套表示空间关系的描述体系,与常识类知识图谱中的三元组<实体1,实体2,关系>类似,采用<地理位置点1,地理位置点2,空间关系类型,空间关系取值>,使得空间关系的表达更规范统一,使得空间关系知识的系统化的计算、推理、存储成为可能。
在上述提取地理位置点空间关系信息的过程中,地理位置点空间关系提取模型是重点之一。在了解了上述模型结构之外,下面结合实施例对上述模型的训练过程进行详细描述。
实施例二、
图4为本申请实施例二提供的训练地理位置点空间关系提取模型的方法流程图,如图4中所示,该方法包括以下步骤:
在401中,获取训练数据,训练数据包括:文本以及对文本中地理位置点、地理位置点空间关系信息的标注。
在本实施例中,可以通过人工标注的方式来构建训练样本。例如从地理位置关联的官网数据中抓取地址数据,并对其进行标注。
例如,从海底捞官网中的官网数据中抓取地址数据,并对其进行人工标注。比如从海底捞官网中抓取地址数据“北京市海淀区清河中街五彩城购物中心六层”,并对其进行人工标注,标注出其中的POI、空间关系的类型和取值。表1为对该文本进行标注的示例:
表1
Figure PCTCN2020131305-appb-000010
其中,X表征文本,Y表征标注的标签。其中,“O”表示结束,在本申请实施例中“O”即为不属于POI、空间关系的类型和取值中任一种。“B”表示开始,例如“POI_B”表示POI标签的开始字符,“VAL_B”表示空间关系取值的开始字符,“LOF_B”表示空间关系类型的开始字符。“I”表示中间,例如“POI_I”表示POI标签的中间字符。经过标注后,可以看出,“五彩城购物中心”被标注为POI标签,“层”被标注为空间关系类型标签,“6”被标注为空间关系取值标签。
除此之外,也可以采用人工构建文本并进行标注的方式来构建训练样本,或者从其他数据源来获取高质量的文本并进行人工标注。
在402中,利用训练数据训练地理位置点空间关系提取模型,其中, 地理位置点空间关系提取模型包括嵌入层、Transformer层和映射层,训练目标包括:映射层对训练数据中文本的标签预测符合训练数据中的标注。
地理位置点空间关系提取模型的结构仍如图3中所示,与实施例一中类似地,嵌入层可以包括多个嵌入层。
首先,对于文本可以看作由至少一个语句构成的序列,首先可以在文本之前增加分隔符[CLS],语句之间增加分隔符[SEP],每个字符和分隔符都分别作为一个Token。
第一嵌入层,图中表示为Token Embedding,用于对文本中各Token(元素)进行字符编码,文本中的Token可以包括文本中的字符以及分隔符。
第二嵌入层,图中表示为Position Embedding,用于对各Token进行位置编码,可以对各Token在文本中的位置信息,例如对各Token按顺序进行位置编号,对各位置编号进行编码。
第三嵌入层,图中表示为Sentence Embedding,用于对各Token所属语句标识进行编码。例如对文本中的语句进行按顺序编号作为语句标识,对各Token所属语句标识进行编码。
对于上述三种输入可以表示成:
Figure PCTCN2020131305-appb-000011
其中,e i表示第i个Token的向量表示,
Figure PCTCN2020131305-appb-000012
表示第i个Token作为字符的向量表示,是通过查找词向量矩阵,将字符转换成稠密向量得到的。
Figure PCTCN2020131305-appb-000013
表示第i个Token的位置的向量表示,是通过查找词向量矩阵,将位置转换为稠密向量得到的。
Figure PCTCN2020131305-appb-000014
表示第i个Token所述语句标识的向量表示,是通过查找词向量矩阵,将语句表示转换为稠密向量得到的。
各嵌入层的编码结果的稠密向量表示输出至Transformer层,Transformer层进行多层的Attention(注意力)机制处理后,输出隐向量。例如,将一个稠密向量序列E={e 1,e 2,…,e n},输出是包含上下文信息的隐向量序列h=φ θ(E)={h 1,h 2,…,h n}。其中,n为输入的序列长度,即包含的Token数量。
映射层可以包括CRF,用于利用Transformer层输出的隐向量,预测输入模型的文本包含的空间关系的信息。
在得到隐向量的序列h={h 1,h 2,…,h n}后,我们使用CRF预测标签, 从而得到模型的输出Y={y 1,y 2,…,y n},其中y i是对应的输入x i的预测标签。
对于每个token x i,我们可以通过如下公式得到一个概率分布
Figure PCTCN2020131305-appb-000015
Figure PCTCN2020131305-appb-000016
其中
Figure PCTCN2020131305-appb-000017
即是一个d×c维的向量,为权重参数向量,c代表了输出标签的数量。
随后对于每个预测序列Y={y 1,y 2,…,y n},我们可以得到这个序列的
Figure PCTCN2020131305-appb-000018
最终,我们可以使用softmax(全连接层)层得到每个预测序列Y的概率P r
Figure PCTCN2020131305-appb-000019
其中,
Figure PCTCN2020131305-appb-000020
指得到的所有预测序列中的任一个。在训练阶段,最大似然损失函数可以为:
Figure PCTCN2020131305-appb-000021
Figure PCTCN2020131305-appb-000022
其中,Θ代表模型的所有参数,λ是正则化的超参数,λ需要人工调参确定。
在训练过程中,实际上训练目标为:尽量使得CRF对文本的标签预测符合训练数据中的标注。也就是说,利用上述损失函数调整嵌入层、Transformer层和CRF层的模型参数,尽量最小化损失函数的取值。
训练一个好的模型需要大规模高质量的训练数据,上述训练数据中,同时包含地理位置点、地理位置点空间关系信息,且满足一定质量要求的训练数据较少,且需要人工标注。这就造成高质量训练数据获取困难,为了解决这一问题,本申请实施例中提供了一种优选实施例,采用预训练+fine-tuning(优化调整)的方式进行地理位置点空间关系提取模型的训练。在预训练过程中可以利用从互联网挖掘出的文本构成第一训练数 据,这些第一训练数据的质量要求不高,可以得到数量较大的第一训练数据。在fine-tuning阶段采用官网文本进行人工标注构成第二训练数据,这些训练数据质量很高,数量较少,可以在预训练过程中得到的模型参数基础上进行进一步调优。下面结合实施例三对该方式进行描述。
实施例三、
图5为本申请实施例三提供的训练地理位置点空间关系提取模型的方法流程图,如图5中所示,该方法包括以下步骤:
在501中,获取第一训练数据,第一训练数据包括:文本以及对文本中地理位置点、地理位置点空间关系的标注。
如上所述地,第一训练数据可以是从互联网中挖掘出的包含地理位置点、地理位置点空间关系类型关键词的文本作为预训练数据,这部分训练数据准确率较低,相比较人工精标数据,属于弱标注数据。本申请中并不限于从互联网中挖掘上述文本的具体方式,最简单的方式之一就是预先构建地理位置点空间关系类型关键词的词典,利用该词典以及地图数据库中地理位置点名称在互联网的海量文本中进行匹配,从而获得上述文本。其中文本中关于地理位置点和地理位置点空间关系类型的标注也可以基于上述词典和地图数据库中地理位置点名称自动匹配实现。由于不需要大量的人工参与,因此预训练所需要的第一训练数据可以是大规模的、海量的数据,从而能够保证预训练模型的训练要求。
在502中,利用第一训练数据,训练预训练模型,预训练模型包括:嵌入层、Transformer层和至少一个任务层。
预训练模型的结构可以如图6a中所示,嵌入层和Transformer层的结构和用途与实施例一和二中相似,唯一不同的地方在于,嵌入层还可以包括对输入文本所用于的任务层标识进行编码的第四嵌入层,图中表示为Task Embedding。对于第一训练数据而言,会采用不同的形式用于后续不同的任务层,各任务层的描述参见后续步骤的描述。如图6a中所示,若当前第一训练数据的形式用于标识为2的任务层,则各Token的任务层标识均为2。
其他嵌入层和Transformer层的结构和用途在此不做赘述。
重点介绍任务层。本实施例中Transformer层输出的隐向量输入至各任务层。任务层至少包括掩码(masking)预测任务层、空间关系预测任 务层和地理位置点预测任务层中的至少一种。
其中,掩码预测任务层,用于基于Trasformer层输出的隐向量,预测第一训练数据的文本中掩码部分的内容,训练目标为最小化预测结果与掩码部分对应实际内容的差异。
对于第一训练数据的文本,可以采用掩码字符,或者掩码地理位置点的方式。在对文本进行掩码时可以采用随机的方式,也可以由用户指定规则。举个例子:
对于“北京市海淀区清河中街五彩城购物中心6层”的文本,若随机对字符进行掩码,则可以得到:
“北京市海【mask】区清河中街五彩城购【mask】中心6层”,其中【mask】指代掩码部分,对应的实际内容分别为“淀”和“物”。
若随机对POI进行掩码,则可以得到:
“北京市海淀区清河中街【mask】【mask】【mask】【mask】【mask】【mask】【mask】6层”,其中【mask】指代掩码部分,对应的实际内容分别为“五”、“彩”、“城”、“购”、“物”、“中”和“心”。
空间关系预测任务层,用于基于Transformer层输出的隐向量,预测第一训练数据的文本描述的空间关系,训练目标为预测结果符合对应空间关系标注。
本任务层可以基于文本X,文本中给定的地理位置S、地理位置O,来预测空间关系P,用公式表示为预测概率:P r=F(P|X,S,O)。该任务层可以看做是一个多分类任务,采用交叉熵来确定损失函数。
地理位置点预测任务层,用于基于Transformer层输出的隐向量,预测第一训练数据的文本包含的地理位置点,训练目标为预测结果符合对应地理位置点标注。
本任务层基于描述两个地理位置点空间关系的文本X,文本中给定的其中一个地理位置点S或O,空间关系类型P,来预测另外一个地理位置点O或S。用公式表示为预测概率:P r=F(O|X,S,P),或者,P r=F(S|X,P,O)。该任务层可以看做是一个多分类任务,采用交叉熵来确定损失函数。
上述各任务层可以采用全连接方式、分类器方式等来实现。在进行预训练模型的训练时,各任务层交替进行训练或者同时训练。利用被训 练的任务层的训练目标对应的损失函数,优化嵌入层、Transformer层和被训练的任务层的模型参数。
可以看出本申请采用多任务学习的方式,能够在多个任务间共享知识,从而得到更好的预训练模型。
若采用各任务层交替训练的方式,则每次可以按顺序或者随机选择一个任务层进行训练,每次利用被选择任务层的损失函数来优化嵌入层、Transformer层和被训练的任务层的模型参数。
若采用各任务层同时训练的方式,则每次可以同时训练所有任务,并根据每个任务的损失函数构建联合损失函数,例如可以采用对各任务的损失函数进行加权求和的处理方式,其中加权系数可以采用人工调参的方式确定,例如采用实验值或经验值等。然后利用联合损失函数来优化嵌入层、Transformer层和所有任务层的模型参数。
在503中,获取第二训练数据,第二训练数据包括:文本以及对文本中地理位置点、地理位置点空间关系的类型和取值的标注。
该第二训练数据与实施例二中步骤401中获取的训练数据相同,在此不做赘述。
在504中,基于预训练模型训练得到的嵌入层和Transformer层,利用第二训练数据训练地理位置点空间关系提取模型,地理位置点空间关系提取模型包括嵌入层、Transformer层和映射层。
本步骤实际上是fine-tuning环节,对于在预训练过程中已经训练得到的预训练模型,嵌入层和Transformer层的处理已经较为完善,因此其模型参数已经趋于稳定,本步骤中fine-tuning环节的训练过程中,可以将预训练模型已经训练得到的嵌入层和Transformer层直接拿来继续进行地理位置点空间关系提取模型的训练,即将CRF层替换上述的任务层,让Transformer层输出的隐向量直接输入CRF层。
另外,在本实施例中,采用图6a所示结构的预训练模型,若嵌入层还包括Task Embedding层,则在fine-tuning阶段的输入也包含Task Embedding层,此时各Token所属任务的编码可以为随机取的任意值,对应结构如图6b中所示。相应地,训练结束后利用地理位置点空间关系提取模型提取空间关系时,嵌入层也包括Task Embedding层,各Token所属任务的编码也可以为随机取的任意值。
由于采用人工标注的高精训练数据即第二训练数据与大规模弱标注数据即第一训练数据是相似的,并且人工标注的高精训练数据的规模较小,为了减少过拟合的风险,可以固定嵌入层、Transformer层对的模型参数不变,在本步骤的训练过程中只对映射层例如CRF层的模型参数进行优化(微调)。
关于地理位置点空间关系提取模型的训练原理与实施例二中的描述类似,在此不做赘述。
以上是对本申请所提供方法进行的详细描述,下面结合实施例对本申请所提供的装置进行详细描述。
实施例四、
图7为本申请实施例四提供的训练地理位置点空间关系提取模型的装置结构图,如图7中所示,该装置可以包括:第二获取单元01和第二训练单元02,还可以进一步包括第一获取单元03和第一训练单元04。其中各组成单元的主要功能如下:
第二获取单元01,用于获取第二训练数据,第二训练数据包括:文本以及对文本中地理位置点、地理位置点空间关系信息的标注。
在本实施例中,可以通过人工标注的方式来构建训练样本。例如从地理位置关联的官网数据中抓取地址数据,并对其进行标注。除此之外,也可以采用人工构建文本并进行标注的方式来构建训练样本,或者从其他数据源来获取高质量的文本并进行人工标注。
第二训练单元02,用于利用第二训练数据训练地理位置点空间关系提取模型,地理位置点空间关系提取模型包括嵌入层、Transformer层和映射层。其中,训练得到的地理位置点空间关系提取模型用于从输入的文本中提取地理位置点空间关系信息。
上述的嵌入层包括:用于对文本中各Token进行字符编码的第一嵌入层、用于对各Token进行位置编码的第二嵌入层、用于对各Token所属语句标识进行编码的第三嵌入层。
映射层可以包括,用于利用Transformer层输出的隐向量,预测文本包含的空间关系的信息。
地理位置点空间关系提取模型的训练目标包括:映射层对文本的标签预测符合第二训练数据中的标注。
训练一个好的模型需要大规模高质量的训练数据,上述训练数据中,同时包含地理位置点、地理位置点空间关系信息,且满足一定质量要求的训练数据较少,且需要人工标注。这就造成高质量训练数据获取困难,为了解决这一问题,本申请实施例中提供了一种优选实施例,采用预训练+fine-tuning(优化调整)的方式进行地理位置点空间关系提取模型的训练。在预训练过程中可以利用从互联网挖掘出的文本构成第一训练数据,这些文本的质量要求不高,可以得到数量较大的第一训练数据。在fine-tuning阶段采用高精度文本进行人工标注构成第二训练数据,这些文本质量很高,数量较少,可以在预训练过程中得到的模型参数基础上进行进一步调优。
这种情况下,该装置还包括:
第一获取单元03,用于获取第一训练数据,第一训练数据包括:文本以及对文本中地理位置点、地理位置点空间关系的标注。
第一训练数据可以是从互联网中挖掘出的包含地理位置点、地理位置点空间关系类型关键词的文本作为预训练数据,这部分训练数据准确率较低,相比较人工精标数据,属于弱标注数据。本申请中并不限于从互联网中挖掘上述文本的具体方式,最简单的方式之一就是预先构建地理位置点空间关系类型关键词的词典,利用该词典以及地图数据库中地理位置点名称在互联网的海量文本中进行匹配,从而获得上述文本。其中文本中关于地理位置点和地理位置点空间关系类型的标注也可以基于上述词典和地图数据库中地理位置点名称自动匹配实现。由于不需要大量的人工参与,因此预训练所需要的第一训练数据可以是大规模的、海量的数据,从而能够保证预训练模型的训练要求。
第一训练单元04,用于利用第一训练数据,训练预训练模型,预训练模型包括:嵌入层、Transformer层和至少一个任务层。
第二训练单元02在利用第二训练数据训练地理位置点空间提取模型时,基于预训练模型训练得到的嵌入层和Transformer层。
上述至少一个任务层包括:掩码预测任务层、空间关系预测任务层和地理位置点预测任务层中的至少一种。
其中,掩码预测任务层,用于基于Transformer层输出的隐向量,预测第一训练数据的文本中掩码部分的内容,训练目标为预测结果符合掩 码部分对应实际内容。
空间关系预测任务层,用于基于Transformer层输出的隐向量,预测第一训练数据的文本描述的空间关系,训练目标为预测结果符合对应空间关系标注。
地理位置点预测任务层,用于基于Transformer层输出的隐向量,预测第一训练数据的文本包含的地理位置点,训练目标为预测结果符合对应地理位置点标注。
上述至少一个任务层交替训练或同时训练,利用被训练的任务层的训练目标对应的损失函数,优化嵌入层、Transformer层和被训练的任务层的模型参数。
由于采用人工标注的高精训练数据即第二训练数据与大规模弱标注数据即第一训练数据是相似的,并且人工标注的高精训练数据的规模较小。为了减少过拟合的风险,作为一种优选的实施方式,第二训练单元02在利用第二训练数据训练地理位置点空间提取模型时,采用与训练模型训练得到的嵌入层和Transformer层的模型参数并保持不变,优化映射层的模型参数,直至达到地理位置点空间提取模型的训练目标。
实施例五、
图8为本申请实施例五提供的提取地理位置点空间关系的装置结构图,如图8中所示,该装置可以包括:获取单元11和提取单元12。其中各组成单元的主要功能如下:
获取单元11,用于从互联网获取包含地理位置点信息的文本。
提取单元12,用于将文本输入预先训练得到的地理位置点空间关系提取模型,获取地理位置点空间关系提取模型输出的空间关系的信息;其中地理位置空间关系提取模型包括嵌入层、Transformer层和映射层。
其中,上述嵌入层包括:用于对文本中各Token进行字符编码的第一嵌入层、用于对各Token进行位置编码的第二嵌入层、用于对各Token所属语句标识进行编码的第三嵌入层。
映射层包括CRF,用于利用Transformer层输出的隐向量,预测文本包含的空间关系的信息。
在通过本申请实施例所提供的方式提取出地理位置点空间关系后,可以采用<地理位置点1,地理位置点2,空间关系类型,空间关系取值> 的四元组格式,使得空间关系的表达更规范统一,使得空间关系知识的系统化的计算、推理、存储成为可能。
可以实现诸如如下应用场景:
用户输入query(查询)“清华大学附近有星巴克吗?”,如果数据库中有以下地理位置点空间关系:<清华科技园,清华大学东南门,南,100米>,<威新国际大厦,清华科技园,楼栋,9>,<星巴克,威新国际大厦,楼层,1>,通过这三个关系的推理,我们可以准确地给出答案”清华大学东南门向南100米的清华科技园威新国际大厦1层有星巴克”,并给出对应的地理位置”星巴克”。
根据本申请的实施例,本申请还提供了一种电子设备和一种可读存储介质。
如图9所示,是根据本申请实施例的方法的电子设备的框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本申请的实现。
如图9所示,该电子设备包括:一个或多个处理器901、存储器902,以及用于连接各部件的接口,包括高速接口和低速接口。各个部件利用不同的总线互相连接,并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器可以对在电子设备内执行的指令进行处理,包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如,耦合至接口的显示设备)上显示GUI的图形信息的指令。在其它实施方式中,若需要,可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样,可以连接多个电子设备,各个设备提供部分必要的操作(例如,作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图9中以一个处理器901为例。
存储器902即为本申请所提供的非瞬时计算机可读存储介质。其中,所述存储器存储有可由至少一个处理器执行的指令,以使所述至少一个处理器执行本申请所提供的方法。本申请的非瞬时计算机可读存储介质 存储计算机指令,该计算机指令用于使计算机执行本申请所提供方法。
存储器902作为一种非瞬时计算机可读存储介质,可用于存储非瞬时软件程序、非瞬时计算机可执行程序以及模块,如本申请实施例中的方法对应的程序指令/模块。处理器901通过运行存储在存储器902中的非瞬时软件程序、指令以及模块,从而执行服务器的各种功能应用以及数据处理,即实现上述方法实施例中的方法。
存储器902可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储根据该电子设备的使用所创建的数据等。此外,存储器902可以包括高速随机存取存储器,还可以包括非瞬时存储器,例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些实施例中,存储器902可选包括相对于处理器901远程设置的存储器,这些远程存储器可以通过网络连接至该电子设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
该电子设备还可以包括:输入装置903和输出装置904。处理器901、存储器902、输入装置903和输出装置904可以通过总线或者其他方式连接,图9中以通过总线连接为例。
输入装置903可接收输入的数字或字符信息,以及产生与该电子设备的用户设置以及功能控制有关的键信号输入,例如触摸屏、小键盘、鼠标、轨迹板、触摸板、指示杆、一个或者多个鼠标按钮、轨迹球、操纵杆等输入装置。输出装置904可以包括显示设备、辅助照明装置(例如,LED)和触觉反馈装置(例如,振动电机)等。该显示设备可以包括但不限于,液晶显示器(LCD)、发光二极管(LED)显示器和等离子体显示器。在一些实施方式中,显示设备可以是触摸屏。
此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、专用ASIC(专用集成电路)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该 至少一个输入装置、和该至少一个输出装置。
这些计算程序(也称作程序、软件、软件应用、或者代码)包括可编程处理器的机器指令,并且可以利用高级过程和/或面向对象的编程语言、和/或汇编/机器语言来实施这些计算程序。如本文使用的,术语“机器可读介质”和“计算机可读介质”指的是用于将机器指令和/或数据提供给可编程处理器的任何计算机程序产品、设备、和/或装置(例如,磁盘、光盘、存储器、可编程逻辑装置(PLD)),包括,接收作为机器可读信号的机器指令的机器可读介质。术语“机器可读信号”指的是用于将机器指令和/或数据提供给可编程处理器的任何信号。
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发申请中记载的各步骤可以并行地执行也可以顺 序地执行也可以不同的次序执行,只要能够实现本申请公开的技术方案所期望的结果,本文在此不进行限制。
上述具体实施方式,并不构成对本申请保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等,均应包含在本申请保护范围之内。

Claims (21)

  1. 一种训练地理位置点空间关系提取模型的方法,包括:
    获取第二训练数据,所述第二训练数据包括:文本以及对文本中地理位置点、地理位置点空间关系信息的标注;
    利用所述第二训练数据训练地理位置点空间关系提取模型,所述地理位置点空间关系提取模型包括嵌入层、Transformer层和映射层;
    其中,训练得到的地理位置点空间关系提取模型用于从输入的互联网文本中提取地理位置点空间关系信息。
  2. 根据权利要求1所述的方法,其中,所述嵌入层包括:用于对文本中各Token进行字符编码的第一嵌入层、用于对各Token进行位置编码的第二嵌入层、用于对各Token所属语句标识进行编码的第三嵌入层;
    所述映射层包括条件随机场CRF,用于利用所述Transformer层输出的隐向量,预测所述文本包含的空间关系的信息。
  3. 根据权利要求1所述的方法,其中,所述地理位置点空间关系提取模型的训练目标包括:所述映射层对所述文本的标签预测符合所述第二训练数据中的标注。
  4. 根据权利要求1至3中任一项所述的方法,在利用所述第二训练数据训练地理位置点空间关系提取模型之前,该方法还包括:
    获取第一训练数据,所述第一训练数据包括:文本以及对文本中地理位置点、地理位置点空间关系的标注;
    利用所述第一训练数据,训练预训练模型,所述预训练模型包括:所述嵌入层、所述Transformer层和至少一个任务层;所述嵌入层还包括:用于对输入文本所用于的任务层标识进行编码的第四嵌入层;
    在利用所述第二训练数据训练地理位置点空间提取模型时,基于所述预训练模型训练得到的所述嵌入层和所述Transformer层。
  5. 根据权利要求4所述的方法,其中,所述至少一个任务层包括:掩码预测任务层、空间关系预测任务层和地理位置点预测任务层中的至少一种;
    所述掩码预测任务层,用于基于所述Transformer层输出的隐向量,预测所述第一训练数据的文本中掩码部分的内容,训练目标为预测结果 符合掩码部分对应实际内容;
    所述空间关系预测任务层,用于基于所述Transformer层输出的隐向量,预测所述第一训练数据的文本描述的空间关系,训练目标为预测结果符合对应空间关系标注;
    所述地理位置点预测任务层,用于基于所述Transformer层输出的隐向量,预测所述第一训练数据的文本包含的地理位置点,训练目标为预测结果符合对应地理位置点标注。
  6. 根据权利要求4所述的方法,其中,所述至少一个任务层交替训练或同时训练,利用被训练的任务层的训练目标对应的损失函数,优化所述嵌入层、Transformer层和被训练的任务层的模型参数。
  7. 根据权利要求4所述的方法,其中,在利用所述第二训练数据训练地理位置点空间提取模型时,基于所述预训练模型训练得到的所述嵌入层和所述Transformer层包括:
    在利用所述第二训练数据训练地理位置点空间提取模型时,采用所述与训练模型训练得到的所述嵌入层和所述Transformer层的模型参数并保持不变,优化所述映射层的模型参数,直至达到所述地理位置点空间提取模型的训练目标。
  8. 一种提取地理位置点空间关系的方法,包括:
    从互联网获取包含地理位置点信息的文本;
    将所述文本输入预先训练得到的地理位置点空间关系提取模型,获取所述地理位置点空间关系提取模型输出的空间关系的信息;其中所述地理位置空间关系提取模型包括嵌入层、Transformer层和映射层。
  9. 根据权利要求8所述的方法,其中,所述嵌入层包括:用于对文本中各Token进行字符编码的第一嵌入层、用于对各Token进行位置编码的第二嵌入层、用于对各Token所属语句标识进行编码的第三嵌入层;
    所述映射层包括条件随机场CRF,用于利用所述Transformer层输出的隐向量,预测所述文本包含的空间关系的信息。
  10. 根据权利要求8或9所述的方法,其中,所述空间关系的信息包括:空间关系的类型和取值。
  11. 一种训练地理位置点空间关系提取模型的装置,包括:
    第二获取单元,用于获取第二训练数据,所述第二训练数据包括: 文本以及对文本中地理位置点、地理位置点空间关系信息的标注;
    第二训练单元,用于利用所述第二训练数据训练地理位置点空间关系提取模型,所述地理位置点空间关系提取模型包括嵌入层、Transformer层和映射层;
    所述地理位置点空间关系提取模型用于从输入的文本中提取地理位置点空间关系信息。
  12. 根据权利要求11所述的装置,其中,所述嵌入层包括:用于对文本中各Token进行字符编码的第一嵌入层、用于对各Token进行位置编码的第二嵌入层、用于对各Token所属语句标识进行编码的第三嵌入层;
    所述映射层包括条件随机场CRF,用于利用所述Transformer层输出的隐向量,预测所述文本包含的空间关系的信息。
  13. 根据权利要求11所述的装置,其中,所述地理位置点空间关系提取模型的训练目标包括:所述映射层对所述文本的标签预测符合所述第二训练数据中的标注。
  14. 根据权利要求11至13中任一项所述的装置,还包括:
    第一获取单元,用于获取第一训练数据,所述第一训练数据包括:文本以及对文本中地理位置点、地理位置点空间关系的标注;
    第一训练单元,用于利用所述第一训练数据,训练预训练模型,所述预训练模型包括:所述嵌入层、所述Transformer层和至少一个任务层;所述嵌入层还包括:用于对输入文本所用于的任务层标识进行编码的第四嵌入层;
    所述第二训练单元在利用所述第二训练数据训练地理位置点空间提取模型时,基于所述预训练模型训练得到的所述嵌入层和所述Transformer层。
  15. 根据权利要求14所述的装置,其中,所述至少一个任务层包括:掩码预测任务层、空间关系预测任务层和地理位置点预测任务层中的至少一种;
    所述掩码预测任务层,用于基于所述Transformer层输出的隐向量,预测所述第一训练数据的文本中掩码部分的内容,训练目标为预测结果符合掩码部分对应实际内容;
    所述空间关系预测任务层,用于基于所述Transformer层输出的隐向量,预测所述第一训练数据的文本描述的空间关系,训练目标为预测结果符合对应空间关系标注;
    所述地理位置点预测任务层,用于基于所述Transformer层输出的隐向量,预测所述第一训练数据的文本包含的地理位置点,训练目标为预测结果符合对应地理位置点标注。
  16. 根据权利要求14所述的装置,其中,所述至少一个任务层交替训练或同时训练,利用被训练的任务层的训练目标对应的损失函数,优化所述嵌入层、Transformer层和被训练的任务层的模型参数。
  17. 根据权利要求14所述的装置,其中,所述第二训练单元,具体用于:在利用所述第二训练数据训练地理位置点空间提取模型时,采用所述与训练模型训练得到的所述嵌入层和所述Transformer层的模型参数并保持不变,优化所述映射层的模型参数,直至达到所述地理位置点空间提取模型的训练目标。
  18. 一种提取地理位置点空间关系的装置,包括:
    获取单元,用于从互联网获取包含地理位置点信息的文本;
    提取单元,用于将所述文本输入预先训练得到的地理位置点空间关系提取模型,获取所述地理位置点空间关系提取模型输出的空间关系的信息;其中所述地理位置空间关系提取模型包括嵌入层、Transformer层和映射层。
  19. 根据权利要求18所述的装置,其中,所述嵌入层包括:用于对文本中各Token进行字符编码的第一嵌入层、用于对各Token进行位置编码的第二嵌入层、用于对各Token所属语句标识进行编码的第三嵌入层;
    所述映射层包括条件随机场CRF,用于利用所述Transformer层输出的隐向量,预测所述文本包含的空间关系的信息。
  20. 一种电子设备,其特征在于,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要 求1-10中任一项所述的方法。
  21. 一种存储有计算机指令的非瞬时计算机可读存储介质,其特征在于,所述计算机指令用于使所述计算机执行权利要求1-10中任一项所述的方法。
PCT/CN2020/131305 2020-05-21 2020-11-25 提取地理位置点空间关系的方法、训练提取模型的方法及装置 WO2021232724A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP20913065.7A EP3940552A4 (en) 2020-05-21 2020-11-25 METHOD FOR EXTRACTING SPATIAL RELATIONSHIPS FROM GEOLOCATION POINTS, METHOD FOR TRAINING MODEL EXTRACTION AND DEVICES
US17/427,122 US20220327421A1 (en) 2020-05-21 2020-11-25 Method for extracting geographic location point spatial relationship, and method and apparatus for training an extraction model
JP2022543197A JP2023510906A (ja) 2020-05-21 2020-11-25 地理的位置ポイント空間関係を抽出する方法、抽出モデルをトレーニングする方法、及び装置
KR1020227019541A KR20220092624A (ko) 2020-05-21 2020-11-25 지리적 위치 포인트 공간 관계를 추출하는 방법, 추출 모델을 트레이닝하는 방법 및 장치

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010438214.2 2020-05-21
CN202010438214.2A CN111737383B (zh) 2020-05-21 2020-05-21 提取地理位置点空间关系的方法、训练提取模型的方法及装置

Publications (1)

Publication Number Publication Date
WO2021232724A1 true WO2021232724A1 (zh) 2021-11-25

Family

ID=72647638

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/131305 WO2021232724A1 (zh) 2020-05-21 2020-11-25 提取地理位置点空间关系的方法、训练提取模型的方法及装置

Country Status (6)

Country Link
US (1) US20220327421A1 (zh)
EP (1) EP3940552A4 (zh)
JP (1) JP2023510906A (zh)
KR (1) KR20220092624A (zh)
CN (1) CN111737383B (zh)
WO (1) WO2021232724A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115048478A (zh) * 2022-08-12 2022-09-13 深圳市其域创新科技有限公司 智能设备地理信息图谱的构建方法、设备和系统

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737383B (zh) * 2020-05-21 2021-11-23 百度在线网络技术(北京)有限公司 提取地理位置点空间关系的方法、训练提取模型的方法及装置
CN112184178A (zh) * 2020-10-14 2021-01-05 深圳壹账通智能科技有限公司 邮件内容提取方法、装置、电子设备及存储介质
CN112269925B (zh) * 2020-10-19 2024-03-22 北京百度网讯科技有限公司 一种获取地理位置点信息的方法和装置
CN112347738B (zh) * 2020-11-04 2023-09-15 平安直通咨询有限公司上海分公司 基于裁判文书的双向编码器表征量模型优化方法和装置
CN112380849B (zh) * 2020-11-20 2024-05-28 北京百度网讯科技有限公司 生成兴趣点提取模型和提取兴趣点的方法和装置
CN113947147B (zh) * 2021-10-18 2023-04-18 北京百度网讯科技有限公司 目标地图模型的训练方法、定位方法及相关装置
CN114418093B (zh) * 2022-01-19 2023-08-25 北京百度网讯科技有限公司 训练路径表征模型、输出信息的方法和装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090177460A1 (en) * 2008-01-04 2009-07-09 Fluential, Inc. Methods for Using Manual Phrase Alignment Data to Generate Translation Models for Statistical Machine Translation
CN108920457A (zh) * 2018-06-15 2018-11-30 腾讯大地通途(北京)科技有限公司 地址识别方法和装置及存储介质
CN109753566A (zh) * 2019-01-09 2019-05-14 大连民族大学 基于卷积神经网络的跨领域情感分析的模型训练方法
CN110457420A (zh) * 2019-08-13 2019-11-15 腾讯云计算(北京)有限责任公司 兴趣点位置识别方法、装置、设备及存储介质
CN110716999A (zh) * 2019-09-05 2020-01-21 武汉大学 基于包含定性方位和定量距离的位置描述的poi定位方法
CN110851738A (zh) * 2019-10-28 2020-02-28 百度在线网络技术(北京)有限公司 获取poi状态信息的方法、装置、设备和计算机存储介质
CN111737383A (zh) * 2020-05-21 2020-10-02 百度在线网络技术(北京)有限公司 提取地理位置点空间关系的方法、训练提取模型的方法及装置

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7058617B1 (en) * 1996-05-06 2006-06-06 Pavilion Technologies, Inc. Method and apparatus for training a system model with gain constraints
US11226945B2 (en) * 2008-11-14 2022-01-18 Georgetown University Process and framework for facilitating information sharing using a distributed hypergraph
US9378065B2 (en) * 2013-03-15 2016-06-28 Advanced Elemental Technologies, Inc. Purposeful computing
CN107180045B (zh) * 2016-03-10 2020-10-16 中国科学院地理科学与资源研究所 一种互联网文本蕴含地理实体关系的抽取方法
CN107783960B (zh) * 2017-10-23 2021-07-23 百度在线网络技术(北京)有限公司 用于抽取信息的方法、装置和设备
CN109460434B (zh) * 2018-10-25 2020-11-03 北京知道创宇信息技术股份有限公司 数据提取模型建立方法及装置
CN110083831B (zh) * 2019-04-16 2023-04-18 武汉大学 一种基于BERT-BiGRU-CRF的中文命名实体识别方法
CN110570920B (zh) * 2019-08-20 2023-07-14 华东理工大学 一种基于集中注意力模型的实体、关系联合学习方法
CN110489555B (zh) * 2019-08-21 2022-03-08 创新工场(广州)人工智能研究有限公司 一种结合类词信息的语言模型预训练方法
CN110781683B (zh) * 2019-11-04 2024-04-05 河海大学 一种实体关系联合抽取方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090177460A1 (en) * 2008-01-04 2009-07-09 Fluential, Inc. Methods for Using Manual Phrase Alignment Data to Generate Translation Models for Statistical Machine Translation
CN108920457A (zh) * 2018-06-15 2018-11-30 腾讯大地通途(北京)科技有限公司 地址识别方法和装置及存储介质
CN109753566A (zh) * 2019-01-09 2019-05-14 大连民族大学 基于卷积神经网络的跨领域情感分析的模型训练方法
CN110457420A (zh) * 2019-08-13 2019-11-15 腾讯云计算(北京)有限责任公司 兴趣点位置识别方法、装置、设备及存储介质
CN110716999A (zh) * 2019-09-05 2020-01-21 武汉大学 基于包含定性方位和定量距离的位置描述的poi定位方法
CN110851738A (zh) * 2019-10-28 2020-02-28 百度在线网络技术(北京)有限公司 获取poi状态信息的方法、装置、设备和计算机存储介质
CN111737383A (zh) * 2020-05-21 2020-10-02 百度在线网络技术(北京)有限公司 提取地理位置点空间关系的方法、训练提取模型的方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3940552A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115048478A (zh) * 2022-08-12 2022-09-13 深圳市其域创新科技有限公司 智能设备地理信息图谱的构建方法、设备和系统
CN115048478B (zh) * 2022-08-12 2022-10-21 深圳市其域创新科技有限公司 智能设备地理信息图谱的构建方法、设备和系统

Also Published As

Publication number Publication date
EP3940552A4 (en) 2022-05-11
CN111737383B (zh) 2021-11-23
EP3940552A1 (en) 2022-01-19
KR20220092624A (ko) 2022-07-01
US20220327421A1 (en) 2022-10-13
CN111737383A (zh) 2020-10-02
JP2023510906A (ja) 2023-03-15

Similar Documents

Publication Publication Date Title
WO2021232724A1 (zh) 提取地理位置点空间关系的方法、训练提取模型的方法及装置
JP7283009B2 (ja) 対話理解モデルの訓練方法、装置、デバイス及び記憶媒体
JP7228662B2 (ja) イベント抽出方法、装置、電子機器及び記憶媒体
JP7214949B2 (ja) Poi状態情報を取得する方法、装置、デバイス、プログラム及びコンピュータ記憶媒体
US20220019341A1 (en) Map information display method and apparatus, electronic device, and computer storage medium
WO2021093308A1 (zh) 提取poi名称的方法、装置、设备和计算机存储介质
WO2023124005A1 (zh) 地图兴趣点查询方法、装置、设备、存储介质及程序产品
JP7362998B2 (ja) Poi状態情報を取得する方法、及び装置
CN111767359B (zh) 兴趣点分类方法、装置、设备以及存储介质
JP2023529939A (ja) マルチモーダルpoi特徴の抽出方法及び装置
CN111666292B (zh) 用于检索地理位置的相似度模型建立方法和装置
US20210239486A1 (en) Method and apparatus for predicting destination, electronic device and storage medium
WO2023065731A1 (zh) 目标地图模型的训练方法、定位方法及相关装置
WO2023168909A1 (zh) 地理预训练模型的预训练方法及模型微调方法
CN112328890B (zh) 搜索地理位置点的方法、装置、设备及存储介质
CN114281968B (zh) 一种模型训练及语料生成方法、装置、设备和存储介质
WO2021212827A1 (zh) 检索地理位置的方法、装置、设备和计算机存储介质
CN112507103A (zh) 任务型对话及模型训练方法、装置、设备和存储介质
CN112269925B (zh) 一种获取地理位置点信息的方法和装置
CN114925680A (zh) 物流兴趣点信息生成方法、装置、设备和计算机可读介质
CN111782748B (zh) 地图检索方法、信息点poi语义向量的计算方法和装置
CN113807102A (zh) 建立语义表示模型的方法、装置、设备和计算机存储介质
CN111782979A (zh) 兴趣点的品牌分类方法、装置、设备以及存储介质
Wang et al. Construction of bilingual knowledge graph based on meteorological simulation
CN114722841B (zh) 翻译方法、装置及计算机程序产品

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2020913065

Country of ref document: EP

Effective date: 20210721

ENP Entry into the national phase

Ref document number: 20227019541

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2022543197

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE