CN113539270A

CN113539270A - Position identification method and device, electronic equipment and storage medium

Info

Publication number: CN113539270A
Application number: CN202110830026.9A
Authority: CN
Inventors: 姚雷; 杜新凯; 纪诚; 黄莹
Original assignee: Sunshine Insurance Group Co Ltd
Current assignee: Sunshine Insurance Group Co Ltd
Priority date: 2021-07-22
Filing date: 2021-07-22
Publication date: 2021-10-22
Anticipated expiration: 2041-07-22
Also published as: CN113539270B

Abstract

The application provides a position identification method, a position identification device, electronic equipment and a storage medium, wherein the method comprises the following steps: after voice data are acquired, converting the voice data into text information through a voice recognition technology; after the text information is input into a pre-trained positioning model for fuzzy recognition of characters, recognizing characters with semantic similarity exceeding a preset threshold value with place names in the text information through the positioning model to take the characters as target characters; and aiming at each target character, searching the geographical position information containing the target character from the corresponding relation between the character and the geographical position so as to determine the geographical position in the geographical position information as the geographical position corresponding to the target character. According to the method, the accuracy of recognizing the place name in the voice can be improved.

Description

Position identification method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of speech recognition technologies, and in particular, to a position recognition method, an apparatus, an electronic device, and a storage medium.

Background

With the advent of speech recognition technology, the human-computer interaction mode of speech is gradually applied to more occasions, and when position recognition is performed, the position in speech is also gradually recognized by adopting a speech input mode.

The inventor finds in research that in the prior art, voice is converted into text through a voice recognition technology, and the place name in the text is determined through a word-by-word or word-by-word comparison mode. In practical application, because the speech recognition technology is not mature enough, when the speech data has interferences such as noise, tone change, non-uniform speaking mode and the like, some errors are generated when the speech recognition technology is applied to convert the speech into the text, so that part of words in the converted text are changed, and when the place name in the speech is recognized in a word-by-word or word-by-word comparison mode, the accuracy is low.

Disclosure of Invention

In view of this, embodiments of the present application provide a position recognition method, an apparatus, an electronic device, and a storage medium, so as to solve the problem that accuracy of recognizing a place name in a speech is low.

In a first aspect, an embodiment of the present application provides a location identification method, including:

after voice data are acquired, converting the voice data into text information through a voice recognition technology;

after the text information is input into a pre-trained positioning model for fuzzy recognition of characters, recognizing characters with semantic similarity exceeding a preset threshold value with place names in the text information through the positioning model to take the characters as target characters;

and aiming at each target character, searching the geographical position information containing the target character from the corresponding relation between the character and the geographical position so as to determine the geographical position in the geographical position information as the geographical position corresponding to the target character.

In one possible embodiment, the localization model is trained by:

after a plurality of sample voice data are obtained, converting the sample voice data into a training text through a voice recognition technology aiming at each sample voice data;

marking a first mark for a first character used for representing the name of the place name in the training text, and marking a second mark for a non-first character used for representing the name of the place name, so as to take the training text carrying the first mark and the second mark as a target training text;

inputting a data set containing a plurality of target training texts into a BERT + CRF model so as to train the BERT + CRF model into the positioning model in a supervised learning mode.

In one possible embodiment, inputting a data set containing a plurality of target training texts into a BERT + CRF model to train the BERT + CRF model into the positioning model by means of supervised learning, includes:

respectively putting target training files in the data set into a training set, a verification set and a test set according to a preset proportion;

after the training set, the verification set and the test set are respectively input into the BERT + CRF model, aiming at each hyper-parameter preset for the BERT + CRF model, training the BERT + CRF model under the hyper-parameter through at least one target training text in the training set and the first identification and the second identification carried by the at least one target training text to obtain parameters of the BERT + CRF model under the hyper-parameter; the BERT + CRF model identifies characters, with the semantic similarity to the place name exceeding a preset threshold, in the text information according to the parameters;

after the parameters are obtained, verifying the parameters of the BERT + CRF model under the hyper-parameters through the verification set aiming at each hyper-parameter so as to obtain a first identification rate of the model under the parameters; the first recognition rate is the success rate of recognizing characters, with the semantic similarity to place names exceeding a preset threshold, in the target training texts in the verification set by the BERT + CRF model under the parameter;

after determining the hyper-parameter of the BERT + CRF model according to the first identification rate, testing the parameter of the BERT + CRF model under the hyper-parameter through a test set to obtain a second identification rate of the model under the parameter; the second recognition rate is the success rate of recognizing the characters with the semantic similarity with the place name exceeding a preset threshold in the target training texts in the test set by the BERT + CRF model under the parameter;

and judging whether the second recognition rate is greater than or equal to a preset recognition rate, and when the second recognition rate of the BERT + CRF model is greater than or equal to the preset recognition rate, using the model as the positioning model to recognize characters, with the semantic similarity with the place name exceeding a preset threshold, in the text information through the positioning model.

In one possible embodiment, the correspondence between the words and the geographic locations is constructed by:

after acquiring at least one piece of geographical location information comprising a geographical location, words used for representing the geographical location name and the corresponding relationship between the words and the geographical location, putting the geographical location, at least one word or vocabulary used for representing the geographical location name and the corresponding relationship between the at least one word or vocabulary and the geographical location into a geographical location set corresponding to the geographical location from the at least one piece of geographical location information corresponding to the geographical location for each geographical location in the geographical location information;

and storing the geographical position set into the corresponding relation between the characters and the geographical positions.

In a possible embodiment, before obtaining at least one geographic location information including a geographic location, a text indicating a name of the geographic location, and a correspondence between the text and the geographic location, the method further includes:

for each marked word, taking the marked word as a preset place name, and setting a geographical position corresponding to the preset place name for each preset place name; the marked vocabulary is a vocabulary formed by characters carrying target identifications, and the target identifications comprise first identifications and second identifications;

and aiming at each preset place name, storing the preset place name, the geographic position of the preset place name and the corresponding relation between the preset place name and the geographic position into the geographic position information corresponding to the preset place name.

In a second aspect, an embodiment of the present application further provides a location identification apparatus, where the apparatus includes:

the conversion unit is used for converting the voice data into text information through a voice recognition technology after the voice data is obtained;

the positioning unit is used for identifying characters with semantic similarity exceeding a preset threshold value with a place name in the text information through a positioning model after the text information is input into the pre-trained positioning model for fuzzy identification of the characters so as to take the characters as target characters;

and the determining unit is used for searching the geographical position information containing the target characters from the corresponding relation between the characters and the geographical positions aiming at each target character so as to determine the geographical position in the geographical position information as the geographical position corresponding to the target character.

In one possible embodiment, the apparatus further comprises:

the voice recognition system comprises a sample unit, a voice recognition unit and a voice recognition unit, wherein the sample unit is used for converting sample voice data into a training text through a voice recognition technology aiming at each sample voice data after the plurality of sample voice data are obtained;

a marking unit, configured to mark a first character of the training text for representing a name of a place name with a first identifier, mark a second character of the training text for representing a name of a place name with a non-first identifier, and use the training text carrying the first identifier and the second identifier as a target training text;

the input unit is used for inputting a data set containing a plurality of target training texts into a BERT + CRF model, and the BERT + CRF model is trained into the positioning model in a supervised learning mode.

In a possible embodiment, the input unit is specifically configured to:

the alignment unit is used for putting the geographic position, at least one character or vocabulary used for representing the geographic position name and the corresponding relation between the at least one character or vocabulary and the geographic position into a geographic position set corresponding to the geographic position from at least one piece of geographic position information corresponding to the geographic position aiming at each geographic position in the geographic position information after acquiring at least one piece of geographic position information comprising the geographic position, the character used for representing the geographic position name and the corresponding relation between the character and the geographic position;

and the first storage unit is used for storing the geographical position set into the corresponding relation between the characters and the geographical positions.

In one possible embodiment, the apparatus further comprises:

the system comprises a presetting unit, a processing unit and a processing unit, wherein the presetting unit is used for taking a marked word as a preset place name and setting a geographical position corresponding to the preset place name for each preset place name before acquiring at least one geographical position information containing the geographical position, a character used for representing the geographical position name and a corresponding relation between the character and the geographical position; the marked vocabulary is a vocabulary formed by characters carrying target identifications, and the target identifications comprise first identifications and second identifications;

and the second storage unit is used for storing the preset place name, the geographic position of the preset place name and the corresponding relation between the preset place name and the geographic position into the geographic position information corresponding to the preset place name aiming at each preset place name.

In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is operated, the processor executing the machine-readable instructions to perform the steps of the method according to any one of the first aspect.

In a fourth aspect, this application further provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, performs the steps of the method according to any one of the first aspect.

According to the position recognition method, the position recognition device, the electronic equipment and the storage medium, after voice data are obtained, the voice data are converted into text information through a voice recognition technology; after voice data is converted into text information, because the voice data is easily interfered by environment, noise, signals and the like, certain errors exist in the text information converted by a voice recognition technology, and the errors enable part of characters in the text information to be changed into other characters; and aiming at each target character, searching the geographical position information containing the target character from the corresponding relation between the character and the geographical position so as to determine the geographical position in the geographical position information as the geographical position corresponding to the target character. Under the condition that errors are easily generated in a voice conversion text, compared with a method for determining a place name by word or word by word comparison in the prior art, the method and the device for determining the place name in the voice conversion text can identify the characters, with the semantic similarity exceeding the preset threshold, in the voice data through the positioning model, namely when the converted place name has wrongly-written characters, the positioning model can also identify the characters, so that the accuracy of identifying the place name in the voice is improved, and the geographic position corresponding to the place name is determined for the place name.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 shows a flowchart of a location identification method provided in an embodiment of the present application.

Fig. 2 shows a flowchart of another location identification method provided in an embodiment of the present application.

Fig. 3 shows a schematic structural diagram of a position identification device provided in an embodiment of the present application.

Fig. 4 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.

In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that in the embodiments of the present application, the term "comprising" is used to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.

It should be noted that the apparatuses, electronic devices, and the like according to the embodiments of the present application may be executed on a single server or may be executed in a server group. The server group may be centralized or distributed. In some embodiments, the server may be local or remote to the terminal. For example, the server may access information and/or data stored in the service requester terminal, the service provider terminal, or the database, or any combination thereof, via the network. As another example, the server may be directly connected to at least one of the service requester terminal, the service provider terminal and the database to access the stored information and/or data. In some embodiments, the server may be implemented on a cloud platform; by way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud (community cloud), a distributed cloud, an inter-cloud, a multi-cloud, and the like, or any combination thereof.

In addition, the apparatus or the electronic device related to the embodiment of the present application may be implemented on an access device or a third-party device, and specifically may include: a mobile device, a tablet computer, a laptop computer, or a built-in device in a motor vehicle, etc., or any combination thereof. In some embodiments, the mobile device may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof. In some embodiments, the smart home devices may include a control device of a smart electrical device, a smart monitoring device, a smart television, a smart camera, or an intercom, or the like, or any combination thereof. In some embodiments, the wearable device may include a smart bracelet, a smart helmet, a smart watch, a smart accessory, and the like, or any combination thereof. In some embodiments, the smart mobile device may include a smartphone, a Personal Digital Assistant (PDA), a gaming device, a navigation device, or the like, or any combination thereof. In some embodiments, the virtual reality device and/or the augmented reality device may include a virtual reality helmet, an augmented reality helmet, or the like, or any combination thereof. For example, the virtual reality device and/or augmented reality device may include various virtual reality products and the like.

Example one

Fig. 1 is a flowchart of a location identification method according to an embodiment of the present application, and as shown in fig. 1, the method is implemented by the following steps:

step 101, after acquiring voice data, converting the voice data into text information through a voice recognition technology.

Specifically, the voice data is data containing voice of the user, and may be, for example, a piece of voice information sent by the user, call data of the user, and the like. The application scenario of the embodiment of the application is an insurance industry accident reporting scenario and an insurance industry accident settlement scenario. Therefore, in the embodiment of the application, the voice data is call data between the user and the intelligent robot when the user performs service feedback or fault declaration. After the voice data is acquired, the voice data is converted into text information through a voice recognition technology, and the text information is cleaned. By cleaning the text information, interference information such as messy codes, special characters and the like generated in the text information during the voice conversion process is eliminated, and the empty text is deleted.

And 102, after the text information is input into a pre-trained positioning model for fuzzy recognition of characters, recognizing characters with semantic similarity exceeding a preset threshold value with place names in the text information through the positioning model, and taking the characters as target characters.

Specifically, the positioning model can identify characters in the text information, and the characters with the semantic similarity with the place name exceeding a preset threshold value in the text information are identified according to parameters obtained by training the positioning model in advance. The target word may be a word or a word composed of at least two words. After step 101 is executed, the cleaned text information is acquired, the text information is input into a positioning model, the positioning model identifies characters in the text information, marks the identified characters, extracts the marked characters, and obtains target characters, of which the semantic similarity with the place name exceeds a preset threshold, in the text information. The semantic similarity between the extracted target characters and the place names is determined by the parameters of the positioning model, and the more accurate the parameters obtained by training the positioning model are, the higher the semantic similarity between the extracted target characters and the place names is.

In practical applications, when the quality of voice communication is affected by the user's idiomatic words, speaking modes, accents or communication environment, communication noise, communication signals, and other problems, some errors may occur when converting voice into words through the voice recognition technology, for example, converting "i go to Tianjin this evening" into "i go to Tianjing this evening" and "i go to raise my day this evening". Therefore, through the trained positioning model, the words similar to "tianjin" in semantics, such as "tianjing" or "tianjing", in the voice call data can be recognized as target characters, and the specific training process of the positioning model refers to steps 211-215.

Step 103, for each target character, searching the geographical position information containing the target character from the corresponding relation between the character and the geographical position, so as to determine the geographical position in the geographical position information as the geographical position corresponding to the target character.

Specifically, the corresponding relationship between the characters and the geographic positions comprises a knowledge graph constructed for at least one geographic position; for each geographic location, the knowledge-graph includes entities, geographic locations, relationships between entities and geographic locations, and relationships between different entities. The knowledge graph is essentially a knowledge base of a semantic network and is a relational graph used for representing the relation among different entities; wherein, the relationship is used for expressing the relation between different entities; an entity is a noun used to represent a thing or law that exists in the real world by a guest, such as a person's name, place name, concept, medicine, company, etc.

The corresponding relation between the characters and the geographic position comprises a plurality of geographic position information or a plurality of geographic position sets, and each geographic position information or geographic position set comprises the characters representing the name of the geographic position information, the geographic position of the geographic position information and the corresponding relation between the characters and the geographic position. After the target characters are extracted by the positioning model in step 102, for each target character, the geographical position information including the target character is searched from the corresponding relationship between the character and the geographical position, and the geographical position in the geographical position information is determined as the geographical position corresponding to the target character.

When only one target character is identified in one text message, a geographical position can be uniquely determined according to the target character; when a plurality of target characters are identified in one text message, determining the geographic position corresponding to the text message according to a plurality of geographic positions determined for the plurality of target characters from the corresponding relation between the characters and the geographic positions.

The rule for determining the geographic position corresponding to the text information can be preset, the set rule is not limited in the embodiment of the application, and the rule can be adjusted according to actual conditions. For example, the geographic position with the minimum corresponding range of the target characters is determined as the geographic position corresponding to the text message, and the character in the text message is "go to beijing plaza a watch flag", then the target characters that can be identified by the positioning model are "beijing" and "plaza a", and then the geographic position corresponding to the text message is determined as the geographic position of "plaza a".

After the geographical position of the target character is determined, the target character and the geographical position corresponding to the target character can be displayed through a webpage or a client, the target character and the geographical position can be marked to the voice message, or the data can be directly stored, marked and output to a third party.

For example, the following geographical location information is stored in the correspondence relationship between the text and the geographical location:

information one, Beijing, at 39 degrees 26 '-41 degrees 03' and east longitude at 115 degrees 25 '-117 degrees 30'.

Information two, Shanghai, 30 ° 40 '-31 ° 53' in northern latitude, and 120 ° 52 '-122 ° 12' in east longitude.

If the target character identified by the positioning model is Shanghai, searching the target character Shanghai in the geographic position information in the corresponding relationship between the character and the geographic position, determining that the information II is the geographic position information containing the target character Shanghai, and determining the geographic position of the information II, namely 30 degrees of north latitude 40 '-31 degrees 53', 120 degrees of east longitude 52 '-122 degrees 12' ″, as the geographic position corresponding to the target character Shanghai.

In a possible implementation, fig. 2 is a flowchart of another location identification method provided in an embodiment of the present application, and as shown in fig. 2, the method is implemented by the following steps:

step 201, after a plurality of sample voice data are obtained, for each sample voice data, converting the sample voice data into a training text by a voice recognition technology.

Specifically, in the embodiment of the application, the sample voice data is user history voice data which is acquired from a server or a cloud and contains data of a call between a user and the intelligent robot. After a plurality of sample voice data are obtained, the sample voice data are converted into training texts through a voice recognition technology, the training texts are cleaned, interference information such as messy codes and special characters generated in the training texts during the voice conversion process is eliminated, and the empty texts are deleted.

Step 202, marking a first mark for a first character used for representing the name of the place name in the training text, and marking a second mark for a non-first character used for representing the name of the place name, so as to use the training text carrying the first mark and the second mark as a target training text.

Specifically, after the plurality of training texts are obtained in step 201, a first character in all place names in the training texts is marked with a first identifier, and the other characters except the first character in all place names in the training texts are marked with a second identifier. In addition, characters irrelevant to the place name are marked by the third identification. At this time, all characters in the training text are marked with the first identifier, the second identifier or the third identifier, and the training text carrying the first identifier, the second identifier and the third identifier is used as a target training text.

For example, if the text in the training text is "i want to go to beijing this evening", a first identifier and a second identifier are marked for "beijing" representing a place name, wherein the first identifier is marked for the first word "beijing" of "beijing"; marking a second mark for a non-first word 'Beijing' in 'Beijing'; marking a third mark for other characters which are irrelevant to the place name in the ' I ' go to Beijing tonight ', respectively.

Or if the text in the training text is ' mildewed sound, the first identification is marked by ' mildewed ' in ' mildewed sound '; marking a second marker for "all" in "mildewa"; the third identification is marked for each word of 'go' and 'hear the same voice'.

Step 203, inputting a data set containing a plurality of target training texts into a BERT + CRF model, and training the BERT + CRF model into the positioning model in a supervised learning mode.

Specifically, the BERT model (a pre-training model) is used to increase the generalization capability of the word vector model and fully describe the character-level, word-level, sentence-level, and even sentence-level relational features. And the CRF (Conditional Random Field) is used for modeling the target sequence on the basis of the observation sequence and mainly solving the problem of serialization labeling. Supervised learning is a machine learning task that infers a function from a labeled training dataset. In the embodiment of the application, a BERT + CRF model under a named entity recognition model is adopted, and the named entity recognition model can be changed according to actual conditions, for example, entity recognition models such as electrora and Ernie can be adopted. The named entity recognition process of the BERT + CRF model is actually a process of predicting a labeling sequence of an input sentence, can summarize a rule from a training text, and predicts an unknown text according to the rule.

After the target training texts are labeled with the first identifier, the second identifier and the third identifier in step 202, a data set composed of a plurality of target training texts is input into a BERT + CRF model.

In a possible embodiment, when step 203 is executed, the following steps are specifically implemented:

and step 211, respectively putting the target training files in the data set into a training set, a verification set and a test set according to a preset proportion.

Specifically, in the embodiment of the present application, the preset ratio is 7:2:1, that is, 70% of the target training texts in the data set are put into the training set, 20% of the target training texts in the data set are put into the verification set, and 10% of the target training texts are put into the test set. The preset proportion is not limited, and can be set according to other proportions.

Step 212, after the training set, the verification set and the test set are respectively input into the BERT + CRF model, for each hyper-parameter preset for the BERT + CRF model, training the BERT + CRF model under the hyper-parameter through at least one target training text in the training set and the first identifier and the second identifier carried by the at least one target training text to obtain parameters of the BERT + CRF model under the hyper-parameter; and the BERT + CRF model identifies characters, with the semantic similarity to the place name exceeding a preset threshold, in the text information according to the parameters.

Specifically, the parameters are variables that the model can automatically learn from the data, such as weights, deviations for deep learning, and the like. The hyper-parameters are parameters used to determine a model, and for the same model, the hyper-parameters are different, and the model is also different under different hyper-parameters, which are generally empirically determined variables. For example, in deep learning, the hyper-parameters are: learning rate, number of iterations, number of layers, number of neurons per layer, and the like. The preset threshold is preset, and can be set to be 100%, 90% and the like according to actual conditions.

Respectively inputting a training set, a verification set and a test set into the BERT + CRF model, training the model through the training set, verifying the trained model through the verification set, and testing the verified model through the test set. Before training the BERT + CRF model, a plurality of super parameters are set for the model, and the parameters of the BERT + CRF model under the super parameters are trained through a training set under each super parameter. After a training set is input into a BERT + CRF model, characters carrying a first identification and a second identification in a target training text of the training set are used as expected output values, the target training text in the training set is analyzed through the BERT + CRF model to obtain actual output values, and parameters of the BERT + CRF model under the hyper-parameters are determined by continuously adjusting errors between the actual output values and the expected output values, so that the BERT + CRF model can generate the capability of predicting unknown samples according to the parameters. For example, when the iteration times (hyper-parameters) set by the same model are different, the parameters obtained by training the model through the training set are also different under different iteration times, and the different parameters make the model have different capabilities of recognizing characters with the semantic similarity exceeding a preset threshold value with the place name under different hyper-parameters. The more accurate the hyper-parameter setting, the higher the semantic similarity between the text and place name recognized by the parameter trained by the model under the hyper-parameter.

Step 213, after obtaining the parameters, verifying the parameters of the BERT + CRF model under the hyper-parameters through the verification set for each hyper-parameter to obtain a first identification rate of the model under the parameters; and the first recognition rate is the success rate of recognizing the characters with the semantic similarity with the place name exceeding a preset threshold in the target training texts in the verification set by the BERT + CRF model under the parameter.

Specifically, after the step 212 is executed, parameters respectively corresponding to the model under different hyper-parameters are obtained. And inputting the verification set into a BERT + CRF model under each hyper-parameter aiming at the corresponding BERT + CRF model under each hyper-parameter, identifying characters in a target training text of the verification set according to the parameters of the BERT + CRF model under the hyper-parameter, and calculating a first identification rate by comparing the identification result with the identification result of the identification carried by the target training text. The recognition result refers to the recognition condition of the BERT + CRF model to characters in the target training text of the verification set under the parameter. The first recognition rate is the ratio of the number of successfully recognized place names to the total number of place names in the target training text.

For example, when the text information in one of the target training texts in the verification set is "sit high iron today to play in Beijing", seven words "sit high iron today" and "play" in the verification set which are the target training texts in advance are respectively marked with a third mark, a first mark is marked with "north", and a second mark is marked with "Beijing";

according to one of the models under the hyper-parameters, after text information 'sit high-speed rail to get to Beijing play' in a target training text is input into the trained model, the model marks characters in the text information with a first verification identifier, a second verification identifier and a third verification identifier respectively for representing the first identifier, the second identifier and the third identifier through the parameters obtained by training, if the model marks the third verification identifier respectively for seven characters of 'sit high-speed rail to get to today' and 'play', the first verification identifier is marked for north ', and the second verification identifier is marked for Beijing'. Then the model can recognize the place name of "Beijing" according to the comparison with the vocabulary formed by the first identifier and the second identifier, the obtained first recognition rate of the target training text is 100%, and when a plurality of target training texts in the verification set exist, the first recognition rate of the model for the whole verification set is calculated.

Step 214, after determining the hyper-parameter of the BERT + CRF model according to the first recognition rate, testing the parameter of the BERT + CRF model under the hyper-parameter through a test set to obtain a second recognition rate of the model under the parameter; and the second recognition rate is the success rate of recognizing the characters with the semantic similarity with the place name exceeding a preset threshold in the target training texts in the test set by the BERT + CRF model under the parameter.

Specifically, after step 213 is executed, a plurality of first recognition rates of the model are obtained, and the hyper-parameter of the model corresponding to the highest first recognition rate is determined as the hyper-parameter of the BERT + CRF model; or determining the hyper-parameters of the model corresponding to the first identification rate exceeding the threshold as the hyper-parameters of the BERT + CRF model; the method for determining the hyper-parameter is not limited in the embodiment of the application. And after determining the hyper-parameters for the BERT + CRF model, taking a target training file in a test set as a test sample, and testing the parameters of the BERT + CRF model under the hyper-parameters through the test set. The second recognition rate is calculated in the same manner as the first recognition rate.

Step 215, judging whether the second recognition rate is greater than or equal to a preset recognition rate, and when the second recognition rate of the BERT + CRF model is greater than or equal to the preset recognition rate, using the model as the positioning model to recognize characters, of which the semantic similarity with the place name exceeds a preset threshold, in the text information through the positioning model.

Specifically, after step 214, a second recognition rate obtained by testing the model is obtained, and it is determined whether the second recognition rate is greater than the preset recognition rate. And when the second recognition rate is smaller than the preset recognition rate, the trained location model under the hyperparameter is considered to have an unsatisfactory recognition effect on the place name, and the hyperparameter of the model needs to be readjusted and the location model under the hyperparameter needs to be trained, or the BERT + CRF model is replaced by other named entity recognition models for training. When the second recognition rate is greater than the preset recognition rate, the model with the second recognition rate is considered to achieve the recognition accuracy of the place name in the text, and the model can be used as a positioning model, so that characters with the semantic similarity with the place name exceeding a preset threshold value in the text information are recognized through the positioning model.

In one possible embodiment, the correspondence between the words and the geographic locations is constructed by the following steps:

after acquiring at least one piece of geographical location information comprising a geographical location, words used for representing the geographical location name and the corresponding relationship between the words and the geographical location, putting the geographical location, at least one word or vocabulary used for representing the geographical location name and the corresponding relationship between the at least one word or vocabulary and the geographical location into a geographical location set corresponding to the geographical location from the at least one piece of geographical location information corresponding to the geographical location for each geographical location in the geographical location information; and storing the geographical position set into the corresponding relation between the characters and the geographical positions.

Specifically, the geographic location information may be obtained by methods such as network crawling and historical service data sorting. The historical business data comprises geographic position information comprising geographic positions, characters used for representing the geographic position names and corresponding relations between the characters and the geographic positions. And performing operations such as entity disambiguation, entity alignment, attribute alignment and the like on the geographic position information, so as to arrange the geographic position information with the same geographic position and different place names into the same geographic position set. In the geographical location set, each place name uniquely corresponds to one geographical location, and each geographical location corresponds to at least one place name.

For example, assume that there are a plurality of geographical location information:

information one, Shandong, 34 ° north latitude 22.9 '-38 ° 24.01', and 114 ° east longitude 47.5 '-122 ° 42.3'.

Information two, zilu, 34 ° north latitude 22.9 '-38 ° 24.01', and 114 ° east longitude 47.5 '-122 ° 42.3'.

Information III, Beijing, 39 degrees 26-41 degrees 03 degrees, east longitude, 115 degrees 25-117 degrees 30 degrees.

Information four, emperor, north latitude 39 ° 26 '-41 ° 03', east longitude 115 ° 25 '-117 ° 30'.

Information five, Beijing, 39 degrees 26 '-41 degrees 03' of latitude, and 115 degrees 25 '-117 degrees 30' of east longitude.

The geographical information set after sorting is:

integrates one, (Shandong, Qilu; 34 degrees 22.9 '-38 degrees 24.01 degrees in northern latitude, and 114 degrees 47.5' -122 degrees 42.3 degrees in east longitude).

Set two, (Beijing, Didu, Beijing; 39 deg. 26 '-41 deg. 03', Dongding 115 deg. 25 '-117 deg. 30').

The corresponding relation between the characters and the geographic positions also comprises the characters and the geographic positions acquired from channels such as historical business data, networks and the like, and the relation between the characters and the geographic positions.

For example, "zhang san lives in beijing," where "zhang san" and "beijing" are both entities in the relationship graph, and in the embodiment of the present application, "beijing" is an entity used to represent a geographic location corresponding to a geographic location in the knowledge graph representing longitude and latitude of beijing, "zhang san" is an entity having a relationship with the entity "beijing" used to represent the geographic location, and a relationship between "zhang san" and "beijing" is "living in. When the Beijing is identified, the relation of Zhang III can be realized according to the Beijing; when "zhang san" is recognized, it can be associated to "beijing" according to "zhang san". So as to improve the accuracy of the determined geographical position according to the association between the characters.

For example, after the knowledge graph includes "zhangsan", "beijing", and the relationship between "zhangsan" and "beijing", and after the text message includes the information with wrongly written characters, such as "zhangsan live in beijing", the geographical position of the beijing can be inferred from the correspondence between the written characters and the geographical position.

In the embodiments of the present application, the knowledge-graph is mainly used to represent the relationship between entities of geographic locations.

For example: according to the historical service information or the information obtained by the network, the following texts are arranged:

text one, square a to beijing.

Text two, square B is next to square a.

All the 'Beijing', 'Square A' and 'Square B' in the text belong to entities for representing geographic positions, and the relationship among the entities is that the 'Square A' belongs to the 'Beijing', 'Square B' and 'Square A' are adjacent, so that a relationship is generated between the 'Beijing' and the 'Square B' through the 'Square A', and the relationship is stored in the corresponding relationship between the characters and the geographic positions.

In a possible embodiment, before obtaining at least one geographic location information including a geographic location, a text for indicating a name of the geographic location, and a corresponding relationship between the text and the geographic location, the following steps are further performed:

for each marked word, taking the marked word as a preset place name, and setting a geographical position corresponding to the preset place name for each preset place name; the marked vocabulary is a vocabulary formed by characters carrying target identification, and the target identification comprises a first identification and a second identification. And aiming at each preset place name, storing the preset place name, the geographic position of the preset place name and the corresponding relation between the preset place name and the geographic position into the geographic position information corresponding to the preset place name.

Specifically, in step 202, for each word in the training text used to represent the place name, a first identifier or a second identifier is marked. And for each character carrying the first identifier and each vocabulary formed by the characters carrying the first identifier and at least one second identifier, taking the characters or the vocabulary as a preset place name, setting a geographic position corresponding to the preset place name for each preset place name, and setting geographic position information corresponding to the preset place name according to the preset place name and the geographic position.

For example, characters such as "beijing", "tianjin", "tianjing", "lu", "shandong" are respectively marked in the target training text. A first mark is marked by north, back, day, Shandong and mountain; marking a second mark for "Jing", "jin", "Jing" and "Dong"; taking the character 'lu' carrying the first mark as a first preset place name; the words "Beijing", "Tianjin", "Tianjing", "Shandong" composed of characters carrying the first identifier and the second identifier are used as a second preset place name, a third preset place name, a fourth preset place name, a fifth preset place name and a sixth preset place name. And respectively setting corresponding geographic positions for the six preset place names. And aiming at each preset place name, storing the preset place name, the geographic position of the preset place name and the corresponding relation between the preset place name and the geographic position into the geographic position information corresponding to the preset place name, and setting six pieces of geographic position information corresponding to six preset place names:

Information two, Beijing, at 39 degrees 26 '-41 degrees 03', east longitude at 115 degrees 25 '-117 degrees 30'.

Information III, Beijing, 39 degrees 26-41 degrees 03 'of Beijing, and 115 degrees 25-117 degrees 30' of east longitude.

Information four, Tianjin, north latitude 38 ° 34 '-40 ° 15', east longitude 116 ° 43 '-118 ° 04'.

Information five, tianjing, north latitude 38 ° 34 '-40 ° 15', east longitude 116 ° 43 '-118 ° 04'.

Information six, Shandong, 34 ° north latitude 22.9 '-38 ° 24.01', 114 ° east longitude 47.5 '-122 ° 42.3'.

Example two

Fig. 3 is a schematic structural diagram of a position identification device according to an embodiment of the present application, and as shown in fig. 3, the device includes: a conversion unit 301, a positioning unit 302 and a determination unit 303.

The conversion unit 301 is configured to, after acquiring the voice data, convert the voice data into text information by using a voice recognition technology.

A positioning unit 302, configured to, after the text information is input into a pre-trained positioning model for performing fuzzy recognition on characters, recognize, through the positioning model, a character in the text information whose semantic similarity to a place name exceeds a preset threshold, so as to use the character as a target character.

The determining unit 303 is configured to, for each target text, search geographic position information including the target text from a correspondence between the text and a geographic position, so as to determine the geographic position in the geographic position information as the geographic position corresponding to the target text.

In one possible embodiment, the apparatus further comprises:

the voice recognition system comprises a sample unit and a training text conversion unit, wherein the sample unit is used for converting sample voice data into a training text through a voice recognition technology aiming at each sample voice data after a plurality of sample voice data are obtained.

And the marking unit is used for marking a first character used for representing the name of the place name in the training text with a first identifier and marking a non-first character used for representing the name of the place name with a second identifier so as to take the training text carrying the first identifier and the second identifier as a target training text.

In a possible embodiment, the input unit is specifically configured to:

and respectively putting the target training files in the data set into a training set, a verification set and a test set according to a preset proportion.

After the training set, the verification set and the test set are respectively input into the BERT + CRF model, aiming at each hyper-parameter preset for the BERT + CRF model, training the BERT + CRF model under the hyper-parameter through at least one target training text in the training set and the first identification and the second identification carried by the at least one target training text to obtain parameters of the BERT + CRF model under the hyper-parameter; and the BERT + CRF model identifies characters, with the semantic similarity to the place name exceeding a preset threshold, in the text information according to the parameters.

After the parameters are obtained, verifying the parameters of the BERT + CRF model under the hyper-parameters through the verification set aiming at each hyper-parameter so as to obtain a first identification rate of the model under the parameters; and the first recognition rate is the success rate of recognizing the characters with the semantic similarity with the place name exceeding a preset threshold in the target training texts in the verification set by the BERT + CRF model under the parameter.

After determining the hyper-parameter of the BERT + CRF model according to the first identification rate, testing the parameter of the BERT + CRF model under the hyper-parameter through a test set to obtain a second identification rate of the model under the parameter; and the second recognition rate is the success rate of recognizing the characters with the semantic similarity with the place name exceeding a preset threshold in the target training texts in the test set by the BERT + CRF model under the parameter.

the alignment unit is configured to, after obtaining at least one piece of geographic location information including a geographic location, a word used for representing the name of the geographic location, and a correspondence between the word and the geographic location, put the geographic location, at least one word or vocabulary used for representing the name of the geographic location, and a correspondence between the at least one word or vocabulary and the geographic location into a geographic location set corresponding to the geographic location from at least one piece of geographic location information corresponding to the geographic location for each geographic location in the geographic location information.

In one possible embodiment, the apparatus further comprises:

the system comprises a presetting unit, a processing unit and a processing unit, wherein the presetting unit is used for taking a marked word as a preset place name and setting a geographical position corresponding to the preset place name for each preset place name before acquiring at least one geographical position information containing the geographical position, a character used for representing the geographical position name and a corresponding relation between the character and the geographical position; the marked vocabulary is a vocabulary formed by characters carrying target identification, and the target identification comprises a first identification and a second identification.

EXAMPLE III

Fig. 4 is a schematic structural diagram of an electronic device according to a third embodiment of the present application, including: a processor 401, a storage medium 402 and a bus 403, wherein the storage medium 402 stores machine-readable instructions executable by the processor 401, when the electronic device executes the method according to the first embodiment, the processor 401 communicates with the storage medium 402 via the bus 403, and the processor 401 executes the machine-readable instructions to perform the steps according to the first embodiment.

In this embodiment of the application, the storage medium 402 may further execute other machine-readable instructions to execute other methods as described in the first embodiment, and for the specific steps and principles of the executed method, reference is made to the description of the first embodiment, which is not described in detail herein.

Example four

A fourth embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor when the computer program is executed to perform the steps in the first embodiment.

In the embodiment of the present application, when being executed by a processor, the computer program may further execute other machine-readable instructions to perform other methods as described in the first embodiment, and for the specific method steps and principles to be performed, reference is made to the description of the first embodiment, which is not described in detail herein.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and there may be other divisions in actual implementation, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of location identification, comprising:

2. The method of claim 1, wherein the positioning model is trained by:

3. The method of claim 2, wherein inputting a data set comprising a plurality of target training texts into a BERT + CRF model to train the BERT + CRF model into the positioning model by supervised learning comprises:

4. The method of claim 2, wherein the correspondence between the text and the geographic location is constructed by:

5. The method according to claim 4, wherein before acquiring at least one piece of geographical location information including a geographical location, a text indicating a name of the geographical location, and a correspondence between the text and the geographical location, the method further comprises:

6. A position recognition apparatus, characterized in that the apparatus comprises:

7. The apparatus of claim 6, further comprising:

8. The apparatus of claim 7, wherein the input unit is specifically configured to:

9. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the position recognition method according to any one of claims 1 to 5.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the position recognition method according to any one of claims 1 to 5.