CN110298039B

CN110298039B - Event place identification method, system, equipment and computer readable storage medium

Info

Publication number: CN110298039B
Application number: CN201910539293.3A
Authority: CN
Inventors: 韩翠云; 陈玉光; 刘远圳; 潘禄; 施茜
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-06-20
Filing date: 2019-06-20
Publication date: 2023-05-30
Anticipated expiration: 2039-06-20
Also published as: CN110298039A

Abstract

The embodiment of the invention provides a method, a system, equipment and a computer readable storage medium for identifying an event place. The method comprises the following steps: extracting candidate place words in event information, wherein the event information comprises a title and a text; and inputting the candidate place words and the corresponding titles and place sentences into a pre-trained recognition model so that the recognition model recognizes whether the candidate place words are event places in the place sentences, wherein the place sentences are sentences in which the place words are located. The embodiment of the invention can accurately identify the occurrence place of the event.

Description

Event place identification method, system, equipment and computer readable storage medium

Technical Field

The embodiment of the invention relates to the technical field of communication, in particular to a method, a system, equipment and a computer readable storage medium for identifying an event place.

Background

The event map is a network map taking events as nodes and the relationship among the events as edges, wherein the event nodes are composed of all attribute characteristics of the events, and the place is one of important attributes of the events, so that the identification of the occurrence place of the events is important to the construction of the event map.

At present, some existing technologies can identify occurrence places in some events, but for scenes where the occurrence places of the events and the occurrence places are related at the same time, the occurrence places of the events and the occurrence places cannot be distinguished, so that the identification accuracy of the occurrence places of the events is low.

Disclosure of Invention

The embodiment of the invention provides a method, a system, equipment and a computer readable storage medium for identifying an event place, so as to improve the identification precision of the event place.

In a first aspect, an embodiment of the present invention provides a method for identifying an event venue, including: extracting candidate place words in event information, wherein the event information comprises a title and a text; and inputting the candidate place words and the corresponding titles and place sentences into a pre-trained recognition model so that the pre-trained recognition model recognizes whether the candidate place words are event places in the place sentences, wherein the place sentences are sentences in which the place words are located.

In a second aspect, an embodiment of the present invention provides an identification system for an event venue, including: the extraction module is used for extracting candidate place words in the event information, wherein the event information comprises a title and a text; and the input and recognition module is used for inputting the candidate place words and the corresponding titles and place sentences into a pre-trained recognition model so that the pre-trained recognition model can recognize whether the candidate place words are event places in the place sentences or not, and the place sentences are sentences in which the place words are located.

In a third aspect, an embodiment of the present invention provides an apparatus for identifying an event venue, including:

a memory;

a processor; and

a computer program;

wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of the first aspect.

In a fourth aspect, embodiments of the present invention provide a computer readable storage medium having stored thereon a computer program for execution by a processor to implement the method of the first aspect.

The method, the system, the equipment and the computer readable storage medium for identifying the event places are provided by the embodiment of the invention, and candidate place words in event information are extracted, wherein the event information comprises a title and a text; and inputting the candidate place words and the corresponding titles and place sentences into a pre-trained recognition model so that the pre-trained recognition model recognizes whether the candidate place words are event places in the place sentences, wherein the place sentences are sentences in which the place words are located. Since the identification model considers the title when identifying the event occurrence place, it is possible to distinguish the event occurrence place from the event-related place, thereby accurately identifying the event occurrence place.

Drawings

FIG. 1 is a flowchart of a method for identifying an event venue according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for identifying an event venue according to another embodiment of the present invention;

FIG. 3 is a schematic diagram of a system for identifying an event venue according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an event location identification device according to an embodiment of the present invention.

Specific embodiments of the present disclosure have been shown by way of the above drawings and will be described in more detail below. These drawings and the written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the disclosed concepts to those skilled in the art by reference to specific embodiments.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

The event place identification method provided by the embodiment of the invention can be applied to equipment such as terminal equipment, intelligent watches, tablet computers and the like.

The method for identifying the event places aims to solve the technical problems in the prior art.

The following describes the technical scheme of the present invention and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.

Fig. 1 is a flowchart of a method for identifying an event venue according to an embodiment of the present invention. Aiming at the technical problems in the prior art, the embodiment of the invention provides an event place identification method, which comprises the following specific steps:

step 101, extracting candidate place words in event information, wherein the event information comprises a title and a text.

In this embodiment, the event information may be information, for example, news information including a title and a body. Extracting candidate place words in the event information means extracting all place words from the title and the text of the event information. For example, a piece of news information is the title: the earthquake sensation of 4.1 grade earthquake Sichuan Leshan yaan and the like occurs in Yunnan Zhaotong is obvious. The text is: a4.1-grade earthquake occurs in Zhaoshan county (28.11 degrees North latitude and 103.63 degrees east longitude) in Zhaotong, yunnan, 6 months, 5 days and 15 hours, and the depth of the earthquake focus is 8 kilometers. After the earthquake occurs, the feedback of Sichuan net friends is obvious in the sense of earthquake such as Leshan, yibin, yaan and the like. Extracting candidate place words from the title and body includes: "Yunnan", "Zhaotong city", "Yongshan county", "Sichuan", "Leshan" and "Yaan".

Step 102, inputting the candidate place words and the corresponding titles and place sentences into a pre-trained recognition model, so that the pre-trained recognition model recognizes whether the candidate place words are event places in the place sentences, wherein the place sentences are sentences in which the place words are located.

Optionally, before inputting the candidate place words and the corresponding titles and place sentences into a pre-trained recognition model, so that the pre-trained recognition model recognizes whether the candidate place words are event places in the place sentences, the method of the embodiment of the present invention further includes: constructing an identification model to be trained; acquiring a training sample, wherein the training sample comprises a title and a text; extracting candidate place words in the title and the text of the training sample, and labeling the candidate place words to obtain labeling results, wherein the labeling results comprise whether the candidate place words are place words, whether the candidate place words are event occurrence places and whether the candidate place words are event-related places; inputting the labeling result of the candidate place words, the place sentences corresponding to the candidate place words and the titles corresponding to the candidate place words into the recognition model to be trained; and training the recognition model to be trained until a preset training index is reached.

Specifically, the pre-trained recognition model may be obtained by training a deep learning-based classification model, for example, a fine-tune model based on bert, and training the deep learning-based classification model mainly includes: and acquiring a training sample and labeling the sample. The training samples may be obtained by recalling news information of each field in a period of time, for example, news information in the last year from an event map resource library, randomly extracting a preset number of events, for example, 2000 events, from the obtained training samples, and sorting the 2000 events into the formats of event ID numbers, news links, titles and texts. And extracting all place words contained in the title and the text for subsequent manual labeling.

Optionally, manually labeling the training sample includes: determining whether the title and the text are target events or not, and if the title and the text are the target events, determining whether the title and the text contain place words or not; and if the place words are contained, classifying and labeling the place words. Optionally, whether the missing place words are not extracted can be checked manually, and if the missing place words are not extracted, the place words are extracted in a manual extraction mode.

Further, the marked data is input into a two-class model based on deep learning according to the format of 'title + place sentence and place word', so as to train the two-class model to learn whether the marked place word is an event occurrence place in the place sentence. In the learning process, the model outputs a score which indicates whether the place word belongs to the probability of the event occurrence place in the place sentence or not, and the larger the output score is, the larger the probability that the place word belongs to the event occurrence place in the place sentence is. And when the training result reaches the training index, ending the training. Alternatively, the training index may be that the probability value output by the model reaches a probability threshold.

The title may be used to distinguish between an event occurrence location and an event correlation location, where the location sentence refers to a sentence in which the extracted location word is located.

After training to obtain the identification model through the method steps, the identification model can be used for identifying the occurrence place of the event. In the use process, similar place words, place sentences and titles are input again, and then the model can identify whether the place words are event places in the place sentences.

The embodiment of the invention extracts candidate place words in the event information, wherein the event information comprises a title and a text; and inputting the candidate place words and the corresponding titles and place sentences into a pre-trained recognition model so that the recognition model recognizes whether the candidate place words are event places in the place sentences, wherein the place sentences are sentences in which the place words are located. Since the identification model considers the title when identifying the event occurrence place, it is possible to distinguish the event occurrence place from the event-related place, thereby accurately identifying the event occurrence place.

Optionally, extracting candidate place words in the event information or extracting candidate place words in the title and the text of the training sample includes at least one processing method of:

first kind: and extracting the title and the geographical nouns in the body as candidate place words. Specifically, whether the title and the text contain geographic nouns or not is judged, and if the title and the text contain geographic nouns, the geographic nouns are extracted to be used as place words.

Second kind: and cutting the title and the text, and performing part-of-speech analysis on the cutting result to obtain candidate place words. Alternatively, the existing word segmentation tool may be used to segment the title and the text to obtain a plurality of words, and then label the parts of speech of the plurality of words. Specifically, the part-of-speech tagging includes: several words are labeled as general nouns, modifiers, noun phrases, verb phrases, etc. Finally, the virtual words, the quantitative words, the personification words, the stop words and the like are removed from the words.

Third kind: and extracting the administrative division type place words in the title and the body as candidate place words according to the administrative division dictionary file. Specifically, the administrative division dictionary file is parsed to obtain place words of country, province, state, city, county, district and the like, and whether the title and the text contain the place words of country, province, state, city, county, district and the like obtained by parsing is further judged, and if the place words are contained, the place words are extracted to be the place words.

Fourth kind: and carrying out regular matching on the title and the text through a regular matching template to obtain candidate place words. Specifically, sentences in the title and the text are matched through a regular matching template, so that potential candidate place words are mined. Regular matching templates are held, for example, at (..times.), with attendance at (..times.) [ meeting|activity|forum ].

The title and the text in the above four processing methods refer to the title and the text in the event information or the training sample.

Alternatively, the embodiment of the invention can adopt one of the four modes to extract the place words, and can select two or three of the four modes to extract the place words, and of course, the title and the text can be sequentially processed according to the four modes to extract the candidate place words. The title and the text are processed in turn according to the four modes, and candidate place words are extracted. Thus, the potential candidate place words in the title and the text can be guaranteed to be mined to the greatest extent.

Optionally, after inputting the candidate place words and the corresponding titles and place sentences into a pre-trained recognition model, so that the recognition model recognizes whether the candidate place words are event places in the place sentences, the method of the embodiment of the present invention further includes: and processing the candidate place words corresponding to the identified event places into addresses in a preset format. For example, the candidate place words corresponding to the event occurrence place identified in the above embodiment are mapped to administrative units such as province, city, county/district, etc., and are processed into places in five-level format of "country-province/state-city-county/district-address". If the upper level administrative division of a city is province, state, the final processing results in a country-province/state-city address. If the upper level of the city is a country, the final processing results in a country-to-city address.

Fig. 2 is a flowchart of a method for identifying an event venue according to another embodiment of the present invention. On the basis of the above embodiment, processing the candidate place words corresponding to the identified event places into addresses in a preset format includes:

step 201, word segmentation is carried out on candidate place words;

step 202, performing part-of-speech analysis on the word segmentation result to obtain fine-granularity place words;

since the candidate place words extracted in the foregoing embodiments may be rough, for example, in the format of "Changsha city of Hunan province". Therefore, the extracted candidate place words can be further segmented by adopting the existing segmentation tool to obtain place words with finer granularity, and place words with finer granularity of Hunan province and Changsha city can be obtained after further segmentation.

Optionally, performing part-of-speech analysis on the segmentation result includes: and labeling a plurality of words as general nouns, modifiers, noun phrases, verb phrases and the like, and finally removing words such as virtual words, quantitative words, personification words, stop words and the like from the plurality of words.

Step 203, when the fine-grained place word belongs to the administrative division type place word, processing the fine-grained place word into an address with a preset format by adopting an administrative division dictionary. Optionally, the administrative division class place words include administrative division class place words such as xx province, xx city, xx county, xx district, xx town and the like. For example, if a place word with a certain fine granularity is a long-sand city, determining that a place word of a first-level administrative division class of the long-sand city is Hunan province in a administrative division dictionary, and obtaining an address with a preset format, namely the long-sand city in Hunan province.

Optionally, in the case that the fine-grained place word belongs to the administrative division type place word, processing the fine-grained place word into the address in the preset format by adopting the administrative division dictionary includes: under the condition that the fine-granularity place words belong to administrative division place words, acquiring the upper-level administrative division place words corresponding to the administrative division place words according to an administrative division dictionary until the highest-level administrative division place words are acquired; and processing the administrative division type place words into addresses which comprise the place words of the administrative division level step by step up to the highest level according to the administrative division level.

Optionally, as shown in fig. 2, after performing part-of-speech analysis on the word segmentation result to obtain the place word with fine granularity, the method in the embodiment of the present invention further includes:

step 204, under the condition that the place words with fine granularity belong to the place words of the organization, adopting a mapping relation between a preset entity and places to process the place words with fine granularity into addresses with preset formats.

Optionally, in the case that the fine-grained place word belongs to the organization structure type place word, processing the fine-grained place word into the address in the preset format by adopting the mapping relationship between the preset entity and the place includes: under the condition that the fine-granularity place words belong to organization place words, sequentially acquiring the upper-level place words corresponding to the organization place words according to a preset mapping relation between an entity and places until the highest-level place words; and processing the organization place words into addresses from the organization place words step by step upwards to the highest place words.

Optionally, the organization class place words include xx streets, xx communities, xx cells, xx schools, xx buildings, and other entities. For example, if a place word with a fine granularity is xx university, it may be processed into a preset format of a detailed address corresponding to the national-province/state-city-county/district-xx university according to a preset mapping relationship between an entity and a place. The entity can be a house, a shop, a mailbox or a bus station. Since entities of the same name may exist, one entity may eventually get multiple addresses, resulting in a list of addresses.

In addition, in order to increase the robustness of the recognition model, candidate place words with scores exceeding a score threshold are determined according to the scoring of the recognition model on the extracted candidate place words, and the candidate place words with scores exceeding the score threshold are processed into addresses in a preset format.

For example, the recognition model scores the recognized place word "Xinhua area" as 0.9, scores "Shijia Xinhua area" as 0.8, scores "Beijing" as 0.4, and sets the score threshold as 0.5, and then filters out the place word "Beijing", and processes the "Xinhua area" and "Shijia Xinhua area" as addresses in a preset format.

Further, taking a Xinhua area as an example, firstly word segmentation and part-of-speech analysis are carried out to obtain a Xinhua area, searching the Xinhua area in a political region dictionary, and obtaining the address format after processing if 2 results are obtained, namely the Xinhua area- > [ Shijia, cangzhou ], respectively: "China-Hebei province-Shijia village-Xinhua district-0.45, china-Hebei province-Cangzhou city-Xinhua district-0.45".

Similarly, searching for a Shijia Xinhua area in the administrative division dictionary, and obtaining the address format after processing as follows: "China-Hebei province-Shijia-Xinhua district-0.8".

And combining the two results to obtain 'Chinese-Hebei province-Shijia-Xinhua district-1.25 and Chinese-Hebei province-Cangzhou city-Xinhua district-0.45', wherein the combination is carried out from the highest level to the last level according to the order of administrative division in the combination process, for example, the combination is carried out according to the order of country, province, city and county.

Fig. 3 is a schematic structural diagram of an event location recognition system according to an embodiment of the present invention. The system for identifying an event venue provided in the embodiment of the present invention may execute the processing flow provided in the embodiment of the method for identifying an event venue, as shown in fig. 3, where the system for identifying an event venue 30 includes: a decimation module 31 and an input and recognition module 32; the extraction module 31 is configured to extract candidate place words in event information, where the event information includes a title and a text; the input and recognition module 32 is configured to input the candidate place word and the corresponding title and place sentence into a recognition model trained in advance, so that the recognition model recognizes whether the candidate place word is an event occurrence place in the place sentence, where the place sentence is a sentence where the place word is located.

Optionally, the system 30 of the embodiment of the present invention further includes: a construction module 33, an acquisition module 34, an input module 35 and a training module 36; wherein, the construction module 33 is configured to construct an identification model to be trained; an acquisition module 34, configured to acquire a training sample, where the training sample includes a title and a text; the extracting module 31 is further configured to extract candidate place words in the title and the text of the training sample, and label the candidate place words to obtain a labeling result, where the labeling result includes whether the candidate place words are place words, whether the candidate place words are event places and whether the candidate place words are event-related places; the input module 35 is configured to input the labeling result of the candidate place word, the place sentence corresponding to the candidate place word, and the title corresponding to the candidate place word into the recognition model to be trained; and the training module 36 is configured to train the recognition model to be trained until a preset training index is reached.

Optionally, when the extracting module 31 extracts candidate place words in the event information or extracts candidate place words in the title and the text of the training sample, at least one of the following processes are included: extracting the title and the geographic nouns in the body as the candidate place words; performing word segmentation on the title and the text, and performing part-of-speech analysis on a word segmentation result to obtain the candidate place words; extracting the title and the administrative division category place words in the body according to the administrative division dictionary file to serve as candidate place words; and carrying out regular matching on the title and the text through a regular matching template to obtain candidate place words.

Optionally, the system 30 of the embodiment of the present invention further comprises a processing module 37; the processing module 37 is configured to process the candidate place words corresponding to the identified event occurrence place into an address in a preset format.

Optionally, when the processing module 37 processes the candidate place words corresponding to the identified event place into an address in a preset format, the processing module is specifically configured to: word segmentation is carried out on the candidate place words; part-of-speech analysis is carried out on the word segmentation result to obtain place words with fine granularity; and under the condition that the fine-granularity place words belong to administrative division type place words, adopting an administrative division dictionary to process the fine-granularity place words into addresses in a preset format.

Optionally, the processing module 37 is further configured to process the fine-grained place word into the address in the preset format by adopting a preset mapping relationship between the entity and the place in the case that the fine-grained place word belongs to the organization place word.

Optionally, when the fine-grained place word belongs to an administrative division type place word, the processing module 37 is specifically configured to, when using an administrative division dictionary to process the fine-grained place word into an address in a preset format: under the condition that the fine-granularity place words belong to administrative division place words, acquiring the upper-level administrative division place words corresponding to the administrative division place words according to an administrative division dictionary until the highest-level administrative division place words are acquired; and processing the administrative division type place words into addresses which comprise the place words of the administrative division level step by step up to the highest level according to the administrative division level.

Optionally, when the fine-grained place word belongs to the organization structure type place word, the processing module 37 is specifically configured to, when adopting a mapping relationship between a preset entity and a place to process the fine-grained place word into an address in a preset format: under the condition that the fine-granularity place words belong to organization place words, sequentially acquiring the upper-level place words corresponding to the organization place words according to a preset mapping relation between an entity and places until the highest-level place words; and processing the organization place words into addresses from the organization place words step by step upwards to the highest place words. The system for identifying an event location in the embodiment shown in fig. 3 may be used to implement the technical solution of the above method embodiment, and its implementation principle and technical effects are similar, and are not described herein again.

According to the method, the device and the computer readable storage medium for identifying the event places, provided by the embodiment of the invention, candidate place words in event information are extracted, the event information comprises a title and a text, the candidate place words and the corresponding title and place sentences are input into a pre-trained identification model, so that whether the candidate place words are event places in the place sentences or not is identified by the identification model, and the place sentences are sentences in which the place words are located. Since the identification model considers the title when identifying the event occurrence place, it is possible to distinguish the event occurrence place from the event-related place, thereby accurately identifying the event occurrence place.

Fig. 4 is a schematic structural diagram of an event location identification device according to an embodiment of the present invention. The event location identification device provided by the embodiment of the present invention may execute the processing flow provided by the embodiment of the event location identification method, as shown in fig. 4, where the event location identification device 40 includes: memory 41, processor 42, computer programs and communication interface 43; wherein the computer program is stored in the memory 41 and configured to be executed by the processor 42 for the steps of the above method embodiments.

The identifying device for the event area in the embodiment shown in fig. 4 may be used to implement the technical solution of the above method embodiment, and its implementation principle and technical effects are similar, and will not be described herein again.

In addition, an embodiment of the present invention also provides a computer-readable storage medium having stored thereon a computer program that is executed by a processor to implement the method for identifying an event venue described in the above embodiment.

The method, the device and the computer readable storage medium for identifying the event places are provided by the embodiment of the invention, and at least one place word in the event information is extracted, wherein the event information comprises a title and a text; and inputting the at least one place word and the corresponding title and place sentence into a pre-trained recognition model so that the recognition model recognizes the event occurrence place in the at least one place word, wherein the place sentence is the sentence in which the place word is located. Since the identification model considers the title when identifying the event occurrence place, it is possible to distinguish the event occurrence place from the event-related place, thereby accurately identifying the event occurrence place.

In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in hardware plus software functional units.

The integrated units implemented in the form of software functional units described above may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform part of the steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional modules is illustrated, and in practical application, the above-described functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to perform all or part of the functions described above. The specific working process of the above-described device may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. A method for identifying an event venue, comprising:

extracting candidate place words in event information, wherein the event information comprises a title and a text;

inputting the candidate place words and the corresponding titles and place sentences into a pre-trained recognition model so that the pre-trained recognition model recognizes whether the candidate place words are event places in the place sentences or not, wherein the place sentences are sentences in which the place words are located;

the method further comprises, before the step of inputting the candidate place words and the corresponding titles and place sentences into a pre-trained recognition model so that the pre-trained recognition model recognizes whether the candidate place words are event places in the place sentences, the method further comprises:

constructing an identification model to be trained;

acquiring a training sample, wherein the training sample comprises a title and a text;

extracting candidate place words in the title and the text of the training sample, and labeling the candidate place words to obtain labeling results, wherein the labeling results comprise whether the candidate place words are place words, whether the candidate place words are event occurrence places and whether the candidate place words are event-related places;

inputting the labeling result of the candidate place words, the place sentences corresponding to the candidate place words and the titles corresponding to the candidate place words into the recognition model to be trained;

and training the recognition model to be trained until a preset training index is reached.

2. The method of claim 1, wherein the extracting candidate place words in event information or extracting candidate place words in the title and body of the training sample comprises at least one of:

extracting the title and the geographic nouns in the body as the candidate place words;

performing word segmentation on the title and the text, and performing part-of-speech analysis on a word segmentation result to obtain the candidate place words;

extracting the title and the administrative division type place words in the body as the candidate place words according to the administrative division dictionary file;

and carrying out regular matching on the title and the text through a regular matching template to obtain the candidate place words.

3. The method of claim 1 or 2, wherein upon said entering said candidate place words with corresponding said headlines and place sentences into a pre-trained recognition model, such that said pre-trained recognition model recognizes whether said candidate place words are places of occurrence of an event in said place sentences, said method further comprises:

and processing the candidate place words corresponding to the identified event occurrence places into addresses in a preset format.

4. A method according to claim 3, wherein said processing the candidate place words corresponding to the identified event places as addresses in a preset format comprises:

word segmentation is carried out on the candidate place words;

part-of-speech analysis is carried out on the word segmentation result to obtain place words with fine granularity;

and under the condition that the fine-granularity place words belong to administrative division type place words, adopting an administrative division dictionary to process the fine-granularity place words into addresses in a preset format.

5. The method of claim 4, wherein after the word segmentation result is subjected to part-of-speech analysis to obtain fine-grained place words, the method further comprises:

and under the condition that the fine-grained place words belong to organization place words, processing the fine-grained place words into addresses in a preset format by adopting a preset mapping relation between entities and places.

6. The method according to claim 4 or 5, wherein, in the case that the fine-grained place word belongs to an administrative division type place word, processing the fine-grained place word into a preset-format address using an administrative division dictionary includes:

under the condition that the fine-granularity place words belong to administrative division place words, acquiring the upper-level administrative division place words corresponding to the administrative division place words according to an administrative division dictionary until the highest-level administrative division place words are acquired;

and processing the administrative division type place words into addresses which comprise the place words of the administrative division level step by step up to the highest level according to the administrative division level.

7. The method according to claim 5, wherein, in the case that the fine-grained place word belongs to an organization structure type place word, processing the fine-grained place word into a preset format address by adopting a preset mapping relationship between an entity and a place includes:

under the condition that the fine-granularity place words belong to organization place words, sequentially acquiring the upper-level place words corresponding to the organization place words according to a preset mapping relation between an entity and places until the highest-level place words;

and processing the organization place words into addresses from the organization place words step by step upwards to the highest place words.

8. An identification system for an event venue, comprising:

the extraction module is used for extracting candidate place words in the event information, wherein the event information comprises a title and a text;

the input and recognition module is used for inputting the candidate place words and the corresponding titles and place sentences into a pre-trained recognition model so that the pre-trained recognition model can recognize whether the candidate place words are event places in the place sentences or not, and the place sentences are sentences in which the place words are located;

the system further comprises:

the construction module is used for constructing an identification model to be trained;

the acquisition module is used for acquiring a training sample, wherein the training sample comprises a title and a text;

the extraction module is further used for extracting candidate place words in the title and the text of the training sample, and labeling the candidate place words to obtain labeling results, wherein the labeling results comprise whether the candidate place words are place words, whether the candidate place words are event places and whether the candidate place words are event-related places;

the input module is used for inputting the labeling result of the candidate place words, the place sentences corresponding to the candidate place words and the titles corresponding to the candidate place words into the recognition model to be trained;

and the training module is used for training the recognition model to be trained until a preset training index is reached.

9. The system of claim 8, wherein the extraction module, when extracting candidate place words in event information or candidate place words in the title and text of the training sample, comprises at least one of the following:

extracting the title and the administrative division category place words in the body according to the administrative division dictionary file to serve as candidate place words;

and carrying out regular matching on the title and the text through a regular matching template to obtain candidate place words.

10. The system according to claim 8 or 9, characterized in that the system further comprises:

and the processing module is used for processing the candidate place words corresponding to the identified event occurrence places into addresses in a preset format.

11. The system of claim 10, wherein the processing module is configured to, when processing the candidate place words corresponding to the identified event place as an address in a preset format:

word segmentation is carried out on the candidate place words;

12. The system of claim 11, wherein the processing module is further configured to process the fine-grained place word into a preset-format address using a preset entity-place mapping relationship in a case where the fine-grained place word belongs to an organization-like place word.

13. The system according to claim 11 or 12, wherein the processing module is configured to, when the fine-grained place word belongs to an administrative division type place word, process the fine-grained place word into an address in a preset format by using an administrative division dictionary, specifically:

14. The system of claim 12, wherein the processing module is configured to, when the fine-grained place word belongs to an organization structure type place word, process the fine-grained place word into an address in a preset format by using a mapping relationship between a preset entity and a place, specifically:

15. An apparatus for identifying an event venue, comprising:

a memory;

a processor; and

a computer program;

wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1-7.

16. A computer readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the method according to any of claims 1-7.