CN111914539A

CN111914539A - Channel announcement information extraction method and system based on BilSTM-CRF model

Info

Publication number: CN111914539A
Application number: CN202010756216.6A
Authority: CN
Inventors: 杨保岑; 朱剑华; 何明宪; 张秋实; 李�赫; 李莉; 徐硕; 朱楠; 周冠男; 吕霖; 徐乐; 李伟凡; 李艳芳; 彭洋; 刘思鹏; 杨传波
Original assignee: CHANGJIANG WATERWAY SURVEY CENTER
Current assignee: CHANGJIANG WATERWAY SURVEY CENTER
Priority date: 2020-07-31
Filing date: 2020-07-31
Publication date: 2020-11-10
Anticipated expiration: 2040-07-31
Also published as: CN111914539B

Abstract

The invention provides a method and a system for extracting channel announcement information based on a BilSTM-CRF model, which comprises the steps of performing Chinese word segmentation according to channel related information, and constructing an electronic channel map object name word segmentation dictionary according to a channel element map layer when performing Chinese word segmentation to serve as a login dictionary; the method comprises the steps of dividing elements which have practical significance to users in the channel notice information according to mechanisms O, places L, topics S, events E and time T, constructing a text semantic extraction model of the channel notice, training by adopting a BilSTM-CRF model under the constraint of the text semantic extraction model, and extracting key information. The channel announcement information obtained by the invention can be used for channel announcement, text information visualization such as scale and the like, channel key area visualization, navigation auxiliary reminding, channel information interaction and pushing based on the mobile terminal and the like.

Description

Channel announcement information extraction method and system based on BilSTM-CRF model

Technical Field

The invention relates to the field of channel information intellectualization, in particular to a channel announcement information extraction method based on a BilSTM-CRF model.

Background

The channel announcement information is known content which is issued by channel departments to the public for ensuring the smoothness and safety of a channel, and through the channel announcement content, a ship can plan a navigation route in advance, so that potential safety hazards and property loss caused by obstacles are avoided as much as possible.

In the current digital channel informatization construction, a fixed structured template is not formed yet, channel notification information is mainly presented in a non-structured text form, so that the business cooperation is difficult to realize among related businesses through information sharing and flow docking, the comprehensive utilization efficiency of resources is low, and the texts need to be converted into structured data through a natural language processing technology to promote channel resource integration and sharing.

Therefore, there is a need in the art to provide a new practical technique for converting unstructured channel announcement data into structured data with spatial identifiers, so as to provide a data base for practical applications, such as intelligent spatial matching of channel announcement information with an electronic channel map in a changjiang channel map APP or other real-time application tools.

Disclosure of Invention

The invention aims to realize the technical scheme of extracting the channel notice information based on the BilSTM-CRF model, improve the utilization rate of the channel notice information and promote the integration and sharing of channel resources.

The technical scheme of the invention provides a channel announcement information extraction method based on a BilSTM-CRF model, which comprises the following steps:

step 1, Chinese word segmentation is carried out according to relevant information of a navigation channel, and when Chinese word segmentation is carried out, an electronic navigation channel graph target name word segmentation dictionary is constructed according to a navigation channel element graph layer and is used as a login dictionary;

and 2, extracting key information through geographic entity recognition, wherein the key information extraction is realized by dividing elements which have practical significance to users in the channel notice information according to a mechanism O, a place L, a subject S, an event E and time T, constructing a text semantic extraction model of the channel notice, training by adopting a BilSTM-CRF model under the constraint of the text semantic extraction model, and extracting the key information.

Furthermore, channel information acquisition is performed in advance, and comprises the steps of acquiring and storing channel related information, wherein the channel related information comprises channel announcements, planned water depth and maintenance scale; and acquiring relevant information of the navigation channel by adopting a focused web crawler mode.

And when crawling the page, putting the filtered links into the URL queue in turn according to the priorities of 'important', 'upstream', 'midstream' and 'downstream'.

Moreover, the electronic navigation channel map object name word segmentation dictionary is constructed according to the navigation channel element map layer in the following way,

step 1.1, loading channel element layers in batches;

step 1.2, reading the element, extracting the element name according to the attribute field, and storing the result to a read attribute name list;

step 1.3, judging whether unread elements exist at present, if so, continuing to read the elements, returning to the step 1.2, and if not, ending the reading process and entering the step 1.4;

and step 1.4, according to the final name list obtained in the step 1.2, writing the final name list into the text file in sequence according to the format of 'name + line feed' of the Chinese word segmentation dictionary, and outputting the final file as the word segmentation dictionary.

Moreover, in the text semantic extraction model of the channel announcement,

a mechanism O for identifying a channel announcement issuing mechanism;

a location L for identifying position-related information contained in the channel announcements, including typical channel features with unambiguous spatial location characteristics;

the theme S is used for identifying the main content contained in the channel announcement, wherein the main content comprises channel special element objects and the running state of a channel;

event E, used for identifying the procedural content in the channel announcement, including natural events and artificial events;

and the time T is used for identifying the release time of the channel announcement.

And training by adopting a BilSTM-CRF model under the constraint of the text semantic extraction model, wherein the training comprises the step of marking the text semantic extraction model by using a BIO marking set adopted in Bakeoff-3 evaluation, and the constraint is added to a finally predicted label on a CRF layer of the BilSTM-CRF model.

And the method is used for spatial information visualization, and comprises the steps of carrying out spatial matching with an electronic channel map based on geographic entities with labels as places identified by a BilSTM-CRF model, generating a geographic fence by taking a spatial position as a center, and marking and displaying real-time channel notification information.

And the method is used for channel key area visualization and navigation auxiliary reminding.

And the method is used for channel information interaction and pushing based on the mobile terminal.

The invention also provides a channel notice information extraction system based on the BilSTM-CRF model, which is used for executing the channel notice information extraction method based on the BilSTM-CRF model.

The invention provides a channel notice information extraction technology based on a BilSTM-CRF model, and the channel notice information is quickly extracted. According to the method, firstly, a network crawler technology is utilized to crawl and store channel related information on a channel bureau website, then intelligent processing is carried out on the crawled data, unstructured channel notification information is split into independent word units with specific meanings, then a text semantic extraction model of the channel notification 'mechanism-place-subject-event-time' (OLSET) is constructed, machine learning training is carried out by combining a bidirectional long-short term memory gating structure-discrete random field (BiLSTM-CRF) model, and finally machine intelligent extraction of key information of channel notification elements is achieved according to a training result. The obtained channel announcement information can be used for channel announcement, visualization of character information such as scales and the like, visualization of channel key areas, navigation auxiliary reminding, channel information interaction and pushing based on the mobile terminal and the like. The invention utilizes the electronic channel map object name to construct the word segmentation dictionary, can more accurately extract channel information than a conventional dictionary, is not only suitable for extracting information elements of channel notice, but also is also suitable for geospating and visualizing other information of shipping, and indexes such as identification accuracy, recall rate and the like of the electronic channel map object name are continuously improved along with the operation and the perfection of a machine learning model.

Drawings

FIG. 1 is a system block diagram of an embodiment of the present invention;

FIG. 2 is a schematic diagram of channel information acquisition according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of key information extraction according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a Chinese segmentation dictionary construction process according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a Chinese word segmentation process according to an embodiment of the present invention;

FIG. 6 is a schematic structural diagram of a BilSTM-CRF model according to an embodiment of the present invention;

fig. 7 is a schematic diagram of spatial information visualization according to an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is explained in detail in the following by combining the drawings and the embodiment.

The embodiment provides a processing flow of a channel announcement information extraction method based on a BilSTM-CRF model, which is specifically realized as follows:

firstly, acquiring and storing relevant information of a navigation channel in advance.

In the embodiment, a Focused web Crawler (Focused Crawler) technology is preferably used for crawling channel related information such as channel announcement, planned water depth, maintenance scale and the like from a website of a channel office of the Yangtze river, and the obtained result can be stored in a database. Example crawling process as in fig. 2, detailed implementation steps are described as follows:

step 1, definition and description of a crawling target: in the focused web crawler, firstly, a target crawled by the focused web crawler and description thereof are defined according to crawling requirements, namely a Yangtze river channel bureau channel service webpage comprises contents such as channel scale forecast, channel announcement, water level, tide level, safety early warning, comprehensive service information, a monthly water depth plan, an annual water depth plan and the like;

step 2, obtaining an initial URL (http:// www.cjhdj.com.cn/hdfw /);

step 3, crawling the page according to the initial URL and obtaining a new URL;

step 4, filtering links irrelevant to a crawling target from the new URL, for example, when a channel is crawled for notification, a filtering keyword of a URL address is 'channel _ node', namely all webpage addresses need to take 'http:// www.cjhdj.com.cn/hdfw/channel _ node/' as a start;

and 5, sequentially placing the filtered links into a URL queue:

in specific implementation, based on the Yangtze river channel bureau business division, a channel announcement webpage has sub-columns such as key points, upstream, midstream, downstream, summary and the like, the key column comprises channel information which has important reference significance and value for ship navigation, such as channel opening and closing, channel adjustment, channel emergency and the like, and the upstream, midstream and downstream columns provide announcement information corresponding to channel geographical section division and are usually divided according to geographical positions. Thus, the preferred suggestions may place the filtered links into the URL queue in order of priority for "important", "upstream", "midstream", and "downstream", for example:

"important" (http:// www.cjhdj.com.cn/hdfw/channel _ notice/hdtgzy /), "important" ("important"),

(http:// www.cjhdj.com.cn/hdfw/channel _ node/hdtgsy /),

(iii) mid-stream (http:// www.cjhdj.com.cn/hdfw/channel _ note/hdtgzy 1/),

(iv) < downstream > (http:// www.cjhdj.com.cn/hdfw/channel _ notice/hdtgxy >);

step 6, adopting a breadth-first crawling strategy to the filtered links to acquire webpage contents;

step 7, acquiring a next URL address to be crawled as an initial URL address, and repeating the step 3-7;

and 8, stopping crawling when the URL address needing to be crawled cannot be obtained.

Secondly, Chinese word segmentation and geographic entity recognition are carried out according to the input relevant information of the navigation channel, the extraction process is as shown in figure 3, and the detailed implementation steps are described as follows:

(1) chinese word segmentation

Because the electronic channel map contains the place names related to the channels, the navigation marks, the names of the channel facilities such as the renovation buildings and the like, and other special nouns which are not related in the conventional dictionary, the embodiment adopts the names of the electronic channel map objects to construct the word segmentation dictionary, the processing flow is as shown in fig. 4, and the word segmentation processing is carried out on the channel announcement title by adopting the jieba word segmentation tool under the python environment, the processing flow is as shown in fig. 5, and the detailed implementation steps are described as follows:

step 1, constructing an electronic channel map object name word segmentation dictionary, referring to fig. 4, and describing a specific process as follows:

step 1.1, loading the navigation channel element layers in batches.

Step 1.2, reading the element, extracting the element name according to the attribute field (such as NOBJNM), and saving the result to the read attribute name list.

And 1.3, judging whether unread elements exist at present, if so, continuing to read the elements, repeating the step 1.2, and if not, ending the reading process and entering the step 1.4.

And step 1.4, according to the final name list obtained in the step 1.2, writing the final name list into the text file in sequence according to a format of 'name + line feed' commonly used by the Chinese word segmentation dictionary, and outputting the final file as the word segmentation dictionary.

And 2, sentence cleaning is carried out on the sentence to be processed, special characters such as Latin symbols and the like which are coded based on utf8 and are irrelevant to word segmentation are separated, and the special characters are marked as unknown parts of speech.

And 3, loading the constructed electronic navigation path map object name word segmentation dictionary as a login dictionary to establish a trie tree word segmentation model (prefix dictionary).

Step 4, performing word graph scanning based on the prefix dictionary to generate a Directed Acyclic Graph (DAG) formed by all possible word forming conditions of the Chinese characters in the text;

step 5, searching a maximum probability path Route by adopting dynamic planning, and finding out a maximum segmentation combination based on word frequency;

step 6, marking the login words recorded in the word segmentation dictionary according to the dictionary;

step 7, identifying words which are not included in the word segmentation dictionary separately according to Chinese and English, giving corresponding labels to combinations of English, numbers and time forms, and calculating word forming probability by Chinese through a Hidden Markov Model (HMM) based on Chinese character word forming capability;

step 8, performing part-of-speech tagging based on a Viterbi algorithm;

and 9, extracting keywords based on the TF-IDF and the TextRank model.

(2) Named entity recognition

Step 1, although the current channel announcement information presents unstructured characteristics, the current channel announcement information still comprises specific element units, such as mechanisms, places, topics, events, time and the like, so that the geographic entity identification of the channel announcement information is allowed to be converted into a sequence labeling problem, the problem is simplified into structured classification, and the method lays a cushion for next deep learning. Dividing elements which have practical significance to users in the channel announcement information according to Organization (Organization), Location (Location), Subject (Subject), Event (Event) and Time (Time), thereby constructing a text semantic extraction model of the channel announcement "Organization-Location-Subject-Event-Time" (OLSET), wherein:

(1) o (organization) is the mechanism: and issuing mechanisms for identifying channel announcements, such as Changjiang XX channel bureau \ place and the like.

(2) L (location) is the location: the method is used for identifying position related information contained in the channel announcement, such as XX channel \ water area \ river reach \ shoal … … (only XX is marked, postfix contents of channel \ water area \ river reach \ shoal and the like are not marked), and typical channel ground objects with definite spatial position characteristics, such as bridges, wharfs and the like.

(3) S (subject) is the subject: the method is used for identifying the main content contained in the channel announcement, wherein the main content comprises channel special element objects, such as a control river reach, a shoal, a bridge area, a signal station, a special channel \ navigation mark and the like, and the operation state of the channel, such as contents of navigation prohibition \ non-navigation prohibition, shift collection \ shift start, navigation mark adjustment \ removal \ recovery \ arrangement \ malfunction \ abnormal operation … … and the like.

(4) E (event) is an event: the method is used for identifying the contents with procedural property in the channel notice, such as natural events of flood peaks, floods, dead waters, flood seasons, non-flood seasons and the like, or artificial events of channel maintenance, dredging, sand mining, construction, operation, investigation … … and the like.

(5) T (time) is time: and the release time is used for identifying the channel announcement, such as XX year, X month and X day.

Step 2, performing machine learning training by adopting a bidirectional long-short term memory gating structure-discrete random field (BilSTM-CRF) model, and extracting key information, wherein the model structure diagram is shown in FIG. 6, and the processing flow is described as follows:

1) based on the text semantic extraction model constructed in the step 1, a BIO (building information organization) annotation set adopted in Bakeoff-3 evaluation is used for annotating the model, namely B-ORG represents the first character of a mechanism, I-ORG represents the first character of a mechanism, B-LOC represents the first character of a place, I-LOC represents the first character of a place, B-SUB represents the first character of a subject, I-SUB represents the first character of a subject, B-EVE represents the first character of an event, I-EVE represents the first character of an event, B-TM represents the first character of an event, I-TM represents the first character of time, and O represents that the character does not belong to one part of a named entity.

The invention proposes that geographic entity recognition is actually a classification problem, so targets are divided according to business requirements, and subsequent steps are recognized through machine learning. In the embodiment, the crawled 'important' channel announcement information is used as a training data set to label the text semantic extraction model.

2) Taking a sentence as a unit, a sentence (a sequence of words) containing n words is written as:

x＝(x₁,x₂,...,x_n)

wherein x_iAnd representing the id of the ith word in the sentence in the dictionary, and further obtaining a word vector of each word, wherein the dimension is the size of the dictionary.

3) Embedding vector matrix using pre-training or random initialization to convert each character x in sentence_iMapping from word vectors to low-dimensional dense word vectors x_i(x_i∈R^dR is the word vector and d is the dimension of the vector) and sets the over-fit parameter dropout to mitigate the over-fit. dropout refers to temporarily discarding a neural network unit from a network according to a certain probability in the training process of a deep learning network.

4) And automatically extracting sentence characteristics. Embedding a sequence of vectors (x) for each word of a sentence₁,x₂,...,x_n) As the input of each time step of the bidirectional LSTM, the hidden state sequence (h) of the forward LSTM is output₁,h₂...,h_n) Hidden state sequence with inverted LSTM output (h'₁,h'₂...,h'_n) Position-based splicing h for hidden states output at various positions_t＝[h_t；h'_t]∈R^m(m is the dimension of the position) to obtain the complete hidden state sequence (h)₁,h₂...,h_n)∈R^n*m。

5) After dropout is set, a linear layer is accessed, a hidden state vector is mapped from m dimension to k dimension, k is the label number of a label set, and thus the automatically extracted sentence characteristics are obtained and are recorded as an LSTM output matrix P ═ P (P)₁,p₂,...,p_n)∈R^n*k。

R^n*kFor reduced-dimension word vector sets, p_iThe rank of the matrix is output for the LSTM.

Can be substituted by p_i∈R^kEach dimension p of_ijAre all regarded as words x_iIf the scoring value of the jth label is classified, if Softmax is carried out on P, the classification is equivalent to independent class k classification of each position. However, since the marked information cannot be used when marking each position, a conditional random field CRF layer is accessed for marking next.

6) Sentence-level sequence labeling is performed. The parameter of the CRF layer is a matrix A, A of (k +2) × (k +2)_ijThe transition score from the ith tag to the jth tag is shown, and the tags marked before can be used when marking a position, so 2 is added to add a starting state to the head of the sentence and an ending state to the tail of the sentence. If a tag sequence y with a length equal to the sentence length is recorded (y)₁,y₂,...,y_n) Then the model scores as follows for sentence x with a label equal to y:

wherein P is_i,yiScore value, A, for sorting the ith word to the yi tag_yi-1,yiRepresenting the transition score from the yi-1 st tag to the yi-th tag.

It can be seen that the score for the entire sequence is equal to the sum of the scores for the positions, and that the score for each position is derived from two parts, one part being the p output by the LSTM_iThe other part is determined by the transfer matrix A of the CRF. Further, the normalized probability can be obtained by using Softmax:

wherein, y_nIs a subsequence of tag sequence y, i.e., a tag that may be present. score (x, y) is a scoring that the label of sentence x equals y, score (x, y)_n) The label for sentence x equals y_nScoring of (4).

7) The log-likelihood estimate is maximized. The log-likelihood for one training sample (x, y) is given by:

8) a prediction tag for each word is obtained. The optimal path is solved using the dynamically planned Viterbi algorithm:

the Viterbi algorithm is a classical algorithm for solving the optimal path by dynamic programming, and the details of the invention are not repeated.

9) The CRF layer rules constraints. The tags for each word in the sentence are available through B-LSTM, but there is no guarantee that the tags are predicted correctly each time. The CRF layer may add constraints to the last predicted label to ensure that the predicted label is consistent with the rules, and the constraints may be automatically learned through the CRF layer during training of the training data. And accessing a CRF layer to predict sentence-level labels, so that the labeling process does not independently classify each word any more, the transition probability of the sequence is introduced, and finally the function loss is calculated and fed back to the network. Under the action of CRF, the sequence can be regulated according to transition probability.

In the embodiment, after the training and learning of the model are finished, the crawled 'upstream', 'midstream' and 'downstream' channel announcement information is used as a test data set to verify and evaluate the model processing result.

In specific implementation, the method provided by the technical scheme of the invention can be implemented by a person skilled in the art by adopting a computer software technology to realize an automatic operation process, and a system device for operating the method also needs to be in the protection scope of the invention. Referring to fig. 1, the embodiment further provides a channel announcement information extraction system based on the BiLSTM-CRF model, which includes a chinese word segmentation module (10) and a named entity identification module (20).

The Chinese word segmentation module (10) is used for performing Chinese word segmentation according to the relevant information of the navigation channel, and when the Chinese word segmentation is performed, an electronic navigation channel map object name word segmentation dictionary is constructed according to the navigation channel element map layer and is used as a login dictionary;

the named entity recognition module (20) is used for realizing key information extraction through geographic entity recognition, and comprises the steps of dividing elements which have practical significance to users in the channel notice information according to mechanisms O, places L, subjects S, events E and time T, constructing a text semantic extraction model of the channel notice, training by adopting a BilSTM-CRF model under the constraint of the text semantic extraction model, and extracting key information.

The implementation of each module can refer to the implementation description of the corresponding method step, and the invention is not repeated.

The technical scheme of the invention can be applied to various subsequent applications, such as:

(1) and the text information such as channel announcement, scale and the like is visualized. The Yangtze river channel map APP obtains channel announcement character information published by the Yangtze river channel bureau, information such as geographic positions, ranges, key contents and start-stop time in the character information is extracted through natural language processing technologies such as template matching word segmentation and named entity recognition, the information is subjected to structuring processing, spatial information such as the geographic positions and the ranges in the character information is matched with elements such as a water channel and mileage on an electronic channel map, a geographic fence is constructed through a coordinate point and a buffer area or a polygon, the channel announcement and the corresponding key contents in the scale are displayed on the geographic fence, and the start-stop time is used for controlling display and cancellation of the information.

(2) Visualization of important areas of the channel and navigation auxiliary reminding. Constructing a geo-fence by using regional areas such as an anchor area, a warning area, a navigation limiting area and the like, and displaying the geo-fence on an electronic channel map in an overlapping manner, wherein when a ship positioning signal is close to or in the range of the geo-fence, related early warning and warning information can be pushed to the ship positioning signal; the method comprises the steps that a geo-fence is constructed in the water channel range of the Yangtze river electronic channel map, and when a ship positioning signal is close to or located in the geo-fence range of the water channel, the channel maintenance scale, the navigation mark water level information, the channel announcement, the surrounding meteorological information and the related geographical information interest Points (POI) related to the water channel can be pushed to the geo-fence.

(3) Channel information interaction and pushing based on the mobile terminal. The Changjiang river channel chart APP provides a user reporting function, clicks on the electronic channel chart to obtain geographic coordinates of the position, creates a geographic fence according to the geographic coordinates, and generates reporting information. The user can attach text description or field photo to the reported information, and report the navigation field condition information to the relevant management department, which is beneficial to the first time confirmation of the field and the rapid update and release of the information.

For the sake of reference, the embodiment of the present invention provides a detailed description of the channel information visualization as follows:

the identified geographic entities, namely, the entities marked (labeled) as "locations" (locations) in the named entity identification step, are spatially matched with the electronic channel map, a geographic fence is generated by taking a spatial position as a center, real-time channel notification information is marked, the visualization process is as shown in fig. 7, and the detailed implementation steps are described as follows:

step 1, analyzing and acquiring longitude and latitude of the current position based on AIS data or mobile terminal GPS data, judging whether the current position is located in a relevant APP map range, and if not, roaming to the map where the current position is located.

Step 2, extracting the center of the navigation channel element ground object so as to draw the notification information at the center position of the ground object: and performing superposition analysis in the current map range to obtain typical channel ground objects with definite spatial position characteristics, such as a channel, a navigation mark, a bridge, a wharf and the like, and sequentially calculating the central position of the typical channel ground objects, so that channel notification information can be drawn in the middle. For point-like ground objects such as navigation marks, obstructive objects and the like, the central position of the point-like ground objects is represented by an actual position; for a linear or planar ground object such as a bridge, a wharf, a water channel, etc., the center position thereof can be expressed as:

wherein x_iAnd y_iIs the coordinate of the point element i constituting the line and plane elements, and n is equal to the total number of the point elements constituting the line and plane elements.

And 3, calculating a proper (such as one third of the screen width) buffer area radius or a polygonal range according to the current mobile equipment resolution and the center position obtained in the step 2, and sequentially constructing the geo-fences.

And 4, calculating whether the geofences constructed in the step 3 are covered or not, wherein for the simple polygonal geofence, a ray method has high query efficiency, starting from each point of the geofence A, drawing a ray along an X axis, judging the intersection point of the ray and each edge of the geofence B, counting the number of the intersection points, if the number of the intersection points is even, the geofences A and B are not covered, otherwise, the geofences A and B are covered, and at the moment, the geofence range needs to be adjusted or the geofence range needs to be subjected to offset processing.

And 5, sequentially acquiring corresponding key information through the WebService service request based on the ground object name acquired in the step 2.

And 6, organizing according to a preset specific format (such as a ground object name + an event + time) to simplify the channel notification information based on the ground object center position obtained in the step 2 and the key information obtained in the step 5, and drawing and labeling in the geo-fence range determined in the step 4.

In specific implementation, the above applications can also be automatically run in a software manner.

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims

1. A channel announcement information extraction method based on a BilSTM-CRF model is characterized by comprising the following steps:

2. The method for extracting information of a channel notice based on a BilSTM-CRF model as claimed in claim 1, wherein: channel information acquisition is carried out in advance, and comprises the steps of acquiring and storing channel related information, wherein the channel related information comprises channel announcements, planned water depth and maintenance scale; and acquiring relevant information of the navigation channel by adopting a focused web crawler mode.

3. The method for extracting information of a channel notice based on a BilSTM-CRF model as claimed in claim 2, wherein: and when crawling the page, putting the filtered links into a URL queue in sequence according to the priorities of 'important', 'upstream', 'midstream' and 'downstream'.

4. The method for extracting BiLSTM-CRF model-based channel announcement information as claimed in claim 1, 2 or 3, wherein: the implementation mode of constructing the electronic channel map object name word segmentation dictionary according to the channel element map layer is as follows,

step 1.1, loading channel element layers in batches;

5. The method for extracting BiLSTM-CRF model-based channel announcement information as claimed in claim 1, 2 or 3, wherein: in the text semantic extraction model of the channel announcement,

a mechanism O for identifying a channel announcement issuing mechanism;

6. The method for extracting BiLSTM-CRF model-based channel announcement information as claimed in claim 1, 2 or 3, wherein: and training by adopting a BilSTM-CRF model under the constraint of the text semantic extraction model, wherein the training comprises the steps of marking the text semantic extraction model by using a BIO marking set adopted in Bakeoff-3 evaluation, and adding constraint for a finally predicted label on a CRF layer of the BilSTM-CRF model.

7. The method for extracting BiLSTM-CRF model-based channel announcement information as claimed in claim 1, 2 or 3, wherein: the method is used for spatial information visualization, and comprises the steps of carrying out spatial matching on a geographic entity with a place as a label identified by a BilSTM-CRF model and an electronic channel map, generating a geographic fence by taking a spatial position as a center, and marking and displaying real-time channel notification information.

8. The method for extracting BiLSTM-CRF model-based channel announcement information as claimed in claim 1, 2 or 3, wherein: the method is used for channel key area visualization and navigation auxiliary reminding.

9. The method for extracting BiLSTM-CRF model-based channel announcement information as claimed in claim 1, 2 or 3, wherein: the method is used for channel information interaction and pushing based on the mobile terminal.

10. A channel announcement information extraction system based on a BilSTM-CRF model is characterized in that: for carrying out the method of extracting information of a route announcement based on a BiLSTM-CRF model as claimed in claims 1 to 9.