CN116579343A

CN116579343A - A Named Entity Recognition Method for Chinese Culture and Tourism

Info

Publication number: CN116579343A
Application number: CN202310560194.XA
Authority: CN
Inventors: 秦智; 杜自豪; 刘恩洋; 张仕斌; 昌燕; 胡贵强
Original assignee: Chengdu University of Information Technology
Current assignee: Chengdu University of Information Technology
Priority date: 2023-05-17
Filing date: 2023-05-17
Publication date: 2023-08-11
Anticipated expiration: 2043-05-17
Also published as: CN116579343B

Abstract

The invention discloses a named entity recognition method for Chinese culture and tourism, comprising the following steps: S1, acquiring text data of Chinese culture and tourism, and inputting it into a character embedding layer to obtain a character vector representation; S2, representing the character vector Input to the bidirectional long-term and short-term memory network layer to obtain the context representation; S3. Input the context representation to the CNN layer to obtain a multi-scale local context feature fusion representation; S4. Input the multi-scale local context feature fusion representation to the CRF layer, through The CRF layer performs sequence annotation and completes the named entity recognition of Chinese cultural tourism. The present invention takes into account the problem of less attention paid to the research on named entity recognition of Chinese tourism, and builds a network for Chinese cultural and tourism text data, and uses the second CNN module at the CNN layer to learn multi-scale local context feature fusion Representation, strengthen the correlation between semantics, and improve the feature representation that is beneficial to Chinese recognition.

Description

A Named Entity Recognition Method for Chinese Culture and Tourism

技术领域technical field

本发明属于信息提取技术领域，具体涉及一种中文文旅类的命名实体识别方法。The invention belongs to the technical field of information extraction, and in particular relates to a named entity recognition method for Chinese culture and tourism.

背景技术Background technique

命名实体识别(NER)是一项基本的信息提取任务，在自然语言处理(NLP)中能应用于许多下游任务，如信息抽取、社交媒体分析、搜素引擎、机器翻译、知识图谱。NER的目标是从句子中提取一些预定义的特定实体，并识别它们的正确类型，如人、地点、组织。早期的命名实体识别分为两类:基于规则的方法和基于统计的方法。随着深度学习的日渐强大，NER的研究取得了非常大的进步。涉及的领域多种多样:如医疗领域、金融领域、新闻领域等。但文旅类的命名实体识别的研究非常的稀缺，文旅类的命名实体识别的研究却没有受到关注。Named entity recognition (NER) is a basic information extraction task that can be applied to many downstream tasks in natural language processing (NLP), such as information extraction, social media analysis, search engine, machine translation, knowledge graph. The goal of NER is to extract some predefined specific entities from a sentence and recognize their correct type like person, place, organization. Early named entity recognition falls into two categories: rule-based methods and statistical-based methods. With the growing strength of deep learning, NER research has made great progress. The fields involved are various: such as the medical field, the financial field, the news field, etc. However, research on named entity recognition for cultural tourism is very scarce, and research on named entity recognition for cultural tourism has not received much attention.

根据语言之间的差异，关于特定语言的NER方法的研究也很多，如英语、阿拉伯语、印度语和其他语言，许多研究者主要集中于英文NER的研究。但中文身为一个重要的国际通用语言，在与英文相比，中文有它自己本身的特点，但对于中文NER的研究却相对英文NER来说却少很多，而且很多关于中文NER的研究都没有根据中文的特点做出针对性的研究。According to the differences between languages, there are also many studies on NER methods for specific languages, such as English, Arabic, Indian and other languages, and many researchers mainly focus on the research of English NER. However, Chinese is an important international language. Compared with English, Chinese has its own characteristics, but the research on Chinese NER is much less than that of English NER, and many studies on Chinese NER have no Make targeted research according to the characteristics of Chinese.

发明内容Contents of the invention

针对现有技术中的上述不足，本发明提供的一种中文文旅类的命名实体识别方法解决了目前的命名实体识别研究对中文文旅类的关注度较少的问题。In view of the above-mentioned deficiencies in the prior art, the invention provides a named entity recognition method for Chinese culture and tourism, which solves the problem that the current research on named entity recognition pays less attention to Chinese culture and tourism.

为了达到上述发明目的，本发明采用的技术方案为：一种中文文旅类的命名实体识别方法，包括以下步骤：In order to achieve the above-mentioned purpose of the invention, the technical solution adopted in the present invention is: a named entity recognition method for Chinese culture and tourism, comprising the following steps:

S1、获取中文文旅类文本数据，并将其输入至字符嵌入层，得到字符向量表示；S1. Obtain the text data of Chinese culture and tourism, and input it into the character embedding layer to obtain the character vector representation;

S2、将字符向量表示输入至双向长短期记忆网络层，得到上下文表示；S2. Input the character vector representation to the bidirectional long short-term memory network layer to obtain the context representation;

S3、将上下文表示输入至CNN层，得到多尺度的局部上下文特征融合表示；S3. Input the context representation to the CNN layer to obtain multi-scale local context feature fusion representation;

S4、将多尺度的局部上下文特征融合表示输入至CRF层，通过CRF层进行序列标注，完成中文文旅类的命名实体识别。S4. Input the multi-scale local contextual feature fusion representation to the CRF layer, and perform sequence annotation through the CRF layer to complete the named entity recognition of Chinese culture and tourism.

进一步地：所述S1中，字符嵌入层包括并行的ChineseBert模块和第一CNN模块；Further: in the S1, the character embedding layer includes a parallel ChineseBert module and the first CNN module;

所述S1包括以下分步骤：The S1 includes the following sub-steps:

S11、获取中文文旅类文本数据；S11. Acquiring Chinese cultural and tourism text data;

S12、将中文文旅类文本数据输入至ChineseBert模块，得到中文文旅类文本数据中每个字的字嵌入向量表示；S12. Input the Chinese cultural travel text data into the ChineseBert module, and obtain the word embedding vector representation of each word in the Chinese cultural travel text data;

S13、将中文文旅类文本数据输入至第一CNN模块，得到部首级嵌入表示；S13. Input the Chinese cultural tourism text data into the first CNN module to obtain a radical-level embedding representation;

S14、将字嵌入向量表示与部首级嵌入表示进行拼接，得到字符向量表示。S14. Concatenate the word embedding vector representation with the radical-level embedding representation to obtain a character vector representation.

进一步地：所述S12具体为：Further: said S12 is specifically:

将中文文旅类文本数据输入至ChineseBert模块，通过ChineseBert模块对输入的中文文旅类文本数据进行编码表示，得到特征向量，根据特征向量生成中文文旅类文本数据中每个字的字嵌入向量表示；Input the Chinese cultural tourism text data into the ChineseBert module, encode and express the input Chinese cultural tourism text data through the ChineseBert module, obtain the feature vector, and generate the word embedding vector of each word in the Chinese cultural tourism text data according to the feature vector express;

其中，所述特征向量包括标记嵌入、位置嵌入和分段嵌入。Wherein, the feature vector includes label embedding, position embedding and segment embedding.

进一步地：所述S13中，得到部首级嵌入表示M₂的表达式具体为：Further: in said S13, the expression for obtaining the radical-level embedding representation _M2 is specifically:

M₂＝A₁(b₁+C₁(x))M ₂ =A ₁ (b ₁ +C ₁ (x))

式中，x为汉字部首级特征，C₁(·)为第一CNN模块，A₁为第一激活函数，b₁为第一CNN模块的偏重。In the formula, x is the radical-level feature of Chinese characters, C ₁ (·) is the first CNN module, A ₁ is the first activation function, and b ₁ is the weight of the first CNN module.

进一步地：所述S14中，得到字符向量表示Z_concat的表达式具体为：Further: in the said S14, the expression obtained character vector representation Z _concat is specifically:

Z_concat＝M₁+M₂ Z _concat =M ₁ +M ₂

式中，M₁为字嵌入向量表示。In the formula, M ₁ is the word embedding vector representation.

上述进一步方案的有益效果为：经过字嵌入向量表示和部首级嵌入表示拼接得到的字符向量表示能够得到更多的语义特征，使得模型更好的识别文本中的中文含义。The beneficial effect of the above further solution is that the character vector representation obtained by concatenating the word embedding vector representation and the radical-level embedding representation can obtain more semantic features, so that the model can better recognize the Chinese meaning in the text.

进一步地：所述S2中，双向长短期记忆网络层包括第一～第十二LSTM单元，所述第一～第六LSTM单元正向处理输入的字符向量表示，所述第七～第十二LSTM单元反向处理输入的字符向量表示；Further: in said S2, the bidirectional long-short-term memory network layer includes the first to twelfth LSTM units, the first to sixth LSTM units are forward processing the input character vector representation, and the seventh to twelfth The LSTM unit inversely processes the character vector representation of the input;

得到上下文表示的方法具体为：The method of obtaining the context representation is as follows:

根据第一～第十二LSTM单元的输出结果进行拼接，得到上下文表示。进一步地：Splicing is performed according to the output results of the first to twelfth LSTM units to obtain the context representation. further:

进一步地：所述S2中，得到上下文表示H的表达式具体为：Further: in said S2, the expression for obtaining the context representation H is specifically:

H＝{h₁,...,h_ti,...,h_D}H＝{h ₁ ,...,h _ti ,...,h _D }

式中，h_ti为第一～第十二LSTM单元的输出结果进行拼接，ti为拼接的序号，且ti＝1，…，D，D为字符向量表示的维度；In the formula, h _ti is the splicing of the output results of the first to twelfth LSTM units, ti is the sequence number of splicing, and ti=1,...,D, D is the dimension represented by the character vector;

所述第一～第十二LSTM单元均包括输入门i_t、输出门o_t和遗忘门f_t，其表达式具体为下式：The first to twelfth LSTM units all include an input gate _it , an output gate o _t and a forgetting gate f _t , the expressions of which are specifically as follows:

i_t＝σ(W_xix_t+W_hih_t-1+W_cic_t-1+b_i)i _t = σ(W _xi x _t +W _hi h _t-1 +W _ci c _t-1 +b _i )

f_t＝σ(W_xfx_t+W_hfh_t-1+W_cfc_t-1+b_f)f _t ＝σ(W _xf x _t +W _hf h _t-1 +W _cf c _t-1 +b _f )

c_t＝f_t⊙c_t-1+i_t⊙tanh(W_xcx_t+W_hch_t-1+b_c)c _t ＝f _t ⊙c _t-1 +i _t ⊙tanh(W _xc x _t +W _hc h _t-1 +b _c )

o_t＝σ(W_xox_t+W_hoh_t-1+W_coc_t+b_o)o _t ＝σ(W _xo x _t +W _ho h _t-1 +W _co c _t +b _o )

h_t＝o_t⊙tanh(c_t)h _t ＝o _t ⊙tanh(c _t )

式中，σ(·)为逐元的sigmoid函数，tanh(·)为双曲切线函数，⊙为逐元相乘函数，W_xi、W_hi、W_ci、W_xf、W_hf、W_cf、W_xc、W_hc、W_xo、W_ho和W_co均为权重参数，b_i、b_f、b_c和b_o均为偏重参数，c_t为记忆细胞，h_t为输出结果。In the formula, σ(·) is the per-element sigmoid function, tanh(·) is the hyperbolic tangent function, ⊙ is the per-element multiplication function, W _xi , W _hi , W _ci , W _xf , W _hf , W _cf , W _xc , W _hc , W _xo , _Who and W _co are weight parameters, b _i , b _f , b _c and b _o are weight parameters, c _t is memory cells, and h _t is output results.

进一步地：所述S3中，CNN层设置有第二CNN模块，得到多尺度的局部上下文特征融合表示M₃的表达式具体为：Further: in the S3, the CNN layer is provided with a second CNN module, and the multi-scale local context feature fusion representation M is obtained. The expression of _M3 is specifically:

M₃＝A₂(b₂+C₂(H))M ₃ =A ₂ (b ₂ +C ₂ (H))

式中，H为下文表示，C₂(·)为第二CNN模块，A₂为第二激活函数，b₂为第二CNN模块的偏重。In the formula, H is the expression below, C ₂ (·) is the second CNN module, A ₂ is the second activation function, and b ₂ is the weight of the second CNN module.

上述进一步方案的有益效果为：将上下文表示输入至第二CNN模块，能够加强语义之间的相关性，生成多尺度的局部上下文特征融合表示。The beneficial effect of the above further solution is: inputting the context representation to the second CNN module can strengthen the correlation between semantics and generate a multi-scale fusion representation of local context features.

本发明的有益效果为：本发明提供的一种中文文旅类的命名实体识别方法解决了对中文旅游类的命名实体识别研究的关注度较少的问题，针对于中文的文旅类文本数据进行网络搭建，在字符嵌入层利用第一CNN模块学习基于中文的部首级嵌入表示，得到有利于中文识别的字符向量表示，在CNN层利用第二CNN模块学习多尺度的局部上下文特征融合表示，加强语义之间的相关性，进一步提高有利于中文识别的特征表示。The beneficial effects of the present invention are as follows: a method for recognizing named entities of Chinese cultural and tourism categories provided by the present invention solves the problem that less attention is paid to the research on named entity recognition of Chinese tourism categories, and is aimed at Chinese cultural and tourism text data Carry out network construction, use the first CNN module at the character embedding layer to learn Chinese-based radical-level embedded representations, and obtain character vector representations that are conducive to Chinese recognition, and use the second CNN module at the CNN layer to learn multi-scale local context feature fusion representations , strengthen the correlation between semantics, and further improve the feature representation that is beneficial to Chinese recognition.

附图说明Description of drawings

图1为本发明的一种中文文旅类的命名实体识别方法流程图。FIG. 1 is a flow chart of a named entity recognition method for Chinese culture and tourism in the present invention.

图2为本发明整体的网络结构示意图。FIG. 2 is a schematic diagram of the overall network structure of the present invention.

图3为本发明的ChineseBert模块的结构示意图。Fig. 3 is a schematic structural diagram of the ChineseBert module of the present invention.

图4为本发明的第一CNN模块的结构示意图。FIG. 4 is a schematic structural diagram of the first CNN module of the present invention.

图5为本发明的第二CNN模块的结构示意图。FIG. 5 is a schematic structural diagram of the second CNN module of the present invention.

具体实施方式Detailed ways

下面对本发明的具体实施方式进行描述，以便于本技术领域的技术人员理解本发明，但应该清楚，本发明不限于具体实施方式的范围，对本技术领域的普通技术人员来讲，只要各种变化在所附的权利要求限定和确定的本发明的精神和范围内，这些变化是显而易见的，一切利用本发明构思的发明创造均在保护之列。The specific embodiments of the present invention are described below so that those skilled in the art can understand the present invention, but it should be clear that the present invention is not limited to the scope of the specific embodiments. For those of ordinary skill in the art, as long as various changes Within the spirit and scope of the present invention defined and determined by the appended claims, these changes are obvious, and all inventions and creations using the concept of the present invention are included in the protection list.

如图1所示，在本发明的一个实施例中，一种中文文旅类的命名实体识别方法，包括以下步骤：As shown in Figure 1, in one embodiment of the present invention, a named entity recognition method of a Chinese cultural tourism class, comprises the following steps:

在本实施例中，本发明提供一种针对中文汉字特点且适用领域为文旅类数据的基于部首级特征和多尺度的局部上下文特征融合表示的中文文旅的命名实体识别方法，网络的具体结构如图2所示。In this embodiment, the present invention provides a named entity recognition method for Chinese cultural tourism based on the fusion representation of radical-level features and multi-scale local context features for the characteristics of Chinese characters and the applicable field is cultural tourism data. The specific structure is shown in Figure 2.

所述S1中，字符嵌入层包括并行的ChineseBert模块和第一CNN模块；In said S1, the character embedding layer includes a parallel ChineseBert module and the first CNN module;

所述S1包括以下分步骤：The S1 includes the following sub-steps:

在本实施例中，ChineseBert模块的结构如图3所示，ChineseBert模块是通过中文语料预训练得到的预训练模型，专门针对于中文的文本数据进行处理。In this embodiment, the structure of the ChineseBert module is shown in FIG. 3 . The ChineseBert module is a pre-trained model obtained through pre-training Chinese corpus, and is specially processed for Chinese text data.

所述S12具体为：The S12 is specifically:

所述S13中，得到部首级嵌入表示M₂的表达式具体为：In said S13, the expression for obtaining the radical-level embedding representation _M2 is specifically:

M₂＝A₁(b₁+C₁(x))M ₂ =A ₁ (b ₁ +C ₁ (x))

在本实施例中，利用CNN对输入的中文文旅类文本数据进行部首级嵌入表示Radical-level Representation，得到部首级嵌入表示，其中第一CNN模块对输入数据进行Radical Representaion的结构示意图如图4所示。In this embodiment, CNN is used to perform Radical-level Representation on the input Chinese cultural and tourism text data, and the Radical-level Representation is obtained, where the first CNN module performs Radical Representaion on the input data. The structural diagram is as follows Figure 4 shows.

所述S14中，得到字符向量表示Z_concat的表达式具体为：In said S14, the expression obtained by character vector representing Z _concat is specifically:

Z_concat＝M₁+M₂ Z _concat =M ₁ +M ₂

经过字嵌入向量表示和部首级嵌入表示拼接得到的字符向量表示能够得到更多的语义特征，使得模型更好的识别文本中的中文含义。The character vector representation obtained by concatenating word embedding vector representation and radical-level embedding representation can obtain more semantic features, so that the model can better recognize the Chinese meaning in the text.

所述S2中，双向长短期记忆网络层包括第一～第十二LSTM单元，所述第一～第六LSTM单元正向处理输入的字符向量表示，所述第七～第十二LSTM单元反向处理输入的字符向量表示；In said S2, the bidirectional long-short-term memory network layer includes the first to twelfth LSTM units, the first to sixth LSTM units forwardly process the input character vector representation, and the seventh to twelfth LSTM units reverse A character vector representation of the input to the process;

根据第一～第十二LSTM单元的输出结果进行拼接，得到上下文表示。Splicing is performed according to the output results of the first to twelfth LSTM units to obtain the context representation.

在本实施例中，双向长短期记忆网络层得到上下文表示块能够从正反两个方向提升语义表示，能够更好的识别段落中的语义。In this embodiment, the context representation blocks obtained by the bidirectional long-short-term memory network layer can improve the semantic representation from both positive and negative directions, and can better recognize the semantics in the paragraph.

所述S2中，得到上下文表示H的表达式具体为：In said S2, the expression for obtaining the context representation H is specifically:

H＝{h₁,...,h_ti,...,h_D}H＝{h ₁ ,...,h _ti ,...,h _D }

h_t＝o_t⊙tanh(c_t)h _t ＝o _t ⊙tanh(c _t )

所述S3中，CNN层设置有第二CNN模块，得到多尺度的局部上下文特征融合表示M₃的表达式具体为：In said S3, the CNN layer is provided with a second CNN module, and the multi-scale local context feature fusion representation M is obtained. The expression of _M3 is specifically:

M₃＝A₂(b₂+C₂(H))M ₃ =A ₂ (b ₂ +C ₂ (H))

在本实施例中，第二CNN模块的结构如图5所示，将上下文表示输入至第二CNN模块，能够加强语义之间的相关性，生成多尺度的局部上下文特征融合表示。In this embodiment, the structure of the second CNN module is shown in FIG. 5 . Inputting the context representation to the second CNN module can strengthen the correlation between semantics and generate a multi-scale fusion representation of local context features.

将多尺度的局部上下文特征融合表示输入至CRF层，完成序列标注的任务进而完成中文文旅类的命名实体识别。The multi-scale local context feature fusion representation is input to the CRF layer to complete the task of sequence labeling and then complete the named entity recognition of Chinese cultural tourism.

在本发明的描述中，需要理解的是，术语“中心”、“厚度”、“上”、“下”、“水平”、“顶”、“底”、“内”、“外”、“径向”等指示的方位或位置关系为基于附图所示的方位或位置关系，仅是为了便于描述本发明和简化描述，而不是指示或暗示所指的设备或元件必须具有特定的方位、以特定的方位构造和操作，因此不能理解为对本发明的限制。此外，术语“第一”、“第二”、“第三”仅用于描述目的，而不能理解为指示或暗示相对重要性或隐含指明的技术特征的数量。因此，限定由“第一”、“第二”、“第三”的特征可以明示或隐含地包括一个或者更多个该特征。In describing the present invention, it is to be understood that the terms "center", "thickness", "upper", "lower", "horizontal", "top", "bottom", "inner", "outer", " The orientation or positional relationship indicated by "radial" and so on is based on the orientation or positional relationship shown in the drawings, which is only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying that the referred equipment or elements must have a specific orientation, Constructed and operative in a particular orientation and therefore are not to be construed as limitations of the invention. In addition, the terms "first", "second", and "third" are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the number of technical features. Therefore, a feature defined by "first", "second" and "third" may explicitly or implicitly include one or more of these features.

Claims

1. A named entity recognition method of Chinese culture and travel class, is characterized in that, comprises the following steps:

S1. Obtain the text data of Chinese culture and tourism, and input it into the character embedding layer to obtain the character vector representation;

S2. Input the character vector representation to the bidirectional long short-term memory network layer to obtain the context representation;

S3. Input the context representation to the CNN layer to obtain multi-scale local context feature fusion representation;

S4. Input the multi-scale local contextual feature fusion representation to the CRF layer, and perform sequence annotation through the CRF layer to complete the named entity recognition of Chinese culture and tourism.

2. The named entity recognition method of Chinese cultural tourism class according to claim 1, characterized in that, in said S1, the character embedding layer includes a parallel ChineseBert module and the first CNN module;

The S1 includes the following sub-steps:

S11. Acquiring Chinese cultural and tourism text data;

S12. Input the Chinese cultural travel text data into the ChineseBert module, and obtain the word embedding vector representation of each word in the Chinese cultural travel text data;

S13. Input the Chinese cultural tourism text data into the first CNN module to obtain a radical-level embedding representation;

S14. Concatenate the word embedding vector representation with the radical-level embedding representation to obtain a character vector representation.

3. The named entity recognition method of Chinese cultural tourism according to claim 2, wherein said S12 is specifically:

Input the Chinese cultural tourism text data into the ChineseBert module, encode and express the input Chinese cultural tourism text data through the ChineseBert module, obtain the feature vector, and generate the word embedding vector of each word in the Chinese cultural tourism text data according to the feature vector express;

Wherein, the feature vector includes label embedding, position embedding and segment embedding.

4. The named entity recognition method of Chinese cultural tourism class according to claim 2, characterized in that, in said S13, the expression of obtaining the radical-level embedding representation _M2 is specifically:

M ₂ =A ₁ (b ₁ +C ₁ (x))

In the formula, x is the radical-level feature of Chinese characters, C ₁ (·) is the first CNN module, A ₁ is the first activation function, and b ₁ is the weight of the first CNN module.

5. The named entity recognition method of Chinese cultural tourism class according to claim 4, characterized in that, in said S14, the expression obtained by character vector representation Z _concat is specifically:

Z _concat =M ₁ +M ₂

In the formula, M ₁ is the word embedding vector representation.

6. The named entity recognition method for Chinese cultural tourism according to claim 1, wherein in said S2, the two-way long-short-term memory network layer includes first to twelfth LSTM units, and said first to twelfth The six LSTM units forwardly process the input character vector representations, and the seventh to twelfth LSTM units reversely process the input character vector representations;

The method of obtaining the context representation is as follows:

Splicing is performed according to the output results of the first to twelfth LSTM units to obtain the context representation.

7. The named entity recognition method of Chinese cultural tourism class according to claim 6, characterized in that, in said S2, the expression for obtaining the context representation H is specifically:

H＝{h ₁ ,...,h _ti ,...,h _D }

In the formula, h _ti is the splicing of the output results of the first to twelfth LSTM units, ti is the sequence number of splicing, and ti=1,...,D, D is the dimension represented by the character vector;

The first to twelfth LSTM units all include an input gate _it , an output gate o _t and a forgetting gate f _t , the expressions of which are specifically as follows:

i _t = σ(W _xi x _t +W _hi h _t-1 +W _ci c _t-1 +b _i )

f _t ＝σ(W _xf x _t +W _hf h _t-1 +W _cf c _t-1 +b _f )

c _t ＝f _t ⊙c _t-1 +i _t ⊙tanh(W _xc x _t +W _hc h _t-1 +b _c )

o _t ＝σ(W _xo x _t +W _ho h _t-1 +W _co c _t +b _o )

h _t ＝o _t ⊙tanh(c _t )

In the formula, σ(·) is the per-element sigmoid function, tanh(·) is the hyperbolic tangent function, ⊙ is the per-element multiplication function, W _xi , W _hi , W _ci , W _xf , W _hf , W _cf , W _xc , W _hc , W _xo , _Who and W _co are weight parameters, b _i , b _f , b _c and b _o are weight parameters, c _t is memory cells, and h _t is output results.

8. The named entity recognition method of Chinese cultural tourism according to claim 1, wherein in said S3, the CNN layer is provided with a second CNN module to obtain multi-scale local context feature fusion representation _M3 expression The specific formula is:

M ₃ =A ₂ (b ₂ +C ₂ (H))

In the formula, H is the expression below, C ₂ (·) is the second CNN module, A ₂ is the second activation function, and b ₂ is the weight of the second CNN module.