CN112883737A

CN112883737A - Robot language instruction analysis method and system based on Chinese named entity recognition

Info

Publication number: CN112883737A
Application number: CN202110236088.7A
Authority: CN
Inventors: 许庆阳; 姜聪; 周瑞; 李贻斌; 张承进; 宋勇; 袁宪锋; 庞豹
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2021-03-03
Filing date: 2021-03-03
Publication date: 2021-06-01
Anticipated expiration: 2041-03-03
Also published as: CN112883737B

Abstract

The invention discloses a robot language instruction analysis method and system based on Chinese named entity recognition, which comprises the following steps: acquiring Chinese text information based on input instruction content; extracting text features and enhancing the features; inputting the enhanced features into a named entity recognition model, generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, using the repositioning matrix for entity category reasoning, and outputting the named entity category attribute of each Chinese character in a self-supervision mode; and driving the robot to execute corresponding instructions based on the extracted named entities. The invention uses the self-supervision learning mechanism to identify the Chinese named entity, which makes our network completely get rid of the dependence on the manually labeled data set.

Description

Robot language instruction analysis method and system based on Chinese named entity recognition

Technical Field

The invention relates to the technical field of voice recognition, in particular to a robot language instruction analysis method and system based on Chinese named entity recognition.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

One of the core tasks in the voice control of the robot is to analyze the language command, extract useful information and perform the body control of the robot. Named entity recognition is an important tool for extracting language information. Named entities refer to attribute names of real objects, such as people, organizations, places, and the like. The recognition of named entities on texts is the basis for understanding the deep meaning of the texts, and serves as a basic task to provide support for practical application of many back-end natural language processes, such as relationship extraction, text understanding, information extraction, machine translation, entity corpus construction and the like. The traditional named body recognition model has three main types: rule-based learning methods, feature-based supervised learning methods, and unsupervised learning methods. Traditional named body recognition is mainly a rule-based approach. With the development of technology, the NER task is dominated by the supervised model-based named entity recognition method, but most of the supervised-based named entity recognition networks need to be trained by using large-scale manually labeled data sets, and the cost for obtaining the data sets is high. The named entity is identified in an unsupervised or self-supervised learning mode, the model can be trained without marking a data set, and the key of unsupervised named entity identification model training is provided by how to provide an accurate learning direction or classification basis for the model in an unsupervised or self-supervised training mode.

There are two solutions for early unsupervised named entity recognition: one method is that a common dictionary is constructed by a small amount of known data and is used as a clustering center to provide classification basis for a model; the other is that a seed rule is preset, and a basic rule containing prior information such as grammatical information or special cue words is used as a word classification standard and provides a clustering basis for the model. After the two models provide clustering centers or classification bases through prior information, data structures and distribution characteristics are mostly obtained through calculating the similarity of the context of words, and named entities are extracted from unlabeled data. It is noted that, in any method, the core of the method is mostly to implement coarse-grained extraction of named entities by means of list lookup or pattern matching. At present, the unsupervised named entity identification method which is popular in comparison can be divided into discriminant and generating formulas: the discriminant model is based on the traditional method, and fine-grained extraction of the named entity is carried out by designing more reasonable measurement; the generative model realizes the optimal subdivision of the entity class with the highest generation probability through model design. Under the efforts of researchers, the unsupervised named entity recognition direction makes some breakthroughs at present, but in the Chinese named entity extraction, the development is slow because sentences have no obvious word boundaries. In addition, since the unsupervised model needs to incorporate enough context information, in some application fields, such as robot language command parsing, because the instruction is simple and the vocabulary is small, it is often impossible to provide enough context information for the unsupervised model.

Disclosure of Invention

In order to solve the problems, the invention provides a robot language instruction analysis method and system based on Chinese named entity recognition, which can avoid complex parameter training and feature presetting or rule construction of a model, and further get rid of dependence on a large-scale manual labeling data set.

In some embodiments, the following technical scheme is adopted:

the robot language instruction analysis method based on Chinese named entity recognition comprises the following steps:

acquiring Chinese text information based on input instruction content;

extracting text features and enhancing the features;

inputting the enhanced features into a self-supervision Chinese named entity recognition model, generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, generating a 'restatement' instruction according to the repositioning matrix, using the instruction for entity category inference, and outputting the named entity category attribute of each Chinese character in a self-supervision mode;

and driving the robot to execute corresponding instructions based on the extracted named entities.

Wherein the input instruction comprises a voice input instruction or a Chinese text input instruction.

In other embodiments, the following technical solutions are adopted:

the robot language instruction analysis system based on Chinese named entity recognition comprises:

the text information acquisition module is used for acquiring Chinese text information based on the input instruction content;

the characteristic enhancement module is used for extracting text characteristics and enhancing the characteristics;

the word segmentation module is used for inputting the enhanced features into an automatic supervision Chinese named entity recognition model, generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, generating a 'repeat' instruction according to the repositioning matrix, using the instruction for entity category inference, and outputting the named entity category attribute of each Chinese character in an automatic supervision mode;

and the robot control module is used for driving the robot to execute a corresponding instruction based on the extracted named entity.

In other embodiments, the following technical solutions are adopted:

a terminal device comprising a processor and a memory, the processor being arranged to implement instructions; the memory is used for storing a plurality of instructions which are suitable for being loaded by the processor and executing the robot language instruction analysis method for Chinese named entity recognition.

Compared with the prior art, the invention has the beneficial effects that:

the invention uses the self-supervision learning mechanism to identify the Chinese named entity, which makes our network completely get rid of the dependence on the manually labeled data set.

The invention creates a novel learning rule, so that the model can learn only according to the binary result, and an accurate learning direction is not required to be provided like the traditional back propagation algorithm.

The invention adopts a position information matrix construction rule independent of static graphs in a construction subsystem of 'repeat instructions', and does not need to carry out learning approximation according to targets like a Gumbel-SinkHorn network, which theoretically enables a model to be simpler and operate faster.

According to the crawler robot, experiments are carried out on the crawler robot by combining a YOLO-V4 target detection network, and the robot can find and grab a target object according to a named entity extracted by SCNER.

Additional features and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

FIG. 1 is a schematic diagram of a robot language instruction analysis method based on Chinese named entity recognition according to an embodiment of the present invention;

FIG. 2 is a diagram illustrating an embodiment of an auto-supervised Chinese named entity recognition model;

FIGS. 3(a) - (c) are schematic diagrams of a fractional matrix and a position information matrix of different iteration cycles in an embodiment of the present invention, respectively;

FIGS. 4(a) - (b) are schematic diagrams of loss and logs curves and logs matrices without rule constraints in an embodiment of the present invention;

FIGS. 5(a) - (b) are schematic diagrams of loss after network degradation and logs curves and logs matrices in an embodiment of the invention;

FIG. 6 is a schematic diagram of a training data set of a target detection network in an embodiment of the present invention;

FIG. 7 is a schematic view of a tracked robot in an embodiment of the present invention;

FIGS. 8(a) - (b) are diagrammatic views of a training process;

FIGS. 9(a) - (b) are schematic diagrams of training results after self-supervised training;

fig. 10(a) - (b) are schematic diagrams of the test environment and the robot motion lens, respectively.

Detailed Description

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

Example one

How to achieve efficient and accurate learning and understanding of language like humans? Linguists research has found that early infancy is through simulation to learn to speak and understand language. When the voice of the adult is heard, the infant simulates the same output, and simultaneously compares the voice produced by the infant with the voice of the heard adult and corrects the voice to learn speaking. The learning process of recognizing entity objects is also similar. This process is consistent with the language chain created by Denes (see fig. 1), except that in this process both the speaker and the listener are the baby himself. It is also worth noting that simulated learning, whether by speaking or knowing objects, is often an iterative process requiring multiple interactions of the infant with the "caregiver". This mechanism of mimic learning motivates our interest and creates an idea to apply it to named entity recognition.

When the robot obtains an instruction (real instruction), it extracts the participles based on the named entity recognition model and constructs a "repeat instruction" using the participles. Since the network is untrained, the named entities extracted at the beginning are almost completely wrong, and the "repeat instructions" constructed based on these named entities have a large gap from the real instructions. But after simple training with this difference as a loss function, the robot will "understand" the real instruction, i.e. correctly extract the named entity and construct a "repeat instruction" that is consistent with the real instruction. Based on these correctly named entities we can drive the robot to move as we thought.

According to the embodiment of the invention, an embodiment of a robot language instruction analysis method based on Chinese named entity recognition is disclosed, referring to fig. 1, the method comprises the following processes:

(1) acquiring Chinese text information based on input instruction content;

specifically, the input instruction may be a voice instruction, and at this time, the voice input needs to be converted into text information; when the input command is text information, conversion is not needed.

(2) Extracting text features and enhancing the features;

specifically, the sequence is subjected to feature enhancement and fusion through a feature extraction and enhancement module, and the fused combined features are mapped into a feature sequence through simple statistics.

In the named entity recognition model, a very important hidden feature learned by the network is the context. In the self-supervised Chinese named-body entity recognition model (SCNER), such context obtained by large-scale pre-training is not available. Compared with other languages, the Chinese sentence has unique grammar rules, the Chinese characters also have multiple description attributes such as pinyin and components, characteristic enhancement can be realized by creating a characteristic sequence through the unique language structure, and the relation between the contexts is established to a certain extent.

In this embodiment, χ_c＝[x_c0,…,x_cn]Represents a sentence in which x_ciThe ith Chinese character is represented by the number i,

representing the corresponding characteristic sequence of the sentence. Given a sentence

The purpose of feature enhancement is to establish a feature sequence set by fusing other attributes of Chinese characters

And maps it to a set of eigenvectors psi_F. In this context, we use five-dimensional attributes for feature enhancement, so the feature sequence set of a sentence output contains five elements:

wherein,

is a sequence of Chinese characters,

is a pinyin sequence corresponding to the sentence,

the sequence of the components is a radical sequence,

is a sequence of parts of speech, and is,

is a word boundary sequence. Our word boundary sequence generation is based on the 4-tag method of Deng et al, in [ B (begin), M (middle), E (end), S (single) ]]And carrying out position division on the Chinese words. In contrast, we do not use any pre-trained word vector embedding model in the feature embedding process. Our feature embedding approach can be expressed as:

wherein,

representing the original input sequence of a Chinese character containing all feature information, d represents the dimension of the feature. Function(s)

Representing a statistical dictionary-based encoding mapping. W^(l)Weight matrix being a linear transformation, b^(l)Are deviant vectors that are part of an unsupervised closed loop, generated in an end-to-end training without pre-training.

(3) Inputting the enhanced features into a self-supervision Chinese named entity recognition model (SCNER), generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, generating a 'restatement' instruction according to the repositioning matrix, using the instruction for entity category inference, and outputting the named entity category attribute of each Chinese character in a self-supervision mode;

specifically, the structure of the self-supervised chinese named entity recognition model is shown in fig. 2, and includes: the system comprises a text sequence module, a feature enhancement module and a word segmentation module; the text sequence module and the feature enhancement module are explained above; the segmentation module includes a Named Entity Recognition Subsystem (NERS) and an instruction construction subsystem (IGS).

And the text sequence is subjected to feature enhancement and fusion through a feature enhancement module, and the fused combined features are mapped into a feature sequence through simple statistics. The signature sequence is then fed into the named entity recognition subsystem for processing. The named entity recognition subsystem generates a score for each Chinese character belonging to each named entity category, and the results of this part of the computation will be used as two directions: a relocation matrix and named entity class inference is generated. The repositioning matrix and the score matrix contain the same position information, but the original score matrix is simply used as the repositioning matrix, so that the mutual influence among the Chinese characters cannot be eliminated. We eliminate the mutual influence between Chinese characters by constructing the repositioning matrix similar to the form of the permutation matrix. The method can accurately use the position information in the fractional matrix for generating the repeat instruction, and can carry out back propagation according to the influence of the original fractional matrix. Therefore, the model can independently act the position information of the score instruction on each Chinese character in the process of creating the 'repeat instruction', and can also use error back propagation for model self-supervision learning. The model completes self-supervision learning through multiple interactions. And finally, the score matrix is directly used for entity class reasoning through a reasoning layer, and the named entity class attribute of each Chinese character is output.

As shown in FIG. 2, the self-supervised segmentation module trains the model in a manner similar to human mock learning. Based on the traditional named entity recognition model, an instruction construction subsystem is added at the back end to generate a 'repeat' instruction. The IGS and the inference layer work in parallel, but in the process of self-supervision training, only the change of the IGS and the 'repeat' instruction to the network is concerned, and in the stage of named entity extraction, all attention is focused on the inference result of the inference layer.

Named body entity recognition subsystem

The key to the parsing of the robot language command is the extraction of named entities. The traditional method usually uses Bi-LSTM network as a feature extraction layer, then establishes strong context relation in the form of state transition matrix in the process of supervised learning through conditional random field CRF, and carries out reasoning through a Viterbi decoding module. Similar model structures are adopted, but since a state transition matrix containing strong context relationship cannot be obtained through supervised training, the CRF module is abandoned in the named body model. The named entity recognition subsystem is designed to comprise two layers of bidirectional LSTM networks, each layer of the network uses 100 LSTM cores, and each LSTM core has a general structure: the input proportion of the information and the forgetting proportion of the previous period are controlled by three gates. For the set of eigenvectors ψ at time t_FFlattened one-dimensional feature sequence

The implementation is as follows:

i_t＝Sigmoid(W_i′h_t-1+U_iψ_t′+b′_i)

f_t＝Sigmoid(W′_fh_t-1+U_fψ_t′+b′_f)

o_t＝Sigmoid(W′_oh_t-1+U_oψ_t′+b′_o)

h_t＝o_t⊙tanh(c_t)

wherein "" indicates the product of elements. i.e. i_t,f_t,o_t,c_tRespectively showing an input gate, a forgetting gate, an output gate and a memory unit. h is_tRepresents a hidden state vector and stores useful information at and before time t. U represents the weight matrix of the gate unit, and W' represents the hidden state h in the gate unit_tB' represents a deviation vector.

Language instruction generating system

In the named body recognition subsystem, the Bi-LSTM layer at the front end transmits global information to the back end, not just the last moment state. The global information is the final output O of the model_endIncluding the results [ O ] of the LSTM core after the completion of the processing at each time₀,O₁,...,O_t,...]And all of it is transferred into the Bi-LSTM layer at the back end. By doing so, it can be ensured that information is not lost as completely as possible, and meanwhile, the hidden layer feature sequence reflects the correlation between the preceding and following layers to a certain extent. Processing the input based on the model of the structure to generate a fractional sequence of named entities corresponding to each Chinese character

Wherein i is more than or equal to 0 and less than or equal to n, dIndicating the number of named entity categories. Integrating all the score sequences to obtain a score matrix of the named entity classes corresponding to the input sentences

The input sentence is designed according to Chinese grammar rules, and the position information matrix is used for reconstructing the input instruction sentence to obtain a repeat instruction. Wherein a position information matrix is obtained

The method of (2) can be expressed as:

wherein the function

This process is discrete, representing location information extraction.

Representing the score matrix after suppressing the influence of the non-target named entities.

FIG. 3(a) is a fractional matrix and a position matrix in an initial state; FIG. 3(b) is a fractional matrix and a position matrix at 50 cycle iterations; FIG. 3(c) is a fractional matrix and a position matrix at 100 cycle iterations; as shown in fig. 3(a) - (c), the energy of the original score matrix is distributed in each cell compared to the position information matrix. The process of transforming the input instruction is a weighted summation of the input sequence over all named entity class components, and the use of the score matrix for operation will undoubtedly lead to confusion of the "repeat" instruction, because the information of each character in the input instruction cannot be independently and completely transferred to the "repeat" instruction. In the embodiment, a special position information matrix construction rule is designed to realize the independent transmission of the input instruction to the repeated instruction information, and the aim is to construct an ideal position information matrix and independently transmit the ideal position information matrixDiscrete operations outside the network "

Operation "to extract the position information matrix, and the original channel using the fractional matrix in the back propagation is back propagation, which can be expressed as:

wherein,

representing slave score matrices

The influence of the position information extracted by discrete operation in the parameter learning process.

Learning rules

By adding corresponding rules in the basic back propagation method, the SCNER model can learn through the difference between an input instruction and a repeated instruction under the condition that no explicit learning direction exists in an auto-supervision closed loop, and named entity recognition is achieved. The process of SCNER's extraction from the supervising named entity at a time can be viewed as the network receiving only one sample at a time for the purpose of learning. In the self-supervision training process, the network is shown to be degraded rapidly because the input instruction is unique. As shown in fig. 4(a) - (b), it can be seen that without rule constraint, LossValue representing the difference degree between the input instruction and the "repeat" instruction jumps to the maximum value, and the values of the logs matrix rapidly degrade toward minus infinity.

To solve this problem, a constraint is added to the SCNER. First, we suppress the influence of non-target named entities in the original score matrix to enable the SCNER model to learn from the location information matrix in discrete form, which can be expressed as:

however, it is not sufficient to simply suppress the influence of other information. As shown in fig. 5(a) - (b), the network cannot reach a convergence state during training, and the influence suppression of the non-target named entity only plays a role in delaying, and the network is still degraded as the training progresses. In the general batch learning paradigm, the model does not have such a trouble, and one reason is that the network receives the positive sample and the negative sample to learn in different directions respectively. When the proportion of positive and negative samples is close, the numerical state of the model maintains dynamic balance. Therefore, we also consider adding a balancing rule to the model in the self-supervised learning paradigm

So that it remains stable, which is achieved as follows:

wherein α is a balance factor.

A linear rectifying unit is shown.

A constant matrix of dimension n x d is represented, with elements 1.

(4) And driving the robot to execute corresponding instructions based on the extracted named entities.

After the model identifies the self-supervision named entity, the model outputs named entity objects, such as 'B-Location', 'O-Object', and the like, which are used for driving the robot to move. As shown in fig. 10(a), in order to eliminate the requirements of information such as the shape, size, switching mode and the like of different "Location" named entity targets on robot motion control, the test process is simplified, and a sign printed with a position name is used to replace a real "Location" target to build a test environment. In the experimental process, cross-combination tests are performed on different named entity targets, and a lens of robot motion, for example, a command of "go to refrigerator and take an apple", is shown in fig. 10 (b). It can be seen that after the robot enters the indication board area of the 'refrigerator', the target is searched by adopting a YOLO-V4 network. When there is no "Object in the field of view, the robot will switch positions or poses for Object detection at other positions in the current area. After the target object is found, the robot clamps the object according to a preset motion scheme.

Experimental setup

In the experiment, the word segmentation network adopts a double-layer bidirectional LSTM network for feature extraction, each layer of LSTM network comprises 100 LSTM units, and global feature information is adopted as output in a bottom layer feature extraction layer. The feature enhancement module transmits a feature sequence generated after feature enhancement and fusion processing is carried out on input information into a word segmentation network, and a named entity is extracted in a self-supervision mode. In the back-end target detection section, YOLO-V4 pre-trained on the VOC2007 dataset was used as the target detection model and fine-tuned on the fruit dataset (containing 300 images of orange, apple, banana and mixed four types, available in baidu aistudio) as shown in fig. 6. In the aspect of hardware, a behavior control verification experiment is performed on the apples and the oranges by using the crawler-type robot shown in fig. 7, the server is in WIFI communication with the robot in a communication mode, a main controller of the crawler-type robot is provided with a WIFI router and is connected with a steering engine controller in a serial communication mode, the steering engine controller is used for controlling joint steering engines of the mechanical arms, and the robot is provided with a camera. In the experiment, the processing processes such as language analysis, target identification and the like are operated in the server, and the robot is responsible for behavior implementation, environment perception and the like. It should be noted that the behavior pattern of the robot and the default of the path information in the environment are known, and this part can be further improved by the technology of building semantic map.

The experimental link comprises the steps that the SCNER model carries out named entity recognition on an input instruction in a self-supervision mode, and drives the robot to move. In NERS, the input of the model is text information, and feature enhancement including four dimensions of pinyin, radical, word boundary, and part of speech is performed. The enhanced feature sequence will be mapped to a 100-dimensional word vector (each feature maps to 20 dimensions). The word vector embedding model adopted here is not pre-trained, but is a mapping based on simple statistics of input instructions, a mapping dictionary is initialized every time an instruction is input, and mapping rules are also learned in self-supervision end-to-end learning without extra work. The loss function of the model is mean square error, the learning rate is 1E-4, the balance factor is 0.12, the iteration number of the self-supervision training is set to be 200, and the training results are shown in FIGS. 8(a) - (b).

FIG. 8(a) is a graph of accuracy during an auto-supervised learning process, showing the percentage of all characters in an input command that are recognized as correct. It can be seen that the recognition accuracy of the model for the named entity of the input instruction in the initial state is only 0% -20%. And as the self-supervision learning is carried out, the recognition accuracy of the model on all characters is gradually improved, and the stable and accurate recognition capability is obtained after 110 periods. FIG. 8(b) is a sample of the named entity recognition result of the model during the training process, which is used to record the named entity sequence generated by SCNER under different training periods. The named entity label of the deep background is correct in identification, the label which is not processed is wrong in identification, and the display content of the named entity label is consistent with the accuracy curve.

The states of the model when receiving the instruction are all initialized randomly, so the initial effect on named entity recognition is not ideal. After approximately 100 cycles of self-supervised training, the model converges quickly and maintains stable results for the following training cycles. Similarly, the Logits matrix of the model gradually transitions from a completely disordered state to an ordered state, as shown in fig. 9(a) - (b). The position information matrix is obtained from the logs matrix through discrete operation, so that no matter whether the negative influence in the non-target named entity category is increased or not, the influence of the position information matrix on the repeated instruction is not different from that of other non-target named entity categories.

Example two

According to the embodiment of the invention, the embodiment of the robot language instruction analysis system based on Chinese named entity recognition is disclosed, and comprises the following steps:

It should be noted that specific implementation manners of the modules are already described in detail in the first embodiment, and are not described again.

EXAMPLE III

According to an embodiment of the present invention, an embodiment of a terminal device is disclosed, which includes a processor and a memory, the processor being configured to implement instructions; the memory is used for storing a plurality of instructions which are suitable for being loaded by the processor and executing the robot language instruction analysis method for Chinese named entity recognition in the first embodiment.

Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims

1. The robot language instruction analysis method based on Chinese named entity recognition is characterized by comprising the following steps:

acquiring Chinese text information based on input instruction content;

extracting text features and enhancing the features;

2. The method of claim 1, wherein the input command comprises a voice input command or a chinese text input command.

3. The method for analyzing robot language instruction based on Chinese named entity recognition according to claim 1, wherein the extracting text features and performing feature enhancement specifically comprises:

given a sentence

The output feature sequence set is as follows:

wherein,

is a sequence of Chinese characters,

is a pinyin sequence corresponding to the sentence,

the sequence of the components is a radical sequence,

is a sequence of parts of speech, and is,

is a word boundary sequence; during feature embedding, word vectors which are not pre-trained are embedded into the model.

4. The method of analysis of robot language instructions based on chinese named-entity recognition as claimed in claim 1 wherein the named-entity recognition model comprises two layers of bi-directional LSTM networks, each layer of networks comprising a number of LSTM cores, each LSTM core having a general structure: the input proportion of the information and the forgetting proportion of the previous period are controlled by three gates.

5. The method as claimed in claim 4, wherein the input is processed based on the structural model to generate a score sequence of named entities corresponding to each Chinese character

Where 0 ≦ i ≦ n, and d represents the number of named entity classes.

6. The method of claim 5, wherein the score sequences are integrated to obtain a score matrix of the named entity class corresponding to the input sentence

Obtaining a position matrix based on the fractional matrix:

whereinFunction of

It is shown that the position information is extracted,

representing a fractional matrix after restraining influence of the non-target named entity;

and reconstructing the input instruction sentence by using the position information matrix to obtain the 'repeat' instruction.

7. The method for analyzing robot language instruction based on Chinese named entity recognition as claimed in claim 1, wherein the named entity recognition is implemented by adding set rules in the basic back propagation method, so that the named entity recognition model can learn through the difference between the input instruction and the "repeat" instruction without explicit learning direction in the self-supervision closed loop.

8. The method for analyzing robot language instruction based on Chinese named entity recognition as claimed in claim 7, wherein the set rules comprise:

the influence of the non-target named entities in the original score matrix is restrained, so that the named entity recognition model can learn according to the position information matrix in a discrete form;

balancing rules are added to the model in the self-supervised learning paradigm to keep it stable.

9. Robot language instruction analysis system based on Chinese named entity discernment characterized in that includes:

the word segmentation module is used for inputting the enhanced features into the named entity recognition model, generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, generating a 'restatement' instruction according to the repositioning matrix, using the instruction for entity category inference, and outputting the named entity category attribute of each Chinese character in a self-supervision mode;

10. A terminal device comprising a processor and a memory, the processor being arranged to implement instructions; the memory is used for storing a plurality of instructions, wherein the instructions are suitable for being loaded by the processor and executing the robot language instruction analysis method for Chinese named entity recognition according to any one of claims 1 to 8.