CN112883737A - Robot language instruction analysis method and system based on Chinese named entity recognition - Google Patents

Robot language instruction analysis method and system based on Chinese named entity recognition Download PDF

Info

Publication number
CN112883737A
CN112883737A CN202110236088.7A CN202110236088A CN112883737A CN 112883737 A CN112883737 A CN 112883737A CN 202110236088 A CN202110236088 A CN 202110236088A CN 112883737 A CN112883737 A CN 112883737A
Authority
CN
China
Prior art keywords
named entity
chinese
instruction
matrix
named
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110236088.7A
Other languages
Chinese (zh)
Other versions
CN112883737B (en
Inventor
许庆阳
姜聪
周瑞
李贻斌
张承进
宋勇
袁宪锋
庞豹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202110236088.7A priority Critical patent/CN112883737B/en
Publication of CN112883737A publication Critical patent/CN112883737A/en
Application granted granted Critical
Publication of CN112883737B publication Critical patent/CN112883737B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a robot language instruction analysis method and system based on Chinese named entity recognition, which comprises the following steps: acquiring Chinese text information based on input instruction content; extracting text features and enhancing the features; inputting the enhanced features into a named entity recognition model, generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, using the repositioning matrix for entity category reasoning, and outputting the named entity category attribute of each Chinese character in a self-supervision mode; and driving the robot to execute corresponding instructions based on the extracted named entities. The invention uses the self-supervision learning mechanism to identify the Chinese named entity, which makes our network completely get rid of the dependence on the manually labeled data set.

Description

Robot language instruction analysis method and system based on Chinese named entity recognition
Technical Field
The invention relates to the technical field of voice recognition, in particular to a robot language instruction analysis method and system based on Chinese named entity recognition.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
One of the core tasks in the voice control of the robot is to analyze the language command, extract useful information and perform the body control of the robot. Named entity recognition is an important tool for extracting language information. Named entities refer to attribute names of real objects, such as people, organizations, places, and the like. The recognition of named entities on texts is the basis for understanding the deep meaning of the texts, and serves as a basic task to provide support for practical application of many back-end natural language processes, such as relationship extraction, text understanding, information extraction, machine translation, entity corpus construction and the like. The traditional named body recognition model has three main types: rule-based learning methods, feature-based supervised learning methods, and unsupervised learning methods. Traditional named body recognition is mainly a rule-based approach. With the development of technology, the NER task is dominated by the supervised model-based named entity recognition method, but most of the supervised-based named entity recognition networks need to be trained by using large-scale manually labeled data sets, and the cost for obtaining the data sets is high. The named entity is identified in an unsupervised or self-supervised learning mode, the model can be trained without marking a data set, and the key of unsupervised named entity identification model training is provided by how to provide an accurate learning direction or classification basis for the model in an unsupervised or self-supervised training mode.
There are two solutions for early unsupervised named entity recognition: one method is that a common dictionary is constructed by a small amount of known data and is used as a clustering center to provide classification basis for a model; the other is that a seed rule is preset, and a basic rule containing prior information such as grammatical information or special cue words is used as a word classification standard and provides a clustering basis for the model. After the two models provide clustering centers or classification bases through prior information, data structures and distribution characteristics are mostly obtained through calculating the similarity of the context of words, and named entities are extracted from unlabeled data. It is noted that, in any method, the core of the method is mostly to implement coarse-grained extraction of named entities by means of list lookup or pattern matching. At present, the unsupervised named entity identification method which is popular in comparison can be divided into discriminant and generating formulas: the discriminant model is based on the traditional method, and fine-grained extraction of the named entity is carried out by designing more reasonable measurement; the generative model realizes the optimal subdivision of the entity class with the highest generation probability through model design. Under the efforts of researchers, the unsupervised named entity recognition direction makes some breakthroughs at present, but in the Chinese named entity extraction, the development is slow because sentences have no obvious word boundaries. In addition, since the unsupervised model needs to incorporate enough context information, in some application fields, such as robot language command parsing, because the instruction is simple and the vocabulary is small, it is often impossible to provide enough context information for the unsupervised model.
Disclosure of Invention
In order to solve the problems, the invention provides a robot language instruction analysis method and system based on Chinese named entity recognition, which can avoid complex parameter training and feature presetting or rule construction of a model, and further get rid of dependence on a large-scale manual labeling data set.
In some embodiments, the following technical scheme is adopted:
the robot language instruction analysis method based on Chinese named entity recognition comprises the following steps:
acquiring Chinese text information based on input instruction content;
extracting text features and enhancing the features;
inputting the enhanced features into a self-supervision Chinese named entity recognition model, generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, generating a 'restatement' instruction according to the repositioning matrix, using the instruction for entity category inference, and outputting the named entity category attribute of each Chinese character in a self-supervision mode;
and driving the robot to execute corresponding instructions based on the extracted named entities.
Wherein the input instruction comprises a voice input instruction or a Chinese text input instruction.
In other embodiments, the following technical solutions are adopted:
the robot language instruction analysis system based on Chinese named entity recognition comprises:
the text information acquisition module is used for acquiring Chinese text information based on the input instruction content;
the characteristic enhancement module is used for extracting text characteristics and enhancing the characteristics;
the word segmentation module is used for inputting the enhanced features into an automatic supervision Chinese named entity recognition model, generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, generating a 'repeat' instruction according to the repositioning matrix, using the instruction for entity category inference, and outputting the named entity category attribute of each Chinese character in an automatic supervision mode;
and the robot control module is used for driving the robot to execute a corresponding instruction based on the extracted named entity.
In other embodiments, the following technical solutions are adopted:
a terminal device comprising a processor and a memory, the processor being arranged to implement instructions; the memory is used for storing a plurality of instructions which are suitable for being loaded by the processor and executing the robot language instruction analysis method for Chinese named entity recognition.
Compared with the prior art, the invention has the beneficial effects that:
the invention uses the self-supervision learning mechanism to identify the Chinese named entity, which makes our network completely get rid of the dependence on the manually labeled data set.
The invention creates a novel learning rule, so that the model can learn only according to the binary result, and an accurate learning direction is not required to be provided like the traditional back propagation algorithm.
The invention adopts a position information matrix construction rule independent of static graphs in a construction subsystem of 'repeat instructions', and does not need to carry out learning approximation according to targets like a Gumbel-SinkHorn network, which theoretically enables a model to be simpler and operate faster.
According to the crawler robot, experiments are carried out on the crawler robot by combining a YOLO-V4 target detection network, and the robot can find and grab a target object according to a named entity extracted by SCNER.
Additional features and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a schematic diagram of a robot language instruction analysis method based on Chinese named entity recognition according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating an embodiment of an auto-supervised Chinese named entity recognition model;
FIGS. 3(a) - (c) are schematic diagrams of a fractional matrix and a position information matrix of different iteration cycles in an embodiment of the present invention, respectively;
FIGS. 4(a) - (b) are schematic diagrams of loss and logs curves and logs matrices without rule constraints in an embodiment of the present invention;
FIGS. 5(a) - (b) are schematic diagrams of loss after network degradation and logs curves and logs matrices in an embodiment of the invention;
FIG. 6 is a schematic diagram of a training data set of a target detection network in an embodiment of the present invention;
FIG. 7 is a schematic view of a tracked robot in an embodiment of the present invention;
FIGS. 8(a) - (b) are diagrammatic views of a training process;
FIGS. 9(a) - (b) are schematic diagrams of training results after self-supervised training;
fig. 10(a) - (b) are schematic diagrams of the test environment and the robot motion lens, respectively.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
Example one
How to achieve efficient and accurate learning and understanding of language like humans? Linguists research has found that early infancy is through simulation to learn to speak and understand language. When the voice of the adult is heard, the infant simulates the same output, and simultaneously compares the voice produced by the infant with the voice of the heard adult and corrects the voice to learn speaking. The learning process of recognizing entity objects is also similar. This process is consistent with the language chain created by Denes (see fig. 1), except that in this process both the speaker and the listener are the baby himself. It is also worth noting that simulated learning, whether by speaking or knowing objects, is often an iterative process requiring multiple interactions of the infant with the "caregiver". This mechanism of mimic learning motivates our interest and creates an idea to apply it to named entity recognition.
When the robot obtains an instruction (real instruction), it extracts the participles based on the named entity recognition model and constructs a "repeat instruction" using the participles. Since the network is untrained, the named entities extracted at the beginning are almost completely wrong, and the "repeat instructions" constructed based on these named entities have a large gap from the real instructions. But after simple training with this difference as a loss function, the robot will "understand" the real instruction, i.e. correctly extract the named entity and construct a "repeat instruction" that is consistent with the real instruction. Based on these correctly named entities we can drive the robot to move as we thought.
According to the embodiment of the invention, an embodiment of a robot language instruction analysis method based on Chinese named entity recognition is disclosed, referring to fig. 1, the method comprises the following processes:
(1) acquiring Chinese text information based on input instruction content;
specifically, the input instruction may be a voice instruction, and at this time, the voice input needs to be converted into text information; when the input command is text information, conversion is not needed.
(2) Extracting text features and enhancing the features;
specifically, the sequence is subjected to feature enhancement and fusion through a feature extraction and enhancement module, and the fused combined features are mapped into a feature sequence through simple statistics.
In the named entity recognition model, a very important hidden feature learned by the network is the context. In the self-supervised Chinese named-body entity recognition model (SCNER), such context obtained by large-scale pre-training is not available. Compared with other languages, the Chinese sentence has unique grammar rules, the Chinese characters also have multiple description attributes such as pinyin and components, characteristic enhancement can be realized by creating a characteristic sequence through the unique language structure, and the relation between the contexts is established to a certain extent.
In this embodiment, χc=[xc0,…,xcn]Represents a sentence in which xciThe ith Chinese character is represented by the number i,
Figure BDA0002960558000000061
representing the corresponding characteristic sequence of the sentence. Given a sentence
Figure BDA0002960558000000062
The purpose of feature enhancement is to establish a feature sequence set by fusing other attributes of Chinese characters
Figure BDA0002960558000000063
And maps it to a set of eigenvectors psiF. In this context, we use five-dimensional attributes for feature enhancement, so the feature sequence set of a sentence output contains five elements:
Figure BDA0002960558000000074
wherein,
Figure BDA0002960558000000075
is a sequence of Chinese characters,
Figure BDA0002960558000000076
is a pinyin sequence corresponding to the sentence,
Figure BDA0002960558000000079
the sequence of the components is a radical sequence,
Figure BDA0002960558000000078
is a sequence of parts of speech, and is,
Figure BDA0002960558000000077
is a word boundary sequence. Our word boundary sequence generation is based on the 4-tag method of Deng et al, in [ B (begin), M (middle), E (end), S (single) ]]And carrying out position division on the Chinese words. In contrast, we do not use any pre-trained word vector embedding model in the feature embedding process. Our feature embedding approach can be expressed as:
Figure BDA0002960558000000071
wherein,
Figure BDA0002960558000000072
representing the original input sequence of a Chinese character containing all feature information, d represents the dimension of the feature. Function(s)
Figure BDA0002960558000000073
Representing a statistical dictionary-based encoding mapping. W(l)Weight matrix being a linear transformation, b(l)Are deviant vectors that are part of an unsupervised closed loop, generated in an end-to-end training without pre-training.
(3) Inputting the enhanced features into a self-supervision Chinese named entity recognition model (SCNER), generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, generating a 'restatement' instruction according to the repositioning matrix, using the instruction for entity category inference, and outputting the named entity category attribute of each Chinese character in a self-supervision mode;
specifically, the structure of the self-supervised chinese named entity recognition model is shown in fig. 2, and includes: the system comprises a text sequence module, a feature enhancement module and a word segmentation module; the text sequence module and the feature enhancement module are explained above; the segmentation module includes a Named Entity Recognition Subsystem (NERS) and an instruction construction subsystem (IGS).
And the text sequence is subjected to feature enhancement and fusion through a feature enhancement module, and the fused combined features are mapped into a feature sequence through simple statistics. The signature sequence is then fed into the named entity recognition subsystem for processing. The named entity recognition subsystem generates a score for each Chinese character belonging to each named entity category, and the results of this part of the computation will be used as two directions: a relocation matrix and named entity class inference is generated. The repositioning matrix and the score matrix contain the same position information, but the original score matrix is simply used as the repositioning matrix, so that the mutual influence among the Chinese characters cannot be eliminated. We eliminate the mutual influence between Chinese characters by constructing the repositioning matrix similar to the form of the permutation matrix. The method can accurately use the position information in the fractional matrix for generating the repeat instruction, and can carry out back propagation according to the influence of the original fractional matrix. Therefore, the model can independently act the position information of the score instruction on each Chinese character in the process of creating the 'repeat instruction', and can also use error back propagation for model self-supervision learning. The model completes self-supervision learning through multiple interactions. And finally, the score matrix is directly used for entity class reasoning through a reasoning layer, and the named entity class attribute of each Chinese character is output.
As shown in FIG. 2, the self-supervised segmentation module trains the model in a manner similar to human mock learning. Based on the traditional named entity recognition model, an instruction construction subsystem is added at the back end to generate a 'repeat' instruction. The IGS and the inference layer work in parallel, but in the process of self-supervision training, only the change of the IGS and the 'repeat' instruction to the network is concerned, and in the stage of named entity extraction, all attention is focused on the inference result of the inference layer.
Named body entity recognition subsystem
The key to the parsing of the robot language command is the extraction of named entities. The traditional method usually uses Bi-LSTM network as a feature extraction layer, then establishes strong context relation in the form of state transition matrix in the process of supervised learning through conditional random field CRF, and carries out reasoning through a Viterbi decoding module. Similar model structures are adopted, but since a state transition matrix containing strong context relationship cannot be obtained through supervised training, the CRF module is abandoned in the named body model. The named entity recognition subsystem is designed to comprise two layers of bidirectional LSTM networks, each layer of the network uses 100 LSTM cores, and each LSTM core has a general structure: the input proportion of the information and the forgetting proportion of the previous period are controlled by three gates. For the set of eigenvectors ψ at time tFFlattened one-dimensional feature sequence
Figure BDA0002960558000000081
The implementation is as follows:
it=Sigmoid(Wi′ht-1+Uiψt′+b′i)
ft=Sigmoid(W′fht-1+Ufψt′+b′f)
Figure BDA0002960558000000091
Figure BDA0002960558000000092
ot=Sigmoid(W′oht-1+Uoψt′+b′o)
ht=ot⊙tanh(ct)
wherein "" indicates the product of elements. i.e. it,ft,ot,ctRespectively showing an input gate, a forgetting gate, an output gate and a memory unit. h istRepresents a hidden state vector and stores useful information at and before time t. U represents the weight matrix of the gate unit, and W' represents the hidden state h in the gate unittB' represents a deviation vector.
Language instruction generating system
In the named body recognition subsystem, the Bi-LSTM layer at the front end transmits global information to the back end, not just the last moment state. The global information is the final output O of the modelendIncluding the results [ O ] of the LSTM core after the completion of the processing at each time0,O1,...,Ot,...]And all of it is transferred into the Bi-LSTM layer at the back end. By doing so, it can be ensured that information is not lost as completely as possible, and meanwhile, the hidden layer feature sequence reflects the correlation between the preceding and following layers to a certain extent. Processing the input based on the model of the structure to generate a fractional sequence of named entities corresponding to each Chinese character
Figure BDA0002960558000000093
Wherein i is more than or equal to 0 and less than or equal to n, dIndicating the number of named entity categories. Integrating all the score sequences to obtain a score matrix of the named entity classes corresponding to the input sentences
Figure BDA0002960558000000094
The input sentence is designed according to Chinese grammar rules, and the position information matrix is used for reconstructing the input instruction sentence to obtain a repeat instruction. Wherein a position information matrix is obtained
Figure BDA0002960558000000095
The method of (2) can be expressed as:
Figure BDA0002960558000000096
wherein the function
Figure BDA0002960558000000101
This process is discrete, representing location information extraction.
Figure BDA0002960558000000102
Representing the score matrix after suppressing the influence of the non-target named entities.
FIG. 3(a) is a fractional matrix and a position matrix in an initial state; FIG. 3(b) is a fractional matrix and a position matrix at 50 cycle iterations; FIG. 3(c) is a fractional matrix and a position matrix at 100 cycle iterations; as shown in fig. 3(a) - (c), the energy of the original score matrix is distributed in each cell compared to the position information matrix. The process of transforming the input instruction is a weighted summation of the input sequence over all named entity class components, and the use of the score matrix for operation will undoubtedly lead to confusion of the "repeat" instruction, because the information of each character in the input instruction cannot be independently and completely transferred to the "repeat" instruction. In the embodiment, a special position information matrix construction rule is designed to realize the independent transmission of the input instruction to the repeated instruction information, and the aim is to construct an ideal position information matrix and independently transmit the ideal position information matrixDiscrete operations outside the network "
Figure BDA0002960558000000103
Operation "to extract the position information matrix, and the original channel using the fractional matrix in the back propagation is back propagation, which can be expressed as:
Figure BDA0002960558000000104
wherein,
Figure BDA0002960558000000105
representing slave score matrices
Figure BDA0002960558000000106
The influence of the position information extracted by discrete operation in the parameter learning process.
Learning rules
By adding corresponding rules in the basic back propagation method, the SCNER model can learn through the difference between an input instruction and a repeated instruction under the condition that no explicit learning direction exists in an auto-supervision closed loop, and named entity recognition is achieved. The process of SCNER's extraction from the supervising named entity at a time can be viewed as the network receiving only one sample at a time for the purpose of learning. In the self-supervision training process, the network is shown to be degraded rapidly because the input instruction is unique. As shown in fig. 4(a) - (b), it can be seen that without rule constraint, LossValue representing the difference degree between the input instruction and the "repeat" instruction jumps to the maximum value, and the values of the logs matrix rapidly degrade toward minus infinity.
To solve this problem, a constraint is added to the SCNER. First, we suppress the influence of non-target named entities in the original score matrix to enable the SCNER model to learn from the location information matrix in discrete form, which can be expressed as:
Figure BDA0002960558000000111
Figure BDA0002960558000000112
however, it is not sufficient to simply suppress the influence of other information. As shown in fig. 5(a) - (b), the network cannot reach a convergence state during training, and the influence suppression of the non-target named entity only plays a role in delaying, and the network is still degraded as the training progresses. In the general batch learning paradigm, the model does not have such a trouble, and one reason is that the network receives the positive sample and the negative sample to learn in different directions respectively. When the proportion of positive and negative samples is close, the numerical state of the model maintains dynamic balance. Therefore, we also consider adding a balancing rule to the model in the self-supervised learning paradigm
Figure BDA0002960558000000113
So that it remains stable, which is achieved as follows:
Figure BDA0002960558000000114
wherein α is a balance factor.
Figure BDA0002960558000000115
A linear rectifying unit is shown.
Figure BDA0002960558000000116
A constant matrix of dimension n x d is represented, with elements 1.
(4) And driving the robot to execute corresponding instructions based on the extracted named entities.
After the model identifies the self-supervision named entity, the model outputs named entity objects, such as 'B-Location', 'O-Object', and the like, which are used for driving the robot to move. As shown in fig. 10(a), in order to eliminate the requirements of information such as the shape, size, switching mode and the like of different "Location" named entity targets on robot motion control, the test process is simplified, and a sign printed with a position name is used to replace a real "Location" target to build a test environment. In the experimental process, cross-combination tests are performed on different named entity targets, and a lens of robot motion, for example, a command of "go to refrigerator and take an apple", is shown in fig. 10 (b). It can be seen that after the robot enters the indication board area of the 'refrigerator', the target is searched by adopting a YOLO-V4 network. When there is no "Object in the field of view, the robot will switch positions or poses for Object detection at other positions in the current area. After the target object is found, the robot clamps the object according to a preset motion scheme.
Experimental setup
In the experiment, the word segmentation network adopts a double-layer bidirectional LSTM network for feature extraction, each layer of LSTM network comprises 100 LSTM units, and global feature information is adopted as output in a bottom layer feature extraction layer. The feature enhancement module transmits a feature sequence generated after feature enhancement and fusion processing is carried out on input information into a word segmentation network, and a named entity is extracted in a self-supervision mode. In the back-end target detection section, YOLO-V4 pre-trained on the VOC2007 dataset was used as the target detection model and fine-tuned on the fruit dataset (containing 300 images of orange, apple, banana and mixed four types, available in baidu aistudio) as shown in fig. 6. In the aspect of hardware, a behavior control verification experiment is performed on the apples and the oranges by using the crawler-type robot shown in fig. 7, the server is in WIFI communication with the robot in a communication mode, a main controller of the crawler-type robot is provided with a WIFI router and is connected with a steering engine controller in a serial communication mode, the steering engine controller is used for controlling joint steering engines of the mechanical arms, and the robot is provided with a camera. In the experiment, the processing processes such as language analysis, target identification and the like are operated in the server, and the robot is responsible for behavior implementation, environment perception and the like. It should be noted that the behavior pattern of the robot and the default of the path information in the environment are known, and this part can be further improved by the technology of building semantic map.
The experimental link comprises the steps that the SCNER model carries out named entity recognition on an input instruction in a self-supervision mode, and drives the robot to move. In NERS, the input of the model is text information, and feature enhancement including four dimensions of pinyin, radical, word boundary, and part of speech is performed. The enhanced feature sequence will be mapped to a 100-dimensional word vector (each feature maps to 20 dimensions). The word vector embedding model adopted here is not pre-trained, but is a mapping based on simple statistics of input instructions, a mapping dictionary is initialized every time an instruction is input, and mapping rules are also learned in self-supervision end-to-end learning without extra work. The loss function of the model is mean square error, the learning rate is 1E-4, the balance factor is 0.12, the iteration number of the self-supervision training is set to be 200, and the training results are shown in FIGS. 8(a) - (b).
FIG. 8(a) is a graph of accuracy during an auto-supervised learning process, showing the percentage of all characters in an input command that are recognized as correct. It can be seen that the recognition accuracy of the model for the named entity of the input instruction in the initial state is only 0% -20%. And as the self-supervision learning is carried out, the recognition accuracy of the model on all characters is gradually improved, and the stable and accurate recognition capability is obtained after 110 periods. FIG. 8(b) is a sample of the named entity recognition result of the model during the training process, which is used to record the named entity sequence generated by SCNER under different training periods. The named entity label of the deep background is correct in identification, the label which is not processed is wrong in identification, and the display content of the named entity label is consistent with the accuracy curve.
The states of the model when receiving the instruction are all initialized randomly, so the initial effect on named entity recognition is not ideal. After approximately 100 cycles of self-supervised training, the model converges quickly and maintains stable results for the following training cycles. Similarly, the Logits matrix of the model gradually transitions from a completely disordered state to an ordered state, as shown in fig. 9(a) - (b). The position information matrix is obtained from the logs matrix through discrete operation, so that no matter whether the negative influence in the non-target named entity category is increased or not, the influence of the position information matrix on the repeated instruction is not different from that of other non-target named entity categories.
Example two
According to the embodiment of the invention, the embodiment of the robot language instruction analysis system based on Chinese named entity recognition is disclosed, and comprises the following steps:
the text information acquisition module is used for acquiring Chinese text information based on the input instruction content;
the characteristic enhancement module is used for extracting text characteristics and enhancing the characteristics;
the word segmentation module is used for inputting the enhanced features into an automatic supervision Chinese named entity recognition model, generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, generating a 'repeat' instruction according to the repositioning matrix, using the instruction for entity category inference, and outputting the named entity category attribute of each Chinese character in an automatic supervision mode;
and the robot control module is used for driving the robot to execute a corresponding instruction based on the extracted named entity.
It should be noted that specific implementation manners of the modules are already described in detail in the first embodiment, and are not described again.
EXAMPLE III
According to an embodiment of the present invention, an embodiment of a terminal device is disclosed, which includes a processor and a memory, the processor being configured to implement instructions; the memory is used for storing a plurality of instructions which are suitable for being loaded by the processor and executing the robot language instruction analysis method for Chinese named entity recognition in the first embodiment.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (10)

1. The robot language instruction analysis method based on Chinese named entity recognition is characterized by comprising the following steps:
acquiring Chinese text information based on input instruction content;
extracting text features and enhancing the features;
inputting the enhanced features into a self-supervision Chinese named entity recognition model, generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, generating a 'restatement' instruction according to the repositioning matrix, using the instruction for entity category inference, and outputting the named entity category attribute of each Chinese character in a self-supervision mode;
and driving the robot to execute corresponding instructions based on the extracted named entities.
2. The method of claim 1, wherein the input command comprises a voice input command or a chinese text input command.
3. The method for analyzing robot language instruction based on Chinese named entity recognition according to claim 1, wherein the extracting text features and performing feature enhancement specifically comprises:
given a sentence
Figure FDA0002960557990000011
The output feature sequence set is as follows:
Figure FDA0002960557990000012
wherein,
Figure FDA0002960557990000013
is a sequence of Chinese characters,
Figure FDA0002960557990000014
is a pinyin sequence corresponding to the sentence,
Figure FDA0002960557990000015
the sequence of the components is a radical sequence,
Figure FDA0002960557990000016
is a sequence of parts of speech, and is,
Figure FDA0002960557990000017
is a word boundary sequence; during feature embedding, word vectors which are not pre-trained are embedded into the model.
4. The method of analysis of robot language instructions based on chinese named-entity recognition as claimed in claim 1 wherein the named-entity recognition model comprises two layers of bi-directional LSTM networks, each layer of networks comprising a number of LSTM cores, each LSTM core having a general structure: the input proportion of the information and the forgetting proportion of the previous period are controlled by three gates.
5. The method as claimed in claim 4, wherein the input is processed based on the structural model to generate a score sequence of named entities corresponding to each Chinese character
Figure FDA0002960557990000018
Where 0 ≦ i ≦ n, and d represents the number of named entity classes.
6. The method of claim 5, wherein the score sequences are integrated to obtain a score matrix of the named entity class corresponding to the input sentence
Figure FDA0002960557990000021
Obtaining a position matrix based on the fractional matrix:
Figure FDA0002960557990000022
whereinFunction of
Figure FDA0002960557990000023
It is shown that the position information is extracted,
Figure FDA0002960557990000024
representing a fractional matrix after restraining influence of the non-target named entity;
and reconstructing the input instruction sentence by using the position information matrix to obtain the 'repeat' instruction.
7. The method for analyzing robot language instruction based on Chinese named entity recognition as claimed in claim 1, wherein the named entity recognition is implemented by adding set rules in the basic back propagation method, so that the named entity recognition model can learn through the difference between the input instruction and the "repeat" instruction without explicit learning direction in the self-supervision closed loop.
8. The method for analyzing robot language instruction based on Chinese named entity recognition as claimed in claim 7, wherein the set rules comprise:
the influence of the non-target named entities in the original score matrix is restrained, so that the named entity recognition model can learn according to the position information matrix in a discrete form;
balancing rules are added to the model in the self-supervised learning paradigm to keep it stable.
9. Robot language instruction analysis system based on Chinese named entity discernment characterized in that includes:
the text information acquisition module is used for acquiring Chinese text information based on the input instruction content;
the characteristic enhancement module is used for extracting text characteristics and enhancing the characteristics;
the word segmentation module is used for inputting the enhanced features into the named entity recognition model, generating a score of each Chinese character belonging to each named entity category, constructing a repositioning matrix, generating a 'restatement' instruction according to the repositioning matrix, using the instruction for entity category inference, and outputting the named entity category attribute of each Chinese character in a self-supervision mode;
and the robot control module is used for driving the robot to execute a corresponding instruction based on the extracted named entity.
10. A terminal device comprising a processor and a memory, the processor being arranged to implement instructions; the memory is used for storing a plurality of instructions, wherein the instructions are suitable for being loaded by the processor and executing the robot language instruction analysis method for Chinese named entity recognition according to any one of claims 1 to 8.
CN202110236088.7A 2021-03-03 2021-03-03 Robot language instruction analysis method and system based on Chinese named entity recognition Active CN112883737B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110236088.7A CN112883737B (en) 2021-03-03 2021-03-03 Robot language instruction analysis method and system based on Chinese named entity recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110236088.7A CN112883737B (en) 2021-03-03 2021-03-03 Robot language instruction analysis method and system based on Chinese named entity recognition

Publications (2)

Publication Number Publication Date
CN112883737A true CN112883737A (en) 2021-06-01
CN112883737B CN112883737B (en) 2022-06-14

Family

ID=76055361

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110236088.7A Active CN112883737B (en) 2021-03-03 2021-03-03 Robot language instruction analysis method and system based on Chinese named entity recognition

Country Status (1)

Country Link
CN (1) CN112883737B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113485382A (en) * 2021-08-26 2021-10-08 苏州大学 Mobile robot autonomous navigation method and system for man-machine natural interaction
CN116842021A (en) * 2023-07-14 2023-10-03 恩核(北京)信息技术有限公司 Data dictionary standardization method, equipment and medium based on AI generation technology
CN117079646A (en) * 2023-10-13 2023-11-17 之江实验室 Training method, device, equipment and storage medium of voice recognition model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677779A (en) * 2015-12-30 2016-06-15 山东大学 Feedback-type question type classifier system based on scoring mechanism and working method thereof
CN109657239A (en) * 2018-12-12 2019-04-19 电子科技大学 The Chinese name entity recognition method learnt based on attention mechanism and language model
CN111444721A (en) * 2020-05-27 2020-07-24 南京大学 Chinese text key information extraction method based on pre-training language model
CN111985239A (en) * 2020-07-31 2020-11-24 杭州远传新业科技有限公司 Entity identification method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677779A (en) * 2015-12-30 2016-06-15 山东大学 Feedback-type question type classifier system based on scoring mechanism and working method thereof
CN109657239A (en) * 2018-12-12 2019-04-19 电子科技大学 The Chinese name entity recognition method learnt based on attention mechanism and language model
CN111444721A (en) * 2020-05-27 2020-07-24 南京大学 Chinese text key information extraction method based on pre-training language model
CN111985239A (en) * 2020-07-31 2020-11-24 杭州远传新业科技有限公司 Entity identification method and device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHUO YANG等: "Learning Multi-Object Dense Descriptor for Autonomous Goal-Conditioned Grasping", 《IEEE》 *
姜辉: "用于智能测控设备的人机交互方法的研究与设计", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113485382A (en) * 2021-08-26 2021-10-08 苏州大学 Mobile robot autonomous navigation method and system for man-machine natural interaction
CN116842021A (en) * 2023-07-14 2023-10-03 恩核(北京)信息技术有限公司 Data dictionary standardization method, equipment and medium based on AI generation technology
CN116842021B (en) * 2023-07-14 2024-04-26 恩核(北京)信息技术有限公司 Data dictionary standardization method, equipment and medium based on AI generation technology
CN117079646A (en) * 2023-10-13 2023-11-17 之江实验室 Training method, device, equipment and storage medium of voice recognition model
CN117079646B (en) * 2023-10-13 2024-01-09 之江实验室 Training method, device, equipment and storage medium of voice recognition model

Also Published As

Publication number Publication date
CN112883737B (en) 2022-06-14

Similar Documents

Publication Publication Date Title
Sharma et al. Efficient Classification for Neural Machines Interpretations based on Mathematical models
CN112883737B (en) Robot language instruction analysis method and system based on Chinese named entity recognition
CN107239446B (en) A kind of intelligence relationship extracting method based on neural network Yu attention mechanism
Liu et al. Multi-timescale long short-term memory neural network for modelling sentences and documents
CN110532558B (en) Multi-intention recognition method and system based on sentence structure deep parsing
CN112541356B (en) Method and system for recognizing biomedical named entities
WO2023137911A1 (en) Intention classification method and apparatus based on small-sample corpus, and computer device
CN114676234A (en) Model training method and related equipment
CN113535953B (en) Meta learning-based few-sample classification method
Juven et al. Cross-situational learning with reservoir computing for language acquisition modelling
Zhang Application of intelligent grammar error correction system following deep learning algorithm in English teaching
Li Analysis of semantic comprehension algorithms of natural language based on robot’s questions and answers
CN116680407A (en) Knowledge graph construction method and device
Zhang et al. Japanese sentiment classification with stacked denoising auto-encoder using distributed word representation
CN116483314A (en) Automatic intelligent activity diagram generation method
CN112131879A (en) Relationship extraction system, method and device
Han et al. Lexicalized neural unsupervised dependency parsing
CN113590745B (en) Interpretable text inference method
CN113032565B (en) Cross-language supervision-based superior-inferior relation detection method
CN114692615A (en) Small sample semantic graph recognition method for small languages
Zarir et al. Automated image captioning with deep neural networks
Wu et al. Analyzing the Application of Multimedia Technology Assisted English Grammar Teaching in Colleges
CN110543569A (en) Network layer structure for short text intention recognition and short text intention recognition method
Juliet A Comparative Study on Optimizers for Automatic Image Captioning
Fu et al. A hybrid algorithm for text classification based on CNN-BLSTM with attention

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant