CN117350286A - Natural language intention translation method oriented to intention driving data link network - Google Patents

Natural language intention translation method oriented to intention driving data link network Download PDF

Info

Publication number
CN117350286A
CN117350286A CN202311301236.4A CN202311301236A CN117350286A CN 117350286 A CN117350286 A CN 117350286A CN 202311301236 A CN202311301236 A CN 202311301236A CN 117350286 A CN117350286 A CN 117350286A
Authority
CN
China
Prior art keywords
model
intention
entity
network
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311301236.4A
Other languages
Chinese (zh)
Inventor
蒋定德
王志浩
刘心蕙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Publication of CN117350286A publication Critical patent/CN117350286A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a natural language intention translation method for an intention driven data link network, and belongs to the technical field of data link networks. The invention realizes accurate translation of the data link network intention input by the user and expressed in natural language. According to the invention, the Bert+CRF model is used for carrying out intention entity identification of the intention-driven data chain network, and the mode of pre-training the model and fine-tuning the model is adopted, so that the model training cost is reduced, and the parameter adjustment and training of the model can be completed quickly. And converting the output network intention into a network configuration file with a fixed format by adopting a template matching mode, so as to realize the translation of the data link network intention.

Description

Natural language intention translation method oriented to intention driving data link network
Technical Field
The invention belongs to the technical field of data link networks, and particularly relates to a natural language intention translation method for an intention-driven data link network.
Background
The data chain is a networked information system for transmitting and processing formatted messages of battlefield situations, command control and tactical coordination among the sensors, the command control system and the weapon platform in real time. The intention-driven data link network translates the intention into a corresponding data link network management strategy by analyzing the user intention, and finally realizes the automatic deployment of network perception and control strategies. The intention is the core of the intention network, and the running process of the intention-driven data chain network is closely related to the intention. The user only needs to describe the wanted result, but does not need to describe how to realize, the system can automatically realize the user intention, can continuously monitor the network state information and judge whether the user intention is realized or not. The intention is to give a statement that the user expects the network to reach a certain state, and does not indicate how to implement, but the network still makes forwarding decisions, resource allocation according to network policies during actual operation. Therefore, it is necessary to translate the user intention into a corresponding network configuration policy according to the content in the intention and the current network state, which is a translation process of the intention. At present, the translation of user intention mainly adopts a natural language processing method to process intention, and keyword extraction, lexical analysis, semantic mining and other operations are carried out on the user intention, so that the network running state expected by the user is obtained, and an intelligent method is used for generating a network strategy.
With the development of natural language understanding (Natural Language Understanding, NLP) technology, artificial intelligence-based intent translation algorithms represented by sequence notation have received wide attention and application. The sequence labels comprise subtasks such as word segmentation, part-of-speech labels, clear entity identification and the like. Sequence labeling is to label each element in a linear sequence with a certain label in a label set for a one-dimensional linear input sequence, and essentially classifies each element in the linear sequence according to context. The problem of labeling Chinese sequences often can be regarded as an element of a linear sequence, and the meaning represented by the label set of different tasks may not be the same, but the same problem is that: how to label the Chinese characters according to the context of the Chinese characters. Many research methods for named entity extraction mainly include rule and dictionary-based methods, machine learning methods, deep learning methods, and the like. In the traditional machine learning method, a large-scale corpus is required for learning the labeling model for named entity recognition, manual participation is still required in the aspect of feature extraction, and meanwhile, the quality of corpus labeling seriously influences the entity recognition effect. With the development of deep learning, RNN has achieved great success in sequence labeling, bi-LSTM+CRF model, and excellent performance in this task. However, the Bi-LSTM-based model still has the problems of insufficient accuracy and long training time, and can not realize the tasks of word segmentation and part-of-speech tagging at the same time. The implementation of the BERT model is based on a multi-layer bi-directional transducer encoder. The transducer uses a bi-directional self-attention mechanism that breaks the limitations of uni-directional fusion of context information, pre-trains with a new mask language model and builds a model with deep bi-directional transducer components, thereby generating deep bi-directional language representations of the fused context information.
Therefore, how to construct an intended entity recognition data set suitable for an intended driving data chain network, design a proper word segmentation device to improve recognition performance, adjust model parameters to realize accurate recognition when training the data set, and finally how to convert the recognized intended entity into a configuration file or an instruction in a fixed format for a network management module to call is a technical problem that needs to be solved by those skilled in the art at present.
Disclosure of Invention
Aiming at the defect that network management and recognition of collocation intents are difficult to accurately realize in an intent-based data link network system, the invention provides efficient recognition of network management intents input in a natural language form, and provides a natural language intent translation method for an intent-driven data link network, which can realize natural language intent translation with various management operations, various data link network nodes and various performance constraints.
In order to achieve the above objective, the present invention adopts the following technical scheme, and is a natural language intent translation method for an intent driven data link network, the overall step flow and input/output are shown in fig. 1, and the method comprises the following steps:
step 1: data chain network management intention sample data acquisition and labeling;
the natural language corpus related to the data link network is collected by means of manpower, programs and the like, and the corpus comprises corpus samples described in Chinese language in various modes such as application scenes, performance indexes, network architecture, network management, network configuration, network operation and maintenance and the like of the data link network. Defining important entity types related to network management, operation objects, expected states, performance indexes, space-time constraints and the like;
step 2: chinese and English mixed intention sentence word segmentation;
word segmentation is an important step of natural language understanding, and is used for decomposing texts such as sentences and paragraphs into data structures taking words as units and reserving the most significant words. In the invention, wordPiece of Bert is adopted for word segmentation. Aiming at the Chinese and English mixed entities in the corpus, the vocabulary and the word segmentation device are optimized on the basis of the basic Chinese word segmentation device, so that more accurate word segmentation is realized. After word segmentation, marking each word/word in the collected corpus by adopting a BIO marking method according to the entity label marked in the first step, wherein B represents the beginning of an entity, I represents the middle and the end of the entity, and O represents non-entity words.
Step 3: text characterization based on the Bert model;
with the basic Bert chinese pre-training model, the intended original input is in text form, which cannot be directly processed by the mathematical model, requiring the text to be encoded into a digital vector form. The input of the Bert model takes the form of addition of three types of Embedding (Embedding) as a representation of the correspondence of each symbol (token). The pre-trained transducer is used as an encoder to learn the context of the text through its attention mechanism. Subsequently, a randomly initialized classifier is added at the uppermost layer of the model. In the invention, a full connection layer with the size of (h, k) is added on a Bert model to be used as a linear mapping from a hidden layer h-dimensional vector to an output layer k-dimensional vector, wherein k represents the number of entity types;
step 4: entity identification based on CRF;
after the Bert model finishes the calculation of the representation vector of the text, the final full-connection layer maps the output of the model to k-dimensional space, and the score of each label corresponding to each word is obtained. The output of the last layer is directly feasible as the result of entity identification, but the invention continues to add a CRF layer on the upper layer of the model for output constraint so as to avoid the problems of non-continuous entity identification and the like;
step 5: training a Bert-CRF model;
dividing the data chain network intention corpus acquired and marked in the step 1 into a training set, a verification set and a test set according to the proportion of 6:2:2. Based on the pretrained model of Bert, fine tuning of the model is performed, and the encoder and classifier described in step 3 are trained with a small learning rate. In each epoch, step 2, step 3 and step 4 are sequentially executed by randomly sampling training set data, the predicted result is compared with the label marked in step 1, the loss of CRF is calculated, the loss is propagated reversely, and each weight in the network and the transfer matrix in the CRF layer are adjusted;
step 6: deploying an application by a model;
storing the model after training in the step 5 as an offline file, deploying the model as a continuous prediction model based on a flash Web framework, opening an interface, inputting the model as a single data chain intention statement, calling the model offline file, respectively executing the steps 2 to 4, and outputting the intention entity recognition result of the statement;
step 7: template matching;
the entity recognition result output in the step six does not distinguish the categories of the operation, the constraint, the object and the like defined in the step 1, and cannot be directly used as an instruction for data link network management. According to the invention, the entity category is mapped into json sentences which can be processed by the intention network through a template matching method, and the json sentences are input into an underlying intention automation system to issue the intention. Judging whether to continue inputting the next intention, if so, turning to the step six; otherwise, turning to the step eight;
step 8: and (5) ending.
The entity labeling in the first step adopts two modes of mixed labeling, namely automatic labeling and manual screening based on doccano; and finally, giving an entity label to each word in each sentence, and storing the entity label as a data set according to a specific format for training a subsequent model. The method comprises the following specific steps:
step A: based on the flash Web frame, the word automatic labeling interface is realized. The entity class corresponding to the known word is defined, for example, the known "early warning machine" is the "sensor node" class. Matching words, if the HTTP interface acquires known words, returning a corresponding entity class, otherwise, returning no value;
and (B) step (B): based on the doccano loaded and collected unlabeled corpus, starting automatic labeling, and carrying out sentence-by-sentence labeling on an input sample set to obtain an entity labeling result;
step C: b, verifying whether the labeling is correct, if so, continuing loading the next sample for labeling, otherwise, correcting the entity labeling result, and turning to the step B; after all the samples are marked, turning to the step D;
step D: and (5) ending.
The type of the intended entity facing the data link network in the step one mainly comprises creating a link, modifying a link-disconnected link sensor node, a command node, a weapon node, a best effort, a constraint, a transmission rate, an end-to-end delay, a bandwidth, a speed unit, a time unit, a bandwidth unit, a number and a node number, and the type attribution and the specific examples are shown in table 1.
Table 1 data chain network intention entity type
The design of the word segmentation device in the second step, specifically, the English words which may exist and will be segmented by the default word segmentation device, such as "SENSOR", "WEAPON", etc., are individually added into the token list of the token, and the length of the token embedded layer of the model is adjusted to be the adjusted length of the token. The comparison before and after adjustment is shown in fig. 2.
The three types of symbols described in step three are Token symbols, segment Embeddings and Position Embeddings, respectively. Token symbols are fine-grained symbols that constitute text for a single character, requiring two special characters before an input sentence is text: the beginning [ CLS ] and the end [ SEP ] are added to the sentence, the beginning of each input sentence sequence is inserted with a [ CLS ] token for representing the beginning of the sequence, the output of the last transducer layer in the model corresponding to the token is used for converging the characterization information of the whole sequence, so that the model can distinguish two different sentences, the end of the sentence is represented by SEP, and the beginning of the second sentence is marked. Segment Embeddings is used to distinguish two sentences, and is used as a basis for inputting tasks of a plurality of sentences. Since the meaning of the same word in different sentence positions is different, position Embeddings is adopted to represent position information, and position information (relative position and absolute position) is added to each word vector, so that the meaning of the same word in different positions is ensured to be different. The final embedding result is shown in fig. 3.
The Bert model in the third step mainly comprises an encoding part in a natural language processing model transducer, wherein the encoding part is formed by stacking a plurality of layers of Transformer Encoder, the lower layer is an encoding structure in the third step, and the upper layer is a CRF structure in the fourth step. It uses a mask-based language model (Masked Language Model, MLM) in the training process, i.e. some positions in the input sequence are masked randomly and then predicted by the model, the model structure is shown in fig. 4.
The text characterization based on the Bert model in the third step comprises the following specific steps:
step A: there are k tag categories identified in a given intent dataset defined, with BIO as the labeling mode. Text= (w) for each piece of text 1 ,w 2 ,...,.w n ) The token sequence is text '= (w' 1 ,w' 2 ,...,w' n ) Wherein n is the sequence length of the text after word segmentation by the word segmentation device defined in the step x;
and (B) step (B): the Bert model performs embedding representation on text' by the embedding method defined in the step X to obtain an embedded sequence X epsilon R n×d Where d represents the dimension of the embedded layer vector.
Step C: x then text '= (w' through the text encoding layer in the Bert model to the token sequence text '= (w)' 1 ,w' 2 ,...,w' n ) Modeling to obtain hidden layer representation, namely H E R n×h Wherein h represents the dimension of the hidden layer vector;
step D: h, carrying out label prediction on each token through a full connection layer in the Bert model to obtain a classification result L epsilon R n×k Each row l thereof i ∈R k Representing token w' 1 Predictive scores of all tags, k being the number of entity tags;
the CRF model in the fourth step mainly comprises the following steps: the output of the Bert model is used as the input of the CRF layer, the path of the label is used as a prediction target, and the vector L epsilon R obtained in the step three is used as the prediction target n×k Namely, the scoring result L of each token corresponding to each entity class becomes the emission probability; the input of CRF also contains a transfer matrix whose contents are the weights of transferring one entity tag to the next, denoted as T.epsilon.R k×k . For a text sequence of length N, the possible label paths thereof have n=n k Bars, the score of which is denoted P i . The scores of all paths are noted:
each element of the transition matrix is randomly assigned when the model is initialized, and accurate transition probability needs to be calculated through training of the model, so that a loss function of the CRF layer is defined:
if path i is a true path, P i Is P 1 ~P N The score of (2) is highest. According to the loss function, the parameter values of the optimization model are iterated step by step, so that the duty ratio of the real path is increased as much as possible, and finally the transfer matrix is obtained. By inputting the emission probability and the transition matrix into the CRF layer, a transition path with the highest possibility, namely an entity tag sequence of the word sequence, is finally obtained.
The constraint of the CRF model on output in the fourth step mainly comprises adding constraint to the final label prediction result on the basis of the constraint to ensure the validity of the entity class output by the model, help the model select a correct and reasonable entity label sequence, and further reduce the error rate of prediction, and mainly comprises the following steps:
(1) Ensuring that the tag of each entity identified is beginning with B-or O as the first word, rather than I-, to ensure that the entity meets BIO tagged specifications;
(2) In B-L1, B-L2, B-L3..in one entity, L1, L2, L3..the expression should be the same label;
(3) It is ensured that the tag of the first word in each sentence should start with "B-" or "O", instead of "I-".
The model deployment architecture described in the sixth step is shown in fig. 5, and the main parts include:
(1) The intention data service comprises an automatic labeling module based on flash, a manual labeling and correction service based on doccano, and the acquired corpus is subjected to entity mixed labeling according to the intention type specified in the step one;
(2) An intention collection service for collecting intention sentences input by a user from a Web interface based on Django and inputting the intention sentences into the step (3);
(3) The intention translation service deploys the Bert-CRF entity recognition model in the second, third and fourth steps on line based on flash, so that the Bert-CRF entity recognition model can continuously recognize input entities, and opens RESTful API interfaces for calling in step (2);
(4) And (3) the intention presentation service acquires the intention result identified in the step (3), and then pushes the data to a front-end Web interface for a user to confirm the intention so as to carry out a subsequent intention issuing process.
And step seven, the template matching flow is described, and the network intention obtained by final matching is shown as 6. Wherein the intent_id is a unique number used to identify intent in the system; the operation is used for identifying operation types, namely three operation types of connection establishment, connection modification and connection disconnection; objects represent objects of the operation, each object including an object type (sensor node, command node, weapon node) and a formatted node ID number; constraints comprise a plurality of constraints (states that the network is intended to reach), each condition comprising a type of condition (bandwidth, latency, data rate), a condition specific value, a target type (greater than, less than, maximized, minimized) and a unit of condition. The method mainly comprises the following steps:
step A: extracting an operation entity in the intention translation result, marking the operation as operation, matching whether a CONNECT or a modified entity exists in the entity identification result output in the step five, and if so, turning to the step B; otherwise, turning to the step E, requiring the user to reenter the intention;
and (B) step (B): detecting whether SENSOR, WEAPON, C entity class exists, namely objects, and if two entities (which can be the same or different) are contained and a CONNECT or modified entity is contained, going to step C; otherwise, if two entities (which may be the same or different) are included and a DISCONNECT entity is included, go to step D; otherwise, turning to the step E;
step C: detecting whether BW, DELAY, DR entity types exist, recording as constraints, if one or more non-overlapping entity types exist, recording entity types one by one, and detecting adjacent numerical entities, target entities and unit entities to form a constraints format shown in fig. 6; otherwise, turning to the step E;
step D: completing matching, creating intention, adding number intent_id attribute, and formatting into the form shown in FIG. 6;
step E: the match fails, requiring the user to reenter.
The invention has the beneficial effects that:
in order to meet the increasingly complex and diversified performance and service requirements of battlefield environments, the data link network needs to carry out self-adaptive reconstruction according to different performance targets, environments, service requirements and other intentions, and the data link network is driven to become an important component after artificial intelligence is introduced into the data link network. One of the core parts in the data chain network driven by intention is the translation of natural language intention, and a user can form a management instruction which can be issued into the network only by inputting the expectation of the network through daily language and converting the intention system. Most of the existing network management schemes require a network manager or user to define intentions using a programming language, and thus require the manager to have sufficient network knowledge and programming capability. The invention provides a natural language intention translation method for an intention driven data link network, which enables a user to input intention to the data link network in a natural language mode, automatically extracts keywords in the input language of the user through an artificial intelligent model and converts the keywords into a configuration command in a fixed format for the call of an intention automation system. The conventional natural language entity identification database is usually designed for a general scene, has very limited identification capability in a data link network management scene, and cannot meet the requirement of intention network management. The invention firstly provides an intention entity label set oriented to a data chain network, which comprises common elements such as operation on the network, a specific operation object, a state expected to be achieved by the network, various parameters of the network and the like, so as to meet the requirement of an intention automation system. And labeling the entity labels by adopting a mode of combining automatic labeling and manual screening, and converting the entity labels into a data set format which can be identified by the Bert pre-training model. The Bert model adopts a working mode of pre-training and model fine tuning, and a very small amount of additional training is performed by using the basic Chinese Bert model and an intention data set facing a data link network, so that an intention entity identification effect with very high accuracy can be realized, and the efficiency of intention translation of the data link network and the accuracy of the translation are improved. When the Bert model is adopted to identify the intended entity, the basic word segmentation device in Bert can only segment sentences into single characters according to the basic table, and English words such as node numbers, physical quantity units and the like cannot be accurately identified due to poor support on the English words in the Chinese word list. The invention can realize the word segmentation according to the characters and the independent word segmentation of the important English words of Chinese simultaneously by modifying the Bert word segmentation device, and can improve the accuracy of the intention translation of the data link. The Bert model can output labeling results of all entities, but partial word recognition errors can be caused due to errors and the like of the model, so that a section of entity sequence cannot be further recognized. The invention adopts a mode of adding a CRF layer after Bert output to realize further constraint of the entity identification tag so as to realize output of a coherent identification result. Finally, the invention adopts a template matching method to match the identified data link network intention entities one by one, and finally obtains a configuration file with a uniform format, so that a program can directly identify the intention entity identification result, and parameter input is obtained for an intention automation system.
Drawings
FIG. 1 is a diagram showing the overall flow and data input/output of the method according to the present invention;
FIG. 2 is a comparison of word segmentation before and after the optimized execution of the word segmentation device according to the present invention;
FIG. 3 is a diagram showing the calculation of three types of Embedding of the model according to the present invention;
FIG. 4 shows the basic structure of the Bert model used in the present invention in relation to input/output;
FIG. 5 is a model deployment framework for intent translation in accordance with the present invention;
FIG. 6 is an example of the result of matching the intent template according to the present invention;
FIG. 7 shows an embodiment of the present invention;
FIG. 8 (a) shows the variation of various parameters during model training as employed in embodiments of the present invention; (b) Is a variation of loss in model training process adopted in the specific embodiment of the invention.
Detailed Description
And loading a pre-trained offline model for identifying the intended entity of the data link network by using PyTorch, so that the offline model can be input by using a text sequence to obtain an intended entity identification labeling result of the sequence. However, the model of PyTorch is not continuously predictive, i.e., is not maintained in an online state for continuous service. Therefore, a Web server is built by using a flash framework, and the model is loaded, so that the model can continuously input text sequences, respectively obtain entity labeling results, and open an interface for a front-end input interface to call. And realizing a front end framework based on Django, acquiring the data link network intention of a user expressed in natural language in a text input mode, calling the interface to perform natural language understanding, and outputting the identified intention type and parameters. After the data link network intention recognition result is obtained, the recognition result is converted into the parameter configuration in the fixed json format through the intention template matching flow. The deployment framework of the simulation platform is shown in fig. 5, and the flow of the specific embodiment is shown in fig. 7. The method comprises the following specific steps:
step one: data chain intention corpus sample acquisition and annotation
Relevant literature materials such as data link network management, data link network typical application scenes, data link network protocol performance simulation analysis, data link network protocol optimization and the like are collected and are arranged into an unlabeled sample data set with each sentence not longer than 128 words. Starting doccano, and loading an unlabeled data link network intention corpus. And simultaneously starting an automatic labeling module of the intention data service in (1) in fig. 5, and pre-configuring the known entity types into the data service. And (3) carrying out automatic labeling and manual screening one by one for each sentence to obtain a labeled data link network intention data set.
Step two: fine tuning of Bert-CRF models
Downloading a pre-trained basic Bert Chinese model as a basic model, and setting related parameters, wherein the main parameters are as follows: the maximum training sequence length is 128, the maximum verification sequence length is 512, the learning rate is 0.00003, the learning rate of the CRF layer is 0.001, and the number of epochs trained is 4 times. English words possibly existing in the data chain network intention and segmented are used as special token to be added into a segmenter of the Bert model, the segmentation length of the model is adjusted, and a new vocabulary and the segmenter are stored along with the model. And then training and fine-tuning the model in a PyTorch framework, and learning Bert model parameters suitable for data chain intention entity identification and transfer matrix parameters of CRF. As shown in FIG. 8 (a) and FIG. 8 (b), the parameters and loss in the training process are changed, and the graph shows that after the 4 th epoch, each performance of the model achieves a better effect, and the subsequent epochs gradually converge, so that the embodiment adopts 4 epochs to carry out fine adjustment on the model. After training is completed, the model is saved as an offline.
Step three: deployment of Bert-CRF models
And loading the offline model stored after training in the step two based on Pytorch, and deploying the offline model into a model capable of continuous online prediction based on flash. An intent recognition interface is opened for intent input service invocation.
Step four: intent acquisition and translation
Based on the front-end interface of the Django deployment input intention, a user describes the expected network intention in a front-end text box in natural language, in this embodiment, "change the link from the radar SENSOR-16-56 to the unmanned plane C2-16-91 is that the bandwidth reaches 74ghz", and the intention recognition interface is invoked to translate the intention. Firstly, word segmentation is carried out based on the word segmentation device in the second step, and the word segmentation result is shown in the latter case of fig. 2. After word segmentation is completed, inputting the word segmentation into the Bert model subjected to fine adjustment in the second step, adding a full connection layer after outputting the Bert model, and outputting the probability that each word meets each entity class. And inputting the output result into a CRF layer to finally obtain the entity prediction labels of each word, namely the forms of B-XX, I-XX and O. And performing template matching on the obtained result to finally obtain the formatted json file.
Step five: judging whether to continue input, if so, turning to a step four; otherwise, the flow ends.

Claims (10)

1. A natural language intention translation method for an intention driven data chain network comprises the following specific steps:
step 1: data chain network management intention sample data acquisition and labeling;
collecting a natural language corpus related to a data link network, wherein the natural language corpus comprises application scenes, performance indexes, network architecture, network management, network configuration and network operation and maintenance of the data link network; defining network management, operation objects, expected states, performance indexes and space-time constraints;
step 2: chinese and English mixed intention sentence word segmentation;
word segmentation processing is carried out by adopting WordPiece of Bert; after word segmentation, marking each word/word in the collected corpus by adopting a BIO marking method according to the entity label marked in the step 1, wherein B represents the beginning of an entity, I represents the middle and the end of the entity, and O represents non-entity words;
step 3: text characterization based on the Bert model;
the basic Bert Chinese pre-training model is adopted, the intended original input is in a text form and cannot be directly processed by a mathematical model, and the text is required to be encoded into a digital vector form; the input of the Bert model adopts three embedded addition forms as the representation corresponding to each symbol; using a pre-trained transducer as an encoder to learn the context of the text through the attention mechanism of the pre-trained transducer; then, adding a randomly initialized classifier at the uppermost layer of the model; in the invention, a full connection layer with the size of (h, k) is added on a Bert model to be used as a linear mapping from a hidden layer h-dimensional vector to an output layer k-dimensional vector, wherein k represents the number of entity types;
step 4: entity identification based on a CRF model;
after the Bert model finishes the calculation of the representation vector of the text, the final full-connection layer maps the output of the model to k-dimensional space, and the score of each label corresponding to each word is obtained; adding a CRF layer to carry out output constraint after the last layer;
step 5: training a Bert-CRF model;
dividing the data chain network intention corpus acquired and marked in the step 1 into a training set, a verification set and a test set according to the proportion of 6:2:2; performing model fine adjustment based on the Bert pre-training model, and training the encoder and the classifier in the step 3 with a smaller learning rate; in each epoch, step 2, step 3 and step 4 are sequentially executed by randomly sampling training set data, the predicted result is compared with the label marked in step 1, the loss of CRF is calculated, the loss is propagated reversely, and each weight in the network and the transfer matrix in the CRF layer are adjusted;
step 6: deploying an application by a model;
storing the model after training in the step 5 as an offline file, deploying the model as a continuous prediction model based on a flash Web framework, opening an interface, inputting the model as a single data chain intention statement, calling the model offline file, respectively executing the steps 2 to 4, and outputting the intention entity recognition result of the statement;
step 7: template matching;
the entity identification result output in the step 6 does not distinguish the operation, constraint and object defined in the step 1, and cannot be directly used as an instruction for data link network management; mapping entity categories into json sentences which can be processed by the intention network through a template matching method, and inputting the json sentences into an underlying intention automation system to issue intention; judging whether to continue inputting the next intention, if so, turning to step 6; otherwise, turning to step 8;
step 8: and (5) ending.
2. The method for natural language intent translation for an intent driven data link network as recited in claim 1, wherein said data link network management intent of step 1 includes: creating a link, modifying a link disconnect sensor node, controlling a node, weapons a node, best effort, constrained, transmission rate, end-to-end delay, bandwidth, speed units, time units, bandwidth units, numbers, node numbers.
3. The method of claim 1, wherein the word segmentation unit in step 2 is specifically configured to segment english words that may exist and are segmented by a default word segmentation unit, and adjust the length of a token embedding layer of the model to be the adjusted length of the token.
4. The method of claim 1, wherein the three embedding ways in step 3 are Token references, segment Embeddings and Position Embeddings, respectively; token symbols are fine-grained symbols that constitute text for a single character, requiring two special characters before an input sentence is text: the beginning [ CLS ] and the end [ SEP ] are added into sentences, the beginning of each input sentence sequence is inserted with a [ CLS ] token for representing the beginning of the sequence, the output of the last transducer layer in the model corresponding to the token is used for converging the characterization information of the whole sequence, so that the model can distinguish two different sentences, the end of the sentence is represented by SEP, and the beginning of the second sentence is marked; segment Embeddings is used for distinguishing two sentences and is used as a basis for inputting tasks of a plurality of sentences; because the meaning of the same word expressed in different sentence positions is different, position Embeddings is adopted to represent position information, and the position information is added into each word vector; the meaning of the same word at different positions is ensured to be different.
5. The method for translating natural language intention of an intention-driven data link network according to claim 1, wherein the Bert model in step 3 comprises an encoder part in a natural language processing model Transformer, which is formed by stacking a plurality of layers Transformer Encoder, the lower layer is an Embedding structure in step 3, and the upper layer is a CRF structure in step 4; it uses a mask-based language model in the training process, i.e. certain positions in the input sequence are masked randomly and then predicted by the model.
6. The method for natural language intent translation for an intent-driven data link network as recited in claim 1, wherein said text representation based on the Bert model in step 3 comprises the steps of:
step A: defining k label categories identified in a given intention data set, and taking BIO as a labeling mode; text= (w) for each piece of text 1 ,w 2 ,...,.w n ) The token sequence is text '= (w' 1 ,w' 2 ,...,w' n ) Wherein n is the sequence length of the text after word segmentation by the word segmentation device defined in the step x;
and (B) step (B): the Bert model performs embedding representation on text' by the embedding method defined in the step X to obtain an embedded sequence X epsilon R n×d Wherein d represents the dimension of the embedded layer vector;
step C: x then text '= (w' through the text encoding layer in the Bert model to the token sequence text '= (w)' 1 ,w' 2 ,...,w' n ) Modeling to obtain hidden layer representation, namely H E R n×h Wherein h represents the dimension of the hidden layer vector;
step D: h, carrying out label prediction on each token through a full connection layer in the Bert model to obtain a classification result L epsilon R n ×k Each row l thereof i ∈R k Representing token w' 1 And the predictive score of each label, k is the number of entity labels.
7. The method for natural language intent translation for an intent driven data link network as recited in claim 1, wherein said CRF model in step 4 mainly includes: the output of the Bert model is used as the input of the CRF layer, the path of the label is used as a prediction target, and the vector L epsilon R obtained in the step three is used as the prediction target n×k Namely, the scoring result L of each token corresponding to each entity class becomes the emission probability; the input of CRF also contains a transfer matrix whose contents are the weights of transferring one entity tag to the next, denoted as T.epsilon.R k×k The method comprises the steps of carrying out a first treatment on the surface of the For a text sequence of length N, the possible label paths thereof have n=n k Bars, the score of which is denoted P i The method comprises the steps of carrying out a first treatment on the surface of the The scores of all paths are noted:
each element of the transition matrix is randomly assigned when the model is initialized, and accurate transition probability needs to be calculated through training of the model, so that a loss function of the CRF layer is defined:
if path i is a true path, P i Is P 1 ~P N The score of (2) is highest. According to the loss function, the parameter values of the optimization model are iterated step by step, so that the duty ratio of the real path is increased as much as possible, and finally the transfer matrix is obtained. By inputting the emission probability and the transition matrix into the CRF layer, a transition path with the highest possibility, namely an entity tag sequence of the word sequence, is finally obtained.
8. The method for natural language intent translation for an intent-driven data link network of claim 1, wherein the constraints of the CRF model on the output in step 4 mainly include adding constraints to the final label prediction result on the basis of the constraints to ensure the validity of the entity class output by the model, help the model select a correct and reasonable entity label sequence, further reduce the error rate of the prediction, and mainly include:
(1) Ensuring that the tag of each entity identified is beginning with B-or O as the first word, rather than I-, to ensure that the entity meets BIO tagged specifications;
(2) In B-L1, B-L2, B-L3..in one entity, L1, L2, L3..the expression should be the same label;
(3) It is ensured that the tag of the first word in each sentence should start with "B-" or "O", instead of "I-".
9. The method for natural language intent translation for an intent driven data link network as recited in claim 1, wherein the model deployment architecture of step 6 includes:
(1) The intention data service comprises an automatic labeling module based on flash, a manual labeling and correction service based on doccano, and the acquired corpus is subjected to entity mixed labeling according to the intention type specified in the step one;
(2) An intention collection service for collecting intention sentences input by a user from a Web interface based on Django and inputting the intention sentences into the step (3);
(3) The intention translation service deploys the Bert-CRF entity recognition model in the second, third and fourth steps on line based on flash, so that the Bert-CRF entity recognition model can continuously recognize input entities, and opens RESTful API interfaces for calling (2);
(4) And (3) the intention presentation service acquires the intention result identified in the step (3), and then pushes the data to a front-end Web interface for a user to confirm the intention so as to carry out a subsequent intention issuing process.
10. The method for natural language intent translation for an intent-driven data link network of claim 1, wherein the template matching process of step 7, the final matching network intent is shown in fig. 6. Wherein the intent_id is a unique number used to identify intent in the system; the operation is used for identifying operation types, namely three operation types of connection establishment, connection modification and connection disconnection; objects represent objects of an operation, each object including an object type and a formatted node ID number; constraints include a plurality of constraint conditions, each condition including a type of condition, a condition specific value, a target type, and a unit of condition; the method comprises the following steps:
step A: extracting an operation entity in the intention translation result, marking the operation as operation, matching whether a CONNECT or a modified entity exists in the entity identification result output in the step five, and if so, turning to the step B; otherwise, turning to the step E, requiring the user to reenter the intention;
and (B) step (B): detecting whether SENSOR, WEAPON, C entity class exists, namely objects, if two entities are contained and a CONNECT or modified entity is contained, then turning to step C; otherwise, if the two entities are included and the DISCNECT entity is included, go to step D; otherwise, turning to the step E;
step C: detecting whether BW, DELAY, DR entity types exist, recording as constraints, if one or more non-overlapping entity types exist, recording the entity types one by one, and detecting adjacent numerical entities, target entities and unit entities;
otherwise, turning to the step E;
step D: completing matching, creating intention, and adding a serial number intent_id attribute;
step E: the match fails, requiring the user to reenter.
CN202311301236.4A 2023-07-14 2023-10-09 Natural language intention translation method oriented to intention driving data link network Pending CN117350286A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2023108718283 2023-07-14
CN202310871828 2023-07-14

Publications (1)

Publication Number Publication Date
CN117350286A true CN117350286A (en) 2024-01-05

Family

ID=89356886

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311301236.4A Pending CN117350286A (en) 2023-07-14 2023-10-09 Natural language intention translation method oriented to intention driving data link network

Country Status (1)

Country Link
CN (1) CN117350286A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117574875A (en) * 2024-01-08 2024-02-20 成都愿景仿视科技有限公司 Natural language understanding modeling method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117574875A (en) * 2024-01-08 2024-02-20 成都愿景仿视科技有限公司 Natural language understanding modeling method
CN117574875B (en) * 2024-01-08 2024-04-26 成都愿景仿视科技有限公司 Natural language understanding modeling method

Similar Documents

Publication Publication Date Title
CN113656570B (en) Visual question-answering method and device based on deep learning model, medium and equipment
CN113177124B (en) Method and system for constructing knowledge graph in vertical field
CN111553479A (en) Model distillation method, text retrieval method and text retrieval device
CN110968660B (en) Information extraction method and system based on joint training model
CN112905795A (en) Text intention classification method, device and readable medium
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
CN114912423A (en) Method and device for analyzing aspect level emotion based on transfer learning
CN114491024B (en) Specific field multi-label text classification method based on small sample
CN113138920B (en) Software defect report allocation method and device based on knowledge graph and semantic role labeling
CN117350286A (en) Natural language intention translation method oriented to intention driving data link network
CN114841151A (en) Medical text entity relation joint extraction method based on decomposition-recombination strategy
CN114780723A (en) Portrait generation method, system and medium based on guide network text classification
WO2022072237A1 (en) Lifecycle management for customized natural language processing
US20230289528A1 (en) Method for constructing sentiment classification model based on metaphor identification
CN115827871A (en) Internet enterprise classification method, device and system
CN115687627A (en) Two-step lightweight text classification method based on attention mechanism
CN115905527A (en) Priori knowledge-based method for analyzing aspect-level emotion of BERT model
CN111723301B (en) Attention relation identification and labeling method based on hierarchical theme preference semantic matrix
CN114822726A (en) Construction method, analysis method, device, storage medium and computer equipment
CN115186670A (en) Method and system for identifying domain named entities based on active learning
CN114610878A (en) Model training method, computer device and computer-readable storage medium
CN114065770A (en) Method and system for constructing semantic knowledge base based on graph neural network
Shahade et al. Deep learning approach-based hybrid fine-tuned Smith algorithm with Adam optimiser for multilingual opinion mining
Zhao et al. A subject-aware attention hierarchical tagger for joint entity and relation extraction
CN116821712B (en) Semantic matching method and device for unstructured text and knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination