CN117151069B - Security scheme generation system - Google Patents

Security scheme generation system Download PDF

Info

Publication number
CN117151069B
CN117151069B CN202311421473.4A CN202311421473A CN117151069B CN 117151069 B CN117151069 B CN 117151069B CN 202311421473 A CN202311421473 A CN 202311421473A CN 117151069 B CN117151069 B CN 117151069B
Authority
CN
China
Prior art keywords
scheme
security
information
document
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311421473.4A
Other languages
Chinese (zh)
Other versions
CN117151069A (en
Inventor
张春荣
张芸捷
娄庄西
李晓永
敬军
张心臻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 15 Research Institute
Original Assignee
CETC 15 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 15 Research Institute filed Critical CETC 15 Research Institute
Priority to CN202311421473.4A priority Critical patent/CN117151069B/en
Publication of CN117151069A publication Critical patent/CN117151069A/en
Application granted granted Critical
Publication of CN117151069B publication Critical patent/CN117151069B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The invention belongs to the technical field of electric digital processing, and relates to a security scheme generating system which comprises a dialogue module, a plan recommending module, a key information extracting module and a text generating module; the dialogue module prompts a user to input contents related to tasks, environments and specific requirement information of a scheme to be generated, and extracts the tasks, the environments and the specific requirement information; the plan recommending module acquires similar historical plans according to the task, the environment and the specific requirement information; the key information extraction module extracts the subject words and the document catalogue of the user input content and the similar history schemes; and the text generation module generates the content of each section in the document catalog according to the keywords, the subject terms and the similar history schemes until a complete guarantee scheme is generated. The security scheme generating system provided by the invention can acquire more detailed specific requirement information by inquiring the related information of the user, and can quickly generate a new security scheme according to the similar historical scheme, so that the workload of the user for preparing the security scheme is reduced.

Description

Security scheme generation system
Technical Field
The invention relates to the technical field of electric digital processing, in particular to a security scheme generating system.
Background
At present, the AI intelligent document generation technology is widely applied to the fields including meeting summary, overview, official documents, schemes and the like, and compared with manual writing of documents, a great deal of time is often required to be spent, the information is collected by thinking, and the writing is conceived so as to meet the document requirements in different scenes, which is time-consuming and labor-consuming; the AI intelligent document generation technology starts to develop intelligent writing applied to various fields along with the continuous development of artificial intelligence, and the document writing efficiency is improved.
In AI intelligent document generation technology, it becomes extremely important how to make artificial intelligence accurately acquire key information for authoring. In the prior art, the task-oriented man-machine interaction AI is used for more emphasis on context correlation, each round of dialogue has an influence on the next dialogue, and key information for writing can be accurately acquired. According to the adopted technical type, the task-oriented man-machine interaction can be divided into rules-based, semantic analysis-based and data-driven types; the rule-based dialogue system has a direct implementation method of high coupling between dialogue logic and dialogue management and a state transition-based method, which are easy to understand and have high implementation speed, but defining proprietary rules for various scenes can make the system huge, and as the number of dialogues and tasks increases, data maintenance and code maintenance become very difficult and the reusability is also poor. Because of the diversity and complexity of languages, dialog systems constructed solely by means of logical structures and logical conditions do not meet the actual dialog requirements. The dialog system requires a lot of support for effective labeling data, with the consequent difficulty of relying on manual labeling work, but labeling data creates a lot of labor costs but still does not achieve good results because of understanding bias and lack of industry knowledge. Therefore, how to understand the intention of the user, extracting key information such as specific tasks and specific requirements of the security scheme from the input of the user, and controlling the generated text content of the security scheme by using the key information is a problem to be solved.
Disclosure of Invention
The invention provides a security scheme generation system which is characterized by comprising a dialogue module, a plan recommendation module, a key information extraction module and a text generation module;
the dialogue module is used for carrying out natural language dialogue with a user, prompting the user to input contents related to tasks, environments and specific requirement information of a scheme to be generated, extracting keywords in the user input contents according to the user input contents, and calculating keyword scores based on occurrence frequencies of the keywords in the user input contents; invoking a keyword retrieval knowledge base to obtain a question set related to the keyword; calculating the similarity of the user input content and each question in the question set, and sequencing the question set by combining the keyword scores to obtain a preset number of high-similarity questions serving as questions to be queried to query the user; the task, environment and specific requirement information related to the scheme to be generated are obtained through the initial user input content and the content input when the user replies in the process of inquiring;
the plan recommendation module acquires a recommended historical guarantee scheme or a plan as plan information according to the similarity between a scheme to be generated and an existing scheme based on the acquired task, environment and specific requirement information;
the key information extraction module extracts tasks, environments, specific requirement information of a scheme to be generated and subject words of plan information; generating a document catalog of the guarantee scheme according to the plan information;
the text generation module generates the security scheme text content of each section by using a transducer pre-training model aiming at each section in the security scheme document catalog, further generates the content of the whole security scheme document, repeatedly generates the whole document content and obtains the final security scheme text;
the transducer pre-training model comprises an encoding controller and a decoding controller;
the transducer pre-training model is trained through specific requirement information and preset information;
the coding controller is used for controlling the keywords and the positions of the keywords based on the belonging relation between the keywords and the document catalogue;
the decoding controller is used for adjusting the subject word of each section so that the content of the current generated document catalog contains the subject word and the corresponding target content.
Further, the dialogue module ends the continuous inquiry when the user inputs the termination intention content or the maximum dialogue round number is larger than the preset number.
Further, the plan recommendation module performs sample document set training through a CBOW model, and performs K-dimensional word vector characterization on each word in task, environment and specific requirement information;
searching by using the feature vector, calculating the similarity with a history guarantee scheme or a plan, and finally realizing the searching of the similar scheme; the similarity is calculated by adopting the following cosine COS similarity formula:
wherein the similarity between the ith word in the task, environment and specific requirement information and the jth scheme in the scheme library is Sim (e i ,f j ),e i ,f j Is a word vector representation;
and filtering the obtained similar guarantee schemes through task priority rules to obtain the finally recommended similar guarantee schemes as plan information.
Further, the key information extraction module extracts the subject terms by using an LDA key term extraction algorithm and generates a security scheme document catalog;
the LDA keyword extraction algorithm comprises the following steps of:
acquiring text data in tasks, environments, specific requirement information and plan information of a plan to be generated;
chinese word segmentation is carried out on the user dialogue information and the plan information by utilizing a Jieba word segmentation tool;
vectorizing a text word segmentation result;
and performing LDA topic modeling, and selecting the optimal topic number and topic words by means of a confusion degree-topic number curve.
Further, the security scheme document catalog comprises condition judgment, task description, personnel security, equipment security, material security and security actions;
the condition judgment also comprises an environment condition and a network condition;
the personnel security also comprises personnel requirements and work responsibilities;
the equipment guarantee also comprises equipment requirements and equipment transportation;
the material guarantee also comprises material demand and material transportation.
Further, the confusion is calculated as:
wherein, the document collection D, wherein M is the total number of documents,is a bag vector consisting of words in the document d, p #) The probability of generation of the document d predicted for the model,N d is the total number of words in document d.
Further, the task is a target of a to-be-generated guarantee scheme and is generated according to initial user input content;
the environment comprises geographic information, climate information and emergency information of a target location where a guarantee scheme is to be generated;
the specific requirement information is used for determining personnel, equipment and material requirements corresponding to the task and the environment.
Further, the coding controller controls and generates positions of occurrence of the keywords in the security scheme text according to the belonging relation between the keywords and the security scheme document catalogue.
Further, aiming at the theme of each section in the document catalog of the guarantee scheme, the decoding controller excites the subject matters belonging to the section and suppresses the subject matters not belonging to the section; and enabling the generated content of each section in the generated guarantee scheme to accord with the corresponding theme.
Further, the dialogue module extracts each initial word in the user input content, carries out semantic classification on each initial word, carries out vector extraction on each initial word in the user input content, acquires the semantic expressed by each initial word, and combines the initial words with different expressions and the same semantic or similarity reaching the preset requirement to be used as the same word;
and replacing the original expression with the combined same vocabulary for the user input content, extracting the keywords, and calculating the keyword scores according to the occurrence frequencies of various keywords.
The beneficial effects achieved by the invention are as follows:
the security scheme generating system provides each module for the information system, so that the information system can uniformly manage and operate each module, a security scheme text generating model is quickly built, and the workload of a developer is reduced. Moreover, the security scheme generation step provided by the system can well assist a user in security scheme preparation, and well solve the problems that security scheme preparation efficiency is low and emergency security complex situations of instant change are difficult to deal with.
The security scheme generating system provided by the invention can be used for carrying out semantic understanding on the content input by the user, the user can carry out heuristic inquiry according to the current input requirement of the user to obtain more specific requirement information generated by the related scheme, and the extracted key information can be used for further generating a security scheme text with accurate and controllable content, so that the controllability of the user on the generation of the security scheme text content is enhanced.
Drawings
FIG. 1 is a schematic diagram of a security scheme generation system provided by an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a security scheme generating system according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of a dialogue module in a security scheme generation system according to an embodiment of the present invention;
fig. 4 is a schematic flow chart of key information extraction in a security scheme generating system according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a security scheme document catalog in a security scheme generation system provided by an embodiment of the present invention;
fig. 6 is a schematic diagram of a text generating module in a security scheme generating system according to an embodiment of the present invention.
Detailed Description
The technical scheme of the present invention will be described in more detail with reference to the accompanying drawings, and the present invention includes, but is not limited to, the following examples.
As shown in fig. 1, the present invention proposes a security scheme generation system including a dialogue module, a plan recommendation module, a key information extraction module, and a text generation module.
As shown in fig. 2, the specific steps of the security scheme generation system are as follows: firstly, carrying out natural language dialogue with a user by using a dialogue module to obtain task, environment and specific requirement information; secondly, based on a plan recommending module, similar recommended guarantee schemes are obtained; thirdly, extracting information such as subject words of the scheme based on the key information extraction module; generating a document catalog of the security scheme according to the element template; finally, based on a text generation module, generating the text content of the security scheme of each section; and repeatedly generating the whole document content to obtain the final guarantee scheme text.
The dialogue module acquires the requirements of writing the security scheme by using a heuristic dialogue technology, and the heuristic dialogue system and the user dialogue question and answer acquire task, environment and specific requirement information. By establishing topic association among knowledge points, a dialogue system can actively discover related knowledge, fully play a role in knowledge coordination, guide a dialogue process and actively send the knowledge to a user at a proper time. The user may be proactively presented with a question to open a session or may be asked to respond to the question by asking the user to complete the session task.
And providing a smoother communication mode of knowledge and information for the user by using heuristic dialogue and knowledge management technology. When the user does not know what to input, the dialogue interaction is actively guided according to the previous question of the user. User questions can be connected to knowledge points (entity-related attributes in KG) in various forms (question-answer pairs, knowledge maps, etc.), and the knowledge points are fused into topics. The topics are jumped according to the semantic or logical relation, and the conversation process is to make overall planning and jumping according to the topics. The topic planning is carried out based on the topic tree, after the user questions to initiate initial man-machine response, the machine selects related questions from related topic structures so as to carry out multiple rounds of conversations around topics of interest of the user, and the real demands of the user are explored and led to final target topics gradually while the user questions are answered. For example, "how does the current environmental conditions? ", if the user inputs" is raining heavy ", then the dialog system automatically inquires" whether there is traffic jam? "or" whether there is a disaster such as landslide? "; if the user inputs "high Wen Ku sun" then the dialog system automatically inquires "if there is a person heatstroke? "etc. Through further inquiry, a user can feed back more accurate information so as to obtain more accurate specific requirements, and the content generated in the condition judgment of the guarantee scheme meets the actual condition requirements more.
As shown in fig. 3, the specific workflow of the dialogue module is:
1. and extracting keywords according to the input content of the user, and calculating the keyword scores.
In the keyword score calculation process, the applicant finds a problem that even words or expressions with the same, similar meanings or opposite word orders may be recognized as different keywords when keywords are extracted due to the fact that the input process of a user is irregular, so that the expressions with the same meaning are scattered into two or more expressions frequently when keyword scores are counted, the keyword scores of the two expressions are reduced, and the proportion and the score of the important keywords are reduced.
In order to solve the problem, in a preferred implementation manner of the invention, each initial vocabulary in the user input content is extracted, semantic classification is carried out on each initial vocabulary, vector extraction is carried out on each initial vocabulary in the user input content, the semanteme expressed by each initial vocabulary is obtained, initial vocabularies with different expressions and the same semanteme or similarity reaching the preset requirement are combined to be used as the same vocabulary, and the combined same vocabulary is used for replacing the original expression for subsequent keyword extraction. In the replacement, each similar expression is replaced by a similar word with higher occurrence frequency.
Then, an unsupervised Chinese short text keyword extraction algorithm is adopted, keyword extraction is regarded as a sequence labeling problem, an unsupervised SIFRank algorithm is firstly used for labeling corpus, and then a keyword extraction model is trained to extract keywords in short text content input by a user. For example, "write road rush repair emergency guarantee scheme" for user input content "," main tasks include: road rush repair, material guarantee and the like "," under heavy rain "for keyword extraction" road rush repair "," under heavy rain "," road rush repair "," material guarantee ";
calculating the keyword score by calculating the occurrence frequency of each keyword or keywords with the same meaning; by regarding keywords with the same meaning as the same keyword calculation frequency, the keyword score calculation error caused by inconsistent user expression modes can be effectively avoided, so that the keywords can not be accurately identified.
2. Word2vec is used for calculating word vector representation, and a knowledge base is searched according to the keywords.
After extracting keywords in the user input content, the questions are retrieved from the knowledge base by using QA questions and answers through the keywords. Based on text vector representation in word2vec deep learning, words with similar semantics are similar in a vector space, and the semantic retrieval is realized by utilizing the vector to retrieve; user input content can be connected to knowledge points (entity related attributes in KG) in various forms (question-answer pairs, knowledge graphs and the like) to fuse the knowledge points into topics. For example, "heavy rain" and "road rush repair" in this example, are associated to the nodes of the knowledge-graph concerning "rain", "traffic", etc.
3. And returning the problem set according to the knowledge retrieval result.
Answers are retrieved from the database from a semantic perspective using question-answer pairs based on the question-answer pairs. For example, recall the problem set according to "heavy rain" includes; "does there be traffic congestion? "or" whether there is a disaster such as landslide? ";
4. calculating similarity and sequencing a problem set by combining keyword scores to obtain TOP N problems, and obtaining questions of a question, wherein the method comprises the following specific steps of:
the number of questions for keyword recall remains very large, and therefore, it is necessary to calculate the vector similarity of the user input content and the recall questions. Adopting a double-tower (DSSM) model without interaction, calculating sentence vectors of problems in a database in advance, and storing the sentence vectors into the database; and then, calculating sentence vectors of the user input content by using the sentence vectors of the sense-bert model, and then calculating a cos value between the recall problem and the two sentence vectors of the user input content as similarity.
And sequencing the problem set by combining the similarity and the keyword score to obtain TOP N problems, and finally obtaining the problems to be queried. For example, "there is or is not traffic congestion? "is there a disaster such as landslide? "ranked according to keyword score, for example," road rush repair "score is high, and the problems related thereto are ranked in front: "does there be traffic congestion? "2." disaster such as landslide is present? ", the final return to challenge question is: "does there be traffic congestion? ".
5. When the user inputs the termination intention content or the maximum number of dialogue turns is larger than a preset value, the continuous inquiry is ended.
And the plan recommendation module carries out similar text recommendation according to the similarity among the guaranteed tasks by utilizing task, environment and specific requirement information, and constructs a text-based semantic matching retrieval system. Aiming at the problem, the key text and the like are vectorized in a certain way (word 2vec, doc2vec, ELMO, BERT and the like), then similarity calculation is carried out by utilizing the feature vector, finally, retrieval of a similar scheme is realized, and then filtering is carried out by a certain rule, so that the finally recommended similar scheme is obtained.
The specific implementation process of the plan recommendation module is as follows:
1. training a sample document set through a CBOW model of word2vec (vectorization representation such as doc2vec, ELMO, BERT can be adopted), and carrying out K-dimensional word vector representation on each word in task, environment and specific requirement information;
2. then searching by using the feature vector, calculating the similarity with a history guarantee scheme or a plan, and finally realizing the searching of the similar scheme; the following cosine COS similarity formula is adopted for calculation:
wherein the similarity between the ith word in the task, environment and specific requirement information and the jth scheme in the scheme library is Sim (e i ,f j ),e i ,f j Is a word vector representation;
3. and filtering through priority rules (such as task priority, environment priority, etc.), and obtaining the finally recommended similar scheme.
The key information extraction module extracts information such as scheme topics, elements and the like by utilizing a key information extraction model according to the information of user conversations and similar scheme information; and generates a security scheme document catalog.
The key information extraction module utilizes the LDA model to carry out topic mining, and extracts related key information such as topics and the like from text data of user conversations.
As shown in fig. 4, the key information extraction module specifically performs the following steps:
1. acquiring text data of user dialogue information and plan information;
2. then data cleaning and text preprocessing are carried out;
aiming at Chinese, utilizing a Jieba word segmentation tool to perform Chinese word segmentation, filtering out text pretreatment such as stop words and the like on the information of a user dialogue;
3. word2vec is utilized to calculate Word vectors for text Word segmentation results, and vectorization is carried out on the text;
4. performing LDA topic modeling, and selecting optimal topic numbers and topic words by means of a confusion degree-topic number curve;
the LDA model adopts Gibbs sampling algorithm to obtain the distribution of the subject Z and the subject word w. In LDA, the number of topics is a pre-specified hyper-parameter. The topic number K is determined by calculating the topic confusion degree. The topic confusion is calculated using the following formula:
wherein, the document collection D, wherein M is the total number of documents,a bag of words vector that is composed of words in document d,probability of generation of document d predicted for model, +.>Is the total number of words in document d.
The key information extraction module also extracts element template information by utilizing the plan information. Generating a document catalog of the security scheme according to the element template; comprising the following steps: situation judgment (environmental situation, network situation), task description, personnel guarantee (personnel requirement, work responsibility), equipment guarantee (equipment requirement, equipment transportation), material guarantee (material requirement, material transportation), guarantee action, others, and the like. The security scheme document catalog is shown in fig. 5.
The text generation module generates contents of the security scheme text based on a transducer pre-training model. Aiming at each section, according to key information such as a theme, a history scheme similar scheme and the like, a controllable text generation model is pre-trained by using a Transformer, and the security scheme text content of each section is generated, so that the content of the whole security scheme document is generated. The core idea of the content controllable text generation provided by the invention is as follows: key information of the natural text is extracted as basic constraint conditions (such as subject words) through a heuristic dialogue method, and then the constraint conditions are converted into natural language instructions to form weakly supervised training data. By adding natural language constraint descriptions and some demonstrations on the pre-training language model, different types of constraints are fine-tuned, and the final result is that the output result of the text generation model is formed to be controllable to the text generation result.
The goal of the text generation module is to control a given model to generate text of a particular attribute based on the source text. Specific attributes include the subject matter of the text, keywords, length, etc. A text generation task that generates a Target sequence from a Source text may be modeled as P (target|source); while the controllable text generation task of the control signal is considered, it can be modeled as P (target|source, control signal). The invention is based on a transducer pre-training model, and the encoding controller and the decoding controller are respectively utilized to control the transducer encoder and the transducer decoder, and the pre-training model is finely adjusted by utilizing control signals, so that the controllable generation of a guarantee scheme text is realized, and the method is specifically shown in fig. 6:
the specific implementation flow is as follows:
1. training is based on a transducer pre-model.
Based on the transducer pre-training model, the retrieved plan information is input to retrain the plan information so as to strengthen the writing capability of the to-be-generated guarantee scheme.
The transducer model includes encoder and decoder structures. The encoder is one of the core components of the transducer, and its main task is to understand and process the input data. The encoder builds a powerful sequence-to-sequence mapping tool by combining the self-attention mechanism, feed forward neural network, normalization layer and residual connection. The self-attention mechanism enables the model to capture complex relationships inside the sequence, while the feed-forward network provides nonlinear computational power. The normalization layer and residual connection then help stabilize the training process.
The decoder generates a target sequence from the output of the encoder and the previously generated partial output sequence. The decoder adopts a similar structure to the encoder, but adds a masking self-attention layer and an encoder-decoder attention layer to generate the target sequence. The mask ensures that the decoder generates an output for each location using only the previous locations. The encoder-decoder attention layer then enables the decoder to use the output of the encoder. With this structure, the decoder is able to generate a target sequence that conforms to the context and source sequence information, providing a powerful solution for many complex sequence generation tasks.
2. Based on the transducer pre-training model, the key words and their positions are controlled by adjusting the model input using an encoder controller (EN-control) in the encoder, influencing the result of signal generation by input control.
The keywords and their locations are controlled by an encoding controller (EN-control). For example, the keyword "heavy rain" and the corresponding document directory POSITION of the security scheme are controlled by using "[ LENGTH50] [ SEP ] heavy rain [ POSITION20]" as a control signal, and the keyword "heavy rain" and the corresponding document directory POSITION of the security scheme are controlled to appear in an "environmental condition".
Meanwhile, specific requirement information and plan information acquired by the modules are used as original text (source text) and input to a transducer encoder.
3. The decoding strategy is adjusted by means of a decoding controller (DE-control) in the decoder such that the generated result contains as much as possible the content of the object, i.e. the subject matter that we specify, controlling the subject matter.
The Topic (Topic) acquired by the aforementioned module is input to the decoder as a decoder control signal so that the generated document is within the Topic area without deviating from the Topic.
In the implementation, the decoding controller calculates probability distribution on the subject vocabulary, and normalizes the probability distribution of the positive and negative categories, so as to determine encouraging or suppressing contents of the model. The decoding controller was trained using the subjects "[ True ] + subject word, [ False ] + subject word" as control code and anti-control code, respectively. The pre-training language model can be guided to perform condition generation through decoding control, namely, the control model generates text contents in the subject field.
4. Finally, a target Text (Text) is generated by using a decoder, and a guarantee scheme Text (Text) is finally obtained.
The controllable text generation model based on the transducer adopts a method for adjusting the input and decoding of the model, and influences the generated result by inputting control elements. Compared with other controllable text generation methods, the method is more suitable for restraining different types, and has smaller influence on the generation quality and speed; the model is also allowed to adapt to new constraints with a small amount of task generalization and context learning without requiring model retraining.
The present invention is not limited to the above embodiments, and those skilled in the art can implement the present invention in various other embodiments according to the examples and the disclosure of the drawings, so that the design of the present invention is simply changed or modified while adopting the design structure and concept of the present invention, and the present invention falls within the scope of protection.

Claims (10)

1. The system is characterized by comprising a dialogue module, a plan recommendation module, a key information extraction module and a text generation module;
the dialogue module is used for carrying out natural language dialogue with a user, prompting the user to input contents related to tasks, environments and specific requirement information of a scheme to be generated, extracting keywords in the user input contents according to the user input contents, and calculating keyword scores based on occurrence frequencies of the keywords in the user input contents; invoking a keyword retrieval knowledge base to obtain a problem set related to the keywords; calculating the similarity of the user input content and each question in the question set, and sequencing the question set by combining the keyword scores to obtain a preset number of high-similarity questions serving as questions to be queried to query the user; the task, environment and specific requirement information related to the scheme to be generated are obtained through the initial user input content and the content input when the user replies in the process of inquiring;
the plan recommendation module acquires a recommended historical guarantee scheme or a plan as plan information according to the similarity between a scheme to be generated and an existing scheme based on the acquired task, environment and specific requirement information;
the key information extraction module extracts tasks, environments, specific requirement information of a scheme to be generated and subject words of plan information; generating a document catalog of the guarantee scheme according to the plan information;
the text generation module generates the security scheme text content of each section by using a transducer pre-training model aiming at each section in the security scheme document catalog, further generates the content of the whole security scheme document, repeatedly generates the whole document content and obtains the final security scheme text;
the transducer pre-training model comprises an encoding controller and a decoding controller;
the transducer pre-training model is trained through specific requirement information and preset information;
the coding controller is used for controlling the keywords and the positions of the keywords based on the belonging relation between the keywords and the document catalogue;
the decoding controller is used for adjusting the subject word of each section so that the content of the current generated document catalog contains the subject word and the corresponding target content.
2. The system of claim 1, wherein the dialogue module ends the follow-up when the user inputs the termination intention content or the maximum number of dialogue rounds is greater than a preset number.
3. The security scheme generation system according to claim 1, wherein the scheme recommendation module performs sample document set training through a CBOW model, and performs K-dimensional word vector characterization on each word in task, environment and specific requirement information;
searching by using the feature vector, calculating the similarity with a history guarantee scheme or a plan, and finally realizing the searching of the similar scheme; the similarity is calculated by adopting the following cosine COS similarity formula:
wherein the similarity between the ith word in the task, environment and specific requirement information and the jth scheme in the scheme library is Sim (e i ,f j ),e i ,f j Is a word vector representation;
and filtering the obtained similar guarantee schemes through task priority rules to obtain the finally recommended similar guarantee schemes as plan information.
4. The security scheme generation system of claim 3, wherein the keyword extraction module extracts the subject term using an LDA keyword extraction algorithm and generates a security scheme document catalog;
the LDA keyword extraction algorithm comprises the following steps of:
acquiring text data in tasks, environments, specific requirement information and plan information of a plan to be generated;
chinese word segmentation is carried out on the user dialogue information and the plan information by utilizing a Jieba word segmentation tool;
vectorizing a text word segmentation result;
and performing LDA topic modeling, and selecting the optimal topic number and topic words by means of a confusion degree-topic number curve.
5. The security scheme generation system according to claim 4, wherein the security scheme document directory includes situation judgment, task description, personnel security, equipment security, material security, security actions;
the condition judgment also comprises an environment condition and a network condition;
the personnel security also comprises personnel requirements and work responsibilities;
the equipment guarantee also comprises equipment requirements and equipment transportation;
the material guarantee also comprises material demand and material transportation.
6. The security scheme generation system of claim 4, wherein the degree of confusion is calculated as:
wherein, the document collection D, wherein M is the total number of documents,bag of words vector, p (=) for words in document d>) The probability of generation of the document d predicted for the model,N d is the total number of words in document d.
7. The security scheme generation system according to claim 1, wherein the task is a target of a security scheme to be generated, the task being generated according to initial user input content;
the environment comprises geographic information, climate information and emergency information of a target location where a guarantee scheme is to be generated;
the specific requirement information is used for determining personnel, equipment and material requirements corresponding to the task and the environment.
8. The security scheme generation system according to claim 1, wherein the encoding controller controls a position where a keyword appears in the generated security scheme text according to a relationship between the keyword and the security scheme document directory.
9. The system according to claim 1, wherein the decoding controller excites a subject word belonging to each section in the security document catalog and suppresses a subject word not belonging to the section for the subject of the section; and enabling the generated content of each section in the generated guarantee scheme to accord with the corresponding theme.
10. The system for generating a security scheme according to claim 1, wherein the dialogue module extracts each initial vocabulary in the user input content, performs semantic classification on each initial vocabulary, performs vector extraction on each initial vocabulary in the user input content, acquires the semantics expressed by each initial vocabulary, and combines the initial vocabularies which are expressed differently and have the same semantics or have the similarity reaching a predetermined requirement as the same vocabulary;
and replacing the original expression with the combined same vocabulary for the user input content, extracting the keywords, and calculating the keyword scores according to the occurrence frequencies of various keywords.
CN202311421473.4A 2023-10-31 2023-10-31 Security scheme generation system Active CN117151069B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311421473.4A CN117151069B (en) 2023-10-31 2023-10-31 Security scheme generation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311421473.4A CN117151069B (en) 2023-10-31 2023-10-31 Security scheme generation system

Publications (2)

Publication Number Publication Date
CN117151069A CN117151069A (en) 2023-12-01
CN117151069B true CN117151069B (en) 2024-01-02

Family

ID=88908550

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311421473.4A Active CN117151069B (en) 2023-10-31 2023-10-31 Security scheme generation system

Country Status (1)

Country Link
CN (1) CN117151069B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112241623A (en) * 2020-09-17 2021-01-19 华中科技大学 Automatic generation device and method for contact network construction technology document content
CN114169312A (en) * 2021-12-08 2022-03-11 湘潭大学 Two-stage hybrid automatic summarization method for judicial official documents
CN115510814A (en) * 2022-11-09 2022-12-23 东南大学 Chapter-level complex problem generation method based on double planning
CN116542241A (en) * 2023-06-25 2023-08-04 四川蔚丰云联信息科技有限公司 Matching method of emergency plan and emergency medical rescue cooperative command platform system
CN116804691A (en) * 2023-06-28 2023-09-26 国网安徽省电力有限公司青阳县供电公司 Fault monitoring method for dispatching automation equipment of power system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11823592B2 (en) * 2021-08-31 2023-11-21 Accenture Global Solutions Limited Virtual agent conducting interactive testing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112241623A (en) * 2020-09-17 2021-01-19 华中科技大学 Automatic generation device and method for contact network construction technology document content
CN114169312A (en) * 2021-12-08 2022-03-11 湘潭大学 Two-stage hybrid automatic summarization method for judicial official documents
CN115510814A (en) * 2022-11-09 2022-12-23 东南大学 Chapter-level complex problem generation method based on double planning
CN116542241A (en) * 2023-06-25 2023-08-04 四川蔚丰云联信息科技有限公司 Matching method of emergency plan and emergency medical rescue cooperative command platform system
CN116804691A (en) * 2023-06-28 2023-09-26 国网安徽省电力有限公司青阳县供电公司 Fault monitoring method for dispatching automation equipment of power system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于BERT的自动文本摘要模型构建方法;岳一峰;黄蔚;任祥辉;;计算机与现代化(第01期);全文 *

Also Published As

Publication number Publication date
CN117151069A (en) 2023-12-01

Similar Documents

Publication Publication Date Title
CN110334354B (en) Chinese relation extraction method
CN106407333B (en) Spoken language query identification method and device based on artificial intelligence
CN110609891A (en) Visual dialog generation method based on context awareness graph neural network
CN113010693A (en) Intelligent knowledge graph question-answering method fusing pointer to generate network
CN110532558B (en) Multi-intention recognition method and system based on sentence structure deep parsing
CN110796160A (en) Text classification method, device and storage medium
CN111680484B (en) Answer model generation method and system for visual general knowledge reasoning question and answer
CN110990555B (en) End-to-end retrieval type dialogue method and system and computer equipment
CN111666381A (en) Task type question-answer interaction system oriented to intelligent control
CN110555084A (en) remote supervision relation classification method based on PCNN and multi-layer attention
CN115599899B (en) Intelligent question-answering method, system, equipment and medium based on aircraft knowledge graph
CN114490991A (en) Dialog structure perception dialog method and system based on fine-grained local information enhancement
CN111914556A (en) Emotion guiding method and system based on emotion semantic transfer map
CN115357719A (en) Power audit text classification method and device based on improved BERT model
CN111984780A (en) Multi-intention recognition model training method, multi-intention recognition method and related device
CN114510570A (en) Intention classification method and device based on small sample corpus and computer equipment
CN112328748A (en) Method for identifying insurance configuration intention
CN112988970A (en) Text matching algorithm serving intelligent question-answering system
Chandiok et al. CIT: Integrated cognitive computing and cognitive agent technologies based cognitive architecture for human-like functionality in artificial systems
CN117033602A (en) Method for constructing multi-mode user mental perception question-answering model
Parvin et al. Transformer-based local-global guidance for image captioning
CN112905750A (en) Generation method and device of optimization model
CN113177113A (en) Task type dialogue model pre-training method, device, equipment and storage medium
CN115905187B (en) Intelligent proposition system oriented to cloud computing engineering technician authentication
CN117438047A (en) Psychological consultation model training and psychological consultation processing method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant