CN113206854B - Method and device for rapidly developing national standard terminal protocol - Google Patents

Method and device for rapidly developing national standard terminal protocol Download PDF

Info

Publication number
CN113206854B
CN113206854B CN202110498225.4A CN202110498225A CN113206854B CN 113206854 B CN113206854 B CN 113206854B CN 202110498225 A CN202110498225 A CN 202110498225A CN 113206854 B CN113206854 B CN 113206854B
Authority
CN
China
Prior art keywords
protocol
corpus
training model
national standard
preprocessing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110498225.4A
Other languages
Chinese (zh)
Other versions
CN113206854A (en
Inventor
杜静波
杨威
程彦龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shouyue Technology Beijing Co Ltd
Original Assignee
Shouyue Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shouyue Technology Beijing Co Ltd filed Critical Shouyue Technology Beijing Co Ltd
Priority to CN202110498225.4A priority Critical patent/CN113206854B/en
Publication of CN113206854A publication Critical patent/CN113206854A/en
Application granted granted Critical
Publication of CN113206854B publication Critical patent/CN113206854B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/03Protocol definition or specification 
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/18Multiprotocol handlers, e.g. single devices capable of handling multiple protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Security & Cryptography (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer And Data Communications (AREA)
  • Communication Control (AREA)

Abstract

The invention discloses a method and a system for rapidly developing a national standard terminal protocol, wherein the method comprises the following steps: constructing a protocol corpus; preprocessing the protocol corpus; representing the words and the expressions after the word segmentation into a representation model applied to a computer; selecting a feature subset; constructing a training model according to the feature subset; submitting a protocol analysis training model of the analysis training model, generating a skeleton code, improving the skeleton code, and testing the skeleton code; the training model is modified. By the method and the device, various protocols are quickly analyzed; the development cost of research personnel is reduced; when a relatively simple protocol is developed and analyzed, the burden of research personnel is reduced.

Description

Method and device for rapidly developing national standard terminal protocol
Technical Field
The invention belongs to the field of software development, and particularly relates to a method and a device for rapidly developing a national standard terminal protocol.
Background
With the increasing standardization of the country for the management and control of various industries, the updating iteration of the corresponding various industry standard protocols is increased continuously, and the development and platform of standard communication protocols such as JT/T808-2011 JT/T808-2019 JT/T1078-2016 and the like are built into a hard standard engaged in related industries. The protocol analysis work is tedious and high in repeatability, the related range is wide, and certain rules can be followed.
In the prior art, research and development personnel are required to repeatedly and manually analyze various protocol functions, the development cost is high, and errors and debugging difficulty are high.
With the construction of various compliances, the protocol analysis and the establishment of a compliances platform are more and more frequent, and the updating iteration speed is also increased continuously; the research and development personnel need to continuously spend a great deal of time on development, which often leads to great increase of development cost.
Disclosure of Invention
Aiming at the technical problems in the related art, the invention provides a method and a device for rapidly developing a national standard terminal protocol, which can overcome the defects in the prior art.
In order to achieve the technical purpose, the technical scheme of the invention is realized as follows:
a method for rapidly developing national standard terminal protocol includes:
constructing a protocol corpus;
preprocessing the protocol corpus, wherein the preprocessing comprises the following steps: word segmentation;
representing the participled words and terms into a representation model applied to a computer, wherein the representation model comprises: a feature vector;
selecting features of the feature vectors to select feature subsets;
constructing a training model according to the feature subset;
submitting a protocol for analyzing the training model;
analyzing a training model according to a protocol to generate a skeleton code, improving the skeleton code, and testing the skeleton code;
and modifying the training model according to the skeleton code test result.
Further, the method for constructing the protocol corpus includes:
downloading international protocols and/or capturing corpora from the network.
Further, the preprocessing the protocol corpus further includes:
data cleaning;
part of speech tagging;
to stop words.
Further, the representation model further includes: bag of words model.
Further, the building of the training model includes: the training model corresponds to a connection code in the protocol corpus.
On the other hand, the invention provides a device for rapidly developing a national standard terminal protocol, which comprises:
the first construction unit is used for constructing a protocol corpus;
the device comprises a preprocessing unit and a processing unit, wherein the preprocessing unit is used for preprocessing the protocol corpus, and the preprocessing comprises the following steps: word segmentation;
a representation unit for representing the word and the term after the word segmentation into a representation model applied to a computer, wherein the representation model comprises: a feature vector;
the characteristic selection unit is used for selecting the characteristic of the characteristic vector and selecting a characteristic subset;
the second construction unit is used for constructing a training model according to the characteristic subset;
the submitting unit is used for submitting a protocol for analyzing the training model;
the testing unit is used for analyzing the training model according to a protocol, generating a skeleton code, improving the skeleton code and testing the skeleton code;
and the modifying unit is used for modifying the training model according to the skeleton code test result.
Further, the manner of constructing the protocol corpus includes:
downloading international protocols and/or crawling corpora from the network.
Further, the preprocessing the protocol corpus further includes:
data cleaning;
part of speech tagging;
to stop words.
Further, the representation model further includes: bag of words model.
Further, the building of the training model includes: the training model corresponds to a connection code in the protocol corpus.
The invention has the beneficial effects that: by the method and the device, a corpus and a training model for protocol analysis are constructed, and various protocols are quickly analyzed; the development is rapid, the configuration is flexible, and the development cost of research personnel is reduced; when a relatively simple protocol is developed and analyzed, research and development personnel are not required to participate, one-key generation is realized, and the burden of the research and development personnel is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart illustrating a method for rapidly developing a national standard terminal protocol according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, a method for rapidly developing a national standard terminal protocol includes:
constructing a protocol corpus;
preprocessing the protocol corpus, wherein the preprocessing comprises the following steps: word segmentation;
representing the participled words and terms into a representation model applied to a computer, wherein the representation model comprises: a feature vector;
selecting the characteristic vector to select a characteristic subset;
constructing a training model according to the feature subset;
submitting a protocol for analyzing the training model;
analyzing a training model according to a protocol to generate a skeleton code, improving the skeleton code, and testing the skeleton code;
and modifying the training model according to the skeleton code test result.
In some embodiments of the present invention, the manner of constructing the protocol corpus includes:
downloading international protocols and/or capturing corpora from the network.
In some embodiments of the present invention, the preprocessing the protocol corpus further includes:
data cleaning;
part of speech tagging;
to stop words.
In some embodiments of the invention, the representation model further comprises: bag of words model.
In some embodiments of the invention, the building of the training model comprises: the training model corresponds to a connection code in the protocol corpus.
On the other hand, the invention provides a device for rapidly developing the national standard terminal protocol, which comprises:
the first construction unit is used for constructing a protocol corpus;
the device comprises a preprocessing unit and a processing unit, wherein the preprocessing unit is used for preprocessing the protocol corpus, and the preprocessing comprises the following steps: word segmentation;
a representation unit for representing the word and the term after the word segmentation into a representation model applied to a computer, wherein the representation model comprises: a feature vector;
the characteristic selection unit is used for carrying out characteristic selection on the characteristic vector and selecting a characteristic subset;
the second construction unit is used for constructing a training model according to the feature subset;
the submitting unit is used for submitting a protocol for analyzing the training model;
the testing unit is used for analyzing the training model according to a protocol, generating a skeleton code, improving the skeleton code and testing the skeleton code;
and the modifying unit is used for modifying the training model according to the skeleton code test result.
In some embodiments of the present invention, the manner of constructing the protocol corpus includes:
downloading international protocols and/or capturing corpora from the network.
In some embodiments of the present invention, the preprocessing the protocol corpus further includes:
data cleaning;
part of speech tagging;
to stop the word.
In some embodiments of the invention, the representation model further comprises: bag of words model.
In some embodiments of the invention, the constructing the training model includes: the training model corresponds to a connection code in the protocol corpus.
Corpus: refers to a large-scale electronic text library that has been scientifically sampled and processed. With the aid of computer analysis tools, researchers can develop relevant linguistic theory and application studies.
For the development cost problem of protocol analysis, a set of system capable of automatically generating corresponding codes through configuration is built, and a scheme for standard protocol development is realized by quickly compiling a small number of page rule codes and background function configuration of protocol rule configuration analysis, so that the development cost of developers is reduced, unnecessary repeated wheel manufacturing code compiling is avoided, and the error probability is reduced.
By the method and the device, a corpus and a training model for protocol analysis are constructed, and various protocols are analyzed quickly; the development is rapid, the configuration is flexible, and the development cost of research personnel is reduced; when a relatively simple protocol is developed and analyzed, research and development personnel are not required to participate, one-key generation is realized, and the burden of the research and development personnel is reduced.
Although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. A method for rapidly developing a national standard terminal protocol is characterized by comprising the following steps:
constructing a protocol corpus;
preprocessing the protocol corpus, wherein the preprocessing comprises the following steps: word segmentation;
representing the participled words and words into a representation model applied to a computer, wherein the representation model comprises: a feature vector;
selecting features of the feature vectors to select feature subsets;
building a training model according to the feature subset, wherein the building of the training model comprises: the training model corresponds to a connection code in the protocol corpus;
submitting a protocol for analyzing the training model;
analyzing a training model according to a protocol to generate a skeleton code, improving the skeleton code, and testing the skeleton code;
and modifying the training model according to the skeleton code test result.
2. The method for rapidly developing the national standard terminal protocol according to claim 1, wherein the way of constructing the protocol corpus comprises:
downloading international protocols and/or capturing corpora from the network.
3. The method according to claim 1, wherein the preprocessing of the protocol corpus further comprises:
data cleaning;
part of speech tagging;
to stop words.
4. The method for rapidly developing a national standard terminal protocol according to claim 1, wherein the representation model further comprises: bag of words model.
5. An apparatus for rapidly developing a national standard terminal protocol, comprising:
the first construction unit is used for constructing a protocol corpus;
the preprocessing unit is used for preprocessing the protocol corpus, wherein the preprocessing comprises the following steps: word segmentation;
a representation unit for representing the participled words and phrases into a representation model applied to a computer, wherein the representation model comprises: a feature vector;
the characteristic selection unit is used for selecting the characteristic of the characteristic vector and selecting a characteristic subset;
a second constructing unit, configured to construct a training model according to the feature subset, where the constructing of the training model includes: the training model corresponds to a connection code in the protocol corpus;
the submitting unit is used for submitting a protocol for analyzing the training model;
the testing unit is used for analyzing the training model according to a protocol, generating a skeleton code, improving the skeleton code and testing the skeleton code;
and the modifying unit is used for modifying the training model according to the skeleton code test result.
6. The apparatus for rapidly developing national standard terminal protocol according to claim 5, wherein the manner of constructing the protocol corpus comprises:
downloading international protocols and/or capturing corpora from the network.
7. The apparatus for rapidly developing a national standard terminal protocol according to claim 5, wherein the preprocessing of the protocol corpus further comprises:
data cleaning;
part of speech tagging;
to stop the word.
8. The apparatus for rapidly developing a national standard terminal protocol according to claim 5, wherein the representation model further comprises: bag of words model.
CN202110498225.4A 2021-05-08 2021-05-08 Method and device for rapidly developing national standard terminal protocol Active CN113206854B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110498225.4A CN113206854B (en) 2021-05-08 2021-05-08 Method and device for rapidly developing national standard terminal protocol

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110498225.4A CN113206854B (en) 2021-05-08 2021-05-08 Method and device for rapidly developing national standard terminal protocol

Publications (2)

Publication Number Publication Date
CN113206854A CN113206854A (en) 2021-08-03
CN113206854B true CN113206854B (en) 2022-12-13

Family

ID=77030308

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110498225.4A Active CN113206854B (en) 2021-05-08 2021-05-08 Method and device for rapidly developing national standard terminal protocol

Country Status (1)

Country Link
CN (1) CN113206854B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108415838A (en) * 2018-03-01 2018-08-17 吉旗(成都)科技有限公司 A kind of automated testing method based on natural language processing technique
CN109525556A (en) * 2018-10-18 2019-03-26 中国电力科学研究院有限公司 It is a kind of for determining the light weight method and system of protocol bug in embedded system firmware
CN110287482A (en) * 2019-05-29 2019-09-27 西南电子技术研究所(中国电子科技集团公司第十研究所) Semi-automation participle corpus labeling training device
CN110597997A (en) * 2019-07-19 2019-12-20 中国人民解放军国防科技大学 Military scenario text event extraction corpus iterative construction method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018232581A1 (en) * 2017-06-20 2018-12-27 Accenture Global Solutions Limited Automatic extraction of a training corpus for a data classifier based on machine learning algorithms
US11144725B2 (en) * 2019-03-14 2021-10-12 International Business Machines Corporation Predictive natural language rule generation
CN110928989A (en) * 2019-11-01 2020-03-27 暨南大学 Language model-based annual newspaper corpus construction method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108415838A (en) * 2018-03-01 2018-08-17 吉旗(成都)科技有限公司 A kind of automated testing method based on natural language processing technique
CN109525556A (en) * 2018-10-18 2019-03-26 中国电力科学研究院有限公司 It is a kind of for determining the light weight method and system of protocol bug in embedded system firmware
CN110287482A (en) * 2019-05-29 2019-09-27 西南电子技术研究所(中国电子科技集团公司第十研究所) Semi-automation participle corpus labeling training device
CN110597997A (en) * 2019-07-19 2019-12-20 中国人民解放军国防科技大学 Military scenario text event extraction corpus iterative construction method and device

Also Published As

Publication number Publication date
CN113206854A (en) 2021-08-03

Similar Documents

Publication Publication Date Title
CN112416806B (en) JS engine fuzzy test method based on standard document analysis
CN111310440B (en) Text error correction method, device and system
CN107437417B (en) Voice data enhancement method and device based on recurrent neural network voice recognition
US20220358292A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN115328756A (en) Test case generation method, device and equipment
CN112016275A (en) Intelligent error correction method and system for voice recognition text and electronic equipment
CN113420822B (en) Model training method and device and text prediction method and device
CN113591093A (en) Industrial software vulnerability detection method based on self-attention mechanism
CN110781673B (en) Document acceptance method and device, computer equipment and storage medium
CN114647590A (en) Test case generation method and related device
CN112989797B (en) Model training and text expansion methods, devices, equipment and storage medium
CN113206854B (en) Method and device for rapidly developing national standard terminal protocol
CN111723583B (en) Statement processing method, device, equipment and storage medium based on intention role
CN112989829A (en) Named entity identification method, device, equipment and storage medium
CN112491649A (en) Interface joint debugging test method and device, electronic equipment and storage medium
US20230267286A1 (en) Translation model training method, translation method, apparatus, device, and storage medium
CN116483314A (en) Automatic intelligent activity diagram generation method
CN115510180A (en) Multi-field-oriented complex event element extraction method
CN115809688A (en) Model debugging method and device, electronic equipment and storage medium
CN114969347A (en) Defect duplication checking implementation method and device, terminal equipment and storage medium
CN111292741B (en) Intelligent voice interaction robot
CN113486647A (en) Semantic parsing method and device, electronic equipment and storage medium
CN114519357B (en) Natural language processing method and system based on machine learning
CN115879446B (en) Text processing method, deep learning model training method, device and equipment
CN118332300A (en) Compliance detection method and system for privacy policy labels of mobile terminal application programs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant