CN111666575B

CN111666575B - Text carrier-free information hiding method based on word element coding

Info

Publication number: CN111666575B
Application number: CN202010295993.5A
Authority: CN
Inventors: 王晓梅; 张维; 张晨旭; 吴亚男; 安鑫; 陈兴强
Original assignee: Information Engineering University of PLA Strategic Support Force
Current assignee: Information Engineering University of PLA Strategic Support Force
Priority date: 2020-04-15
Filing date: 2020-04-15
Publication date: 2022-11-18
Anticipated expiration: 2040-04-15
Also published as: CN111666575A

Abstract

The invention discloses a text carrier-free information hiding method based on lemma coding, which takes a text as a carrier and realizes the hidden transmission of secret information through the lemma coding, and comprises the following specific steps: establishing a dynamically updated text library, and normalizing the format of the text through preprocessing; obtaining a word element sequence of each text by using a word segmentation module, forming a word element index file, and constructing a word element node tree by using the word element index file; arranging adjacent sub-nodes of the non-leaf lemma nodes in a descending order according to the transition probability, and coding adjacent paths of the sub-nodes; constructing an isomorphic text set of each lemma node source path; and the sender retrieves the corresponding text according to the secret information and sends the text to the receiver, and the receiver can extract the secret information through corresponding inverse transformation. Compared with the existing text carrier-free information hiding technology, the method can resist the existing steganography detection technology, meanwhile, the embedding capacity is obviously improved, and the application scene of the carrier-free information hiding method is greatly expanded.

Description

Text carrier-free information hiding method based on word element coding

Technical Field

The invention belongs to the technical field of information security, and particularly relates to a text carrier-free information hiding method based on word element coding.

Background

The development of network and communication technology has greatly promoted the productivity revolution, becoming indispensable pillar force for social development. Influenced by the openness characteristic of the internet, the risk of data security is increasingly complex, and the concealment and the security of communication activities are urgently to be strengthened.

Under the premise of not influencing the normal function of the digital carrier, the information hiding technology embeds the preprocessed secret information into the selected carrier, and the information transmission is realized through the transmission of the carrier. Compared with encryption methods, information hiding can better eliminate the perceptibility of secret information. However, in practice, the conventional information hiding technology inevitably modifies the carrier in a certain granularity, so that the statistical characteristics of the carrier are changed, and the carrier is difficult to resist a specific steganography detection attack. In this context, the concept of bearer-less information hiding has attracted the attention of researchers. The carrier-free information hiding method takes the secret information as a drive, directly retrieves and transmits the natural text meeting the requirements, and the receiver can extract the secret information according to the convention rule. Compared with the traditional information hiding technology, the carrier-free information hiding method does not need to modify the carrier, so that the existing steganography detection means can be resisted. Therefore, the carrier-free information hiding technology can really realize the hidden transmission of key data, has incomparable advantages in the aspects of hiding performance, detection resistance and the like, and further promotes the rapid development of the information hiding technology.

The research work of carrier-free information hiding carried out by taking texts as objects mainly comprises the following steps: document 1 (jihong yong, chapter j, sun star. Text carrierless information hiding scheme [ C ]// national information hiding and multimedia information security academic convention. 2016.) based on single keyword cuts secret information into keyword forms, generates a positioning tag by using user identity information, retrieves a natural text containing a combination of the tag and the keyword to send, and a receiver can extract the secret information according to the tag; document 2 (Zhou Z, mu Y, zhao N, et al. Coverless Information high Method Based on Multi-keywords [ J ]. 2016.) hides the number of keywords in a text by the part of speech of a word, and eliminates the phenomenon of tag ambiguity in the extraction process through the screening and reassignment of tags, with a slight increase in the Hiding capacity. Document 3 (Zhang J, wang L, lin h. Coverless Text Information mapping Method Based on the Rank Map [ J ]. 38555a.s.) converts secret Information into a Text set common word using a word conversion protocol, and guides the positioning of the common word using a word-Rank tag, thereby realizing the embedding and extraction of secret Information; document 4 (Zhang J, huang H, wang L, et al. Conversion text information associating using the frequency words hash [ J ]. International Journal of Network Security,2017,19 (6): 1016-1023.) defines a common word distance of a text, selects a corresponding location tag for the converted secret information by the common word distance and word rank tag location protocol, and directly retrieves the text containing the converted secret information and the corresponding word rank tag as a steganographic carrier; document 5 (Xianyi C, shell C. Text conversion formatting on and selection of words [ J ]. Soft Computing, 2018.) uses parity of chinese character unicode coding as a label, and uses a commonly used compound vocabulary as a keyword, thereby further improving the text hiding success rate and hiding capacity; document 6 (Xianyi C, name C. Text conversion based on compound and selection of words [ J ]. Soft Computing, 2018.) uses word2vec to obtain the approximate vocabulary of the keyword as the replacement when the search mismatch occurs, which significantly improves the hiding success rate.

The research results can be classified as a carrier-free information hiding method based on a label model, in the method, the secret information or the conversion form thereof only exists in a specific part (such as a specific keyword and the like) of a text, namely, the carrier text only has a specific position for transmitting the secret information, the main part is mainly used for keeping the normal semantics and the complete structure of the text and does not bear the function of representing specific information, each text can only hide 1-2.87 Chinese characters on average, and the hiding capacity is very limited. In addition, the combination of the tag and the keyword makes the success of hiding closely related to the capacity of the text library and the range of covering words, and the rare keywords often cannot be successfully matched, thereby reducing the success rate of hiding.

Based on the method, the text-based carrier-free information hiding is realized based on the word element coding, and compared with the existing research results, the method has stable hiding success rate and obviously improves the hiding capacity of the carrier text. Because the method does not change the natural text, the existing steganography detection means can be resisted, and the method has ideal concealment and safety.

Disclosure of Invention

The invention provides a text carrier-free information hiding method based on word element coding, which aims at solving the problems of unstable hiding success rate and low hiding capacity of the existing text carrier-free hiding method, and obviously improves the hiding success rate and the hiding capacity.

In order to achieve the purpose, the invention adopts the following technical scheme:

a text carrierless information hiding method based on lemma coding comprises the following steps:

step 1: establishing a dynamically updated text library C, and preprocessing each text in the text library C;

and 2, step: sequentially reading the preprocessed text contents, extracting the word element information, and constructing a word element node tree G according to the extracted word element information;

and step 3: traversing a lexical element node tree G, arranging adjacent sub-nodes of any non-leaf lexical element node in a descending order according to the transition probability, and coding an adjacent path of the lexical element node;

and 4, step 4: traversing the lexical element node tree G, and constructing an isomorphic text set of a source path of each lexical element node;

and 5: encrypting the secret information, determining a source path of a lemma node according to the lemma node tree G and the encrypted bit stream, and selecting a cipher text-carrying book from a corresponding isomorphic text set and sending the cipher text-carrying book;

step 6: and receiving the secret-carrying text, extracting the lemma information of the secret-carrying text, extracting the encrypted bit stream in the lemma information according to the lemma node tree G, and extracting the secret information through corresponding inverse transformation.

Further, the step 1 comprises:

step 1.1: removing stop words and non-Chinese characters in each text in the text library C;

step 1.2: and screening each text in the text library C according to the text length, and removing the text with the length deviating from the preset value.

Further, the step 2 comprises:

step 2.1: sequentially reading the preprocessed text contents, extracting the corresponding word element contents, position indexes and available text links of the texts, and storing to form word element index files;

step 2.2: inquiring the lemma index file obtained in the step 2.1, aggregating lemmas with position index of 1 and same content into a same node, using the same node as a first-layer lemma node of the lemma node tree G, and storing the lemma node index file according to the structure of the lemma node identifier, the father node identifier, the position index, the lemma content and the available text link set;

step 2.3: let V _i Is a set of layer i morpheme nodes of a morpheme node tree G, v _i,j For the jth lemma node of the ith layer of the lemma node tree G, let i =2, for

Reading v _i-1，j The text content of the available text link set is aggregated into the same node as v by using the lemma with the position index of i and the same content in the part of text _i-1，j Until the set V is reached _i-1 All the lemma nodes are processed, and the ith layer of lemma nodes of the lemma node tree are obtained;

step 2.4: and (5) repeating the step 2.3 and the step 2.4 until all the lemma index files are processed, and obtaining a lemma node tree G of the text library C by letting i = i + 1.

Further, the step 3 comprises:

step 3.1: sequentially importing each non-leaf lemma node, and matching the lemma nodes according to the transition probability of the lemma nodesArranging the neighbor child nodes in a descending order; the transition probability of the lemma node is

Wherein S _j Is S _i Of a neighboring child node, T _j Representing a lemma node S _j The number of available text links, sigma T represents the lemma node S _i The sum of the number of available text links of all adjacent child nodes;

step 3.2: acquiring the number n of adjacent paths of each non-leaf lemma node, if n is more than or equal to 2, encoding the adjacent paths of the lemma node, wherein the encoding bit number is m = [ log ] ₂ n](ii) a If n is<2, the node is skipped.

Further, the step 4 comprises:

step 4.1: sequentially importing each lemma node, and if the lemma node is a leaf node, taking an available text link set corresponding to the lemma node as an isomorphic text set of a lemma node source path; if the lemma node is a non-leaf node, judging whether the lemma node has an adjacent path which is not coded;

and 4.2: if the word element node exists, the available text link set of the sub-node corresponding to the adjacent path which is not coded is used as the isomorphic text set of the word element node, and the adjacent path which is not coded and the subsequent word element node are deleted from the word element node tree G;

step 4.3: and if the word element node does not exist, selecting a part of texts from the available text link set of the word element node as an isomorphic text set of the path of the word element node source, and deleting corresponding texts from the available text link set of the child node.

Further, the step 5 comprises:

step 5.1: the secret information M is encrypted and packaged by an information frame structure, the total length of the information frame does not exceed the hiding capacity upper limit N of the lemma node tree G,

n denotes the number of levels of the lemma node tree G, B _i Representing the highest encoding number of each layer;

step 5.2: and inquiring the word element node tree G, determining a source path of the word element node according to the information frame, selecting a secret-carrying text from the isomorphic text set of the path, and sending the secret-carrying text.

Further, the step 6 comprises:

step 6.1: receiving a secret-carrying text, extracting the lemma information of the secret-carrying text, converting the lemma information into an encrypted bit stream according to a lemma node tree G, extracting an encrypted information segment and a check code through an information frame structure corresponding to the bit stream, judging whether the received secret-carrying text is tampered or not through the check code, if so, replacing a channel for retransmission, and if not, executing the next step;

step 6.2: and combining the encrypted information in segments according to the receiving sequence of the encrypted text-carrying text, and decrypting to extract the original secret information.

Compared with the prior art, the invention has the following beneficial effects:

1) The invention obviously improves the hidden capacity of the secret-carrying text, effectively avoids the abnormal phenomena of dense transmission of a large number of texts and the like caused by over-small hidden capacity, and ensures that the information transmission is more concealed and safer;

2) The method does not modify the carrier text, and the transmitted carrier ciphertext has a complete semantic structure, normal statistical characteristics and good readability, and can resist various existing steganography detection means;

3) The invention better integrates the encryption technology and the related technology of the check coding, can resist the potential tampering attack in the transmission process and has certain robustness;

4) The selection of the text set, the word segmentation mode, the coding mode and the establishment rule of the isomorphic text set can directly influence the mapping relation between the text and the source path of the lemma node, an unauthorized party is difficult to crack secret information, and the data security can be effectively guaranteed.

Drawings

Fig. 1 is a basic flowchart of a text carrierless information hiding method based on lemma coding according to an embodiment of the present invention;

fig. 2 is an exemplary diagram of a lemma node tree of a text carrierless information hiding method based on lemma coding according to an embodiment of the present invention;

fig. 3 is an exemplary diagram of a coded lemma node tree of a text carrierless information hiding method based on lemma coding according to an embodiment of the present invention.

Detailed Description

In a statistical feature-based natural language processing model, S = (W) for a given sentence ₁ W ₂ W ₃ ...W _n ) The probability of occurrence can be expressed as P (S) = P (W) ₁ )P(W ₂ |W ₁ )P(W ₃ |W ₁ W ₂ )...P(W _n |W ₁ W ₂ W ₃ ...W _n-1 ). As can be seen from the above formula, the probability of each word element is not completely independent, and the probability of generating the nth word is determined by the first n-1 words (W) ₁ ,W ₂ ,...,W _n-1 ) And (6) determining. The model reveals the dependency relationship between the lemmas and provides a new mapping space for carrier-free information hiding.

Preprocessing each text of the appointed text library such as word removal and word segmentation, acquiring a word element set of each text, and finding out that: (1) Influenced by multiple factors such as subject, emotion, writing method and the like, even if the text is based on approximate content, the word element set of the text has obvious difference; (2) The connection relation between different word elements presents larger difference, and any word element W _n Often only a specific range of lemmas are closely related (e.g., compound words, etc.). Based on this, the present invention utilizes a transition process between lemma nodes to represent information. For the purpose of explaining the basic idea and the implementation details of the present invention, the definitions of the relevant points are clear as follows:

and (3) word element: basic elements forming the sentence comprise forms of single words, expressions, compound phrases and the like;

and (3) a word element node tree: a word element node tree G = (V, E) established according to word element nodes and the connection relation between the nodes, wherein the set V stores information of the word element nodes, and the set E stores the connection relation between the word element nodes;

leaf node: for a given lemma node tree G, if the lemma node S _i Without child nodes, the term element node S _i As leaf nodes of a lemma node tree G；

A source path: let S _i And S _j Respectively as the initial node and the end node of the path p, if the lemma node S _i Is the root node of the lemma node tree, the path p is called the lemma node S _j The source path of (a);

an adjacent path: let S _i And S _j Respectively as the initial node and the end node of the path p, if the lemma node S _j Is S _i The adjacent child node of (2) is called a path p as a lemma node S _i An adjoining path of (a);

isomorphic text sets: if a certain text can confirm the source path of the token node S, the text is called as an isomorphic text of the source path of the token node S, and the corresponding set is an isomorphic text set;

the invention is further illustrated by the following examples in conjunction with the accompanying drawings:

example 1

As shown in fig. 1, a text unsupported information hiding method based on lemma coding includes:

step S101: establishing a dynamically updated text library C, and preprocessing each text in the text library C;

step 102: sequentially reading the preprocessed text contents, extracting the word element information, and constructing a word element node tree G according to the extracted word element information;

step 103: traversing a lexical element node tree G, arranging adjacent sub-nodes of any non-leaf lexical element node in a descending order according to the transition probability, and coding an adjacent path of the lexical element node;

step 104: traversing the lexical element node tree G, and constructing an isomorphic text set of a source path of each lexical element node;

step 105: encrypting the secret information, determining a source path of the lemma node according to the lemma node tree G and the encrypted bit stream, and selecting a ciphertext from the corresponding isomorphic text set and sending the ciphertext;

step 106: and receiving the secret-carrying text, extracting the word element information of the secret-carrying text, extracting the encrypted bit stream in the word element information according to the word element node tree G, and realizing the extraction of the secret information through corresponding inverse transformation.

Further, the step 101 includes:

step 101.1: removing stop words and non-Chinese characters in each text in the text library C;

step 101.2: and screening each text in the text library C according to the text length to remove the text with the length significantly deviating from the preset value, specifically, the parameter can be determined by the user in combination with the communication scene, and the text with the proper length is selected for transmission (for example, the secret information is transmitted by using short text in instant messaging).

Further, the step 102 includes:

step 102.1: sequentially reading the content of each preprocessed text, extracting the corresponding word element content, position index and available text link of each text by using a word segmentation module, and storing to form a word element index file as shown in table 1; the step makes the lemma of each text have a unique identifier; as an implementable manner, the embodiment adopts a jieba word segmentation device in the python module, and the word segmentation device can provide multiple word segmentation modes;

TABLE 1 lemma index File Structure example

Position indexing

Content of word element

Available text links

Step 102.2: inquiring the lemma index file obtained in the step 102.1, aggregating lemmas with the position index of 1 and the same content into the same node, using the same node as a first-layer lemma node of the lemma node tree G, and storing the lemma index file in a structure shown in a table 2;

table 2 lemma node storage structure example

Token node identification

Parent node identification

Position indexing

Content of word element

Available text link collections

Step 102.3: let V _i Is a set of layer i morpheme nodes of a morpheme node tree G, v _i,j For the jth lemma node at the ith level of the lemma node tree G, let i =2, for

Reading v _i-1，j The text content of the available text link, the lemmas with the position index of i and the same content in the partial text are aggregated into the same node as v _i-1，j Until the set V, the process is repeated _i-1 All the word element nodes are processed, and the ith layer of word element nodes of the word element node tree are obtained;

step 102.4: and (5) repeating the step 102.3 and the step 102.4 until all the lemma index files are processed, and obtaining the lemma node tree G of the text library C by letting i = i + 1.

Further, the step 103 includes:

step 103.1: sequentially importing each non-leaf lemma node, and arranging adjacent child nodes in a descending order according to the transition probability of the lemma node; the transition probability of the lemma node is

Wherein S _j Is S _i Of a neighboring child node, T _j Representing a lemma node S _j The number of available text links, Σ T represents a lemma node S _i The sum of the number of available text links of all adjacent child nodes;

step 103.2: acquiring the number n of adjacent paths of each non-leaf lemma node, if n is more than or equal to 2, encoding the adjacent paths of the lemma node, wherein the encoding bit number is m = [ log ] ₂ n](ii) a If n is<2, the node is skipped.

Further, the step 104 includes:

step 104.1: sequentially importing each lemma node, and if the lemma node is a leaf node, taking an available text link set corresponding to the lemma node as an isomorphic text set of a lemma node source path; if the word element node is a non-leaf node, judging whether the word element node has an adjacent path which is not coded;

step 104.2: if the word element node exists, the available text link set of the sub-node corresponding to the adjacent path which is not coded is used as the isomorphic text set of the word element node, and the adjacent path which is not coded and the subsequent word element node are deleted from the word element node tree G;

step 104.3: if the vocabulary entry node does not exist, selecting a part of texts from the available text link set of the vocabulary entry node as an isomorphic text set of the source path of the vocabulary entry node, and deleting corresponding texts from the available text link set of the child node.

Further, the step 105 comprises:

step 105.1: the sender encrypts the secret information M through an encryption algorithm and a key K and encapsulates the secret information M in an information frame structure shown in Table 3, the total length of the information frame does not exceed the hiding capacity upper limit N of the lemma node tree G,

n denotes the number of levels of the lemma node tree G, B _i Representing the highest encoding bit number of each layer;

table 3 information frame structure example

Length information

Encryption information segmentation

Check code

It should be noted that, before the next step is executed, the sender needs to screen the information frame, so as to avoid the situation that the same text is repeatedly sent by the same information frame, and further improve the concealment of the communication activity;

step 105.2: and querying the lemma node tree G, determining a source path of the lemma node according to the information frame, selecting a secret text from the isomorphic text set of the path, and sending the secret text and the key K.

Further, the step 106 includes:

step 106.1: a receiver receives the secret-carrying text, utilizes a word segmentation module to extract the word element information of the secret-carrying text, converts the word element information into an encrypted bit stream according to a word element node tree G, respectively extracts encrypted information segments and check codes according to the length information of a frame structure header corresponding to the bit stream, judges whether the received secret-carrying text is tampered or not according to the check codes, if so, changes a channel for retransmission, otherwise, executes the next step;

step 106.2: after the correct encrypted information is obtained, all encrypted information segments are combined according to the receiving sequence of the ciphertext-carrying book, and the original secret information can be correctly extracted through inverse transformation such as decryption.

As an implementation manner, for convenience of description and without loss of generality, assuming that an information frame to be transmitted is "101001", a partial structure of the lemma node tree G is shown in fig. 2, and hiding and extracting information can be achieved through the following steps:

(1) Encoding

As shown in fig. 2, the lemma node tree G is a typical multi-way tree structure. If morpheme node S _i N, then its adjacent paths can be stably embedded m = [ log = ₂ n]The binary information of bit, the encoded lemma node tree G is shown in fig. 3.

(2) Constructing isomorphic text collections

The isomorphic text can establish the mapping relation between the text and the lemma node source path, and avoid ambiguity during extraction. If the lemma node is a leaf node, the available text link set is used as an isomorphic text set of the lemma node source path; if the lemma node is a non-leaf node, the construction method of the isomorphic text set depends on whether the lemma node has an adjacent path which is not coded.

Taking FIG. 3 as an example, the lemma node S ₃ There are 5 sub-nodes in total, of which the path S is contiguous ₃ →S ₉ Not encoded and cannot be used to characterize secret information. To maximize the use of the text library resources, the lemma node S may be used ₉ As a lemma node S ₃ The isomorphic text set of (a). Morpheme node S ₇ The adjacent paths are all coded, a certain number of texts can be selected from the available text link set of the lemma node according to an agreed rule to serve as an isomorphic text set, for example, n texts with the shortest length can be selected from the node with the most texts in the available text link set; the shortest text length can also be extracted from each of the available text link sets of each node. In order to avoid ambiguity in extraction, the corresponding text needs to be deleted from the available text link set of its child node.

(3) Information hiding

The information frame to be transmitted is "1010001", and as can be seen from fig. 3, the corresponding source path is S ₀ →S ₃ →S ₇ →S ₁₁ And selecting a text from the corresponding isomorphic text set to transmit.

(4) Information extraction

The receiving party obtains the lemma sequence of the secret-carrying text through the word segmentation module as (S) ₀ 、S ₃ 、S ₇ 、S ₁₁ 、…)，The isomorphic text set is inquired to know that the secret text is the lemma node S ₁₁ The isomorphic text of the source path can obtain an information frame '1010001' according to the coding rule, and the extraction of the secret information can be realized through corresponding inverse transformation.

The above shows only the preferred embodiments of the present invention, and it should be noted that it is obvious to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements should also be considered as the protection scope of the present invention.

Claims

1. A text carrierless information hiding method based on word element coding is characterized by comprising the following steps:

step 2: sequentially reading the preprocessed text contents, extracting the word element information, and constructing a word element node tree G according to the extracted word element information; the word elements are basic elements forming sentences and comprise single words, words and compound phrases; the step 2 comprises the following steps:

step 2.2: inquiring the lemma index file obtained in the step 2.1, aggregating lemmas with the position index of 1 and the same content into the same node as a first-layer lemma node of the lemma node tree G, and storing according to the structure of the lemma node identifier, the father node identifier, the position index, the lemma content and the available text link set;

step 2.3: let V _i Is a set of layer i morpheme nodes of a morpheme node tree G, v _i,j For the jth lemma node at the ith level of the lemma node tree G, let i =2, for

Reading v _i-1，j The text content of the set of available text links, and the position in the portion of textThe lemmas with index of i and same content are aggregated into the same node as v _i-1，j Until the set V is reached _i-1 All the word element nodes are processed, and the ith layer of word element nodes of the word element node tree are obtained;

step 2.4: making i = i +1, repeating the step 2.3 and the step 2.4 until all the lemma index files are processed, and obtaining a lemma node tree G of the text library C;

and step 3: traversing a lexical element node tree G, arranging adjacent sub-nodes of any non-leaf lexical element node in a descending order according to the transition probability, and coding an adjacent path of the lexical element node; let S _i And S _j Respectively as initial node and termination node of path p, if the lemma node S _j Is S _i The adjacent child node of (2) is called a path p as a lemma node S _i An adjoining path of (a); the step 3 comprises the following steps:

step 3.1: sequentially importing each non-leaf lemma node, and arranging adjacent child nodes in a descending order according to the transition probability of the lemma node; the transition probability of the lemma node is

step 3.2: acquiring the number n of adjacent paths of each non-leaf lemma node, if n is more than or equal to 2, encoding the adjacent paths of the lemma node, wherein the encoding bit number is m = [ log ] ₂ n](ii) a If n is<2, skipping the node;

and 4, step 4: traversing the lexical element node tree G, and constructing an isomorphic text set of a source path of each lexical element node; let S _i And S _j Respectively as initial node and termination node of path p, if the lemma node S _i Is the root node of the lemma node tree, the path p is called the lemma node S _j The source path of (a); if a certain text can definitely represent the source path of the morpheme node S, the text is called as a morphemeThe corresponding set of the isomorphic texts of the source path of the node S is an isomorphic text set; the step 4 comprises the following steps:

step 4.2: if yes, the available text link set of the child node corresponding to the uncoded adjacent path is used as an isomorphic text set of the lemma node, and the uncoded adjacent path and the subsequent lemma node are deleted from the lemma node tree G;

step 4.3: if not, selecting part of texts from the available text link set of the word element node as an isomorphic text set of the path of the word element node source, and deleting corresponding texts from the available text link set of the child node;

step 6: and receiving the secret-carrying text, extracting the word element information of the secret-carrying text, extracting the encrypted bit stream in the word element information according to the word element node tree G, and realizing the extraction of the secret information through corresponding inverse transformation.

2. The method for hiding the text unsupported information based on the lemma coding according to claim 1, wherein the step 1 comprises:

3. The method as claimed in claim 1, wherein the step 5 comprises:

step 5.1: the secret information M is encrypted and packaged by an information frame structure, the total length of the information frame does not exceed the upper limit N of the hidden capacity of the lemma node tree G,

step 5.2: and querying the lemma node tree G, determining a source path of the lemma node according to the information frame, selecting a cipher text from the isomorphic text set of the path, and sending the cipher text.

4. The method for hiding text unsupported information based on lemma coding according to claim 3, wherein said step 6 comprises: