CN116645971A - Semantic communication text transmission optimization method based on deep learning - Google Patents
Semantic communication text transmission optimization method based on deep learning Download PDFInfo
- Publication number
- CN116645971A CN116645971A CN202310512333.1A CN202310512333A CN116645971A CN 116645971 A CN116645971 A CN 116645971A CN 202310512333 A CN202310512333 A CN 202310512333A CN 116645971 A CN116645971 A CN 116645971A
- Authority
- CN
- China
- Prior art keywords
- semantic
- channel
- sequence
- optimization
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000005457 optimization Methods 0.000 title claims abstract description 52
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000004891 communication Methods 0.000 title claims abstract description 40
- 230000005540 biological transmission Effects 0.000 title claims abstract description 22
- 238000013135 deep learning Methods 0.000 title claims abstract description 22
- 230000006870 function Effects 0.000 claims description 35
- 238000009826 distribution Methods 0.000 claims description 31
- 238000012549 training Methods 0.000 claims description 27
- 238000004364 calculation method Methods 0.000 claims description 24
- 230000008569 process Effects 0.000 claims description 14
- 230000007246 mechanism Effects 0.000 claims description 13
- 238000012512 characterization method Methods 0.000 claims description 5
- 230000011664 signaling Effects 0.000 claims description 4
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 16
- 238000013144 data compression Methods 0.000 abstract description 4
- 239000013598 vector Substances 0.000 description 22
- 238000004422 calculation algorithm Methods 0.000 description 14
- 238000011156 evaluation Methods 0.000 description 14
- 238000012360 testing method Methods 0.000 description 10
- 230000011218 segmentation Effects 0.000 description 7
- 238000004088 simulation Methods 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000005562 fading Methods 0.000 description 6
- 239000000654 additive Substances 0.000 description 4
- 230000000996 additive effect Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000010606 normalization Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000004140 cleaning Methods 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000011478 gradient descent method Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 238000012821 model calculation Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B17/00—Monitoring; Testing
- H04B17/30—Monitoring; Testing of propagation channels
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Databases & Information Systems (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Electromagnetism (AREA)
- Computer Networks & Wireless Communication (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a semantic communication text transmission optimization method based on deep learning, which aims at the problem of limited data compression of a traditional communication system. For the further optimization problem in the scene of limited computing resources at the data processing end, the channel coding and decoding are adopted as black box processing, so that the semantic information is optimized as the focus optimization method, the accuracy of text semantic transmission of the semantic communication system is further improved, and a good balance is achieved between complexity and performance.
Description
Technical Field
The invention relates to a semantic communication technology, in particular to a semantic communication method based on deep learning, and more particularly relates to a text transmission-oriented semantic communication optimization method based on deep learning.
Background
From 1G to 5G, communication is mainly concerned with how to accurately and efficiently transmit bits from a transmitter to a receiver. Under the traditional communication architecture, we can reach the transmission rate of the order of gigabit per second (bps), the system capacity gradually approaches the shannon limit, and today in intelligent application scenarios such as man-machine interaction, autopilot, geological monitoring and remote health, the incredible data volume is generated, such as the current cellular network needs to process data traffic growing at an exponential speed, including uplink and downlink data rates of 1Tb/s, traffic densities of 1-10, time delays of 0.1ms, and the like, which inevitably bring about a huge challenge to the traditional communication system. Semantic communication (Semantic Communication, SC) has immeasurable potential in data compression and signaling enhancement as an important product of the convergence development of Deep Learning (DL) and communication technology.
Compared with a mature grammar communication technology, research on semantic communication is still in a preliminary stage, scientists have performed some exploratory work in the field, and certain progress is made in the aspects of design of a semantic communication system architecture at a structural level, training of a background knowledge base in an algorithm aspect, application of a receiving end and the like. However, only a few researches are conducted on semantic communication optimization algorithms in detail, and most of the optimization algorithms at present do not fully consider the shortage of computing resources, and only theoretical demonstration is conducted, so that simulation verification at an application level is lacking. For example, universal Transformer is an optimization algorithm based on a transducer model, and the semantic coding and decoding network is dynamically changed by adding an adaptive network to adapt to the transmission requirement of transmitting texts with different complexity, but the reduced transmission data calculation amount cannot well offset the added adaptive network calculation amount.
Therefore, the invention provides a semantic communication optimization algorithm based on deep learning, which aims at a 6G mobile communication scene with intensive data and resources, takes maximized semantic information as an optimization target of a semantic communication system, and achieves good balance between complexity and performance.
Disclosure of Invention
The invention aims to: aiming at the problems existing in the prior art, a semantic communication optimization algorithm based on deep learning is provided, and the transmission of text data is oriented. The method aims to extract and recover the semantic information by using a transducer, so that the receiving end can recover the semantic information more greatly under the condition of low signal-to-noise ratio.
The technical scheme is as follows: aiming at the problem of limited data compression of the traditional communication system, a semantic communication system constructed by a transducer is adopted to extract and compress semantic information under a specific scene, and then the acquired semantic information is subjected to source channel joint coding/decoding to retain the semantic information to a greater extent under the condition of limited channel capacity. For the further optimization problem of the semantic communication system facing text transmission under the scene of limited computing resources of a data processing end, the channel coding and decoding are adopted as black box processing, so that the semantic information is optimized to be the optimization method of the concentration point, and the accuracy rate of the semantic transmission of text semantics of the semantic communication system is further improved. And finally, based on the constructed mathematical model, comparing and testing with other schemes on different channel environments, and analyzing and verifying the robustness of the system under the condition of limited computing resources. The invention is realized by the following technical scheme: a semantic communication text transmission optimization method based on deep learning comprises the following steps:
(1) At a transmitting end, firstly, preprocessing an input corpus to generate a training set, a testing set and a corresponding dictionary, so that the predicted text can be recovered conveniently;
(2) Input sentence according to initial parameters of modelSub S performs semantic coding S α (-), semantic representation sequence m=s generated by semantic encoder using a transducer α (s);
(3) Channel coded C β (-) to ensure a stable transmission of the sequence over the channel, the coded symbol stream x=c β [S α (s)];
(4) Establishing a channel model according to the required signal-to-noise ratio condition and the environment for transmitting information;
(5) At the receiving end, the channel output signal y is firstly sent to a channel decoding module to recover the semantic representation sequence
(6) Taking channel coding and decoding as a black box, extracting a recovered semantic representation sequence n and a semantic representation sequence m generated by a semantic encoder, and sending the semantic representation sequence n and the semantic representation sequence m into a semantic optimization network to obtain a loss value required by optimizing semantic information;
(7) According to the local background knowledge base of the receiving end, carrying out semantic decoding on the semantic representation sequence to obtain a predicted text sequence
(8) Performing cross entropy loss function calculation on the predicted s' and the target sequence s, and performing back propagation on the obtained result and a semantic optimization function together to train a system model;
(9) And in the stage of analyzing the system performance, testing the trained system under different channel environments, and focusing on the performance of the system under the condition of limited computing resources by taking BLEU or semantic similarity as an evaluation index.
Further, the step (1) includes the following specific steps:
(1a) Data cleaning: removing accent marks in a language, filtering out unnecessary characters such as XML labels, special symbols and the like, and adding a blank in front of punctuation marks at the end of a sentence so as to separate the punctuation marks from text contents;
(1b) Word segmentation: the text is split into corresponding words, phrases or symbols, etc. for easier subsequent processing. The method employed varies for corpora in different languages.
If the input text is English, french, german and the like, the processing mode is simpler, regular expression word segmentation can be used, non-English characters, namely non-a-Z 'and non-A-Z' characters, can be directly deleted, and capital letters are converted into lowercase forms, so that repeated vocabulary is reduced, and a model is simplified.
And for the processing of the Chinese database, the processing is relatively complex. Firstly, a Chinese word segmentation component 'Jieba' library in a Python third party library is required to be called, and a cut function in the library is used for splitting a Chinese text to be processed. In addition, the deletion operation is required for the non-Chinese characters, and the characters to be processed are known to be the characters of non-one- ' according to the initial Chinese character ' one ' and the ending Chinese character ' ' of the ASCII code table. Finally, the operation of removing stop words is also carried out;
(1c) Clause: the long texts are separated according to sentence standards, so that single sentences can be conveniently processed and sentence lengths can be counted. And the sequence start-end marking is carried out, so that the model can be helped to better identify the sentence structure and grammar rule of the processed language, and the performance of the model are improved;
(1d) Vocabulary construction: creating a list containing all words and uniquely encoding the words so as to perform word frequency statistics and vector conversion on the sentences;
(1e) Sequence Padding (Sequence Padding): in order to make all sequences have the same length and be convenient to input into a training model, filling words are required to be added into sentences with different lengths, generally 0 is selected as a filling word, then post alignment operation is carried out, and the sentences with the longest sentence length are filled;
(1f) Data set partitioning: the dataset was assembled as per 9:1 into training and testing sets, so that the experimental model is evaluated for performance at different stages using the corresponding data sets.
Further, the step (2) includes the following specific steps:
(2a) A semantic codec model is built, comprising a 4-layer encoder layer and a 4-layer decoder layer. Semantic coding is carried out on the input sentences through a local knowledge base, semantic information is extracted from a physical channel, and effective compression of data is achieved;
(2b) For one sentence s of input, s= [ w ] 1 ,w 2 ,...,w l ]Wherein w is i Representing the i-th word in the sentence, l is the sentence length. Firstly, expanding an input word to a model dimension 128 in a word embedding mode, and enabling a generated word vector to have characteristic semantics and specific position information through the following position codes:
wherein: d is the dimension of the word vector, i is the index of its dimension, and position is the position index of the word vector.
Then, the sentence input into the semantic encoder becomes a form of:
(2c) The key point of the self-attention calculation of the input word vector is the training of three weight matrixes in the self-attention network, namely W Q 、W K 、W V After the vectors are multiplied by the word embedded vector, a Query vector (Q), a Key vector (K) and a Value vector (V) can be obtained, and then attention calculation is carried out by the following formula:
(2d) Since the transducer uses a Multi-head mechanism (Multi-head technology), i.e. by multiple sets of W Q 、W K 、W V The matrix can obtain a plurality of groups of Q, K, V vectors, so that in order to enable the model to pay attention to information of different semantic layers at the same time, Z matrixes obtained by calculation of each group are integrated through splicing (Concat) operation;
MultiHead=Concat(Z 1 ,Z 2 ,...Z k ) Expression 4
(2e) The spliced result is transmitted into a feedforward network, is transmitted into a channel coding module after being subjected to residual connection (Residual Connection) and layer normalization (Layer Normalization, LN), and the generated semantic representation sequence is S α (s)。
Further, the step (4) comprises the following specific steps:
(4a) Most physical channels can be modeled by neural networks. For additive white gaussian noise (Additive White Gaussian Noise, AWGN) channels, multiplicative gaussian noise channels, and erasure channels, a simple neural network can model them. For fading channels, such as Rayleigh fading channels, more complex neural networks are needed;
(4b) If the channel coefficient is g and the channel noise is n, after the information of the transmitting end is transmitted, the signal received by the receiving end can be represented by the following formula:
y=gx+n expression 5
(4c) If the channel is AWGN, g=1, N corresponds to gaussian distribution N (0, σ 2 ) The method comprises the steps of carrying out a first treatment on the surface of the If the channel is a Rayleigh fading channel, g conforms to Rayleigh distributiong is more than or equal to 0, and under the condition of high signal-to-noise ratio, g and n can be distributed in a circular symmetry complex Gaussian manner CN (0, sigma) 2 ) Simulation was performed.
Further, the step (6) comprises the following specific steps:
(6a) In the optimization process, the channel coding and decoding part is firstly processed by a black box, and a semantic representation sequence m generated by a sender semantic encoder and a semantic representation sequence n recovered after the receiver channel decoding are used as inputs of a semantic optimization network. In order to correlate semantic information at two ends of receiving and transmitting to the greatest extent, a maximized mutual information (Mutual Information) mechanism is adopted as an optimization target of the network, and as the mutual information determines the information content of the coded data to a certain extent, the maximized mutual information mechanism has the advantages that the signaling rate of the system can be improved, and the channel tolerance is maximized;
(6b) The method for calculating the mutual information is shown as follows:
wherein: m and N are distribution spaces of semantic characterization sequences M and N, p (M and N) are joint probability distributions of M and N, p (M) and p (N) are edge probability distributions of M and N respectively;
(6c) From the definition of KL divergence, the mutual information of m, n is the KL divergence of the product of the joint probability distribution of m, n and the edge probability distribution, so in order to optimize the mutual information, some properties of KL divergence are needed below;
(6d) Given that KL divergence is a special form of f-divergence, starting from the nature of f-divergence, the general expression for f-divergence is:
wherein: f (·) is a convex function, satisfying
KL divergence is the f-divergence when f (t) =t·logt:
(6e) The lower bound representation of the constraint equation satisfied by the mutual information can be obtained from the nature of the convex function:
wherein: x is any traversable equation;
(6f) In summary, a loss function of the semantic optimization network can be established to realize gradient descent:
the function X that maximizes mutual information can be found out by training the network.
Further, the step (8) comprises the following specific steps:
(8a) Calculating the distribution difference of the prediction result s' and the target result s according to the two distribution differences:
wherein: p represents the probability distribution of the target result, p ' represents the probability distribution of the predicted result, H (p ') represents the information entropy of the predicted result, and H (p, p ') represents the cross entropy of the target result and the predicted result;
(8b) Since H (p') is a certain value, optimizing KL divergence is as effective as optimizing cross entropy terms. In order to reduce the calculation amount of the deep learning, the cross entropy item is selected and optimized, and a cross entropy loss function is established in consideration of the fact that the calculation amount generated by the deep learning process is large. For a classification problem, its corresponding cross entropy loss function can be expressed as:
wherein: n is the number of samples, w i For the label corresponding to the ith sample, p (w i ) To predict the probability of correct for sample i.
The cross entropy loss function established from H (p, p') is then:
(8c) In combination with the semantic optimization function, the loss function is updated by the following formula:
loss=loss_ce (s, s')+σ·loss (m, n) expression 14
Wherein: sigma is update weight, the value range of which is 0-1 decimal, and is generally controlled between 0 and 0.2;
(8d) If the Loss is less than the threshold, saving the current model parameters, and then entering (8 e), otherwise, directly entering the next step without saving the model parameters;
(8e) The decoding parameters are updated by back propagation. In the back propagation process, a gradient descent method is used to approximate the optimal solution. The gradient calculating method comprises the following steps:
wherein: alpha is the parameter to be updated and,for the modules or networks that need to be undergone in the process of solving the gradient from Loss to α, since not all parameters need to be strictly calculated according to the modules that are undergone by the back propagation, for example, if only the end-to-end input/output of the codec is considered, the gradient calculation can be simplified as follows:
in particular, when the channel is an AWGN channel,
after the gradient is obtained, model parameters such as weight parameters of an attention mechanism, hidden layer parameters and the like can be dynamically adjusted by the following steps:
α n =α n-1 - λΔ expression 17
Wherein: λ is a gradient decreasing weight parameter;
(8f) If the current training times are smaller than the iteratable times, turning to the step (2), and continuing to train the network; otherwise, saving the model parameters after the last training, and exiting the training;
further, the step (9) comprises the following specific steps:
(9a) Loading trained model parameters and testing a data set;
(9b) Initializing a channel environment of an analysis process;
(9c) Selecting an evaluation index;
(9d) If the evaluation index is BLEU, the step (9 e) is entered; if the evaluation index is semantic similarity, entering a step (9 f); if the evaluation index is other, returning an error;
(9e) BLEU is based on an N-gram model, and is determined by the precision value p of the N-gram n And weight value w thereof n The BLEU size of the output result versus target statement may be calculated:
wherein: BP is a length Penalty factor (Brightness Penalty), whose value is a conditional function:
the BLEU value ranges from 0 to 1, the larger the value, the higher the reduction degree of the predicted result, the smaller the value, the more inaccurate the prediction. When the length of the predicted sentence s' is greater than the length of the original sentence s, the BLEU value is reducedIf the length penalty of the sentence is not longer than the original sentence length, a penalty mechanism is not required to be started;
(9f) The calculation of the semantic similarity is based on a model parameter BERT-Large Uncased (white Word Mask) disclosed by Google, a BERT model calculation formula B (-) is utilized, and the semantic distance between the two is calculated by analogy to an included angle calculation method:
(9f) Calculating an average value of the evaluation index in units of sample capacity;
(9g) And returning an evaluation result.
The beneficial effects are that: the invention provides a semantic communication text transmission optimization method based on deep learning, which aims at a text transmission scene with limited computing resources in 6G communication, adopts a semantic communication system constructed by a Transformer to extract and compress semantic information under a specific scene, adopts a method for optimizing the semantic information as a concentration point by using channel coding and decoding as a black box process, further improves the text transmission semantic accuracy of the semantic communication system, and obtains good balance between complexity and performance.
In summary, in the scenario that the computing resources of the mobile communication data processing end are limited, the semantic communication optimization algorithm based on deep learning provided by the invention is superior in the aspect of maximizing semantic information when the semantic communication optimization algorithm is oriented to text data transmission.
Drawings
Fig. 1 is a schematic diagram of a semantic communication text transmission optimization method based on deep learning according to an embodiment of the present invention;
FIG. 2 is a diagram of simulation results of the effect of convergence of a loss value and training times under a semantic optimization algorithm provided by an embodiment of the present invention;
fig. 3 is a diagram of simulation results of the variation of the BLEU value with the signal-to-noise ratio after training for 10 times under the optimization algorithm provided by the embodiment of the present invention;
Detailed Description
The core idea of the invention is that: aiming at the problem of limited data compression of the traditional communication system, the deep learning-based end-to-end semantic communication system model is adopted to extract and compress semantic information in a specific scene, and the semantic information is reserved to a greater extent through joint coding of information source channels. For the further optimization problem in the scene of limited computing resources at the data processing end, the channel coding and decoding are adopted as black box processing, so that the semantic information is optimized as a focus optimization method, and the aim of further improving the text semantic accuracy of the semantic communication system is fulfilled by gradient descent updating of the decoding parameters.
The present invention is described in further detail below.
Firstly, preprocessing an input corpus at a transmitting end to generate a training set, a testing set and a corresponding dictionary, wherein the training set, the testing set and the corresponding dictionary are convenient for recovering a predicted text, and the method specifically comprises the following steps:
(1a) Data cleaning: removing accent marks in a language, filtering out unnecessary characters such as XML labels, special symbols and the like, and adding a blank in front of punctuation marks at the end of a sentence so as to separate the punctuation marks from text contents;
(1b) Word segmentation: the text is split into corresponding words, phrases or symbols, etc. for easier subsequent processing. The method employed varies for corpora in different languages.
If the input text is English, french, german and the like, the processing mode is simpler, regular expression word segmentation can be used, non-English characters, namely non-a-Z 'and non-A-Z' characters, can be directly deleted, and capital letters are converted into lowercase forms, so that repeated vocabulary is reduced, and a model is simplified.
And for the processing of the Chinese database, the processing is relatively complex. Firstly, a Chinese word segmentation component 'Jieba' library in a Python third party library needs to be called, a cut function in the Chinese word segmentation component 'Jieba' is used for splitting a Chinese text to be processed, and sentences are split into independent words and stored in a list. Words are then combined together in spaces by a join function carried by Python to form a sentence. In addition, the deletion operation is required for the non-Chinese characters, and the characters to be processed are known to be the characters of non-one- ' according to the initial Chinese character ' one ' and the ending Chinese character ' ' of the ASCII code table. Finally, the stop word removing operation is performed, namely, nonsensical words in the text are deleted, because the words do not provide valuable information when semantic analysis is performed, including some connective words 'yes', 'in', and the like, and some Chinese assistances such as 'ya', 'in', and the like;
(1c) Clause: the long texts are separated according to sentence standards, so that single sentences can be conveniently processed and sentence lengths can be counted. In addition, as the design scheme uses a transducer to perform semantic processing, in order to help a machine learning model to better understand and process text sequences, a sequence START-END marking needs to be performed, a special START mark (usually indicated by "< s >" or "< START >") is added at the beginning of each text sequence, namely a single sentence, and a special END mark (usually indicated by "</s >" or "< END >") is added at the END, and sentence division is performed by using the marks, so that the model can be helped to better recognize sentence structures and grammar rules of the processed language, and the performance and the expression of the model are improved;
(1d) Vocabulary construction: creating a list containing all words and uniquely encoding the words so as to perform word frequency statistics and vector conversion on the sentences;
(1e) Sequence filling: in order to make all sequences have the same length and be convenient to input into a training model, filling words are required to be added into sentences with different lengths, generally 0 is selected as a filling word, then post alignment operation is carried out, and the sentences with the longest sentence length are filled;
(1f) Data set partitioning: the dataset was assembled as per 9:1 into a training set and a testing set so as to evaluate the performance of the experimental model by using corresponding data sets at different stages;
step (2), carrying out semantic coding S on the input sentence S according to the initial parameters of the model α (-), semantic representation sequence m=s generated by semantic encoder using a transducer α (s) specifically:
(2a) A semantic codec model is built, comprising a 4-layer encoder layer and a 4-layer decoder layer. Semantic coding is carried out on the input sentences through a local knowledge base, semantic information is extracted from a physical channel, and effective compression of data is achieved;
(2b) For one sentence s of input, s= [ w ] 1 ,w 2 ,...,w l ]Wherein w is i Representing the i-th word in the sentence, l is the sentence length. Firstly, inputting a single word through a word embedding modeThe word is expanded to a model dimension 128, and the generated word vector has both characteristic semantics and specific position information through the following position codes:
wherein: d is the dimension of the word vector, i is the index of its dimension, and position is the position index of the word vector.
Then, the sentence input into the semantic encoder becomes a form of:
(2c) The key point of the self-attention calculation of the input word vector is the training of three weight matrixes in the self-attention network, namely W Q 、W K 、W V After the vectors are multiplied by the word embedded vector, a Query vector (Q), a Key vector (K) and a Value vector (V) can be obtained, and then attention calculation is carried out by the following formula:
(2d) Since the transducer uses a multi-head mechanism, i.e. by multiple sets of W Q 、W K 、W V The matrix can obtain a plurality of groups of Q, K, V vectors, so that in order to enable the model to pay attention to information of different semantic layers at the same time, Z matrixes obtained by calculation of each group are integrated through splicing operation;
MultiHead=Concat(Z 1 ,Z 2 ,...Z k ) Expression 4
(2e) The spliced result is transmitted into a feedforward network, is transmitted into a channel coding module after being subjected to residual connection (Residual Connection) and layer normalization (Layer Normalization, LN), and the generated semantic representation sequence is S α (s);
Step (3), channel coding C β (. Cndot.) to ensure sequence stability on channelFixed transmission, coded symbol stream x=c β [S α (s)];
And (4) establishing a channel model according to the required signal-to-noise ratio condition and the environment for transmitting information, wherein the method comprises the following specific steps of:
(4a) Most physical channels can be modeled by neural networks. For additive white gaussian noise (Additive White Gaussian Noise, AWGN) channels, multiplicative gaussian noise channels, and erasure channels, a simple neural network can model them. For fading channels, such as Rayleigh fading channels, more complex neural networks are needed;
(4b) If the channel coefficient is g and the channel noise is n, after the information of the transmitting end is transmitted, the signal received by the receiving end can be represented by the following formula:
y=gx+n expression 5
(4c) If the channel is AWGN, g=1, N corresponds to gaussian distribution N (0, σ 2 ) The method comprises the steps of carrying out a first treatment on the surface of the If the channel is a Rayleigh fading channel, g conforms to Rayleigh distributiong is more than or equal to 0, and under the condition of high signal-to-noise ratio, g and n can be distributed in a circular symmetry complex Gaussian manner CN (0, sigma) 2 ) Performing simulation;
step (5), at the receiving end, the channel output signal y is firstly sent into a channel decoding module to recover the semantic representation sequence
And (6) taking channel coding and decoding as a black box, extracting a recovered semantic representation sequence n and a semantic representation sequence m generated by a semantic encoder, sending the extracted semantic representation sequence n and the semantic representation sequence m into a semantic optimization network, and obtaining a loss value required by optimizing semantic information, wherein the method specifically comprises the following steps of:
(6a) In the optimization process, the channel coding and decoding part is firstly processed by a black box, and a semantic representation sequence m generated by a sender semantic encoder and a semantic representation sequence n recovered after the receiver channel decoding are used as inputs of a semantic optimization network. In order to correlate semantic information at the receiving and transmitting ends to the greatest extent, a maximized mutual information mechanism is adopted as an optimization target of the network, and the mutual information determines the information content of encoded data to a certain extent, so that the maximized mutual information mechanism has the advantages of improving the signaling rate of the system and maximizing the channel tolerance;
(6b) The method for calculating the mutual information is shown as follows:
wherein: m and N are distribution spaces of semantic characterization sequences M and N, p (M and N) are joint probability distributions of M and N, p (M) and p (N) are edge probability distributions of M and N respectively;
(6c) From the definition of KL divergence, the mutual information of m, n is the KL divergence of the product of the joint probability distribution of m, n and the edge probability distribution, so in order to optimize the mutual information, some properties of KL divergence are needed below;
(6d) Considering that KL divergence is a special form of f-divergence, starting from the nature of f-divergence, the general expression of f-divergence is:
wherein: f (·) is a convex function, satisfying
KL divergence is the f-divergence when f (t) =t·logt:
(6e) The lower bound representation of the constraint equation satisfied by the mutual information can be obtained from the nature of the convex function:
wherein: x is any traversable equation;
(6f) In summary, a loss function of the semantic optimization network can be established to realize gradient descent:
the function X for maximizing mutual information can be found out through a training network;
step (7), according to the local background knowledge base of the receiving end, carrying out semantic decoding on the semantic representation sequence to obtain a predicted text sequence
Step (8), cross entropy loss function calculation is carried out on the predicted s and the target sequence s, the obtained result and the semantic optimization function are carried out back propagation together, and a system model is trained, and the method comprises the following specific steps:
(8a) Calculating the distribution difference of the prediction result s' and the target result s according to the two distribution differences:
wherein: p represents the probability distribution of the target result, p ' represents the probability distribution of the predicted result, H (p ') represents the information entropy of the predicted result, and H (p, p ') represents the cross entropy of the target result and the predicted result;
(8b) Since H (p') is a certain value, optimizing KL divergence is as effective as optimizing cross entropy terms. In order to reduce the calculation amount of the deep learning, the cross entropy item is selected and optimized, and a cross entropy loss function is established in consideration of the fact that the calculation amount generated by the deep learning process is large. For a classification problem, its corresponding cross entropy loss function can be expressed as:
wherein: n is a sampleNumber, w i For the label corresponding to the ith sample, p (w i ) To predict the probability of correct for sample i.
The cross entropy loss function established from H (p, p') is then:
(8c) In combination with the semantic optimization function, the loss function is updated by the following formula:
loss=loss_ce (s, s')+σ·loss (m, n) expression 14
Wherein: sigma is update weight, the value range of which is 0-1 decimal, and is generally controlled between 0 and 0.2;
(8d) If the Loss is less than the threshold, saving the current model parameters, and then entering (8 e), otherwise, directly entering the next step without saving the model parameters;
(8e) The decoding parameters are updated by back propagation. In the back propagation process, a gradient descent method is used to approximate the optimal solution. The gradient calculating method comprises the following steps:
wherein: alpha is the parameter to be updated and,in order to calculate the gradient of the module or network that needs to be undergone in the gradient process from Loss to α, since not all parameters need to be strictly calculated according to the module that is undergone by back propagation, for example, when training the semantic communication system, the semantic codec and the channel codec are often combined, only the input and output of the semantic codec from end to end are considered to be processed as a black box, and at this time, the gradient calculation can be simplified as follows:
in particular, when the channel is an AWGN channel,
After the gradient is obtained, model parameters such as weight parameters of an attention mechanism, hidden layer parameters and the like can be dynamically adjusted by the following steps:
α n =α n-1 - λΔ expression 17
Wherein: λ is a gradient decreasing weight parameter;
(8f) If the current training times are smaller than the iteratable times, turning to the step (2), and continuing to train the network; otherwise, saving the model parameters after the last training, and exiting the training;
step (9), in the stage of analyzing system performance, testing the trained system in different channel environments, taking BLEU or semantic similarity as an evaluation index, focusing on the performance of the system under the condition of limited computing resources, and comprising the following specific steps:
(9a) Loading trained model parameters and testing a data set;
(9b) Initializing a channel environment of an analysis process;
(9c) Selecting an evaluation index;
(9d) If the evaluation index is BLEU, the step (9 e) is entered; if the evaluation index is semantic similarity, entering a step (9 f); if the evaluation index is other, returning an error;
(9e) BLEU is based on an N-gram model, and is determined by the precision value p of the N-gram n And weight value w thereof n The BLEU size of the output result versus target statement may be calculated:
wherein: BP is a length Penalty factor (Brightness Penalty), whose value is a conditional function:
BLEU valueThe larger the value, the higher the degree of reduction of the predicted result, the smaller the value, the more inaccurate the prediction. When the length of the predicted sentence s' is greater than the length of the original sentence s, the BLEU value is reducedIf the length penalty of the sentence is not longer than the original sentence length, a penalty mechanism is not required to be started;
(9f) The calculation of the semantic similarity is based on a model parameter BERT-Large Uncased (white Word Mask) disclosed by Google, a BERT model calculation formula B (-) is utilized, and the semantic distance between the two is calculated by analogy to an included angle calculation method:
(9f) Calculating an average value of the evaluation index in units of sample capacity;
(9g) And returning an evaluation result.
In fig. 1, a semantic communication text transmission optimization method based on deep learning is described, after a channel is processed as a black box, a semantic optimization network further reduces the variation range of a semantic representation sequence by calculating the difference of a semantic sequence at a receiving and transmitting end.
In fig. 2, simulation results of the effect of convergence of the loss value and the training times under the semantic optimization algorithm are described, it can be seen that the convergence speed of the optimization method is faster in the initial stage, and the phenomenon of resource waste and time delay caused by the fact that the sinking part is the smallest does not occur.
In fig. 3, a simulation result of the variation of the BLEU value along with the signal-to-noise ratio after training for 10 times under the optimization algorithm is described, and under the condition of limited computing resources and low signal-to-noise ratio, the semantic optimization algorithm can improve the accuracy by about 20% compared with the deep sc algorithm.
According to the description of the invention, it should be apparent to those skilled in the art that the invention proposes a semantic communication text transmission optimization method based on deep learning, which can effectively avoid the waste of computing resources and achieve a good balance between complexity and performance.
What is not described in detail in the present application belongs to the prior art known to those skilled in the art.
Claims (1)
1. The semantic communication text transmission optimization method based on deep learning is characterized by comprising the following steps of:
(1) Semantic coding S of an input sentence S according to model initial parameters α (. Cndot.) the generated semantic representation sequence m=s α (s);
(2) Channel coded C β (-) to ensure a stable transmission of the sequence over the channel, the coded symbol stream x=c β [S α (s)];
(3) Establishing a channel model according to the required signal-to-noise ratio condition and the environment for transmitting information;
(4) At the receiving end, the channel output signal y is firstly sent to a channel decoding module to recover the semantic representation sequence
(5) Taking channel coding and decoding as a black box, extracting a recovered semantic representation sequence n and a semantic representation sequence m generated by a semantic encoder, and sending the semantic representation sequence n and the semantic representation sequence m into a semantic optimization network to obtain a loss value required by optimizing semantic information;
(6) According to a local background knowledge base of a receiving end, carrying out semantic decoding on the semantic representation sequence to obtain a predicted sequence s';
(7) Performing cross entropy loss function calculation on the predicted s' and the target sequence s, and performing back propagation on the obtained result and a semantic optimization function together to train a system model;
further, the step (5) comprises the following specific steps:
(5a) In the optimization process, the channel coding and decoding part is firstly processed by a black box, and a semantic characterization sequence m generated by a sender semantic encoder and a semantic characterization sequence n recovered after the receiver channel decoding are used as inputs of a semantic optimization network; in order to correlate the semantic information at the receiving and transmitting ends to the greatest extent, a maximized mutual information mechanism is adopted as an optimization target of the network, and as the mutual information determines the information quantity contained in the coded data to a certain extent, the other advantage of using the mechanism is that the signaling rate of the system can be improved, and the channel tolerance is maximized;
(5b) The method for calculating the mutual information is shown as follows:
wherein: m and N are distribution spaces of semantic characterization sequences M and N, p (M and N) are joint probability distributions of M and N, p (M) and p (N) are edge probability distributions of M and N respectively;
(5c) From the definition of KL divergence, the KL divergence of the product of the joint probability distribution of m, n and the edge probability distribution, where KL divergence is a special form of f-divergence (f-divergence), the convex function taking into account the f-divergence satisfiesThe lower bound representation of the constraint equation satisfied by the mutual information can be obtained:
wherein: x is any traversable equation;
(5d) In order to maximize semantic information of the associated transceiver, a loss function of a semantic optimization network is established as follows:
the function X for maximizing mutual information can be found out through a training network;
further, the step (7) comprises the following specific steps:
(7a) Calculating the distribution difference of the prediction result s' and the target result s according to the two distribution differences:
wherein: l is the sentence length, p represents the probability distribution of the target result, p ' represents the probability distribution of the predicted result, H (p ') represents the information entropy of the predicted result, and H (p, p ') represents the cross entropy of the target result and the predicted result;
(7b) In order to reduce the calculation amount of the deep learning, selecting and optimizing cross entropy items, and establishing a cross entropy loss function:
wherein: n is the number of samples;
(7c) In combination with the semantic optimization function, the loss function is updated by the following formula:
Loss=Loss_CE(s,s′)+σ·Loss(m,n)
wherein: sigma is update weight, the value range of which is 0-1 decimal, and is generally controlled between 0 and 0.2;
(7d) If the Loss is less than the threshold, the current model parameters are saved, and then the step (7 e) is carried out, otherwise, the next step is directly carried out;
(7e) The decoding parameters are updated through back propagation, and the gradient calculation method comprises the following steps:
wherein: alpha is the parameter to be updated and,in order to find the module or network that needs to be traversed in the gradient from Loss to alpha, not all parameters are exactly calculated as the module that is traversed by the back propagation, e.g. if only the codec is consideredThe gradient calculation can be simplified as:
after the gradient is obtained, model parameters such as weight parameters of an attention mechanism, hidden layer parameters and the like can be dynamically adjusted by the following steps:
α n =α n-1 -λΔ
wherein: λ is a gradient decreasing weight parameter;
(7f) If the current training times are smaller than the iteratable times, returning to the step (1) and continuing to train the network; otherwise, the model parameters after the last training are saved, and the training is exited.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310512333.1A CN116645971A (en) | 2023-05-08 | 2023-05-08 | Semantic communication text transmission optimization method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310512333.1A CN116645971A (en) | 2023-05-08 | 2023-05-08 | Semantic communication text transmission optimization method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116645971A true CN116645971A (en) | 2023-08-25 |
Family
ID=87614459
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310512333.1A Pending CN116645971A (en) | 2023-05-08 | 2023-05-08 | Semantic communication text transmission optimization method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116645971A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117014126A (en) * | 2023-09-26 | 2023-11-07 | 深圳市德航智能技术有限公司 | Data transmission method based on channel expansion |
CN117725965A (en) * | 2024-02-06 | 2024-03-19 | 湘江实验室 | Federal edge data communication method based on tensor mask semantic communication |
CN117725965B (en) * | 2024-02-06 | 2024-05-14 | 湘江实验室 | Federal edge data communication method based on tensor mask semantic communication |
-
2023
- 2023-05-08 CN CN202310512333.1A patent/CN116645971A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117014126A (en) * | 2023-09-26 | 2023-11-07 | 深圳市德航智能技术有限公司 | Data transmission method based on channel expansion |
CN117014126B (en) * | 2023-09-26 | 2023-12-08 | 深圳市德航智能技术有限公司 | Data transmission method based on channel expansion |
CN117725965A (en) * | 2024-02-06 | 2024-03-19 | 湘江实验室 | Federal edge data communication method based on tensor mask semantic communication |
CN117725965B (en) * | 2024-02-06 | 2024-05-14 | 湘江实验室 | Federal edge data communication method based on tensor mask semantic communication |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108829722B (en) | Remote supervision Dual-Attention relation classification method and system | |
CN110119765B (en) | Keyword extraction method based on Seq2Seq framework | |
CN111639175B (en) | Self-supervision dialogue text abstract method and system | |
CN110232439B (en) | Intention identification method based on deep learning network | |
CN110688862A (en) | Mongolian-Chinese inter-translation method based on transfer learning | |
CN110569505A (en) | text input method and device | |
CN111858932A (en) | Multiple-feature Chinese and English emotion classification method and system based on Transformer | |
CN110196903B (en) | Method and system for generating abstract for article | |
CN113300813B (en) | Attention-based combined source and channel method for text | |
CN111209749A (en) | Method for applying deep learning to Chinese word segmentation | |
CN115617955B (en) | Hierarchical prediction model training method, punctuation symbol recovery method and device | |
CN116645971A (en) | Semantic communication text transmission optimization method based on deep learning | |
CN113065349A (en) | Named entity recognition method based on conditional random field | |
CN116502628A (en) | Multi-stage fusion text error correction method for government affair field based on knowledge graph | |
CN115545033A (en) | Chinese field text named entity recognition method fusing vocabulary category representation | |
CN116436567A (en) | Semantic communication method based on deep neural network | |
CN115309869A (en) | One-to-many multi-user semantic communication model and communication method | |
CN113535896A (en) | Searching method, searching device, electronic equipment and storage medium | |
CN111008277B (en) | Automatic text summarization method | |
CN110569499B (en) | Generating type dialog system coding method and coder based on multi-mode word vectors | |
CN115470799B (en) | Text transmission and semantic understanding integrated method for network edge equipment | |
CN115017924B (en) | Construction of neural machine translation model for cross-language translation and translation method thereof | |
CN116261176A (en) | Semantic communication method based on information bottleneck | |
CN115293167A (en) | Dependency syntax analysis-based hierarchical semantic communication method and system | |
CN115840815A (en) | Automatic abstract generation method based on pointer key information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |