CN116645971A - Semantic communication text transmission optimization method based on deep learning - Google Patents

Semantic communication text transmission optimization method based on deep learning Download PDF

Info

Publication number
CN116645971A
CN116645971A CN202310512333.1A CN202310512333A CN116645971A CN 116645971 A CN116645971 A CN 116645971A CN 202310512333 A CN202310512333 A CN 202310512333A CN 116645971 A CN116645971 A CN 116645971A
Authority
CN
China
Prior art keywords
semantic
channel
sequence
optimization
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310512333.1A
Other languages
Chinese (zh)
Inventor
袁源
宋晓勤
赵晨辰
刘宇
陈思祺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202310512333.1A priority Critical patent/CN116645971A/en
Publication of CN116645971A publication Critical patent/CN116645971A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Electromagnetism (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a semantic communication text transmission optimization method based on deep learning, which aims at the problem of limited data compression of a traditional communication system. For the further optimization problem in the scene of limited computing resources at the data processing end, the channel coding and decoding are adopted as black box processing, so that the semantic information is optimized as the focus optimization method, the accuracy of text semantic transmission of the semantic communication system is further improved, and a good balance is achieved between complexity and performance.

Description

Semantic communication text transmission optimization method based on deep learning
Technical Field
The invention relates to a semantic communication technology, in particular to a semantic communication method based on deep learning, and more particularly relates to a text transmission-oriented semantic communication optimization method based on deep learning.
Background
From 1G to 5G, communication is mainly concerned with how to accurately and efficiently transmit bits from a transmitter to a receiver. Under the traditional communication architecture, we can reach the transmission rate of the order of gigabit per second (bps), the system capacity gradually approaches the shannon limit, and today in intelligent application scenarios such as man-machine interaction, autopilot, geological monitoring and remote health, the incredible data volume is generated, such as the current cellular network needs to process data traffic growing at an exponential speed, including uplink and downlink data rates of 1Tb/s, traffic densities of 1-10, time delays of 0.1ms, and the like, which inevitably bring about a huge challenge to the traditional communication system. Semantic communication (Semantic Communication, SC) has immeasurable potential in data compression and signaling enhancement as an important product of the convergence development of Deep Learning (DL) and communication technology.
Compared with a mature grammar communication technology, research on semantic communication is still in a preliminary stage, scientists have performed some exploratory work in the field, and certain progress is made in the aspects of design of a semantic communication system architecture at a structural level, training of a background knowledge base in an algorithm aspect, application of a receiving end and the like. However, only a few researches are conducted on semantic communication optimization algorithms in detail, and most of the optimization algorithms at present do not fully consider the shortage of computing resources, and only theoretical demonstration is conducted, so that simulation verification at an application level is lacking. For example, universal Transformer is an optimization algorithm based on a transducer model, and the semantic coding and decoding network is dynamically changed by adding an adaptive network to adapt to the transmission requirement of transmitting texts with different complexity, but the reduced transmission data calculation amount cannot well offset the added adaptive network calculation amount.
Therefore, the invention provides a semantic communication optimization algorithm based on deep learning, which aims at a 6G mobile communication scene with intensive data and resources, takes maximized semantic information as an optimization target of a semantic communication system, and achieves good balance between complexity and performance.
Disclosure of Invention
The invention aims to: aiming at the problems existing in the prior art, a semantic communication optimization algorithm based on deep learning is provided, and the transmission of text data is oriented. The method aims to extract and recover the semantic information by using a transducer, so that the receiving end can recover the semantic information more greatly under the condition of low signal-to-noise ratio.
The technical scheme is as follows: aiming at the problem of limited data compression of the traditional communication system, a semantic communication system constructed by a transducer is adopted to extract and compress semantic information under a specific scene, and then the acquired semantic information is subjected to source channel joint coding/decoding to retain the semantic information to a greater extent under the condition of limited channel capacity. For the further optimization problem of the semantic communication system facing text transmission under the scene of limited computing resources of a data processing end, the channel coding and decoding are adopted as black box processing, so that the semantic information is optimized to be the optimization method of the concentration point, and the accuracy rate of the semantic transmission of text semantics of the semantic communication system is further improved. And finally, based on the constructed mathematical model, comparing and testing with other schemes on different channel environments, and analyzing and verifying the robustness of the system under the condition of limited computing resources. The invention is realized by the following technical scheme: a semantic communication text transmission optimization method based on deep learning comprises the following steps:
(1) At a transmitting end, firstly, preprocessing an input corpus to generate a training set, a testing set and a corresponding dictionary, so that the predicted text can be recovered conveniently;
(2) Input sentence according to initial parameters of modelSub S performs semantic coding S α (-), semantic representation sequence m=s generated by semantic encoder using a transducer α (s);
(3) Channel coded C β (-) to ensure a stable transmission of the sequence over the channel, the coded symbol stream x=c β [S α (s)];
(4) Establishing a channel model according to the required signal-to-noise ratio condition and the environment for transmitting information;
(5) At the receiving end, the channel output signal y is firstly sent to a channel decoding module to recover the semantic representation sequence
(6) Taking channel coding and decoding as a black box, extracting a recovered semantic representation sequence n and a semantic representation sequence m generated by a semantic encoder, and sending the semantic representation sequence n and the semantic representation sequence m into a semantic optimization network to obtain a loss value required by optimizing semantic information;
(7) According to the local background knowledge base of the receiving end, carrying out semantic decoding on the semantic representation sequence to obtain a predicted text sequence
(8) Performing cross entropy loss function calculation on the predicted s' and the target sequence s, and performing back propagation on the obtained result and a semantic optimization function together to train a system model;
(9) And in the stage of analyzing the system performance, testing the trained system under different channel environments, and focusing on the performance of the system under the condition of limited computing resources by taking BLEU or semantic similarity as an evaluation index.
Further, the step (1) includes the following specific steps:
(1a) Data cleaning: removing accent marks in a language, filtering out unnecessary characters such as XML labels, special symbols and the like, and adding a blank in front of punctuation marks at the end of a sentence so as to separate the punctuation marks from text contents;
(1b) Word segmentation: the text is split into corresponding words, phrases or symbols, etc. for easier subsequent processing. The method employed varies for corpora in different languages.
If the input text is English, french, german and the like, the processing mode is simpler, regular expression word segmentation can be used, non-English characters, namely non-a-Z 'and non-A-Z' characters, can be directly deleted, and capital letters are converted into lowercase forms, so that repeated vocabulary is reduced, and a model is simplified.
And for the processing of the Chinese database, the processing is relatively complex. Firstly, a Chinese word segmentation component 'Jieba' library in a Python third party library is required to be called, and a cut function in the library is used for splitting a Chinese text to be processed. In addition, the deletion operation is required for the non-Chinese characters, and the characters to be processed are known to be the characters of non-one- ' according to the initial Chinese character ' one ' and the ending Chinese character ' ' of the ASCII code table. Finally, the operation of removing stop words is also carried out;
(1c) Clause: the long texts are separated according to sentence standards, so that single sentences can be conveniently processed and sentence lengths can be counted. And the sequence start-end marking is carried out, so that the model can be helped to better identify the sentence structure and grammar rule of the processed language, and the performance of the model are improved;
(1d) Vocabulary construction: creating a list containing all words and uniquely encoding the words so as to perform word frequency statistics and vector conversion on the sentences;
(1e) Sequence Padding (Sequence Padding): in order to make all sequences have the same length and be convenient to input into a training model, filling words are required to be added into sentences with different lengths, generally 0 is selected as a filling word, then post alignment operation is carried out, and the sentences with the longest sentence length are filled;
(1f) Data set partitioning: the dataset was assembled as per 9:1 into training and testing sets, so that the experimental model is evaluated for performance at different stages using the corresponding data sets.
Further, the step (2) includes the following specific steps:
(2a) A semantic codec model is built, comprising a 4-layer encoder layer and a 4-layer decoder layer. Semantic coding is carried out on the input sentences through a local knowledge base, semantic information is extracted from a physical channel, and effective compression of data is achieved;
(2b) For one sentence s of input, s= [ w ] 1 ,w 2 ,...,w l ]Wherein w is i Representing the i-th word in the sentence, l is the sentence length. Firstly, expanding an input word to a model dimension 128 in a word embedding mode, and enabling a generated word vector to have characteristic semantics and specific position information through the following position codes:
wherein: d is the dimension of the word vector, i is the index of its dimension, and position is the position index of the word vector.
Then, the sentence input into the semantic encoder becomes a form of:
(2c) The key point of the self-attention calculation of the input word vector is the training of three weight matrixes in the self-attention network, namely W Q 、W K 、W V After the vectors are multiplied by the word embedded vector, a Query vector (Q), a Key vector (K) and a Value vector (V) can be obtained, and then attention calculation is carried out by the following formula:
(2d) Since the transducer uses a Multi-head mechanism (Multi-head technology), i.e. by multiple sets of W Q 、W K 、W V The matrix can obtain a plurality of groups of Q, K, V vectors, so that in order to enable the model to pay attention to information of different semantic layers at the same time, Z matrixes obtained by calculation of each group are integrated through splicing (Concat) operation;
MultiHead=Concat(Z 1 ,Z 2 ,...Z k ) Expression 4
(2e) The spliced result is transmitted into a feedforward network, is transmitted into a channel coding module after being subjected to residual connection (Residual Connection) and layer normalization (Layer Normalization, LN), and the generated semantic representation sequence is S α (s)。
Further, the step (4) comprises the following specific steps:
(4a) Most physical channels can be modeled by neural networks. For additive white gaussian noise (Additive White Gaussian Noise, AWGN) channels, multiplicative gaussian noise channels, and erasure channels, a simple neural network can model them. For fading channels, such as Rayleigh fading channels, more complex neural networks are needed;
(4b) If the channel coefficient is g and the channel noise is n, after the information of the transmitting end is transmitted, the signal received by the receiving end can be represented by the following formula:
y=gx+n expression 5
(4c) If the channel is AWGN, g=1, N corresponds to gaussian distribution N (0, σ 2 ) The method comprises the steps of carrying out a first treatment on the surface of the If the channel is a Rayleigh fading channel, g conforms to Rayleigh distributiong is more than or equal to 0, and under the condition of high signal-to-noise ratio, g and n can be distributed in a circular symmetry complex Gaussian manner CN (0, sigma) 2 ) Simulation was performed.
Further, the step (6) comprises the following specific steps:
(6a) In the optimization process, the channel coding and decoding part is firstly processed by a black box, and a semantic representation sequence m generated by a sender semantic encoder and a semantic representation sequence n recovered after the receiver channel decoding are used as inputs of a semantic optimization network. In order to correlate semantic information at two ends of receiving and transmitting to the greatest extent, a maximized mutual information (Mutual Information) mechanism is adopted as an optimization target of the network, and as the mutual information determines the information content of the coded data to a certain extent, the maximized mutual information mechanism has the advantages that the signaling rate of the system can be improved, and the channel tolerance is maximized;
(6b) The method for calculating the mutual information is shown as follows:
wherein: m and N are distribution spaces of semantic characterization sequences M and N, p (M and N) are joint probability distributions of M and N, p (M) and p (N) are edge probability distributions of M and N respectively;
(6c) From the definition of KL divergence, the mutual information of m, n is the KL divergence of the product of the joint probability distribution of m, n and the edge probability distribution, so in order to optimize the mutual information, some properties of KL divergence are needed below;
(6d) Given that KL divergence is a special form of f-divergence, starting from the nature of f-divergence, the general expression for f-divergence is:
wherein: f (·) is a convex function, satisfying
KL divergence is the f-divergence when f (t) =t·logt:
(6e) The lower bound representation of the constraint equation satisfied by the mutual information can be obtained from the nature of the convex function:
wherein: x is any traversable equation;
(6f) In summary, a loss function of the semantic optimization network can be established to realize gradient descent:
the function X that maximizes mutual information can be found out by training the network.
Further, the step (8) comprises the following specific steps:
(8a) Calculating the distribution difference of the prediction result s' and the target result s according to the two distribution differences:
wherein: p represents the probability distribution of the target result, p ' represents the probability distribution of the predicted result, H (p ') represents the information entropy of the predicted result, and H (p, p ') represents the cross entropy of the target result and the predicted result;
(8b) Since H (p') is a certain value, optimizing KL divergence is as effective as optimizing cross entropy terms. In order to reduce the calculation amount of the deep learning, the cross entropy item is selected and optimized, and a cross entropy loss function is established in consideration of the fact that the calculation amount generated by the deep learning process is large. For a classification problem, its corresponding cross entropy loss function can be expressed as:
wherein: n is the number of samples, w i For the label corresponding to the ith sample, p (w i ) To predict the probability of correct for sample i.
The cross entropy loss function established from H (p, p') is then:
(8c) In combination with the semantic optimization function, the loss function is updated by the following formula:
loss=loss_ce (s, s')+σ·loss (m, n) expression 14
Wherein: sigma is update weight, the value range of which is 0-1 decimal, and is generally controlled between 0 and 0.2;
(8d) If the Loss is less than the threshold, saving the current model parameters, and then entering (8 e), otherwise, directly entering the next step without saving the model parameters;
(8e) The decoding parameters are updated by back propagation. In the back propagation process, a gradient descent method is used to approximate the optimal solution. The gradient calculating method comprises the following steps:
wherein: alpha is the parameter to be updated and,for the modules or networks that need to be undergone in the process of solving the gradient from Loss to α, since not all parameters need to be strictly calculated according to the modules that are undergone by the back propagation, for example, if only the end-to-end input/output of the codec is considered, the gradient calculation can be simplified as follows:
in particular, when the channel is an AWGN channel,
after the gradient is obtained, model parameters such as weight parameters of an attention mechanism, hidden layer parameters and the like can be dynamically adjusted by the following steps:
α n =α n-1 - λΔ expression 17
Wherein: λ is a gradient decreasing weight parameter;
(8f) If the current training times are smaller than the iteratable times, turning to the step (2), and continuing to train the network; otherwise, saving the model parameters after the last training, and exiting the training;
further, the step (9) comprises the following specific steps:
(9a) Loading trained model parameters and testing a data set;
(9b) Initializing a channel environment of an analysis process;
(9c) Selecting an evaluation index;
(9d) If the evaluation index is BLEU, the step (9 e) is entered; if the evaluation index is semantic similarity, entering a step (9 f); if the evaluation index is other, returning an error;
(9e) BLEU is based on an N-gram model, and is determined by the precision value p of the N-gram n And weight value w thereof n The BLEU size of the output result versus target statement may be calculated:
wherein: BP is a length Penalty factor (Brightness Penalty), whose value is a conditional function:
the BLEU value ranges from 0 to 1, the larger the value, the higher the reduction degree of the predicted result, the smaller the value, the more inaccurate the prediction. When the length of the predicted sentence s' is greater than the length of the original sentence s, the BLEU value is reducedIf the length penalty of the sentence is not longer than the original sentence length, a penalty mechanism is not required to be started;
(9f) The calculation of the semantic similarity is based on a model parameter BERT-Large Uncased (white Word Mask) disclosed by Google, a BERT model calculation formula B (-) is utilized, and the semantic distance between the two is calculated by analogy to an included angle calculation method:
(9f) Calculating an average value of the evaluation index in units of sample capacity;
(9g) And returning an evaluation result.
The beneficial effects are that: the invention provides a semantic communication text transmission optimization method based on deep learning, which aims at a text transmission scene with limited computing resources in 6G communication, adopts a semantic communication system constructed by a Transformer to extract and compress semantic information under a specific scene, adopts a method for optimizing the semantic information as a concentration point by using channel coding and decoding as a black box process, further improves the text transmission semantic accuracy of the semantic communication system, and obtains good balance between complexity and performance.
In summary, in the scenario that the computing resources of the mobile communication data processing end are limited, the semantic communication optimization algorithm based on deep learning provided by the invention is superior in the aspect of maximizing semantic information when the semantic communication optimization algorithm is oriented to text data transmission.
Drawings
Fig. 1 is a schematic diagram of a semantic communication text transmission optimization method based on deep learning according to an embodiment of the present invention;
FIG. 2 is a diagram of simulation results of the effect of convergence of a loss value and training times under a semantic optimization algorithm provided by an embodiment of the present invention;
fig. 3 is a diagram of simulation results of the variation of the BLEU value with the signal-to-noise ratio after training for 10 times under the optimization algorithm provided by the embodiment of the present invention;
Detailed Description
The core idea of the invention is that: aiming at the problem of limited data compression of the traditional communication system, the deep learning-based end-to-end semantic communication system model is adopted to extract and compress semantic information in a specific scene, and the semantic information is reserved to a greater extent through joint coding of information source channels. For the further optimization problem in the scene of limited computing resources at the data processing end, the channel coding and decoding are adopted as black box processing, so that the semantic information is optimized as a focus optimization method, and the aim of further improving the text semantic accuracy of the semantic communication system is fulfilled by gradient descent updating of the decoding parameters.
The present invention is described in further detail below.
Firstly, preprocessing an input corpus at a transmitting end to generate a training set, a testing set and a corresponding dictionary, wherein the training set, the testing set and the corresponding dictionary are convenient for recovering a predicted text, and the method specifically comprises the following steps:
(1a) Data cleaning: removing accent marks in a language, filtering out unnecessary characters such as XML labels, special symbols and the like, and adding a blank in front of punctuation marks at the end of a sentence so as to separate the punctuation marks from text contents;
(1b) Word segmentation: the text is split into corresponding words, phrases or symbols, etc. for easier subsequent processing. The method employed varies for corpora in different languages.
If the input text is English, french, german and the like, the processing mode is simpler, regular expression word segmentation can be used, non-English characters, namely non-a-Z 'and non-A-Z' characters, can be directly deleted, and capital letters are converted into lowercase forms, so that repeated vocabulary is reduced, and a model is simplified.
And for the processing of the Chinese database, the processing is relatively complex. Firstly, a Chinese word segmentation component 'Jieba' library in a Python third party library needs to be called, a cut function in the Chinese word segmentation component 'Jieba' is used for splitting a Chinese text to be processed, and sentences are split into independent words and stored in a list. Words are then combined together in spaces by a join function carried by Python to form a sentence. In addition, the deletion operation is required for the non-Chinese characters, and the characters to be processed are known to be the characters of non-one- ' according to the initial Chinese character ' one ' and the ending Chinese character ' ' of the ASCII code table. Finally, the stop word removing operation is performed, namely, nonsensical words in the text are deleted, because the words do not provide valuable information when semantic analysis is performed, including some connective words 'yes', 'in', and the like, and some Chinese assistances such as 'ya', 'in', and the like;
(1c) Clause: the long texts are separated according to sentence standards, so that single sentences can be conveniently processed and sentence lengths can be counted. In addition, as the design scheme uses a transducer to perform semantic processing, in order to help a machine learning model to better understand and process text sequences, a sequence START-END marking needs to be performed, a special START mark (usually indicated by "< s >" or "< START >") is added at the beginning of each text sequence, namely a single sentence, and a special END mark (usually indicated by "</s >" or "< END >") is added at the END, and sentence division is performed by using the marks, so that the model can be helped to better recognize sentence structures and grammar rules of the processed language, and the performance and the expression of the model are improved;
(1d) Vocabulary construction: creating a list containing all words and uniquely encoding the words so as to perform word frequency statistics and vector conversion on the sentences;
(1e) Sequence filling: in order to make all sequences have the same length and be convenient to input into a training model, filling words are required to be added into sentences with different lengths, generally 0 is selected as a filling word, then post alignment operation is carried out, and the sentences with the longest sentence length are filled;
(1f) Data set partitioning: the dataset was assembled as per 9:1 into a training set and a testing set so as to evaluate the performance of the experimental model by using corresponding data sets at different stages;
step (2), carrying out semantic coding S on the input sentence S according to the initial parameters of the model α (-), semantic representation sequence m=s generated by semantic encoder using a transducer α (s) specifically:
(2a) A semantic codec model is built, comprising a 4-layer encoder layer and a 4-layer decoder layer. Semantic coding is carried out on the input sentences through a local knowledge base, semantic information is extracted from a physical channel, and effective compression of data is achieved;
(2b) For one sentence s of input, s= [ w ] 1 ,w 2 ,...,w l ]Wherein w is i Representing the i-th word in the sentence, l is the sentence length. Firstly, inputting a single word through a word embedding modeThe word is expanded to a model dimension 128, and the generated word vector has both characteristic semantics and specific position information through the following position codes:
wherein: d is the dimension of the word vector, i is the index of its dimension, and position is the position index of the word vector.
Then, the sentence input into the semantic encoder becomes a form of:
(2c) The key point of the self-attention calculation of the input word vector is the training of three weight matrixes in the self-attention network, namely W Q 、W K 、W V After the vectors are multiplied by the word embedded vector, a Query vector (Q), a Key vector (K) and a Value vector (V) can be obtained, and then attention calculation is carried out by the following formula:
(2d) Since the transducer uses a multi-head mechanism, i.e. by multiple sets of W Q 、W K 、W V The matrix can obtain a plurality of groups of Q, K, V vectors, so that in order to enable the model to pay attention to information of different semantic layers at the same time, Z matrixes obtained by calculation of each group are integrated through splicing operation;
MultiHead=Concat(Z 1 ,Z 2 ,...Z k ) Expression 4
(2e) The spliced result is transmitted into a feedforward network, is transmitted into a channel coding module after being subjected to residual connection (Residual Connection) and layer normalization (Layer Normalization, LN), and the generated semantic representation sequence is S α (s);
Step (3), channel coding C β (. Cndot.) to ensure sequence stability on channelFixed transmission, coded symbol stream x=c β [S α (s)];
And (4) establishing a channel model according to the required signal-to-noise ratio condition and the environment for transmitting information, wherein the method comprises the following specific steps of:
(4a) Most physical channels can be modeled by neural networks. For additive white gaussian noise (Additive White Gaussian Noise, AWGN) channels, multiplicative gaussian noise channels, and erasure channels, a simple neural network can model them. For fading channels, such as Rayleigh fading channels, more complex neural networks are needed;
(4b) If the channel coefficient is g and the channel noise is n, after the information of the transmitting end is transmitted, the signal received by the receiving end can be represented by the following formula:
y=gx+n expression 5
(4c) If the channel is AWGN, g=1, N corresponds to gaussian distribution N (0, σ 2 ) The method comprises the steps of carrying out a first treatment on the surface of the If the channel is a Rayleigh fading channel, g conforms to Rayleigh distributiong is more than or equal to 0, and under the condition of high signal-to-noise ratio, g and n can be distributed in a circular symmetry complex Gaussian manner CN (0, sigma) 2 ) Performing simulation;
step (5), at the receiving end, the channel output signal y is firstly sent into a channel decoding module to recover the semantic representation sequence
And (6) taking channel coding and decoding as a black box, extracting a recovered semantic representation sequence n and a semantic representation sequence m generated by a semantic encoder, sending the extracted semantic representation sequence n and the semantic representation sequence m into a semantic optimization network, and obtaining a loss value required by optimizing semantic information, wherein the method specifically comprises the following steps of:
(6a) In the optimization process, the channel coding and decoding part is firstly processed by a black box, and a semantic representation sequence m generated by a sender semantic encoder and a semantic representation sequence n recovered after the receiver channel decoding are used as inputs of a semantic optimization network. In order to correlate semantic information at the receiving and transmitting ends to the greatest extent, a maximized mutual information mechanism is adopted as an optimization target of the network, and the mutual information determines the information content of encoded data to a certain extent, so that the maximized mutual information mechanism has the advantages of improving the signaling rate of the system and maximizing the channel tolerance;
(6b) The method for calculating the mutual information is shown as follows:
wherein: m and N are distribution spaces of semantic characterization sequences M and N, p (M and N) are joint probability distributions of M and N, p (M) and p (N) are edge probability distributions of M and N respectively;
(6c) From the definition of KL divergence, the mutual information of m, n is the KL divergence of the product of the joint probability distribution of m, n and the edge probability distribution, so in order to optimize the mutual information, some properties of KL divergence are needed below;
(6d) Considering that KL divergence is a special form of f-divergence, starting from the nature of f-divergence, the general expression of f-divergence is:
wherein: f (·) is a convex function, satisfying
KL divergence is the f-divergence when f (t) =t·logt:
(6e) The lower bound representation of the constraint equation satisfied by the mutual information can be obtained from the nature of the convex function:
wherein: x is any traversable equation;
(6f) In summary, a loss function of the semantic optimization network can be established to realize gradient descent:
the function X for maximizing mutual information can be found out through a training network;
step (7), according to the local background knowledge base of the receiving end, carrying out semantic decoding on the semantic representation sequence to obtain a predicted text sequence
Step (8), cross entropy loss function calculation is carried out on the predicted s and the target sequence s, the obtained result and the semantic optimization function are carried out back propagation together, and a system model is trained, and the method comprises the following specific steps:
(8a) Calculating the distribution difference of the prediction result s' and the target result s according to the two distribution differences:
wherein: p represents the probability distribution of the target result, p ' represents the probability distribution of the predicted result, H (p ') represents the information entropy of the predicted result, and H (p, p ') represents the cross entropy of the target result and the predicted result;
(8b) Since H (p') is a certain value, optimizing KL divergence is as effective as optimizing cross entropy terms. In order to reduce the calculation amount of the deep learning, the cross entropy item is selected and optimized, and a cross entropy loss function is established in consideration of the fact that the calculation amount generated by the deep learning process is large. For a classification problem, its corresponding cross entropy loss function can be expressed as:
wherein: n is a sampleNumber, w i For the label corresponding to the ith sample, p (w i ) To predict the probability of correct for sample i.
The cross entropy loss function established from H (p, p') is then:
(8c) In combination with the semantic optimization function, the loss function is updated by the following formula:
loss=loss_ce (s, s')+σ·loss (m, n) expression 14
Wherein: sigma is update weight, the value range of which is 0-1 decimal, and is generally controlled between 0 and 0.2;
(8d) If the Loss is less than the threshold, saving the current model parameters, and then entering (8 e), otherwise, directly entering the next step without saving the model parameters;
(8e) The decoding parameters are updated by back propagation. In the back propagation process, a gradient descent method is used to approximate the optimal solution. The gradient calculating method comprises the following steps:
wherein: alpha is the parameter to be updated and,in order to calculate the gradient of the module or network that needs to be undergone in the gradient process from Loss to α, since not all parameters need to be strictly calculated according to the module that is undergone by back propagation, for example, when training the semantic communication system, the semantic codec and the channel codec are often combined, only the input and output of the semantic codec from end to end are considered to be processed as a black box, and at this time, the gradient calculation can be simplified as follows:
in particular, when the channel is an AWGN channel,
After the gradient is obtained, model parameters such as weight parameters of an attention mechanism, hidden layer parameters and the like can be dynamically adjusted by the following steps:
α n =α n-1 - λΔ expression 17
Wherein: λ is a gradient decreasing weight parameter;
(8f) If the current training times are smaller than the iteratable times, turning to the step (2), and continuing to train the network; otherwise, saving the model parameters after the last training, and exiting the training;
step (9), in the stage of analyzing system performance, testing the trained system in different channel environments, taking BLEU or semantic similarity as an evaluation index, focusing on the performance of the system under the condition of limited computing resources, and comprising the following specific steps:
(9a) Loading trained model parameters and testing a data set;
(9b) Initializing a channel environment of an analysis process;
(9c) Selecting an evaluation index;
(9d) If the evaluation index is BLEU, the step (9 e) is entered; if the evaluation index is semantic similarity, entering a step (9 f); if the evaluation index is other, returning an error;
(9e) BLEU is based on an N-gram model, and is determined by the precision value p of the N-gram n And weight value w thereof n The BLEU size of the output result versus target statement may be calculated:
wherein: BP is a length Penalty factor (Brightness Penalty), whose value is a conditional function:
BLEU valueThe larger the value, the higher the degree of reduction of the predicted result, the smaller the value, the more inaccurate the prediction. When the length of the predicted sentence s' is greater than the length of the original sentence s, the BLEU value is reducedIf the length penalty of the sentence is not longer than the original sentence length, a penalty mechanism is not required to be started;
(9f) The calculation of the semantic similarity is based on a model parameter BERT-Large Uncased (white Word Mask) disclosed by Google, a BERT model calculation formula B (-) is utilized, and the semantic distance between the two is calculated by analogy to an included angle calculation method:
(9f) Calculating an average value of the evaluation index in units of sample capacity;
(9g) And returning an evaluation result.
In fig. 1, a semantic communication text transmission optimization method based on deep learning is described, after a channel is processed as a black box, a semantic optimization network further reduces the variation range of a semantic representation sequence by calculating the difference of a semantic sequence at a receiving and transmitting end.
In fig. 2, simulation results of the effect of convergence of the loss value and the training times under the semantic optimization algorithm are described, it can be seen that the convergence speed of the optimization method is faster in the initial stage, and the phenomenon of resource waste and time delay caused by the fact that the sinking part is the smallest does not occur.
In fig. 3, a simulation result of the variation of the BLEU value along with the signal-to-noise ratio after training for 10 times under the optimization algorithm is described, and under the condition of limited computing resources and low signal-to-noise ratio, the semantic optimization algorithm can improve the accuracy by about 20% compared with the deep sc algorithm.
According to the description of the invention, it should be apparent to those skilled in the art that the invention proposes a semantic communication text transmission optimization method based on deep learning, which can effectively avoid the waste of computing resources and achieve a good balance between complexity and performance.
What is not described in detail in the present application belongs to the prior art known to those skilled in the art.

Claims (1)

1. The semantic communication text transmission optimization method based on deep learning is characterized by comprising the following steps of:
(1) Semantic coding S of an input sentence S according to model initial parameters α (. Cndot.) the generated semantic representation sequence m=s α (s);
(2) Channel coded C β (-) to ensure a stable transmission of the sequence over the channel, the coded symbol stream x=c β [S α (s)];
(3) Establishing a channel model according to the required signal-to-noise ratio condition and the environment for transmitting information;
(4) At the receiving end, the channel output signal y is firstly sent to a channel decoding module to recover the semantic representation sequence
(5) Taking channel coding and decoding as a black box, extracting a recovered semantic representation sequence n and a semantic representation sequence m generated by a semantic encoder, and sending the semantic representation sequence n and the semantic representation sequence m into a semantic optimization network to obtain a loss value required by optimizing semantic information;
(6) According to a local background knowledge base of a receiving end, carrying out semantic decoding on the semantic representation sequence to obtain a predicted sequence s';
(7) Performing cross entropy loss function calculation on the predicted s' and the target sequence s, and performing back propagation on the obtained result and a semantic optimization function together to train a system model;
further, the step (5) comprises the following specific steps:
(5a) In the optimization process, the channel coding and decoding part is firstly processed by a black box, and a semantic characterization sequence m generated by a sender semantic encoder and a semantic characterization sequence n recovered after the receiver channel decoding are used as inputs of a semantic optimization network; in order to correlate the semantic information at the receiving and transmitting ends to the greatest extent, a maximized mutual information mechanism is adopted as an optimization target of the network, and as the mutual information determines the information quantity contained in the coded data to a certain extent, the other advantage of using the mechanism is that the signaling rate of the system can be improved, and the channel tolerance is maximized;
(5b) The method for calculating the mutual information is shown as follows:
wherein: m and N are distribution spaces of semantic characterization sequences M and N, p (M and N) are joint probability distributions of M and N, p (M) and p (N) are edge probability distributions of M and N respectively;
(5c) From the definition of KL divergence, the KL divergence of the product of the joint probability distribution of m, n and the edge probability distribution, where KL divergence is a special form of f-divergence (f-divergence), the convex function taking into account the f-divergence satisfiesThe lower bound representation of the constraint equation satisfied by the mutual information can be obtained:
wherein: x is any traversable equation;
(5d) In order to maximize semantic information of the associated transceiver, a loss function of a semantic optimization network is established as follows:
the function X for maximizing mutual information can be found out through a training network;
further, the step (7) comprises the following specific steps:
(7a) Calculating the distribution difference of the prediction result s' and the target result s according to the two distribution differences:
wherein: l is the sentence length, p represents the probability distribution of the target result, p ' represents the probability distribution of the predicted result, H (p ') represents the information entropy of the predicted result, and H (p, p ') represents the cross entropy of the target result and the predicted result;
(7b) In order to reduce the calculation amount of the deep learning, selecting and optimizing cross entropy items, and establishing a cross entropy loss function:
wherein: n is the number of samples;
(7c) In combination with the semantic optimization function, the loss function is updated by the following formula:
Loss=Loss_CE(s,s′)+σ·Loss(m,n)
wherein: sigma is update weight, the value range of which is 0-1 decimal, and is generally controlled between 0 and 0.2;
(7d) If the Loss is less than the threshold, the current model parameters are saved, and then the step (7 e) is carried out, otherwise, the next step is directly carried out;
(7e) The decoding parameters are updated through back propagation, and the gradient calculation method comprises the following steps:
wherein: alpha is the parameter to be updated and,in order to find the module or network that needs to be traversed in the gradient from Loss to alpha, not all parameters are exactly calculated as the module that is traversed by the back propagation, e.g. if only the codec is consideredThe gradient calculation can be simplified as:
after the gradient is obtained, model parameters such as weight parameters of an attention mechanism, hidden layer parameters and the like can be dynamically adjusted by the following steps:
α n =α n-1 -λΔ
wherein: λ is a gradient decreasing weight parameter;
(7f) If the current training times are smaller than the iteratable times, returning to the step (1) and continuing to train the network; otherwise, the model parameters after the last training are saved, and the training is exited.
CN202310512333.1A 2023-05-08 2023-05-08 Semantic communication text transmission optimization method based on deep learning Pending CN116645971A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310512333.1A CN116645971A (en) 2023-05-08 2023-05-08 Semantic communication text transmission optimization method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310512333.1A CN116645971A (en) 2023-05-08 2023-05-08 Semantic communication text transmission optimization method based on deep learning

Publications (1)

Publication Number Publication Date
CN116645971A true CN116645971A (en) 2023-08-25

Family

ID=87614459

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310512333.1A Pending CN116645971A (en) 2023-05-08 2023-05-08 Semantic communication text transmission optimization method based on deep learning

Country Status (1)

Country Link
CN (1) CN116645971A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117014126A (en) * 2023-09-26 2023-11-07 深圳市德航智能技术有限公司 Data transmission method based on channel expansion
CN117725965A (en) * 2024-02-06 2024-03-19 湘江实验室 Federal edge data communication method based on tensor mask semantic communication
CN117725965B (en) * 2024-02-06 2024-05-14 湘江实验室 Federal edge data communication method based on tensor mask semantic communication

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117014126A (en) * 2023-09-26 2023-11-07 深圳市德航智能技术有限公司 Data transmission method based on channel expansion
CN117014126B (en) * 2023-09-26 2023-12-08 深圳市德航智能技术有限公司 Data transmission method based on channel expansion
CN117725965A (en) * 2024-02-06 2024-03-19 湘江实验室 Federal edge data communication method based on tensor mask semantic communication
CN117725965B (en) * 2024-02-06 2024-05-14 湘江实验室 Federal edge data communication method based on tensor mask semantic communication

Similar Documents

Publication Publication Date Title
CN108829722B (en) Remote supervision Dual-Attention relation classification method and system
CN110119765B (en) Keyword extraction method based on Seq2Seq framework
CN111639175B (en) Self-supervision dialogue text abstract method and system
CN110232439B (en) Intention identification method based on deep learning network
CN110688862A (en) Mongolian-Chinese inter-translation method based on transfer learning
CN110569505A (en) text input method and device
CN111858932A (en) Multiple-feature Chinese and English emotion classification method and system based on Transformer
CN110196903B (en) Method and system for generating abstract for article
CN113300813B (en) Attention-based combined source and channel method for text
CN111209749A (en) Method for applying deep learning to Chinese word segmentation
CN115617955B (en) Hierarchical prediction model training method, punctuation symbol recovery method and device
CN116645971A (en) Semantic communication text transmission optimization method based on deep learning
CN113065349A (en) Named entity recognition method based on conditional random field
CN116502628A (en) Multi-stage fusion text error correction method for government affair field based on knowledge graph
CN115545033A (en) Chinese field text named entity recognition method fusing vocabulary category representation
CN116436567A (en) Semantic communication method based on deep neural network
CN115309869A (en) One-to-many multi-user semantic communication model and communication method
CN113535896A (en) Searching method, searching device, electronic equipment and storage medium
CN111008277B (en) Automatic text summarization method
CN110569499B (en) Generating type dialog system coding method and coder based on multi-mode word vectors
CN115470799B (en) Text transmission and semantic understanding integrated method for network edge equipment
CN115017924B (en) Construction of neural machine translation model for cross-language translation and translation method thereof
CN116261176A (en) Semantic communication method based on information bottleneck
CN115293167A (en) Dependency syntax analysis-based hierarchical semantic communication method and system
CN115840815A (en) Automatic abstract generation method based on pointer key information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination