CN116865887A - Emotion classification broadcasting system and method based on knowledge distillation - Google Patents

Emotion classification broadcasting system and method based on knowledge distillation Download PDF

Info

Publication number
CN116865887A
CN116865887A CN202310828957.4A CN202310828957A CN116865887A CN 116865887 A CN116865887 A CN 116865887A CN 202310828957 A CN202310828957 A CN 202310828957A CN 116865887 A CN116865887 A CN 116865887A
Authority
CN
China
Prior art keywords
broadcasting
program
emotion
text
emotion classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310828957.4A
Other languages
Chinese (zh)
Other versions
CN116865887B (en
Inventor
刘海章
王祥
张长娟
田才林
黄大池
朱静宁
赵开宇
杜限
黄河
靳晶晶
王佩
邹雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Institute Of Radio And Television Science And Technology
Original Assignee
Sichuan Institute Of Radio And Television Science And Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Institute Of Radio And Television Science And Technology filed Critical Sichuan Institute Of Radio And Television Science And Technology
Priority to CN202310828957.4A priority Critical patent/CN116865887B/en
Publication of CN116865887A publication Critical patent/CN116865887A/en
Application granted granted Critical
Publication of CN116865887B publication Critical patent/CN116865887B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H20/00Arrangements for broadcast or for distribution combined with broadcast
    • H04H20/53Arrangements specially adapted for specific applications, e.g. for traffic information or for mobile receivers
    • H04H20/59Arrangements specially adapted for specific applications, e.g. for traffic information or for mobile receivers for emergency or urgency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Databases & Information Systems (AREA)
  • Emergency Management (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses an emotion classification broadcasting system and a method based on knowledge distillation, wherein the emotion classification broadcasting system comprises a program broadcasting control service subsystem, a program text content processing subsystem and a program text content processing subsystem, wherein the program broadcasting control service subsystem is used for converging the content of a program text content; the AI service subsystem is used for carrying out knowledge distillation on the emotion classification BERT pre-training large model and training to obtain an edge side emotion classification small model; and the intelligent broadcasting terminal subsystem is used for transmitting the emotion factors and the program texts obtained by classification to the text-to-speech conversion engine, generating an audio program with the emotion factors and broadcasting the audio program. The invention not only ensures that the broadcasting system has the broadcasting effect of the program with emotion colors, but also greatly reduces the transmission data quantity because the transmitted program data is changed from audio frequency into text, so that the transmission time of the program is shorter and the emergency broadcasting capability of the system is stronger.

Description

Emotion classification broadcasting system and method based on knowledge distillation
Technical Field
The invention belongs to the technical field of intelligent broadcasting, and particularly relates to an emotion classification broadcasting system and method based on knowledge distillation.
Background
In the current broadcasting system, broadcasting content is collected from safe and compliant broadcasting data sources through a news collector and a web crawler program, the broadcasting system firstly uses a text-to-speech mode for collected broadcasting texts to generate audio programs at a cloud end, and then uses streaming media communication modes such as RTMP and the like to transmit the audio programs to a broadcasting terminal for broadcasting the audio programs.
The main problems in the above-mentioned manner are:
(1) Broadcast program content without emotion color
Because the broadcasting programs are converted by using a text-to-speech conversion engine (TTS), emotion classification is not carried out on the broadcasting program text during conversion, and all the programs are of the same speech speed and tone, so that the broadcasting programs are flat and have no infection force, and effective influence on audience products cannot be generated.
(2) The program transmission time is too long
Because the broadcasting terminal directly plays the converted audio file, compared with directly transmitting and playing the text, the audio file transmission time may be too long, and even the playing failure condition may be caused by the network transmission quality problem under the condition of poor network.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides an emotion classification broadcasting system and method based on knowledge distillation, so as to solve the problems of no emotion color and overlong program transmission time of the existing broadcasting program.
In order to achieve the above purpose, the invention adopts the following technical scheme:
in a first aspect, an emotion classification broadcast system based on knowledge distillation, comprising:
the program broadcasting control service subsystem is used for carrying out content aggregation on the text content of the program, storing the content into a broadcasting program library and sending the text content of the program to the intelligent broadcasting terminal subsystem according to broadcasting requirements;
the AI service subsystem is used for carrying out knowledge distillation on the emotion classification BERT pre-training large model, training to obtain an edge side emotion classification small model, and storing and transmitting the edge side emotion classification small model to the intelligent broadcasting terminal subsystem;
and the intelligent broadcasting terminal subsystem is used for receiving the edge side emotion classification small model and the program text, performing emotion classification on the program text, transmitting the emotion factors and the program text obtained by classification to the text-to-speech conversion engine, generating an audio program with the emotion factors, and broadcasting the audio program.
In a second aspect, a broadcasting method of an emotion classification broadcasting system based on knowledge distillation is characterized by comprising the following steps:
s1, content aggregation is carried out on the text content of the program, the content is stored in a broadcasting program library, and the text content of the program is sent to an intelligent broadcasting terminal subsystem according to broadcasting requirements;
s2, carrying out knowledge distillation on the emotion classification BERT pre-training large model, training to obtain an edge side emotion classification small model, and storing and transmitting the edge side emotion classification small model to the intelligent broadcast terminal subsystem;
s3, carrying out emotion classification on the program text by adopting an edge side emotion classification small model, transmitting emotion factors and the program text obtained by classification to a text-to-speech conversion engine, generating an audio program with emotion factors, and broadcasting.
Further, step S1 includes:
the method comprises the steps of collecting content through a news collector and a web crawler for emergency broadcasting, geological disaster information, emergency release of other three parties, appointed news information or government notices, cleaning data of collected content, storing aggregated program data in a broadcasting program library, storing the broadcasting program library in a data and file separation mode, and carrying out CDN release on programs of the broadcasting program library based on broadcasting service.
Further, in step S2, knowledge distillation is performed on the emotion classification BERT pre-training large model, and the training is performed to obtain an edge side emotion classification small model, which includes:
s2.1, initializing a temperature parameter T;
s2.2, training a student model by adopting a current temperature parameter T and a loss function;
s2.3, evaluating the performance of the training student model under the temperature parameter T by adopting the accuracy of the verification set;
s2.4, if the performance on the verification set does not meet the preset condition, increasing the value of the temperature parameter T, and returning to S2.2 for continuous training; if the performance on the verification set meets the preset condition, the current temperature parameter T is saved, the value of the temperature parameter T is reduced, and then the training is continued by returning to S2.2;
and stopping training until the value of the temperature parameter T is smaller than the threshold value or the training round number is larger than the maximum training round number.
Further, step S2.2 includes:
introducing a temperature parameter T, performing temperature scaling by using a softmax function, and outputting probability distribution q by the scaled student model i The method comprises the following steps:
wherein f T ( i The method comprises the steps of carrying out a first treatment on the surface of the ) Input text content sample x for student model i The lower output probability distribution, θ is the parameter of the student model;
using a loss function L S,T Training a student model after temperature scaling:
wherein y is i Training the real label of the sample for the ith text content; n is the total number of text content training samples; k is the number of categories of the tag, namely the number of emotion classifications.
Further, step S2.3 includes:
wherein N is v To verify the total number of samples, A T Is the model accuracy.
Further, step S3 includes:
carrying out emotion classification on the program text by adopting an edge-side emotion classification small model, wherein the emotion classification of the program text comprises urgency, pleasure, peace and sadness, and the category with the maximum probability of four emotion classification is used as an emotion factor of the broadcast program text;
and transmitting the obtained emotion factors and the program text to a text-to-speech conversion engine, performing speech conversion on the program text, generating an audio program with the emotion factors, and broadcasting the audio program.
The emotion classification broadcasting system and method based on knowledge distillation provided by the invention have the following beneficial effects:
according to the invention, the emotion classification method based on knowledge distillation is adopted to carry out emotion analysis on the program text at the edge side, and emotion factors obtained by analysis are input into the text-to-speech conversion engine to carry out audio program conversion, so that the broadcast audio program has emotion factors and has more infectivity. Meanwhile, as the transmitted program data is changed from audio frequency to text, the transmission data volume is greatly reduced, the program transmission time is shorter, and the emergency broadcasting capability of the system is stronger.
The invention not only enables the broadcasting system to have the program broadcasting effect with emotion colors, but also enables the program type issued by the cloud to be text type, the data size after compression is only a few KB, the files transmitted by the existing broadcasting system are audio files, the size after encoding is also more than hundreds KB, and compared with the prior art, the data volume of the program transmission is reduced by more than hundreds of times, and the transmission time is also greatly shortened; compared with the existing broadcasting system, the method and the system greatly reduce the transmitted data volume and transmission time, enhance the emergency broadcasting capability of the broadcasting system, and simultaneously carry out the emotion reasoning of the whole text on the intelligent terminal at the edge side, thereby greatly reducing the calculation pressure of the cloud of the broadcasting system.
The core idea of the invention is to use an adaptive way to determine the value of the temperature parameter T when distilling a small model, in particular when the performance on the validation set is not improved, the value of the temperature parameter can be increased to expand the search space of the model, thus having a greater possibility to find a better model. When the performance on the verification set is improved, the current temperature parameter T is saved, and the value of the temperature parameter T is reduced, so that the model is focused on the knowledge of the original pre-training model; by repeatedly iterating the training and continuously adjusting the value of the temperature parameter T, the algorithm can adaptively determine the optimal temperature parameter value, thereby improving the performance of the student model.
Drawings
Fig. 1 is a system block diagram of an emotion classification broadcast system based on knowledge distillation.
FIG. 2 is a flow chart of adaptive temperature scaling knowledge distillation.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.
Example 1
The embodiment provides an emotion classification broadcasting system based on knowledge distillation, which aims at solving the problems of no emotion color, overlong program transmission time and the like of broadcasting programs in the existing method, adopts emotion classification based on knowledge distillation to carry out emotion classification on program texts at the edge side, inputs emotion factors obtained by analysis into a text-to-speech conversion engine to carry out audio program conversion, so that broadcasting audio programs have emotion factors and have more infectivity, and meanwhile, because the transmitted program data is changed into texts from audio, the transmission data volume is greatly reduced, the program transmission time is shorter, and the emergency broadcasting capability of the system is stronger, and referring to fig. 1, the system specifically comprises:
program broadcasting control service subsystem
The system is used for converging the contents of programs such as emergency message texts, news information websites, emergency audios and the like, storing the contents in a broadcasting program library and issuing the broadcasting contents to the intelligent broadcasting terminal according to broadcasting requirements;
specifically, the subsystem collects contents according to set rules through a news collector and a web crawler program on websites such as emergency broadcasting, geological disaster information, emergency release of other three parties, appointed news information, government notices and the like, cleans the collected contents only by data and stores the collected program data in a broadcasting program library. The broadcasting program library is stored in a mode of separating data from files, and the functions of automatic classification, automatic labeling and automatic cleaning of expired data are provided. The broadcasting service distributes the program of the broadcasting program library to CDN. Compared with the prior art, the broadcasting system of the invention changes the form of issuing programs from audio to text;
AI service subsystem
Carrying out knowledge distillation on the emotion classification BERT (Bidirectional Encoder Representation from Transformers) pre-training large model, training to obtain an edge side emotion classification small model which can be operated at the intelligent broadcasting terminal, and storing and issuing the edge emotion classification small model according to the requirement;
in order for a broadcast system to have emotion broadcasting capability, it is necessary to use AI to perform emotion classification on broadcast text content. In the system, emotion classification is to classify emotion of a program and judge which category of "urgent", "pleasant", "peaceful" and "sad" the program belongs to. Considering that the number of broadcasting terminals is large, if emotion classification is carried out at the cloud, the calculation force requirement is huge and the reasoning response time is too long, so that the invention carries out the emotion classification reasoning task on the intelligent broadcasting terminals. Because the intelligent broadcasting terminal has limited calculation power, the intelligent broadcasting terminal cannot directly lower the estrus classification BERT pre-training large model (comprising a multi-layer encoder), and the large model can be efficiently operated at the edge end only by compressing and cutting;
intelligent broadcasting terminal subsystem
Using the received emotion classification small model to carry out emotion classification on the broadcast text, transmitting the program text and emotion factors to a text-to-speech conversion engine to generate an audio program with emotion factors and broadcasting the audio program;
specifically, the intelligent broadcasting terminal adopts multi-channel receiving, and can receive 4G/5G, wiFi, bluetooth and other transmission data. And the data analysis module analyzes the received data to obtain a broadcasting program text and an emotion classification BERT small model. The AI emotion analysis module loads an emotion classification BERT small model by using an AI framework such as PyTorch and the like, performs emotion classification reasoning on the broadcast program text, and then takes the category with the highest emotion classification probability of ' urgent ', ' pleasant ', ' mild ' sad ' and the like as an emotion factor of the broadcast program text. Finally, the text-to-speech engine TTS performs speech conversion on the program text according to the input emotion factors, and generates an audio program with emotion colors for broadcasting.
Example 2
The embodiment provides a broadcasting method of an emotion classification broadcasting system based on knowledge distillation, which is based on a self-adaptive temperature scaling knowledge distillation method to perform knowledge distillation on a BERT large model to obtain a corresponding BERT small model (only 3 layers of encoders). The model calculation and storage cost is greatly reduced, and meanwhile, the performance and generalization capability of a large model are basically ensured, so that a small model can carry out efficient emotion classification reasoning on an intelligent broadcasting terminal, wherein the self-adaptive temperature scaling knowledge distillation is as follows:
the goal of knowledge distillation is to migrate knowledge in a large BERT emotion classification large model teacher network T into a small student model S that will be trained to mimic the behavior of the teacher network; f (f) T And f S Representing the behavioural functions of the teacher network and the student network, respectively, the goal of the behavioural functions is to convert the input of the network into a corresponding information encoded representation, knowledge distillation can be modeled as a minimization process of the following objective functions, namely:
L S,T =∑ x∈E Loss( T (x), S (x))
the loss (·) is a loss function for measuring the difference between a teacher network and a student network, x is an input sample, and E is a sample set; the focus of knowledge distillation is to select and construct a loss function with which to correlate an effective behavioral function.
In the method, the student model selects the BERT model which is consistent with the teacher model in structure, and the model layer number is far smaller than that of the teacher model. In the method, the output of a prediction layer and the attention weight are selected as corresponding behavior functions, and a cross entropy function is selected as a basic loss function. In the aspect of data sets, the patent adopts a broadcasting program library to construct, and emotion is classified into four categories, namely: "urgent", "pleasant", "peaceful", "sad". The whole data set comprises 10000 samples of a text content training set, 5000 samples of a text content verification set and 5000 samples of a text content test set.
The self-adaptive temperature scaling method is designed to optimize the whole distillation process and improve the generalization capability of the student model. In particular, when performance on the validation set is not improved, the value of the temperature parameter may be increased to expand the search space of the model, thereby having a greater likelihood of finding a better model. When the performance on the validation set improves, the current temperature parameter T is saved, and the value of the temperature parameter T is reduced, so that the model is focused on the knowledge of the original pre-trained model. By iteratively training and continuously adjusting the value of the temperature parameter T, the algorithm can adaptively determine the optimal temperature parameter value, thereby improving the performance of the student model, and referring to fig. 2, the algorithm specifically comprises the following steps:
step S1, content aggregation is carried out on the program text content, the content is stored in a broadcasting program library, and the program text content is sent to an intelligent broadcasting terminal subsystem according to broadcasting requirements;
specifically, the news collector and the web crawler collect content of emergency broadcast, geological disaster information, other three-party emergency delivery, appointed news information or government notices, clean the collected content data, store the collected program data in a broadcasting program library, store the broadcasting program library in a mode of separating data from files, and carry out CDN delivery on the programs of the broadcasting program library based on broadcasting service.
S2, carrying out knowledge distillation on the emotion classification BERT pre-training large model, training to obtain an edge side emotion classification small model, and storing and transmitting the edge side emotion classification small model to an intelligent broadcast terminal subsystem, wherein the method specifically comprises the following steps of:
step S2.1, initializing a temperature parameter T, wherein the temperature parameter T is a smaller value;
s2.2, training a student model by adopting a current temperature parameter T and a loss function;
introducing a temperature parameter T, performing temperature scaling by using a softmax function, and outputting probability distribution q by the scaled student model i The method comprises the following steps:
wherein f T ( i The method comprises the steps of carrying out a first treatment on the surface of the ) Input text content sample x for student model i The lower output probability distribution, θ is the parameter of the student model;
using a loss function L S,T Training a student model after temperature scaling:
wherein y is i Training the real label of the sample for the ith text content; n is the total number of text content training samples (10000); k is the number of categories (4) of labels, i.e. the number of emotion classifications, including "urgent", "pleasant", "peaceful", "sad";
s2.3, evaluating the performance of the training student model under the temperature parameter T by adopting the accuracy of the verification set;
wherein N is v To verify the total number of samples (5000), A T Is the model accuracy;
step S2.4, if the performance on the verification set does not meet the preset condition, increasing the value of the temperature parameter T, and returning to the step S2.2 to continue training; if the performance on the verification set meets the preset condition, the current temperature parameter T is saved, the value of the temperature parameter T is reduced, and then the step S2.2 is returned to continue training;
and stopping training until the value of the temperature parameter T is smaller than the threshold value or the training round number is larger than the maximum training round number.
As shown in FIG. 2, wherein A T () For model accuracy in training round I, l max For maximum number of rounds of training, T min For a set minimum temperature value, α, β are constants.
S3, carrying out emotion classification on the program text by adopting an edge side emotion classification small model, wherein the emotion classification of the program text comprises urgency, pleasure, peace and sadness, and the category with the highest probability of four emotion classification is used as an emotion factor of the broadcast program text;
and transmitting the obtained emotion factors and the program text to a text-to-speech conversion engine, performing speech conversion on the program text, generating an audio program with the emotion factors, and broadcasting the audio program.
Although specific embodiments of the invention have been described in detail with reference to the accompanying drawings, it should not be construed as limiting the scope of protection of the present patent. Various modifications and variations which may be made by those skilled in the art without the creative effort are within the scope of the patent described in the claims.

Claims (7)

1. An emotion classification broadcast system based on knowledge distillation, comprising:
the program broadcasting control service subsystem is used for carrying out content aggregation on the text content of the program, storing the content into a broadcasting program library and sending the text content of the program to the intelligent broadcasting terminal subsystem according to broadcasting requirements;
the AI service subsystem is used for carrying out knowledge distillation on the emotion classification BERT pre-training large model, training to obtain an edge side emotion classification small model, and storing and transmitting the edge side emotion classification small model to the intelligent broadcasting terminal subsystem;
and the intelligent broadcasting terminal subsystem is used for receiving the edge side emotion classification small model and the program text, performing emotion classification on the program text, transmitting the emotion factors and the program text obtained by classification to the text-to-speech conversion engine, generating an audio program with the emotion factors, and broadcasting the audio program.
2. A broadcasting method using the knowledge distillation-based emotion classification broadcasting system of claim 1, comprising the steps of:
s1, content aggregation is carried out on the text content of the program, the content is stored in a broadcasting program library, and the text content of the program is sent to an intelligent broadcasting terminal subsystem according to broadcasting requirements;
s2, carrying out knowledge distillation on the emotion classification BERT pre-training large model, training to obtain an edge side emotion classification small model, and storing and transmitting the edge side emotion classification small model to the intelligent broadcast terminal subsystem;
s3, carrying out emotion classification on the program text by adopting an edge side emotion classification small model, transmitting emotion factors and the program text obtained by classification to a text-to-speech conversion engine, generating an audio program with emotion factors, and broadcasting.
3. The broadcasting method of knowledge distillation based emotion classification broadcasting system according to claim 2, wherein said step S1 comprises:
the method comprises the steps of collecting content through a news collector and a web crawler for emergency broadcasting, geological disaster information, emergency release of other three parties, appointed news information or government notices, cleaning data of collected content, storing aggregated program data in a broadcasting program library, storing the broadcasting program library in a data and file separation mode, and carrying out CDN release on programs of the broadcasting program library based on broadcasting service.
4. The broadcasting method of the emotion classification broadcasting system based on knowledge distillation as set forth in claim 3, wherein in the step S2, knowledge distillation is performed on the emotion classification BERT pre-training large model, and training is performed to obtain an edge side emotion classification small model, which includes:
s2.1, initializing a temperature parameter T;
s2.2, training a student model by adopting a current temperature parameter T and a loss function;
s2.3, evaluating the performance of the training student model under the temperature parameter T by adopting the accuracy of the verification set;
s2.4, if the performance on the verification set does not meet the preset condition, increasing the value of the temperature parameter T, and returning to S2.2 for continuous training; if the performance on the verification set meets the preset condition, the current temperature parameter T is saved, the value of the temperature parameter T is reduced, and then the training is continued by returning to S2.2;
and stopping training until the value of the temperature parameter T is smaller than the threshold value or the training round number is larger than the maximum training round number.
5. The broadcasting method of knowledge distillation based emotion classification broadcasting system according to claim 4, wherein said step S2.2 comprises:
introducing a temperature parameter T, performing temperature scaling by using a softmax function, and outputting probability distribution q by the scaled student model i The method comprises the following steps:
wherein f T (x i The method comprises the steps of carrying out a first treatment on the surface of the θ) is the student model input text content sample x i The lower output probability distribution, θ is the parameter of the student model;
using a loss function L S,T Training a student model after temperature scaling:
wherein y is i Training the real label of the sample for the ith text content; n is the total number of text content training samples; k is the number of categories of the tag, namely the number of emotion classifications.
6. The broadcasting method of knowledge distillation based emotion classification broadcasting system according to claim 5, wherein said step S2.3 comprises:
wherein N is v To verify the total number of samples, A T Is the model accuracy.
7. The broadcasting method of knowledge distillation based emotion classification broadcasting system of claim 6, wherein said step S3 comprises:
carrying out emotion classification on the program text by adopting an edge-side emotion classification small model, wherein the emotion classification of the program text comprises urgency, pleasure, peace and sadness, and the category with the maximum probability of four emotion classification is used as an emotion factor of the broadcast program text;
and transmitting the obtained emotion factors and the program text to a text-to-speech conversion engine, performing speech conversion on the program text, generating an audio program with the emotion factors, and broadcasting the audio program.
CN202310828957.4A 2023-07-06 2023-07-06 Emotion classification broadcasting system and method based on knowledge distillation Active CN116865887B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310828957.4A CN116865887B (en) 2023-07-06 2023-07-06 Emotion classification broadcasting system and method based on knowledge distillation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310828957.4A CN116865887B (en) 2023-07-06 2023-07-06 Emotion classification broadcasting system and method based on knowledge distillation

Publications (2)

Publication Number Publication Date
CN116865887A true CN116865887A (en) 2023-10-10
CN116865887B CN116865887B (en) 2024-03-01

Family

ID=88235340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310828957.4A Active CN116865887B (en) 2023-07-06 2023-07-06 Emotion classification broadcasting system and method based on knowledge distillation

Country Status (1)

Country Link
CN (1) CN116865887B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109347586A (en) * 2018-11-24 2019-02-15 合肥龙泊信息科技有限公司 There is the emergency broadcase system of terminal broadcast speech monitoring function when a kind of teletext
CN111767740A (en) * 2020-06-23 2020-10-13 北京字节跳动网络技术有限公司 Sound effect adding method and device, storage medium and electronic equipment
CN114241282A (en) * 2021-11-04 2022-03-25 河南工业大学 Knowledge distillation-based edge equipment scene identification method and device
CN114863226A (en) * 2022-04-26 2022-08-05 江西理工大学 Network physical system intrusion detection method
CN116260642A (en) * 2023-02-27 2023-06-13 南京邮电大学 Knowledge distillation space-time neural network-based lightweight Internet of things malicious traffic identification method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109347586A (en) * 2018-11-24 2019-02-15 合肥龙泊信息科技有限公司 There is the emergency broadcase system of terminal broadcast speech monitoring function when a kind of teletext
CN111767740A (en) * 2020-06-23 2020-10-13 北京字节跳动网络技术有限公司 Sound effect adding method and device, storage medium and electronic equipment
CN114241282A (en) * 2021-11-04 2022-03-25 河南工业大学 Knowledge distillation-based edge equipment scene identification method and device
CN114863226A (en) * 2022-04-26 2022-08-05 江西理工大学 Network physical system intrusion detection method
CN116260642A (en) * 2023-02-27 2023-06-13 南京邮电大学 Knowledge distillation space-time neural network-based lightweight Internet of things malicious traffic identification method

Also Published As

Publication number Publication date
CN116865887B (en) 2024-03-01

Similar Documents

Publication Publication Date Title
Gündüz et al. Beyond transmitting bits: Context, semantics, and task-oriented communications
EP3255633B1 (en) Audio content recognition method and device
US20220207352A1 (en) Methods and systems for generating recommendations for counterfactual explanations of computer alerts that are automatically detected by a machine learning algorithm
CN111489765A (en) Telephone traffic service quality inspection method based on intelligent voice technology
Lu et al. Semantics-empowered communications: A tutorial-cum-survey
CN113988086A (en) Conversation processing method and device
Qin et al. Bert-erc: Fine-tuning bert is enough for emotion recognition in conversation
CN116865887B (en) Emotion classification broadcasting system and method based on knowledge distillation
US11687576B1 (en) Summarizing content of live media programs
CN114528434A (en) IPTV live channel fusion recommendation method based on self-attention mechanism
CN116737936B (en) AI virtual personage language library classification management system based on artificial intelligence
CN112036122B (en) Text recognition method, electronic device and computer readable medium
CN117713377A (en) Intelligent voice joint debugging system of dispatching automation master station
CN117150338A (en) Task processing, automatic question and answer and multimedia data identification model training method
CN113849641B (en) Knowledge distillation method and system for cross-domain hierarchical relationship
CN114743540A (en) Speech recognition method, system, electronic device and storage medium
CN114373443A (en) Speech synthesis method and apparatus, computing device, storage medium, and program product
CN114842857A (en) Voice processing method, device, system, equipment and storage medium
CN114328867A (en) Intelligent interruption method and device in man-machine conversation
CN114596854A (en) Voice processing method and system based on full-duplex communication protocol and computer equipment
CN114065742B (en) Text detection method and device
CN116127074B (en) Anchor image classification method based on LDA theme model and kmeans clustering algorithm
CN114567811B (en) Multi-modal model training method, system and related equipment for voice sequencing
Di Principles of AIGC technology and its application in new media micro-video creation
CN114283794A (en) Noise filtering method, noise filtering device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant