CN112562669A - Intelligent digital newspaper automatic summarization and voice interaction news chat method and system - Google Patents
Intelligent digital newspaper automatic summarization and voice interaction news chat method and system Download PDFInfo
- Publication number
- CN112562669A CN112562669A CN202011389092.9A CN202011389092A CN112562669A CN 112562669 A CN112562669 A CN 112562669A CN 202011389092 A CN202011389092 A CN 202011389092A CN 112562669 A CN112562669 A CN 112562669A
- Authority
- CN
- China
- Prior art keywords
- voice
- text
- news
- word
- intelligent
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 92
- 230000003993 interaction Effects 0.000 title claims abstract description 46
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 55
- 238000013528 artificial neural network Methods 0.000 claims abstract description 47
- 238000005516 engineering process Methods 0.000 claims abstract description 20
- 238000001914 filtration Methods 0.000 claims abstract description 10
- 239000013598 vector Substances 0.000 claims description 65
- 238000012545 processing Methods 0.000 claims description 43
- 238000004364 calculation method Methods 0.000 claims description 31
- 230000008569 process Effects 0.000 claims description 30
- 238000004458 analytical method Methods 0.000 claims description 24
- 238000012549 training Methods 0.000 claims description 24
- 238000007781 pre-processing Methods 0.000 claims description 21
- 230000002452 interceptive effect Effects 0.000 claims description 18
- 239000011159 matrix material Substances 0.000 claims description 15
- 230000010354 integration Effects 0.000 claims description 12
- 238000013136 deep learning model Methods 0.000 claims description 11
- 230000000694 effects Effects 0.000 claims description 11
- 230000007246 mechanism Effects 0.000 claims description 11
- 238000013135 deep learning Methods 0.000 claims description 10
- 230000004044 response Effects 0.000 claims description 10
- 238000011176 pooling Methods 0.000 claims description 5
- 238000013139 quantization Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 238000013461 design Methods 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims description 3
- 238000010801 machine learning Methods 0.000 claims description 2
- 238000005457 optimization Methods 0.000 claims description 2
- 238000012360 testing method Methods 0.000 claims description 2
- 238000005065 mining Methods 0.000 claims 1
- 238000012795 verification Methods 0.000 claims 1
- 230000015572 biosynthetic process Effects 0.000 abstract description 8
- 238000003786 synthesis reaction Methods 0.000 abstract description 8
- 238000007670 refining Methods 0.000 abstract description 4
- 239000010410 layer Substances 0.000 description 16
- 238000013527 convolutional neural network Methods 0.000 description 8
- 230000014509 gene expression Effects 0.000 description 6
- 230000011218 segmentation Effects 0.000 description 6
- 230000007547 defect Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 230000000306 recurrent effect Effects 0.000 description 4
- 239000002356 single layer Substances 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000007635 classification algorithm Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- 230000003595 spectral effect Effects 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 241000282414 Homo sapiens Species 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 230000005611 electricity Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/34—Browsing; Visualisation therefor
- G06F16/345—Summarisation for human users
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
- G06N5/041—Abduction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
- G10L13/047—Architecture of speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/54—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0631—Creating reference templates; Clustering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0638—Interactive procedures
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Databases & Information Systems (AREA)
- Signal Processing (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a method and a system for automatically abstracting intelligent digital newspaper and interactively chatting news by voice, which adopts a method for establishing a neural network by a self-attention layer to generate an intelligent abstract of a text, and the system comprises the following steps: the method comprises the technologies of algorithm, scheduling, awakening, searching, voice recognition, synthesis and chatting news, particularly relates to intelligent summarization and chatting news, realizes the purpose of intelligently and efficiently refining summarized texts for massive news, is concise and easy to read, and is personalized in news recommendation, topic recognition and tracking. The intelligent voice recognition method based on the real-time voice pickup and ultrasonic filtering has the advantages that the problem of high dimensionality, data sparseness and lack of semantic methods of traditional texts is solved through the attention neural network, and intelligent man-machine interaction experience of a type of 'carefully selecting, listening, speaking, understanding you and answering' is given in voice output.
Description
Technical Field
The invention belongs to the field of intelligent news processing abstract automatic generation technology, natural language processing technology and voice interaction chat news, and particularly relates to an intelligent news abstract automatic generation method based on semantic correlation, a voice interaction processing method and a system.
Background
Newspaper is one of the oldest means in the field of dissemination, and for thousands of years, newspaper has made an outstanding contribution to the progress of human civilization. The news information is an important channel for news readers to obtain social information resources, and how to obtain the information needed by people from the huge information ocean becomes very important in the important means for people to obtain the social effective information in the big data era. The intelligent news abstract is one or one sentence obtained by refining and summarizing the source digital message book, fully reflects the theme of the text, and is concise and easy to read. The intelligent news abstract is a technology for automatically generating a text abstract by using a computer, extracting from an original digital newspaper text by using a computer science technology and a natural language processing technology, and then re-presenting the digital newspaper abstract to a reader according to a form required by a user.
The intelligent news abstract takes a news text word vector as input, and news text characteristics are automatically learned and extracted through the attention neural network, so that the defects of time and labor waste and error accumulation caused by manual participation in characteristic extraction in a traditional news text classification method are overcome, the efficiency of news text classification is effectively improved, and more effective information organization and management in the news field are promoted.
The existing voice interaction adopts the preset problem and corresponding answer in a knowledge base, after the system acquires the voice information of a reader, the voice information is transcribed into a text through a corresponding algorithm, the text information is sent to a background system, and the corresponding answer is searched in the knowledge base and returned to the reader; in actual use, due to the adoption of single algorithms such as keywords, regular expressions, deep learning models and the like, the calculated result is not accurate enough, so that the answer is wrong and the effect is poor; the preset answers are generally single results, the answering effect on information such as multiple questions and multiple intentions of readers is poor, the preset questions and the corresponding answers are single, the designed process is free of a polling context and historical data association mechanism, mechanical traces are obvious in the process of communication with the readers, interaction is not smooth, experience is poor, the same voice synthesis interaction technology is adopted in different information contents or different stages of the same information, personalized response cannot be achieved, and the actual using effect is discounted.
Under the new media environment, in the aspect of reading digital newspapers and periodicals, new requirements are provided for giving the digital newspapers and periodicals the intelligent man-machine interaction experience of the type of being capable of listening, speaking, understanding and answering, and the spoken voices of human beings are understood by voice recognition technology and artificial intelligence. Speech recognition technology is a technology that allows a machine to convert speech signals into corresponding text or commands through a recognition and understanding process. Speech recognition is a very extensive cross-discipline, which is very closely related to such disciplines as acoustics, phonetics, linguistics, information theory, pattern recognition theory, neurobiology, etc.
The first generation digital newspaper system provides the full look of the whole newspaper, except that the touch sense of paper cannot be realized, the reading process experience is highly matched with that of the traditional newspaper, and the original-taste newspaper reading effect is provided for the digital newspaper. For a newspaper medium with content advantages, revolutionary influence on information dissemination caused by intelligent voice interaction should be seen in time. From content distribution aiming at voice interaction hardware to technology-enabled content production, the intelligent voice interaction technology and media naturally have the possibility of multi-level cooperation, and the technology has wide application prospect.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a newspaper which can listen, speak, understand and answer, wherein text scanning is adopted to acquire text characteristics, a multilingual original text is converted into a word vector sequence consisting of a plurality of word vectors, an Attention-oriented neural network architecture is adopted, the regular weights of titles, beginning sentences and ending sentences are set, and a training set is generated by a text summarization method of pre-training plus micro-scheduling, parallel computing and semantics;
the deep learning method comprises the steps of voice interaction, and learning to obtain dense document quantization processing doc2vec word vector representation of a news text, so that the problems of high dimensionality, sparse data, lack of semantics, sentence breaking and the like of the traditional text representation are solved. The voice collecting device has the advantages that real-time sound collection can be achieved, external effective voice can be accurately obtained in the voice output process, the voice interaction process is more intelligent, and intelligent human-computer interaction experience is achieved.
The invention achieves the aim through the following technical scheme:
a news chat method and system of intelligent digital newspaper automatic summarization and voice interaction comprises the following steps: the step S1 includes:
s11, according to the set rule, the news title has a strong generalization effect on the news text, the titles of some major news directly reflect the central thought of the news text, the sentence weight calculation is carried out, the better effect is generated by combining the similarity of the titles, and the weight value is weighted by referring to the similarity of the titles and the sentences;
in the calculation of sentence weight, the title is firstly, and then the first sentence and the first paragraph are emphasized, but in the news report, the first sentence may be the first sentence of a news genre which does not influence the news content, such as 'a reporter report', 'XX net X month X day signal', 'XX society Beijing X month X day electricity', and the like, so the first sentence of the news genre is firstly filtered when being processed. The intelligent news summarization is carried out without considering the sentence patterns of question sentences and exclamation sentences. Calculating sentence similarity and filtering redundant sentences;
in text expressed in chinese, the characteristics of a sentence: word features, semantic features, syntactic features. During sentence similarity calculation, the three types of characteristics are comprehensively considered, and organic weighted combination and mutual complementation are carried out on the characteristics;
the sentences of the Chinese texts can be divided into a core part and a modification part, wherein the core part is a main object structure which plays a crucial role in the semantics of the sentences, and the modification part is a secondary object structure which is usually a shape-fixed supplementary structure. Because the subject and the object in the structure of the subject-predicate object are often nouns or pronouns, and the predicate is mostly adverbs or adjectives, when the similarity of sentences is calculated, parts of speech of various words appearing in the sentences should be labeled, keywords are retained, and non-keywords are filtered;
the S12 participle strategy engine is internally provided with a sequence labeling model and a deep learning algorithm for participle processing; the knowledge model algorithm analysis is based on character string matching for scoring; analyzing the deep learning model algorithm, and scoring based on a deep learning algorithm K-means, an LDA algorithm, an iterative decision tree, a TextCNN and a TextRNN attention model; analyzing the similarity auxiliary model algorithm, and scoring based on word distance calculation, covariance calculation, word vector calculation and stability calculation; the score comprises the scoring score of any one or more of the knowledge model, the deep learning model and the similarity auxiliary model;
s13, for the processing of complex answers such as multiple questions and multiple intentions, scheduling, wherein the preliminary preprocessing is carried out before the text is sent to a question calculation model, the calculation model is split through the multiple intentions, the questions with multiple intentions are decomposed into multiple parts, the integrated scheduling of multiple algorithms is realized, then the multiple parts are sent to the question calculation model to obtain multiple answers, the generated abstract expresses the core meaning of the original text, and the answer results are integrated and fed back to a reader;
the process of calculating the model doc2vec comprises two steps: training a model, and obtaining a word vector, a parameter sum of softmax and a paragraph vector/sentence vector from known training data;
the inference process (inference stage) gets its vector expression for the new paragraph. Specifically, more columns are added in the matrix, and under the fixed condition, the method is used for training, and a gradient descent method is used for obtaining a new D, so that the vector expression of a new paragraph is obtained;
the calculation model realizes the integrated scheduling of various algorithms, and solves the defects of the traditional algorithm in the short text classification problem from three aspects. Firstly, the structure of a single-layer neural network is improved, and compared with the existing LSTM (Long-Short-Term-Memory). Structural units of a Recurrent neural network such as GRU (gated Recurrent Unit) find structural units suitable for short text classification tasks, and output of the Recurrent neural network structure is improved. The traditional method only takes the output of the last layer as the semantic representation of the short text, and the invention adopts the thought in the attention neural network to fuse the forward output and the backward output of the cyclic neural network, thereby obtaining better short text representation; secondly, input and intermediate parameters of the neural network are optimized, pre-training is respectively carried out on input variables and a network structure by combining word vectors and an automatic coding machine, and a contrast test shows that the pre-training process is more beneficial to parameter convergence in the neural network, so that a better classification effect is obtained; finally, the invention introduces an improved multi-layer neural network fusion method for short text classification, the traditional deep neural network only simply takes the output of a single-layer neural network as input, the output is superposed layer by layer, the relation between layers in a multi-layer cyclic neural network is improved by means of the thought of a threshold in an LSTM, the semantic representation of the short text is further optimized, and an experimental result shows that the classification effect of the improved multi-layer neural network is superior to that of the single-layer neural network, so that the management and the storage of information are greatly facilitated for people, the text is subjected to language synthesis, natural language understanding, and the integration and scheduling of various algorithms are realized, and an abstract training set is generated;
s3, the smart voice interaction wake mode (wake word "xiaoxin"): the intelligent voice interaction comprises: the method comprises the steps of voice acquisition, voice preprocessing and voice processing, wherein the voice acquisition is used for acquiring sound information in real time, the voice preprocessing is connected with the voice acquisition and used for acquiring the sound information and carrying out ultrasonic filtering on the sound information to obtain intelligent voice of a 'listening, speaking, understanding you and answering' type;
the intelligent voice interactive chat news has two states of an awakening mode and a non-awakening mode; this intelligence voice interaction chat news includes:
voice acquisition, wherein the voice acquisition is used for acquiring sound information in real time;
the voice preprocessing is connected with the voice acquisition and used for acquiring the voice information and carrying out ultrasonic filtering on the voice information to obtain target voice; judging whether the target voice is a set awakening word or not in a non-awakening mode, if so, entering the awakening mode, and if not, keeping the non-awakening mode;
the voice recognition is used for recognizing all target voices in the awakening mode to obtain target contents;
searching, wherein the searching is respectively connected with the voice recognition and the storage of the pre-stored answer sentences, and the searching is used for storing the answer sentences from the storage or according to the target content in an awakening mode;
an output connected to the search, the output configured to obtain the response content in an awake mode and output the response content;
the intelligent voice interactive chat news enters a non-awakening mode when no content is output within set time and no sound information is acquired.
The intelligent voice interactive chat news based on deep learning as described above, wherein optionally, a mode control is further included, and the mode control is electrically connected to the voice preprocessing, the voice recognition, the voice preprocessing and the output respectively;
the mode control is used for acquiring mode information and respectively sending the current mode information to the voice preprocessing, the voice recognition, the search and the output;
in a non-awakening mode, the mode control generates an awakening state identifier according to a judgment result of whether the target voice is a set awakening word or not according to the voice preprocessing, and under the condition that the target voice is the set awakening word, the awakening state identifier is respectively output to the voice preprocessing, the voice recognition, the search and the output;
in an awakening mode, the mode control acquires a time node when the response content is output, and monitors whether the voice preprocessing acquires the target content in real time; if the target content is acquired within the set time, generating a non-awakening state identifier, and respectively outputting the non-awakening state identifier to the voice preprocessing, the voice recognition, the search and the output;
the method S3 is specifically as follows:
s31 sound is picked up in real time in the wake-up mode, so that external effective voice can be accurately obtained in the voice output process;
s32 multi-purpose judgment processing, multi-model algorithm analysis, a data strategy engine and integration processing; the system comprises a multi-model algorithm analysis module, a data strategy engine module and an integration processing module. That is, the recognition of multiple intentions of a short text is a problem in Spoken Language Understanding (SLU), and since the short text has sparse features and a small number of words but contains a large amount of information, it is difficult to extract effective features thereof in a classification problem. To solve this problem, syntactic features and a self-attention neural network (CNN) are combined to propose a multi-intent recognition model. First, a sentence is subjected to dependency syntax analysis to determine whether multiple intents are contained; then, calculating a distance matrix by using the word frequency-inverse document frequency (TF-IDF) and the trained word vectors to determine the number of intentions; secondly, the distance matrix is used as the input of a CNN model for intention classification;
further, the step S32 is to combine semantic expansion and CNN methods to classify the news data sets, first extract information in the headlines, and then perform semantic expansion using CNN;
further, the S32 multi-purpose judgment processing module, the multi-model algorithm parsing module, the data policy engine module, and the integration processing module are respectively connected to the language processing module for data transmission;
the multi-intention judging process is used for analyzing whether the dialog text of the reader has multi-intentions, the multi-intention judging process is connected with a word segmentation strategy engine, and the multi-intention judging process is used for receiving the text, performing primary filtering through the word segmentation strategy engine, judging the multi-intentions of the reader and feeding the judgment back to the scheduling; after the dispatching acquires the data fed back by the multi-intention judgment processing, selecting according to problem data in the text, calling a multi-model algorithm analysis mode to acquire a score generated by the multi-model algorithm analysis, after the dispatching acquires the score, calling the integration processing, performing weight screening according to the score to obtain calculation result data and transmitting the calculation result data to the data strategy engine, wherein the data strategy engine processes the calculation result to generate the analysis, and the multi-model algorithm analysis is internally provided with a knowledge model algorithm interpretation, a deep learning model algorithm interpretation and a similarity auxiliary model algorithm analysis.
Intelligent voice interactive chat news, no content is output within a set time, and a non-awakening mode is entered;
a news chat method and system of intelligent digital newspaper automatic summarization and voice interaction are disclosed, the system comprises:
doc2Vec is also called para 2Vec, and sensor elements are unsupervised algorithms, can obtain vector expression of sense/para/documents, and are extensions of word2 Vec. The vector finds the similarity between the sensors/paragrams/documents by calculating the distance, is used for text clustering, and can also classify the text by a supervised learning method for the data with the label;
an automatic speech recognition module: (ASR) is an Automatic Speech Recognition for Automatic Speech Recognition;
a spoken language understanding module: (SLU) is Spoken Language Understanding, used for automatic speech recognition of the pronunciation;
a conversation management module: (DM) is a dialog Management for automatic speech recognition;
a language processing module: (NLP) is a Natural control dispatching device used for automatic voice recognition;
a speech synthesis module: (TTS) is Text To Speech, from Text To Speech;
the multi-purpose judgment processing module: is (MIM) a Multi-interaction task processing module for automatic speech recognition;
the multi-model algorithm analysis module: is (MAM) Multi-model algorithm module;
the data policy engine module: (DSM) is a Data stream engine module;
the word segmentation strategy engine: (SSE) is Segmentation stream engine;
an integration processing module: (DI) is Data interpretation;
furthermore, the intelligent digital newspaper automatic summarization and voice interactive news chat method and system adopt an Attention neural network architecture, set the rule weights of the titles, the beginning sentences and the ending sentences, and generate a training set by a text summarization method of pre-training, micro-scheduling, parallel computing and semantics; the learning module ASR is used for converting the words spoken by the readers into texts, the SLU is used for understanding the intentions of the readers and extracting key information in the texts, the DM is used for managing the conversations of the machine and the readers, the TTS is used for returning the texts generated by the machine to the readers by using voice, and the accuracy of the machine for semantic understanding depends on the accuracy of the ASR, but most importantly, depends on the accuracy of the SLU;
in summary, the method and system for intelligent digital newspaper automatic summarization and voice interactive news chat of the present invention initially generates an initial representation or embedding of each word, represented by an open circle, using Attention. Information is aggregated from other words using the self-attention mechanism, and each word of the context generates a new token, represented by a filled circle, according to the sentence rule weights that set the title, beginning, and end. This step is then repeated multiple times in parallel, successively generating new tokens for all words. The system of preprocessing, strategy flow, voice synthesis, voice recognition, language processing and scheduling combination realizes the integrated scheduling of multiple algorithms, and schedules the calculation of multiple algorithm models and synthesizes the calculation result to obtain the optimal solution according to the set rule, so as to solve the limitation of blind spot calculation of a single algorithm model and achieve the complementary effect; and voice information is acquired in real time through voice acquisition, so that corresponding effective voice information can be identified no matter the interaction is in an awakening mode or a non-awakening mode. In the awakening mode, in the process of interactive chat news, a reader does not need to add a specific awakening word before each sentence, so that the interactive chat news process can be more free and random, and the intelligence of the interactive chat news is improved. In addition, because voice is obtained and voice information is obtained in real time, even if the interactive chat news is in the voice output process, effective voice information can be accurately identified, so that the interactive chat news can be interrupted in the voice output process, and communication is more efficient and smooth.
On the basis of the technical level of the original intelligent voice interaction field, further architectural design optimization and content refinement extension are performed, and the voice acquisition module acquires the voice information in real time, so that the corresponding effective voice information can be identified no matter the interactive voice is in an awakening mode or a non-awakening mode. In the wake-up mode, in the interactive process, a reader is not required to add a specific wake-up word before each sentence, so that the reader is not required to add the specific wake-up word before each sentenceThe method can enable the interaction process to be more free and random, thereby improving the intellectualization of the interaction voice. In addition, because the voice acquisition module acquires the voice information in real time, even if the interactive voice is in the voice output process, the effective voice information can be accurately identified, so that the interactive voice can be interrupted in the voice output process, the communication is more efficient and smooth, a specific scene can be customized, a complex scene with multiple intentions is better solved, and the history and the current data are combined in a diversified manner. The attention module mainly comprises two parts, wherein the first part is a channel attention vector matrix MCIt can select channels, and the other part is a space attention vector matrix MSIt can select the region needing attention in the vector matrix space, and for the given characteristic vector matrix, the region passes through F epsilon RC×H×WOutput after self attention module:
F'=MC(F)×F,
F"=MS(F')×F',
in the self-attention neural network, channel information generally represents different feature information of a document quantization processing doc2vec word vector, and the network can better notice information which is useful for a task in the doc2vec word vector by selecting channels. In order to realize the selection of the channel, the global average pooling and global maximum pooling information of the feature doc2vec word vector is calculated, and the attention parameters of the channel are obtained by adding after the full-connection layers, wherein the two share the same full-connection network.
And performing global maximum and average pooling of the characteristic vector matrix on each coordinate of the character sequence of the characteristic identification of each channel, and performing self-attention on the characteristic vector matrix to obtain a space attention vector matrix.
The automatic extraction of the digital electronic newspaper, the automatic abstract processing of multiple meanings, voice recognition and interaction and voice chat news are achieved.
The method and the system for automatically abstracting the intelligent digital newspaper and interactively chatting the news by voice adopt a method for generating the intelligent text abstract by establishing a neural network by a self-attention layer, and the system comprises the following steps: the method comprises the technologies of algorithm, scheduling, awakening, searching, voice recognition, synthesis and chatting news, particularly relates to intelligent summarization and chatting news, realizes the purpose of intelligently and efficiently refining summarized texts for massive news, is concise and easy to read, and is personalized in news recommendation, topic recognition and tracking. The intelligent voice recognition method based on the real-time voice pickup and ultrasonic filtering has the advantages that the problem of high dimensionality, data sparseness and lack of semantic methods of traditional texts is solved through the attention neural network, and intelligent man-machine interaction experience of a type of 'carefully selecting, listening, speaking, understanding you and answering' is given in voice output.
Drawings
FIG. 1 is a flow chart of a method and system for intelligent digital newspaper automatic summarization and voice interactive news chat according to the present invention;
FIG. 2 is a block diagram of S1 according to the present invention;
FIG. 3 is a flow chart of S2 according to the present invention;
FIG. 4 is a flow chart of an embodiment of S3;
FIG. 5 encoder-decoder architecture (a) the conventional architecture (b) the architecture of the model with the addition of attention mechanism;
FIG. 6 is a flow chart illustrating the basic principle of the speech recognition system of the present invention.
Detailed Description
Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
In the practical application of the present embodiment, the following description of at least one example is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where applicable.
Any particular value, in all methods shown and discussed herein, should be construed as exemplary only and not as limiting. Thus, other examples of method embodiments may have different values.
It should be noted that: like symbols and letters represent like items in the following figures, and thus, once an item is defined in one figure, it need not be further discussed in subsequent figures. Other features of the present invention and its advantages will become apparent from the following detailed description of the invention with reference to the accompanying drawings, as illustrated in fig. 1 to 6. The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.
S1, acquiring a digital newspaper text; the text scanning method comprises the steps of collecting text characteristics through the text scanning, converting a multi-semantic original text into a word vector sequence consisting of a plurality of word vectors, setting the rule weights of titles, beginning sentences and ending sentences through an Attention-oriented neural network architecture, and generating a training set through a text summarization method of pre-training plus micro-scheduling, parallel computing and semantics;
s2, through the above-mentioned obtaining the title, beginning and ending sentence rule weight of the digital message, centralizing the character sequence and text word to be identified to reduce the influence caused by background information, making the identification ability not easy to be influenced by other interference, through the Attention self-Attention neural network model, the channel of vector matrix characteristic designs the Attention self-Attention module, and the information channel and space obtained by calculating the cross entropy as the model loss function and the maximum pooling are selected. The dense document quantization processing doc2vec word vector representation of the news text solves the problems of high dimensionality, sparse data, lack of semantics, sentence breaking and the like of the traditional text representation, the news text word vector is used as input, and the attention neural network automatically learns and extracts the characteristics of the news text, so that the defects of time and labor waste and error accumulation of the traditional news text classification method which manually participates in the characteristic extraction are overcome. The model can put 'attention' on more useful information, and the module applies various self-attention neural network architectures to put forward the attention capability of the CBAM module to obtain a final abstract result.
The voice collecting device has the advantages that real-time sound collection can be achieved, external effective voice can be accurately obtained in the voice output process, the voice interaction process is more intelligent, and intelligent human-computer interaction experience is achieved.
S3, according to the intelligent voice interaction wake-up mode wake-up word 'Xiaoxin' described by the text, an intelligent voice input port of the 'listening, speaking, understanding you, answering' type is obtained.
Further, S1 includes: s11, according to the set rule, the news title has a strong generalization effect on the news text, the titles of some major news directly reflect the central thought of the news text, the sentence weight calculation is carried out, the better effect is generated by combining the similarity of the titles, and the weight value is weighted by referring to the similarity of the titles and the sentences;
in the calculation of sentence weight, the title is firstly, and then the first sentence and the first paragraph are emphasized, but in the news report, the first sentence may be the first sentence of a news genre which does not influence the news content, such as 'a reporter report', 'XX net X month X day signal', 'XX society Beijing X month X day electricity', and the like, so the first sentence of the news genre is firstly filtered when being processed. The intelligent news summarization is carried out without considering the sentence patterns of question sentences and exclamation sentences. Calculating sentence similarity and filtering redundant sentences;
in text expressed in chinese, the characteristics of a sentence: word features, semantic features, syntactic features. During sentence similarity calculation, the three types of characteristics are comprehensively considered, and organic weighted combination and mutual complementation are carried out on the characteristics;
the sentences of the Chinese texts can be divided into a core part and a modification part, wherein the core part is a main object structure which plays a crucial role in the semantics of the sentences, and the modification part is a secondary object structure which is usually a shape-fixed supplementary structure. Because the subject and the object in the structure of the subject-predicate object are often nouns or pronouns, and the predicate is mostly adverbs or adjectives, when the similarity of sentences is calculated, parts of speech of various words appearing in the sentences should be labeled, keywords are retained, and non-keywords are filtered;
the S12 participle strategy engine is internally provided with a sequence labeling model and a deep learning algorithm for participle processing; the knowledge model algorithm analysis is based on character string matching for scoring; analyzing the deep learning model algorithm, and scoring based on a deep learning algorithm K-means, an LDA algorithm, an iterative decision tree, a TextCNN and a TextRNN attention model; analyzing the similarity auxiliary model algorithm, and scoring based on word distance calculation, covariance calculation, word vector calculation and stability calculation; the score comprises the scoring score of any one or more of the knowledge model, the deep learning model and the similarity auxiliary model;
s13, for the processing of complex answers such as multiple questions and multiple intentions, scheduling, before the text is sent to the question calculation model, preliminary preprocessing is carried out, the calculation model is split through the multiple intentions, the questions with multiple intentions are decomposed into multiple shares, multiple algorithm integration scheduling is realized, then the multiple answers are sent to the question calculation model, the generated abstract expresses the core meaning of the original text, and the answer results are integrated and fed back to the reader, and the intelligent digital newspaper automatic abstract and voice interactive chat news method and system provided by the invention comprises the following steps:
the method adopts attention classification, uses an attention key neural network architecture and attention neural network training, maps text elements into vectors with fixed length, and the distance between the vectors can depict the semantic correlation between the text elements, thereby overcoming the defects that one-hot vector dimension is too high and the connection between the text elements cannot be depicted. A new text classification algorithm and a multi-document automatic summarization algorithm are designed on the basis of text distributed representation, and a text concept word vector model is designed by combining the distributed representation of words and a text vector model representation method aiming at the problems of huge structural dimension, extreme sparseness and the like of a text vector model. The method comprises the steps of mapping words in a text into word vectors, clustering the words with high semantic relevance into concepts through word vector clustering, then constructing a concept directed vector model according to the sequence relation of the words, storing adjacent matrixes corresponding to the vector models of the concepts of the text into vector sequences, converting natural voice processing tasks into vector sequence processing tasks, achieving mapping from the text to the vector sequences, and designing a multilayer self-attention neural network. The text vector sequences are classified, and the classification result is compared with other text classification algorithms, so that the result shows that the algorithm provided by the invention is better than other three text classification algorithms, and the problem that automatic summarization of multiple documents in China is lack of summary sentence redundancy is solved. The distributed expression of sentences is combined with a spectral clustering algorithm, a multi-document automatic summarization algorithm based on an attention algorithm and spectral clustering is designed, the sentences in the text are mapped into sentence vectors, the sentence vectors are clustered by using the spectral clustering algorithm, and the documents are divided into sub-topic documents. And establishing a sentence relation vector sequence model in each sub-topic document, and iterating the sentence weight by using an Attention algorithm.
Further, S3 includes:
s31 sound is picked up in real time in the wake-up mode, so that external effective voice can be accurately obtained in the voice output process;
s32 multi-purpose judgment processing, multi-model algorithm analysis, a data strategy engine and integration processing; the system comprises a multi-model algorithm analysis module, a data strategy engine module and an integration processing module. That is, the recognition of multiple intentions of a short text is a problem in Spoken Language Understanding (SLU), and since the short text has sparse features and a small number of words but contains a large amount of information, it is difficult to extract effective features thereof in a classification problem. To solve this problem, syntactic features and a self-attention neural network (CNN) are combined to propose a multi-intent recognition model. First, a sentence is subjected to dependency syntax analysis to determine whether multiple intents are contained; then, calculating a distance matrix by using the word frequency-inverse document frequency (TF-IDF) and the trained word vectors to determine the number of intentions; secondly, the distance matrix is used as the input of a CNN model for intention classification;
s32 is a method combining semantic expansion and CNN to classify the news data set, firstly extracting the information in the title, and then performing semantic expansion by using CNN;
the S32 multi-purpose judgment processing module, the multi-model algorithm parsing module, the data policy engine module, and the integration processing module are respectively connected to the language processing module for data transmission;
s33 multi-intention judging processing is used for analyzing whether the dialog text of the reader has multi-intentions, the multi-intention judging processing is connected with a word segmentation strategy engine, the multi-intention judging processing is used for receiving the text, primary filtering is carried out through the word segmentation strategy engine, multi-intention judgment of the reader is carried out, and the multi-intention judgment is fed back to the scheduling; after the dispatching acquires the data fed back by the multi-intention judgment processing, selecting according to problem data in the text, calling a multi-model algorithm analysis mode to acquire a score generated by the multi-model algorithm analysis, after the dispatching acquires the score, calling the integration processing, performing weight screening according to the score to obtain calculation result data and transmitting the calculation result data to the data strategy engine, wherein the data strategy engine processes the calculation result to generate the analysis, and the multi-model algorithm analysis is internally provided with a knowledge model algorithm interpretation, a deep learning model algorithm interpretation and a similarity auxiliary model algorithm analysis.
(1) The invention relates to a digital newspaper system technology of automatic abstract newspapers, which is characterized by attention classification and attention-using neural network architecture, wherein in a text and summarization task, only some words of an input sequence are related to a next prediction output value by an attention algorithm; in the tagging task, some local information is more closely related to the next tagged word. The attention mechanism integrates this relationship, allowing the model to dynamically focus on specific portions of the input to more efficiently accomplish the task at hand.
An attention model is added into the neural network structure of the invention: the attention model achieves very good performance in a plurality of tasks, such as question answering, emotion analysis and part of speech tagging text analysis, and the attention mechanism increases the interpretability of the neural network structure. Since the traditional neural network is a black box model, improving the interpretability is crucial to improving the fairness, reliability and transparency of the machine learning model. Third, it can help alleviate some of the drawbacks in recurrent neural networks, such as performance degradation with increasing input sequence length and computational inefficiency with sequential processing of the inputs.
The conventional encoding-decoding structure has two major drawbacks. First, the encoder must compress all the input information into a fixed-length vector. Using such simple fixed-length encoding to represent longer and more complex inputs tends to result in loss of input information. Secondly, such a structure does not model the correspondence of input sequences and output sequences, which is important in text summarization tasks. Intuitively, in a sequence task, each position of the output sequence may be affected by a particular position of the input sequence. However, classical decoding architectures do not take this correspondence into account when generating the output.
The attention model overcomes the two major drawbacks of the conventional architecture by allowing the decoder to access all of the encoder-generated outputs. The core idea is that all the outputs of the encoder are weighted and combined and then input into the decoder at the current position to influence the output of the decoder. By weighting the output of the encoder, more context information of the original data can also be utilized at the colleague that achieves the alignment of the input and output. For text classification and recommendation tasks, the input is a sequence and the output is not a sequence. In this scenario, an attention mechanism may be used to capture the association between each unit (e.g., each word) in the input sequence and other units in the input sequence. In this case, the candidate state and the query state of the attention model are the same sequence, and we refer to the attention model based on this mechanism as a self-attention model.
The attention weight is calculated from the original input sequence of the attention model only, and this attention model may be referred to as a single-layer attention model. On the other hand, we can abstract the input sequence many times, which can make the context vector of the bottom abstraction become the query state of the next abstraction. This method of superimposing several layers of attention modules on input data to achieve multi-layer abstraction of sequence data may be referred to as a multi-layer attention model. More specifically, the multi-layered attention model may be further partitioned in a manner such that the weight of the model is either top-down learning or bottom-up learning. One typical application of the multi-layer attention mechanism is the classification of text by making two layers of abstraction over the text. This Model is called the "Hierarchical Attention Model (HAM)". Composed of different sentences, each of which contains different words, HAM can extract the natural hierarchical structure of articles. Specifically, an attention model is established for each sentence, the input of the attention model is a basic word in each sentence, and therefore a characteristic representation of the sentence is obtained; the representation of the features of the sentence is then input into a subsequent attention model to construct a feature representation of the entire piece of text. This resulting feature representation of the entire text segment can be used for input to a subsequent classification task classifier. One key issue in machine translation is how to model attention with neural network techniques to better align sentences of different languages. The advantages of the attention model also become more apparent when translating longer sentences. By introducing a mechanism of attention into the question-answering, the model can be helped to understand the question by paying attention to the more important parts, and meanwhile, the stored large amount of information can also help the question-answering chat to find a proper answer. In addition, the performance of the system can be improved by modeling multi-modal data in the question and answer task through the attention model. The task scene of multimedia data description is to generate a description in natural language for multimedia input sequence, wherein the multimedia input sequence can be voice, image and video. Similar to chatting news, attention is focused on finding relevant acoustic signals in relevant parts of the speech or image input, predicting the next word in the headline. The task of titling video may also be accomplished using the spatiotemporal structure of multimedia data, such as video data, in conjunction with a multi-layer attention mechanism. Where the lower level is used to capture a specific area in a video frame and the higher level is used to extract a small subset of the video frames.
(2) The machine can understand the words of people and execute corresponding tasks according to the intention of people, and is a cross subject with a very wide related area related to subjects such as signal processing, neuropsychology, artificial intelligence, computers, linguistics, communication and the like. The intelligent voice interaction technology is a systematic project and generally relates to technologies and comprehensive applications of voice recognition, natural language understanding, dialogue management, natural language generation, voice synthesis and the like. The flow of natural language understanding, dialogue management and natural language generation is also called as an intelligent dialogue system, and is the core technical difficulty of the whole intelligent voice interaction process. After artificial intelligence technologies such as big data deep learning and the like are used, the method comprises the following steps: firstly, analyzing and processing voice signals to remove redundant information. Extracting key information influencing speech recognition and characteristic information expressing language meaning. And thirdly, tightly buckling the characteristic information and identifying words by using a minimum unit. Fourthly, recognizing the words according to respective grammars of different languages and the sequence. The front meaning and the back meaning are taken as auxiliary recognition conditions, which is beneficial to analysis and recognition. Dividing paragraphs for key information according to semantic analysis, taking out recognized words and connecting them, and regulating sentence composition according to the meaning of sentence. And seventhly, combining semantics, carefully analyzing the mutual connection of the contexts, and properly correcting the current sentence being processed. The voice interaction technology initially realizes the progress from rule instructions to natural language instructions, and the machine-learned 'chatty robot' enters a trial stage.
The method and the system for automatically abstracting the intelligent digital newspaper and interactively chatting the news by voice adopt a method for generating the intelligent text abstract by establishing a neural network by a self-attention layer, and the system comprises the following steps: the method comprises the technologies of algorithm, scheduling, awakening, searching, voice recognition, synthesis and chatting news, particularly relates to intelligent summarization and chatting news, realizes the purpose of intelligently and efficiently refining summarized texts for massive news, is concise and easy to read, and is personalized in news recommendation, topic recognition and tracking. The intelligent voice recognition method based on the real-time voice pickup and ultrasonic filtering has the advantages that the problem of high dimensionality, data sparseness and lack of semantic methods of traditional texts is solved through the attention neural network, and intelligent man-machine interaction experience of a type of 'carefully selecting, listening, speaking, understanding you and answering' is given in voice output.
Claims (8)
1. A news chat method of intelligent digital newspaper automatic summarization and voice interaction is characterized by comprising the following steps:
s1, acquiring a digital newspaper text;
s2, analyzing and processing the characteristic character information of the digital newspaper text by the attention neural network through the acquired digital newspaper text and the characteristic information of the title and the text contained in the training of the deep learning model to obtain a cross entropy loss function; the neural network model based on the self-Attention mechanism utilizes channel information and space information of vector matrix characteristics to design an Attention self-Attention module, enables the model to put 'Attention' on more useful information, and the module applies the Attention neural network architecture and selects information channels and spaces obtained by calculating cross entropy as a model loss function and maximizing pooling. The attention capacity of the CBAM module is provided to obtain a final abstract result;
s3, the intelligent voice interaction wake-up mode wake-up word 'Xiaoxin' is used for obtaining an intelligent voice input port in a 'listening, speaking, understanding and answering' type.
2. A method for intelligent digital newspaper automatic summarization and voice interaction chat news as claimed in claim 1, comprising the steps of:
s1, the intelligent digital newspaper system obtains the feature information of the title and the text word, the attention neural network learns the vector matrix feature and obtains the text, and a training set is generated;
s2, analyzing the characteristic character information of the digital newspaper text by the attention neural network through the characteristic information of the text contained in the training of the deep learning model by the acquired digital newspaper text, comparing the characteristic character information with the reference abstract of a training set, and performing back propagation calculation by the model according to a loss function by using a parameter optimization method, thereby completing the training of the deep learning model; according to the rule weight of sentences at the title, the beginning and the end of a text, the influence caused by background information is reduced in a text word to be recognized in a centralized manner, so that the recognition capability is not easily influenced by other interferences. The attention capacity of the CBAM module is provided, the increase of the module parameters is not large, the attention capacity of the neural network is improved, and the influence of the background on the recognition is avoided.
S3, the intelligent voice interaction wake-up mode wake-up word 'Xiaoxin': the intelligent voice interaction comprises: the method comprises the steps of obtaining voice, wherein the voice obtaining is used for obtaining sound information in real time, the voice preprocessing is connected with the voice obtaining, the voice preprocessing is used for obtaining the sound information, and carrying out ultrasonic filtering on the sound information to obtain intelligent voice of a 'listening, speaking, understanding you and answering' type, whether the intelligent voice is a set awakening word is judged in a non-awakening mode, if so, the voice enters the awakening mode, and if not, the non-awakening mode is kept; and voice recognition, wherein the voice recognition is used for recognizing the target voice which can be heard, spoken, understood and answered under awakening to obtain the target content.
3. The method for automatically abstracting the intelligent digital newspaper and interactively chatting the news with voice according to the 1 or 2, wherein the feature information of the text comprises the following steps: vectorization of titles.
4. A smart digital newspaper automatic summarization and voice interaction news chat method as claimed in claim 1, further comprising preset rules, wherein the preset rules comprise: (1) the title weight is highest; (2) the closer to the beginning or end sentence, the higher the rule weight of the sentence; (3) if a sentence starts with an adverb or conjunctive, the rule weight of the sentence is reduced.
5. The method for intelligent automatic summarization of digital newspaper and voice interaction chat of news as claimed in claim 1, wherein the S1 specifically comprises:
s11, the text is scanned and collected, a multi-semantic original text is converted into a word vector sequence consisting of a plurality of word vectors, and a training set is generated by a text summarization method of pre-training, micro-scheduling, parallel computing and semantic;
s12, classifying according to a certain mode, further finding out the best matching result according to the judgment criterion, mining the abstract semantic representation of the text, and using a natural language generation method;
s13, quantization representation of dense text documents that also include news text based on deep learning in machine learning, the goal of the document quantization process doc2vec word vector representation is to create a vectorized representation of the document, regardless of its length.
6. The intelligent digital newspaper automatic summarization and voice interaction news chat method as claimed in claim 1, further comprising: performing word embedding on the word sequence to generate a corresponding word vector sequence, dividing the original text into words to form a word sequence, wherein the deep learning neural network model cannot directly process the words and vectorize the words in the text; the word vector is used for representing a word in the deep neural network and is a feature vector or a representation of the word; the word embedding method is that the word vectors of all words in the word list are generated in a random initialization mode during model training, and the word vectors of the word list are updated by the model in the training process; in the verification and testing phase, the model directly uses the word embedding vectors resulting from training.
7. A smart automatic summarization and voice interaction news chat method for a digital newspaper as claimed in claim 1, wherein the S3 comprises:
s31, intelligent voice interaction in the voice recognition and search technology system, wherein the intelligent voice interaction comprises a microphone, and the output comprises a loudspeaker;
s32, detecting VAD model based on voice activity, including mode control, the mode control is electrically connected with the voice preprocessing, the voice recognition, the search and the output respectively; the mode control is used for acquiring mode information and respectively sending the current mode information to the voice preprocessing, the voice recognition, the search and the output; in a non-awakening mode, the mode control generates an awakening state identifier according to a judgment result of whether the target voice is a set awakening word or not in the voice preprocessing, and under the condition that the target voice is the set awakening word, the awakening state identifier is respectively output to the voice preprocessing, the voice recognition, the search and the output; in an awakening mode, the mode controls to acquire a time node when the response content is output, and monitors whether the voice preprocessing acquires the target content in real time; if the target content is not acquired within the set time, generating a non-awakening state identifier, and respectively outputting the non-awakening state identifier to the voice preprocessing, the voice recognition, the search and the output;
s33, further comprising history association to the voice interaction chat news based on deep learning; the history association is respectively electrically connected with the searching and the voice recognition; the historical association is used when entering an awake mode; acquiring the searched response content in an awakening mode, and recording the response content in the history association; the searching acquires history information related to the target content from history association and acquires response content according to the history information and the target content; the history association is also used for deleting the corresponding response content from the history association after the output response content is interrupted;
s34, the system comprises a plurality of scene models, and the voice processing module comprises a knowledge model, a similarity auxiliary model and a deep learning model; the dispatching system is internally provided with multi-purpose judgment processing, multi-model algorithm analysis, a data strategy engine and integration processing; leading the obtained texts into a time sequence neural network for training to obtain a high-dimensionality feature vector; the self-attention neural network performs regression operation by taking the high-dimensionality feature vector as input; and comparing the feature vectors obtained from the attention neural network with the feature vectors corresponding to the database, selecting the result with the highest similarity, outputting the selected result with the highest similarity by using audio, and feeding the searched voice interactive chat news content back to the reader in a voice form.
8. An intelligent digital newspaper automatic summarization and voice interaction chat news method as claimed in any one of claims 1-7, wherein the voice recognition module recognizes voice based on a deep neural network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011389092.9A CN112562669B (en) | 2020-12-01 | 2020-12-01 | Method and system for automatically abstracting intelligent digital newspaper and performing voice interaction chat |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011389092.9A CN112562669B (en) | 2020-12-01 | 2020-12-01 | Method and system for automatically abstracting intelligent digital newspaper and performing voice interaction chat |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112562669A true CN112562669A (en) | 2021-03-26 |
CN112562669B CN112562669B (en) | 2024-01-12 |
Family
ID=75047464
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011389092.9A Active CN112562669B (en) | 2020-12-01 | 2020-12-01 | Method and system for automatically abstracting intelligent digital newspaper and performing voice interaction chat |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112562669B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113282711A (en) * | 2021-06-03 | 2021-08-20 | 中国软件评测中心(工业和信息化部软件与集成电路促进中心) | Internet of vehicles text matching method and device, electronic equipment and storage medium |
CN113420783A (en) * | 2021-05-27 | 2021-09-21 | 中国人民解放军军事科学院国防科技创新研究院 | Intelligent man-machine interaction method and device based on image-text matching |
CN113688230A (en) * | 2021-07-21 | 2021-11-23 | 武汉众智数字技术有限公司 | Text abstract generation method and system |
CN114580429A (en) * | 2022-01-26 | 2022-06-03 | 云捷计算机软件(江苏)有限责任公司 | Artificial intelligence-based language and image understanding integrated service system |
CN116414972A (en) * | 2023-03-08 | 2023-07-11 | 浙江方正印务有限公司 | Method for automatically broadcasting information content and generating short message |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108519890A (en) * | 2018-04-08 | 2018-09-11 | 武汉大学 | A kind of robustness code abstraction generating method based on from attention mechanism |
CN109857860A (en) * | 2019-01-04 | 2019-06-07 | 平安科技(深圳)有限公司 | File classification method, device, computer equipment and storage medium |
CN110263332A (en) * | 2019-05-28 | 2019-09-20 | 华东师范大学 | A kind of natural language Relation extraction method neural network based |
CN110597979A (en) * | 2019-06-13 | 2019-12-20 | 中山大学 | Self-attention-based generating text summarization method |
CN111508491A (en) * | 2020-04-17 | 2020-08-07 | 山东传媒职业学院 | Intelligent voice interaction equipment based on deep learning |
-
2020
- 2020-12-01 CN CN202011389092.9A patent/CN112562669B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108519890A (en) * | 2018-04-08 | 2018-09-11 | 武汉大学 | A kind of robustness code abstraction generating method based on from attention mechanism |
CN109857860A (en) * | 2019-01-04 | 2019-06-07 | 平安科技(深圳)有限公司 | File classification method, device, computer equipment and storage medium |
CN110263332A (en) * | 2019-05-28 | 2019-09-20 | 华东师范大学 | A kind of natural language Relation extraction method neural network based |
CN110597979A (en) * | 2019-06-13 | 2019-12-20 | 中山大学 | Self-attention-based generating text summarization method |
CN111508491A (en) * | 2020-04-17 | 2020-08-07 | 山东传媒职业学院 | Intelligent voice interaction equipment based on deep learning |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113420783A (en) * | 2021-05-27 | 2021-09-21 | 中国人民解放军军事科学院国防科技创新研究院 | Intelligent man-machine interaction method and device based on image-text matching |
CN113420783B (en) * | 2021-05-27 | 2022-04-08 | 中国人民解放军军事科学院国防科技创新研究院 | Intelligent man-machine interaction method and device based on image-text matching |
CN113282711A (en) * | 2021-06-03 | 2021-08-20 | 中国软件评测中心(工业和信息化部软件与集成电路促进中心) | Internet of vehicles text matching method and device, electronic equipment and storage medium |
CN113282711B (en) * | 2021-06-03 | 2023-09-22 | 中国软件评测中心(工业和信息化部软件与集成电路促进中心) | Internet of vehicles text matching method and device, electronic equipment and storage medium |
CN113688230A (en) * | 2021-07-21 | 2021-11-23 | 武汉众智数字技术有限公司 | Text abstract generation method and system |
CN114580429A (en) * | 2022-01-26 | 2022-06-03 | 云捷计算机软件(江苏)有限责任公司 | Artificial intelligence-based language and image understanding integrated service system |
CN116414972A (en) * | 2023-03-08 | 2023-07-11 | 浙江方正印务有限公司 | Method for automatically broadcasting information content and generating short message |
CN116414972B (en) * | 2023-03-08 | 2024-02-20 | 浙江方正印务有限公司 | Method for automatically broadcasting information content and generating short message |
Also Published As
Publication number | Publication date |
---|---|
CN112562669B (en) | 2024-01-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112562669B (en) | Method and system for automatically abstracting intelligent digital newspaper and performing voice interaction chat | |
Lopez et al. | Deep Learning applied to NLP | |
Wu et al. | Emotion recognition from text using semantic labels and separable mixture models | |
CN110297907B (en) | Method for generating interview report, computer-readable storage medium and terminal device | |
Gao et al. | Convolutional neural network based sentiment analysis using Adaboost combination | |
Ren et al. | Intention detection based on siamese neural network with triplet loss | |
CN112131350A (en) | Text label determination method, text label determination device, terminal and readable storage medium | |
Han et al. | A survey of transformer-based multimodal pre-trained modals | |
Arumugam et al. | Hands-On Natural Language Processing with Python: A practical guide to applying deep learning architectures to your NLP applications | |
CN110297906B (en) | Method for generating interview report, computer-readable storage medium and terminal device | |
Kshirsagar et al. | A review on application of deep learning in natural language processing | |
CN112131876A (en) | Method and system for determining standard problem based on similarity | |
Amanova et al. | Creating annotated dialogue resources: Cross-domain dialogue act classification | |
CN111753058A (en) | Text viewpoint mining method and system | |
Tao et al. | News text classification based on an improved convolutional neural network | |
CN114691864A (en) | Text classification model training method and device and text classification method and device | |
CN112183106A (en) | Semantic understanding method and device based on phoneme association and deep learning | |
Varaprasad et al. | Applications and Techniques of Natural Language Processing: An Overview. | |
CN116910251A (en) | Text classification method, device, equipment and medium based on BERT model | |
CN117493548A (en) | Text classification method, training method and training device for model | |
Bellagha et al. | Using the MGB-2 challenge data for creating a new multimodal Dataset for speaker role recognition in Arabic TV Broadcasts | |
Huang et al. | Spoken document retrieval using multilevel knowledge and semantic verification | |
Harsha et al. | Lexical Ambiguity in Natural Language Processing Applications | |
CN110543559A (en) | Method for generating interview report, computer-readable storage medium and terminal device | |
Hao | Naive Bayesian Prediction of Japanese Annotated Corpus for Textual Semantic Word Formation Classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |