CN112562669A - Intelligent digital newspaper automatic summarization and voice interaction news chat method and system - Google Patents

Intelligent digital newspaper automatic summarization and voice interaction news chat method and system Download PDF

Info

Publication number
CN112562669A
CN112562669A CN202011389092.9A CN202011389092A CN112562669A CN 112562669 A CN112562669 A CN 112562669A CN 202011389092 A CN202011389092 A CN 202011389092A CN 112562669 A CN112562669 A CN 112562669A
Authority
CN
China
Prior art keywords
voice
text
news
word
intelligent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011389092.9A
Other languages
Chinese (zh)
Other versions
CN112562669B (en
Inventor
庄跃辉
程雨夏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Fangzheng Printing Co ltd
Original Assignee
Zhejiang Fangzheng Printing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Fangzheng Printing Co ltd filed Critical Zhejiang Fangzheng Printing Co ltd
Priority to CN202011389092.9A priority Critical patent/CN112562669B/en
Publication of CN112562669A publication Critical patent/CN112562669A/en
Application granted granted Critical
Publication of CN112562669B publication Critical patent/CN112562669B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • G10L13/047Architecture of speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/54Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0638Interactive procedures
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Signal Processing (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method and a system for automatically abstracting intelligent digital newspaper and interactively chatting news by voice, which adopts a method for establishing a neural network by a self-attention layer to generate an intelligent abstract of a text, and the system comprises the following steps: the method comprises the technologies of algorithm, scheduling, awakening, searching, voice recognition, synthesis and chatting news, particularly relates to intelligent summarization and chatting news, realizes the purpose of intelligently and efficiently refining summarized texts for massive news, is concise and easy to read, and is personalized in news recommendation, topic recognition and tracking. The intelligent voice recognition method based on the real-time voice pickup and ultrasonic filtering has the advantages that the problem of high dimensionality, data sparseness and lack of semantic methods of traditional texts is solved through the attention neural network, and intelligent man-machine interaction experience of a type of 'carefully selecting, listening, speaking, understanding you and answering' is given in voice output.

Description

Intelligent digital newspaper automatic summarization and voice interaction news chat method and system
Technical Field
The invention belongs to the field of intelligent news processing abstract automatic generation technology, natural language processing technology and voice interaction chat news, and particularly relates to an intelligent news abstract automatic generation method based on semantic correlation, a voice interaction processing method and a system.
Background
Newspaper is one of the oldest means in the field of dissemination, and for thousands of years, newspaper has made an outstanding contribution to the progress of human civilization. The news information is an important channel for news readers to obtain social information resources, and how to obtain the information needed by people from the huge information ocean becomes very important in the important means for people to obtain the social effective information in the big data era. The intelligent news abstract is one or one sentence obtained by refining and summarizing the source digital message book, fully reflects the theme of the text, and is concise and easy to read. The intelligent news abstract is a technology for automatically generating a text abstract by using a computer, extracting from an original digital newspaper text by using a computer science technology and a natural language processing technology, and then re-presenting the digital newspaper abstract to a reader according to a form required by a user.
The intelligent news abstract takes a news text word vector as input, and news text characteristics are automatically learned and extracted through the attention neural network, so that the defects of time and labor waste and error accumulation caused by manual participation in characteristic extraction in a traditional news text classification method are overcome, the efficiency of news text classification is effectively improved, and more effective information organization and management in the news field are promoted.
The existing voice interaction adopts the preset problem and corresponding answer in a knowledge base, after the system acquires the voice information of a reader, the voice information is transcribed into a text through a corresponding algorithm, the text information is sent to a background system, and the corresponding answer is searched in the knowledge base and returned to the reader; in actual use, due to the adoption of single algorithms such as keywords, regular expressions, deep learning models and the like, the calculated result is not accurate enough, so that the answer is wrong and the effect is poor; the preset answers are generally single results, the answering effect on information such as multiple questions and multiple intentions of readers is poor, the preset questions and the corresponding answers are single, the designed process is free of a polling context and historical data association mechanism, mechanical traces are obvious in the process of communication with the readers, interaction is not smooth, experience is poor, the same voice synthesis interaction technology is adopted in different information contents or different stages of the same information, personalized response cannot be achieved, and the actual using effect is discounted.
Under the new media environment, in the aspect of reading digital newspapers and periodicals, new requirements are provided for giving the digital newspapers and periodicals the intelligent man-machine interaction experience of the type of being capable of listening, speaking, understanding and answering, and the spoken voices of human beings are understood by voice recognition technology and artificial intelligence. Speech recognition technology is a technology that allows a machine to convert speech signals into corresponding text or commands through a recognition and understanding process. Speech recognition is a very extensive cross-discipline, which is very closely related to such disciplines as acoustics, phonetics, linguistics, information theory, pattern recognition theory, neurobiology, etc.
The first generation digital newspaper system provides the full look of the whole newspaper, except that the touch sense of paper cannot be realized, the reading process experience is highly matched with that of the traditional newspaper, and the original-taste newspaper reading effect is provided for the digital newspaper. For a newspaper medium with content advantages, revolutionary influence on information dissemination caused by intelligent voice interaction should be seen in time. From content distribution aiming at voice interaction hardware to technology-enabled content production, the intelligent voice interaction technology and media naturally have the possibility of multi-level cooperation, and the technology has wide application prospect.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a newspaper which can listen, speak, understand and answer, wherein text scanning is adopted to acquire text characteristics, a multilingual original text is converted into a word vector sequence consisting of a plurality of word vectors, an Attention-oriented neural network architecture is adopted, the regular weights of titles, beginning sentences and ending sentences are set, and a training set is generated by a text summarization method of pre-training plus micro-scheduling, parallel computing and semantics;
the deep learning method comprises the steps of voice interaction, and learning to obtain dense document quantization processing doc2vec word vector representation of a news text, so that the problems of high dimensionality, sparse data, lack of semantics, sentence breaking and the like of the traditional text representation are solved. The voice collecting device has the advantages that real-time sound collection can be achieved, external effective voice can be accurately obtained in the voice output process, the voice interaction process is more intelligent, and intelligent human-computer interaction experience is achieved.
The invention achieves the aim through the following technical scheme:
a news chat method and system of intelligent digital newspaper automatic summarization and voice interaction comprises the following steps: the step S1 includes:
s11, according to the set rule, the news title has a strong generalization effect on the news text, the titles of some major news directly reflect the central thought of the news text, the sentence weight calculation is carried out, the better effect is generated by combining the similarity of the titles, and the weight value is weighted by referring to the similarity of the titles and the sentences;
in the calculation of sentence weight, the title is firstly, and then the first sentence and the first paragraph are emphasized, but in the news report, the first sentence may be the first sentence of a news genre which does not influence the news content, such as 'a reporter report', 'XX net X month X day signal', 'XX society Beijing X month X day electricity', and the like, so the first sentence of the news genre is firstly filtered when being processed. The intelligent news summarization is carried out without considering the sentence patterns of question sentences and exclamation sentences. Calculating sentence similarity and filtering redundant sentences;
in text expressed in chinese, the characteristics of a sentence: word features, semantic features, syntactic features. During sentence similarity calculation, the three types of characteristics are comprehensively considered, and organic weighted combination and mutual complementation are carried out on the characteristics;
the sentences of the Chinese texts can be divided into a core part and a modification part, wherein the core part is a main object structure which plays a crucial role in the semantics of the sentences, and the modification part is a secondary object structure which is usually a shape-fixed supplementary structure. Because the subject and the object in the structure of the subject-predicate object are often nouns or pronouns, and the predicate is mostly adverbs or adjectives, when the similarity of sentences is calculated, parts of speech of various words appearing in the sentences should be labeled, keywords are retained, and non-keywords are filtered;
the S12 participle strategy engine is internally provided with a sequence labeling model and a deep learning algorithm for participle processing; the knowledge model algorithm analysis is based on character string matching for scoring; analyzing the deep learning model algorithm, and scoring based on a deep learning algorithm K-means, an LDA algorithm, an iterative decision tree, a TextCNN and a TextRNN attention model; analyzing the similarity auxiliary model algorithm, and scoring based on word distance calculation, covariance calculation, word vector calculation and stability calculation; the score comprises the scoring score of any one or more of the knowledge model, the deep learning model and the similarity auxiliary model;
s13, for the processing of complex answers such as multiple questions and multiple intentions, scheduling, wherein the preliminary preprocessing is carried out before the text is sent to a question calculation model, the calculation model is split through the multiple intentions, the questions with multiple intentions are decomposed into multiple parts, the integrated scheduling of multiple algorithms is realized, then the multiple parts are sent to the question calculation model to obtain multiple answers, the generated abstract expresses the core meaning of the original text, and the answer results are integrated and fed back to a reader;
the process of calculating the model doc2vec comprises two steps: training a model, and obtaining a word vector, a parameter sum of softmax and a paragraph vector/sentence vector from known training data;
the inference process (inference stage) gets its vector expression for the new paragraph. Specifically, more columns are added in the matrix, and under the fixed condition, the method is used for training, and a gradient descent method is used for obtaining a new D, so that the vector expression of a new paragraph is obtained;
the calculation model realizes the integrated scheduling of various algorithms, and solves the defects of the traditional algorithm in the short text classification problem from three aspects. Firstly, the structure of a single-layer neural network is improved, and compared with the existing LSTM (Long-Short-Term-Memory). Structural units of a Recurrent neural network such as GRU (gated Recurrent Unit) find structural units suitable for short text classification tasks, and output of the Recurrent neural network structure is improved. The traditional method only takes the output of the last layer as the semantic representation of the short text, and the invention adopts the thought in the attention neural network to fuse the forward output and the backward output of the cyclic neural network, thereby obtaining better short text representation; secondly, input and intermediate parameters of the neural network are optimized, pre-training is respectively carried out on input variables and a network structure by combining word vectors and an automatic coding machine, and a contrast test shows that the pre-training process is more beneficial to parameter convergence in the neural network, so that a better classification effect is obtained; finally, the invention introduces an improved multi-layer neural network fusion method for short text classification, the traditional deep neural network only simply takes the output of a single-layer neural network as input, the output is superposed layer by layer, the relation between layers in a multi-layer cyclic neural network is improved by means of the thought of a threshold in an LSTM, the semantic representation of the short text is further optimized, and an experimental result shows that the classification effect of the improved multi-layer neural network is superior to that of the single-layer neural network, so that the management and the storage of information are greatly facilitated for people, the text is subjected to language synthesis, natural language understanding, and the integration and scheduling of various algorithms are realized, and an abstract training set is generated;
s3, the smart voice interaction wake mode (wake word "xiaoxin"): the intelligent voice interaction comprises: the method comprises the steps of voice acquisition, voice preprocessing and voice processing, wherein the voice acquisition is used for acquiring sound information in real time, the voice preprocessing is connected with the voice acquisition and used for acquiring the sound information and carrying out ultrasonic filtering on the sound information to obtain intelligent voice of a 'listening, speaking, understanding you and answering' type;
the intelligent voice interactive chat news has two states of an awakening mode and a non-awakening mode; this intelligence voice interaction chat news includes:
voice acquisition, wherein the voice acquisition is used for acquiring sound information in real time;
the voice preprocessing is connected with the voice acquisition and used for acquiring the voice information and carrying out ultrasonic filtering on the voice information to obtain target voice; judging whether the target voice is a set awakening word or not in a non-awakening mode, if so, entering the awakening mode, and if not, keeping the non-awakening mode;
the voice recognition is used for recognizing all target voices in the awakening mode to obtain target contents;
searching, wherein the searching is respectively connected with the voice recognition and the storage of the pre-stored answer sentences, and the searching is used for storing the answer sentences from the storage or according to the target content in an awakening mode;
an output connected to the search, the output configured to obtain the response content in an awake mode and output the response content;
the intelligent voice interactive chat news enters a non-awakening mode when no content is output within set time and no sound information is acquired.
The intelligent voice interactive chat news based on deep learning as described above, wherein optionally, a mode control is further included, and the mode control is electrically connected to the voice preprocessing, the voice recognition, the voice preprocessing and the output respectively;
the mode control is used for acquiring mode information and respectively sending the current mode information to the voice preprocessing, the voice recognition, the search and the output;
in a non-awakening mode, the mode control generates an awakening state identifier according to a judgment result of whether the target voice is a set awakening word or not according to the voice preprocessing, and under the condition that the target voice is the set awakening word, the awakening state identifier is respectively output to the voice preprocessing, the voice recognition, the search and the output;
in an awakening mode, the mode control acquires a time node when the response content is output, and monitors whether the voice preprocessing acquires the target content in real time; if the target content is acquired within the set time, generating a non-awakening state identifier, and respectively outputting the non-awakening state identifier to the voice preprocessing, the voice recognition, the search and the output;
the method S3 is specifically as follows:
s31 sound is picked up in real time in the wake-up mode, so that external effective voice can be accurately obtained in the voice output process;
s32 multi-purpose judgment processing, multi-model algorithm analysis, a data strategy engine and integration processing; the system comprises a multi-model algorithm analysis module, a data strategy engine module and an integration processing module. That is, the recognition of multiple intentions of a short text is a problem in Spoken Language Understanding (SLU), and since the short text has sparse features and a small number of words but contains a large amount of information, it is difficult to extract effective features thereof in a classification problem. To solve this problem, syntactic features and a self-attention neural network (CNN) are combined to propose a multi-intent recognition model. First, a sentence is subjected to dependency syntax analysis to determine whether multiple intents are contained; then, calculating a distance matrix by using the word frequency-inverse document frequency (TF-IDF) and the trained word vectors to determine the number of intentions; secondly, the distance matrix is used as the input of a CNN model for intention classification;
further, the step S32 is to combine semantic expansion and CNN methods to classify the news data sets, first extract information in the headlines, and then perform semantic expansion using CNN;
further, the S32 multi-purpose judgment processing module, the multi-model algorithm parsing module, the data policy engine module, and the integration processing module are respectively connected to the language processing module for data transmission;
the multi-intention judging process is used for analyzing whether the dialog text of the reader has multi-intentions, the multi-intention judging process is connected with a word segmentation strategy engine, and the multi-intention judging process is used for receiving the text, performing primary filtering through the word segmentation strategy engine, judging the multi-intentions of the reader and feeding the judgment back to the scheduling; after the dispatching acquires the data fed back by the multi-intention judgment processing, selecting according to problem data in the text, calling a multi-model algorithm analysis mode to acquire a score generated by the multi-model algorithm analysis, after the dispatching acquires the score, calling the integration processing, performing weight screening according to the score to obtain calculation result data and transmitting the calculation result data to the data strategy engine, wherein the data strategy engine processes the calculation result to generate the analysis, and the multi-model algorithm analysis is internally provided with a knowledge model algorithm interpretation, a deep learning model algorithm interpretation and a similarity auxiliary model algorithm analysis.
Intelligent voice interactive chat news, no content is output within a set time, and a non-awakening mode is entered;
a news chat method and system of intelligent digital newspaper automatic summarization and voice interaction are disclosed, the system comprises:
doc2Vec is also called para 2Vec, and sensor elements are unsupervised algorithms, can obtain vector expression of sense/para/documents, and are extensions of word2 Vec. The vector finds the similarity between the sensors/paragrams/documents by calculating the distance, is used for text clustering, and can also classify the text by a supervised learning method for the data with the label;
an automatic speech recognition module: (ASR) is an Automatic Speech Recognition for Automatic Speech Recognition;
a spoken language understanding module: (SLU) is Spoken Language Understanding, used for automatic speech recognition of the pronunciation;
a conversation management module: (DM) is a dialog Management for automatic speech recognition;
a language processing module: (NLP) is a Natural control dispatching device used for automatic voice recognition;
a speech synthesis module: (TTS) is Text To Speech, from Text To Speech;
the multi-purpose judgment processing module: is (MIM) a Multi-interaction task processing module for automatic speech recognition;
the multi-model algorithm analysis module: is (MAM) Multi-model algorithm module;
the data policy engine module: (DSM) is a Data stream engine module;
the word segmentation strategy engine: (SSE) is Segmentation stream engine;
an integration processing module: (DI) is Data interpretation;
furthermore, the intelligent digital newspaper automatic summarization and voice interactive news chat method and system adopt an Attention neural network architecture, set the rule weights of the titles, the beginning sentences and the ending sentences, and generate a training set by a text summarization method of pre-training, micro-scheduling, parallel computing and semantics; the learning module ASR is used for converting the words spoken by the readers into texts, the SLU is used for understanding the intentions of the readers and extracting key information in the texts, the DM is used for managing the conversations of the machine and the readers, the TTS is used for returning the texts generated by the machine to the readers by using voice, and the accuracy of the machine for semantic understanding depends on the accuracy of the ASR, but most importantly, depends on the accuracy of the SLU;
in summary, the method and system for intelligent digital newspaper automatic summarization and voice interactive news chat of the present invention initially generates an initial representation or embedding of each word, represented by an open circle, using Attention. Information is aggregated from other words using the self-attention mechanism, and each word of the context generates a new token, represented by a filled circle, according to the sentence rule weights that set the title, beginning, and end. This step is then repeated multiple times in parallel, successively generating new tokens for all words. The system of preprocessing, strategy flow, voice synthesis, voice recognition, language processing and scheduling combination realizes the integrated scheduling of multiple algorithms, and schedules the calculation of multiple algorithm models and synthesizes the calculation result to obtain the optimal solution according to the set rule, so as to solve the limitation of blind spot calculation of a single algorithm model and achieve the complementary effect; and voice information is acquired in real time through voice acquisition, so that corresponding effective voice information can be identified no matter the interaction is in an awakening mode or a non-awakening mode. In the awakening mode, in the process of interactive chat news, a reader does not need to add a specific awakening word before each sentence, so that the interactive chat news process can be more free and random, and the intelligence of the interactive chat news is improved. In addition, because voice is obtained and voice information is obtained in real time, even if the interactive chat news is in the voice output process, effective voice information can be accurately identified, so that the interactive chat news can be interrupted in the voice output process, and communication is more efficient and smooth.
On the basis of the technical level of the original intelligent voice interaction field, further architectural design optimization and content refinement extension are performed, and the voice acquisition module acquires the voice information in real time, so that the corresponding effective voice information can be identified no matter the interactive voice is in an awakening mode or a non-awakening mode. In the wake-up mode, in the interactive process, a reader is not required to add a specific wake-up word before each sentence, so that the reader is not required to add the specific wake-up word before each sentenceThe method can enable the interaction process to be more free and random, thereby improving the intellectualization of the interaction voice. In addition, because the voice acquisition module acquires the voice information in real time, even if the interactive voice is in the voice output process, the effective voice information can be accurately identified, so that the interactive voice can be interrupted in the voice output process, the communication is more efficient and smooth, a specific scene can be customized, a complex scene with multiple intentions is better solved, and the history and the current data are combined in a diversified manner. The attention module mainly comprises two parts, wherein the first part is a channel attention vector matrix MCIt can select channels, and the other part is a space attention vector matrix MSIt can select the region needing attention in the vector matrix space, and for the given characteristic vector matrix, the region passes through F epsilon RC×H×WOutput after self attention module:
F'=MC(F)×F,
F"=MS(F')×F',
in the self-attention neural network, channel information generally represents different feature information of a document quantization processing doc2vec word vector, and the network can better notice information which is useful for a task in the doc2vec word vector by selecting channels. In order to realize the selection of the channel, the global average pooling and global maximum pooling information of the feature doc2vec word vector is calculated, and the attention parameters of the channel are obtained by adding after the full-connection layers, wherein the two share the same full-connection network.
Figure BDA0002810736610000091
And performing global maximum and average pooling of the characteristic vector matrix on each coordinate of the character sequence of the characteristic identification of each channel, and performing self-attention on the characteristic vector matrix to obtain a space attention vector matrix.
Figure BDA0002810736610000092
The automatic extraction of the digital electronic newspaper, the automatic abstract processing of multiple meanings, voice recognition and interaction and voice chat news are achieved.
The method and the system for automatically abstracting the intelligent digital newspaper and interactively chatting the news by voice adopt a method for generating the intelligent text abstract by establishing a neural network by a self-attention layer, and the system comprises the following steps: the method comprises the technologies of algorithm, scheduling, awakening, searching, voice recognition, synthesis and chatting news, particularly relates to intelligent summarization and chatting news, realizes the purpose of intelligently and efficiently refining summarized texts for massive news, is concise and easy to read, and is personalized in news recommendation, topic recognition and tracking. The intelligent voice recognition method based on the real-time voice pickup and ultrasonic filtering has the advantages that the problem of high dimensionality, data sparseness and lack of semantic methods of traditional texts is solved through the attention neural network, and intelligent man-machine interaction experience of a type of 'carefully selecting, listening, speaking, understanding you and answering' is given in voice output.
Drawings
FIG. 1 is a flow chart of a method and system for intelligent digital newspaper automatic summarization and voice interactive news chat according to the present invention;
FIG. 2 is a block diagram of S1 according to the present invention;
FIG. 3 is a flow chart of S2 according to the present invention;
FIG. 4 is a flow chart of an embodiment of S3;
FIG. 5 encoder-decoder architecture (a) the conventional architecture (b) the architecture of the model with the addition of attention mechanism;
FIG. 6 is a flow chart illustrating the basic principle of the speech recognition system of the present invention.
Detailed Description
Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
In the practical application of the present embodiment, the following description of at least one example is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where applicable.
Any particular value, in all methods shown and discussed herein, should be construed as exemplary only and not as limiting. Thus, other examples of method embodiments may have different values.
It should be noted that: like symbols and letters represent like items in the following figures, and thus, once an item is defined in one figure, it need not be further discussed in subsequent figures. Other features of the present invention and its advantages will become apparent from the following detailed description of the invention with reference to the accompanying drawings, as illustrated in fig. 1 to 6. The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.
S1, acquiring a digital newspaper text; the text scanning method comprises the steps of collecting text characteristics through the text scanning, converting a multi-semantic original text into a word vector sequence consisting of a plurality of word vectors, setting the rule weights of titles, beginning sentences and ending sentences through an Attention-oriented neural network architecture, and generating a training set through a text summarization method of pre-training plus micro-scheduling, parallel computing and semantics;
s2, through the above-mentioned obtaining the title, beginning and ending sentence rule weight of the digital message, centralizing the character sequence and text word to be identified to reduce the influence caused by background information, making the identification ability not easy to be influenced by other interference, through the Attention self-Attention neural network model, the channel of vector matrix characteristic designs the Attention self-Attention module, and the information channel and space obtained by calculating the cross entropy as the model loss function and the maximum pooling are selected. The dense document quantization processing doc2vec word vector representation of the news text solves the problems of high dimensionality, sparse data, lack of semantics, sentence breaking and the like of the traditional text representation, the news text word vector is used as input, and the attention neural network automatically learns and extracts the characteristics of the news text, so that the defects of time and labor waste and error accumulation of the traditional news text classification method which manually participates in the characteristic extraction are overcome. The model can put 'attention' on more useful information, and the module applies various self-attention neural network architectures to put forward the attention capability of the CBAM module to obtain a final abstract result.
The voice collecting device has the advantages that real-time sound collection can be achieved, external effective voice can be accurately obtained in the voice output process, the voice interaction process is more intelligent, and intelligent human-computer interaction experience is achieved.
S3, according to the intelligent voice interaction wake-up mode wake-up word 'Xiaoxin' described by the text, an intelligent voice input port of the 'listening, speaking, understanding you, answering' type is obtained.
Further, S1 includes: s11, according to the set rule, the news title has a strong generalization effect on the news text, the titles of some major news directly reflect the central thought of the news text, the sentence weight calculation is carried out, the better effect is generated by combining the similarity of the titles, and the weight value is weighted by referring to the similarity of the titles and the sentences;
in the calculation of sentence weight, the title is firstly, and then the first sentence and the first paragraph are emphasized, but in the news report, the first sentence may be the first sentence of a news genre which does not influence the news content, such as 'a reporter report', 'XX net X month X day signal', 'XX society Beijing X month X day electricity', and the like, so the first sentence of the news genre is firstly filtered when being processed. The intelligent news summarization is carried out without considering the sentence patterns of question sentences and exclamation sentences. Calculating sentence similarity and filtering redundant sentences;
in text expressed in chinese, the characteristics of a sentence: word features, semantic features, syntactic features. During sentence similarity calculation, the three types of characteristics are comprehensively considered, and organic weighted combination and mutual complementation are carried out on the characteristics;
the sentences of the Chinese texts can be divided into a core part and a modification part, wherein the core part is a main object structure which plays a crucial role in the semantics of the sentences, and the modification part is a secondary object structure which is usually a shape-fixed supplementary structure. Because the subject and the object in the structure of the subject-predicate object are often nouns or pronouns, and the predicate is mostly adverbs or adjectives, when the similarity of sentences is calculated, parts of speech of various words appearing in the sentences should be labeled, keywords are retained, and non-keywords are filtered;
the S12 participle strategy engine is internally provided with a sequence labeling model and a deep learning algorithm for participle processing; the knowledge model algorithm analysis is based on character string matching for scoring; analyzing the deep learning model algorithm, and scoring based on a deep learning algorithm K-means, an LDA algorithm, an iterative decision tree, a TextCNN and a TextRNN attention model; analyzing the similarity auxiliary model algorithm, and scoring based on word distance calculation, covariance calculation, word vector calculation and stability calculation; the score comprises the scoring score of any one or more of the knowledge model, the deep learning model and the similarity auxiliary model;
s13, for the processing of complex answers such as multiple questions and multiple intentions, scheduling, before the text is sent to the question calculation model, preliminary preprocessing is carried out, the calculation model is split through the multiple intentions, the questions with multiple intentions are decomposed into multiple shares, multiple algorithm integration scheduling is realized, then the multiple answers are sent to the question calculation model, the generated abstract expresses the core meaning of the original text, and the answer results are integrated and fed back to the reader, and the intelligent digital newspaper automatic abstract and voice interactive chat news method and system provided by the invention comprises the following steps:
the method adopts attention classification, uses an attention key neural network architecture and attention neural network training, maps text elements into vectors with fixed length, and the distance between the vectors can depict the semantic correlation between the text elements, thereby overcoming the defects that one-hot vector dimension is too high and the connection between the text elements cannot be depicted. A new text classification algorithm and a multi-document automatic summarization algorithm are designed on the basis of text distributed representation, and a text concept word vector model is designed by combining the distributed representation of words and a text vector model representation method aiming at the problems of huge structural dimension, extreme sparseness and the like of a text vector model. The method comprises the steps of mapping words in a text into word vectors, clustering the words with high semantic relevance into concepts through word vector clustering, then constructing a concept directed vector model according to the sequence relation of the words, storing adjacent matrixes corresponding to the vector models of the concepts of the text into vector sequences, converting natural voice processing tasks into vector sequence processing tasks, achieving mapping from the text to the vector sequences, and designing a multilayer self-attention neural network. The text vector sequences are classified, and the classification result is compared with other text classification algorithms, so that the result shows that the algorithm provided by the invention is better than other three text classification algorithms, and the problem that automatic summarization of multiple documents in China is lack of summary sentence redundancy is solved. The distributed expression of sentences is combined with a spectral clustering algorithm, a multi-document automatic summarization algorithm based on an attention algorithm and spectral clustering is designed, the sentences in the text are mapped into sentence vectors, the sentence vectors are clustered by using the spectral clustering algorithm, and the documents are divided into sub-topic documents. And establishing a sentence relation vector sequence model in each sub-topic document, and iterating the sentence weight by using an Attention algorithm.
Further, S3 includes:
s31 sound is picked up in real time in the wake-up mode, so that external effective voice can be accurately obtained in the voice output process;
s32 multi-purpose judgment processing, multi-model algorithm analysis, a data strategy engine and integration processing; the system comprises a multi-model algorithm analysis module, a data strategy engine module and an integration processing module. That is, the recognition of multiple intentions of a short text is a problem in Spoken Language Understanding (SLU), and since the short text has sparse features and a small number of words but contains a large amount of information, it is difficult to extract effective features thereof in a classification problem. To solve this problem, syntactic features and a self-attention neural network (CNN) are combined to propose a multi-intent recognition model. First, a sentence is subjected to dependency syntax analysis to determine whether multiple intents are contained; then, calculating a distance matrix by using the word frequency-inverse document frequency (TF-IDF) and the trained word vectors to determine the number of intentions; secondly, the distance matrix is used as the input of a CNN model for intention classification;
s32 is a method combining semantic expansion and CNN to classify the news data set, firstly extracting the information in the title, and then performing semantic expansion by using CNN;
the S32 multi-purpose judgment processing module, the multi-model algorithm parsing module, the data policy engine module, and the integration processing module are respectively connected to the language processing module for data transmission;
s33 multi-intention judging processing is used for analyzing whether the dialog text of the reader has multi-intentions, the multi-intention judging processing is connected with a word segmentation strategy engine, the multi-intention judging processing is used for receiving the text, primary filtering is carried out through the word segmentation strategy engine, multi-intention judgment of the reader is carried out, and the multi-intention judgment is fed back to the scheduling; after the dispatching acquires the data fed back by the multi-intention judgment processing, selecting according to problem data in the text, calling a multi-model algorithm analysis mode to acquire a score generated by the multi-model algorithm analysis, after the dispatching acquires the score, calling the integration processing, performing weight screening according to the score to obtain calculation result data and transmitting the calculation result data to the data strategy engine, wherein the data strategy engine processes the calculation result to generate the analysis, and the multi-model algorithm analysis is internally provided with a knowledge model algorithm interpretation, a deep learning model algorithm interpretation and a similarity auxiliary model algorithm analysis.
(1) The invention relates to a digital newspaper system technology of automatic abstract newspapers, which is characterized by attention classification and attention-using neural network architecture, wherein in a text and summarization task, only some words of an input sequence are related to a next prediction output value by an attention algorithm; in the tagging task, some local information is more closely related to the next tagged word. The attention mechanism integrates this relationship, allowing the model to dynamically focus on specific portions of the input to more efficiently accomplish the task at hand.
An attention model is added into the neural network structure of the invention: the attention model achieves very good performance in a plurality of tasks, such as question answering, emotion analysis and part of speech tagging text analysis, and the attention mechanism increases the interpretability of the neural network structure. Since the traditional neural network is a black box model, improving the interpretability is crucial to improving the fairness, reliability and transparency of the machine learning model. Third, it can help alleviate some of the drawbacks in recurrent neural networks, such as performance degradation with increasing input sequence length and computational inefficiency with sequential processing of the inputs.
The conventional encoding-decoding structure has two major drawbacks. First, the encoder must compress all the input information into a fixed-length vector. Using such simple fixed-length encoding to represent longer and more complex inputs tends to result in loss of input information. Secondly, such a structure does not model the correspondence of input sequences and output sequences, which is important in text summarization tasks. Intuitively, in a sequence task, each position of the output sequence may be affected by a particular position of the input sequence. However, classical decoding architectures do not take this correspondence into account when generating the output.
The attention model overcomes the two major drawbacks of the conventional architecture by allowing the decoder to access all of the encoder-generated outputs. The core idea is that all the outputs of the encoder are weighted and combined and then input into the decoder at the current position to influence the output of the decoder. By weighting the output of the encoder, more context information of the original data can also be utilized at the colleague that achieves the alignment of the input and output. For text classification and recommendation tasks, the input is a sequence and the output is not a sequence. In this scenario, an attention mechanism may be used to capture the association between each unit (e.g., each word) in the input sequence and other units in the input sequence. In this case, the candidate state and the query state of the attention model are the same sequence, and we refer to the attention model based on this mechanism as a self-attention model.
The attention weight is calculated from the original input sequence of the attention model only, and this attention model may be referred to as a single-layer attention model. On the other hand, we can abstract the input sequence many times, which can make the context vector of the bottom abstraction become the query state of the next abstraction. This method of superimposing several layers of attention modules on input data to achieve multi-layer abstraction of sequence data may be referred to as a multi-layer attention model. More specifically, the multi-layered attention model may be further partitioned in a manner such that the weight of the model is either top-down learning or bottom-up learning. One typical application of the multi-layer attention mechanism is the classification of text by making two layers of abstraction over the text. This Model is called the "Hierarchical Attention Model (HAM)". Composed of different sentences, each of which contains different words, HAM can extract the natural hierarchical structure of articles. Specifically, an attention model is established for each sentence, the input of the attention model is a basic word in each sentence, and therefore a characteristic representation of the sentence is obtained; the representation of the features of the sentence is then input into a subsequent attention model to construct a feature representation of the entire piece of text. This resulting feature representation of the entire text segment can be used for input to a subsequent classification task classifier. One key issue in machine translation is how to model attention with neural network techniques to better align sentences of different languages. The advantages of the attention model also become more apparent when translating longer sentences. By introducing a mechanism of attention into the question-answering, the model can be helped to understand the question by paying attention to the more important parts, and meanwhile, the stored large amount of information can also help the question-answering chat to find a proper answer. In addition, the performance of the system can be improved by modeling multi-modal data in the question and answer task through the attention model. The task scene of multimedia data description is to generate a description in natural language for multimedia input sequence, wherein the multimedia input sequence can be voice, image and video. Similar to chatting news, attention is focused on finding relevant acoustic signals in relevant parts of the speech or image input, predicting the next word in the headline. The task of titling video may also be accomplished using the spatiotemporal structure of multimedia data, such as video data, in conjunction with a multi-layer attention mechanism. Where the lower level is used to capture a specific area in a video frame and the higher level is used to extract a small subset of the video frames.
(2) The machine can understand the words of people and execute corresponding tasks according to the intention of people, and is a cross subject with a very wide related area related to subjects such as signal processing, neuropsychology, artificial intelligence, computers, linguistics, communication and the like. The intelligent voice interaction technology is a systematic project and generally relates to technologies and comprehensive applications of voice recognition, natural language understanding, dialogue management, natural language generation, voice synthesis and the like. The flow of natural language understanding, dialogue management and natural language generation is also called as an intelligent dialogue system, and is the core technical difficulty of the whole intelligent voice interaction process. After artificial intelligence technologies such as big data deep learning and the like are used, the method comprises the following steps: firstly, analyzing and processing voice signals to remove redundant information. Extracting key information influencing speech recognition and characteristic information expressing language meaning. And thirdly, tightly buckling the characteristic information and identifying words by using a minimum unit. Fourthly, recognizing the words according to respective grammars of different languages and the sequence. The front meaning and the back meaning are taken as auxiliary recognition conditions, which is beneficial to analysis and recognition. Dividing paragraphs for key information according to semantic analysis, taking out recognized words and connecting them, and regulating sentence composition according to the meaning of sentence. And seventhly, combining semantics, carefully analyzing the mutual connection of the contexts, and properly correcting the current sentence being processed. The voice interaction technology initially realizes the progress from rule instructions to natural language instructions, and the machine-learned 'chatty robot' enters a trial stage.
The method and the system for automatically abstracting the intelligent digital newspaper and interactively chatting the news by voice adopt a method for generating the intelligent text abstract by establishing a neural network by a self-attention layer, and the system comprises the following steps: the method comprises the technologies of algorithm, scheduling, awakening, searching, voice recognition, synthesis and chatting news, particularly relates to intelligent summarization and chatting news, realizes the purpose of intelligently and efficiently refining summarized texts for massive news, is concise and easy to read, and is personalized in news recommendation, topic recognition and tracking. The intelligent voice recognition method based on the real-time voice pickup and ultrasonic filtering has the advantages that the problem of high dimensionality, data sparseness and lack of semantic methods of traditional texts is solved through the attention neural network, and intelligent man-machine interaction experience of a type of 'carefully selecting, listening, speaking, understanding you and answering' is given in voice output.

Claims (8)

1. A news chat method of intelligent digital newspaper automatic summarization and voice interaction is characterized by comprising the following steps:
s1, acquiring a digital newspaper text;
s2, analyzing and processing the characteristic character information of the digital newspaper text by the attention neural network through the acquired digital newspaper text and the characteristic information of the title and the text contained in the training of the deep learning model to obtain a cross entropy loss function; the neural network model based on the self-Attention mechanism utilizes channel information and space information of vector matrix characteristics to design an Attention self-Attention module, enables the model to put 'Attention' on more useful information, and the module applies the Attention neural network architecture and selects information channels and spaces obtained by calculating cross entropy as a model loss function and maximizing pooling. The attention capacity of the CBAM module is provided to obtain a final abstract result;
s3, the intelligent voice interaction wake-up mode wake-up word 'Xiaoxin' is used for obtaining an intelligent voice input port in a 'listening, speaking, understanding and answering' type.
2. A method for intelligent digital newspaper automatic summarization and voice interaction chat news as claimed in claim 1, comprising the steps of:
s1, the intelligent digital newspaper system obtains the feature information of the title and the text word, the attention neural network learns the vector matrix feature and obtains the text, and a training set is generated;
s2, analyzing the characteristic character information of the digital newspaper text by the attention neural network through the characteristic information of the text contained in the training of the deep learning model by the acquired digital newspaper text, comparing the characteristic character information with the reference abstract of a training set, and performing back propagation calculation by the model according to a loss function by using a parameter optimization method, thereby completing the training of the deep learning model; according to the rule weight of sentences at the title, the beginning and the end of a text, the influence caused by background information is reduced in a text word to be recognized in a centralized manner, so that the recognition capability is not easily influenced by other interferences. The attention capacity of the CBAM module is provided, the increase of the module parameters is not large, the attention capacity of the neural network is improved, and the influence of the background on the recognition is avoided.
S3, the intelligent voice interaction wake-up mode wake-up word 'Xiaoxin': the intelligent voice interaction comprises: the method comprises the steps of obtaining voice, wherein the voice obtaining is used for obtaining sound information in real time, the voice preprocessing is connected with the voice obtaining, the voice preprocessing is used for obtaining the sound information, and carrying out ultrasonic filtering on the sound information to obtain intelligent voice of a 'listening, speaking, understanding you and answering' type, whether the intelligent voice is a set awakening word is judged in a non-awakening mode, if so, the voice enters the awakening mode, and if not, the non-awakening mode is kept; and voice recognition, wherein the voice recognition is used for recognizing the target voice which can be heard, spoken, understood and answered under awakening to obtain the target content.
3. The method for automatically abstracting the intelligent digital newspaper and interactively chatting the news with voice according to the 1 or 2, wherein the feature information of the text comprises the following steps: vectorization of titles.
4. A smart digital newspaper automatic summarization and voice interaction news chat method as claimed in claim 1, further comprising preset rules, wherein the preset rules comprise: (1) the title weight is highest; (2) the closer to the beginning or end sentence, the higher the rule weight of the sentence; (3) if a sentence starts with an adverb or conjunctive, the rule weight of the sentence is reduced.
5. The method for intelligent automatic summarization of digital newspaper and voice interaction chat of news as claimed in claim 1, wherein the S1 specifically comprises:
s11, the text is scanned and collected, a multi-semantic original text is converted into a word vector sequence consisting of a plurality of word vectors, and a training set is generated by a text summarization method of pre-training, micro-scheduling, parallel computing and semantic;
s12, classifying according to a certain mode, further finding out the best matching result according to the judgment criterion, mining the abstract semantic representation of the text, and using a natural language generation method;
s13, quantization representation of dense text documents that also include news text based on deep learning in machine learning, the goal of the document quantization process doc2vec word vector representation is to create a vectorized representation of the document, regardless of its length.
6. The intelligent digital newspaper automatic summarization and voice interaction news chat method as claimed in claim 1, further comprising: performing word embedding on the word sequence to generate a corresponding word vector sequence, dividing the original text into words to form a word sequence, wherein the deep learning neural network model cannot directly process the words and vectorize the words in the text; the word vector is used for representing a word in the deep neural network and is a feature vector or a representation of the word; the word embedding method is that the word vectors of all words in the word list are generated in a random initialization mode during model training, and the word vectors of the word list are updated by the model in the training process; in the verification and testing phase, the model directly uses the word embedding vectors resulting from training.
7. A smart automatic summarization and voice interaction news chat method for a digital newspaper as claimed in claim 1, wherein the S3 comprises:
s31, intelligent voice interaction in the voice recognition and search technology system, wherein the intelligent voice interaction comprises a microphone, and the output comprises a loudspeaker;
s32, detecting VAD model based on voice activity, including mode control, the mode control is electrically connected with the voice preprocessing, the voice recognition, the search and the output respectively; the mode control is used for acquiring mode information and respectively sending the current mode information to the voice preprocessing, the voice recognition, the search and the output; in a non-awakening mode, the mode control generates an awakening state identifier according to a judgment result of whether the target voice is a set awakening word or not in the voice preprocessing, and under the condition that the target voice is the set awakening word, the awakening state identifier is respectively output to the voice preprocessing, the voice recognition, the search and the output; in an awakening mode, the mode controls to acquire a time node when the response content is output, and monitors whether the voice preprocessing acquires the target content in real time; if the target content is not acquired within the set time, generating a non-awakening state identifier, and respectively outputting the non-awakening state identifier to the voice preprocessing, the voice recognition, the search and the output;
s33, further comprising history association to the voice interaction chat news based on deep learning; the history association is respectively electrically connected with the searching and the voice recognition; the historical association is used when entering an awake mode; acquiring the searched response content in an awakening mode, and recording the response content in the history association; the searching acquires history information related to the target content from history association and acquires response content according to the history information and the target content; the history association is also used for deleting the corresponding response content from the history association after the output response content is interrupted;
s34, the system comprises a plurality of scene models, and the voice processing module comprises a knowledge model, a similarity auxiliary model and a deep learning model; the dispatching system is internally provided with multi-purpose judgment processing, multi-model algorithm analysis, a data strategy engine and integration processing; leading the obtained texts into a time sequence neural network for training to obtain a high-dimensionality feature vector; the self-attention neural network performs regression operation by taking the high-dimensionality feature vector as input; and comparing the feature vectors obtained from the attention neural network with the feature vectors corresponding to the database, selecting the result with the highest similarity, outputting the selected result with the highest similarity by using audio, and feeding the searched voice interactive chat news content back to the reader in a voice form.
8. An intelligent digital newspaper automatic summarization and voice interaction chat news method as claimed in any one of claims 1-7, wherein the voice recognition module recognizes voice based on a deep neural network.
CN202011389092.9A 2020-12-01 2020-12-01 Method and system for automatically abstracting intelligent digital newspaper and performing voice interaction chat Active CN112562669B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011389092.9A CN112562669B (en) 2020-12-01 2020-12-01 Method and system for automatically abstracting intelligent digital newspaper and performing voice interaction chat

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011389092.9A CN112562669B (en) 2020-12-01 2020-12-01 Method and system for automatically abstracting intelligent digital newspaper and performing voice interaction chat

Publications (2)

Publication Number Publication Date
CN112562669A true CN112562669A (en) 2021-03-26
CN112562669B CN112562669B (en) 2024-01-12

Family

ID=75047464

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011389092.9A Active CN112562669B (en) 2020-12-01 2020-12-01 Method and system for automatically abstracting intelligent digital newspaper and performing voice interaction chat

Country Status (1)

Country Link
CN (1) CN112562669B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282711A (en) * 2021-06-03 2021-08-20 中国软件评测中心(工业和信息化部软件与集成电路促进中心) Internet of vehicles text matching method and device, electronic equipment and storage medium
CN113420783A (en) * 2021-05-27 2021-09-21 中国人民解放军军事科学院国防科技创新研究院 Intelligent man-machine interaction method and device based on image-text matching
CN113688230A (en) * 2021-07-21 2021-11-23 武汉众智数字技术有限公司 Text abstract generation method and system
CN114580429A (en) * 2022-01-26 2022-06-03 云捷计算机软件(江苏)有限责任公司 Artificial intelligence-based language and image understanding integrated service system
CN116414972A (en) * 2023-03-08 2023-07-11 浙江方正印务有限公司 Method for automatically broadcasting information content and generating short message

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108519890A (en) * 2018-04-08 2018-09-11 武汉大学 A kind of robustness code abstraction generating method based on from attention mechanism
CN109857860A (en) * 2019-01-04 2019-06-07 平安科技(深圳)有限公司 File classification method, device, computer equipment and storage medium
CN110263332A (en) * 2019-05-28 2019-09-20 华东师范大学 A kind of natural language Relation extraction method neural network based
CN110597979A (en) * 2019-06-13 2019-12-20 中山大学 Self-attention-based generating text summarization method
CN111508491A (en) * 2020-04-17 2020-08-07 山东传媒职业学院 Intelligent voice interaction equipment based on deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108519890A (en) * 2018-04-08 2018-09-11 武汉大学 A kind of robustness code abstraction generating method based on from attention mechanism
CN109857860A (en) * 2019-01-04 2019-06-07 平安科技(深圳)有限公司 File classification method, device, computer equipment and storage medium
CN110263332A (en) * 2019-05-28 2019-09-20 华东师范大学 A kind of natural language Relation extraction method neural network based
CN110597979A (en) * 2019-06-13 2019-12-20 中山大学 Self-attention-based generating text summarization method
CN111508491A (en) * 2020-04-17 2020-08-07 山东传媒职业学院 Intelligent voice interaction equipment based on deep learning

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420783A (en) * 2021-05-27 2021-09-21 中国人民解放军军事科学院国防科技创新研究院 Intelligent man-machine interaction method and device based on image-text matching
CN113420783B (en) * 2021-05-27 2022-04-08 中国人民解放军军事科学院国防科技创新研究院 Intelligent man-machine interaction method and device based on image-text matching
CN113282711A (en) * 2021-06-03 2021-08-20 中国软件评测中心(工业和信息化部软件与集成电路促进中心) Internet of vehicles text matching method and device, electronic equipment and storage medium
CN113282711B (en) * 2021-06-03 2023-09-22 中国软件评测中心(工业和信息化部软件与集成电路促进中心) Internet of vehicles text matching method and device, electronic equipment and storage medium
CN113688230A (en) * 2021-07-21 2021-11-23 武汉众智数字技术有限公司 Text abstract generation method and system
CN114580429A (en) * 2022-01-26 2022-06-03 云捷计算机软件(江苏)有限责任公司 Artificial intelligence-based language and image understanding integrated service system
CN116414972A (en) * 2023-03-08 2023-07-11 浙江方正印务有限公司 Method for automatically broadcasting information content and generating short message
CN116414972B (en) * 2023-03-08 2024-02-20 浙江方正印务有限公司 Method for automatically broadcasting information content and generating short message

Also Published As

Publication number Publication date
CN112562669B (en) 2024-01-12

Similar Documents

Publication Publication Date Title
CN112562669B (en) Method and system for automatically abstracting intelligent digital newspaper and performing voice interaction chat
Lopez et al. Deep Learning applied to NLP
Wu et al. Emotion recognition from text using semantic labels and separable mixture models
CN110297907B (en) Method for generating interview report, computer-readable storage medium and terminal device
Gao et al. Convolutional neural network based sentiment analysis using Adaboost combination
Ren et al. Intention detection based on siamese neural network with triplet loss
CN112131350A (en) Text label determination method, text label determination device, terminal and readable storage medium
Han et al. A survey of transformer-based multimodal pre-trained modals
Arumugam et al. Hands-On Natural Language Processing with Python: A practical guide to applying deep learning architectures to your NLP applications
CN110297906B (en) Method for generating interview report, computer-readable storage medium and terminal device
Kshirsagar et al. A review on application of deep learning in natural language processing
CN112131876A (en) Method and system for determining standard problem based on similarity
Amanova et al. Creating annotated dialogue resources: Cross-domain dialogue act classification
CN111753058A (en) Text viewpoint mining method and system
Tao et al. News text classification based on an improved convolutional neural network
CN114691864A (en) Text classification model training method and device and text classification method and device
CN112183106A (en) Semantic understanding method and device based on phoneme association and deep learning
Varaprasad et al. Applications and Techniques of Natural Language Processing: An Overview.
CN116910251A (en) Text classification method, device, equipment and medium based on BERT model
CN117493548A (en) Text classification method, training method and training device for model
Bellagha et al. Using the MGB-2 challenge data for creating a new multimodal Dataset for speaker role recognition in Arabic TV Broadcasts
Huang et al. Spoken document retrieval using multilevel knowledge and semantic verification
Harsha et al. Lexical Ambiguity in Natural Language Processing Applications
CN110543559A (en) Method for generating interview report, computer-readable storage medium and terminal device
Hao Naive Bayesian Prediction of Japanese Annotated Corpus for Textual Semantic Word Formation Classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant