CN111489743A - Operation management analysis system based on intelligent voice technology - Google Patents

Operation management analysis system based on intelligent voice technology Download PDF

Info

Publication number
CN111489743A
CN111489743A CN201910082514.9A CN201910082514A CN111489743A CN 111489743 A CN111489743 A CN 111489743A CN 201910082514 A CN201910082514 A CN 201910082514A CN 111489743 A CN111489743 A CN 111489743A
Authority
CN
China
Prior art keywords
voice
file
analysis
recording
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910082514.9A
Other languages
Chinese (zh)
Inventor
张劭韡
吴佐平
王颖
邓艳丽
陈敏耀
邓志东
张晓慧
杜小瑾
姜冬
徐景龙
乔晅
徐强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Co ltd Customer Service Center
Beijing China Power Information Technology Co Ltd
Original Assignee
State Grid Co ltd Customer Service Center
Beijing China Power Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Co ltd Customer Service Center, Beijing China Power Information Technology Co Ltd filed Critical State Grid Co ltd Customer Service Center
Priority to CN201910082514.9A priority Critical patent/CN111489743A/en
Publication of CN111489743A publication Critical patent/CN111489743A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/51Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing
    • H04M3/5166Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing in combination with interactive voice response systems or voice portals, e.g. as front-ends
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/53Centralised arrangements for recording incoming messages, i.e. mailbox systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention provides an operation management analysis system based on an intelligent voice technology, which comprises: the recording acquisition unit is used for downloading a recording data file from the telephone recording platform, splicing and converting the file and generating a complete voice file; the scene segmentation unit is used for carrying out scene segmentation or speaker segmentation on the voice file; the voice transcription unit is used for recognizing the voice file based on an intelligent voice recognition engine and transcribing the voice file into text content; the data analysis unit is used for analyzing the text content and the voice file based on a neural network model and outputting an analysis report; the database unit is used for storing the voice file, the text content and the analysis report; and the content indexing unit is used for retrieving the data stored in the database according to the indexing command. The invention discovers the problems and the defects in the service process by intelligently analyzing a large number of recording data files of the call center and timely masters the appeal of the user, thereby improving the satisfaction degree of the user.

Description

Operation management analysis system based on intelligent voice technology
Technical Field
The invention relates to the technical field of information analysis, in particular to an operation management analysis system based on an intelligent voice technology.
Background
With the development of mobile communication technology, a customer service call center plays a crucial role as a bridge between an operation platform and a user. In recent years, the intelligent voice technology industry scale is rapidly and continuously increased under the drive of multiple factors such as global user demand pulling, national strategy guidance and enterprise competition, and is continuously and deeply applied to the fields of mobile internet, intelligent home, automotive electronics, financial payment, online education, medical treatment and the like. Under the promotion of mass data and deep learning, intelligent voice technologies such as voice recognition, voice synthesis and voiceprint recognition become mature day by day and start to enter a practical stage.
The national power grid 95598 call center is used as an important bridge between a national power grid company and a user, the Chinese speech recognition technology trained by adopting the current international mainstream DNN (deep neural network) and HMM (hidden Markov model) method can be suitable for application environments of different ages, different regions, different crowds, different channels, different terminals and different noise environments, meanwhile, the customized training of the model is carried out by utilizing the massive speech corpora and text corpora accumulated by the national power grid 95598 call center, a speech transcription and analysis platform with high availability and high recognition rate is established, the defects of unclear and inaccurate speech recognition and transcription in the prior art are greatly improved, and the speech recognition error rate is reduced.
Disclosure of Invention
In view of the above, the main objective of the present invention is to provide an operation management analysis system based on an intelligent voice technology, which establishes an intelligent voice recognition model with high availability and high recognition rate by using a large amount of voice corpora and text corpora accumulated in a national power grid 95598 call center, and continuously trains and optimizes the model by using a self-deep learning technology, so as to improve the recognition accuracy, continuously improve the recognition accuracy and the applicability to the power service industry, and timely transcribe and analyze a large amount of recording data generated by the call center based on the intelligent voice recognition model, thereby timely discovering defects in the service process, grasping user requirements, and improving the service quality.
The invention adopts the technical scheme that an operation management analysis system based on an intelligent voice technology comprises the following steps:
the recording acquisition unit is used for downloading a recording data file from the telephone recording platform, splicing and converting the file and generating a complete voice file;
the scene segmentation unit is used for carrying out scene segmentation or speaker segmentation on the voice file;
the voice transcription unit is used for recognizing the voice file based on an intelligent voice recognition engine and transcribing the voice file into text content;
the data analysis unit is used for analyzing the text content and the voice file based on a neural network model and outputting an analysis report;
the database unit is used for storing the voice file, the text content and the analysis report;
and the content indexing unit is used for retrieving the data stored in the database according to the indexing command.
Wherein, the recording acquisition unit includes:
the recording downloading module is connected with the telephone recording platform and used for receiving the recording data file segments transmitted by the telephone recording platform at regular time;
the splicing transcoding module is used for splicing the recording data file segments, decompressing the spliced recording data file and converting the recording data file into a complete voice file which can be identified;
and the transcription scheduling module is used for calling the corresponding voice file according to the transcription command and sending the voice file to the voice transcription unit.
Wherein the data analysis unit includes:
the audio analysis module comprises a silence interval detection module, a speech speed detection module and an emotion detection module and is used for respectively carrying out silence interval analysis, speech speed analysis and emotion analysis on the voice file;
and the text analysis module is used for analyzing the text content.
The voice file processing system is further improved by further comprising a recording distribution module used for respectively transmitting the voice file to the voice transcription unit, the data analysis unit and the database unit.
Wherein the intelligent speech recognition engine comprises an acoustic model and a language model corresponding to the calculation of syllable-to-audio feature probabilities and syllable-to-text probabilities extracted from the speech file, respectively;
the language model is modeled by adopting an N-Gram model;
the acoustic model is modeled by adopting a deep neural network and a hidden Markov model.
Wherein the content indexing unit includes:
the data storage module is used for storing the text content and the associated data generated in the transcription process;
the data query module is used for querying and aggregating the voice file and the associated data according to preset query conditions and displaying the result;
the word bank is internally provided with a self-defined word segmentation word bank, and can accurately segment the electric power related sentences;
the text word segmentation processing module is used for carrying out word segmentation marking on the text content based on the word segmentation result and extracting each word segmentation to generate structured text data;
and the text clustering module is used for performing text clustering processing and deep cross analysis on the structured text data generated in a time period to obtain text related clustering information in the time period.
Further improved, the system also comprises a standardized sentence vector model library which comprises a plurality of standardized sentence vector models;
the standardized sentence vector model is obtained by carrying out sentence vector similarity calculation on sentence samples in a corpus based on a neural network model and carrying out standardized training on sentence vectors meeting a similarity threshold.
In a further improvement, the speech transcription unit further includes a normalization processing module, configured to perform sentence segmentation on the text content, calculate a sentence vector of each sentence, and select a normalization sentence vector model corresponding to each sentence from the normalization sentence vector model library to perform normalization training on each sentence, so as to output a corresponding normalization sentence.
Drawings
Fig. 1 is a block diagram of an operation management analysis system based on intelligent voice technology.
Detailed Description
The invention mainly aims to provide an operation management analysis system based on an intelligent voice technology, which establishes an intelligent voice recognition model with high availability and high recognition rate by utilizing massive voice corpora and text corpora accumulated by a national power grid 95598 call center, continuously trains and optimizes the model by utilizing a self-deep learning technology, improves the recognition accuracy, continuously improves the recognition accuracy and the applicability to the power service industry, and timely transcribes and analyzes a large amount of recording data generated by the call center based on the intelligent voice recognition model, thereby timely discovering the defects in the service process, mastering the user requirements and improving the service quality.
The national power grid 95598 call center can be suitable for application environments of different ages, different regions, different crowds, different channels, different terminals and different noise environments by adopting a Chinese speech recognition technology trained by a method of DNN (deep neural network) + HMM (hidden Markov model) which is currently mainstream internationally, and simultaneously carries out customized training on models by utilizing mass speech corpora and text corpora accumulated by the national power grid 95598 call center, so that a speech transcription platform with high availability and high recognition rate is achieved;
the core technology of the voice transcription platform is an intelligent voice recognition technology, the intelligent voice recognition technology adopts a latest generation recognition algorithm, a decoder core and an advanced acoustic model and language model training method, and the intelligent voice recognition technology mainly comprises three important components: training a voice recognition model, processing front-end voice and processing rear-end recognition;
1. speech recognition model training
The speech recognition model is usually composed of two parts, an acoustic model and a language model, corresponding to the computation of syllable-to-syllable probabilities and syllable-to-word probabilities, respectively, of features extracted from the speech signal.
At present, a DNN (deep neural network) + HMM (hidden Markov model) method is generally adopted as a modeling method of an acoustic model, and compared with a GMM (Gaussian mixture model) + HMM method used in the previous generation, the error rate of speech recognition is reduced by 30%, which is the fastest progress in the speech recognition technology in the last 20 years. In the aspect of language models, a modeling method of a statistical language model is usually adopted at present, the statistical language model adopts an N-Gram model, the N-Gram model is also called a first-order markov chain, and the basic idea is to perform a sliding window operation with the size of N on the content in a text according to bytes to form a byte fragment sequence with the length of N, each byte fragment is called a Gram, statistics is performed on the occurrence frequency of all the grams, filtering is performed according to a preset threshold value to form a key Gram list, namely a vector feature space of the text, and each Gram in the list is a feature vector dimension;
the algorithm has the advantages of strong fault tolerance and language independence, is universal for Chinese, English and Chinese, does not need to be processed in linguistics, is a common language model in large-vocabulary continuous speech recognition, is simple and effective, and is widely used.
In order to adapt to application environments of different ages, different regions, different crowds, different channels, different terminals and different noise environments, a large amount of voice corpora and text corpora are required to be trained, and the recognition rate can be effectively improved. With the rapid development of the internet and the popularization and application of mobile terminals such as mobile phones and the like, a large amount of texts or linguistic data in the aspect of voice can be obtained from a plurality of channels at present, which provides rich resources for the training of language models and acoustic models in voice recognition models, and makes the construction of general large-scale language models and acoustic models possible.
2. Front-end speech processing
Front-end speech processing refers to preprocessing such as detecting and denoising the speaker's speech by using a signal processing method so as to obtain the speech most suitable for the recognition engine to process. The main functions include:
(1) endpoint detection
The endpoint detection is to analyze the input audio stream, distinguish the speech and non-speech signal periods in the speech signal, and accurately determine the starting point of the speech signal. After the endpoint detection, the subsequent processing can be carried out on the voice signal only, which plays an important role in improving the accuracy of the model and the recognition accuracy.
(2) Noise cancellation
In practical applications, background noise is a real challenge for speech recognition applications, and even if a speaker is in a quiet office environment, it is difficult to avoid certain noise during a telephone voice call. A good speech recognition engine needs to have efficient noise cancellation capabilities to accommodate the user's requirements for use in a wide variety of environments.
(3) Feature extraction
The features commonly used at present include MFCC (Mel Frequency Cepstrum Coefficient) and P L P (Perceptual L initial Prediction), etc.
3. Backend recognition processing
The back-end recognition processing is a process of recognizing (also referred to as "decoding") the extracted feature vectors by using the trained "acoustic model" and "language model" to obtain text information. The main purpose of the acoustic model is to correspond to the computation of the probabilities of speech features to syllables (or phonemes) and the main purpose of the language model is to correspond to the computation of the probabilities of syllables to words. The most important decoder part is that the original speech characteristics are subjected to acoustic model scoring and language model scoring, and an optimal word pattern sequence path is obtained on the basis, and the text corresponding to the path is the final recognition result.
The early decoder based on the syntax tree structure is designed more complicated, and under the current technical condition, the speed increase of the decoder is already met with a bottleneck, but most of the current mainstream speech recognition decoders adopt a decoding network based on a finite state machine (WFST), and the decoding network can integrate a language model, a dictionary and an acoustic shared tone word set into a large decoding network, so that the decoding speed is greatly improved, and the decoding process and a knowledge source can be separated.
Based on the above-mentioned massive speech corpus and text corpus accumulated based on the national power grid 95598 call center and the intelligent speech recognition model formed by adopting the DNN (deep neural network) + HMM (hidden Markov model) customized training, a preferred embodiment provided by the invention relates to an operation management analysis system, the integration aspect of the system needs to be directly connected with a 95598 telephone recording platform, a 95598 business management system and a quality inspection management module, and is indirectly connected with a 95598 business support system through a quality inspection management module.
The system strives for clear hierarchical division on the whole architecture, and the technology adopted by the core component achieves harmonious unification of advancement and maturity stability. The high availability requirements are fully considered by various interfaces, services and engines of the whole system, the interfaces adopt a master-standby mode in principle, the services and the engines adopt a load balancing mechanism, single-point faults do not exist, and service interruption or flow blockage caused by failure of a few nodes is avoided.
The system needs to deploy corresponding parts in the south (north) branch center, the south branch center needs to be in butt joint with a local telephone recording platform to obtain recording data nearby, and when recording is called, binary voice streams need to be transmitted between south and north networks.
The branch center in the north is also butted with a local recording platform to realize the acquisition and the transcription of the recording. In addition, because the user of the voice analysis is mainly close to the north branch center, and because of the requirement of data summarization, the quality control text content processing and the voice content analysis of the whole customer service center are centralized in the north branch center for processing, and the content retrieval service and the database need to be deployed in the north branch center.
The management of the data is realized by intensively storing the text data according to the actual condition, and various kinds of storage are planned according to the current reasonable flow, so that the processing requirement of the maximum voice flow during the peak-to-summer period is met.
The present system is described in detail below with reference to fig. 1, and the present operation management analysis system includes:
the recording acquisition unit 100 specifically comprises a recording downloading module, which is connected to the telephone recording platform and receives the recording data file segments transmitted by the telephone recording platform at regular time; the splicing transcoding module is used for splicing the recording data file segments, decompressing the spliced recording data file and converting the recording data file into a complete voice file which can be identified; the transcription scheduling module is used for calling the corresponding voice file to send according to the transcription command;
the recording distribution module 200 is configured to transmit the voice files respectively;
the voice transcription unit 300 is used for recognizing the voice file based on an intelligent voice recognition engine and transcribing the voice file into text content;
the intelligent speech recognition engine comprises an acoustic model and a language model, and the acoustic model and the language model respectively correspond to the calculation of the probability from the audio features extracted from the speech file to the syllables and the calculation of the probability from the syllables to the characters;
the language model is modeled by adopting an N-Gram model;
the acoustic model is modeled by adopting a deep neural network and a hidden Markov model
A scene division unit 400, configured to perform scene division or speaker division on the voice file;
in order to save cost, the current call center usually uses single-channel recording, that is, records users and customer services simultaneously and stores the records in the same channel. However, the customer service recording and the user recording are generally required to be analyzed respectively, the customer service recording is mainly used for evaluating the service capacity of the customer service, the user recording contains potential demand information of the user or competitor information and the like, and the business value is obvious. This function is commonly referred to as "speaker separation," also known as "scene segmentation";
the data analysis unit 500 analyzes the text content and the voice file based on the neural network model, and outputs an analysis report, which includes an audio analysis module and a text analysis module;
the audio analysis module comprises a silence interval detection module, a speech speed detection module and an emotion detection module and is used for respectively carrying out silence interval analysis, speech speed analysis and emotion analysis on the voice file;
the text analysis module is used for analyzing the text content;
a database unit 600 for storing the voice file, text content and analysis report;
the content indexing unit 700 retrieves data stored in the database according to the index command, and specifically includes:
the data storage module is used for storing the text content and the associated data generated in the transcription process;
the data query module is used for querying and aggregating the voice file and the associated data according to preset query conditions and displaying the result;
the word bank is internally provided with a self-defined word segmentation word bank, and can accurately segment the electric power related sentences;
the text word segmentation processing module is used for carrying out word segmentation marking on the text content based on the word segmentation result and extracting each word segmentation to generate structured text data;
and the text clustering module is used for performing text clustering processing and deep cross analysis on the structured text data generated in a time period to obtain text related clustering information in the time period.
In addition, the system also comprises a standardized sentence vector model base which contains a plurality of standardized sentence vector models;
the standardized sentence vector model is obtained by carrying out sentence vector similarity calculation on sentence samples in a corpus based on a neural network model and carrying out standardized training on sentence vectors meeting a similarity threshold.
Based on the standardized sentence vector model library, the voice transcription unit further comprises a standardized processing module for sentence segmentation of the text content, calculating a sentence vector of each sentence, and selecting a standardized sentence vector model corresponding to each sentence from the standardized sentence vector model library to perform standardized training on each sentence so as to output a corresponding standardized sentence;
the standardized sentences are recombined to generate standardized text content, so that more accurate text analysis can be performed subsequently.
Based on the operation management analysis system provided by the invention, refined operation application on services can be realized, such as:
customer service voice quality inspection: by applying the voice analysis technology, various retrieval functions are flexibly combined and applied, different application parameter thresholds are set, and the problem of customer service call quality can be effectively and comprehensively analyzed and evaluated. The voice analysis can also locate the specific position where the problem occurs, thereby facilitating the further tracing and determining of the problem by the manager.
Operation management analysis: and the service short board is mined by combining the recording and the recording identification result to carry out service efficiency analysis, incoming call reason mining, user requirement analysis, call duration analysis, hotspot and change trend monitoring, so that auxiliary support is provided for standardizing the service process and optimizing the service flow.
The voice analysis technology supports that a voice analysis system detects the variation amplitude of fundamental frequency, pitch and the like in audio frequency in a telephone recording, provides prediction of emotion fluctuation possibly occurring in the recording, positions the position information of the audio frequency with emotion fluctuation in the whole voice, detects and analyzes the average speed of speech in the whole telephone recording and the variation of the speed of speech in a certain section of recording, detects the mute time and the like of no speaking of a user and a hotline service person in the recording file, generates an index file in a standard XM L format, and carries out keyword retrieval, abnormal condition detection, abnormal voice detection and abnormal dialogue detection.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (8)

1. An operation management analysis system based on intelligent voice technology, comprising:
the recording acquisition unit is used for downloading a recording data file from the telephone recording platform, splicing and converting the file and generating a complete voice file;
the scene segmentation unit is used for carrying out scene segmentation or speaker segmentation on the voice file;
the voice transcription unit is used for recognizing the voice file based on an intelligent voice recognition engine and transcribing the voice file into text content;
the data analysis unit is used for analyzing the text content and the voice file based on a neural network model and outputting an analysis report;
the database unit is used for storing the voice file, the text content and the analysis report;
and the content indexing unit is used for retrieving the data stored in the database according to the indexing command.
2. The system of claim 1, wherein the recording acquisition unit comprises:
the recording downloading module is connected with the telephone recording platform and used for receiving the recording data file segments transmitted by the telephone recording platform at regular time;
the splicing transcoding module is used for splicing the recording data file segments, decompressing the spliced recording data file and converting the recording data file into a complete voice file which can be identified;
and the transcription scheduling module is used for calling the corresponding voice file according to the transcription command and sending the voice file to the voice transcription unit.
3. The system according to claim 1 or 2, wherein the data analysis unit comprises:
the audio analysis module comprises a silence interval detection module, a speech speed detection module and an emotion detection module and is used for respectively carrying out silence interval analysis, speech speed analysis and emotion analysis on the voice file;
and the text analysis module is used for analyzing the text content.
4. The system of claim 1, further comprising a recording distribution module for transmitting the voice file to the voice transcription unit, the data analysis unit, and the database unit, respectively.
5. The system of claim 1, wherein the intelligent speech recognition engine includes an acoustic model and a language model corresponding to the calculation of syllable to syllable probabilities and the calculation of syllable to text probabilities, respectively, of audio features extracted from a speech file;
the language model is modeled by adopting an N-Gram model;
the acoustic model is modeled by adopting a deep neural network and a hidden Markov model.
6. The system of claim 1, wherein the content indexing unit comprises:
the data storage module is used for storing the text content and the associated data generated in the transcription process;
the data query module is used for querying and aggregating the voice file and the associated data according to preset query conditions and displaying the result;
the word bank is internally provided with a self-defined word segmentation word bank, and can accurately segment the electric power related sentences;
the text word segmentation processing module is used for carrying out word segmentation marking on the text content based on the word segmentation result and extracting each word segmentation to generate structured text data;
and the text clustering module is used for performing text clustering processing and deep cross analysis on the structured text data generated in a time period to obtain text related clustering information in the time period.
7. The system of claim 1, further comprising a library of standardized sentence vector models comprising a plurality of standardized sentence vector models;
the standardized sentence vector model is obtained by carrying out sentence vector similarity calculation on sentence samples in a corpus based on a neural network model and carrying out standardized training on sentence vectors meeting a similarity threshold.
8. The system of claim 7, wherein the speech transcription unit further comprises a normalization processing module for sentence-slicing the text content and calculating a sentence vector for each sentence, and selecting a normalization sentence vector model corresponding to each sentence from the normalization sentence vector model library to perform normalization training on each sentence to output a corresponding normalization sentence.
CN201910082514.9A 2019-01-28 2019-01-28 Operation management analysis system based on intelligent voice technology Pending CN111489743A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910082514.9A CN111489743A (en) 2019-01-28 2019-01-28 Operation management analysis system based on intelligent voice technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910082514.9A CN111489743A (en) 2019-01-28 2019-01-28 Operation management analysis system based on intelligent voice technology

Publications (1)

Publication Number Publication Date
CN111489743A true CN111489743A (en) 2020-08-04

Family

ID=71810764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910082514.9A Pending CN111489743A (en) 2019-01-28 2019-01-28 Operation management analysis system based on intelligent voice technology

Country Status (1)

Country Link
CN (1) CN111489743A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113744712A (en) * 2021-07-29 2021-12-03 中国工商银行股份有限公司 Intelligent outbound voice splicing method, device, equipment, medium and program product
CN113743983A (en) * 2021-08-09 2021-12-03 太逗科技集团有限公司 Android application-based electric pin management method, device, equipment and medium
CN114666449A (en) * 2022-03-29 2022-06-24 深圳市银服通企业管理咨询有限公司 Voice data processing method of calling system and calling system
CN116978384A (en) * 2023-09-25 2023-10-31 成都市青羊大数据有限责任公司 Public security integrated big data management system
CN117672266A (en) * 2023-12-05 2024-03-08 绍兴大明电力建设有限公司 Voiceprint recognition method based on DCN

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009175943A (en) * 2008-01-23 2009-08-06 Seiko Epson Corp Database system for call center, information management method for database and information management program for database
CN103118361A (en) * 2013-01-21 2013-05-22 吴建进 Recording method and device based on signaling detection system
CN103793515A (en) * 2014-02-11 2014-05-14 安徽科大讯飞信息科技股份有限公司 Service voice intelligent search and analysis system and method
US20180308487A1 (en) * 2017-04-21 2018-10-25 Go-Vivace Inc. Dialogue System Incorporating Unique Speech to Text Conversion Method for Meaningful Dialogue Response

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009175943A (en) * 2008-01-23 2009-08-06 Seiko Epson Corp Database system for call center, information management method for database and information management program for database
CN103118361A (en) * 2013-01-21 2013-05-22 吴建进 Recording method and device based on signaling detection system
CN103793515A (en) * 2014-02-11 2014-05-14 安徽科大讯飞信息科技股份有限公司 Service voice intelligent search and analysis system and method
US20180308487A1 (en) * 2017-04-21 2018-10-25 Go-Vivace Inc. Dialogue System Incorporating Unique Speech to Text Conversion Method for Meaningful Dialogue Response

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
常培;刘海舟;: "电信运营商智能语音客服平台研究与分析", 邮电设计技术, no. 09, pages 63 - 67 *
黄翊: "基于智能语音分析的客服智慧运营管理系统解决方案", 《科技传播》, pages 121 - 123 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113744712A (en) * 2021-07-29 2021-12-03 中国工商银行股份有限公司 Intelligent outbound voice splicing method, device, equipment, medium and program product
CN113743983A (en) * 2021-08-09 2021-12-03 太逗科技集团有限公司 Android application-based electric pin management method, device, equipment and medium
CN114666449A (en) * 2022-03-29 2022-06-24 深圳市银服通企业管理咨询有限公司 Voice data processing method of calling system and calling system
CN116978384A (en) * 2023-09-25 2023-10-31 成都市青羊大数据有限责任公司 Public security integrated big data management system
CN116978384B (en) * 2023-09-25 2024-01-02 成都市青羊大数据有限责任公司 Public security integrated big data management system
CN117672266A (en) * 2023-12-05 2024-03-08 绍兴大明电力建设有限公司 Voiceprint recognition method based on DCN

Similar Documents

Publication Publication Date Title
CN111933129B (en) Audio processing method, language model training method and device and computer equipment
CN110853649A (en) Label extraction method, system, device and medium based on intelligent voice technology
CN108305634B (en) Decoding method, decoder and storage medium
US11189272B2 (en) Dialect phoneme adaptive training system and method
JP6772198B2 (en) Language model speech end pointing
EP1564722B1 (en) Automatic identification of telephone callers based on voice characteristics
Juang et al. Automatic recognition and understanding of spoken language-a first step toward natural human-machine communication
US8831947B2 (en) Method and apparatus for large vocabulary continuous speech recognition using a hybrid phoneme-word lattice
CN111489743A (en) Operation management analysis system based on intelligent voice technology
Mao et al. Speech recognition and multi-speaker diarization of long conversations
CN111489765A (en) Telephone traffic service quality inspection method based on intelligent voice technology
Rabiner et al. An overview of automatic speech recognition
US11056100B2 (en) Acoustic information based language modeling system and method
CN111489754A (en) Telephone traffic data analysis method based on intelligent voice technology
CN111105785B (en) Text prosody boundary recognition method and device
CN100354929C (en) Voice processing device and method, recording medium, and program
CN114818649A (en) Service consultation processing method and device based on intelligent voice interaction technology
CN111081219A (en) End-to-end voice intention recognition method
CN112397054A (en) Power dispatching voice recognition method
CN114120985A (en) Pacifying interaction method, system and equipment of intelligent voice terminal and storage medium
CN111414748A (en) Traffic data processing method and device
CN111402887A (en) Method and device for escaping characters by voice
EP0177854B1 (en) Keyword recognition system using template-concatenation model
Thakur et al. NLP & AI speech recognition: an analytical review
Žgank et al. Slovenian spontaneous speech recognition and acoustic modeling of filled pauses and onomatopoeas

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination