CN110705218A - Outbound state identification mode based on deep learning - Google Patents

Outbound state identification mode based on deep learning Download PDF

Info

Publication number
CN110705218A
CN110705218A CN201910962912.XA CN201910962912A CN110705218A CN 110705218 A CN110705218 A CN 110705218A CN 201910962912 A CN201910962912 A CN 201910962912A CN 110705218 A CN110705218 A CN 110705218A
Authority
CN
China
Prior art keywords
audio
outbound
deep learning
text
converting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910962912.XA
Other languages
Chinese (zh)
Other versions
CN110705218B (en
Inventor
王磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Baiying Technology Co Ltd
Original Assignee
Zhejiang Baiying Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Baiying Technology Co Ltd filed Critical Zhejiang Baiying Technology Co Ltd
Priority to CN201910962912.XA priority Critical patent/CN110705218B/en
Publication of CN110705218A publication Critical patent/CN110705218A/en
Application granted granted Critical
Publication of CN110705218B publication Critical patent/CN110705218B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/65Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/527Centralised call answering arrangements not requiring operator intervention

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention provides an outbound state identification mode based on deep learning, and belongs to the technical field of outbound. The problem that existing outbound identification efficiency is low is solved. The method comprises the steps of downloading a plurality of sentences of an audio file, editing the audio by using an audio editing tool, converting the edited audio into two-dimensional spectral image signals required by a neural network through operations such as framing and windowing, using a deep convolutional neural network of VGG as a network model, training, outputting a large number of continuously repeated symbols, decoding by using a ctc, merging the continuously same symbols into one symbol, performing n-gram word segmentation on Chinese linguistic data to manufacture a statistical language model, modeling by pinyin-text into a hidden Markov chain, and converting the pinyin into a final recognition text and outputting the final recognition text. The structure obviously improves the outbound recognition efficiency.

Description

Outbound state identification mode based on deep learning
Technical Field
The invention belongs to the technical field of man-machine conversation, relates to an outbound system, and particularly relates to an outbound state identification mode based on deep learning.
Background
The outbound system is a conventional service of a call center, and a user number is actively dialed by an agent to talk with the user so as to finish specific tasks such as marketing, investigation and the like. The automatic calling system is an application system which adopts a computer to replace an agent, dials a user number by the computer and has a conversation with the user to complete a specific task, completely replaces manual work to carry out notification, reminding and confirmation, and can save part of labor cost.
At present, in the field of machine outbound, invalid telephones account for more than 2/3 of total outbound volume, and a large amount of invalid outbound volume not only causes the inefficient conversion of outbound call, but also causes serious waste of resources. With the increasing growth of outgoing call services, outgoing calls need to be identified and screened to improve dialing efficiency and save resources.
The existing outbound call identification mode is as follows: 1. developing a general speech recognition engine to directly convert an audio file into characters, wherein the speech recognition engine needs a large amount of corpus information, a large amount of manpower and material resources are consumed for manufacturing the corpus information, and English audio exists and needs to be compatible with Chinese and English recognition, so that the development cost of the speech recognition engine is very high, and the model training time period is long; 2. collecting a large amount of voice records of the calling state, such as shutdown, temporary unavailable connection, conversation, blank number, unmanned answering, incoming call reminding, incoming call limiting, network busy, outgoing call limiting, line busy, user refusing, call transfer, ring back tone standard beep sound, color ring and other voices, establishing the voice information into a voice library, converting analog voice signals into digital signals, comparing with samples in the voice library and classifying the digital signals.
Disclosure of Invention
The invention aims to provide an outbound state identification mode based on deep learning aiming at the problems in the prior art, and the technical problems to be solved by the invention are as follows: how to improve the efficiency of outbound call identification.
The purpose of the invention can be realized by the following technical scheme:
a outbound state identification mode based on deep learning is characterized by comprising the following steps:
s1, downloading and shutting down, temporarily not being connected, calling, blank number, no one answering, incoming call reminding, incoming call limiting, network busy, outgoing call limiting, line busy, user refusing, call transfer, ring back tone standard beep sound, color ring and other audio files;
s2, cutting and removing the blank sound from the head and the tail of the audio by using an audio cutting tool, and deleting the English part in the audio file;
s3, converting the clipped audio into a time domain spectrum matrix;
s4, converting the clipped audio into a two-dimensional spectrum image signal required by a neural network through operations such as framing and windowing, using a deep convolutional neural network of VGG as a network model, training, outputting a large number of continuously repeated symbols, decoding by using a ctc, and combining the continuously same symbols into one symbol;
s5, performing n-gram word segmentation on the Chinese corpus to manufacture a statistical language model, modeling a Pinyin-to-text model as a hidden Markov chain, converting Pinyin into a final recognition text and outputting the final recognition text;
s6, performing regular matching on the text, and outputting the matched categories;
and S7, identifying and marking the audio according to the output result.
In the above-mentioned outbound state recognition method based on deep learning, in step S2, the audio clipping tool intercepts the audio signals of the 8S audio beginning and end for splicing.
In the above-mentioned outbound state recognition method based on deep learning, in step S2, when the total audio duration is less than 16S, the blank tone filling is performed so that the audio duration is 16S.
In the above-mentioned outbound state recognition method based on deep learning, in step S2, the audio clipping tool is vad technology.
In the above-mentioned outbound status recognition method based on deep learning, in step S2, in step S1, 50 or more audio files such as power-off, temporary connection failure, call in progress, idle call, no answer, incoming call reminder, incoming call restriction, network busy, outgoing call restriction, line busy, user refusal, call forwarding, ring back tone standard beep, color ring, etc. are downloaded.
Compared with the prior art, the outbound state identification mode based on deep learning has the following advantages:
1. the invention combines the deep learning technology with the regular matching, only a small amount of linguistic data is needed to train a voice recognition model based on the outbound field, the recognition result can reach more than 95 percent of accuracy rate through the regular matching, the number is correctly classified, and the manufacturing cost is low.
2. The invention can realize real-time identification, only needs to accurately identify the Chinese part, does not need to pay attention to the on-line use of the English part, improves the identification speed and can realize millisecond response.
3. The method has low maintenance cost, and only the badcase needs to be collected and the model is retrained.
Drawings
FIG. 1 is one of the outbound state recognition flow charts based on deep learning of the present invention.
Fig. 2 is a second flow chart of the outbound state recognition based on deep learning according to the present invention.
Detailed Description
The following are specific embodiments of the present invention and are further described with reference to the drawings, but the present invention is not limited to these embodiments.
As shown in fig. 1 and 2, the outbound state recognition method based on deep learning is characterized by comprising the following steps:
s1, downloading and shutting down, temporarily not connecting, calling, blank number, no one answering, incoming call reminding, incoming call limiting, network busy, outgoing call limiting, line busy, user refusing, call transfer, ring back tone standard beep sound, color ring and other audio files more than 50 sentences;
s2, using vad technology to cut and remove the blank sound of the head and the tail of the audio, deleting English parts in the audio file, using an audio editing tool to cut and splice audio signals of 8S at the head and the tail of the audio, and filling the blank sound to enable the audio duration to be 16S when the total audio duration is less than 16S.
S3, converting the clipped audio into a time domain spectrum matrix;
s4, converting the clipped audio into a two-dimensional spectrum image signal required by a neural network through operations such as framing and windowing, using a deep convolutional neural network of VGG as a network model, training, outputting a large number of continuously repeated symbols, decoding by using a ctc, and combining the continuously same symbols into one symbol;
s5, performing n-gram word segmentation on the Chinese corpus to manufacture a statistical language model, modeling a Pinyin-to-text model as a hidden Markov chain, converting Pinyin into a final recognition text and outputting the final recognition text;
s6, performing regular matching on the text, and outputting the matched categories;
and S7, identifying and marking the audio according to the output result.
The outbound recognition mode of the invention only recognizes the Chinese part without paying attention to the English part, reduces the recognition operation, improves the recognition efficiency, and the length of each audio segment is controlled to be 16s, thereby avoiding the situation that the prediction speed is slow when the audio is too long, the training model adopts a small number of audio samples, applies a deep learning technology combined with regular matching, so that a voice recognition model based on the outbound field can be trained only by a small amount of audio samples, the recognition result can be recognized in real time through regular matching, millisecond recognition response can be realized when the device is used on line, the number classification can reach more than 95 percent of accuracy, effectively identifies invalid numbers, improves the identification efficiency, saves resources, and in addition, the number of required audio samples is small, so that the maintenance cost of the sound library is reduced, and the manufacturing cost of the model is further reduced.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (5)

1. A outbound state identification mode based on deep learning is characterized by comprising the following steps:
s1, downloading and shutting down, temporarily not being connected, calling, blank number, no one answering, incoming call reminding, incoming call limiting, network busy, outgoing call limiting, line busy, user refusing, call transfer, ring back tone standard beep sound, color ring and other audio files;
s2, cutting and removing the blank sound from the head and the tail of the audio by using an audio cutting tool, and deleting the English part in the audio file;
s3, converting the clipped audio into a time domain spectrum matrix;
s4, converting the clipped audio into a two-dimensional spectrum image signal required by a neural network through operations such as framing and windowing, using a deep convolutional neural network of VGG as a network model, training, outputting a large number of continuously repeated symbols, decoding by using a ctc, and combining the continuously same symbols into one symbol;
s5, performing n-gram word segmentation on the Chinese corpus to manufacture a statistical language model, modeling a Pinyin-to-text model as a hidden Markov chain, converting Pinyin into a final recognition text and outputting the final recognition text;
s6, performing regular matching on the text, and outputting the matched categories;
and S7, identifying and marking the audio according to the output result.
2. The outbound state recognition method based on deep learning of claim 1, wherein in step S2, the audio editing tool intercepts the audio signals of the audio head and the audio tail for 8S to splice.
3. The method for recognizing the outbound state based on the deep learning of claim 2, wherein in step S2, when the total duration of the audio is lower than 16S, the blank tone filling is performed to make the duration of the audio 16S.
4. The method for recognizing the outbound state based on deep learning of any one of claims 1 to 3, wherein in step S2, the audio editing tool is vad technology.
5. The outbound status recognition mode based on deep learning of any one of claims 1 to 3, wherein in step S1, 50 or more sentences of audio files such as power-off, temporary connection failure, call in progress, idle number, no answer, incoming call reminder, incoming call restriction, network busy, outgoing call restriction, line busy, user refusal, call forwarding, ring back tone standard beep, color ring, etc. are downloaded.
CN201910962912.XA 2019-10-11 2019-10-11 Outbound state identification mode based on deep learning Active CN110705218B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910962912.XA CN110705218B (en) 2019-10-11 2019-10-11 Outbound state identification mode based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910962912.XA CN110705218B (en) 2019-10-11 2019-10-11 Outbound state identification mode based on deep learning

Publications (2)

Publication Number Publication Date
CN110705218A true CN110705218A (en) 2020-01-17
CN110705218B CN110705218B (en) 2023-04-07

Family

ID=69198453

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910962912.XA Active CN110705218B (en) 2019-10-11 2019-10-11 Outbound state identification mode based on deep learning

Country Status (1)

Country Link
CN (1) CN110705218B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112003991A (en) * 2020-09-02 2020-11-27 深圳壹账通智能科技有限公司 Outbound method and related equipment
CN112735583A (en) * 2020-12-25 2021-04-30 山东众阳健康科技集团有限公司 Traditional Chinese medicine health preserving robot and method
CN113438368A (en) * 2021-06-22 2021-09-24 上海翰声信息技术有限公司 Method, device and computer readable storage medium for realizing ring back tone detection

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105979106A (en) * 2016-06-13 2016-09-28 北京容联易通信息技术有限公司 Ring tone recognition method and system for call center system
CN109670041A (en) * 2018-11-29 2019-04-23 天格科技(杭州)有限公司 A kind of band based on binary channels text convolutional neural networks is made an uproar illegal short text recognition methods
CN109859760A (en) * 2019-02-19 2019-06-07 成都富王科技有限公司 Phone robot voice recognition result bearing calibration based on deep learning
CN110059161A (en) * 2019-04-23 2019-07-26 深圳市大众通信技术有限公司 A kind of call voice robot system based on Text Classification
CN110211569A (en) * 2019-07-09 2019-09-06 浙江百应科技有限公司 Real-time gender identification method based on voice map and deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105979106A (en) * 2016-06-13 2016-09-28 北京容联易通信息技术有限公司 Ring tone recognition method and system for call center system
CN109670041A (en) * 2018-11-29 2019-04-23 天格科技(杭州)有限公司 A kind of band based on binary channels text convolutional neural networks is made an uproar illegal short text recognition methods
CN109859760A (en) * 2019-02-19 2019-06-07 成都富王科技有限公司 Phone robot voice recognition result bearing calibration based on deep learning
CN110059161A (en) * 2019-04-23 2019-07-26 深圳市大众通信技术有限公司 A kind of call voice robot system based on Text Classification
CN110211569A (en) * 2019-07-09 2019-09-06 浙江百应科技有限公司 Real-time gender identification method based on voice map and deep learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112003991A (en) * 2020-09-02 2020-11-27 深圳壹账通智能科技有限公司 Outbound method and related equipment
CN112735583A (en) * 2020-12-25 2021-04-30 山东众阳健康科技集团有限公司 Traditional Chinese medicine health preserving robot and method
CN113438368A (en) * 2021-06-22 2021-09-24 上海翰声信息技术有限公司 Method, device and computer readable storage medium for realizing ring back tone detection

Also Published As

Publication number Publication date
CN110705218B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN112804400B (en) Customer service call voice quality inspection method and device, electronic equipment and storage medium
CN110705218B (en) Outbound state identification mode based on deep learning
CN111246027B (en) Voice communication system and method for realizing man-machine cooperation
CN111128126B (en) Multi-language intelligent voice conversation method and system
US8457964B2 (en) Detecting and communicating biometrics of recorded voice during transcription process
EP1992154B1 (en) A mass-scale, user-independent, device-independent, voice message to text conversion system
US6327343B1 (en) System and methods for automatic call and data transfer processing
US8976944B2 (en) Mass-scale, user-independent, device-independent voice messaging system
WO2021218086A1 (en) Call control method and apparatus, computer device, and storage medium
CN111489765A (en) Telephone traffic service quality inspection method based on intelligent voice technology
CN112188017A (en) Information interaction method, information interaction system, processing equipment and storage medium
CN111128241A (en) Intelligent quality inspection method and system for voice call
CN111294471A (en) Intelligent telephone answering method and system
CN116665676B (en) Semantic recognition method for intelligent voice outbound system
CN201355842Y (en) Large-scale user-independent and device-independent voice message system
CN111901488B (en) Method for improving outbound efficiency of voice robot based on number state
CN103067579A (en) Method and device assisting in on-line voice chat
CN113779217A (en) Intelligent voice outbound service method and system based on human-computer interaction
CN102196100A (en) Instant call translation system and method
CN203278958U (en) Conversation transcription system
CN115022471A (en) Intelligent robot voice interaction system and method
CN114328867A (en) Intelligent interruption method and device in man-machine conversation
CN112506405A (en) Artificial intelligent voice large screen command method based on Internet supervision field
CN110728145A (en) Method for establishing natural language understanding model based on recording conversation
RU2783966C1 (en) Method for processing incoming calls

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant