CN105895116A - Dual track voice break-in and interruption analysis method - Google Patents
Dual track voice break-in and interruption analysis method Download PDFInfo
- Publication number
- CN105895116A CN105895116A CN201610209686.4A CN201610209686A CN105895116A CN 105895116 A CN105895116 A CN 105895116A CN 201610209686 A CN201610209686 A CN 201610209686A CN 105895116 A CN105895116 A CN 105895116A
- Authority
- CN
- China
- Prior art keywords
- end points
- voice
- time
- chipping
- analysis method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 20
- 230000009977 dual effect Effects 0.000 title abstract 3
- 238000001514 detection method Methods 0.000 claims abstract description 7
- 230000000694 effects Effects 0.000 claims abstract description 5
- 238000005516 engineering process Methods 0.000 claims abstract description 5
- 239000012634 fragment Substances 0.000 claims abstract description 4
- 238000000034 method Methods 0.000 claims abstract description 4
- 238000007689 inspection Methods 0.000 claims description 4
- 230000001351 cycling effect Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 2
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/87—Detection of discrete points within a voice signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
Abstract
The present invention discloses a dual track voice break-in and interruption analysis method. The method comprises the steps of carrying out the effective voice end point detection by a voice activity detection technology and aiming at the recording streams of two tracks, and finding out the talk duration during the whole voice; according to the effective voice end points of the two track recordings, uniformly processing the end point time of each fragment, and laying all end points on a time axis by uniformly describing the three attributes of time points, tracks and end point types; traversing all time points from front to back, and analyzing whether the end points are beginning position end points or ending position end points. The dual track voice break-in and interruption analysis method can capture the break-in and interruption phenomena timely when the break-in and interruption phenomena are generated between two or more roles, and carries out the subsequent processing, thereby avoiding a break-in and interruption impolite conversation mode, and providing the superior guarantee for the customer services.
Description
Technical field
The invention belongs to customer service communicating tech field, particularly relate to a kind of double track voice robs analysis side of chipping in
Method.
Background technology
Voice customer service refers to the customer service mainly carried out with the form of mobile phone, during customer service,
The problem often robbed words, chip between two or more roles.Wherein rob words and refer between two roles,
One role just finishes, and another role the most just speaks, and middle not free interval, this is at talk
In be a kind of unhandsome mode, can be considered rude, half-hearted by the other side.Chip between two roles of finger,
One role speaks, and another role directly chips in and states the suggestion of oneself, and this is more in talk
Unhandsome mode.The phenomenon this rob words, chipping in has had a strong impact on the quality level of customer service.
Summary of the invention
It is an object of the invention to provide a kind of double track voice robs analysis method of chipping in, it is intended to solve customer service
During the problem robbed words, chip in that occurs.
The present invention is achieved in that the analysis method of chipping in of robbing of this double track voice comprises the following steps:
Step one, carry out efficient voice end points by voice activity detection technology for the recording stream of two sound channels
Detection, finds out and said from several seconds to several seconds in whole voice and exchange words;
Step 2, the efficient voice end points recorded according to two sound channels are unified by the endpoint time of each fragment
Process, by time point, sound channel, three attribute Unify legislation of endpoint type, and all end points are tiled arrive
On time shaft;
Step 3, two be close to end points, the most previous end points is the beginning end points that role A speaks,
Later end points is the end caps that role B speaks, and this is phenomenon of chipping in.
Step 4, two be close to end points, the most previous end points be role A speak terminate end points,
Later end points is the beginning end points that role B speaks, and the time boundary difference of two end points is less than 200ms,
It is and robs words phenomenon.
The present invention also takes following technical measures:
Efficient voice end points in step one comprises time started, end time, three attributes of speaker.
In step 2, endpoint type includes starting and terminating.
The analysis method of endpoint type is comprised the following steps:
Step one, inspection endpoint type;
Step 2, if starting position end points, then judge whether stack top comprises starting position;
If step 3 stack top comprises starting position, then judge time started position whether with this starting position
Role is identical;
If step 4 is identical, then corrupt data is described, it is impossible to a people does not finish words, opens again
Beginning speaks;
Step 5, if it is different, then illustrate to chip in, records this information of chipping in, and is ejected by stack top end points;
If step 6 stack top does not comprise starting position, then by starting position pop down, endpoint location is added 1, and
Continue cycling through;
Step 7, if end position end points, then judge whether stack top comprises starting position;
If step 8 stack top comprises starting position, then judge time started position whether with this end position
Role is the most identical;
If step 9 is identical, then explanation is normal end points, does not has to chip in, when recording this end position
Between point;
Step 10 if it is different, error in data is then described, before there occurs really not record of chipping in;
If step 11 stack top does not comprise starting position, then see end position and the start bit of previous end points
Whether put within 200ms, be, rob words, words time of origin robbed in record, and is ejected by stack top end points;
Step 12, by all chip in finish message records robbed, wherein every section rob chip in comprise the time started,
End time, type, rob direction of chipping in.
The present invention has the advantage that with good effect: the robbing of this double track voice chips in analysis method can be
Rob words between two or more roles, be required to capture this phenomenon timely when chipping in, and
Carry out subsequent treatment, it is to avoid rob words and unhandsome talking mode of chipping in, provide high-quality for customer service
Guarantee.
Accompanying drawing explanation
Fig. 1 be the embodiment of the present invention provide double track voice rob analysis method flow diagram of chipping in;
Fig. 2 is the analysis method flow diagram to endpoint type that the embodiment of the present invention provides.
Detailed description of the invention
In order to make the purpose of the present invention, technical scheme and advantage clearer, below in conjunction with embodiment,
The present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to
Explain the present invention, be not intended to limit the present invention.
Below in conjunction with the accompanying drawings 1,2 and specific embodiment the application principle of the present invention is further described.
The analysis method of chipping in of robbing of this double track voice comprises the following steps:
S101, carry out efficient voice end points inspection by voice activity detection technology for the recording stream of two sound channels
Survey, find out and whole voice was said from several seconds to several seconds exchange words;
S102, the efficient voice end points recorded according to two sound channels, at unified for the endpoint time of each fragment
All end points by time point, sound channel, three attribute Unify legislation of endpoint type, and are tiled then by reason
On countershaft;
S103, two be close to end points, the most previous end points is the beginning end points that role A speaks, after
One end points is the end caps that role B speaks, and this is phenomenon of chipping in.
S104, two be close to end points, the most previous end points be role A speak terminate end points, after
One end points is the beginning end points that role B speaks, and the time boundary difference of two end points is less than 200ms, i.e.
For robbing words phenomenon.
Efficient voice end points in S101 comprises time started, end time, three attributes of speaker.
In S102, endpoint type includes starting and terminating.
The analysis method of endpoint type is comprised the following steps:
S201, inspection endpoint type;
S202, if starting position end points, then judge whether stack top comprises starting position;
If S203 stack top comprises starting position, then judge time started position whether with the angle of this starting position
Color is identical;
If S204 is identical, then corrupt data is described, it is impossible to a people does not finish words, starts again
Speak;
S205, if it is different, then illustrate to chip in, records this information of chipping in, and is ejected by stack top end points;
If S206 stack top does not comprise starting position, then by starting position pop down, endpoint location is added 1, and continue
Continuous circulation;
S207, if end position end points, then judge whether stack top comprises starting position;
If S208 stack top comprises starting position, then judge time started position whether with the angle of this end position
Color is the most identical;
If S209 is identical, then explanation is normal end points, does not has to chip in, records this end position time
Point;
S210 if it is different, error in data is then described, before there occurs really not record of chipping in;
If S211 stack top does not comprise starting position, then see that the end position of previous end points and starting position are
No within 200ms, it is to rob words, words time of origin robbed in record, and is ejected by stack top end points;
S212, by all chip in finish message records robbed, wherein rob to chip in and comprise time started, knot for every section
Bundle time, type (rob words or chip in), rob direction of chipping in (who robs has been chipped in whom).
This double track voice rob analysis method of chipping in can rob between two or more roles words,
It is required to when chipping in capture this phenomenon timely, and carries out subsequent treatment, it is to avoid rob words and chip in
Unhandsome talking mode, provides the guarantee of high-quality for customer service.
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all at this
Any amendment, equivalent and the improvement etc. made within bright spirit and principle, should be included in the present invention
Protection domain within.
Claims (4)
1. a double track voice rob analysis method of chipping in, it is characterised in that robbing of this double track voice is slotting
Words analysis method comprises the following steps:
Step one, carry out efficient voice end points by voice activity detection technology for the recording stream of two sound channels
Detection, finds out and said from several seconds to several seconds in whole voice and exchange words;
Step 2, the efficient voice end points recorded according to two sound channels are unified by the endpoint time of each fragment
Process, by time point, sound channel, three attribute Unify legislation of endpoint type, and all end points are tiled arrive
On time shaft;
Step 3, two be close to end points, the most previous end points is the beginning end points that role A speaks,
Later end points is the end caps that role B speaks, and this is phenomenon of chipping in.
Step 4, two be close to end points, the most previous end points be role A speak terminate end points,
Later end points is the beginning end points that role B speaks, and the time boundary difference of two end points is less than 200ms,
It is and robs words phenomenon.
2. double track voice as claimed in claim 1 rob analysis method of chipping in, it is characterised in that in step
Efficient voice end points in rapid one comprises time started, end time, three attributes of speaker.
3. double track voice as claimed in claim 1 rob analysis method of chipping in, it is characterised in that step
In two, endpoint type includes starting and terminating.
4. double track voice as claimed in claim 1 rob analysis method of chipping in, it is characterised in that opposite end
The analysis method of vertex type comprises the following steps:
Step one, inspection endpoint type;
Step 2, if starting position end points, then judge whether stack top comprises starting position;
If step 3 stack top comprises starting position, then judge time started position whether with this start bit
The role put is identical;
If step 4 is identical, then corrupt data is described, it is impossible to a people does not finish words, weighs again
Newly loquitur;
Step 5, if it is different, then illustrate to chip in, records this information of chipping in, and by stack top end points
Eject;
If step 6 stack top does not comprise starting position, then by starting position pop down, endpoint location is added
1, and continue cycling through;
Step 7, if end position end points, then judge whether stack top comprises starting position;
If step 8 stack top comprises starting position, then judge time started position whether with this stop bits
The role put is the most identical;
If step 9 is identical, then explanation is normal end points, does not has to chip in, records this stop bits
Put time point;
Step 10 if it is different, error in data is then described, before there occurs really not record of chipping in;
If step 11 stack top does not comprise starting position, then see the end position of previous end points and open
Whether beginning position, within 200ms, is, robs words, and words time of origin robbed in record, and by stack top
End points ejects;
Step 12, by all chip in finish message records robbed, wherein rob to chip in and comprise beginning for every section
Time, end time, type, rob direction of chipping in.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610209686.4A CN105895116B (en) | 2016-04-06 | 2016-04-06 | Double-track voice break-in analysis method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610209686.4A CN105895116B (en) | 2016-04-06 | 2016-04-06 | Double-track voice break-in analysis method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105895116A true CN105895116A (en) | 2016-08-24 |
CN105895116B CN105895116B (en) | 2020-01-03 |
Family
ID=57012984
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610209686.4A Active CN105895116B (en) | 2016-04-06 | 2016-04-06 | Double-track voice break-in analysis method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105895116B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109600526A (en) * | 2019-01-08 | 2019-04-09 | 上海上湖信息技术有限公司 | Customer service quality determining method and device, readable storage medium storing program for executing |
CN111147669A (en) * | 2019-12-30 | 2020-05-12 | 科讯嘉联信息技术有限公司 | Full real-time automatic service quality inspection system and method |
CN112511698A (en) * | 2020-12-03 | 2021-03-16 | 普强时代(珠海横琴)信息技术有限公司 | Real-time call analysis method based on universal boundary detection |
CN113066496A (en) * | 2021-03-17 | 2021-07-02 | 浙江百应科技有限公司 | Method for analyzing call robbing of two conversation parties in audio |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001265368A (en) * | 2000-03-17 | 2001-09-28 | Omron Corp | Voice recognition device and recognized object detecting method |
CN102522081A (en) * | 2011-12-29 | 2012-06-27 | 北京百度网讯科技有限公司 | Method for detecting speech endpoints and system |
CN103811009A (en) * | 2014-03-13 | 2014-05-21 | 华东理工大学 | Smart phone customer service system based on speech analysis |
CN104052610A (en) * | 2014-05-19 | 2014-09-17 | 国家电网公司 | Informatization intelligent conference dispatching management device and using method |
WO2015001492A1 (en) * | 2013-07-02 | 2015-01-08 | Family Systems, Limited | Systems and methods for improving audio conferencing services |
US20150100316A1 (en) * | 2011-09-01 | 2015-04-09 | At&T Intellectual Property I, L.P. | System and method for advanced turn-taking for interactive spoken dialog systems |
US20150255087A1 (en) * | 2014-03-07 | 2015-09-10 | Fujitsu Limited | Voice processing device, voice processing method, and computer-readable recording medium storing voice processing program |
-
2016
- 2016-04-06 CN CN201610209686.4A patent/CN105895116B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001265368A (en) * | 2000-03-17 | 2001-09-28 | Omron Corp | Voice recognition device and recognized object detecting method |
US20150100316A1 (en) * | 2011-09-01 | 2015-04-09 | At&T Intellectual Property I, L.P. | System and method for advanced turn-taking for interactive spoken dialog systems |
CN102522081A (en) * | 2011-12-29 | 2012-06-27 | 北京百度网讯科技有限公司 | Method for detecting speech endpoints and system |
WO2015001492A1 (en) * | 2013-07-02 | 2015-01-08 | Family Systems, Limited | Systems and methods for improving audio conferencing services |
US20150255087A1 (en) * | 2014-03-07 | 2015-09-10 | Fujitsu Limited | Voice processing device, voice processing method, and computer-readable recording medium storing voice processing program |
CN103811009A (en) * | 2014-03-13 | 2014-05-21 | 华东理工大学 | Smart phone customer service system based on speech analysis |
CN104052610A (en) * | 2014-05-19 | 2014-09-17 | 国家电网公司 | Informatization intelligent conference dispatching management device and using method |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109600526A (en) * | 2019-01-08 | 2019-04-09 | 上海上湖信息技术有限公司 | Customer service quality determining method and device, readable storage medium storing program for executing |
CN111147669A (en) * | 2019-12-30 | 2020-05-12 | 科讯嘉联信息技术有限公司 | Full real-time automatic service quality inspection system and method |
CN112511698A (en) * | 2020-12-03 | 2021-03-16 | 普强时代(珠海横琴)信息技术有限公司 | Real-time call analysis method based on universal boundary detection |
CN113066496A (en) * | 2021-03-17 | 2021-07-02 | 浙江百应科技有限公司 | Method for analyzing call robbing of two conversation parties in audio |
Also Published As
Publication number | Publication date |
---|---|
CN105895116B (en) | 2020-01-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10701482B2 (en) | Recording meeting audio via multiple individual smartphones | |
CN105895116A (en) | Dual track voice break-in and interruption analysis method | |
US9258425B2 (en) | Method and system for speaker verification | |
US10498886B2 (en) | Dynamically switching communications to text interactions | |
US20220303502A1 (en) | Leveraging a network of microphones for inferring room location and speaker identity for more accurate transcriptions and semantic context across meetings | |
KR102349985B1 (en) | Detect and suppress voice queries | |
US11570217B2 (en) | Switch controller for separating multiple portions of call | |
US8588111B1 (en) | System and method for passive communication recording | |
US8086461B2 (en) | System and method for tracking persons of interest via voiceprint | |
US20150154961A1 (en) | Methods and apparatus for identifying fraudulent callers | |
US20100208605A1 (en) | Method and device for processing network time delay characteristics | |
CA3001839C (en) | Call detail record analysis to identify fraudulent activity and fraud detection in interactive voice response systems | |
US20090028310A1 (en) | Automatic contextual media recording and processing utilizing speech analytics | |
US20150310863A1 (en) | Method and apparatus for speaker diarization | |
EP3158719A1 (en) | Method and system for filtering undesirable incoming telephone calls | |
EP3504861B1 (en) | Audio transmission with compensation for speech detection period duration | |
US10652396B2 (en) | Stream server that modifies a stream according to detected characteristics | |
DE602007008602D1 (en) | SELECTION OF ACCESS PROCEDURES DURING THE PERFORMANCE OF HANDOVERS IN A MOBILE COMMUNICATION SYSTEM | |
US20210092223A1 (en) | Robocall detection using acoustic profiling | |
US20150032515A1 (en) | Quality Inspection Processing Method and Device | |
CN101202040A (en) | An efficient voice activity detactor to detect fixed power signals | |
US8437266B2 (en) | Flow through call control | |
CN103002108B (en) | Call recording processing method and system and mobile terminal | |
US9257117B2 (en) | Speech analytics with adaptive filtering | |
WO2018038989A1 (en) | Audio compensation techniques for network outages |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20200309 Address after: 519000 room 105-58115, No. 6, Baohua Road, Hengqin New District, Zhuhai City, Guangdong Province (centralized office area) Patentee after: Puqiang times (Zhuhai Hengqin) Information Technology Co., Ltd Address before: 100085 cloud base 4 / F, tower C, Software Park Plaza, building 4, No. 8, Dongbei Wangxi Road, Haidian District, Beijing Patentee before: Puqiang Information Technology (Beijing) Co., Ltd. |
|
TR01 | Transfer of patent right |