CN105895116A - Dual track voice break-in and interruption analysis method - Google Patents

Dual track voice break-in and interruption analysis method Download PDF

Info

Publication number
CN105895116A
CN105895116A CN201610209686.4A CN201610209686A CN105895116A CN 105895116 A CN105895116 A CN 105895116A CN 201610209686 A CN201610209686 A CN 201610209686A CN 105895116 A CN105895116 A CN 105895116A
Authority
CN
China
Prior art keywords
end points
voice
time
chipping
analysis method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610209686.4A
Other languages
Chinese (zh)
Other versions
CN105895116B (en
Inventor
刘郁松
何国涛
李全忠
蒲瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Puqiang times (Zhuhai Hengqin) Information Technology Co., Ltd
Original Assignee
Universal Information Technology (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universal Information Technology (beijing) Co Ltd filed Critical Universal Information Technology (beijing) Co Ltd
Priority to CN201610209686.4A priority Critical patent/CN105895116B/en
Publication of CN105895116A publication Critical patent/CN105895116A/en
Application granted granted Critical
Publication of CN105895116B publication Critical patent/CN105895116B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/87Detection of discrete points within a voice signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention discloses a dual track voice break-in and interruption analysis method. The method comprises the steps of carrying out the effective voice end point detection by a voice activity detection technology and aiming at the recording streams of two tracks, and finding out the talk duration during the whole voice; according to the effective voice end points of the two track recordings, uniformly processing the end point time of each fragment, and laying all end points on a time axis by uniformly describing the three attributes of time points, tracks and end point types; traversing all time points from front to back, and analyzing whether the end points are beginning position end points or ending position end points. The dual track voice break-in and interruption analysis method can capture the break-in and interruption phenomena timely when the break-in and interruption phenomena are generated between two or more roles, and carries out the subsequent processing, thereby avoiding a break-in and interruption impolite conversation mode, and providing the superior guarantee for the customer services.

Description

A kind of double track voice rob analysis method of chipping in
Technical field
The invention belongs to customer service communicating tech field, particularly relate to a kind of double track voice robs analysis side of chipping in Method.
Background technology
Voice customer service refers to the customer service mainly carried out with the form of mobile phone, during customer service, The problem often robbed words, chip between two or more roles.Wherein rob words and refer between two roles, One role just finishes, and another role the most just speaks, and middle not free interval, this is at talk In be a kind of unhandsome mode, can be considered rude, half-hearted by the other side.Chip between two roles of finger, One role speaks, and another role directly chips in and states the suggestion of oneself, and this is more in talk Unhandsome mode.The phenomenon this rob words, chipping in has had a strong impact on the quality level of customer service.
Summary of the invention
It is an object of the invention to provide a kind of double track voice robs analysis method of chipping in, it is intended to solve customer service During the problem robbed words, chip in that occurs.
The present invention is achieved in that the analysis method of chipping in of robbing of this double track voice comprises the following steps:
Step one, carry out efficient voice end points by voice activity detection technology for the recording stream of two sound channels Detection, finds out and said from several seconds to several seconds in whole voice and exchange words;
Step 2, the efficient voice end points recorded according to two sound channels are unified by the endpoint time of each fragment Process, by time point, sound channel, three attribute Unify legislation of endpoint type, and all end points are tiled arrive On time shaft;
Step 3, two be close to end points, the most previous end points is the beginning end points that role A speaks, Later end points is the end caps that role B speaks, and this is phenomenon of chipping in.
Step 4, two be close to end points, the most previous end points be role A speak terminate end points, Later end points is the beginning end points that role B speaks, and the time boundary difference of two end points is less than 200ms, It is and robs words phenomenon.
The present invention also takes following technical measures:
Efficient voice end points in step one comprises time started, end time, three attributes of speaker.
In step 2, endpoint type includes starting and terminating.
The analysis method of endpoint type is comprised the following steps:
Step one, inspection endpoint type;
Step 2, if starting position end points, then judge whether stack top comprises starting position;
If step 3 stack top comprises starting position, then judge time started position whether with this starting position Role is identical;
If step 4 is identical, then corrupt data is described, it is impossible to a people does not finish words, opens again Beginning speaks;
Step 5, if it is different, then illustrate to chip in, records this information of chipping in, and is ejected by stack top end points;
If step 6 stack top does not comprise starting position, then by starting position pop down, endpoint location is added 1, and Continue cycling through;
Step 7, if end position end points, then judge whether stack top comprises starting position;
If step 8 stack top comprises starting position, then judge time started position whether with this end position Role is the most identical;
If step 9 is identical, then explanation is normal end points, does not has to chip in, when recording this end position Between point;
Step 10 if it is different, error in data is then described, before there occurs really not record of chipping in;
If step 11 stack top does not comprise starting position, then see end position and the start bit of previous end points Whether put within 200ms, be, rob words, words time of origin robbed in record, and is ejected by stack top end points;
Step 12, by all chip in finish message records robbed, wherein every section rob chip in comprise the time started, End time, type, rob direction of chipping in.
The present invention has the advantage that with good effect: the robbing of this double track voice chips in analysis method can be Rob words between two or more roles, be required to capture this phenomenon timely when chipping in, and Carry out subsequent treatment, it is to avoid rob words and unhandsome talking mode of chipping in, provide high-quality for customer service Guarantee.
Accompanying drawing explanation
Fig. 1 be the embodiment of the present invention provide double track voice rob analysis method flow diagram of chipping in;
Fig. 2 is the analysis method flow diagram to endpoint type that the embodiment of the present invention provides.
Detailed description of the invention
In order to make the purpose of the present invention, technical scheme and advantage clearer, below in conjunction with embodiment, The present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to Explain the present invention, be not intended to limit the present invention.
Below in conjunction with the accompanying drawings 1,2 and specific embodiment the application principle of the present invention is further described.
The analysis method of chipping in of robbing of this double track voice comprises the following steps:
S101, carry out efficient voice end points inspection by voice activity detection technology for the recording stream of two sound channels Survey, find out and whole voice was said from several seconds to several seconds exchange words;
S102, the efficient voice end points recorded according to two sound channels, at unified for the endpoint time of each fragment All end points by time point, sound channel, three attribute Unify legislation of endpoint type, and are tiled then by reason On countershaft;
S103, two be close to end points, the most previous end points is the beginning end points that role A speaks, after One end points is the end caps that role B speaks, and this is phenomenon of chipping in.
S104, two be close to end points, the most previous end points be role A speak terminate end points, after One end points is the beginning end points that role B speaks, and the time boundary difference of two end points is less than 200ms, i.e. For robbing words phenomenon.
Efficient voice end points in S101 comprises time started, end time, three attributes of speaker.
In S102, endpoint type includes starting and terminating.
The analysis method of endpoint type is comprised the following steps:
S201, inspection endpoint type;
S202, if starting position end points, then judge whether stack top comprises starting position;
If S203 stack top comprises starting position, then judge time started position whether with the angle of this starting position Color is identical;
If S204 is identical, then corrupt data is described, it is impossible to a people does not finish words, starts again Speak;
S205, if it is different, then illustrate to chip in, records this information of chipping in, and is ejected by stack top end points;
If S206 stack top does not comprise starting position, then by starting position pop down, endpoint location is added 1, and continue Continuous circulation;
S207, if end position end points, then judge whether stack top comprises starting position;
If S208 stack top comprises starting position, then judge time started position whether with the angle of this end position Color is the most identical;
If S209 is identical, then explanation is normal end points, does not has to chip in, records this end position time Point;
S210 if it is different, error in data is then described, before there occurs really not record of chipping in;
If S211 stack top does not comprise starting position, then see that the end position of previous end points and starting position are No within 200ms, it is to rob words, words time of origin robbed in record, and is ejected by stack top end points;
S212, by all chip in finish message records robbed, wherein rob to chip in and comprise time started, knot for every section Bundle time, type (rob words or chip in), rob direction of chipping in (who robs has been chipped in whom).
This double track voice rob analysis method of chipping in can rob between two or more roles words, It is required to when chipping in capture this phenomenon timely, and carries out subsequent treatment, it is to avoid rob words and chip in Unhandsome talking mode, provides the guarantee of high-quality for customer service.
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all at this Any amendment, equivalent and the improvement etc. made within bright spirit and principle, should be included in the present invention Protection domain within.

Claims (4)

1. a double track voice rob analysis method of chipping in, it is characterised in that robbing of this double track voice is slotting Words analysis method comprises the following steps:
Step one, carry out efficient voice end points by voice activity detection technology for the recording stream of two sound channels Detection, finds out and said from several seconds to several seconds in whole voice and exchange words;
Step 2, the efficient voice end points recorded according to two sound channels are unified by the endpoint time of each fragment Process, by time point, sound channel, three attribute Unify legislation of endpoint type, and all end points are tiled arrive On time shaft;
Step 3, two be close to end points, the most previous end points is the beginning end points that role A speaks, Later end points is the end caps that role B speaks, and this is phenomenon of chipping in.
Step 4, two be close to end points, the most previous end points be role A speak terminate end points, Later end points is the beginning end points that role B speaks, and the time boundary difference of two end points is less than 200ms, It is and robs words phenomenon.
2. double track voice as claimed in claim 1 rob analysis method of chipping in, it is characterised in that in step Efficient voice end points in rapid one comprises time started, end time, three attributes of speaker.
3. double track voice as claimed in claim 1 rob analysis method of chipping in, it is characterised in that step In two, endpoint type includes starting and terminating.
4. double track voice as claimed in claim 1 rob analysis method of chipping in, it is characterised in that opposite end The analysis method of vertex type comprises the following steps:
Step one, inspection endpoint type;
Step 2, if starting position end points, then judge whether stack top comprises starting position;
If step 3 stack top comprises starting position, then judge time started position whether with this start bit The role put is identical;
If step 4 is identical, then corrupt data is described, it is impossible to a people does not finish words, weighs again Newly loquitur;
Step 5, if it is different, then illustrate to chip in, records this information of chipping in, and by stack top end points Eject;
If step 6 stack top does not comprise starting position, then by starting position pop down, endpoint location is added 1, and continue cycling through;
Step 7, if end position end points, then judge whether stack top comprises starting position;
If step 8 stack top comprises starting position, then judge time started position whether with this stop bits The role put is the most identical;
If step 9 is identical, then explanation is normal end points, does not has to chip in, records this stop bits Put time point;
Step 10 if it is different, error in data is then described, before there occurs really not record of chipping in;
If step 11 stack top does not comprise starting position, then see the end position of previous end points and open Whether beginning position, within 200ms, is, robs words, and words time of origin robbed in record, and by stack top End points ejects;
Step 12, by all chip in finish message records robbed, wherein rob to chip in and comprise beginning for every section Time, end time, type, rob direction of chipping in.
CN201610209686.4A 2016-04-06 2016-04-06 Double-track voice break-in analysis method Active CN105895116B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610209686.4A CN105895116B (en) 2016-04-06 2016-04-06 Double-track voice break-in analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610209686.4A CN105895116B (en) 2016-04-06 2016-04-06 Double-track voice break-in analysis method

Publications (2)

Publication Number Publication Date
CN105895116A true CN105895116A (en) 2016-08-24
CN105895116B CN105895116B (en) 2020-01-03

Family

ID=57012984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610209686.4A Active CN105895116B (en) 2016-04-06 2016-04-06 Double-track voice break-in analysis method

Country Status (1)

Country Link
CN (1) CN105895116B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109600526A (en) * 2019-01-08 2019-04-09 上海上湖信息技术有限公司 Customer service quality determining method and device, readable storage medium storing program for executing
CN111147669A (en) * 2019-12-30 2020-05-12 科讯嘉联信息技术有限公司 Full real-time automatic service quality inspection system and method
CN112511698A (en) * 2020-12-03 2021-03-16 普强时代(珠海横琴)信息技术有限公司 Real-time call analysis method based on universal boundary detection
CN113066496A (en) * 2021-03-17 2021-07-02 浙江百应科技有限公司 Method for analyzing call robbing of two conversation parties in audio

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001265368A (en) * 2000-03-17 2001-09-28 Omron Corp Voice recognition device and recognized object detecting method
CN102522081A (en) * 2011-12-29 2012-06-27 北京百度网讯科技有限公司 Method for detecting speech endpoints and system
CN103811009A (en) * 2014-03-13 2014-05-21 华东理工大学 Smart phone customer service system based on speech analysis
CN104052610A (en) * 2014-05-19 2014-09-17 国家电网公司 Informatization intelligent conference dispatching management device and using method
WO2015001492A1 (en) * 2013-07-02 2015-01-08 Family Systems, Limited Systems and methods for improving audio conferencing services
US20150100316A1 (en) * 2011-09-01 2015-04-09 At&T Intellectual Property I, L.P. System and method for advanced turn-taking for interactive spoken dialog systems
US20150255087A1 (en) * 2014-03-07 2015-09-10 Fujitsu Limited Voice processing device, voice processing method, and computer-readable recording medium storing voice processing program

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001265368A (en) * 2000-03-17 2001-09-28 Omron Corp Voice recognition device and recognized object detecting method
US20150100316A1 (en) * 2011-09-01 2015-04-09 At&T Intellectual Property I, L.P. System and method for advanced turn-taking for interactive spoken dialog systems
CN102522081A (en) * 2011-12-29 2012-06-27 北京百度网讯科技有限公司 Method for detecting speech endpoints and system
WO2015001492A1 (en) * 2013-07-02 2015-01-08 Family Systems, Limited Systems and methods for improving audio conferencing services
US20150255087A1 (en) * 2014-03-07 2015-09-10 Fujitsu Limited Voice processing device, voice processing method, and computer-readable recording medium storing voice processing program
CN103811009A (en) * 2014-03-13 2014-05-21 华东理工大学 Smart phone customer service system based on speech analysis
CN104052610A (en) * 2014-05-19 2014-09-17 国家电网公司 Informatization intelligent conference dispatching management device and using method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109600526A (en) * 2019-01-08 2019-04-09 上海上湖信息技术有限公司 Customer service quality determining method and device, readable storage medium storing program for executing
CN111147669A (en) * 2019-12-30 2020-05-12 科讯嘉联信息技术有限公司 Full real-time automatic service quality inspection system and method
CN112511698A (en) * 2020-12-03 2021-03-16 普强时代(珠海横琴)信息技术有限公司 Real-time call analysis method based on universal boundary detection
CN113066496A (en) * 2021-03-17 2021-07-02 浙江百应科技有限公司 Method for analyzing call robbing of two conversation parties in audio

Also Published As

Publication number Publication date
CN105895116B (en) 2020-01-03

Similar Documents

Publication Publication Date Title
US10701482B2 (en) Recording meeting audio via multiple individual smartphones
CN105895116A (en) Dual track voice break-in and interruption analysis method
US9258425B2 (en) Method and system for speaker verification
US10498886B2 (en) Dynamically switching communications to text interactions
US20220303502A1 (en) Leveraging a network of microphones for inferring room location and speaker identity for more accurate transcriptions and semantic context across meetings
KR102349985B1 (en) Detect and suppress voice queries
US11570217B2 (en) Switch controller for separating multiple portions of call
US8588111B1 (en) System and method for passive communication recording
US8086461B2 (en) System and method for tracking persons of interest via voiceprint
US20150154961A1 (en) Methods and apparatus for identifying fraudulent callers
US20100208605A1 (en) Method and device for processing network time delay characteristics
CA3001839C (en) Call detail record analysis to identify fraudulent activity and fraud detection in interactive voice response systems
US20090028310A1 (en) Automatic contextual media recording and processing utilizing speech analytics
US20150310863A1 (en) Method and apparatus for speaker diarization
EP3158719A1 (en) Method and system for filtering undesirable incoming telephone calls
EP3504861B1 (en) Audio transmission with compensation for speech detection period duration
US10652396B2 (en) Stream server that modifies a stream according to detected characteristics
DE602007008602D1 (en) SELECTION OF ACCESS PROCEDURES DURING THE PERFORMANCE OF HANDOVERS IN A MOBILE COMMUNICATION SYSTEM
US20210092223A1 (en) Robocall detection using acoustic profiling
US20150032515A1 (en) Quality Inspection Processing Method and Device
CN101202040A (en) An efficient voice activity detactor to detect fixed power signals
US8437266B2 (en) Flow through call control
CN103002108B (en) Call recording processing method and system and mobile terminal
US9257117B2 (en) Speech analytics with adaptive filtering
WO2018038989A1 (en) Audio compensation techniques for network outages

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200309

Address after: 519000 room 105-58115, No. 6, Baohua Road, Hengqin New District, Zhuhai City, Guangdong Province (centralized office area)

Patentee after: Puqiang times (Zhuhai Hengqin) Information Technology Co., Ltd

Address before: 100085 cloud base 4 / F, tower C, Software Park Plaza, building 4, No. 8, Dongbei Wangxi Road, Haidian District, Beijing

Patentee before: Puqiang Information Technology (Beijing) Co., Ltd.

TR01 Transfer of patent right