CN115442273B - Voice recognition-based audio transmission integrity monitoring method and device - Google Patents

Voice recognition-based audio transmission integrity monitoring method and device Download PDF

Info

Publication number
CN115442273B
CN115442273B CN202211117749.5A CN202211117749A CN115442273B CN 115442273 B CN115442273 B CN 115442273B CN 202211117749 A CN202211117749 A CN 202211117749A CN 115442273 B CN115442273 B CN 115442273B
Authority
CN
China
Prior art keywords
audio
audio information
information
data
transmission
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211117749.5A
Other languages
Chinese (zh)
Other versions
CN115442273A (en
Inventor
章笑春
彭猛
余怀军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rivotek Technology Jiangsu Co Ltd
Original Assignee
Rivotek Technology Jiangsu Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rivotek Technology Jiangsu Co Ltd filed Critical Rivotek Technology Jiangsu Co Ltd
Priority to CN202211117749.5A priority Critical patent/CN115442273B/en
Publication of CN115442273A publication Critical patent/CN115442273A/en
Application granted granted Critical
Publication of CN115442273B publication Critical patent/CN115442273B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors
    • H04L43/0829Packet loss
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • H04L43/106Active monitoring, e.g. heartbeat, ping or trace-route using time related information in packets, e.g. by adding timestamps
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention provides a voice recognition-based audio transmission integrity monitoring method and a voice recognition-based audio transmission integrity monitoring device, wherein the method comprises the following steps: the audio acquisition module acquires first audio information and transmits the first audio information to a transmission target terminal; converting the first audio information into text information with a timestamp and uploading the text information to a monitoring analysis background; the transmission target terminal plays the second audio information, converts the second audio information into text information with a timestamp and uploads the text information to the monitoring analysis background; and the monitoring analysis background automatically compares the text information with the same timestamp and analyzes the integrity of audio transmission. The invention converts the audio data into automatic comparison without manual examination, thereby protecting the privacy of users; data reporting is carried out in a timestamp and text mode, so that the traffic consumption is low, the loss rate is low, and the method is easy to popularize; an objective evaluation method of audio transmission integrity is provided for audio communication equipment manufacturers and operators, and meanwhile, operators can know the use problem of a real user scene through data analysis and provide data support for product iteration.

Description

Voice recognition-based audio transmission integrity monitoring method and device
Technical Field
The invention relates to the technical field of audio transmission, in particular to an audio transmission integrity monitoring method and device based on voice recognition.
Background
The audio communication experience is more and more important in work and study due to the requirements of a large number of cloud office work and cloud teaching. Because the audio service data volume is large, the real-time requirement is high, and the user sensitivity is strong, the monitoring of the audio communication quality is very important for audio communication equipment manufacturers and operators.
The core indexes of audio communication quality measurement are as follows: whether the audio content is transmitted to the target terminal completely and accurately, i.e. the completeness and accuracy of the audio transmission. In the process of network communication, a user cannot detect the packet loss rate under the conditions of inaudibility, interruption or blockage, or the packet loss occurs in conversation, and an information receiver is automatically completed to cause communication ambiguity. The problems of audio loss, ambiguity and the like caused by packet loss in the actual communication process generally exist, and the user satisfaction is seriously influenced. However, there is currently a lack in the market of methods for accurately monitoring the integrity of audio transmissions.
Disclosure of Invention
Aiming at the problems that the packet loss rate cannot be detected under the conditions of inaudibility, intermittence or blockage of a user in the network communication process, or the packet loss occurs in the conversation process, an information receiver automatically completes the communication to cause communication ambiguity, the audio loss and the ambiguity caused by the packet loss in the actual communication process generally exist, and the satisfaction degree of the user is seriously influenced, the audio transmission integrity monitoring method and the device based on the voice recognition are provided, the time stamp is marked on input characters, the compensation output is automatically carried out according to the absence of the time stamp during the output, and the problem of the packet loss during the conversation can be solved; the audio is digitalized and automatic comparison is introduced, manual examination is not needed, and the privacy of the user is protected; data reporting is carried out in a timestamp and text mode, so that the traffic consumption is low, the loss rate is low, and the method is easy to popularize; the link is simpler based on the third link, and the detection is more convenient; an objective evaluation method for audio transmission integrity is provided for audio communication equipment manufacturers and operators, and meanwhile, operators can know the use problem of a real user scene through data analysis and provide data support for product iteration.
In order to achieve the purpose, the invention is realized by the following technical scheme:
a voice recognition-based audio transmission integrity monitoring method comprises the following steps:
the audio acquisition module acquires first audio information and transmits the first audio information to a transmission target terminal;
converting first audio information into character information as first data, and uploading the first data to a monitoring analysis background;
the transmission target terminal takes the received audio information as second audio information and plays the second audio information, the second audio information is converted into character information and serves as second data, and the second data is uploaded to a monitoring analysis background;
and the monitoring analysis background automatically compares the first data and the second data with the same timestamp, and analyzes the integrity of audio transmission.
As a preferred embodiment of the present invention, the converting of the first audio information into text information as first data specifically includes: and cutting the first audio information, acquiring first segmented audio information, and transmitting the first segmented audio information to a first voice recognition module of an input source, wherein the first voice recognition module recognizes the first segmented audio information and converts the first segmented audio information into character information with a timestamp on each character as first data.
As a preferred scheme of the present invention, the converting, by the transmission target terminal, the second audio information into the text information as the second data specifically includes: and cutting the second audio information, acquiring second sectional audio information, transmitting the second sectional audio information to a second voice recognition module of a transmission target terminal, and recognizing the second sectional audio information by the second voice recognition module to convert the second sectional audio information into character information with time stamps of each character as second data.
As a preferred scheme of the present invention, the automatically comparing, by the monitoring analysis background, the first data and the second data with the same timestamp specifically includes: the monitoring analysis background automatically compares the text contents with the same timestamp to obtain the lost or wrong text amount, and the formula is utilized: distortion rate = amount of lost or erroneous text/total text transmission, the audio distortion rate is calculated.
As a preferred aspect of the present invention, the lower the audio distortion rate, the higher the audio transmission integrity.
As a preferred scheme of the invention, the device comprises an input source, a transmission target terminal and a monitoring analysis platform; the input source is in communication connection with the transmission target terminal;
the input source comprises a first real-time clock and a first controller, wherein the first real-time clock generates a time synchronization signal, and the first real-time clock is electrically connected with the first controller;
the transmission target terminal comprises a second real-time clock and a second controller, wherein the second real-time clock is used for generating a time synchronization signal and is electrically connected with the second controller;
the first real-time clock is synchronized with the second real-time clock.
As a preferable aspect of the present invention, the first controller includes:
the audio acquisition module is used for acquiring first audio information by an input source;
the first audio cutting module is used for cutting the audio information to obtain first segmented audio information;
the first audio transmission module is used for transmitting the first segmented audio information to the voice recognition module;
the first voice recognition module is used for recognizing the first segmented audio information and converting the first segmented audio information into character information with time stamps of each character as first data;
and the first communication module is used for uploading the first data to a monitoring analysis background.
As a preferable aspect of the present invention, the second controller includes:
the audio receiving module is used for receiving the audio information as second audio information and playing the second audio information;
the second audio cutting module is used for cutting the second audio information to obtain second segmented audio information;
the second audio transmission module is used for transmitting the second segmented audio information to the voice recognition module;
the second voice recognition module is used for recognizing the second segmented audio information and converting the second segmented audio information into character information with time stamps of each character as second data;
and the second communication module is used for uploading the second data to a monitoring analysis background.
As a preferred scheme of the present invention, the monitoring analysis background is configured to automatically compare first data and second data with the same timestamp, and specifically includes: the monitoring analysis background automatically compares the text contents with the same timestamp to obtain the lost or wrong text quantity, and the formula is utilized: the distortion rate = the lost or wrong text amount/total text transmission amount, and the audio distortion rate is calculated; the lower the audio distortion rate, the higher the audio transmission integrity.
As a preferred embodiment of the present invention, the input source and the transmission target terminal are one of a PC, a mobile phone, a PAD, a vehicle-mounted console, and an intelligent speaker.
The invention has the beneficial effects that: the audio is digitalized and automatic comparison is introduced, manual examination is not needed, and the privacy of the user is protected; data reporting is carried out in a timestamp and text mode, so that the traffic consumption is low, the loss rate is low, and the method is easy to popularize; the link is simpler based on the third link, and the detection is more convenient; an objective evaluation method for the audio transmission integrity is provided for audio communication equipment manufacturers and operators, and meanwhile, operators can know the use problem of a real user scene through data analysis and provide data support for product iteration.
Drawings
The invention is described in detail below with reference to the drawings and the detailed description;
fig. 1 is a flowchart of an audio transmission integrity monitoring method based on speech recognition according to an embodiment of the present invention;
fig. 2 is a block diagram of an audio transmission integrity monitoring apparatus based on speech recognition according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings of the embodiments of the present invention. It should be apparent that the described embodiments are only some of the embodiments of the present invention, and not all of them. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the invention, are within the scope of the invention.
As shown in fig. 1, an embodiment of the present invention provides an audio transmission integrity monitoring method based on speech recognition, including the following steps:
step 1: the audio acquisition module acquires first audio information and transmits the first audio information to a transmission target terminal;
step 2: converting the first audio information into character information as first data, and uploading the first data to a monitoring analysis background;
specifically, the first audio information is cut, first segmented audio information is obtained, the first segmented audio information is transmitted to a first voice recognition module of an input source, the first voice recognition module recognizes the first segmented audio information, and the first segmented audio information is converted into character information with a timestamp of each character as first data.
And step 3: the transmission target terminal takes the received audio information as second audio information and plays the second audio information, the second audio information is converted into character information to serve as second data, and the second data is uploaded to the monitoring analysis background;
specifically, the second audio information is cut, the second audio information is obtained, the second audio information is transmitted to a second voice recognition module of the transmission target terminal, the second voice recognition module recognizes the second audio information, and the second audio information is converted into character information with a timestamp and each character information serves as second data.
And 4, step 4: the monitoring analysis background automatically compares the first data and the second data with the same timestamp, and analyzes the integrity of audio transmission; the monitoring analysis background automatically compares the text contents with the same timestamp to obtain the lost or wrong text amount, and the formula is utilized: distortion rate = amount of lost or erroneous text/total text transmission, the audio distortion rate is calculated. The lower the audio distortion rate, the higher the audio transmission integrity.
As shown in fig. 2, another embodiment of the present invention provides an audio transmission integrity monitoring apparatus based on voice recognition, which includes an input source, a transmission target terminal, and a monitoring analysis platform; the input source is in communication connection with the transmission target terminal; the input source comprises a first real-time clock and a first controller, wherein the first real-time clock generates a time synchronization signal and is electrically connected with the first controller; the transmission target terminal comprises a second real-time clock and a second controller, wherein the second real-time clock generates a time synchronization signal and is electrically connected with the second controller; the first real-time clock is synchronized with the second real-time clock. The input source and the transmission target terminal are one of a PC, a mobile phone, a PAD, a vehicle-mounted center console or an intelligent sound box.
The first controller includes:
the audio acquisition module is used for acquiring first audio information by an input source;
the first audio cutting module is used for cutting the audio information to obtain first segmented audio information;
the first audio transmission module is used for transmitting the first segmented audio information to the voice recognition module;
the first voice recognition module is used for converting the first segmented audio information into character information with time stamps of each character as first data;
the first communication module is used for uploading the first data to the monitoring analysis background.
The second controller includes:
the audio receiving module is used for receiving the audio information as second audio information and playing the second audio information;
the second audio cutting module is used for cutting the second audio information to obtain second sectional audio information;
the second audio transmission module is used for transmitting the second segmented audio information to the voice recognition module;
the second voice recognition module is used for recognizing the second section of audio information and converting the second section of audio information into character information with time stamps of each character as second data;
and the second communication module is used for uploading the second data to the monitoring analysis background.
The monitoring analysis background is used for automatically comparing first data and second data of the same timestamp, automatically comparing the text content of the same timestamp by the monitoring analysis background to obtain the lost or wrong text quantity, and utilizing a formula: the distortion rate = the lost or wrong text amount/total text transmission amount, and the audio distortion rate is calculated; the lower the audio distortion rate, the higher the audio transmission integrity.
Furthermore, the transmission target terminal can automatically perform compensation output according to the lack of the timestamp, and the problem of packet loss during conversation is solved.
In conclusion, the audio data is converted into automatic comparison, manual examination is not needed, and the privacy of the user is protected; data reporting is carried out in a timestamp and text mode, so that the traffic consumption is low, the loss rate is low, and the method is easy to popularize; the link is simpler based on the third link, and the detection is more convenient; an objective evaluation method for audio transmission integrity is provided for audio communication equipment manufacturers and operators, and meanwhile, operators can know the use problem of a real user scene through data analysis and provide data support for product iteration.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (5)

1. A voice recognition-based audio transmission integrity monitoring method is characterized by comprising the following steps:
the audio acquisition module acquires first audio information and transmits the first audio information to a transmission target terminal;
converting the first audio information into character information with time stamps of each character as first data, and uploading the first data to a monitoring analysis background;
the transmission target terminal takes the received audio information as second audio information and plays the second audio information, the second audio information is converted into character information with time stamps and serves as second data, and the second data are uploaded to a monitoring analysis background;
the monitoring analysis background automatically compares the first data and the second data with the same timestamp, and analyzes the integrity of audio transmission;
the monitoring analysis background automatically compares the first data and the second data with the same timestamp, and specifically comprises the following steps: the monitoring analysis background automatically compares the text contents with the same timestamp to obtain the lost or wrong text amount, and the formula is utilized: the distortion rate = the lost or wrong text amount/total text transmission amount, and the audio distortion rate is calculated; the lower the audio distortion rate, the higher the audio transmission integrity.
2. The audio transmission integrity monitoring method based on speech recognition according to claim 1, wherein the converting of the first audio information into text information with a timestamp for each word as the first data specifically comprises: and cutting the first audio information, acquiring first segmented audio information, and transmitting the first segmented audio information to a first voice recognition module of an input source, wherein the first voice recognition module recognizes the first segmented audio information and converts the first segmented audio information into character information with a timestamp on each character as first data.
3. The audio transmission integrity monitoring method based on speech recognition according to claim 1, wherein the transmission target terminal converts the second audio information into text information with a time stamp for each word as the second data, specifically comprising: and cutting the second audio information, acquiring second sectional audio information, transmitting the second sectional audio information to a second voice recognition module of a transmission target terminal, and recognizing the second sectional audio information by the second voice recognition module to convert the second sectional audio information into character information with time stamps of each character as second data.
4. The device for monitoring the integrity of audio transmission based on voice recognition is characterized by comprising an input source, a transmission target terminal and a monitoring analysis background; the input source is in communication connection with the transmission target terminal;
the input source comprises a first real-time clock and a first controller, wherein the first real-time clock generates a time synchronization signal, and the first real-time clock is electrically connected with the first controller;
the transmission target terminal comprises a second real-time clock and a second controller, wherein the second real-time clock generates a time synchronization signal, and the second real-time clock is electrically connected with the second controller;
the first real-time clock is synchronous with the second real-time clock;
the first controller includes:
the audio acquisition module is used for acquiring first audio information by an input source;
the first audio cutting module is used for cutting the audio information to obtain first segmented audio information;
the first audio transmission module is used for transmitting the first segmented audio information to the voice recognition module;
the first voice recognition module is used for recognizing the first segmented audio information and converting the first segmented audio information into character information with time stamps of each character as first data;
the first communication module is used for uploading the first data to a monitoring analysis background;
the second controller includes:
the audio receiving module is used for receiving the audio information as second audio information and playing the second audio information;
the second audio cutting module is used for cutting the second audio information to obtain second segmented audio information;
the second audio transmission module is used for transmitting the second segmented audio information to the voice recognition module;
the second voice recognition module is used for recognizing the second segmented audio information and converting the second segmented audio information into character information with time stamps of each character as second data;
the second communication module is used for uploading the second data to a monitoring analysis background;
the monitoring analysis background is used for automatically comparing first data and second data with the same timestamp, and specifically comprises the following steps: the monitoring analysis background automatically compares the text contents with the same timestamp to obtain the lost or wrong text amount, and the formula is utilized: the distortion rate = the lost or wrong text amount/total text transmission amount, and the audio distortion rate is calculated; the lower the audio distortion rate, the higher the audio transmission integrity.
5. The audio transmission integrity monitoring device based on speech recognition as claimed in claim 4, wherein the input source and the transmission target terminal are one of a PC, a mobile phone, a PAD, a car console or a smart box.
CN202211117749.5A 2022-09-14 2022-09-14 Voice recognition-based audio transmission integrity monitoring method and device Active CN115442273B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211117749.5A CN115442273B (en) 2022-09-14 2022-09-14 Voice recognition-based audio transmission integrity monitoring method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211117749.5A CN115442273B (en) 2022-09-14 2022-09-14 Voice recognition-based audio transmission integrity monitoring method and device

Publications (2)

Publication Number Publication Date
CN115442273A CN115442273A (en) 2022-12-06
CN115442273B true CN115442273B (en) 2023-04-07

Family

ID=84247399

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211117749.5A Active CN115442273B (en) 2022-09-14 2022-09-14 Voice recognition-based audio transmission integrity monitoring method and device

Country Status (1)

Country Link
CN (1) CN115442273B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017128991A1 (en) * 2016-01-26 2017-08-03 阿里巴巴集团控股有限公司 Instant communication method and instant communication system based on voice recognition
CN212211298U (en) * 2020-06-09 2020-12-22 苏州车萝卜汽车电子科技有限公司 Audio synchronous transmission device with time stamp

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108074570A (en) * 2017-12-26 2018-05-25 安徽声讯信息技术有限公司 Surface trimming, transmission, the audio recognition method preserved
CN110880316A (en) * 2019-10-16 2020-03-13 苏宁云计算有限公司 Audio output method and system
CN112270919B (en) * 2020-09-14 2022-11-22 深圳随锐视听科技有限公司 Method, system, storage medium and electronic device for automatically complementing sound of video conference

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017128991A1 (en) * 2016-01-26 2017-08-03 阿里巴巴集团控股有限公司 Instant communication method and instant communication system based on voice recognition
CN212211298U (en) * 2020-06-09 2020-12-22 苏州车萝卜汽车电子科技有限公司 Audio synchronous transmission device with time stamp

Also Published As

Publication number Publication date
CN115442273A (en) 2022-12-06

Similar Documents

Publication Publication Date Title
CN103632680B (en) A kind of speech quality assessment method, network element and system
US6700953B1 (en) System, apparatus, method and article of manufacture for evaluating the quality of a transmission channel using voice recognition technology
JPH0654364A (en) Apparatus for comparison of subjective dialogue in mobile telephone system
CN102158881B (en) Method and device for completely evaluating 3G visual telephone quality
CN101742548B (en) H.324M protocol-based 3G video telephone audio and video synchronization device and method thereof
CN104125022B (en) The method of measurement of audio transmission time delay and system
CN102075988A (en) System and method for locating end-to-end voice quality fault in mobile communication network
CN102325059A (en) Audio frequency end-to-end time delay measurement method of non-intrusive single end acquisition and apparatus thereof
US20130243204A1 (en) Sound quality testing method and system
CN108183760A (en) Broadcast broadcasting signal intelligent monitor system based on removable monitoring
CN107580155A (en) Networking telephone quality determination method, device, computer equipment and storage medium
CN100499694C (en) Method and device for testing speech quality
EP1284059B1 (en) Test signalling
CN115442273B (en) Voice recognition-based audio transmission integrity monitoring method and device
CN106998567A (en) A kind of voice quality method of testing, test device and user equipment
CN111081278A (en) Method and system for testing conversation quality of talkback terminal
CN110010124A (en) Equipment and the call method of inspection are examined in call
CN113660009B (en) Testing system and testing method for power distribution and utilization communication
CN102332261B (en) Audio end-to-end delay measuring method and device based on nonintrusive double-end collection
CN116318457B (en) Radio signal monitoring method and system
CN105306685B (en) The test method and mobile terminal of signal quality
WO2000072453A1 (en) Universal quality measurement system for multimedia and other signals
EP1441329B1 (en) Audio signal quality assessment method and apparatus
CN116346659A (en) Cloud education English teaching platform and teaching method
CN102316363A (en) Method and devices for measuring video end-to-end delay by non-intrusive double-ended collection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant