CN113140211A - Intelligent voice recognition technology of real-time audio and video stream based on trusted call - Google Patents

Intelligent voice recognition technology of real-time audio and video stream based on trusted call Download PDF

Info

Publication number
CN113140211A
CN113140211A CN202110422256.1A CN202110422256A CN113140211A CN 113140211 A CN113140211 A CN 113140211A CN 202110422256 A CN202110422256 A CN 202110422256A CN 113140211 A CN113140211 A CN 113140211A
Authority
CN
China
Prior art keywords
module
voice
voiceprint
recognition technology
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110422256.1A
Other languages
Chinese (zh)
Inventor
刘波涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Weiwu Yunlian Technology Co ltd
Original Assignee
Wuhan Weiwu Yunlian Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Weiwu Yunlian Technology Co ltd filed Critical Wuhan Weiwu Yunlian Technology Co ltd
Priority to CN202110422256.1A priority Critical patent/CN113140211A/en
Publication of CN113140211A publication Critical patent/CN113140211A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/005Language recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/02Protecting privacy or anonymity, e.g. protecting personally identifiable information [PII]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/16Communication-related supplementary services, e.g. call-transfer or call-hold

Landscapes

  • Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Security & Cryptography (AREA)
  • Telephone Function (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses an intelligent voice recognition technology of real-time audio and video streams based on trusted calls, which comprises a pre-preparation module, a mode matching module, a calling end and an answering end, wherein the pre-preparation module comprises a trusted call source database, a voiceprint database and a voiceprint binding module, the calling end and the answering end are connected through an information transmission module, a voice input module, a voice extraction module, a mode matching module, a voice detection module, a voice comparison module and a voice recognition module in sequence, and the whole flow of the calling end and the whole flow of the answering end are arranged in the surrounding of an encryption module. The intelligent voice recognition technology of the real-time audio and video stream based on the credible call effectively protects the privacy of answering and calling personnel through the voiceprint binding module of the preparation module, before the intelligent voice recognition technology is used, a voiceprint binding user needs to be input firstly, after the user registers and logs in, the voiceprint is input twice, the same detection in two times is that the intelligent voice recognition technology is successfully bound, and the intelligent voice recognition technology can be used only by successfully adding the voiceprint of the user.

Description

Intelligent voice recognition technology of real-time audio and video stream based on trusted call
Technical Field
The invention relates to the technical field of voice recognition, in particular to an intelligent voice recognition technology of real-time audio and video streams based on a credible call.
Background
With the development of science and technology, people can communicate in real time through electronic equipment such as mobile phones and computers, but this is not true for people with serious hearing impairment, and although voice conversion services are available in many countries around the world and people with hearing impairment can communicate through media, the voice conversion services are still insufficient in terms of protecting user privacy, and in addition, the cost is expensive from equipment and training to labor input. And certain specific service numbers only serve some important characters. Since the content of the call is very important in such a scenario, it is not difficult to forge the phone number to make a call, and thus it cannot be determined whether the call is made by these important persons depending on the source of the number. Therefore, the privacy and the call security of the answering and calling personnel are short of guarantee.
Disclosure of Invention
In view of the above problems, the present invention provides an intelligent speech recognition technology based on real-time audio and video streams of a trusted call to solve the problems set forth in the background art. The method comprises the following specific steps: in order to achieve the purpose, the invention adopts the following technical scheme: an intelligent voice recognition technology of real-time audio and video stream based on credible calling comprises a pre-preparation module, an information transmission module, a voice input module, a voice extraction module, a mode matching module, a calling end and an answering end, the pre-preparation module comprises a trusted call source database, a voiceprint database and a voiceprint binding module, the calling end and the answering end are connected in turn through an information transmission module, a voice input module, a voice extraction module, a mode matching module, a voice detection module, a voice comparison module and a voice recognition module, the whole process of the calling end and the whole process of the answering end are all arranged in the surrounding of the encryption module, and the information transmission module, the voice input module, the voice extraction module, the mode matching module, the voice detection module, the voice comparison module, the voice storage module and the voice recognition module are electrically connected with each other. Preferably, the voiceprint binding module comprises user registration, user login and user detection, and the voiceprint binding module is provided with two recording bindings. Preferably, the pre-preparation module is respectively connected with the encryption module and the voice storage module, and the pre-preparation module is provided with a secret key. Preferably, the voice extraction module is directly connected with the voice storage module electrically, and the voice storage module is directly connected with the voice comparison module through an electrical property. Preferably, the voice detection module is electrically connected with the voice reminding module and the voice feedback module respectively, the voice reminding module is set to be in two modes of audio reminding and pop window reminding, and the voice feedback module is connected with the trusted call source database. Preferably, the voice recognition module is connected to the voice conversion module, and the voice conversion module includes text conversion, signal conversion, and language conversion. Preferably, the method comprises the following steps: s1, before the system is started, a voiceprint binding user needs to be input, after the user registers and logs in, the voiceprint is input twice, the same detection in the two times is that the binding is successful, and the user can use the voiceprint after the voiceprint is successfully added; s2, on the premise of signal encryption, the calling end transmits the signal to the voice input module through the information transmission module, then transmits the signal to the voice extraction module through the voice input module, sequentially transmits the extracted voiceprint information to the mode matching module, the voice detection module and the voice comparison module for matching, and then transmits the information to the answering end through the voice recognition module and the information transmission module; s3, when the voice detection module detects that the voiceprint is different from the voiceprint of the trusted call source database, the voice reminding module carries out audio reminding and popup reminding; and S4, when the voice feedback module feeds back the information to the source database of the credible calling source, and finds that the voiceprint information is inconsistent, the information is fed back to the actual real calling terminal while reminding the answering terminal. The invention has the following beneficial effects: the intelligent voice recognition technology of the real-time audio and video stream based on the credible call effectively protects the privacy of answering and calling personnel through the voiceprint binding module of the preparation module, before the intelligent voice recognition technology is used, a voiceprint binding user needs to be input firstly, after the user registers and logs in, the voiceprint is input twice, the same detection in two times is that the binding is successful, and the intelligent voice recognition technology can be used only by successfully adding the voiceprint of the user; the front preparation module is electrically connected with the encryption module, and meanwhile, the front preparation module is provided with a secret key, and the whole process of the calling end and the answering end is arranged in the enclosure of the encryption module, so that the privacy and the conversation safety of answering and calling personnel are further improved; in addition, the voice recognition module is connected with the voice conversion module, the voice conversion module comprises character conversion, signal conversion and language conversion, and for a user with hearing impairment, the voice can be converted into characters for recognition and can be converted according to the language.
Drawings
FIG. 1 is a schematic diagram of a pre-system preparation process of the present invention; FIG. 2 is a diagram of a call receiving layout of the system of the present invention.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further explained in the following combination. In the description of the present invention, it should be noted that the terms "upper", "lower", "inner", "outer", "front", "rear", "both ends", "one end", "the other end", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In the description of the present invention, it is to be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "disposed," "connected," and the like are to be construed broadly, such as "connected," which may be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art. The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. Referring to fig. 1-2, the intelligent voice recognition technology for real-time audio and video streams based on trusted calls, provided by the invention, comprises a front preparation module, a calling end and an answering end, wherein the front preparation module comprises a trusted call source database, a voiceprint database and a voiceprint binding module, the calling end and the answering end are connected sequentially through an information transmission module, a voice input module, a voice extraction module, a mode matching module, a voice detection module, a voice comparison module and a voice recognition module, the whole process of the calling end and the answering end is arranged in the surrounding of an encryption module, and the information transmission module, the voice input module, the voice extraction module, the mode matching module, the voice detection module, the voice comparison module, the voice storage module and the voice recognition module are electrically connected with each other; the voiceprint binding module comprises user registration, user login and user detection, and is provided with two recording bindings; the front preparation module is electrically connected with the encryption module and the voice storage module respectively, and is provided with a secret key; the voice extraction module is electrically and directly connected with the voice storage module, and the voice storage module is directly connected with the voice comparison module through the electrical property; the voice detection module is respectively electrically connected with the voice reminding module and the voice feedback module, the voice reminding module is set to be in two modes of audio reminding and popup reminding, and the voice feedback module is connected with the credible call source database; the voice recognition module is connected with the voice conversion module, and the voice conversion module comprises character conversion, signal conversion and language conversion; the specific matter flow of the intelligent voice recognition technology of the real-time audio and video stream based on the credible call is as follows: s1, before the system is started, a voiceprint binding user needs to be input, after the user registers and logs in, the voiceprint is input twice, the same detection in the two times is that the binding is successful, and the user can use the voiceprint after the voiceprint is successfully added; s2, on the premise of signal encryption, the calling end transmits the signal to the voice input module through the information transmission module, then transmits the signal to the voice extraction module through the voice input module, sequentially transmits the extracted voiceprint information to the mode matching module, the voice detection module and the voice comparison module for matching, and then transmits the information to the answering end through the voice recognition module and the information transmission module; s3, when the voice detection module detects that the voiceprint is different from the voiceprint of the trusted call source database, the voice reminding module carries out audio reminding and popup reminding; and S4, when the voice feedback module feeds back the information to the source database of the credible calling source, and finds that the voiceprint information is inconsistent, the information is fed back to the actual real calling terminal while reminding the answering terminal. The above are merely examples of the present invention, and common general knowledge of known specific structures and characteristics in the schemes is not described herein. It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (7)

1. The intelligent voice recognition technology of real-time audio and video stream based on credible calling comprises a front preparation module, an information transmission module, a voice input module, a voice extraction module, a mode matching module, a calling end and a receiving end, and is characterized in that: the voice pre-preparation module comprises a trusted call source database, a voiceprint database and a voiceprint binding module, the calling end and the answering end are connected sequentially through an information transmission module, a voice input module, a voice extraction module, a mode matching module, a voice detection module, a voice comparison module and a voice recognition module, the whole process of the calling end and the whole process of the answering end are all arranged in the surrounding of an encryption module, and the information transmission module, the voice input module, the voice extraction module, the mode matching module, the voice detection module, the voice comparison module, the voice storage module and the voice recognition module are electrically connected with each other.
2. The intelligent speech recognition technology for real-time audio and video streams based on trusted calls as claimed in claim 1, wherein: the voiceprint binding module comprises user registration, user login and user detection, and is provided with twice recording binding.
3. The intelligent voice recognition technology for the real-time audio and video stream of the credible call comprises a sending end and a receiving end, and is characterized in that: the front preparation module is electrically connected with the encryption module and the voice storage module respectively and is provided with a secret key.
4. The intelligent speech recognition technology for real-time audio and video streams based on trusted calls as claimed in claim 1, wherein: the voice extraction module is directly connected with the voice storage module in an electrical mode, and the voice storage module is directly connected with the voice comparison module in an electrical mode.
5. The intelligent speech recognition technology for real-time audio and video streams based on trusted calls as claimed in claim 1, wherein: the voice detection module is respectively electrically connected with the voice reminding module and the voice feedback module, the voice reminding module is set to be in two modes of audio reminding and popup reminding, and the voice feedback module is connected with the credible call source database.
6. The intelligent speech recognition technology for real-time audio and video streams based on trusted calls as claimed in claim 1, wherein: the voice recognition module is connected with the voice conversion module, and the voice conversion module comprises character conversion, signal conversion and language conversion.
7. The intelligent voice recognition technology for the real-time audio and video stream of the trusted call is characterized by comprising the following processes: s1, before the system is started, a voiceprint binding user needs to be input, after the user registers and logs in, the voiceprint is input twice, the same detection in the two times is that the binding is successful, and the user can use the voiceprint after the voiceprint is successfully added; s2, on the premise of signal encryption, the calling end transmits the signal to the voice input module through the information transmission module, then transmits the signal to the voice extraction module through the voice input module, sequentially transmits the extracted voiceprint information to the mode matching module, the voice detection module and the voice comparison module for matching, and then transmits the information to the answering end through the voice recognition module and the information transmission module; s3, when the voice detection module detects that the voiceprint is different from the voiceprint of the trusted call source database, the voice reminding module carries out audio reminding and popup reminding; and S4, when the voice feedback module feeds back the information to the source database of the credible calling source, and finds that the voiceprint information is inconsistent, the information is fed back to the actual real calling terminal while reminding the answering terminal.
CN202110422256.1A 2021-04-20 2021-04-20 Intelligent voice recognition technology of real-time audio and video stream based on trusted call Pending CN113140211A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110422256.1A CN113140211A (en) 2021-04-20 2021-04-20 Intelligent voice recognition technology of real-time audio and video stream based on trusted call

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110422256.1A CN113140211A (en) 2021-04-20 2021-04-20 Intelligent voice recognition technology of real-time audio and video stream based on trusted call

Publications (1)

Publication Number Publication Date
CN113140211A true CN113140211A (en) 2021-07-20

Family

ID=76812755

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110422256.1A Pending CN113140211A (en) 2021-04-20 2021-04-20 Intelligent voice recognition technology of real-time audio and video stream based on trusted call

Country Status (1)

Country Link
CN (1) CN113140211A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007241130A (en) * 2006-03-10 2007-09-20 Matsushita Electric Ind Co Ltd System and device using voiceprint recognition
CN101909290A (en) * 2010-08-25 2010-12-08 中兴通讯股份有限公司 Method, system and mobile terminal for encrypting voice call
WO2015192450A1 (en) * 2014-06-20 2015-12-23 中兴通讯股份有限公司 Identity identification method and apparatus and communication terminal
CN107959655A (en) * 2016-10-14 2018-04-24 北京信威通信技术股份有限公司 A kind of calling and called correlating method of end-to-end enciphoring voice telecommunication
CN108347512A (en) * 2018-01-22 2018-07-31 维沃移动通信有限公司 A kind of personal identification method and mobile terminal
CN110660398A (en) * 2019-09-19 2020-01-07 北京三快在线科技有限公司 Voiceprint feature updating method and device, computer equipment and storage medium
CN110730952A (en) * 2017-11-03 2020-01-24 腾讯科技(深圳)有限公司 Method and system for processing audio communication on network
CN112235608A (en) * 2020-12-11 2021-01-15 视联动力信息技术股份有限公司 Data encryption transmission method, device and medium based on video network
CN112637543A (en) * 2020-12-09 2021-04-09 随锐科技集团股份有限公司 Audio and video conference method and device based on voice control

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007241130A (en) * 2006-03-10 2007-09-20 Matsushita Electric Ind Co Ltd System and device using voiceprint recognition
CN101909290A (en) * 2010-08-25 2010-12-08 中兴通讯股份有限公司 Method, system and mobile terminal for encrypting voice call
WO2015192450A1 (en) * 2014-06-20 2015-12-23 中兴通讯股份有限公司 Identity identification method and apparatus and communication terminal
CN107959655A (en) * 2016-10-14 2018-04-24 北京信威通信技术股份有限公司 A kind of calling and called correlating method of end-to-end enciphoring voice telecommunication
CN110730952A (en) * 2017-11-03 2020-01-24 腾讯科技(深圳)有限公司 Method and system for processing audio communication on network
CN108347512A (en) * 2018-01-22 2018-07-31 维沃移动通信有限公司 A kind of personal identification method and mobile terminal
CN110660398A (en) * 2019-09-19 2020-01-07 北京三快在线科技有限公司 Voiceprint feature updating method and device, computer equipment and storage medium
CN112637543A (en) * 2020-12-09 2021-04-09 随锐科技集团股份有限公司 Audio and video conference method and device based on voice control
CN112235608A (en) * 2020-12-11 2021-01-15 视联动力信息技术股份有限公司 Data encryption transmission method, device and medium based on video network

Similar Documents

Publication Publication Date Title
US9906642B2 (en) Identity identification method and apparatus and communication terminal
CN101345788B (en) Identity affirmation method and system through telephone call-back
CN102044128A (en) Emergency alarm system and method
CN106657547A (en) Method and system for preventing crank calls
US10251044B2 (en) System and method for two-way message transmission on mobile platforms for emergency and non-emergency communications
CN101048004B (en) Apparatus and method for storing/calling telephone number in mobile terminal
CN106357877A (en) Information prompting method, device and terminal
WO2006023089A2 (en) Telephone to telephone data passing system
CN106131325A (en) Vehicle mounted communication control method and system
CN103795834A (en) Recording method capable of uploading conversation recording file of smart phone and dedicated recording apparatus
CN104010060B (en) The method and electronic equipment of identification incoming call incoming call side's identity
CN110381219A (en) A kind of anti-swindle system of communication
EP1699215A1 (en) Voice authentication device, voice authentication system, and voice authentication method
CN106341555A (en) Communication monitoring method and device
CN113140211A (en) Intelligent voice recognition technology of real-time audio and video stream based on trusted call
CN102857731A (en) Method for identity validation and automatic answering of 3G wireless communication module video telephone call
CN106412209A (en) Incoming call callback processing method and device
TW201042994A (en) General code communication system
CN108990061A (en) A kind of identification card number anti-theft method, apparatus and system
CN114339115A (en) Information interaction system applied between parents and school
US8666443B2 (en) Method and apparatus for muting a sounder device
CN102984661A (en) Group calling method based on mobile communication network, terminal and server
CN102339509A (en) Method for realizing anti-theft warning on buses
CN106302083A (en) Instant communication method and server
CN204733229U (en) A kind of communicator by biometrics identification technology display of calling personal information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210720

RJ01 Rejection of invention patent application after publication