CN113160827A - Voice transcription system and method based on multi-language model - Google Patents

Voice transcription system and method based on multi-language model Download PDF

Info

Publication number
CN113160827A
CN113160827A CN202110371093.9A CN202110371093A CN113160827A CN 113160827 A CN113160827 A CN 113160827A CN 202110371093 A CN202110371093 A CN 202110371093A CN 113160827 A CN113160827 A CN 113160827A
Authority
CN
China
Prior art keywords
module
client
voice
voice data
platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110371093.9A
Other languages
Chinese (zh)
Inventor
鱼海航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yuliang Technology Co ltd
Original Assignee
Shenzhen Yuliang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Yuliang Technology Co ltd filed Critical Shenzhen Yuliang Technology Co ltd
Priority to CN202110371093.9A priority Critical patent/CN113160827A/en
Publication of CN113160827A publication Critical patent/CN113160827A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/12Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being prediction coefficients
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a voice transcription system and a method based on a multi-language model, which comprises a platform, a client connected with the platform, a storage module, a voice service module and a display module connected with the client; the platform is used for receiving the information sent by the client and the voice service module and sending the information to the client and the voice service module; the client is used for inputting personal information of a user, sending the personal information to the platform, sending the information sent by the platform to the user and displaying the information through the display module; the storage module is used for storing the voice data; the voice service module is used for transcribing and translating voice data of the user and generating a transcribed text and a translated text; the invention avoids the situation that a translator needs to follow at any time, the cost is higher, the translation cost is high, the working efficiency is improved, and the situation that the translator is inconvenient in the field in some occasions is also avoided.

Description

Voice transcription system and method based on multi-language model
Technical Field
The invention relates to the technical field of voice communication, in particular to a voice transcription system and a voice transcription method based on a multi-language model.
Background
According to statistics, 5000-6000 languages are common in the world, and more common languages include English, Chinese, Japanese French, German, Russian and the like. With the development of communication and traffic, business and tourism activities among countries are increasingly carried out, international long distance telephone expenses are greatly reduced, and call volume is greatly increased. In 2000 years, the number of tourists entering foreign countries in China exceeds ten million, and the tourists live in the fifth place of the world and the first place in Asia. The language barrier causes great inconvenience to the trade and the tourism, and further development of the trade and the tourism is influenced. To clear up language barriers, spoken language translation becomes an important tool. In tourism and large investment countries like China, tens of thousands of translators are needed in the world.
However, when a field translator is used, the field translator needs to keep close at any time, so that the cost is high, and the translation cost is generally high; the translator has low working efficiency and poor maneuverability, and the translator is inconvenient in some occasions.
Disclosure of Invention
The technical problem to be solved by the invention is to overcome the existing defects, and provide a voice transcription system and method based on a multi-language model, so as to solve the problems that in the technical background, a field translator needs to follow the human body at any time, the cost is high, and the translation cost is high generally; the translator has the defects of low working efficiency, poor maneuverability and inconvenience when the translator is in the field in some occasions.
In order to achieve the purpose, the invention provides the following technical scheme: a voice transcription system and method based on a multi-language model comprises a platform, a client connected with the platform, a storage module, a voice service module and a display module connected with the client;
the platform is used for receiving the information sent by the client and the voice service module and sending the information to the client and the voice service module;
the client is used for inputting personal information of a user, sending the personal information to the platform, sending the information sent by the platform to the user and displaying the information through the display module;
the storage module is used for storing voice data;
and the voice service module is used for transcribing and translating the voice data of the user and generating a transcribed text and a translated text.
Preferably, the voice service module is connected to a processing module, and the processing module is connected to an extraction module and is configured to process voice data sent by the voice service module and send the data to the extraction module; the extraction module is connected with the voice service module and used for extracting characteristics of the voice data sent by the processing module and sending the voice data to the voice service module.
Preferably, the processing module is configured to perform pre-emphasis, framing, windowing, and endpoint detection on the voice data sent by the voice service module, and send the processed voice data to the extraction module.
Preferably, the extraction module is used for extracting important relevant information reflecting speech features and removing relatively irrelevant information from the speech data sent by the processing module through the linear prediction cepstrum coefficients LPCC, and sending the data to the speech service module.
Preferably, the system also comprises a voice acquisition module for acquiring voice data of the user and a conversion module for carrying out A/D conversion on the voice data acquired by the voice acquisition module, wherein the voice acquisition module is connected with the conversion module, and the conversion module is connected with the client.
Preferably, the method comprises the following steps:
s1, a user logs in the client to record personal voice data, and the voice data is sent to the platform through the client, and the platform synchronously sends the voice data to the voice service module and the storage module;
s2, during translation, the voice acquisition module acquires user voice data, the user voice data is transmitted to the client through the conversion module, the client transmits the received voice data to the platform, and the platform transmits the data pushed by the client to the voice service module and stores the data in the storage module;
when the voice data sent by the user is consistent with the voice data recorded by other users, the voice service module only transcribes the voice data into texts and sends the transcribed texts to the platform, the platform sends the texts to each client, the transcribed texts are displayed through the display module connected with each client, and meanwhile, the voice information of the user is sent to each client;
when the voice data sent by the user is different from the voice data recorded by the individual user, the voice service module translates and transcribes the voice data and sends the translated text and the transcribed text to the platform, the platform sends the translated text to the individual corresponding client and sends the transcribed text to the client of the original user, the translated text is displayed through the display module connected with the corresponding client, and the transcribed text is displayed through the display module connected with the client of the original user;
and S3, the platform synchronously sends the voice data, the transcription text and the translation text of each user to the storage module for storage.
Preferably, in step S2, when multiple users communicate with each other, the voice data of the users are synchronized to the platform, the voice data are translated and transcribed through the voice service module, the translated text is sent to another user client for display, and the transcribed text is sent to the original client for display.
Preferably, when the user needs to inquire the communication information, the user logs in the client to input the information, the client sends the translation text and the transcription text of the voice information needing to be inquired to the client of the user, and the translation text are displayed through a display module connected with the client.
Compared with the prior art, the invention provides a voice transcription system and a method based on a multi-language model, which have the following beneficial effects:
according to the invention, a user logs on a client, and inputs voice data to the client and sends the voice data to the platform, the voice data is transcribed and translated through the voice service module connected with the platform, the transcribed text and the translated text are sent to the client of each corresponding user, and the transcribed text and the translated text are displayed through the display module connected with the client so as to be convenient for converting multiple languages, thereby avoiding the situation that a translator needs to keep up with the client at any time, the cost is high, the translation cost is high, the working efficiency is improved, and the situation that the translator is inconvenient in the field in some occasions is avoided.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention without limiting the invention in which:
FIG. 1 is a simplified structural diagram of a system and method for transferring a voice based on a multi-language model according to the present invention.
Detailed Description
In order to make the technical means, the original characteristics, the achieved purposes and the effects of the invention easily understood, the invention is further described below with reference to the specific embodiments and the attached drawings, but the following embodiments are only the preferred embodiments of the invention, and not all embodiments are provided. Based on the embodiments in the implementation, other embodiments obtained by those skilled in the art without any creative efforts belong to the protection scope of the present invention.
Referring to fig. 1, a system and method for voice transcription based on a multi-language model includes a platform, a client connected to the platform, a storage module, a voice service module, and a display module connected to the client;
the platform is used for receiving the information sent by the client and the voice service module, sending the information to the client and the voice service module and sending the data to the storage module for storage;
the client is connected with the platform through a network and used for inputting personal information of a user, sending the personal information to the platform, sending the information sent by the platform to the user and displaying the information through the display module, and the display module can display the transcribed text and the translated text so as to be convenient for the user to watch;
the storage module is used for storing voice data and storing data transmitted and received by the platform;
the voice service module is connected with the platform through a network and used for transcribing and translating voice data of the user and generating a transcribed text and a translated text.
The voice service module is connected with the processing module, and the processing module is connected with the extraction module and used for processing the voice data sent by the voice service module and sending the data to the extraction module; the extraction module is connected with the voice service module and used for extracting characteristics of the voice data sent by the processing module and sending the voice data to the voice service module.
The processing module is used for performing pre-emphasis, framing, windowing and endpoint detection on voice data sent by the voice service module, and sending the processed voice data to the extraction module, wherein the pre-emphasis is also called high-frequency lifting, and is a phenomenon that information is easily lost in a high-frequency part of a voice signal due to the influence of oral-nasal radiation and the like of the voice signal, so that the pre-emphasis is performed in the pre-processing before analog/digital conversion. The purpose of pre-emphasis is to boost the high frequency part and flatten the frequency spectrum of the signal so as to perform frequency spectrum analysis or vocal tract parameter analysis; framing is a common method in speech signal analysis and processing, and is an idea of processing a speech signal by segmenting a section of the speech signal, and the speech signal can be regarded as a characteristic which is kept relatively stable in a limited time period, which is also called short-time stationary. When analyzing the voice signal, the continuous voice signal can be divided into a plurality of relatively independent parts for consideration, so that the continuous voice is simpler to process; windowing operation is carried out after the frame division, the purpose of windowing is that the voice signal is smooth at the beginning and the end, and a rectangular window function and a Hamming window function are used more in practical application; the final step of the preprocessing is end point detection, which is a technique for recognizing the positions of the start and end points of a primitive such as a phoneme, syllable, word, etc. in a speech signal.
The extraction module is used for extracting important relevant information reflecting the voice characteristics and removing relatively irrelevant information from the voice data sent by the processing module through the linear prediction cepstrum coefficient LPCC, and sending the data to the voice service module.
The voice data acquisition system further comprises a voice acquisition module for acquiring the voice data of the user and a conversion module for carrying out A/D conversion on the voice data acquired by the voice acquisition module, wherein the voice acquisition module is connected with the conversion module, and the conversion module is connected with the client.
S1, a user logs in the client to record personal voice data, and the voice data is sent to the platform through the client, and the platform synchronously sends the voice data to the voice service module and the storage module;
s2, during translation, the voice acquisition module acquires user voice data, the user voice data is transmitted to the client through the conversion module, the client transmits the received voice data to the platform, and the platform transmits the data pushed by the client to the voice service module and stores the data in the storage module;
when the voice data sent by the user is consistent with the voice data recorded by other users, the voice service module only transcribes the voice data into texts and sends the transcribed texts to the platform, the platform sends the texts to each client, the transcribed texts are displayed through the display module connected with each client, and meanwhile, the voice information of the user is sent to each client;
when the voice data sent by the user is different from the voice data recorded by the individual user, the voice service module translates and transcribes the voice data and sends the translated text and the transcribed text to the platform, the platform sends the translated text to the individual corresponding client and sends the transcribed text to the client of the original user, the translated text is displayed through the display module connected with the corresponding client, and the transcribed text is displayed through the display module connected with the client of the original user;
and S3, the platform synchronously sends the voice data, the transcription text and the translation text of each user to the storage module for storage.
In step S2, when multiple users communicate, the voice data of the users are synchronized to the platform, the voice data are translated and transcribed through the voice service module, the translated text is sent to another user client for display, and the transcribed text is sent to the original client for display.
When a user needs to inquire the communication information, the client inputs the information by logging in, the client sends the translation text and the translation text of the voice information to be inquired to the client of the user, the translation text and the translation text are displayed by a display module connected with the client, when the user with different languages logs in the client, the voice data of the user is recorded, the voice data is collected by a voice collecting module, an analog signal is converted into a digital signal by a converting module and is sent to the client, the digital signal is sent to a platform by the client, the platform sends the data to a storage module for storage and simultaneously to a voice service module, the data is sent to the voice service module by a processing module and an extracting module for translation and translation of the voice data, the translation text and the translation text are generated and sent to the platform, the platform will be reprinted the text and send to former user, the translation text is sent to other users, the platform can store in data transmission to the storage module simultaneously, and the user pass through the translation data that client receiving platform sent, and be convenient for look over the translation text through display module, avoided the translator to follow at any time, and the cost is higher, the condition that translation expense is high, and work efficiency has been improved, the inconvenient condition of some occasions translator when on the spot has also been avoided and has appeared.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and the preferred embodiments of the present invention are described in the above embodiments and the description, and are not intended to limit the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (8)

1. A voice transcription system based on a multi-language model is characterized by comprising a platform, a client connected with the platform, a storage module, a voice service module and a display module connected with the client;
the platform is used for receiving the information sent by the client and the voice service module and sending the information to the client and the voice service module;
the client is used for inputting personal information of a user, sending the personal information to the platform, sending the information sent by the platform to the user and displaying the information through the display module;
the storage module is used for storing voice data;
and the voice service module is used for transcribing and translating the voice data of the user and generating a transcribed text and a translated text.
2. The multi-language model-based speech transcription system as claimed in claim 1, wherein: the voice service module is connected with the processing module, and the processing module is connected with the extraction module and is used for processing the voice data sent by the voice service module and sending the data to the extraction module; the extraction module is connected with the voice service module and used for extracting characteristics of the voice data sent by the processing module and sending the voice data to the voice service module.
3. The multi-language model-based speech transcription system as claimed in claim 2, wherein: the processing module is used for carrying out pre-emphasis, framing, windowing and endpoint detection on the voice data sent by the voice service module and sending the processed voice data to the extraction module.
4. A multi-language model-based speech transcription system as claimed in claim 3, characterized in that: the extraction module is used for extracting important relevant information reflecting voice characteristics and removing relatively irrelevant information from the voice data sent by the processing module through a Linear Prediction Cepstrum Coefficient (LPCC) and sending the data to the voice service module.
5. A multi-language model-based speech transcription system according to any one of claims 1 to 5, characterized in that: the voice data acquisition system further comprises a voice acquisition module for acquiring voice data of a user and a conversion module for carrying out A/D conversion on the voice data acquired by the voice acquisition module, wherein the voice acquisition module is connected with the conversion module, and the conversion module is connected with the client.
6. A voice transcription method based on a multi-language model is characterized by comprising the following steps:
s1, a user logs in the client to record personal voice data, and the voice data is sent to the platform through the client, and the platform synchronously sends the voice data to the voice service module and the storage module;
s2, during translation, the voice acquisition module acquires user voice data, the user voice data is transmitted to the client through the conversion module, the client transmits the received voice data to the platform, and the platform transmits the data pushed by the client to the voice service module and stores the data in the storage module;
when the voice data sent by the user is consistent with the voice data recorded by other users, the voice service module only transcribes the voice data into texts and sends the transcribed texts to the platform, the platform sends the texts to each client, the transcribed texts are displayed through the display module connected with each client, and meanwhile, the voice information of the user is sent to each client;
when the voice data sent by the user is different from the voice data recorded by the individual user, the voice service module translates and transcribes the voice data and sends the translated text and the transcribed text to the platform, the platform sends the translated text to the individual corresponding client and sends the transcribed text to the client of the original user, the translated text is displayed through the display module connected with the corresponding client, and the transcribed text is displayed through the display module connected with the client of the original user;
and S3, the platform synchronously sends the voice data, the transcription text and the translation text of each user to the storage module for storage.
7. The method of claim 6, wherein the method comprises: in step S2, when multiple users communicate with each other, the voice data of the users are synchronized to the platform, the voice data are translated and transcribed through the voice service module, the translated text is sent to another user client for display, and the transcribed text is sent to the original client for display.
8. The method of claim 6, wherein the method comprises: when a user needs to inquire the communication information, the user logs in the client to input the information, the client sends the translation text and the transcription text of the voice information needing to be inquired to the client of the user, and the transcription text and the translation text are displayed through a display module connected with the client.
CN202110371093.9A 2021-04-07 2021-04-07 Voice transcription system and method based on multi-language model Pending CN113160827A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110371093.9A CN113160827A (en) 2021-04-07 2021-04-07 Voice transcription system and method based on multi-language model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110371093.9A CN113160827A (en) 2021-04-07 2021-04-07 Voice transcription system and method based on multi-language model

Publications (1)

Publication Number Publication Date
CN113160827A true CN113160827A (en) 2021-07-23

Family

ID=76888535

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110371093.9A Pending CN113160827A (en) 2021-04-07 2021-04-07 Voice transcription system and method based on multi-language model

Country Status (1)

Country Link
CN (1) CN113160827A (en)

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202587038U (en) * 2012-04-11 2012-12-05 上海车音网络科技有限公司 Voice data processing platform and system thereof
CN105103151A (en) * 2013-02-08 2015-11-25 机械地带有限公司 Systems and methods for multi-user multi-lingual communications
CN105408891A (en) * 2013-06-03 2016-03-16 机械地带有限公司 Systems and methods for multi-user multi-lingual communications
CN106453043A (en) * 2016-09-29 2017-02-22 安徽声讯信息技术有限公司 Multi-language conversion-based instant communication system
CN107229616A (en) * 2016-03-25 2017-10-03 阿里巴巴集团控股有限公司 Language identification method, apparatus and system
JP2018060165A (en) * 2016-09-28 2018-04-12 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America Voice recognition method, portable terminal, and program
CN108595443A (en) * 2018-03-30 2018-09-28 浙江吉利控股集团有限公司 Simultaneous interpreting method, device, intelligent vehicle mounted terminal and storage medium
CN110049270A (en) * 2019-03-12 2019-07-23 平安科技(深圳)有限公司 Multi-person conference speech transcription method, apparatus, system, equipment and storage medium
CN110335610A (en) * 2019-07-19 2019-10-15 北京硬壳科技有限公司 The control method and display of multimedia translation
CN110457717A (en) * 2019-08-07 2019-11-15 深圳市博音科技有限公司 Remote translating system and method
CN110556094A (en) * 2019-10-18 2019-12-10 重庆旅游人工智能信息科技有限公司 Artificial intelligent voice simultaneous interpretation system of tour guide machine
CN110689770A (en) * 2019-08-12 2020-01-14 合肥马道信息科技有限公司 Online classroom voice transcription and translation system and working method thereof
KR20200090579A (en) * 2019-01-21 2020-07-29 (주)한컴인터프리 Method and System for Interpreting and Translating using Smart Device
CN111554280A (en) * 2019-10-23 2020-08-18 爱声科技有限公司 Real-time interpretation service system for mixing interpretation contents using artificial intelligence and interpretation contents of interpretation experts
KR20210020448A (en) * 2019-08-14 2021-02-24 주식회사 사운드브릿지 Simultaneous interpretation device based on mobile cloud and electronic device
CN112447168A (en) * 2019-09-05 2021-03-05 阿里巴巴集团控股有限公司 Voice recognition system and method, sound box, display device and interaction platform
CN112951236A (en) * 2021-02-07 2021-06-11 北京有竹居网络技术有限公司 Voice translation equipment and method

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202587038U (en) * 2012-04-11 2012-12-05 上海车音网络科技有限公司 Voice data processing platform and system thereof
CN105103151A (en) * 2013-02-08 2015-11-25 机械地带有限公司 Systems and methods for multi-user multi-lingual communications
CN105408891A (en) * 2013-06-03 2016-03-16 机械地带有限公司 Systems and methods for multi-user multi-lingual communications
CN107229616A (en) * 2016-03-25 2017-10-03 阿里巴巴集团控股有限公司 Language identification method, apparatus and system
JP2018060165A (en) * 2016-09-28 2018-04-12 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America Voice recognition method, portable terminal, and program
CN106453043A (en) * 2016-09-29 2017-02-22 安徽声讯信息技术有限公司 Multi-language conversion-based instant communication system
CN108595443A (en) * 2018-03-30 2018-09-28 浙江吉利控股集团有限公司 Simultaneous interpreting method, device, intelligent vehicle mounted terminal and storage medium
KR20200090579A (en) * 2019-01-21 2020-07-29 (주)한컴인터프리 Method and System for Interpreting and Translating using Smart Device
CN110049270A (en) * 2019-03-12 2019-07-23 平安科技(深圳)有限公司 Multi-person conference speech transcription method, apparatus, system, equipment and storage medium
CN110335610A (en) * 2019-07-19 2019-10-15 北京硬壳科技有限公司 The control method and display of multimedia translation
CN110457717A (en) * 2019-08-07 2019-11-15 深圳市博音科技有限公司 Remote translating system and method
CN110689770A (en) * 2019-08-12 2020-01-14 合肥马道信息科技有限公司 Online classroom voice transcription and translation system and working method thereof
KR20210020448A (en) * 2019-08-14 2021-02-24 주식회사 사운드브릿지 Simultaneous interpretation device based on mobile cloud and electronic device
CN112447168A (en) * 2019-09-05 2021-03-05 阿里巴巴集团控股有限公司 Voice recognition system and method, sound box, display device and interaction platform
CN110556094A (en) * 2019-10-18 2019-12-10 重庆旅游人工智能信息科技有限公司 Artificial intelligent voice simultaneous interpretation system of tour guide machine
CN111554280A (en) * 2019-10-23 2020-08-18 爱声科技有限公司 Real-time interpretation service system for mixing interpretation contents using artificial intelligence and interpretation contents of interpretation experts
CN112951236A (en) * 2021-02-07 2021-06-11 北京有竹居网络技术有限公司 Voice translation equipment and method

Similar Documents

Publication Publication Date Title
CN111128126B (en) Multi-language intelligent voice conversation method and system
CN110049270B (en) Multi-person conference voice transcription method, device, system, equipment and storage medium
JP4393494B2 (en) Machine translation apparatus, machine translation method, and machine translation program
CN102903361A (en) Instant call translation system and instant call translation method
WO2008084476A2 (en) Vowel recognition system and method in speech to text applications
CN107945805A (en) A kind of intelligent across language voice identification method for transformation
CN108053823A (en) A kind of speech recognition system and method
CN111477216A (en) Training method and system for pronunciation understanding model of conversation robot
KR20140121580A (en) Apparatus and method for automatic translation and interpretation
CN106453043A (en) Multi-language conversion-based instant communication system
CN101876887A (en) Voice input method and device
CN110853615A (en) Data processing method, device and storage medium
US20020198716A1 (en) System and method of improved communication
CN110265000A (en) A method of realizing Rapid Speech writing record
CN113744722A (en) Off-line speech recognition matching device and method for limited sentence library
CN109686365B (en) Voice recognition method and voice recognition system
CN113505609A (en) One-key auxiliary translation method for multi-language conference and equipment with same
JPH0965424A (en) Automatic translation system using radio portable terminal equipment
CN107885736A (en) Interpretation method and device
CN113160827A (en) Voice transcription system and method based on multi-language model
CN102196100A (en) Instant call translation system and method
CN106228984A (en) Voice recognition information acquisition methods
CN111108553A (en) Voiceprint detection method, device and equipment for sound collection object
CN111738023A (en) Automatic image-text audio translation method and system
CN116665674A (en) Internet intelligent recruitment publishing method based on voice and pre-training model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination