CN116524910B - Manuscript prefabrication method and system based on microphone - Google Patents

Manuscript prefabrication method and system based on microphone Download PDF

Info

Publication number
CN116524910B
CN116524910B CN202310744330.0A CN202310744330A CN116524910B CN 116524910 B CN116524910 B CN 116524910B CN 202310744330 A CN202310744330 A CN 202310744330A CN 116524910 B CN116524910 B CN 116524910B
Authority
CN
China
Prior art keywords
microphone
target voice
voice
manuscript
prefabrication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310744330.0A
Other languages
Chinese (zh)
Other versions
CN116524910A (en
Inventor
虞焰兴
徐勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Semxum Information Technology Co ltd
Original Assignee
Anhui Semxum Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Semxum Information Technology Co ltd filed Critical Anhui Semxum Information Technology Co ltd
Priority to CN202310744330.0A priority Critical patent/CN116524910B/en
Publication of CN116524910A publication Critical patent/CN116524910A/en
Application granted granted Critical
Publication of CN116524910B publication Critical patent/CN116524910B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention belongs to the technical field of manuscript prefabrication, and provides a manuscript prefabrication method and a manuscript prefabrication system based on a microphone, wherein the method comprises the following steps: different marks are respectively arranged on the n microphones and marked as A1 and A2 … … An; acquiring manuscripts corresponding to n microphones, and recording the manuscripts as voice; training the recorded n voices respectively to obtain target voices corresponding to n manuscripts; judging whether a microphone marked with Am sends out a reading signal, wherein n is more than or equal to m is more than or equal to 1; if a reading signal is sent out, playing target voice corresponding to a microphone marked as Am; if the read signal is not sent, waiting for the read signal to be sent. The invention can record manuscripts in advance, then adjust voice information into clear target voice through training, and then avoid the problems of unclear voice, accent or intonation during playing, so that a person receiving the information can accurately acquire the information.

Description

Manuscript prefabrication method and system based on microphone
Technical Field
The invention belongs to the technical field of manuscript prefabrication, and particularly relates to a manuscript prefabrication method and system based on microphones.
Background
Along with the development of the internet of things technology, information is more and more conveniently transmitted, and the microphone is used as a propagation carrier of sound information and is widely applied to the internet of things technology. The internet of things is an important component of a new generation of information technology, the core and the foundation of the internet are still the internet, the internet is an extended and expanded network based on the internet, and the user side of the internet is extended and expanded to any article to article for information exchange and communication.
At present, in the field of large conferences or network teaching, remote information transmission systems are rapidly developed, sound is used as an information transmission medium, the sound needs to be played by means of a microphone, and then the functions of the existing microphone are single, for example, in the information transmission process, if a person speaking is unclear or accents are heavy or the person speaking has a voice, the person receiving the information is wrong or the person receiving the information is unclear.
Disclosure of Invention
In order to solve at least one problem in the background art, the invention provides a manuscript prefabrication method and a manuscript prefabrication system for a microphone.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a microphone-based manuscript prefabrication method, comprising the steps of:
different marks are respectively arranged on the n microphones and are marked as A1 and A2;
acquiring manuscripts corresponding to n microphones, and recording the manuscripts as voice;
training the recorded n voices respectively to obtain target voices corresponding to n manuscripts;
judging whether a microphone marked with Am sends out a reading signal, wherein n is more than or equal to m is more than or equal to 1;
if a reading signal is sent out, playing target voice corresponding to a microphone marked as Am;
if the read signal is not sent, waiting for the read signal to be sent.
Preferably, the voice is trained through word bag models or word frequency-reverse file frequency or word embedding;
the bag-of-words model comprises the following formulas:
v(D)=[count(w 1 ,D),count(w 2 ,D),...,count(w N ,D)];
where v (D) represents a bag of words model, D represents text,wrepresentation vocabularycount(w i D) represents vocabularyw i The number of times of occurrence in the text D is more than or equal to N, i is more than or equal to 1;
the word frequency-reverse file frequency includes the following formula:
tf-idf(w,D,C)=tf(w, D) × idf(w,C);
wherein ,wrepresenting vocabulary, D representing text, C representing corpus, tf (w, D) representing vocabularywFrequency of occurrence in text D;
the word embedding includes the following formula:
v(w) = [v 1 (w),v 2 (w), ...,v d (w)];
where d represents the dimension of the vector, v 1 (w)~v d (w) Representing vector v # -w) The values in the 1 st to d th dimensions are wordswSpecific semantic or grammatical features in the 1 st to d th dimensions.
Preferably, playing the target voice includes the following steps:
acquiring the audio frequency of a target voice;
and judging whether the target voice needs tuning or not based on the audio frequency until the audio frequency of the target voice meets a preset audio frequency range.
Preferably, playing the target voice includes the following steps:
converting the target voice into text information;
and playing the text information through the multimedia equipment.
A microphone-based manuscript prefabrication system for performing the above-mentioned microphone-based manuscript prefabrication method, comprising:
the first input unit is used for reading manuscripts and sending out reading signals;
the recording unit is used for recording the manuscript into voice;
the training unit is used for training the recorded voice to obtain target voice;
the server unit is used for converting the target voice into text information and binding and storing the target voice and the text information;
the receiver unit is used for acquiring target voice from the server unit, performing frequency modulation, acquiring text information and judging whether a read signal is received or not;
the sound unit is used for playing the target voice;
and the display unit is used for displaying the text information.
Preferably, the first input unit includes a microphone provided with a signal key for emitting a reading signal.
Preferably, the server unit is connected with the receiver unit through a network, and comprises a storage module and an analysis module;
the storage module is used for storing target voice and text information;
the analysis module is used for analyzing the target voice and acquiring text information corresponding to the target voice.
Preferably, a marking module is arranged in the recording unit, and the marking module is used for marking a plurality of microphones and binding target voices with corresponding marks.
Preferably, the receiver unit comprises a wifi module and a tuning module;
the wifi module is used for acquiring a reading signal and downloading target voice and text information from the server unit;
and the tuning module is used for carrying out frequency modulation on the target voice.
The invention has the beneficial effects that:
1. the invention provides a prefabricating method of a microphone manuscript, which can record the manuscript in advance, then adjust voice information into clear target voice through training, and then avoid the problems of unclear voice, dense accent or intonation during playing, so that a person receiving information can accurately acquire the information;
2. according to the prefabrication method, the plurality of microphones are marked, so that each microphone can store different target voice texts in advance, and the corresponding target voice texts are selected to be played according to the marks of the microphones when a multi-person conference or multi-person teaching is carried out, so that the working efficiency of a speaker is improved, and the prefabrication method is more humanized.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 illustrates a flow chart of a microphone-based manuscript prefabrication method of the present invention;
fig. 2 shows a block diagram of a microphone-based manuscript prefabrication system of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
A microphone-based manuscript prefabrication method, as shown in fig. 1, comprises the following steps:
s1: different marks are respectively arranged on the n microphones and are marked as A1 and A2;
s2: acquiring manuscripts corresponding to n microphones, and recording the manuscripts as voice;
s3: training the recorded n voices respectively to obtain target voices corresponding to n manuscripts;
s4: judging whether a microphone marked with Am sends out a reading signal, wherein n is more than or equal to m is more than or equal to 1; if a reading signal is sent out, playing target voice corresponding to a microphone marked as Am; if the read signal is not sent, waiting for the read signal to be sent.
In step S1, each microphone generates An identification code, which is denoted as A1 to An for simplifying the description, and then marks A1 to An in steps S2 to S4 are corresponding to and bound with the voice and text information, and when data is called, only the marks A1 to An are needed to be identified to obtain the corresponding target voice and text information.
It should be further noted that, the manuscript prefabrication method of the present invention may also be applied to other devices, for example, in the course of network teaching, a teacher may input teaching contents into a microphone in advance, and then train the input information in a large amount, eliminating the influence of accents, dialects or misstatement, etc., to form clear mandarin voices, and in addition, may convert teaching contents into foreign language voices, thus being beneficial to international teaching.
Further, in step S3, the speech is trained by means of a word bag model or word frequency-reverse file frequency or word embedding, specifically as follows:
1. data collection and cleaning: a large amount of text data, such as web text, news stories, social media comments, etc., needs to be collected first. These data then need to be cleaned to remove duplicate, erroneous and irrelevant data to ensure the quality of the data.
2. Prefabricating a file: file preparation refers to converting raw text data into a computer-processable form, such as converting text into vectors or matrices. This may be accomplished using various text representation methods, such as bag of words models, TF-IDF, word embedding, and the like.
3. Model training: next, training of the file pre-fabricated data using machine learning algorithms is required to build models that can automatically understand human language. Common machine learning algorithms include naive bayes, decision trees, random forests, neural networks, and the like.
4. Model evaluation and optimization: after training the model, it needs to be evaluated and optimized. Various metrics may be used to evaluate the performance of the model, such as accuracy, recall, F1 values, and the like. If the performance of the model is not good enough, the model needs to be adjusted and optimized to improve its performance.
5. Prediction and application: finally, the trained model is applied to actual scenes, such as natural language processing, machine translation, emotion analysis, intelligent customer service and the like.
File prefabrication and text training are important technologies in the field of natural language processing, which can enable a computer to better understand and process human language and provide better language processing services for people.
It should be noted that there are various methods and formulas for file prefabrication and text training, and the following list several methods and formulas are commonly used:
word bag model (Bag of Words Model)
The bag of words model is a simple and efficient method of text representation that treats a piece of text as a unordered set of words, ignoring the order and grammatical structure between words, and considering only the frequency with which they appear in the text. The bag of words model may be represented by vectors, the dimension of each vector being the number of words possible, each element in the vector representing the number of times the corresponding word appears in the text. The formula of the bag-of-words model is as follows:
v(D)=[count(w 1 ,D),count(w 2 ,D),...,count(w N ,D)];
where v (D) represents a bag of words model, D represents text,wrepresentation vocabularycount(w i D) represents vocabularyw i The number of occurrences in text D, N.gtoreq.i.gtoreq.1. It should be noted that each element in the bag-of-words model is a non-negative integer, so sparse vector representation can be used, i.e. only the positions and values of non-zero elements are stored, to save storage space.
TF-IDF (word frequency-reverse document frequency) is a common text representation method that multiplies each element in a bag of words model by a weight factor that takes into account the importance of the vocabulary in the text and the prevalence in the corpus. The formula for TF-IDF can be expressed as:
tf-idf(w,D,C)=tf(w, D) × idf(w,C);
wherein ,wrepresenting vocabulary, D representing text, C representing corpus, tf (w, D) representing vocabularywThe frequency of occurrence in the text D is typically normalized word frequency, such as:
tf(w,D)=count(w,D)/max{count(w',D):u'∈D};
wherein max { count }w' D) ' u ' E D represents the number of occurrences of the vocabulary with the largest number of occurrences in the text D, and is used for normalizing word frequency, idf #wC) represents vocabularywThe inverse document frequency in corpus C is generally defined as:
idf(w,C)=logN/df(w,C);
wherein N represents the total number of texts in the corpus, df @wC) represents containing wordswThe larger the value of the text number idf (w.C) of (a) is, the vocabulary is representedwThe less common it is in a corpus and therefore the greater the discrimination capability. In addition, when words and phraseswIdf (w, C) is 0 when it appears in all texts, which should be specially treated to avoid calculation errors.
Word Embedding (Word Embedding)
Word embedding is a method of mapping words to vector space that captures semantic and grammatical information between words. The general formula for word embedding is as follows:
v(w) = [v 1 (w),v 2 (w), ...,v d (w)];
where d represents the dimension of the vector, v 1 (w)~v d (w) Representing vector v # -w) The values in the 1 st to d th dimensions are wordswSpecific semantic or grammatical features in the 1 st to d th dimensions. It should be noted that the choice of these features and the dimensions of the vector can both affect the performance of the word embedding model.
Further, in step S4, playing the target voice, including voice broadcasting and text displaying, wherein the audio of the target voice is first obtained when the voice broadcasting is performed, and then whether tuning of the target voice is required is determined based on the audio until the audio of the target voice meets a preset audio range. In addition, when the text information is displayed, the target voice is firstly converted into the text information, and then the text information is played through the multimedia equipment.
The display screen can be connected with the scene when the text information is played, so that synchronous presentation of the voice and the text is realized. For example, in a large conference site, because the number of people is large, voice information cannot be effectively transmitted to all participants, so that information transmission can be performed by means of characters of a display screen; for example, when teaching students, the display screen can be used for synchronizing voice information, so that the students at the back row can also effectively acquire teaching contents.
A microphone-based manuscript prefabrication system for performing the above-mentioned microphone-based manuscript prefabrication method, as shown in fig. 2, comprising: the first input unit is used for reading manuscripts and sending out reading signals; the recording unit is used for recording the manuscript into voice; the training unit is used for training the recorded voice to obtain target voice; the server unit is used for converting the target voice into text information and binding and storing the target voice and the text information; the receiver unit is used for acquiring target voice from the server unit, performing frequency modulation, acquiring text information and judging whether a read signal is received or not; the sound unit is used for playing the target voice; and the display unit is used for displaying the text information.
Further, the first input unit includes a microphone provided with a signal key for emitting a reading signal.
Further, the server unit is connected with the receiver unit through a network and comprises a storage module and an analysis module; the storage module is used for storing target voice and text information; and the analysis module is used for analyzing the target voice and acquiring text information corresponding to the target voice.
Further, a marking module is arranged in the recording unit and used for marking a plurality of microphones and binding target voices with corresponding marks.
Further, the receiver unit comprises a wifi module and a tuning module; the wifi module is used for acquiring a reading signal and downloading target voice and text information from the server unit; and the tuning module is used for carrying out frequency modulation on the target voice.
For system embodiments, reference is made to the description of method embodiments for the relevant points, since they essentially correspond to the method embodiments. The units and modules of the manuscript prefabrication system of the microphone are only divided according to the functional logic, but are not limited to the above division, so long as the corresponding functions can be realized; in addition, the specific names of the units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.
Although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A microphone-based manuscript prefabrication method, comprising the steps of:
different marks are respectively arranged on the n microphones and are marked as A1 and A2;
acquiring manuscripts corresponding to n microphones, and recording the manuscripts as voice;
training the recorded n voices respectively to obtain target voices corresponding to n manuscripts, wherein the target voices are mandarin voices or foreign language voices without accent, no language disease or no mouth error;
judging whether a microphone marked with Am sends out a reading signal, wherein n is more than or equal to m is more than or equal to 1;
if a reading signal is sent out, playing target voice corresponding to a microphone marked as Am;
playing the target voice, comprising the following steps:
acquiring the audio frequency of a target voice;
judging whether the target voice needs tuning or not based on the audio frequency until the audio frequency of the target voice meets a preset audio frequency range;
if the read signal is not sent, waiting for the read signal to be sent.
2. The microphone-based manuscript prefabrication method according to claim 1, wherein the voice is trained by a bag of words model or word frequency-reverse file frequency or word embedding;
the bag-of-words model comprises the following formulas:
v(D)=[count(w 1 ,D),count(w 2 ,D),...,count(w N ,D)];
where v (D) represents a bag of words model, D represents text,wrepresentation vocabularycount(w i D) represents vocabularyw i The number of times of occurrence in the text D is more than or equal to N, i is more than or equal to 1;
the word frequency-reverse file frequency includes the following formula:
tf-idf(w,D,C)=tf(w, D)×idf(w,C);
wherein ,wrepresenting vocabulary, D representing text, C representing corpus, tf (w, D) representing vocabularywFrequency of occurrence in text D, idf #wC) represents vocabularywInverse document frequency in corpus C;
the word embedding includes the following formula:
v(w) = [v 1 (w),v 2 (w), ...,v d (w)];
where d represents the dimension of the vector, v 1 (w)~v d (w) Representing vector v # -w) The values in the 1 st to d th dimensions are wordswSpecific semantic or grammatical features in the 1 st to d th dimensions.
3. The microphone-based manuscript prefabrication method according to claim 1, wherein playing the target voice comprises the steps of:
converting the target voice into text information;
and playing the text information through the multimedia equipment.
4. A microphone-based document prefabrication system for performing a microphone-based document prefabrication method as claimed in any one of claims 1-3.
5. A microphone-based manuscript prefabrication system, comprising:
the first input unit is used for reading manuscripts and sending out reading signals;
the recording unit is used for recording the manuscript into voice;
the training unit is used for training the recorded voice to obtain target voice, wherein the target voice is mandarin voice or foreign language voice without accent, no Chinese disease or no mouth error;
the server unit is used for converting the target voice into text information and binding and storing the target voice and the text information;
the receiver unit is used for acquiring target voice from the server unit, performing frequency modulation, acquiring text information and judging whether a read signal is received or not;
the sound unit is used for playing the target voice;
and the display unit is used for displaying the text information.
6. The microphone-based manuscript prefabrication system according to claim 5, wherein the first input unit comprises a microphone provided with a signal key for emitting a reading signal.
7. The microphone-based manuscript prefabrication system according to claim 5, wherein the server unit is connected to the receiver unit through a network, and comprises a storage module and an analysis module;
the storage module is used for storing target voice and text information;
the analysis module is used for analyzing the target voice and acquiring text information corresponding to the target voice.
8. The microphone-based manuscript prefabrication system according to claim 5, wherein the recording unit is provided with a marking module, wherein the marking module is used for marking a plurality of microphones and binding target voices with corresponding marks.
9. The microphone-based manuscript prefabrication system according to claim 5, wherein the receiver unit comprises a wifi module and a tuning module;
the wifi module is used for acquiring a reading signal and downloading target voice and text information from the server unit;
and the tuning module is used for carrying out frequency modulation on the target voice.
CN202310744330.0A 2023-06-25 2023-06-25 Manuscript prefabrication method and system based on microphone Active CN116524910B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310744330.0A CN116524910B (en) 2023-06-25 2023-06-25 Manuscript prefabrication method and system based on microphone

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310744330.0A CN116524910B (en) 2023-06-25 2023-06-25 Manuscript prefabrication method and system based on microphone

Publications (2)

Publication Number Publication Date
CN116524910A CN116524910A (en) 2023-08-01
CN116524910B true CN116524910B (en) 2023-09-08

Family

ID=87392489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310744330.0A Active CN116524910B (en) 2023-06-25 2023-06-25 Manuscript prefabrication method and system based on microphone

Country Status (1)

Country Link
CN (1) CN116524910B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004309965A (en) * 2003-04-10 2004-11-04 Advanced Media Inc Conference recording/dictation system
CN106782540A (en) * 2017-01-17 2017-05-31 联想(北京)有限公司 Speech ciphering equipment and the voice interactive system including the speech ciphering equipment
CN107220228A (en) * 2017-06-13 2017-09-29 深圳市鹰硕技术有限公司 One kind teaching recorded broadcast data correction device
WO2017200076A1 (en) * 2016-05-20 2017-11-23 日本電信電話株式会社 Dialog method, dialog system, dialog device, and program
CN206759648U (en) * 2017-06-02 2017-12-15 北京经纬中天信息技术有限公司 The live recording and broadcasting system of Rich Media
WO2018121757A1 (en) * 2016-12-31 2018-07-05 深圳市优必选科技有限公司 Method and system for speech broadcast of text
JP2019153099A (en) * 2018-03-05 2019-09-12 コニカミノルタ株式会社 Conference assisting system, and conference assisting program
CN112750465A (en) * 2020-12-29 2021-05-04 昆山杜克大学 Cloud language ability evaluation system and wearable recording terminal
CN114553841A (en) * 2022-04-25 2022-05-27 广州集韵信息科技有限公司 Communication method and system based on cloud service
CN115147957A (en) * 2021-03-15 2022-10-04 爱国者电子科技有限公司 Intelligent voice door lock control method and control system
CN115865875A (en) * 2021-09-24 2023-03-28 精工爱普生株式会社 Display method, display device and display system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004309965A (en) * 2003-04-10 2004-11-04 Advanced Media Inc Conference recording/dictation system
WO2017200076A1 (en) * 2016-05-20 2017-11-23 日本電信電話株式会社 Dialog method, dialog system, dialog device, and program
WO2018121757A1 (en) * 2016-12-31 2018-07-05 深圳市优必选科技有限公司 Method and system for speech broadcast of text
CN106782540A (en) * 2017-01-17 2017-05-31 联想(北京)有限公司 Speech ciphering equipment and the voice interactive system including the speech ciphering equipment
CN206759648U (en) * 2017-06-02 2017-12-15 北京经纬中天信息技术有限公司 The live recording and broadcasting system of Rich Media
CN107220228A (en) * 2017-06-13 2017-09-29 深圳市鹰硕技术有限公司 One kind teaching recorded broadcast data correction device
JP2019153099A (en) * 2018-03-05 2019-09-12 コニカミノルタ株式会社 Conference assisting system, and conference assisting program
CN112750465A (en) * 2020-12-29 2021-05-04 昆山杜克大学 Cloud language ability evaluation system and wearable recording terminal
CN115147957A (en) * 2021-03-15 2022-10-04 爱国者电子科技有限公司 Intelligent voice door lock control method and control system
CN115865875A (en) * 2021-09-24 2023-03-28 精工爱普生株式会社 Display method, display device and display system
CN114553841A (en) * 2022-04-25 2022-05-27 广州集韵信息科技有限公司 Communication method and system based on cloud service

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈斌.智能科学与技术丛书 Python机器学习原书第3版.机械工业出版社,2021,第164-165页. *

Also Published As

Publication number Publication date
CN116524910A (en) 2023-08-01

Similar Documents

Publication Publication Date Title
US8386265B2 (en) Language translation with emotion metadata
CN101382937B (en) Multimedia resource processing method based on speech recognition and on-line teaching system thereof
US20180204565A1 (en) Automatic Language Model Update
US6816858B1 (en) System, method and apparatus providing collateral information for a video/audio stream
CN112104919B (en) Content title generation method, device, equipment and computer readable storage medium based on neural network
CN104050160B (en) Interpreter's method and apparatus that a kind of machine is blended with human translation
KR20120038000A (en) Method and system for determining the topic of a conversation and obtaining and presenting related content
CN105957531A (en) Speech content extracting method and speech content extracting device based on cloud platform
WO2019214456A1 (en) Gesture language translation system and method, and server
CN109712612A (en) A kind of voice keyword detection method and device
CN102339606B (en) Depressed mood phone automatic speech recognition screening system
WO2023222088A1 (en) Voice recognition and classification method and apparatus
Lamel et al. Speech processing for audio indexing
Yang et al. Open source magicdata-ramc: A rich annotated mandarin conversational (ramc) speech dataset
Cho et al. StreamHover: Livestream transcript summarization and annotation
CN114328817A (en) Text processing method and device
CN112185363A (en) Audio processing method and device
CN115206293A (en) Multi-task air traffic control voice recognition method and device based on pre-training
CN111899740A (en) Voice recognition system crowdsourcing test case generation method based on test requirements
US20210264812A1 (en) Language learning system and method
US20220414338A1 (en) Topical vector-quantized variational autoencoders for extractive summarization of video transcripts
Galibert et al. Ritel: an open-domain, human-computer dialog system.
CN116524910B (en) Manuscript prefabrication method and system based on microphone
CN114125506A (en) Voice auditing method and device
CN109979458A (en) News interview original text automatic generation method and relevant device based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant