CN114936555B - Method and system for AI intelligent labeling of Mongolian - Google Patents

Method and system for AI intelligent labeling of Mongolian Download PDF

Info

Publication number
CN114936555B
CN114936555B CN202210573462.7A CN202210573462A CN114936555B CN 114936555 B CN114936555 B CN 114936555B CN 202210573462 A CN202210573462 A CN 202210573462A CN 114936555 B CN114936555 B CN 114936555B
Authority
CN
China
Prior art keywords
labeling
word
module
words
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210573462.7A
Other languages
Chinese (zh)
Other versions
CN114936555A (en
Inventor
娜仁格日乐
陈磊
杨忠
王辉
戴林
尹帮仁
程彪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia Autonomous Region Public Security Bureau
Iflytek Information Technology Co Ltd
Original Assignee
Inner Mongolia Autonomous Region Public Security Bureau
Iflytek Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia Autonomous Region Public Security Bureau, Iflytek Information Technology Co Ltd filed Critical Inner Mongolia Autonomous Region Public Security Bureau
Priority to CN202210573462.7A priority Critical patent/CN114936555B/en
Publication of CN114936555A publication Critical patent/CN114936555A/en
Application granted granted Critical
Publication of CN114936555B publication Critical patent/CN114936555B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/005Language recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)
  • Character Discrimination (AREA)

Abstract

The application relates to a method for AI intelligent labeling of Mongolian, which comprises the following steps: determining whether the sound data is within an invalid data range; identifying whether the audio content is tin Lin Guole alliance accents; identifying the type of the transfer content and performing normalization treatment; identifying a special text and making a special mark; and outputting the final transfer annotation content. The method and the device increase the screening capability of invalid audio data, reduce the invalid calculation cost and improve the voice transcription labeling rate; the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is improved.

Description

Method and system for AI intelligent labeling of Mongolian
Technical Field
The application relates to the field of Mongolian processing, in particular to a method and a system for AI intelligent labeling of Mongolian.
Background
Along with the development of the times and the progress of the society, the communication among all the industries of the society is gradually and closely, and along with the development of logistics industry and transportation industry, on the premise of developed communication industry, the trade among the countries is gradually increased, wherein the inconvenience of the languages of partial farmers in the process of transnational trade becomes one of the main obstacles, and therefore, the app and the website for language translation play a great role, but the cultural level of some farmers is lower, the control and familiarity of the electronic equipment are lower, and the dialect accents of the farmers are heavier due to the fact that the farmers work at the same place throughout the year, and the language translation software cannot be well translated when the language translation software is used.
The method for transferring and marking the standard sound Mongolian of the Zhenglan flag has no good effect, and cannot finish the transferring and marking work of the Mongolian of the standard sound of the Zhenglan flag.
Disclosure of Invention
In order to solve the problem that the transfer marking effect of the standard Mongolian of the Zhenglan flag is poor, the application provides a method and a system for performing AI intelligent marking on Mongolian.
The application provides a method for AI intelligent labeling of a Mongolian, which comprises the following steps:
a method for AI intelligent labeling of Mongolian comprises the following steps:
step S1, judging whether sound data are valid or not, if the sound data are not in an invalid data range, the sound data are valid, performing audio transcription labeling, if the sound data accord with any standard in the invalid data range, namely the sound data are invalid, performing no audio transcription labeling, performing label processing, and labeling as bad data;
step S2, judging the audio accent, identifying whether the audio content is tin Lin Guole allied accent, if the identification result is tin Lin Guole allied accent, performing audio transcription labeling, if the identification result is other local accent, not performing audio transcription labeling, and performing direct marking treatment on the other local accent;
step S3, the transfer content is normalized, the type of the transfer content is identified, and transfer labeling is carried out;
s4, marking the special text, identifying the special text and making the special mark;
and S5, outputting the transfer annotation content, and integrating the results of the steps S3-S4 to output the final transfer annotation content.
Through the scheme, the method and the device have the advantages that the screening capability of invalid audio data is improved, the invalid calculation cost is reduced, and the voice transcription labeling rate is improved; the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is improved.
Further, in the step S1, the invalid data range includes: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, poor sound recording effect types of speakers, voice noise types, readback types, single word types, rap and singing types;
the non-target languages and the non-positive blue flag types are specifically as follows: null data, pure ambient noise, pure music, pure human voice noise, pure human voice non-speech, pure system broadcast sound;
the severe upper and lower section types are specifically as follows: the voice frequency is harsher and roar, the oscillogram exceeds the upper and lower boundary lines, and the speaking content is not clearly heard;
the sound generator recording effect is of the type specifically: the speaker can not hear clearly when spraying the wheat, and can speak with the throat;
the voice noise type is specifically as follows: the noise of the human voice affects the main speaker, which leads to inaudibility;
the readback type is specifically as follows: one word is not read, and the back is read;
the single word type is specifically: an audio piece has only one word. The screening capability of invalid audio data is improved, the invalid calculation cost is reduced, and the voice transcription labeling rate is improved;
further, the step S3 includes: step S301, summarizing processing; step S302, first word processing, specifically
Figure GDA00037555720900000224
Processing; step S303, processing the postfix of the derivative; step S304, processing the word with the aid of the language; step S305, compound word processing; step S306, standard voice and dialect vocabulary processing; step S307, pronoun processing; step S308, second class word processing, specifically +.>
Figure GDA0003755572090000021
Processing; step S309, third class word processing, specifically +.>
Figure GDA00037555720900000225
Processing related words; step S310, borrowing word processing; step S311, new vocabulary processing; the label transfer can be normalized.
In the step S301: the meaning of a word can be changed into a broader meaning in a generalized form, so that the word which originally refers to a specific object can be changed into a word which comprises the same kind or other objects closely related to the same; the most predominant form of the generalization is to repeat one more after a noun
Figure GDA0003755572090000022
The same word as the consonant is sometimes repeated with other consonants
Figure GDA0003755572090000023
The first syllable of the second word is converted into +.>
Figure GDA0003755572090000024
Consonants; in the case of a generic term consisting of adjectives, the first syllable of the adjective is followed by +.>
Figure GDA0003755572090000025
Consonants; the generalized verb is only one +.>
Figure GDA0003755572090000026
After the trailing edge of the real verb is connected with the generalized verb, the meaning range of the real verb is enlarged;
in the step S302: common lattice
Figure GDA0003755572090000027
The words are written separately +.>
Figure GDA0003755572090000028
Instead, but not in successive sentences, the word is transcribed with +.>
Figure GDA0003755572090000029
Replacement; noun add->
Figure GDA00037555720900000210
When the adjective is changed, the writing is needed to be carried out; +.>
Figure GDA00037555720900000211
The words are written separately +.>
Figure GDA00037555720900000212
Instead, but in successive sentences, direct audio transcription is performed; demonstration of +.f. in the morbid verb of the speculated implication>
Figure GDA00037555720900000213
The writing is needed to be connected; 12 genus of nouns plus->
Figure GDA00037555720900000214
When in use, the writing is needed to be connected;
in the step S303: if the words without change appear in the sentence, separating the virtual words for writing; adverbs and method of making
Figure GDA00037555720900000215
Split write (s)/(s) on a memory card>
Figure GDA00037555720900000216
In the step S304: for the following
Figure GDA00037555720900000217
Positive word is directly followed by transcription ++>
Figure GDA00037555720900000218
Reading the first vowel, directly transcribing the negative word followed by +.>
Figure GDA00037555720900000219
Reading the second vowels, and judging according to the yin-yang of the words at other times except the first vowels and the second vowels;
Figure GDA00037555720900000220
should be read as "fifth vowel->
Figure GDA00037555720900000221
Should be read as "sixth vowel" [ o ]>
Figure GDA00037555720900000222
Should be read as "seventh vowel" [ u ]>
Figure GDA00037555720900000223
Should be read as "fifth vowel" [ v ]>
Figure GDA0003755572090000031
If->
Figure GDA0003755572090000032
Front appearance "/-on>
Figure GDA0003755572090000033
Or->
Figure GDA0003755572090000034
The words of the end are all transcribed according to the audio;
in the step S305: the name and place name formed by two or more words, and some proper nouns are written continuously without distinguishing the yin and yang of the words; the second root in the words consisting of two roots is vowels and is written separately;
in the step S306: in successive sentences
Figure GDA00037555720900000328
Belongs to a non-standard sound range, and data is required to be directly subjected to standard processing; for the dialect vocabulary, let's in the Hull spoken language>
Figure GDA00037555720900000329
And Wu Zhumu +.>
Figure GDA0003755572090000035
The word direct audio transcription; if present in the audio>
Figure GDA0003755572090000036
Figure GDA0003755572090000037
These words, direct audio transcription;
in the step S307, the pronoun includes:
Figure GDA0003755572090000038
/>
Figure GDA0003755572090000039
in the step S308:
Figure GDA00037555720900000310
when the second stop-motion shaping nouns, pronouns and some time-position words are used as the second stop-motion shaping nouns, pronouns and some time-position words, the second stop-motion shaping nouns, pronouns and some time-position words are separately written; />
Figure GDA00037555720900000311
The writing is carried out after part adjectives and position words; suffix third lattice bit lattice->
Figure GDA00037555720900000312
Additional component of the post-addition word->
Figure GDA00037555720900000314
When in need of transformation, the patient needs to be transcribed into->
Figure GDA00037555720900000313
In the step S309: if it is
Figure GDA00037555720900000315
The word meaning, form and function of the word can be written continuously when the additional components are changed; if->
Figure GDA00037555720900000316
The compound word can be formed without change of the word, and the word has the functions of forming words and deforming additional components and is separately written;
in the step S310: for Mongolian borrowed words, the actual pronunciation of Mongolian is used for transcription, and for newly-fed borrowed words, the similar pronunciation is used for transcription in combination with a Mongolian voice system; in the step S311: suffix of nouns derived from verbs if root word is
Figure GDA00037555720900000326
The consonant ending should be replaced with a suffix that has the same meaning and function as the canonical transcription.
Further, in the step S4, transcription is performed according to the voice content of the main speaker, and the content is strictly consistent with the heard voice; the transfer content is written in a top grid; the background sound is human sound, is a target language and sounds clearly, all the background sound is marked according to the sequence, and if the background sound is not clear, only the main speaker is marked; the text is ensured to be completely consistent with the audio, and the place name and the person name are required to be reasonable;
the step S4 includes: s41, arabic numerals are marked, and corresponding Mongolian characters are transcribed according to audio; step S42, marking English, namely directly marking English if English is encountered during transcription marking, and marking according to the meaning of foreign language in Mongolian if foreign language is encountered; s43, labeling the Chinese words; step S44, text labeling with grammar errors is performed, and the audio content is directly transcribed and labeled as long as pronunciation is clear and definite; step S45, marking punctuation marks, wherein only' can appear in the process of transferring marking "
Figure GDA00037555720900000327
? The following is carried out "these four punctuation marks; step S46, labeling proper nouns, chinese names, place names, english names and place names according to the Mongolian actual standard requirementThe method comprises the steps of carrying out a first treatment on the surface of the Step S47, labeling space texts, wherein spaces exist in the Mongolian typing process;
in the step S43, the term "one' S voice" is collectively denoted as
Figure GDA00037555720900000319
Representing affirmative; the Chinese words and phrases are uniformly marked as
Figure GDA00037555720900000318
Indicating insight and comprehension; the word "Java" is marked uniformly as->
Figure GDA00037555720900000317
Indicating a appreciation, indicating surprise; the Chinese qi word "o" is uniformly marked as->
Figure GDA00037555720900000320
Representing a general doubt; the word "bar" is uniformly marked as +.>
Figure GDA00037555720900000321
Representing a good bar; the speech ends with an o and the last word is not in +.>
Figure GDA00037555720900000322
Or->
Figure GDA00037555720900000323
Marking +.>
Figure GDA00037555720900000324
Or->
Figure GDA00037555720900000325
Can realize the labeling of the Chinese words, arabic numerals, english and punctuation marks.
Further, in the step S5, the audio is transferred word by word; the words with spoken language in the audio are required to be correctly transcribed according to written language, and the words which are ignored for the reason of the spoken language are required to be recovered when transcribed; punctuation marks must be used and guaranteedWhen used correctly, punctuation marks are only 'used'
Figure GDA0003755572090000041
? The! "punctuation under Meng Keli Mongolian input method; in the transcription process +.>
Figure GDA0003755572090000042
Are written separately; the name of the person needs to be written continuously, and the name of the place cannot be written continuously and separately due to the input method. The transfer can be normalized.
Further, in the step S5, when the transfer label is output, top grid writing is performed; when inputting punctuation marks, a space is manually added in front of the punctuation marks; word, suffix
Figure GDA00037555720900000411
Only one space is needed between symbols. Can be in words, suffix->
Figure GDA00037555720900000412
Spaces are added between the symbols.
Furthermore, in the step S5, when transferring the label and outputting, the input method must use a Meng Keli input method with MN mark; meng Keli input method requiring national standard coding during platform transcription
Figure GDA00037555720900000413
The letter is black. The transfer label can be normalized.
Further, in the step S5, when the label is output, the suffix is added to
Figure GDA0003755572090000043
Remove->
Figure GDA0003755572090000044
Figure GDA0003755572090000045
The other suffixes are shortcuts 1-9, wherein key 6 +.>
Figure GDA0003755572090000046
Not used, written->
Figure GDA0003755572090000047
The other suffixes have parts of speech, when the suffixes are used, the recommended use is based on an input method, when the suffixes need to be corrected, the suffixes need to be corrected together with the words in front of the suffixes, and at the moment, the parts of speech of the transcribed suffixes are correct. The transfer label is more convenient to output.
Further, in the step S5, when the label is transferred and output, the Chinese character in the audio is also used
Figure GDA0003755572090000048
Replacement writing; "more" of Chinese in audio>
Figure GDA0003755572090000049
And (5) replacing the writing. Alternate writing to "also" and "more" is implemented.
A system for AI intelligent labeling of a mongolian, characterized in that a method for AI intelligent labeling of a mongolian according to any one of claims 1-9 is applied:
the system for AI intelligent labeling of the Mongolian comprises: the device comprises an invalid data range judging module (1), an accent judging module, a special text labeling module, a transfer content standardization processing module and a transfer content output module;
the invalid data range determination module determines whether the invalid data range is satisfied based on the audio data: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, voice recording effect difference types, voice noise types, readback types, single word types, rap and singing types are judged to be invalid data if the voice types are satisfied, and valid data is judged to be valid data if the voice types are not satisfied;
the accent judging module is connected with the invalid data range judging module and is used for receiving and identifying the audio content judged by the invalid data range judging module and judging whether the audio content is tin Lin Guole allied accent or not;
the special text labeling module is connected with the accent judging module and is used for receiving and labeling the audio content judged to pass by the accent judging module; the special text labeling module comprises: the first special labeling submodule is used for labeling Arabic numerals; the second special labeling sub-module is used for labeling English; the third special labeling sub-module is used for labeling the Chinese words; the fourth special labeling sub-module is used for labeling grammar error texts; a fifth special labeling sub-module for labeling punctuation marks; a sixth special labeling sub-module, configured to label the proper noun text; a seventh special labeling sub-module, configured to label space text;
the transfer content standardization processing module comprises; the first normalization sub-module is used for normalizing the processing outline transfer; a second normalization sub-module for normalizing
Figure GDA00037555720900000410
The third normalization sub-module is used for normalizing and processing the derivative suffix; the fourth normalization sub-module is used for normalizing the processing of the language and the auxiliary words; a fifth normalization sub-module, configured to normalize the compound word transcription; a sixth normalization sub-module, configured to normalize standard voice and dialect vocabulary; a seventh normalization sub-module, configured to normalize the processing pronouns; an eighth normalization sub-module for normalizing +.>
Figure GDA0003755572090000051
A ninth normalization sub-module for normalizing +.>
Figure GDA0003755572090000052
Related words; a tenth normalization sub-module, configured to normalize the borrowing word; an eleventh normalization sub-module for normalizing the new vocabulary;
the transfer content output module is respectively connected with the special text labeling module and the transfer content standardization processing module and is used for receiving data and outputting comprehensive transfer content.
Through the system, the method and the system increase the screening capability of invalid audio data, reduce the invalid calculation cost and improve the voice transcription labeling rate; the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is improved.
In summary, the present application includes the following beneficial technical effects:
1. the screening capability of invalid audio data is improved, the invalid calculation cost is reduced, and the voice transcription labeling rate is improved;
2. the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is improved.
Drawings
FIG. 1 is a step diagram of a method for AI intelligent labeling of a Mongolian in an embodiment of the application;
FIG. 2 is a logic diagram of a method for AI intelligent labeling of a Mongolian in accordance with an embodiment of the present application;
fig. 3 is a block diagram of a system for AI intelligent labeling of a mongolian according to an embodiment of the present application.
Reference numerals illustrate:
1. an invalid data range determination module; 2. an accent judgment module;
3. a special text labeling module; 31. a first special labeling sub-module; 32. the second special labeling sub-module; 33. a third special labeling sub-module; 34. a fourth special labeling sub-module; 35. a fifth special labeling sub-module; 36. a sixth special labeling sub-module; 37. a seventh special labeling sub-module;
4. a transfer content standardization processing module; 401. a first normalization sub-module; 402. a second normalization sub-module; 403. a third normalization sub-module; 404. a fourth normalization sub-module; 405. a fifth normalization sub-module; 406. a sixth normalization sub-module; 407. a seventh normalization sub-module; 408. an eighth normalized sub-module; 409. a ninth normalization sub-module; 410. a tenth normalization sub-module; 411. an eleventh normalization sub-module;
5. and a transfer content output module.
Detailed Description
The following detailed description of the embodiments of the present application, such as the shape and construction of the components, the mutual positions and connection relationships between the components, the roles and working principles of the components, the manufacturing process and the operation and use method, etc., is provided to help those skilled in the art to more fully understand the inventive concept, technical solution of the present invention. For convenience of description, reference is made to the directions shown in the drawings.
Example 1
Referring to fig. 1-2, a method for AI intelligent labeling of a mongolian language includes the following steps:
step S1, judging whether sound data are valid or not, if the sound data are not in an invalid data range, the sound data are valid, performing audio transcription labeling, if the sound data accord with any standard in the invalid data range, namely the sound data are invalid, performing no audio transcription labeling, performing label processing, and labeling as bad data;
step S2, the audio accent is determined, and whether the audio content is tin Lin Guole allied accent is identified, including tin Lin Guole allied accent is detected, if the identification result is tin Lin Guole allied accent, including tin hal, then audio transcription labeling is performed, if the identification result is other local accent, for example: the method is characterized in that the method does not make audio transcription marks and directly marks the accent audio of other places;
step S3, the transfer content is normalized, the type of the transfer content is identified, and transfer labeling is carried out;
s4, marking the special text, identifying the special text and making the special mark;
and S5, outputting the transfer annotation content, and integrating the results of the steps S3-S4 to output the final transfer annotation content.
In the step S1, the invalid data range includes: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, poor sound recording effect types of speakers, voice noise types, readback types, single word types, rap and singing types;
the non-target languages and the non-positive blue flag types are specifically as follows: null data, pure ambient noise, pure music, pure human voice noise, pure human voice non-speech, such as singing, sneezing, coughing, laughing, etc., pure system broadcast sounds, such as sounds made by devices such as cell phones, televisions, radios, etc.;
the severe upper and lower section types are specifically as follows: the voice frequency is harsher and roar, the oscillogram exceeds the upper and lower boundary lines, and the speaking content is not clearly heard;
the sound generator recording effect is of the type specifically: the speaker can not hear clearly when spraying the wheat, and can speak with the throat;
the voice noise type is specifically as follows: the noise of the human voice affects the main speaker, so that the main speaker and the secondary speaker are inaudible, namely, the main speaker and the secondary speaker are overlapped in pronunciation, and the main speaker and the secondary speaker are inaudible seriously;
the readback type is specifically as follows: a word is not read and is followed by read-back phenomena such as: beaubeau, but thin is not read back;
the single word type is specifically: an audio has only one word, and the rap and singing are invalid data.
The step S3 includes: step S301, summarizing processing; step S302, first word processing, specifically
Figure GDA0003755572090000072
Processing; step S303, processing the postfix of the derivative; step S304, processing the word with the aid of the language; step S305, compound word processing; step S306, standard voice and dialect vocabulary processing; step S307, pronoun processing; step S308, second word processing, specifically
Figure GDA0003755572090000071
Processing; step S309, third class word processing, specifically +.>
Figure GDA00037555720900000754
Processing related words; step S310, borrowing word processing; step S311, new vocabulary processing;
in the step S301: the meaning of a word can be broader in terms of a word that is intended to refer to a particular thing, to words that include the same class or other things that are closely related to it, such as:
Figure GDA0003755572090000073
Figure GDA0003755572090000074
the most predominant form of the expression is to repeat a term after it is +.>
Figure GDA00037555720900000753
The same word as the consonant, i.e. the auxiliary word, is sometimes repeated with other consonants +.>
Figure GDA0003755572090000075
Figure GDA0003755572090000076
The first word, e.g. something that becomes a vocabulary unit, transforms the first syllable of the second word into +.>
Figure GDA0003755572090000077
Waiting for consonants; such as: />
Figure GDA0003755572090000078
Figure GDA0003755572090000079
Figure GDA00037555720900000710
In the case of a generic term consisting of adjectives, the first syllable of the adjective is followed by +.>
Figure GDA00037555720900000711
Consonants, such as:
Figure GDA00037555720900000712
the generalized verb is only one +.>
Figure GDA00037555720900000713
After the trailing edge of the real verb is connected with the generalized verb, the meaning range of the real verb is enlarged;
in the step S302: common lattice
Figure GDA00037555720900000714
The words are written separately +.>
Figure GDA00037555720900000715
Instead, but not in successive sentences, the word is transcribed with +.>
Figure GDA00037555720900000716
Instead, such as: />
Figure GDA00037555720900000717
Representing the +.>
Figure GDA00037555720900000718
Separate writes are needed when needed, such as: />
Figure GDA00037555720900000719
Noun add->
Figure GDA00037555720900000720
Writing is continued when changing adjectives, such as:
Figure GDA00037555720900000721
+.>
Figure GDA00037555720900000722
The words are written separately +.>
Figure GDA00037555720900000723
Instead, but in successive sentences, direct audio transcription, such as: />
Figure GDA00037555720900000724
Figure GDA00037555720900000725
Demonstration of +.f. in the morbid verb of the speculated implication>
Figure GDA00037555720900000726
Writing is required, such as: />
Figure GDA00037555720900000727
Addition of 12 terms
Figure GDA00037555720900000728
Writing is needed in succession, such as: />
Figure GDA00037555720900000729
In the step S303: if the following unchanged words appear in the sentence, such as all words connecting the works, the works are separated and written, such as:
Figure GDA00037555720900000730
Figure GDA00037555720900000732
Figure GDA00037555720900000733
adverbs->
Figure GDA00037555720900000734
Split write (s)/(s) on a memory card>
Figure GDA00037555720900000735
Such as:
Figure GDA00037555720900000736
these words, if they appear in sentences, are written in succession, such as: />
Figure GDA00037555720900000737
Figure GDA00037555720900000738
In the step S304: for the following
Figure GDA00037555720900000739
Positive word is directly followed by transcription ++>
Figure GDA00037555720900000740
Reading the first vowel, directly transcribing the negative word followed by +.>
Figure GDA00037555720900000741
Reading the second vowels, and judging according to the yin-yang of the words at other times except the first vowels and the second vowels;
Figure GDA00037555720900000742
should be read as "fifth vowel" [ v ]>
Figure GDA00037555720900000743
Should be read as "sixth vowel" [ o ]>
Figure GDA00037555720900000744
Should be read as "seventh vowel" [ u ]>
Figure GDA00037555720900000745
Should be read as "fifth vowel" [ v ]>
Figure GDA00037555720900000746
If->
Figure GDA00037555720900000747
Front appearance "/-on>
Figure GDA00037555720900000748
Or->
Figure GDA00037555720900000749
The words of the "end" may all be transcribed in audio, such as: />
Figure GDA00037555720900000750
Figure GDA00037555720900000751
In the step S305: the names of people and places formed by two or more words, and proper nouns are written in succession without distinguishing yin and yang of words in most cases, for example:
Figure GDA00037555720900000752
Figure GDA0003755572090000081
the second root of the word consisting of two roots is written according to the change in the word when the second root is a vowel in theory, but the words cannot be output in the typing process, and the words are written separately, such as:
Figure GDA0003755572090000082
in the step S306: in successive sentences
Figure GDA0003755572090000083
Belongs to a non-standard sound range, so that data is required to be directly subjected to standard processing; for the dialect vocabulary, let's in the Hull spoken language>
Figure GDA0003755572090000084
And Wu Zhumu in spoken language
Figure GDA0003755572090000085
The words have the meaning of expressing father and are all in the standard range, so that direct audio transcription is carried out, and the correct writing of dialect vocabulary is noted during transcription; if these words appear in the audio
Figure GDA0003755572090000086
Direct audio transcription;
in the step S307, the pronoun includes:
Figure GDA0003755572090000087
Figure GDA0003755572090000088
in the step S308: nouns, pronouns and some time-position words serving as second stop-motion shaping are separately written, for example:
Figure GDA0003755572090000089
partial adjectives and punctuations are written in succession, and also include some special pronouns, such as: />
Figure GDA00037555720900000810
Figure GDA00037555720900000811
Suffix third lattice bit lattice->
Figure GDA00037555720900000812
Additional component of the post-addition word->
Figure GDA00037555720900000813
When in need of transformation, the patient needs to be transcribed into->
Figure GDA00037555720900000814
Such as: />
Figure GDA00037555720900000815
In the step S309: if it is
Figure GDA00037555720900000816
The word meaning, morphology and function of a word can be written in connection with the change of additional components, such as: />
Figure GDA00037555720900000817
If->
Figure GDA00037555720900000818
The compound word can be formed without change, and the functions of forming words and deforming additional components are separately written, for example:
Figure GDA00037555720900000819
in the step S310: for those borrowed words which are long in history and Mongolian, the actual pronunciation of Mongolian is used for transcription, and for the newly-fed borrowed words, the similar pronunciation is used for transcription in combination with a Mongolian voice system, for example:
Figure GDA00037555720900000820
in the step S311: suffix of nouns derived from verbs if root word is
Figure GDA00037555720900000821
Consonant ending should be replaced with suffixes of the same meaning and function as normative transcription, such as: />
Figure GDA00037555720900000822
Figure GDA00037555720900000823
Word writing with extended syllable morphemes, such as:
Figure GDA00037555720900000824
partial root free variants are unified with suffix free variants, such as: />
Figure GDA00037555720900000825
/>
Figure GDA00037555720900000826
Figure GDA00037555720900000827
Figure GDA00037555720900000828
Other vocabulary correct writing method/>
Figure GDA00037555720900000829
Figure GDA00037555720900000830
In the step S4, transcription is performed according to the voice content of the main speaker, the content is strictly consistent with the heard voice, and multiple words, missing words and word staggering are not allowed; the transfer content is to be written in a top grid mode, and blank phenomenon at the beginning is forbidden; the background sound is human sound, is a target language and sounds clearly, all the background sound is marked according to the sequence, and if the background sound is not clear, only the main speaker is marked; the text is ensured to be completely consistent with the audio, the place name and the person name are required to be reasonable, the uncertain words refer to a dictionary, and words can not be randomly made according to pronunciation;
the step S4 includes: step S41, arabic numerals are marked, arabic numerals cannot appear in marked contents, and corresponding Mongolian characters are transcribed according to audio;
such as:
Figure GDA0003755572090000091
Figure GDA0003755572090000092
year 2004 please transcribe the corresponding word according to the actual pronunciation, without labeling the number.
Figure GDA0003755572090000093
Figure GDA0003755572090000101
Step S42, marking English, namely directly marking English if English is met and marking according to the meaning of foreign language in Mongolian if foreign language is met when the English is met;
s43, labeling the Chinese words;
step S44, text labeling with grammar errors is performed, and the audio content is directly transcribed and labeled as long as pronunciation is clear and definite;
step S45, marking punctuation marks, except' during the transfer marking process "
Figure GDA0003755572090000102
? The following is carried out "these four punctuations do not allow any punctuations to appear in the labeling result;
step S46, labeling proper nouns, chinese names, place names, english names, place names and the like according to the actual Mongolian standard requirements;
step S47, labeling space texts, wherein spaces exist in the Mongolian typing process, the meaning of the spaces is not changed, and the spaces cannot be increased or decreased at will;
in the step S43, the term "one' S voice" is collectively denoted as
Figure GDA00037555720900001023
Representing affirmative; the Chinese words and phrases are uniformly marked as
Figure GDA00037555720900001024
Indicating insight and comprehension; the word "Java" is marked uniformly as->
Figure GDA00037555720900001025
Indicating a appreciation, indicating surprise; the Chinese qi word "o" is uniformly marked as->
Figure GDA0003755572090000103
Representing a general doubt; the word "bar" is uniformly marked as +.>
Figure GDA0003755572090000104
The method is characterized in that the method is used for representing a good bar and a Chinese ending word bar, and is not practical; the speech ends with an o and the last word is not in +.>
Figure GDA0003755572090000105
Or->
Figure GDA0003755572090000106
Marking +.>
Figure GDA0003755572090000107
Or->
Figure GDA0003755572090000108
In the step S5, the audio is used for word-by-word transcription, and the words cannot be dropped and added; the words with spoken language in the audio need to be correctly transcribed according to the written language, the words cannot be made, and the words which are ignored for the reason of the spoken language need to be recovered when transcription is carried out, for example:
Figure GDA0003755572090000109
punctuation marks must be used and their correct use ensured, which are now only used "/">
Figure GDA00037555720900001010
? The! "punctuation under input methods other than Meng Keli Mongolian input methods cannot be used; in the transcription process +.>
Figure GDA00037555720900001011
Are written separately, for example: />
Figure GDA00037555720900001012
Cannot be written as->
Figure GDA00037555720900001013
The name of the person needs to be written continuously, and the name of the place cannot be written continuously and separately due to the input method.
In the step S5, when the transfer label is output, the top grid is written, and a natural segment blank is not left at the transfer beginning; when inputting punctuation marks, a space is manually added in front of the punctuation marks; word, suffix
Figure GDA00037555720900001014
Only one space is needed between symbols, and no space or redundant space can be provided.
In the step S5, when transferring the label and outputting, the input method must use Meng Keli input method with MN mark; meng Keli input method requiring national standard coding during platform transcription
Figure GDA00037555720900001015
Input method incapable of using Meng Keli code
Figure GDA00037555720900001016
The words are black, and cannot be blue or red, except for individual special words and fixed words.
In the step S5, when the transfer label is output, for the suffix
Figure GDA00037555720900001017
Remove->
Figure GDA00037555720900001018
The other suffixes are shortcuts 1-9, wherein key 6 +.>
Figure GDA00037555720900001019
Not used, written->
Figure GDA00037555720900001020
For example
Figure GDA00037555720900001021
Figure GDA00037555720900001022
The other suffixes have parts of speech, when the suffixes are used, the recommended use is based on an input method, when the suffixes need to be corrected, the suffixes need to be corrected together with the words in front of the suffixes, and at the moment, the parts of speech of the transcribed suffixes are correct.
In the step S5, when the transfer label is output, the Chinese character in the audio is used as' also
Figure GDA00037555720900001026
Replacement writing; "more" of Chinese in audio>
Figure GDA0003755572090000114
And (5) replacing the writing.
The method for AI intelligent labeling of the Mongolian has the advantages that: the screening capability of invalid audio data is improved, the invalid calculation cost is reduced, and the voice transcription labeling rate is improved; the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is increased.
Example two
Referring to fig. 3, a system for AI intelligent labeling of a mongolian language includes: an invalid data range judging module 1, an accent judging module 2, a special text labeling module 3, a transcription content standardization processing module 4 and a transcription content output module 5;
the invalid data range determination module 1 determines whether the invalid data range is satisfied according to the audio data: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, voice recording effect difference types, voice noise types, readback types, single word types, rap and singing types are judged to be invalid data if the voice types are satisfied, and valid data is judged to be valid data if the voice types are not satisfied;
the accent judging module is connected with the invalid data range judging module 1 and is used for receiving and identifying the audio content judged by the invalid data range judging module 1 and judging whether the audio content is tin Lin Guole allied accent or not;
the special text labeling module 3 is connected with the accent judging module 2 and is used for receiving and labeling the audio content judged to pass by the accent judging module 2; the special text labeling module 3 includes: a first special labeling sub-module 31 for labeling Arabic numerals; a second special labeling sub-module 32 for labeling English; a third special labeling sub-module 33, configured to label the word of the language; a fourth special labeling sub-module 33, configured to label a grammar error text; a fifth special labeling sub-module 34 for labeling punctuation marks; a sixth special labeling sub-module 35, configured to label the proper noun text; a seventh special labeling sub-module 36 for labeling space text;
the transfer content normalization processing module 4 comprises; a first normalization sub-module 401, configured to normalize a process summary transcription; a second normalization sub-module 402 for normalizing the processing
Figure GDA0003755572090000115
A third normalization sub-module 403, configured to normalize the suffix of the derivative; a fourth normalization sub-module 404, configured to normalize the language-gas assisted word; a fifth normalization sub-module 405, configured to normalize the compound word transcription; a sixth normalization submodule 406, configured to normalize standard voice and dialect vocabulary; a seventh normalization sub-module 407, configured to normalize the processing pronouns; an eighth normalization sub-module 408 for normalizing
Figure GDA0003755572090000111
Figure GDA0003755572090000112
A ninth normalization sub-module 409 for normalizing +.>
Figure GDA0003755572090000113
Related words; a tenth normalization sub-module 410, configured to normalize the borrowing words; an eleventh normalization sub-module 411, configured to normalize the new vocabulary;
the transfer content output module 5 is respectively connected with the special text labeling module 3 and the transfer content standardization processing module 4 and is used for receiving data and outputting comprehensive transfer content.
The embodiment of the application, a system for AI intelligent labeling of Mongolian works according to the following principle: by setting the invalid data range judging module, the screening capability of invalid audio data is improved, the invalid calculation cost is reduced, and the voice transcription labeling rate is improved; by arranging the special text labeling module and the transcription content standardization processing module, the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is increased.
The invention and its embodiments have been described above schematically, without limitation, and the drawings illustrate only one embodiment of the invention and the actual structure is not limited thereto. Therefore, if one of ordinary skill in the art is informed by this disclosure, a structural manner and an embodiment similar to the technical scheme are not creatively devised without departing from the gist of the present invention, and all the structural manners and the embodiment are considered to be within the protection scope of the present invention.

Claims (4)

1. The method for AI intelligent labeling of the Mongolian is characterized by comprising the following steps:
step S1, judging the validity of the data, judging whether the sound data is valid, if the sound data is not in the invalid data range, the sound data is valid, performing audio transcription labeling, if the sound data accords with any standard in the invalid data range, namely, the sound data is invalid, performing no audio transcription labeling, performing labeling treatment, and labeling the sound data as bad data, wherein the invalid data range comprises: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, poor sound recording effect types of speakers, voice noise types, readback types, single word types, rap and singing types;
step S2, judging the audio accent, identifying whether the audio content is tin Lin Guole allied accent, if the identification result is tin Lin Guole allied accent, performing audio transcription labeling, if the identification result is other local accent, not performing audio transcription labeling, and performing direct marking treatment on the other local accent;
step S3, the transfer content is normalized, the type of the transfer content is identified, and transfer labeling is carried out;
s4, marking the special text, identifying the special text and making the special mark;
step S5, transferring the labeling content to output, and integrating the results of the step S3 and the step S4 to output the final transferring labeling content;
in the step S4, transcription is performed according to the voice content of the main speaker, and the content is consistent with the heard voice; the transfer content is written in a top grid; the background sound is human sound, is a target language and sounds clearly, all the background sound is marked according to the sequence, and if the background sound is not clear, only the main speaker is marked; the text is completely consistent with the audio, and the place name and the person name are reasonable;
the step S4 includes: s41, arabic numerals are marked, and corresponding Mongolian characters are transcribed according to audio; step S42, marking English, namely directly marking English if English is encountered during transcription marking, and marking according to the meaning of foreign language in Mongolian if foreign language is encountered; s43, labeling the Chinese words; step S44, text labeling with grammar errors is performed, and the audio content is directly transcribed and labeled as long as pronunciation is clear and definite; in step S45, punctuation marks are marked, and only? The following is carried out "these four punctuation marks; step S46, labeling proper nouns, chinese names, place names, english names and place names according to the actual Mongolian standard requirements; step S47, labeling space texts, wherein spaces exist in the Mongolian typing process;
in the step S43, the term "one' S voice" is collectively denoted as
Figure QLYQS_2
Representing affirmative; the word "coupled" is uniformly marked as->
Figure QLYQS_5
Indicating insight and comprehension; the word "Java" is marked uniformly as->
Figure QLYQS_7
Indicating a appreciation, indicating surprise; the Chinese qi word "o" is uniformly marked as
Figure QLYQS_3
Representing a general doubt; the word "bar" is uniformly marked as +.>
Figure QLYQS_6
Representing a good bar; phonetic "o"Ending and last word not with +.>
Figure QLYQS_8
Or->
Figure QLYQS_9
Marking +.>
Figure QLYQS_1
Or->
Figure QLYQS_4
In the step S5, the audio is used for word-by-word transcription; the words with spoken language in the audio are correctly transcribed according to written language, and the words which are ignored for the reason of the spoken language are recovered when being transcribed:
Figure QLYQS_10
Figure QLYQS_11
punctuation must be used and ensured to be properly used, punctuation is now only used ". Diamond-solid? The! Only using the punctuation marks under the Meng Keli Mongolian input method; in the transcription process +.>
Figure QLYQS_12
Write separately: />
Figure QLYQS_13
The name of the person is required to be written continuously, and the name of the place cannot be written continuously and separately due to Meng Keli Mongolian input method;
in the step S5, when the transfer label is output, the top grid is written, and a natural segment blank is not left at the transfer beginning; when punctuation marks are input, a space is manually added in front of the punctuation marks; word, suffix
Figure QLYQS_14
There is one space between the symbols;
the saidIn the step S5, when transferring and labeling output, a Meng Keli Mongolian input method with an MN mark is used in a Meng Keli Mongolian input method; meng Keli Mongolian input method using national standard coding during platform transcription
Figure QLYQS_15
Input method which cannot be encoded by Meng Keli->
Figure QLYQS_16
The words are black, except special words and fixed words; />
In the step S5, when the transfer label is output, for the suffix
Figure QLYQS_20
Remove->
Figure QLYQS_21
The other suffixes are shortcuts 1-9, wherein key 6 +.>
Figure QLYQS_25
Not used, written->
Figure QLYQS_19
“①/>
Figure QLYQS_24
②/>
Figure QLYQS_27
③/>
Figure QLYQS_29
④/>
Figure QLYQS_17
⑤/>
Figure QLYQS_23
⑦/>
Figure QLYQS_26
⑧/>
Figure QLYQS_28
⑨/>
Figure QLYQS_18
Figure QLYQS_22
The other suffixes have parts of speech, when the suffixes are used, the suffixes are used according to the recommended use of the Meng Keli Mongolian input method, and when the suffixes need to be corrected, the suffixes are corrected together with the words in front of the suffixes;
in the step S5, when the transfer label is output, the Chinese character in the audio is used as' also
Figure QLYQS_30
Replacement writing; "more" of Chinese in audio>
Figure QLYQS_31
And (5) replacing the writing.
2. The method for intelligent AI-labeling of mongolian language according to claim 1, wherein:
in the step S1, the non-target language and the non-blue flag type are specifically: null data, pure ambient noise, pure music, pure human voice noise, pure human voice non-speech, pure system broadcast sound;
the severe upper and lower section types are specifically as follows: the voice frequency is harsher and roar, the oscillogram exceeds the upper and lower boundary lines, and the speaking content is not clearly heard;
the sound generator recording effect is of the type specifically: the speaker can not hear clearly when spraying the wheat, and can speak with the throat;
the voice noise type is specifically as follows: the noise of the human voice affects the main speaker, which leads to inaudibility;
the readback type is specifically as follows: one word is not read, and the back is read;
the single word type is specifically: an audio piece has only one word.
3. The method for intelligent AI-labeling of mongolian language according to claim 1, wherein:
the step S3 includes: step S301, summarizing processing; step S302, first word processing, specifically
Figure QLYQS_32
Processing; step S303, processing the postfix of the derivative; step S304, processing the word with the aid of the language; step S305, compound word processing; step S306, standard voice and dialect vocabulary processing; step S307, pronoun processing; step S308, second class word processing, specifically +.>
Figure QLYQS_33
Processing; step S309, third class word processing, specifically +.>
Figure QLYQS_34
Processing related words; step S310, borrowing word processing; step S311, new vocabulary processing;
in the step S301: the meaning of a word can be expanded by a generalized form, so that the word originally referring to a specific object can be changed into a word comprising the same kind or other objects closely related to the same; the most predominant form of the generalization is to repeat one more after a noun
Figure QLYQS_35
The same word as the consonant is sometimes repeated with other consonants +.>
Figure QLYQS_36
The first syllable of the second word is converted into +.>
Figure QLYQS_37
Consonants; in the case of a generic term consisting of adjectives, the first syllable of the adjective is followed by +.>
Figure QLYQS_38
Consonants; the generalized verb is only one +.>
Figure QLYQS_39
After being used for the ambiguous verb;
in the step S302: common lattice
Figure QLYQS_41
The words are written separately +.>
Figure QLYQS_43
Instead, but not in successive sentences, the word is transcribed with +.>
Figure QLYQS_45
Replacement; noun add->
Figure QLYQS_42
When the adjective is changed, the writing is needed to be carried out; +.>
Figure QLYQS_44
The words are written separately +.>
Figure QLYQS_46
Instead, but in successive sentences, direct audio transcription is performed; demonstration of +.f. in the morbid verb of the speculated implication>
Figure QLYQS_47
The writing is needed to be connected; 12 genus of nouns plus->
Figure QLYQS_40
When in use, the writing is needed to be connected;
in the step S303: if the words without change appear in the sentence, separating the virtual words for writing; adverbs and method of making
Figure QLYQS_48
Split write (s)/(s) on a memory card>
Figure QLYQS_49
In the step S304: for the following
Figure QLYQS_53
Positive word is directly followed by transcription ++>
Figure QLYQS_55
Reading the first vowel, directly transcribing the negative word followed by +.>
Figure QLYQS_61
Reading the second vowels, and judging according to the yin-yang of the words at other times except the first vowels and the second vowels; />
Figure QLYQS_52
Should be read as "fifth vowel" [ v ]>
Figure QLYQS_57
Figure QLYQS_59
Should be read as "sixth vowel" [ o ]>
Figure QLYQS_60
Figure QLYQS_50
Should be read as "seventh vowel" [ u ]>
Figure QLYQS_56
Figure QLYQS_62
Should be read as "fifth vowel" [ v ]>
Figure QLYQS_63
If->
Figure QLYQS_51
Front appearance "/-on>
Figure QLYQS_54
Or->
Figure QLYQS_58
The words of the end are all transcribed according to the audio;
in the step S305: the name and place name composed of two or more words are written continuously without distinguishing yin and yang of the words; the second root in the words consisting of two roots is vowels and is written separately;
in the step S306: in successive sentences
Figure QLYQS_64
Belongs to a non-standard sound range, and data is required to be directly subjected to standard processing; for the dialect vocabulary, let's in the Hull spoken language>
Figure QLYQS_65
And Wu Zhumu +.>
Figure QLYQS_66
The word direct audio transcription; if present in the audio>
Figure QLYQS_67
Figure QLYQS_68
These words, direct audio transcription;
in the step S307, the pronoun includes:
Figure QLYQS_69
Figure QLYQS_70
in the step S308:
Figure QLYQS_71
acting as a second gridWriting the formalized nouns and pronouns separately; />
Figure QLYQS_72
At->
Figure QLYQS_73
Figure QLYQS_74
And then writing; suffix third lattice bit lattice->
Figure QLYQS_75
Additional component of the post-addition word->
Figure QLYQS_76
When in need of transformation, the patient needs to be transcribed into->
Figure QLYQS_77
In the step S309: if it is
Figure QLYQS_78
The word meaning, form and function of the word can be written continuously when the additional components are changed; if->
Figure QLYQS_79
The compound word can be formed without change of the word, and the word has the functions of forming words and deforming additional components and is separately written;
in the step S310: for Mongolian borrowed words, the actual pronunciation of Mongolian is used for transcription, and for newly-fed borrowed words, the similar pronunciation is used for transcription in combination with a Mongolian voice system; in the step S311: suffix of nouns derived from verbs if root word is
Figure QLYQS_80
The consonant ending should be replaced with a suffix that has the same meaning and function as the canonical transcription.
4. A system for AI intelligent labeling of a mongolian, characterized in that a method for AI intelligent labeling of a mongolian according to any one of claims 1-3 is applied:
the system for AI intelligent labeling of the Mongolian comprises: an invalid data range judging module (1), an accent judging module (2), a special text labeling module (3), a transcription content standardization processing module (4) and a transcription content output module (5);
the invalid data range judging module (1) judges whether the audio data meets an invalid data range or not, and the invalid data range is: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, voice recording effect difference types, voice noise types, readback types, single word types, rap and singing types are judged to be invalid data if the voice types are satisfied, and valid data is judged to be valid data if the voice types are not satisfied;
the accent judging module (2) is connected with the invalid data range judging module (1) and is used for receiving and identifying the audio content judged by the invalid data range judging module (1) and judging whether the audio content is tin Lin Guole allied accent or not;
the special text labeling module (3) is connected with the accent judging module (2) and is used for receiving and labeling the audio content judged to pass by the accent judging module (2); the special text labeling module (3) comprises: the first special labeling submodule (31) is used for labeling Arabic numerals; a second special labeling sub-module (32) for labeling English; a third special labeling sub-module (33) for labeling the word of the mood; a fourth special labeling sub-module (34) for labeling grammar error text; a fifth special labeling sub-module (35) for labeling punctuation marks; a sixth special labeling sub-module (36) for labeling proper noun class text; a seventh special labeling sub-module (37) for labeling space text;
the transfer content normalization processing module (4) comprises; a first normalization sub-module (401) for normalizing the process summary transcription; a second normalization sub-module (402) for normalizing the processing
Figure QLYQS_81
A third normalization sub-module (403) forProcessing the derivative word suffix in a standardization manner; a fourth normalization sub-module (404) for normalizing the processing of the mood help words; a fifth normalization sub-module (405) for normalizing compound word transcription; a sixth normalization sub-module (406) for normalizing the standard pronunciation and dialect vocabulary; a seventh normalization sub-module (407) for normalizing the processing pronouns; an eighth normalization sub-module (408) for normalizing +.>
Figure QLYQS_82
A ninth normalization sub-module (409) for normalizing the process +.>
Figure QLYQS_83
Related words; a tenth normalization sub-module (410) for normalizing the processing of the borrowing words; an eleventh normalization sub-module (411) for normalizing the new vocabulary;
the transfer content output module (5) is respectively connected with the special text labeling module (3) and the transfer content standardization processing module (4) and is used for receiving data and outputting comprehensive transfer content.
CN202210573462.7A 2022-05-24 2022-05-24 Method and system for AI intelligent labeling of Mongolian Active CN114936555B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210573462.7A CN114936555B (en) 2022-05-24 2022-05-24 Method and system for AI intelligent labeling of Mongolian

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210573462.7A CN114936555B (en) 2022-05-24 2022-05-24 Method and system for AI intelligent labeling of Mongolian

Publications (2)

Publication Number Publication Date
CN114936555A CN114936555A (en) 2022-08-23
CN114936555B true CN114936555B (en) 2023-06-06

Family

ID=82865086

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210573462.7A Active CN114936555B (en) 2022-05-24 2022-05-24 Method and system for AI intelligent labeling of Mongolian

Country Status (1)

Country Link
CN (1) CN114936555B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009021183A1 (en) * 2007-08-08 2009-02-12 Lessac Technologies, Inc. System-effected text annotation for expressive prosody in speech synthesis and recognition
CN101110171A (en) * 2007-09-05 2008-01-23 黄谷 Bopomofo Chinese and English contrasting multifunctional learning aid
CN103632663B (en) * 2013-11-25 2016-08-17 内蒙古大学 A kind of method of Mongol phonetic synthesis front-end processing based on HMM
CN105957518B (en) * 2016-06-16 2019-05-31 内蒙古大学 A kind of method of Mongol large vocabulary continuous speech recognition
CN111104546B (en) * 2019-12-03 2021-08-27 珠海格力电器股份有限公司 Method and device for constructing corpus, computing equipment and storage medium

Also Published As

Publication number Publication date
CN114936555A (en) 2022-08-23

Similar Documents

Publication Publication Date Title
Creutz et al. Morph-based speech recognition and modeling of out-of-vocabulary words across languages
US6067520A (en) System and method of recognizing continuous mandarin speech utilizing chinese hidden markou models
JP2001296880A (en) Method and device to generate plural plausible pronunciation of intrinsic name
CN103578464A (en) Language model establishing method, speech recognition method and electronic device
CN103578465A (en) Speech recognition method and electronic device
CN103578467A (en) Acoustic model building method, voice recognition method and electronic device
Maamouri et al. Diacritization: A challenge to Arabic treebank annotation and parsing
US20050038654A1 (en) System and method for performing speech recognition by utilizing a multi-language dictionary
Niesler et al. Phonetic analysis of afrikaans, english, xhosa and zulu using South African speech databases
Rugchatjaroen et al. Efficient two-stage processing for joint sequence model-based Thai grapheme-to-phoneme conversion
Abbas et al. Punjabi to ISO 15919 and Roman transliteration with phonetic rectification
US6963832B2 (en) Meaning token dictionary for automatic speech recognition
CN114936555B (en) Method and system for AI intelligent labeling of Mongolian
US7353174B2 (en) System and method for effectively implementing a Mandarin Chinese speech recognition dictionary
Carranza Intermediate phonetic realizations in a Japanese accented L2 Spanish corpus
Pellegrini et al. Automatic word decompounding for asr in a morphologically rich language: Application to amharic
Liang et al. A Taiwanese text-to-speech system with applications to language learning
Wutiwiwatchai et al. Phonetically Distributed Continuous Speech Corpus for Thai Language.
Sung et al. Deploying google search by voice in cantonese
CN113571037A (en) Method and system for synthesizing Chinese braille voice
Ganjavi et al. ASCII based transcription systems for languages with the Arabic script: The case of Persian
Aroonmanakun et al. A unified model of Thai romanization and word segmentation
CN113506559B (en) Method for generating pronunciation dictionary according to Vietnam written text
JP7165439B2 (en) How to Train an Augmented Language Speech Recognition Model with Source Language Speech
US8249869B2 (en) Lexical correction of erroneous text by transformation into a voice message

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Hu Rong

Inventor after: Dai Lin

Inventor after: Yin Bangren

Inventor after: Cheng Biao

Inventor after: Yang Xuefeng

Inventor after: Zhou Bateer

Inventor after: Ren Fuqiang

Inventor after: Naren Grzyle

Inventor after: Yang Zhong

Inventor after: Chen Lei

Inventor after: Wang Hui

Inventor after: He Dongwei

Inventor after: Hou Zhuowei

Inventor before: Naren Grzyle

Inventor before: Chen Lei

Inventor before: Yang Zhong

Inventor before: Wang Hui

Inventor before: Dai Lin

Inventor before: Yin Bangren

Inventor before: Cheng Biao

CB03 Change of inventor or designer information