CN114936555B

CN114936555B - Method and system for AI intelligent labeling of Mongolian

Info

Publication number: CN114936555B
Application number: CN202210573462.7A
Authority: CN
Inventors: 娜仁格日乐; 陈磊; 杨忠; 王辉; 戴林; 尹帮仁; 程彪
Original assignee: Inner Mongolia Autonomous Region Public Security Bureau; Iflytek Information Technology Co Ltd
Current assignee: Inner Mongolia Autonomous Region Public Security Bureau; Iflytek Information Technology Co Ltd
Priority date: 2022-05-24
Filing date: 2022-05-24
Publication date: 2023-06-06
Anticipated expiration: 2042-05-24
Also published as: CN114936555A

Abstract

The application relates to a method for AI intelligent labeling of Mongolian, which comprises the following steps: determining whether the sound data is within an invalid data range; identifying whether the audio content is tin Lin Guole alliance accents; identifying the type of the transfer content and performing normalization treatment; identifying a special text and making a special mark; and outputting the final transfer annotation content. The method and the device increase the screening capability of invalid audio data, reduce the invalid calculation cost and improve the voice transcription labeling rate; the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is improved.

Description

Method and system for AI intelligent labeling of Mongolian

Technical Field

The application relates to the field of Mongolian processing, in particular to a method and a system for AI intelligent labeling of Mongolian.

Background

Along with the development of the times and the progress of the society, the communication among all the industries of the society is gradually and closely, and along with the development of logistics industry and transportation industry, on the premise of developed communication industry, the trade among the countries is gradually increased, wherein the inconvenience of the languages of partial farmers in the process of transnational trade becomes one of the main obstacles, and therefore, the app and the website for language translation play a great role, but the cultural level of some farmers is lower, the control and familiarity of the electronic equipment are lower, and the dialect accents of the farmers are heavier due to the fact that the farmers work at the same place throughout the year, and the language translation software cannot be well translated when the language translation software is used.

The method for transferring and marking the standard sound Mongolian of the Zhenglan flag has no good effect, and cannot finish the transferring and marking work of the Mongolian of the standard sound of the Zhenglan flag.

Disclosure of Invention

In order to solve the problem that the transfer marking effect of the standard Mongolian of the Zhenglan flag is poor, the application provides a method and a system for performing AI intelligent marking on Mongolian.

The application provides a method for AI intelligent labeling of a Mongolian, which comprises the following steps:

a method for AI intelligent labeling of Mongolian comprises the following steps:

step S1, judging whether sound data are valid or not, if the sound data are not in an invalid data range, the sound data are valid, performing audio transcription labeling, if the sound data accord with any standard in the invalid data range, namely the sound data are invalid, performing no audio transcription labeling, performing label processing, and labeling as bad data;

step S2, judging the audio accent, identifying whether the audio content is tin Lin Guole allied accent, if the identification result is tin Lin Guole allied accent, performing audio transcription labeling, if the identification result is other local accent, not performing audio transcription labeling, and performing direct marking treatment on the other local accent;

step S3, the transfer content is normalized, the type of the transfer content is identified, and transfer labeling is carried out;

s4, marking the special text, identifying the special text and making the special mark;

and S5, outputting the transfer annotation content, and integrating the results of the steps S3-S4 to output the final transfer annotation content.

Through the scheme, the method and the device have the advantages that the screening capability of invalid audio data is improved, the invalid calculation cost is reduced, and the voice transcription labeling rate is improved; the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is improved.

Further, in the step S1, the invalid data range includes: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, poor sound recording effect types of speakers, voice noise types, readback types, single word types, rap and singing types;

the non-target languages and the non-positive blue flag types are specifically as follows: null data, pure ambient noise, pure music, pure human voice noise, pure human voice non-speech, pure system broadcast sound;

the severe upper and lower section types are specifically as follows: the voice frequency is harsher and roar, the oscillogram exceeds the upper and lower boundary lines, and the speaking content is not clearly heard;

the sound generator recording effect is of the type specifically: the speaker can not hear clearly when spraying the wheat, and can speak with the throat;

the voice noise type is specifically as follows: the noise of the human voice affects the main speaker, which leads to inaudibility;

the readback type is specifically as follows: one word is not read, and the back is read;

the single word type is specifically: an audio piece has only one word. The screening capability of invalid audio data is improved, the invalid calculation cost is reduced, and the voice transcription labeling rate is improved;

further, the step S3 includes: step S301, summarizing processing; step S302, first word processing, specifically

Processing; step S303, processing the postfix of the derivative; step S304, processing the word with the aid of the language; step S305, compound word processing; step S306, standard voice and dialect vocabulary processing; step S307, pronoun processing; step S308, second class word processing, specifically +.>

Processing; step S309, third class word processing, specifically +.>

Processing related words; step S310, borrowing word processing; step S311, new vocabulary processing; the label transfer can be normalized.

In the step S301: the meaning of a word can be changed into a broader meaning in a generalized form, so that the word which originally refers to a specific object can be changed into a word which comprises the same kind or other objects closely related to the same; the most predominant form of the generalization is to repeat one more after a noun

The same word as the consonant is sometimes repeated with other consonants

The first syllable of the second word is converted into +.>

Consonants; in the case of a generic term consisting of adjectives, the first syllable of the adjective is followed by +.>

Consonants; the generalized verb is only one +.>

After the trailing edge of the real verb is connected with the generalized verb, the meaning range of the real verb is enlarged;

in the step S302: common lattice

The words are written separately +.>

Instead, but not in successive sentences, the word is transcribed with +.>

Replacement; noun add->

When the adjective is changed, the writing is needed to be carried out; +.>

The words are written separately +.>

Instead, but in successive sentences, direct audio transcription is performed; demonstration of +.f. in the morbid verb of the speculated implication>

The writing is needed to be connected; 12 genus of nouns plus->

When in use, the writing is needed to be connected;

in the step S303: if the words without change appear in the sentence, separating the virtual words for writing; adverbs and method of making

Split write (s)/(s) on a memory card>

In the step S304: for the following

Positive word is directly followed by transcription ++>

Reading the first vowel, directly transcribing the negative word followed by +.>

Reading the second vowels, and judging according to the yin-yang of the words at other times except the first vowels and the second vowels;

should be read as "fifth vowel->

Should be read as "sixth vowel" [ o ]>

Should be read as "seventh vowel" [ u ]>

Should be read as "fifth vowel" [ v ]>

If->

Front appearance "/-on>

Or->

The words of the end are all transcribed according to the audio;

in the step S305: the name and place name formed by two or more words, and some proper nouns are written continuously without distinguishing the yin and yang of the words; the second root in the words consisting of two roots is vowels and is written separately;

in the step S306: in successive sentences

Belongs to a non-standard sound range, and data is required to be directly subjected to standard processing; for the dialect vocabulary, let's in the Hull spoken language>

And Wu Zhumu +.>

The word direct audio transcription; if present in the audio>

These words, direct audio transcription;

in the step S307, the pronoun includes:

/>

in the step S308:

when the second stop-motion shaping nouns, pronouns and some time-position words are used as the second stop-motion shaping nouns, pronouns and some time-position words, the second stop-motion shaping nouns, pronouns and some time-position words are separately written; />

The writing is carried out after part adjectives and position words; suffix third lattice bit lattice->

Additional component of the post-addition word->

When in need of transformation, the patient needs to be transcribed into->

In the step S309: if it is

The word meaning, form and function of the word can be written continuously when the additional components are changed; if->

The compound word can be formed without change of the word, and the word has the functions of forming words and deforming additional components and is separately written;

in the step S310: for Mongolian borrowed words, the actual pronunciation of Mongolian is used for transcription, and for newly-fed borrowed words, the similar pronunciation is used for transcription in combination with a Mongolian voice system; in the step S311: suffix of nouns derived from verbs if root word is

The consonant ending should be replaced with a suffix that has the same meaning and function as the canonical transcription.

Further, in the step S4, transcription is performed according to the voice content of the main speaker, and the content is strictly consistent with the heard voice; the transfer content is written in a top grid; the background sound is human sound, is a target language and sounds clearly, all the background sound is marked according to the sequence, and if the background sound is not clear, only the main speaker is marked; the text is ensured to be completely consistent with the audio, and the place name and the person name are required to be reasonable;

the step S4 includes: s41, arabic numerals are marked, and corresponding Mongolian characters are transcribed according to audio; step S42, marking English, namely directly marking English if English is encountered during transcription marking, and marking according to the meaning of foreign language in Mongolian if foreign language is encountered; s43, labeling the Chinese words; step S44, text labeling with grammar errors is performed, and the audio content is directly transcribed and labeled as long as pronunciation is clear and definite; step S45, marking punctuation marks, wherein only' can appear in the process of transferring marking "

? The following is carried out "these four punctuation marks; step S46, labeling proper nouns, chinese names, place names, english names and place names according to the Mongolian actual standard requirementThe method comprises the steps of carrying out a first treatment on the surface of the Step S47, labeling space texts, wherein spaces exist in the Mongolian typing process;

in the step S43, the term "one' S voice" is collectively denoted as

Representing affirmative; the Chinese words and phrases are uniformly marked as

Indicating insight and comprehension; the word "Java" is marked uniformly as->

Indicating a appreciation, indicating surprise; the Chinese qi word "o" is uniformly marked as->

Representing a general doubt; the word "bar" is uniformly marked as +.>

Representing a good bar; the speech ends with an o and the last word is not in +.>

Or->

Marking +.>

Or->

Can realize the labeling of the Chinese words, arabic numerals, english and punctuation marks.

Further, in the step S5, the audio is transferred word by word; the words with spoken language in the audio are required to be correctly transcribed according to written language, and the words which are ignored for the reason of the spoken language are required to be recovered when transcribed; punctuation marks must be used and guaranteedWhen used correctly, punctuation marks are only 'used'

? The! "punctuation under Meng Keli Mongolian input method; in the transcription process +.>

Are written separately; the name of the person needs to be written continuously, and the name of the place cannot be written continuously and separately due to the input method. The transfer can be normalized.

Further, in the step S5, when the transfer label is output, top grid writing is performed; when inputting punctuation marks, a space is manually added in front of the punctuation marks; word, suffix

Only one space is needed between symbols. Can be in words, suffix->

Spaces are added between the symbols.

Furthermore, in the step S5, when transferring the label and outputting, the input method must use a Meng Keli input method with MN mark; meng Keli input method requiring national standard coding during platform transcription

The letter is black. The transfer label can be normalized.

Further, in the step S5, when the label is output, the suffix is added to

Remove->

The other suffixes are shortcuts 1-9, wherein key 6 +.>

Not used, written->

The other suffixes have parts of speech, when the suffixes are used, the recommended use is based on an input method, when the suffixes need to be corrected, the suffixes need to be corrected together with the words in front of the suffixes, and at the moment, the parts of speech of the transcribed suffixes are correct. The transfer label is more convenient to output.

Further, in the step S5, when the label is transferred and output, the Chinese character in the audio is also used

Replacement writing; "more" of Chinese in audio>

And (5) replacing the writing. Alternate writing to "also" and "more" is implemented.

A system for AI intelligent labeling of a mongolian, characterized in that a method for AI intelligent labeling of a mongolian according to any one of claims 1-9 is applied:

the system for AI intelligent labeling of the Mongolian comprises: the device comprises an invalid data range judging module (1), an accent judging module, a special text labeling module, a transfer content standardization processing module and a transfer content output module;

the invalid data range determination module determines whether the invalid data range is satisfied based on the audio data: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, voice recording effect difference types, voice noise types, readback types, single word types, rap and singing types are judged to be invalid data if the voice types are satisfied, and valid data is judged to be valid data if the voice types are not satisfied;

the accent judging module is connected with the invalid data range judging module and is used for receiving and identifying the audio content judged by the invalid data range judging module and judging whether the audio content is tin Lin Guole allied accent or not;

the special text labeling module is connected with the accent judging module and is used for receiving and labeling the audio content judged to pass by the accent judging module; the special text labeling module comprises: the first special labeling submodule is used for labeling Arabic numerals; the second special labeling sub-module is used for labeling English; the third special labeling sub-module is used for labeling the Chinese words; the fourth special labeling sub-module is used for labeling grammar error texts; a fifth special labeling sub-module for labeling punctuation marks; a sixth special labeling sub-module, configured to label the proper noun text; a seventh special labeling sub-module, configured to label space text;

the transfer content standardization processing module comprises; the first normalization sub-module is used for normalizing the processing outline transfer; a second normalization sub-module for normalizing

The third normalization sub-module is used for normalizing and processing the derivative suffix; the fourth normalization sub-module is used for normalizing the processing of the language and the auxiliary words; a fifth normalization sub-module, configured to normalize the compound word transcription; a sixth normalization sub-module, configured to normalize standard voice and dialect vocabulary; a seventh normalization sub-module, configured to normalize the processing pronouns; an eighth normalization sub-module for normalizing +.>

A ninth normalization sub-module for normalizing +.>

Related words; a tenth normalization sub-module, configured to normalize the borrowing word; an eleventh normalization sub-module for normalizing the new vocabulary;

the transfer content output module is respectively connected with the special text labeling module and the transfer content standardization processing module and is used for receiving data and outputting comprehensive transfer content.

Through the system, the method and the system increase the screening capability of invalid audio data, reduce the invalid calculation cost and improve the voice transcription labeling rate; the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is improved.

In summary, the present application includes the following beneficial technical effects:

1. the screening capability of invalid audio data is improved, the invalid calculation cost is reduced, and the voice transcription labeling rate is improved;

2. the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is improved.

Drawings

FIG. 1 is a step diagram of a method for AI intelligent labeling of a Mongolian in an embodiment of the application;

FIG. 2 is a logic diagram of a method for AI intelligent labeling of a Mongolian in accordance with an embodiment of the present application;

fig. 3 is a block diagram of a system for AI intelligent labeling of a mongolian according to an embodiment of the present application.

Reference numerals illustrate:

1. an invalid data range determination module; 2. an accent judgment module;

3. a special text labeling module; 31. a first special labeling sub-module; 32. the second special labeling sub-module; 33. a third special labeling sub-module; 34. a fourth special labeling sub-module; 35. a fifth special labeling sub-module; 36. a sixth special labeling sub-module; 37. a seventh special labeling sub-module;

4. a transfer content standardization processing module; 401. a first normalization sub-module; 402. a second normalization sub-module; 403. a third normalization sub-module; 404. a fourth normalization sub-module; 405. a fifth normalization sub-module; 406. a sixth normalization sub-module; 407. a seventh normalization sub-module; 408. an eighth normalized sub-module; 409. a ninth normalization sub-module; 410. a tenth normalization sub-module; 411. an eleventh normalization sub-module;

5. and a transfer content output module.

Detailed Description

The following detailed description of the embodiments of the present application, such as the shape and construction of the components, the mutual positions and connection relationships between the components, the roles and working principles of the components, the manufacturing process and the operation and use method, etc., is provided to help those skilled in the art to more fully understand the inventive concept, technical solution of the present invention. For convenience of description, reference is made to the directions shown in the drawings.

Example 1

Referring to fig. 1-2, a method for AI intelligent labeling of a mongolian language includes the following steps:

step S2, the audio accent is determined, and whether the audio content is tin Lin Guole allied accent is identified, including tin Lin Guole allied accent is detected, if the identification result is tin Lin Guole allied accent, including tin hal, then audio transcription labeling is performed, if the identification result is other local accent, for example: the method is characterized in that the method does not make audio transcription marks and directly marks the accent audio of other places;

In the step S1, the invalid data range includes: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, poor sound recording effect types of speakers, voice noise types, readback types, single word types, rap and singing types;

the non-target languages and the non-positive blue flag types are specifically as follows: null data, pure ambient noise, pure music, pure human voice noise, pure human voice non-speech, such as singing, sneezing, coughing, laughing, etc., pure system broadcast sounds, such as sounds made by devices such as cell phones, televisions, radios, etc.;

the voice noise type is specifically as follows: the noise of the human voice affects the main speaker, so that the main speaker and the secondary speaker are inaudible, namely, the main speaker and the secondary speaker are overlapped in pronunciation, and the main speaker and the secondary speaker are inaudible seriously;

the readback type is specifically as follows: a word is not read and is followed by read-back phenomena such as: beaubeau, but thin is not read back;

the single word type is specifically: an audio has only one word, and the rap and singing are invalid data.

The step S3 includes: step S301, summarizing processing; step S302, first word processing, specifically

Processing; step S303, processing the postfix of the derivative; step S304, processing the word with the aid of the language; step S305, compound word processing; step S306, standard voice and dialect vocabulary processing; step S307, pronoun processing; step S308, second word processing, specifically

Processing; step S309, third class word processing, specifically +.>

Processing related words; step S310, borrowing word processing; step S311, new vocabulary processing;

in the step S301: the meaning of a word can be broader in terms of a word that is intended to refer to a particular thing, to words that include the same class or other things that are closely related to it, such as:

the most predominant form of the expression is to repeat a term after it is +.>

The same word as the consonant, i.e. the auxiliary word, is sometimes repeated with other consonants +.>

The first word, e.g. something that becomes a vocabulary unit, transforms the first syllable of the second word into +.>

Waiting for consonants; such as: />

In the case of a generic term consisting of adjectives, the first syllable of the adjective is followed by +.>

Consonants, such as:

the generalized verb is only one +.>

in the step S302: common lattice

The words are written separately +.>

Instead, but not in successive sentences, the word is transcribed with +.>

Instead, such as: />

Representing the +.>

Separate writes are needed when needed, such as: />

Noun add->

Writing is continued when changing adjectives, such as:

+.>

The words are written separately +.>

Instead, but in successive sentences, direct audio transcription, such as: />

Demonstration of +.f. in the morbid verb of the speculated implication>

Writing is required, such as: />

Addition of 12 terms

Writing is needed in succession, such as: />

In the step S303: if the following unchanged words appear in the sentence, such as all words connecting the works, the works are separated and written, such as:

adverbs->

Split write (s)/(s) on a memory card>

Such as:

these words, if they appear in sentences, are written in succession, such as: />

In the step S304: for the following

Positive word is directly followed by transcription ++>

should be read as "fifth vowel" [ v ]>

Should be read as "sixth vowel" [ o ]>

Should be read as "seventh vowel" [ u ]>

Should be read as "fifth vowel" [ v ]>

If->

Front appearance "/-on>

Or->

The words of the "end" may all be transcribed in audio, such as: />

In the step S305: the names of people and places formed by two or more words, and proper nouns are written in succession without distinguishing yin and yang of words in most cases, for example:

the second root of the word consisting of two roots is written according to the change in the word when the second root is a vowel in theory, but the words cannot be output in the typing process, and the words are written separately, such as:

in the step S306: in successive sentences

Belongs to a non-standard sound range, so that data is required to be directly subjected to standard processing; for the dialect vocabulary, let's in the Hull spoken language>

And Wu Zhumu in spoken language

The words have the meaning of expressing father and are all in the standard range, so that direct audio transcription is carried out, and the correct writing of dialect vocabulary is noted during transcription; if these words appear in the audio

Direct audio transcription;

in the step S307, the pronoun includes:

in the step S308: nouns, pronouns and some time-position words serving as second stop-motion shaping are separately written, for example:

partial adjectives and punctuations are written in succession, and also include some special pronouns, such as: />

Suffix third lattice bit lattice->

Additional component of the post-addition word->

When in need of transformation, the patient needs to be transcribed into->

Such as: />

In the step S309: if it is

The word meaning, morphology and function of a word can be written in connection with the change of additional components, such as: />

If->

The compound word can be formed without change, and the functions of forming words and deforming additional components are separately written, for example:

in the step S310: for those borrowed words which are long in history and Mongolian, the actual pronunciation of Mongolian is used for transcription, and for the newly-fed borrowed words, the similar pronunciation is used for transcription in combination with a Mongolian voice system, for example:

in the step S311: suffix of nouns derived from verbs if root word is

Consonant ending should be replaced with suffixes of the same meaning and function as normative transcription, such as: />

Word writing with extended syllable morphemes, such as:

partial root free variants are unified with suffix free variants, such as: />

/>

Other vocabulary correct writing method/>

In the step S4, transcription is performed according to the voice content of the main speaker, the content is strictly consistent with the heard voice, and multiple words, missing words and word staggering are not allowed; the transfer content is to be written in a top grid mode, and blank phenomenon at the beginning is forbidden; the background sound is human sound, is a target language and sounds clearly, all the background sound is marked according to the sequence, and if the background sound is not clear, only the main speaker is marked; the text is ensured to be completely consistent with the audio, the place name and the person name are required to be reasonable, the uncertain words refer to a dictionary, and words can not be randomly made according to pronunciation;

the step S4 includes: step S41, arabic numerals are marked, arabic numerals cannot appear in marked contents, and corresponding Mongolian characters are transcribed according to audio;

such as:

year 2004 please transcribe the corresponding word according to the actual pronunciation, without labeling the number.

Step S42, marking English, namely directly marking English if English is met and marking according to the meaning of foreign language in Mongolian if foreign language is met when the English is met;

s43, labeling the Chinese words;

step S44, text labeling with grammar errors is performed, and the audio content is directly transcribed and labeled as long as pronunciation is clear and definite;

step S45, marking punctuation marks, except' during the transfer marking process "

? The following is carried out "these four punctuations do not allow any punctuations to appear in the labeling result;

step S46, labeling proper nouns, chinese names, place names, english names, place names and the like according to the actual Mongolian standard requirements;

step S47, labeling space texts, wherein spaces exist in the Mongolian typing process, the meaning of the spaces is not changed, and the spaces cannot be increased or decreased at will;

in the step S43, the term "one' S voice" is collectively denoted as

Representing affirmative; the Chinese words and phrases are uniformly marked as

Indicating insight and comprehension; the word "Java" is marked uniformly as->

Representing a general doubt; the word "bar" is uniformly marked as +.>

The method is characterized in that the method is used for representing a good bar and a Chinese ending word bar, and is not practical; the speech ends with an o and the last word is not in +.>

Or->

Marking +.>

Or->

In the step S5, the audio is used for word-by-word transcription, and the words cannot be dropped and added; the words with spoken language in the audio need to be correctly transcribed according to the written language, the words cannot be made, and the words which are ignored for the reason of the spoken language need to be recovered when transcription is carried out, for example:

punctuation marks must be used and their correct use ensured, which are now only used "/">

? The! "punctuation under input methods other than Meng Keli Mongolian input methods cannot be used; in the transcription process +.>

Are written separately, for example: />

Cannot be written as->

The name of the person needs to be written continuously, and the name of the place cannot be written continuously and separately due to the input method.

In the step S5, when the transfer label is output, the top grid is written, and a natural segment blank is not left at the transfer beginning; when inputting punctuation marks, a space is manually added in front of the punctuation marks; word, suffix

Only one space is needed between symbols, and no space or redundant space can be provided.

In the step S5, when transferring the label and outputting, the input method must use Meng Keli input method with MN mark; meng Keli input method requiring national standard coding during platform transcription

Input method incapable of using Meng Keli code

The words are black, and cannot be blue or red, except for individual special words and fixed words.

In the step S5, when the transfer label is output, for the suffix

Remove->

The other suffixes are shortcuts 1-9, wherein key 6 +.>

Not used, written->

For example

The other suffixes have parts of speech, when the suffixes are used, the recommended use is based on an input method, when the suffixes need to be corrected, the suffixes need to be corrected together with the words in front of the suffixes, and at the moment, the parts of speech of the transcribed suffixes are correct.

In the step S5, when the transfer label is output, the Chinese character in the audio is used as' also

Replacement writing; "more" of Chinese in audio>

And (5) replacing the writing.

The method for AI intelligent labeling of the Mongolian has the advantages that: the screening capability of invalid audio data is improved, the invalid calculation cost is reduced, and the voice transcription labeling rate is improved; the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is increased.

Example two

Referring to fig. 3, a system for AI intelligent labeling of a mongolian language includes: an invalid data range judging module 1, an accent judging module 2, a special text labeling module 3, a transcription content standardization processing module 4 and a transcription content output module 5;

the invalid data range determination module 1 determines whether the invalid data range is satisfied according to the audio data: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, voice recording effect difference types, voice noise types, readback types, single word types, rap and singing types are judged to be invalid data if the voice types are satisfied, and valid data is judged to be valid data if the voice types are not satisfied;

the accent judging module is connected with the invalid data range judging module 1 and is used for receiving and identifying the audio content judged by the invalid data range judging module 1 and judging whether the audio content is tin Lin Guole allied accent or not;

the special text labeling module 3 is connected with the accent judging module 2 and is used for receiving and labeling the audio content judged to pass by the accent judging module 2; the special text labeling module 3 includes: a first special labeling sub-module 31 for labeling Arabic numerals; a second special labeling sub-module 32 for labeling English; a third special labeling sub-module 33, configured to label the word of the language; a fourth special labeling sub-module 33, configured to label a grammar error text; a fifth special labeling sub-module 34 for labeling punctuation marks; a sixth special labeling sub-module 35, configured to label the proper noun text; a seventh special labeling sub-module 36 for labeling space text;

the transfer content normalization processing module 4 comprises; a first normalization sub-module 401, configured to normalize a process summary transcription; a second normalization sub-module 402 for normalizing the processing

A third normalization sub-module 403, configured to normalize the suffix of the derivative; a fourth normalization sub-module 404, configured to normalize the language-gas assisted word; a fifth normalization sub-module 405, configured to normalize the compound word transcription; a sixth normalization submodule 406, configured to normalize standard voice and dialect vocabulary; a seventh normalization sub-module 407, configured to normalize the processing pronouns; an eighth normalization sub-module 408 for normalizing

A ninth normalization sub-module 409 for normalizing +.>

Related words; a tenth normalization sub-module 410, configured to normalize the borrowing words; an eleventh normalization sub-module 411, configured to normalize the new vocabulary;

the transfer content output module 5 is respectively connected with the special text labeling module 3 and the transfer content standardization processing module 4 and is used for receiving data and outputting comprehensive transfer content.

The embodiment of the application, a system for AI intelligent labeling of Mongolian works according to the following principle: by setting the invalid data range judging module, the screening capability of invalid audio data is improved, the invalid calculation cost is reduced, and the voice transcription labeling rate is improved; by arranging the special text labeling module and the transcription content standardization processing module, the recognition capability of various special conditions and text types in the standard Mongolian of the blue flag is improved, and the accuracy of transcription labeling is increased.

The invention and its embodiments have been described above schematically, without limitation, and the drawings illustrate only one embodiment of the invention and the actual structure is not limited thereto. Therefore, if one of ordinary skill in the art is informed by this disclosure, a structural manner and an embodiment similar to the technical scheme are not creatively devised without departing from the gist of the present invention, and all the structural manners and the embodiment are considered to be within the protection scope of the present invention.

Claims

1. The method for AI intelligent labeling of the Mongolian is characterized by comprising the following steps:

step S1, judging the validity of the data, judging whether the sound data is valid, if the sound data is not in the invalid data range, the sound data is valid, performing audio transcription labeling, if the sound data accords with any standard in the invalid data range, namely, the sound data is invalid, performing no audio transcription labeling, performing labeling treatment, and labeling the sound data as bad data, wherein the invalid data range comprises: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, poor sound recording effect types of speakers, voice noise types, readback types, single word types, rap and singing types;

step S5, transferring the labeling content to output, and integrating the results of the step S3 and the step S4 to output the final transferring labeling content;

in the step S4, transcription is performed according to the voice content of the main speaker, and the content is consistent with the heard voice; the transfer content is written in a top grid; the background sound is human sound, is a target language and sounds clearly, all the background sound is marked according to the sequence, and if the background sound is not clear, only the main speaker is marked; the text is completely consistent with the audio, and the place name and the person name are reasonable;

the step S4 includes: s41, arabic numerals are marked, and corresponding Mongolian characters are transcribed according to audio; step S42, marking English, namely directly marking English if English is encountered during transcription marking, and marking according to the meaning of foreign language in Mongolian if foreign language is encountered; s43, labeling the Chinese words; step S44, text labeling with grammar errors is performed, and the audio content is directly transcribed and labeled as long as pronunciation is clear and definite; in step S45, punctuation marks are marked, and only? The following is carried out "these four punctuation marks; step S46, labeling proper nouns, chinese names, place names, english names and place names according to the actual Mongolian standard requirements; step S47, labeling space texts, wherein spaces exist in the Mongolian typing process;

in the step S43, the term "one' S voice" is collectively denoted as

Representing affirmative; the word "coupled" is uniformly marked as->

Indicating insight and comprehension; the word "Java" is marked uniformly as->

Indicating a appreciation, indicating surprise; the Chinese qi word "o" is uniformly marked as

Representing a general doubt; the word "bar" is uniformly marked as +.>

Representing a good bar; phonetic "o"Ending and last word not with +.>

Or->

Marking +.>

Or->

In the step S5, the audio is used for word-by-word transcription; the words with spoken language in the audio are correctly transcribed according to written language, and the words which are ignored for the reason of the spoken language are recovered when being transcribed:

punctuation must be used and ensured to be properly used, punctuation is now only used ". Diamond-solid? The! Only using the punctuation marks under the Meng Keli Mongolian input method; in the transcription process +.>

Write separately: />

The name of the person is required to be written continuously, and the name of the place cannot be written continuously and separately due to Meng Keli Mongolian input method;

in the step S5, when the transfer label is output, the top grid is written, and a natural segment blank is not left at the transfer beginning; when punctuation marks are input, a space is manually added in front of the punctuation marks; word, suffix

There is one space between the symbols;

the saidIn the step S5, when transferring and labeling output, a Meng Keli Mongolian input method with an MN mark is used in a Meng Keli Mongolian input method; meng Keli Mongolian input method using national standard coding during platform transcription

Input method which cannot be encoded by Meng Keli->

The words are black, except special words and fixed words; />

In the step S5, when the transfer label is output, for the suffix

Remove->

The other suffixes are shortcuts 1-9, wherein key 6 +.>

Not used, written->

“①/>

②/>

③/>

④/>

⑤/>

⑦/>

⑧/>

⑨/>

The other suffixes have parts of speech, when the suffixes are used, the suffixes are used according to the recommended use of the Meng Keli Mongolian input method, and when the suffixes need to be corrected, the suffixes are corrected together with the words in front of the suffixes;

Replacement writing; "more" of Chinese in audio>

And (5) replacing the writing.

2. The method for intelligent AI-labeling of mongolian language according to claim 1, wherein:

in the step S1, the non-target language and the non-blue flag type are specifically: null data, pure ambient noise, pure music, pure human voice noise, pure human voice non-speech, pure system broadcast sound;

the single word type is specifically: an audio piece has only one word.

3. The method for intelligent AI-labeling of mongolian language according to claim 1, wherein:

Processing; step S309, third class word processing, specifically +.>

in the step S301: the meaning of a word can be expanded by a generalized form, so that the word originally referring to a specific object can be changed into a word comprising the same kind or other objects closely related to the same; the most predominant form of the generalization is to repeat one more after a noun

The same word as the consonant is sometimes repeated with other consonants +.>

The first syllable of the second word is converted into +.>

Consonants; the generalized verb is only one +.>

After being used for the ambiguous verb;

in the step S302: common lattice

The words are written separately +.>

Instead, but not in successive sentences, the word is transcribed with +.>

Replacement; noun add->

When the adjective is changed, the writing is needed to be carried out; +.>

The words are written separately +.>

The writing is needed to be connected; 12 genus of nouns plus->

When in use, the writing is needed to be connected;

Split write (s)/(s) on a memory card>

In the step S304: for the following

Positive word is directly followed by transcription ++>

Reading the second vowels, and judging according to the yin-yang of the words at other times except the first vowels and the second vowels; />

Should be read as "fifth vowel" [ v ]>

Should be read as "sixth vowel" [ o ]>

Should be read as "seventh vowel" [ u ]>

Should be read as "fifth vowel" [ v ]>

If->

Front appearance "/-on>

Or->

The words of the end are all transcribed according to the audio;

in the step S305: the name and place name composed of two or more words are written continuously without distinguishing yin and yang of the words; the second root in the words consisting of two roots is vowels and is written separately;

in the step S306: in successive sentences

And Wu Zhumu +.>

The word direct audio transcription; if present in the audio>

These words, direct audio transcription;

in the step S307, the pronoun includes:

in the step S308:

acting as a second gridWriting the formalized nouns and pronouns separately; />

At->

And then writing; suffix third lattice bit lattice->

Additional component of the post-addition word->

When in need of transformation, the patient needs to be transcribed into->

In the step S309: if it is

4. A system for AI intelligent labeling of a mongolian, characterized in that a method for AI intelligent labeling of a mongolian according to any one of claims 1-3 is applied:

the system for AI intelligent labeling of the Mongolian comprises: an invalid data range judging module (1), an accent judging module (2), a special text labeling module (3), a transcription content standardization processing module (4) and a transcription content output module (5);

the invalid data range judging module (1) judges whether the audio data meets an invalid data range or not, and the invalid data range is: non-target languages and non-bluish flag types, severe up-down amplitude cutting types, voice recording effect difference types, voice noise types, readback types, single word types, rap and singing types are judged to be invalid data if the voice types are satisfied, and valid data is judged to be valid data if the voice types are not satisfied;

the accent judging module (2) is connected with the invalid data range judging module (1) and is used for receiving and identifying the audio content judged by the invalid data range judging module (1) and judging whether the audio content is tin Lin Guole allied accent or not;

the special text labeling module (3) is connected with the accent judging module (2) and is used for receiving and labeling the audio content judged to pass by the accent judging module (2); the special text labeling module (3) comprises: the first special labeling submodule (31) is used for labeling Arabic numerals; a second special labeling sub-module (32) for labeling English; a third special labeling sub-module (33) for labeling the word of the mood; a fourth special labeling sub-module (34) for labeling grammar error text; a fifth special labeling sub-module (35) for labeling punctuation marks; a sixth special labeling sub-module (36) for labeling proper noun class text; a seventh special labeling sub-module (37) for labeling space text;

the transfer content normalization processing module (4) comprises; a first normalization sub-module (401) for normalizing the process summary transcription; a second normalization sub-module (402) for normalizing the processing

A third normalization sub-module (403) forProcessing the derivative word suffix in a standardization manner; a fourth normalization sub-module (404) for normalizing the processing of the mood help words; a fifth normalization sub-module (405) for normalizing compound word transcription; a sixth normalization sub-module (406) for normalizing the standard pronunciation and dialect vocabulary; a seventh normalization sub-module (407) for normalizing the processing pronouns; an eighth normalization sub-module (408) for normalizing +.>

A ninth normalization sub-module (409) for normalizing the process +.>

Related words; a tenth normalization sub-module (410) for normalizing the processing of the borrowing words; an eleventh normalization sub-module (411) for normalizing the new vocabulary;

the transfer content output module (5) is respectively connected with the special text labeling module (3) and the transfer content standardization processing module (4) and is used for receiving data and outputting comprehensive transfer content.