CN109062891B

CN109062891B - Media processing method, device, terminal and medium

Info

Publication number: CN109062891B
Application number: CN201810744442.5A
Authority: CN
Inventors: 杜若; 覃勋辉; 向海; 侯聪; 刘科
Original assignee: Chongqing Xiezhi Technology Co ltd
Current assignee: Beijing Star Cube Digital Technology Co ltd
Priority date: 2018-07-09
Filing date: 2018-07-09
Publication date: 2022-07-26
Anticipated expiration: 2038-07-09
Also published as: CN109062891A

Abstract

The embodiment of the invention discloses a media processing method, a device and a terminal, wherein the method comprises the following steps: receiving media information, wherein the media information comprises text description content, performing word segmentation processing on the text description content to obtain at least one word group, obtaining Chinese pinyin corresponding to each word group, identifying the word group corresponding to the Chinese pinyin as a dialect word group when the Chinese pinyin exists in a Chinese pinyin database, searching a standard language matched with the Chinese pinyin corresponding to the dialect word group in the Chinese pinyin database, and replacing the dialect word group with the standard language to obtain updated text description content. By the method, dialects in the text can be identified and replaced by the standard language, and communication efficiency is improved.

Description

Media processing method, device, terminal and medium

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a media processing method, an apparatus, a terminal, and a computer-readable storage medium.

Background

Modern chinese has a variety of dialects, which are widely distributed, such as northeast dialects, hunan dialects, or minfantasy. The difference between dialects of modern Chinese is expressed in the aspects of voice, vocabulary and grammar, and the voice aspect is particularly prominent. Due to the difference of regional culture, people in different regions use different dialects, when a user communicates through an instant messaging client or searches through a search engine, dialect voice information or dialect text information can be input, and under the condition that a receiver is not familiar with the dialects, the understanding of the dialect voice information or the dialect text information is different, so that the communication efficiency is low, and therefore how to effectively translate the dialects into standard languages is a technical problem which needs to be solved urgently at present.

Disclosure of Invention

The embodiment of the invention provides a media processing method, a media processing device, a terminal and a computer readable storage medium, which can identify dialects in a text and replace the dialects in the text with standard languages, so that the communication efficiency is improved.

In a first aspect, an embodiment of the present invention provides a media processing method, where the method includes:

receiving media information, wherein the media information comprises text description content;

performing word segmentation processing on the text description content to obtain at least one word group;

obtaining Chinese pinyin corresponding to each phrase;

when the Chinese pinyin exists in the Chinese pinyin database, identifying the phrase corresponding to the Chinese pinyin as a dialect phrase;

searching a standard language matched with the Chinese pinyin corresponding to the dialect phrase in the Chinese pinyin database;

and replacing the dialect phrase with the standard language to obtain the updated text description content.

In a second aspect, the present invention provides a media processing apparatus, the apparatus comprising:

the receiving module is used for receiving media information, and the media information comprises text description content;

the word segmentation module is used for carrying out word segmentation processing on the text description content to obtain at least one word group;

the acquisition module is used for acquiring Chinese pinyin corresponding to each phrase;

the identification module is used for identifying the phrase corresponding to the Chinese pinyin as a dialect phrase when the Chinese pinyin exists in the Chinese pinyin database;

the searching module is used for searching a standard language matched with the Chinese pinyin corresponding to the dialect phrase in the Chinese pinyin database;

and the replacing module is used for replacing the dialect phrase with the standard language to obtain the updated text description content.

In the embodiment of the invention, a terminal receives media information, the media information comprises text description content, word segmentation processing is carried out on the text description content to obtain at least one word group, the terminal obtains Chinese Pinyin corresponding to each word group, when the Chinese Pinyin exists in a Chinese Pinyin database, the terminal identifies the word group corresponding to the Chinese Pinyin as a dialect word group, the terminal searches a standard language matched with the Chinese Pinyin corresponding to the dialect word group in the Chinese Pinyin database, and the dialect word group is replaced by the standard language to obtain updated text description content. By the method, dialects in the text can be identified and replaced by the standard language, so that the communication efficiency is improved

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a schematic flowchart of a media processing method according to an embodiment of the present invention;

FIG. 2 is a flow chart illustrating another media processing method according to an embodiment of the invention;

FIG. 3 is a flowchart illustrating another media processing method according to an embodiment of the present invention;

fig. 4 is a network topology diagram of interaction between a terminal and a server according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a media processing device according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The media processing method provided by the embodiment of the invention is applied to media processing equipment, the media processing equipment can comprise a terminal and a server, and the terminal can comprise but is not limited to electronic equipment such as a smart phone, a palm computer, an intelligent robot or wearable equipment. The operating system of the terminal may include, but is not limited to, an Android operating system, an IOS operating system, a Symbian operating system, a Black Berry operating system, a Windows Phone8 operating system, and the like, which is not limited in the embodiment of the present invention.

Please refer to fig. 1, which is a flowchart illustrating a media processing method according to an embodiment of the present invention, where the method includes:

s101, receiving media information, wherein the media information comprises text description content.

In one implementation, the media information is text information, and the text information includes text description content in chinese. For example, the textual description in the textual information may be "clunking today".

In one implementation, the media information is voice information, and after receiving the voice information, the media processing device may recognize content included in the voice information and store the recognized content in a text form to obtain text description content.

S102, performing word segmentation processing on the text description content to obtain at least one word group;

in the embodiment of the invention, the media processing device performs word segmentation processing on the text description content, and the word group is obtained by performing word segmentation processing on the text description content by the media processing device.

In one implementation, the media processing device may implement word segmentation processing on the text description content based on a character matching manner, and match the chinese character string to be analyzed with a phrase in a preset database according to a preset rule, if the phrase in the chinese character string is found in the preset database, the matching is successful, and the matched phrase is split from the character string and determined as the phrase, where the algorithm based on the character matching may specifically be a forward maximum matching method, a reverse maximum matching method, a minimum splitting method, a bidirectional maximum matching method, and the like.

In one implementation, the media processing device determines whether to combine each character into a phrase according to the frequency or probability of adjacent co-occurrence of the character and the character, specifically, the media processing device counts the frequency of combinations of adjacent co-occurrence of each character in the text description content, calculates the adjacent co-occurrence probability of the characters, and determines the character combination as the phrase if the adjacent co-occurrence probability of the character combination is greater than a preset threshold.

In one implementation, the media processing device implements word segmentation processing on the text description content by learning a word segmentation rule using a statistical machine learning model based on a large number of already segmented texts, so as to obtain at least one word group.

It should be noted that the media processing device may also implement the word segmentation processing on the text description content in other ways, and the present invention is not limited herein. For example, after the media processing device has acquired the textual description "crash this hot today" it performs a tokenization process to obtain the phrase "crash this/today".

S103, obtaining the Chinese pinyin corresponding to each phrase.

In the embodiment of the invention, after the media processing equipment performs word segmentation processing on the text, pinyin corresponding to each word group is obtained.

For example, if the phrase obtained by the media processing device by performing the word segmentation processing on the text description content is "today/crash/so/hot", the media processing device obtains the pinyin of each phrase, and determines that the pinyin corresponding to each phrase is "jiantian/langge/zheme/re".

And S104, when the Chinese pinyin exists in the Chinese pinyin database, identifying the phrase corresponding to the Chinese pinyin as a dialect phrase.

In the embodiment of the invention, after the media processing equipment acquires the Chinese pinyin corresponding to each phrase, whether the Chinese pinyin corresponding to the phrase in the word segmentation result exists is detected in the Chinese pinyin database, and if so, the phrase corresponding to the Chinese pinyin is identified as the dialect phrase. In a specific implementation, the media processing device may detect whether the pinyin exists in the pinyin database by using a bidirectional maximum matching algorithm, and after finding the pinyin corresponding to the phrase in the word segmentation result in the pinyin database, the media processing device identifies the phrase corresponding to the pinyin as a dialect phrase.

For example, if the media processing device determines that the chinese pinyin corresponding to each phrase is "jinian/langge/zheme/re", and the chinese pinyin database contains the pinyin "langge", the media processing device determines that the phrase "crash" corresponding to the "langge" is a fangling phrase.

And S105, searching a standard language matched with the Chinese pinyin corresponding to the dialect phrase in the Chinese pinyin database.

In the embodiment of the invention, the Chinese pinyin database contains dialect pinyins and corresponding standard languages. For example, what is the standard language corresponding to the dialect pinyin "langge", and it should be noted that the correspondence between the dialect pinyin and the standard language may be that a plurality of dialect pinyins correspond to one standard language.

In one implementation, the standard language may be dialects in other regions, the user may preset the type of the standard language, the media processing device may select a corresponding Chinese pinyin library according to an instruction sent by the user, where the pinyin in the different Chinese pinyin libraries corresponds to different dialects, for example, the Chinese pinyin library is a northeast speech library, the standard language corresponding to the Chinese pinyin in the Chinese pinyin database is a northeast speech,

for example, the media processing device performs word segmentation processing on the received text information to obtain a word group "don't need to be young" (the word group is Chongqing, and the corresponding Mandarin is "do not do so"), the media processing device obtains that the pinyin of the word group is "buyaonenge", the dialect library preset by the user is a northeast dialect library, and the northeast dialect matched with the pinyin "buyaonege" is "do not know" from the northeast dialect library.

And S106, replacing the dialect phrases with the standard language to obtain the updated text description content.

In the embodiment of the invention, after the media processing device determines the standard language corresponding to the dialect phrase, the dialect characters in the text are replaced by the corresponding standard language, so as to obtain the updated text description content.

For example, the media information is text information, and the text includes at least one chinese paragraph. For example, the textual description in the textual information may be "clunking today so hot". And the phrase obtained by the word segmentation processing of the text at the end is 'today, crash, so, hot', the media processing equipment acquires the Chinese pinyin of each phrase and determines that the Chinese pinyin corresponding to each phrase is 'jinian, langge, zheme, re'. And the pinyin of "langge" is searched in the Chinese pinyin library, and what the corresponding standard language is, the media processing device replaces the word group "crash" with "what", and the updated text information is "what is hot today".

In an implementation manner, the media information may also be voice information, the media processing device may be an intelligent robot, the user inputs a voice message "clunks today and is hot" to the intelligent robot, the intelligent robot converts the received voice information into text information after receiving the voice message input by the user, performs dialect conversion processing on the text information, replaces dialect vocabularies in the received text information with standard language, obtains updated text information "how hot today", and optionally, the robot performs voice output on the text information converted into the standard language. Alternatively, the robot may also respond to voice information input by the user, answering "the reason for the heat today is that the solar term is a big summer heat today".

In one implementation, the media processing device is an electronic device such as a mobile phone and a computer, a user uses dialects for communication in a communication process, the media processing device converts voice information input by the user into text information, performs word segmentation processing on text description contents, judges whether the voice information input by the user contains the dialects according to pinyin of each word group after the word segmentation processing, and if so, replaces standard language word groups with the dialect word groups to obtain updated text description contents and performs voice output on the updated text description contents.

In the embodiment of the present invention, a media processing device receives media information, where the media information includes text description content, performs word segmentation processing on the text description content to obtain at least one word group, the media processing device obtains chinese pinyins corresponding to the word groups, and when the chinese pinyins exist in a chinese pinyin database, the media processing device identifies the word groups corresponding to the chinese pinyins as dialect word groups, searches for a standard language matched with the chinese pinyins corresponding to the dialect word groups in the chinese pinyin database, and replaces the dialect word groups with the standard language to obtain updated text description content. By the method, dialects in the text can be identified and replaced by the standard language, and communication efficiency is improved.

Referring to fig. 2, a flow chart of another media processing method according to an embodiment of the invention is shown, where the method includes:

s201, the terminal receives media information, and the media information comprises text description content.

In the embodiment of the invention, a terminal receives media information, and in one implementation mode, the media information is voice information, the terminal identifies the voice information and translates the identified voice information into a text to obtain text description content; in one implementation mode, the media information is text information, and the terminal acquires text description contents in the text information.

S202, the terminal carries out word segmentation processing on the text description content to obtain at least one word group.

In the embodiment of the invention, the terminal performs word segmentation on the text description content to obtain at least one word group. In an implementation manner, the terminal may implement word segmentation processing on the text description content based on a character matching manner, or the terminal may determine whether to combine each character into a word group by using the frequency or probability of the adjacent co-occurrence of the character and the character, or the terminal implements word segmentation processing on the text description content by learning a word segmentation rule by using a statistical machine learning model based on a large amount of already-segmented texts, so as to obtain at least one word group. It should be noted that the terminal may also implement word segmentation processing on the text description content in other ways, and the present invention is not limited herein.

S203, the terminal obtains a first segmentation confidence coefficient of the text description content after the segmentation processing.

In the embodiment of the invention, after the terminal performs the word segmentation on the text description, a first word segmentation confidence coefficient of the text description after the word segmentation is obtained, wherein the first word segmentation confidence coefficient is essentially an edge probability, that is, the possibility of segmenting a specific position in the text description is a real number between 0 and 1.

In an optional implementation manner, the first segmentation confidence may also be determined based on similarity of results obtained by different segmentation algorithms, specifically, the terminal selects a target segmentation algorithm to perform segmentation processing on the text description content, after obtaining a first segmentation result, continues to perform segmentation processing on the text description content by using a check segmentation algorithm to obtain a second segmentation processing result, compares whether the first segmentation result is the same as the second segmentation result, and obtains the first segmentation confidence according to the similarity of the first segmentation result and the second segmentation result. It should be noted that the target word segmentation algorithm is the word segmentation algorithm adopted in step S202, and the check word segmentation algorithm is one or more of word segmentation algorithms such as a forward maximum matching method, a reverse maximum matching method, a bidirectional maximum matching method, and a machine learning method.

Optionally, when the check word segmentation algorithm is one, the similarity may be a ratio of the number of the same word groups in the first word segmentation result and the second word segmentation result to the number of word groups in the first word segmentation result. For example, the text description content is "crash of today is so hot", the first segmentation result obtained by performing the segmentation processing on the content by using the target segmentation algorithm is "today/bag/so/hot", the second segmentation result obtained by performing the segmentation processing on the content by using the check segmentation algorithm is "today/bag/one/so/hot", the number of word groups in the first segmentation result and the second segmentation result is 3, the number of word groups in the first segmentation result is 4, the similarity is 75%, and the confidence is 75%.

When the check word segmentation algorithm is multiple, the similarity can be obtained by weighted summation of the similarity of the first word segmentation result and each second word segmentation result. For example, the verification segmentation algorithm is divided into a verification segmentation algorithm 1, a verification segmentation algorithm 2 and a verification segmentation algorithm 3, the corresponding second segmentation results are respectively a second segmentation result 1, a second segmentation result 2 and a second segmentation result 3, the corresponding weights thereof are respectively 0.3, 0.3 and 0.4, the similarity between the first segmentation result and the second segmentation result 1 is 80%, the similarity between the first segmentation result and the second segmentation result 2 is 80%, the similarity between the first segmentation result and the second segmentation result 3 is 90%, and the final confidence of the first segmentation is 84%.

S204, if the confidence of the first word segmentation is smaller than a first preset threshold value, the terminal obtains the pinyin corresponding to each word group.

In the embodiment of the present invention, after the terminal obtains the first segmentation confidence of the text description content after the segmentation processing, it is determined whether the first segmentation confidence is greater than a first preset threshold, where the first preset threshold may be preset by a research and development worker, for example, 85%, 90%, 95%, and the like, if the first segmentation confidence is greater than or equal to the first preset threshold, the process is ended, and if the first segmentation confidence is greater than or equal to the first preset threshold, the pinyin corresponding to each word group is obtained. For example, if the phrase obtained by the terminal through the word segmentation processing of the text description content is "today/crash/so/hot", the terminal obtains the chinese pinyin of each phrase and determines that the chinese pinyin corresponding to each phrase is "jintian/langge/zheme/re".

S205, when the Chinese pinyin corresponding to the word group in the word segmentation result exists in the Chinese pinyin database, the terminal identifies the word group corresponding to the Chinese pinyin as a dialect word group;

in the embodiment of the invention, after the terminal acquires the Chinese Pinyin corresponding to each phrase, whether the Chinese Pinyin corresponding to the phrase in the word segmentation result exists is detected in the Chinese Pinyin database, and if so, the phrase corresponding to the Chinese Pinyin is identified as a dialect phrase. In specific implementation, the terminal may detect whether the pinyin exists in the pinyin database by using a bidirectional maximum matching algorithm, and after finding the pinyin corresponding to the phrase in the word segmentation result in the pinyin database, the terminal identifies the phrase corresponding to the pinyin as a dialect phrase.

S206, the terminal searches a standard language matched with the Chinese pinyin corresponding to the dialect phrase in the Chinese pinyin database.

In the embodiment of the invention, the pinyin of the dialect phrase and the corresponding standard language are stored in the Chinese pinyin database in advance. After the terminal identifies the dialect phrase in the word segmentation result, the standard language matched with the Chinese pinyin corresponding to the dialect phrase is searched in the Chinese pinyin database. It should be noted that the standard language may be mandarin, cantonese, minnan, northeast, etc., or english, german, italian, etc., and different languages correspond to different chinese pinyin databases, and the user may preset the types of the chinese pinyin databases.

And S207, replacing the dialect phrase with the standard language by the terminal to obtain the updated text description content.

In the embodiment of the invention, after the terminal finds the standard language corresponding to the dialect phrase in the Chinese pinyin database, the dialect phrase in the text description content is replaced by the corresponding standard language to obtain the updated text description content.

For example, the standard language is mandarin, the phrase obtained by the terminal performing word segmentation processing on the text description content is "crash/hot/so", the terminal obtains the pinyin of each phrase, determines that the pinyin corresponding to each phrase is "jinian/langge/zheme/re", the terminal finds the pinyin "langge" in the pinyin database by using a bilateral maximum matching algorithm, the terminal identifies the phrase "crash" as a dialect, and the standard language corresponding to the pinyin "langge" in the pinyin database is "how", the terminal replaces the phrase "crash" in the text description content "crash and hot today" with "how" to obtain the updated text description content "how hot as today".

S208, the terminal performs word segmentation processing on the updated text description content to obtain a second word segmentation confidence coefficient of the updated text description content.

In the embodiment of the present invention, after the updated text description content is obtained by the terminal, the updated text description content is subjected to word segmentation processing to obtain a second word segmentation confidence of the updated text description content, where a manner of word segmentation processing performed on the updated text description content by the terminal is the same as that in step S202, and is not described herein again.

S209, the terminal judges whether the confidence of the second participle is larger than a second preset threshold value.

After the terminal acquires the second segmentation confidence of the updated text description, it is determined whether the second segmentation confidence is greater than a second preset threshold, where the second preset threshold may be 85%, 90%, 95%, and the like, and may be specifically preset by a developer, and the embodiment of the present invention is not limited.

And S210, if the confidence of the second participle is greater than a second preset threshold value, responding to the updated text description content.

In the embodiment of the invention, if the confidence of the second word segmentation of the updated text description content acquired by the terminal is greater than a second preset threshold, the updated text description content is responded. Specifically, the manner in which the terminal responds to the updated text description content may be to perform text output on the updated text description content, and the manner in which the terminal responds to the updated text description content may also be to convert the updated text description content into a voice message and perform voice output.

S211, if the confidence of the second word segmentation is smaller than or equal to the second preset threshold, responding to the text description content.

In the embodiment of the invention, if the confidence of the second word segmentation of the updated text description content acquired by the terminal is less than or equal to the second preset threshold, the text description content is responded. Specifically, the terminal may respond to the updated text description content in a manner of outputting the text description content in a text manner, and the terminal may respond to the text description content in a manner of converting the text description content into a voice message for voice output.

For example, the terminal obtains the text description content of "clunking today as hot", the terminal replaces the dialect phrase in the text description content with the standard language to obtain the updated text description content of "how hot today", the terminal performs the segmentation processing on the updated text description content to obtain the segmentation result of "how hot today", and the confidence of the second word segmentation corresponding to the word segmentation result is 92%, if the second preset threshold is 90%, the terminal determines that the confidence of the second participle is greater than a second preset threshold, and the terminal outputs the updated text description content "how hot today" in a voice manner, if the second preset threshold is 95%, the terminal determines that the confidence in the second participle is less than the second preset threshold and the terminal speech outputs the initial textual description "clunks today this hot".

In the embodiment of the invention, a terminal carries out word segmentation processing on text description contents in media information and obtains a first word segmentation confidence corresponding to a word segmentation result, if the first word segmentation confidence is smaller than a first preset threshold, the terminal judges that dialects possibly exist in the text description contents, further obtains pinyin corresponding to each word group in the word segmentation result, detects whether pinyin corresponding to the word group in the word segmentation result exists in a preset pinyin database, if the detection result exists, the terminal preliminarily confirms the word group corresponding to the pinyin as a dialect word group and replaces the dialect word group with a standard language to obtain new text description contents, further carries out word segmentation processing on the updated text description contents and obtains a second word segmentation confidence corresponding to the word segmentation result, and the terminal judges whether the second word segmentation confidence is larger than a second preset threshold, if yes, the terminal further confirms that dialects exist in the original text description content and responds to the updated text description content, if not, the terminal judges that the word segmentation effect is not improved after the dialect word group is replaced by the standard language, if the dialect judgment is possibly wrong, the terminal makes a corresponding action on the original text description content, and through the method, the dialects in the text can be identified and replaced by the standard language, and the communication efficiency is improved.

Referring to fig. 3, a schematic flow chart of another media processing method according to an embodiment of the invention is shown, where the method includes:

s301, the terminal receives the media information, and the media information comprises text description content.

S302, the terminal sends the received media message to a server.

S303, the server carries out word segmentation processing on the text description content in the media information to obtain at least one word group.

S304, the server obtains a first word segmentation confidence coefficient of the text description content after word segmentation processing.

S305, the server determines that the confidence of the first word segmentation is smaller than a first preset threshold value, and obtains the Chinese pinyin corresponding to each word group.

S306, when the Chinese pinyin corresponding to the word group in the word segmentation result exists in the Chinese pinyin database, the server identifies the word group corresponding to the Chinese pinyin as a dialect word group.

S307, the server searches a standard language matched with the Chinese pinyin corresponding to the dialect phrase in the Chinese pinyin database.

S308, the server replaces the dialect phrases with standard languages to obtain updated text description contents.

S309, the server carries out word segmentation processing on the updated text description content to obtain a second word segmentation confidence coefficient of the updated text description content.

S310, the server determines that the confidence of the second segmentation is larger than a second preset threshold.

S311, the server sends the updated text description content to the terminal.

In the embodiment of the present invention, the terminal may be the same terminal as the terminal in step S301, or may be another terminal that establishes a communication connection with the terminal in step S301.

In the embodiment of the invention, the terminal acquires the media information, the server replaces dialect characters of the media information to obtain a processing result, and the processing result is returned to the terminal.

Referring to fig. 4, a schematic structural diagram of a media processing system according to the present invention is shown in fig. 4, where the system includes at least one first terminal 401, a server 402, and at least one second terminal 403. The first terminal 401 establishes a communication connection with the second terminal 403 through the server. In one implementation, a user may input media information including text description content in the first terminal 401, the first terminal 401 sends the text description content to the server 402, the server 402 converts a dialect in the text description content into a standard language to obtain updated text description content, and returns the media information including the updated text description content to the first terminal 401, and the first terminal 401 responds to the received media information, specifically, the responding manner of the first terminal 401 may be to display the obtained updated text description content, or the first terminal 401 performs voice output on the updated text description content.

In one implementation, the first terminal 401 sends media information including text description content to the second terminal 403, where the second terminal 403 may be one or multiple, and when the server 402 detects that the text description content sent by the first terminal 401 includes a dialect, the dialect in the text description content is replaced with a standard language to obtain updated text description content, and sends the media information including the updated text description content to the second terminal 403. After receiving the media information including the updated text description content, the second terminal 403 may display the obtained updated text description content, or the second terminal 403 may perform voice output on the updated text description content.

By the mode, dialects in the text can be recognized in the communication process, and the dialects in the text are replaced by the standard language, so that the communication efficiency is improved.

The media processing device provided by the embodiment of the invention will be described in detail with reference to fig. 5. It should be noted that the media processing apparatus shown in fig. 5 is used for executing the method according to the embodiment of the present invention shown in fig. 1-3, for convenience of description, only the portion related to the embodiment of the present invention is shown, and details are not disclosed, and reference is made to the embodiment of the present invention shown in fig. 1-3.

Referring to fig. 5, which is a schematic structural diagram of a media processing device according to the present invention, the media processing device 50 may include: a receiving module 501, a word segmentation module 502, an acquisition module 503, a recognition module 504, a search module 505, and a replacement module 506.

A receiving module 501, configured to receive media information, where the media information includes text description content;

a word segmentation module 502, configured to perform word segmentation processing on the text description content to obtain at least one word group;

an obtaining module 503, configured to obtain pinyin corresponding to each word group;

an identifying module 504, configured to identify a phrase corresponding to the chinese pinyin as a dialect phrase when the chinese pinyin exists in the chinese pinyin database;

a searching module 505, configured to search the chinese pinyin database for a standard language matched with the chinese pinyin corresponding to the dialect phrase;

a replacing module 506, configured to replace the dialect phrase with the standard language to obtain an updated text description content.

In one implementation, the obtaining module 503 is further configured to,

acquiring a first segmentation confidence coefficient of the text description content after the segmentation processing;

and if the word segmentation confidence is smaller than a first preset threshold, triggering and executing the step of obtaining the pinyin corresponding to each word group.

In one implementation the apparatus further comprises a response module 507,

the word segmentation module 502 is further configured to perform word segmentation processing on the updated text description content to obtain a second word segmentation confidence of the updated text description content;

the response module 507 is configured to respond to the updated text description content if the second word segmentation confidence is greater than a second preset threshold.

In an implementation manner, the responding module 507 is further configured to respond to the text description content if the second word segmentation confidence is smaller than or equal to the second preset threshold.

In one implementation, the lookup module 505 is further configured to detect whether the pinyin is present in the pinyin database using a two-way maximum matching algorithm.

In the embodiment of the present invention, a receiving module 501 receives media information, where the media information includes text description content, and a word segmentation module 502 performs word segmentation processing on the text description content to obtain at least one word group; the obtaining module 503 obtains the bopomofos corresponding to the phrases, when the bopomofos exist in the bopomofo database, the identifying module 504 identifies the phrases corresponding to the bopomofos as dialect phrases, and the searching module 505 searches the standard language matched with the bopomofos corresponding to the dialect phrases in the bopomofo database; the replacing module 506 replaces the dialect phrase with the standard language to obtain the updated text description content. By the mode, dialects in the text can be identified, and the dialects in the text are replaced by the standard language, so that the communication efficiency is improved.

Fig. 6 is a schematic structural diagram of a media processing device according to an embodiment of the present invention. As shown in fig. 6, the media processing apparatus includes: at least one processor 601, input devices 603, output devices 604, memory 605, at least one communication bus 602. Wherein a communication bus 602 is used to enable the connection communication between these components. Wherein the input device 603 may be a control panel or a microphone etc., and the output device 604 may be a display screen etc. The memory 605 may be a high-speed RAM memory or a non-volatile memory (e.g., at least one disk memory). The memory 605 may optionally be at least one storage device located remotely from the processor 601. Wherein the processor 601 may be combined with the media processing apparatus described in fig. 5, the memory 605 stores a set of program codes, and the processor 601, the input device 603, and the output device 604 call the program codes stored in the memory 605 to perform the following operations:

an input device 603 configured to receive media information, where the media information includes textual description content;

a processor 601, configured to perform word segmentation processing on the text description content to obtain at least one word group;

the processor 601 is configured to obtain pinyin corresponding to each word group;

a processor 601, configured to identify a phrase corresponding to the chinese pinyin as a dialect phrase when the chinese pinyin exists in the chinese pinyin database;

a processor 601, configured to search the bopomofo database for a standard language matching the bopomofo corresponding to the dialect phrase;

and the processor 601 is configured to replace the dialect phrase with the standard language to obtain updated text description content.

In one implementation, before the processor 601 obtains the pinyin corresponding to each phrase, the processor is further configured to:

In one implementation, the processor 601 is specifically configured to:

performing word segmentation processing on the updated text description content to obtain a second word segmentation confidence coefficient of the updated text description content;

and if the confidence of the second participle is greater than a second preset threshold value, responding to the updated text description content.

In one implementation, the processor 601 performs word segmentation processing on the updated text description content, and after obtaining a second word segmentation confidence of the updated text description content, is further configured to:

and responding to the text description content if the second participle confidence coefficient is smaller than or equal to the second preset threshold value.

In one implementation, the processor 601 is configured to detect whether the hanyu pinyin is present in the hanyu pinyin database using a two-way maximum match algorithm.

In the embodiment of the present invention, media information is received through an input device 603, where the media information includes text description content, the processor 601 performs word segmentation processing on the text description content to obtain at least one word group, the processor 601 obtains chinese pinyins corresponding to the word groups, when the chinese pinyins exist in a chinese pinyin database, the processor 601 identifies the word group corresponding to the chinese pinyins as a dialect word group, searches for a standard language matched with the chinese pinyins corresponding to the dialect word group in the chinese pinyin database, and the processor 601 replaces the dialect word group with the standard language to obtain updated text description content. The dialect in the text can be recognized and replaced by the standard language, and the communication efficiency is improved.

The module in the embodiment of the present invention may be implemented by a general purpose Integrated Circuit, such as a CPU (Central Processing Unit), or an ASIC (Application Specific Integrated Circuit).

It should be understood that in the embodiment of the present invention, the Processor 601 may be a Central Processing Unit (CPU), and the Processor may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The bus 602 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like, and the bus 602 may be divided into an address bus, a data bus, a control bus, and the like, where fig. 6 only shows one thick line for convenience of illustration, but does not show only one bus or one type of bus.

It will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by a computer program, which may be stored in a computer storage medium and may include the processes of the embodiments of the methods described above when executed. The computer storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims

1. A method of media processing, the method comprising:

performing word segmentation processing on the text description content according to a target word segmentation algorithm to obtain a first word segmentation result, wherein the first word segmentation result comprises at least one word group;

performing word segmentation processing on the text description content according to a check word segmentation algorithm to obtain a second word segmentation result, wherein the second word segmentation result comprises at least one word group;

comparing at least one phrase included in the first segmentation result with at least one phrase included in the second segmentation result to obtain the similarity of the first segmentation result and the second segmentation result;

obtaining a first segmentation confidence coefficient of the first segmentation result according to the similarity;

if the first segmentation confidence is smaller than a first preset threshold, obtaining the pinyin corresponding to each phrase included in the first segmentation result;

2. The method according to claim 1, wherein said replacing the dialect phrase with the standard language, after obtaining the updated textual description, further comprises:

and if the confidence of the second participle is larger than a second preset threshold value, responding to the updated text description content.

3. The method according to claim 2, wherein after performing word segmentation processing on the updated text description content to obtain a second word segmentation confidence of the updated text description content, the method further comprises:

and if the confidence of the second participle is less than or equal to the second preset threshold, responding to the text description content.

4. The method of claim 1, wherein after obtaining the pinyin corresponding to each word group included in the first segmentation result, the method further comprises:

and detecting whether the Chinese pinyin exists in the Chinese pinyin database by adopting a bidirectional maximum matching algorithm.

5. A media processing apparatus, characterized in that the apparatus comprises:

the word segmentation module is used for carrying out word segmentation processing on the text description content according to a target word segmentation algorithm to obtain a first word segmentation result, and the first word segmentation result comprises at least one word group;

the word segmentation module is further used for performing word segmentation processing on the text description content according to a check word segmentation algorithm to obtain a second word segmentation result, and the second word segmentation result comprises at least one word group;

the obtaining module is used for comparing at least one phrase included in the first segmentation result with at least one phrase included in the second segmentation result to obtain the similarity between the first segmentation result and the second segmentation result;

the acquisition module is further used for obtaining a first segmentation confidence coefficient of the first segmentation result according to the similarity;

the obtaining module is further configured to obtain pinyin corresponding to each phrase included in the first segmentation result if the first segmentation confidence is smaller than a first preset threshold;

6. The media processing device of claim 5, wherein the device further comprises a response module,

the word segmentation module is further configured to perform word segmentation processing on the updated text description content to obtain a second word segmentation confidence of the updated text description content;

and the response module is used for responding to the updated text description content if the confidence of the second participle is greater than a second preset threshold.

7. A terminal, characterized in that it comprises a processor, an input device, an output device and a memory, said processor, input device, output device and memory being interconnected, wherein said memory is used to store a computer program comprising program instructions, said processor being configured to invoke said program instructions to perform the method according to any of the claims 1-4.

8. A computer-readable storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-4.