CN110970030A

CN110970030A - Voice recognition conversion method and system

Info

Publication number: CN110970030A
Application number: CN201911260985.0A
Authority: CN
Inventors: 蔡志成
Original assignee: AI Speech Ltd
Current assignee: AI Speech Ltd
Priority date: 2019-12-10
Filing date: 2019-12-10
Publication date: 2020-04-07

Abstract

The invention discloses a voice recognition conversion method, which comprises the following steps: acquiring voice data, and performing voice recognition on the voice data to generate text information to be output; performing conversion analysis on the text information to be output to obtain a text to be converted; and judging the conversion condition of the text to be converted, processing the text information to be output according to the judgment result, and generating and outputting a recognition result. The invention also discloses a voice recognition conversion system, and according to the method and the system disclosed by the invention, the recognition result can be more humanized, and the voice interaction experience of the user is increased.

Description

Voice recognition conversion method and system

Technical Field

The invention relates to the technical field of voice recognition, in particular to a voice recognition conversion method and a voice recognition conversion system.

Background

The current speech recognition and text conversion in the domestic speech technology industry is Chinese, all the texts converted from voice are Chinese, including the texts converted from digital recognition and Chinese, which causes a lot of troubles. For example, when a user speaks a string of sentences related to numbers, a string of Chinese numbers is fed back after voice recognition, which greatly affects the user experience.

Disclosure of Invention

In order to solve the problems, the inventor conceives that after the acquired voice information is identified, the analysis mode of the identification result is improved, the Chinese character types in the identification result are analyzed, and the Chinese character types are intelligently converted into digital types by combining contexts, so that the identification result is more humanized, and the voice interaction experience of a user is improved.

According to an aspect of the present invention, there is provided a speech recognition conversion method, including the steps of: acquiring voice data, and performing voice recognition on the voice data to generate text information to be output; performing conversion analysis on the text information to be output to obtain a text to be converted; and judging the conversion condition of the text to be converted, processing the text information to be output according to the judgment result, and generating and outputting a recognition result. The obtained converted text is judged to obtain the text type which can better accord with the user conversation scene, so that the voice interaction experience of the user can be increased, and the problems that the text is directly obtained only according to voice input in the prior art, and the user cannot conveniently read the specific scenes needing special processing such as sentences with obvious digital features are solved.

In some embodiments, the converting and analyzing the text information to be output, and the obtaining the text to be converted is implemented as follows: and screening keywords for the content of the text information to be output, and acquiring the text information containing Chinese numbers as a text to be converted. The type of the text is positioned through the keywords, so that the method is accurate, the implementation mode is simple, the problem of single text conversion in the prior art is solved, and particularly the problems of poor user experience and the like caused by the fact that a voice recognition result with obvious digital characteristics is still displayed in a Chinese digital form can be solved.

In some embodiments, the step of judging the conversion condition of the text to be converted, processing the text information to be output according to the judgment result, and generating the recognition result output comprises the following steps: judging conversion conditions of the text to be converted according to the Chinese digital content and the digital type of the text to be converted, converting the text to be converted which meets the conversion conditions into corresponding Arabic digital content according to a judgment result, replacing the text content to be converted in the text information to be output by the converted Arabic digital content, and generating and outputting a recognition result; or for the text to be converted which does not accord with the conversion condition, outputting the text information to be output as the identification result. When the text information to be output is processed, the text content is extracted according to the conversion condition, the Arabic numeral content is obtained, and the Arabic content is automatically replaced to the original position, so that the text self-adaptive conversion can be automatically realized, the conversion accuracy can be improved, the text self-adaptive conversion can really meet the scene requirement, and the user experience is greatly improved.

In some embodiments, the numeric type of the text to be converted that meets the conversion condition includes a date type, a floating point type, an integer type, a telephone number, and a percentage. Therefore, conversion of various digital types can be realized, and various context requirements of users can be met.

In some embodiments, the determining the conversion condition of the text to be converted according to the chinese digital content and the digital type of the text to be converted includes: pre-configuring a digital type conversion library for storage, wherein the digital type conversion library comprises a digital type, an expression form corresponding to the digital type and a conversion mode; matching the Chinese digital content of the text to be converted with the expression forms in the digital type conversion library respectively to determine the digital type; and acquiring a corresponding conversion mode according to the determined number type, and judging whether the text to be converted meets the conversion condition based on the conversion mode. The digital type, the corresponding expression form and the conversion mode can be configured according to user requirements and can also be configured according to user habits, so that the expression form of the text to be converted can be judged to determine the digital type to which the text belongs, and then the conversion mode is combined according to the expression requirements and the habits of each digital type, so that the conversion result can better accord with the user requirements and the reading habits of the user, the flexibility and the applicability of the text output by voice recognition are greatly improved, and the user experience is improved.

According to another aspect of the present invention, there is provided a speech recognition conversion system including: the information acquisition module is used for acquiring current voice data and carrying out voice recognition on the voice data to generate text information; the first conversion module is used for carrying out conversion analysis on the text information to obtain a text to be converted; and the second conversion module is used for judging the conversion condition of the text to be converted, processing the text information to be output according to the judgment result, and generating and outputting the identification result. After the text is acquired through the first conversion module, the acquired conversion text is judged through the second conversion module, and the text type which can better accord with the conversation context of the user is obtained, so that the voice interaction experience of the user can be increased, the problem that the text is directly obtained only according to voice input in the prior art, and the problem that the user cannot conveniently read the text in some specific scenes which need special processing such as sentences with obvious digital features is solved.

In some embodiments, the first conversion module is configured to perform keyword screening according to the content of the text information, and acquire the text information containing chinese numbers as the text to be converted. The type of the text is positioned through the keywords, so that the method is accurate, the implementation mode is simple, the problem of single text conversion in the prior art is solved, and particularly the problems of poor user experience and the like caused by the fact that a voice recognition result with obvious digital characteristics is still displayed in a Chinese digital form can be solved.

In some embodiments, the second conversion module comprises: the judging unit is used for judging the conversion condition of the first conversion result and outputting the judgment result meeting the conversion condition to the following first conversion output unit or outputting the judgment result not meeting the conversion condition to the following second conversion output unit; the first conversion output unit is used for converting the text to be converted, which accords with the conversion conditions, into corresponding Arabic digital content when the judgment result is that the text accords with the conversion conditions, replacing the text content to be converted in the text information to be output with the converted Arabic digital content, and generating and outputting an identification result; and the second conversion output unit is used for outputting the text information to be output as the identification result when the judgment result is that the text information does not accord with the conversion condition. When the judging unit processes the text information to be output, the text content is extracted according to the conversion condition to obtain the Arabic numeral content, and the Arabic content is automatically replaced and directly output in situ through the first conversion output unit, so that the text self-adaptive conversion can be automatically realized, the conversion accuracy can be improved, the text self-adaptive conversion can really meet the scene requirement, and the user experience is greatly improved.

According to another aspect of the present invention, there is provided an electronic apparatus including: the computer-readable medium includes at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the steps of the above-described method.

According to a further aspect of the invention, a storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.

Drawings

FIG. 1 is a flow chart of a speech recognition conversion method according to an embodiment of the present invention;

FIG. 2 is a functional block diagram of a speech recognition conversion system according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings.

Fig. 1 schematically shows a speech recognition conversion method according to an embodiment of the present invention, and as shown in fig. 1, this embodiment includes the following steps:

step S101: and acquiring voice data, and performing voice recognition on the voice data to generate text information to be output. In this embodiment, the manner of acquiring the voice data is to pick up the voice output by the user by activating the voice detection device. And then carrying out voice recognition on the received user voice through the voice recognition engine and other prior technologies to obtain a first recognition result, namely the text information to be output.

Step S102: and performing conversion analysis on the text information to be output to obtain a text to be converted. After the text information to be output is acquired, keyword screening is firstly carried out according to the content of the text information, the keyword screening is realized according to the prior art, the keywords are self-defined, for example, the keywords are set as the screening conditions of Chinese numbers, and the text information containing the Chinese numbers is acquired as the text to be converted according to the screening result. Illustratively, the method can be realized by forming a standard regular expression by using the combination of Chinese character Unicode codes, and performing keyword matching on the content of the text information by using the regular expression, thereby realizing primary digit screening of the text information to be output and finding out the digital content to be converted, namely the text to be converted.

Step S103: and judging the conversion condition of the text to be converted, processing the text information to be output according to the judgment result, and generating and outputting a recognition result. After the text to be converted is obtained, firstly, the conversion condition of the text to be converted is judged according to the Chinese digital content and the digital type of the text to be converted, which can be specifically realized as follows: judging conversion conditions by pre-configuring a digital type conversion library, wherein the digital type conversion library configured and stored is exemplarily realized to comprise a digital type, an expression form corresponding to the digital type and a conversion mode; therefore, after the text to be converted is obtained, the Chinese digital content of the text to be converted can be respectively matched with the expression forms in the digital type conversion library, so that the digital type corresponding to the text to be converted is determined; after the number type is determined, the corresponding conversion mode can be obtained according to the determined number type, and the text to be converted is judged to determine whether the text needs to be converted, namely whether the text meets the conversion condition. The numeric type of the text to be converted, which is set to meet the conversion condition, may exemplarily include a date type, a floating point type, an integer type, a phone number, and a percentage. The expression form configured for each numeric type is configured according to the common expression habit, and may exemplarily be: aiming at the percentage type, the expression form is configured to comprise keywords such as percentage, thousandth, zero point and the like; for the number type of the date type, the expression form thereof may be configured to include keywords such as yearly, monthly and daily (e.g., two zero and one nine-year october and daily), monthly and daily (e.g., july, and august); for a floating-point type number type, the expression form thereof may be configured to include keywords such as zero (e.g., one hundred, three, four, zero eight), one-third (e.g., one third), and the like; for the number type of the telephone number, the expression form can be configured to be a number string with a specific length or specific content (such as zero, eight and six); for integer type number types, the expression form can be configured as a regular expression conforming to the specification, for example, including a unit number fixed expression (e.g., one, two … nine), a two-digit fixed expression (e.g., twenty, thirty-five), a three-digit fixed expression (e.g., one hundred and one, two hundred and thirty-four), and the like. In order to realize conversion more accurately, a corresponding conversion mode is configured for each digital type, wherein the conversion mode is configured according to experience in a mode conforming to the habit of a user, illustratively, for integer (but the Chinese text is more than or equal to 2) forms, left and right contexts of a digital format section in the text also need to be searched, whether conversion conditions are met or not is judged according to the left and right contexts, for example, whether the left and right contexts have contents corresponding to definite measurement units and measurement units configured in the conversion mode or not is judged, and for digital conversion unsuitability, the conversion conditions are judged not to be met, such as some Chinese unit words of a ruler, a bar, a grain, a root, a pile and the like; and judging whether the conversion condition is met or not according to the scenes when the conversion is suitable, for example, judging that the conversion condition is met under the scene of only two or more digital texts aiming at Chinese unit words such as single Chinese unit words, Chinese unit words such as Chinese unit words and Chinese. For example, the conversion method corresponding to the floating point type or the telephone number is configured as direct conversion, that is, it is directly determined that the conversion condition is met.

Therefore, the judgment can be realized based on the combination of the digital type and the Chinese digital content, wherein the Chinese digital content related in the judgment condition is based on the reading habit in the prior art, and the judgment result is more accurate and also more accords with the habit of the user. And then, converting the text to be converted which meets the conversion conditions into corresponding Arabic digital content according to a judgment result, replacing the text content to be converted in the text information to be output with the converted Arabic digital content, and generating and outputting an identification result. And for the text to be converted which does not meet the conversion condition, the text information to be output is directly output as the identification result without conversion.

Illustratively, when the acquired text to be converted is: "we know that a bar is got after one thousand, four hundred and twenty days", then, in combination with the relevant configuration of the chinese digital content and the digital type conversion library, the reading habit of the general user is to take the number of days of the chinese content as the arabic number to be more suitable for reading, then determine that the number is the text to be converted, convert the digital type "one thousand, four hundred and twenty days" into "1420" and replace the same position to obtain the recognition result: "We acquainted with 1420 days of home bar".

When the obtained text to be converted is: "today's winning rate is seventy percent", then, in combination with the relevant configuration of the chinese digital content and the number type conversion library, the reading habit of the general user is that the percentage of the chinese character content is more suitable for reading as arabic numerals, and then the text to be converted is determined to be the text to be converted, and the number type "seventy percent" is converted into "70%" and is replaced by the same position, so that the recognition result is: "today this prize score is 70%".

When the obtained text to be converted is: "the two-zero eight-year ice disaster is serious", then, in combination with the relevant configuration of the chinese digital content and the digital type conversion library, the reading habit of a general user is to take the year, month and day of the chinese content as arabic numerals to be more suitable for reading, and then, if it is determined that the chinese digital content is a text to be converted, the digital type "two-zero eight-year" is converted into "2008" and the text is replaced at the same position, so that the recognition result is: "2008 good and serious ice disaster".

However, for some reading habits based on users, the type of the text to be converted should not be converted into Arabic numerals, that is, the text to be converted which does not meet the conversion conditions, and then the text information to be output is output as the recognition result. Preferably, for the poetry type content containing the Chinese digital content, the user is not required to convert in a habitual way, and for the purpose, the digital type of the poetry type can be configured in the digital type conversion library, and the existing poetry sentence containing the numbers is configured, so that when the to-be-output text content of the poetry type is obtained, whether the poetry is the poetry or not is judged and whether the conversion condition is met or not is judged through matching with the content in the digital type conversion library. Illustratively, when the acquired text to be converted is: the method is characterized in that three thousand rules are directly set in a flying flow, then the reading habit of a common user is that the number type in the ancient poetry is more suitable for reading as the Chinese character content by combining the Chinese digital content, the number type is kept by determining that the ancient poetry is a text to be converted without conversion, and the identification result is directly output: "three thousand chi directly under the high current".

According to the method provided by the embodiment, the text type which can better accord with the conversation early warning of the user can be obtained by judging the obtained converted text, so that the voice interaction experience of the user can be increased, and the problems that the text is directly obtained only according to voice input and the user is inconvenient to read for some sentences with obvious digital characteristics in the prior art are solved. Of course, in other embodiments, the screened keywords and the configuration conversion library may also be set according to requirements with reference to the concept of the above method, so as to implement conversion on other specific content according to requirements, and the embodiment of the present invention is not considered to be limited thereto.

Fig. 2 schematically shows a block diagram of a speech recognition conversion system according to an embodiment of the present invention, as shown in fig. 2,

the speech recognition conversion system of the present embodiment includes: the device comprises an information acquisition module 1, a first conversion module 2 and a second conversion module 3. The information obtaining module 1 is configured to obtain current voice data, perform voice recognition on the voice data to generate text information, and may be implemented as a voice recognition engine configured to perform recognition and conversion on pickup content. The first conversion module 2 is used for performing conversion analysis on the text information to obtain a text to be converted. The second conversion module 3 is used for judging the conversion condition of the text to be converted, processing the text information to be output according to the judgment result, and generating and outputting the identification result. The first conversion module 2 is configured to perform keyword screening according to the content of the text information to obtain a to-be-converted file, where a corresponding text screened by the keyword may be a regular expression including chinese numbers, and the text to be converted screened in this way is the text content including the chinese numbers. Whether the text to be converted meets the conversion condition or not can be configured by relevant parameters through a digital type conversion library, the exemplary configuration of the digital type to be screened can comprise a date type, a floating point type, an integer type, a telephone number and a percentage, the screening mode can be a regular expression meeting the requirements through configuration, and whether the text to be converted meets the conversion condition or not can be judged by configuring the conversion mode for each digital type based on experience and user habits to serve as a judgment mode and a judgment basis. The specific working principle and the processing procedure of the first conversion module 2 and the second conversion module 3 may refer to the description of the above method part, and are not described herein again.

The second conversion module 3 includes a determination unit 301, a first conversion output unit 302, and a second conversion output unit 303. The determining unit 301 is configured to perform a conversion condition determination on the first conversion result, and output a determination result meeting the conversion condition to the first conversion output unit 302, or output a determination result not meeting the conversion condition to the second conversion output unit 303. The first conversion output unit 302 is configured to, when the determination result is that the conversion condition is met, convert the text to be converted meeting the conversion condition into corresponding arabic digital content, replace the text content to be converted in the text information to be output with the converted arabic digital content, and generate and output an identification result. The second conversion output unit 303 is configured to output the text information to be output as the recognition result when the determination result is that the conversion condition is not met. The specific processing procedures of the units of the second conversion module 3 can refer to the description of the method section above.

According to the system provided by the embodiment, after the text is acquired through the first conversion module, the acquired conversion text is judged through the second conversion module, and the text type which can better accord with the dialogue early warning of the user is obtained, so that the voice interaction experience of the user can be increased, and the problems that the text is directly obtained only according to the voice input in the prior art, and the user is inconvenient to read some sentences with obvious digital characteristics are solved.

In some embodiments, the present invention provides a non-transitory computer-readable storage medium, in which one or more programs including executable instructions are stored, and the executable instructions can be read and executed by an electronic device (including but not limited to a computer, a server, or a network device, etc.) to perform the above-mentioned voice recognition conversion method of the present invention.

In some embodiments, the present invention further provides a computer program product comprising a computer program stored on a non-volatile computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the above-mentioned method of speech recognition conversion.

In some embodiments, an embodiment of the present invention further provides an electronic device, which includes: at least one processor, and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of speech recognition conversion described above.

In some embodiments, the present invention further provides a storage medium, on which a computer program is stored, which when executed by a processor is capable of performing the above-mentioned method of speech recognition conversion.

The voice recognition and conversion apparatus according to the above embodiment of the present invention can be used for executing the voice recognition and conversion method according to the above embodiment of the present invention, and accordingly achieves the technical effect achieved by the voice recognition and conversion method according to the above embodiment of the present invention, and will not be described herein again. In the embodiment of the present invention, the relevant functional module may be implemented by a hardware processor (hardware processor).

Fig. 3 is a schematic hardware structure diagram of an electronic device for performing a method of speech recognition conversion according to another embodiment of the present application, and as shown in fig. 3, the electronic device includes:

one or more processors 510 and memory 520, with one processor 510 being an example in fig. 3.

The apparatus of the method of performing a speech recognition conversion may further include: an input device 530 and an output device 540.

The processor 510, the memory 520, the input device 530, and the output device 540 may be connected by a bus or other means, such as the bus connection in fig. 3.

The memory 520, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the method for speech recognition conversion in the embodiments of the present application. The processor 510 executes various functional applications of the server and data processing, i.e., a method of implementing voice recognition conversion in the above-described method embodiments, by executing nonvolatile software programs, instructions, and modules stored in the memory 520.

The memory 520 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the device for voice recognition conversion, and the like. Further, the memory 520 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 520 may optionally include memory located remotely from processor 510, which may be connected to a speech recognition conversion device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 530 may receive input numeric or character information and generate signals related to user settings and function control of the speech recognition converted device. The output device 540 may include a display device such as a display screen.

The one or more modules described above are stored in the memory 520 and, when executed by the one or more processors 510, perform the method of speech recognition conversion in any of the method embodiments described above.

The product can execute the method provided by the embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the methods provided in the embodiments of the present application.

The electronic device of the embodiments of the present application exists in various forms, including but not limited to:

(1) mobile communication devices, which are characterized by mobile communication capabilities and are primarily targeted at providing voice and data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.

(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include PDA, MID, and UMPC devices, such as ipads.

(3) Portable entertainment devices such devices may display and play multimedia content. Such devices include audio and video players (e.g., ipods), handheld game consoles, electronic books, as well as smart toys and portable car navigation devices.

(4) The server is similar to a general computer architecture, but has higher requirements on processing capability, stability, reliability, safety, expandability, manageability and the like because of the need of providing highly reliable services.

(5) And other electronic devices with data interaction functions.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a general hardware platform, and certainly can also be implemented by hardware. Based on such understanding, the above technical solutions substantially or contributing to the related art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application

What has been described above are merely some embodiments of the present invention. It will be apparent to those skilled in the art that various changes and modifications can be made without departing from the inventive concept thereof, and these changes and modifications can be made without departing from the spirit and scope of the invention.

Claims

1. A speech recognition conversion method, comprising the steps of:

acquiring voice data, and performing voice recognition on the voice data to generate text information to be output;

performing conversion analysis on the text information to be output to obtain a text to be converted;

and judging the conversion condition of the text to be converted, processing the text information to be output according to the judgment result, and generating and outputting an identification result.

2. The method according to claim 1, wherein the converting and analyzing the text information to be output and obtaining the text information to be converted are implemented as:

and screening keywords for the content of the text information to be output, and acquiring the text information containing Chinese numbers as a text to be converted.

3. The method according to claim 2, wherein the step of judging the conversion condition of the text to be converted, processing the text information to be output according to the judgment result, and generating the recognition result output comprises the following steps:

judging the conversion condition of the text to be converted according to the Chinese digital content and the digital type of the text to be converted, and according to the judgment result,

converting the text to be converted which meets the conversion condition into corresponding Arabic numeral content, replacing the text content to be converted in the text information to be output with the Arabic numeral content, and generating and outputting a recognition result; or

And for the text to be converted which does not accord with the conversion condition, outputting the text information to be output as an identification result.

4. The method of claim 3, wherein the number types of the text to be converted that meet the conversion condition include a date type, a floating point type, an integer type, a telephone number, and a percentage.

5. The method according to claim 4, wherein the judging the conversion condition of the text to be converted according to the Chinese digital content and the digital type of the text to be converted is realized as follows:

pre-configuring a digital type conversion library for storage, wherein the digital type conversion library comprises a digital type, an expression form corresponding to the digital type and a conversion mode;

matching the Chinese digital content of the text to be converted with the expression forms in the digital type conversion library respectively to determine the digital type;

and acquiring a corresponding conversion mode according to the determined number type, and judging whether the text to be converted meets the conversion condition based on the conversion mode.

6. A speech recognition conversion system, comprising:

the information acquisition module is used for acquiring current voice data and carrying out voice recognition on the voice data to generate text information;

the first conversion module is used for carrying out conversion analysis on the text information to obtain a text to be converted;

and the second conversion module is used for judging the conversion condition of the text to be converted, processing the text information to be output according to the judgment result, and generating and outputting a recognition result.

7. The system as claimed in claim 5, wherein the first conversion module is configured to perform keyword screening according to the content of the text information, and obtain the text information containing chinese numerals as the text to be converted.

8. The system of claim 5, wherein the second conversion module comprises:

the judging unit is used for judging the conversion condition of the first conversion result and outputting the judgment result meeting the conversion condition to the following first conversion output unit or outputting the judgment result not meeting the conversion condition to the following second conversion output unit;

the first conversion output unit is used for converting the text to be converted, which accords with the conversion conditions, into corresponding Arabic digital content when the judgment result is that the text to be converted accords with the conversion conditions, replacing the text content to be converted in the text information to be output with the converted Arabic digital content, and generating and outputting a recognition result;

and the second conversion output unit is used for outputting the text information to be output as an identification result when the judgment result is that the text information does not accord with the conversion condition.

9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any one of claims 1-5.

10. Storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.