CN110377914B

CN110377914B - Character recognition method, device and storage medium

Info

Publication number: CN110377914B
Application number: CN201910677203.7A
Authority: CN
Inventors: 李原野; 季成晖; 卢俊之
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-07-25
Filing date: 2019-07-25
Publication date: 2023-01-06
Anticipated expiration: 2039-07-25
Also published as: CN110377914A

Abstract

The application discloses a character recognition method, a character recognition device and a storage medium, and relates to the technical field of information processing. According to the stroke sequence of the character to be detected, the stroke coding sequence of the character to be detected is determined, and the context information of the character to be detected is obtained. And then, determining a word vector of the character to be detected according to the stroke coding sequence and the context information of the character to be detected, and identifying the character to be detected according to the mapping relation between the word vector of the character to be detected and the stored word vector and the character. That is, the character to be detected can be identified by combining the font characteristics and the context semantic meaning of the character to be detected, so that the identification accuracy of wrongly written characters is improved.

Description

Character recognition method, device and storage medium

Technical Field

The present application relates to the field of information processing technologies, and in particular, to a character recognition method, device, and storage medium.

Background

Currently, a terminal can display a wide variety of text information. Since the text information may have wrongly written characters, the terminal can recognize the wrongly written characters in the text information.

In the related art, terminals generally recognize wrongly written characters based on the semantics of chinese characters. For example, the terminal may obtain the pinyin for a chinese character based on the chinese character. And acquiring a plurality of candidate characters close to the pinyin of the Chinese character according to the pinyin of the Chinese character. Then, the terminal can compare the semantic matching degree of the context information of the Chinese character and the semantic matching degree of other candidate characters and the context information, so as to judge whether the Chinese character is a wrongly-written character.

However, in many cases, the close of the pinyins does not represent the close of the semantics, i.e., the occurrence of the wrongly written word may not be caused by the close of the pinyins. In this case, the wrongly written or mispronounced character cannot be detected by the above method.

Disclosure of Invention

The embodiment of the application provides a character recognition method, a character recognition device and a storage medium, which can be used for improving the recognition accuracy of wrongly-written characters. The technical scheme is as follows:

in one aspect, a character recognition method is provided, the method including:

determining a stroke coding sequence of the character to be detected according to the stroke sequence of the character to be detected;

acquiring context information of the character to be detected;

determining a word vector of the character to be detected according to the stroke coding sequence and the context information of the character to be detected;

and identifying the character to be detected according to the stored mapping relation between the character vector and the character vector of the character to be detected.

In another aspect, there is provided a character recognition apparatus, the apparatus including:

the first determining module is used for determining the stroke coding sequence of the character to be detected according to the stroke sequence of the character to be detected;

the first acquisition module is used for acquiring the context information of the character to be detected;

the second determining module is used for determining the word vector of the character to be detected according to the stroke coding sequence and the context information of the character to be detected;

and the recognition module is used for recognizing the character to be detected according to the stored mapping relation between the word vector and the character and the word vector of the character to be detected.

In another aspect, a character recognition apparatus is provided, the apparatus comprising a processor, a communication interface, a memory, and a communication bus;

the processor, the communication interface and the memory complete mutual communication through the communication bus;

the memory is used for storing computer programs;

the processor is used for executing the program stored in the memory so as to realize the steps of the character recognition method.

In another aspect, a computer-readable storage medium is provided, in which a computer program is stored, which computer program, when being executed by a processor, realizes the steps of the character recognition method as provided above.

The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise:

in the embodiment of the application, the stroke coding sequence of the character to be detected is determined according to the stroke sequence of the character to be detected, and the context information of the character to be detected is obtained. And then, determining a word vector of the character to be detected according to the stroke coding sequence and the context information of the character to be detected, and identifying the character to be detected according to the mapping relation between the word vector of the character to be detected and the stored word vector and the character. That is, the character to be detected can be identified by combining the font characteristics and the context semantic meaning of the character to be detected, so that the identification accuracy of wrongly written characters is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flow chart of a character recognition method provided by an embodiment of the present application;

fig. 2 is a schematic structural diagram of a character recognition apparatus according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a first determining module provided in an embodiment of the present application;

FIG. 4 is a schematic structural diagram of another character recognition apparatus according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a terminal for performing character recognition according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

Before explaining the embodiments of the present application in detail, an application scenario related to the embodiments of the present application will be described.

Currently, a terminal can display a wide variety of text information. For example, the terminal may display text information input by the user. Alternatively, the terminal may perform Character Recognition by OCR (Optical Character Recognition) to obtain text information, and then display the text information. Or, after the terminal starts an application, the terminal may obtain the raw data from an application server of the application to display the raw data, for example, after the terminal starts a map application, the terminal may obtain names of points of interest on a map from the server of the map application, and display the names of the points of interest on the map. Because errors may occur in the text information in the process of inputting or converting and recognizing, wrong words may exist in the text information, and based on this, the terminal may adopt the character recognition method provided by the embodiment of the present application to recognize the wrong words in the text information before displaying the text information.

Of course, in some possible scenarios, it may be necessary to automatically acquire synonyms and near-shape words of some words, and in this case, the synonyms and near-shape words may also be identified by using the related implementation manner provided in the embodiments of the present application.

In addition, it should be further noted that the character recognition method provided by the embodiment of the present application may also be executed by a server in some scenarios. For example, for an application server storing text information, the application server can recognize a wrongly written word by the character recognition method so as to correct the wrongly written word.

Next, a method for detecting a character string provided in the embodiment of the present application is described.

Fig. 1 is a flowchart of a character string detection method according to an embodiment of the present application. Referring to fig. 1, the method may be applied to an intelligent device, and in the embodiment of the present application, for example, the method is applied to a terminal to explain, the method may include the following steps:

step 101: and determining the stroke coding sequence of the character to be detected according to the stroke sequence of the character to be detected.

In the embodiment of the application, the terminal can acquire the text information to be detected, and the text information to be detected can include a plurality of Chinese characters. For each character in the plurality of Chinese characters, the terminal can use the character as a character to be detected, and whether the character to be detected is a wrongly written character or not is detected by the character detection method provided by the embodiment of the application. In other words, the character to be detected may refer to any character in the text information to be detected.

The terminal can split the character to be detected according to the stroke sequence of the character to be detected to obtain a stroke sequence comprising a plurality of character components. Determining coding information corresponding to each character component in the stroke sequence according to the mapping relation between the character components and the coding information; and sequencing the determined plurality of coded information according to the sequence of the plurality of character components in the stroke sequence to obtain a stroke coded sequence.

It should be noted that, a chinese character is generally composed of a plurality of strokes, and each stroke has a sequence according to a writing sequence. Based on the method, the terminal can split the character to be detected into a plurality of character components according to the stroke sequence of the character to be detected, and each character component can be a stroke. The broken character components are arranged according to stroke order, so that a stroke sequence is obtained.

The terminal can store the mapping relation between the character components and the coding information. After obtaining the stroke sequence, the terminal may sequentially obtain the coding information corresponding to each character component according to the sequence of each character component in the stroke sequence, and then arrange the multiple pieces of coding information obtained sequentially according to the obtaining sequence, thereby obtaining the stroke coding sequence.

Table 1 shows a mapping relationship between a character component and encoding information according to an embodiment of the present application. As shown in table 1, each character component may correspond to a number of coded information, so that, through the mapping relationship, the separated character components may be converted into the number of coded information that can be processed by the terminal, that is, the stroke code sequence.

TABLE 1 mapping of character components to coding information

Character assembly

A

I1

Vertical and horizontal

□

Left-falling stroke



……

Encoding information

1

2

3

4

5

6

……

Exemplarily, assuming that the character to be detected is "country", the terminal may first split the "country" character into a plurality of character components according to the stroke order, which are: i, fold-back, I, R, I. Next, as is clear from the mapping relationship between character components and coded information shown in table 1, coding information corresponding to "i" is "2", coding information corresponding to "fold-back" is "6", coding information corresponding to "one" is "1", and coding information corresponding to "5". Based on this, the terminal can obtain the stroke code sequence as "2,6,1,1,2,1,5,1" according to the sequence of the character components.

Step 102: and acquiring the context information of the character to be detected.

The stroke code sequence obtained by splitting the Chinese character can be used for representing the font characteristics of the Chinese character, but is not enough for representing the semantics of the Chinese character. Therefore, in the embodiment of the application, the terminal can also acquire the context information of the character to be detected.

The terminal can acquire the context information of the character to be detected through the N-gram model. Illustratively, the terminal may set the size N of the N-gram window according to the window size input by the user. Then, the terminal may obtain n characters from the position of the character to be detected in the text information forward, obtain n characters from the position of the character to be detected backward, and use the obtained 2n characters as the context information of the character to be detected. Wherein n is an integer greater than or equal to 1.

For example, if n =2, the character to be detected is "young", and the text information is "eighteenth kindergarten address", two characters "eighteenth" are obtained forward and two characters "kindergarten" are obtained backward from the position where the "young" character is located, and the context information of the 4 characters Fu Jiwei "young" character is obtained.

It should be noted that, in this embodiment of the present application, the terminal may also execute step 102 first and then execute step 101, or execute

steps

101 and 102 simultaneously, which is not limited in this embodiment of the present application.

Step 103: and determining the word vector of the character to be detected according to the stroke coding sequence and the context information of the character to be detected.

After acquiring the stroke coding sequence and the following information of the character to be detected, the terminal can determine a plurality of segment coding sequences according to the stroke coding sequence; and processing the plurality of segment coding sequences and the context information of the character to be detected through a specified deep learning model to obtain a word vector of the character to be detected.

In this embodiment of the application, in order to unify the format of data input to a specified deep learning model, after determining a stroke code sequence, the terminal may detect whether the number of pieces of coded information included in the stroke code sequence is equal to k, and if the number of pieces of coded information included in the stroke code sequence is less than k, the terminal may supplement preset coded information after the last coded information of the stroke code sequence, so that the number of pieces of coded information included in the stroke code sequence of step is equal to k.

For example, assuming that k =50 and the stroke code sequence of the character to be detected includes 8 pieces of encoded information, the terminal may supplement 42 pieces of preset encoded information "m" after the 8 pieces of encoded information included in the stroke code sequence, so that the supplemented stroke code sequence includes 50 pieces of encoded information. Wherein. The preset encoding information may be encoding information that does not exist in the mapping relationship between the character components and the encoding information. For example, the preset encoding information may be equal to-1, and of course, other values may also be used, which is not limited in this embodiment of the present application.

After supplementing the stroke code sequence, the terminal may generate a plurality of segment code sequences based on the supplemented stroke code sequence. Illustratively, the terminal may divide the encoded information and m encoded information consecutive after the encoded information into a segment, starting from a first encoded information in the stroke-encoding sequence, to obtain a first segment-encoding sequence. Then, the second coded message and m coded messages which are continuous after the second coded message are divided into a segment, so that a second segment coded sequence is obtained, and the like. Wherein m is an integer greater than or equal to 1, and m is less than or equal to k. Typically, m may be equal to 2, 3 or 4.

For example, assuming that m =2 as an example of the stroke code sequence "2,6,1,1,2,1,5,1, -1, …, -1" obtained as described above, when dividing the stroke code sequence, the first encoded information "2" and the following 2 pieces of encoded information "6", "1" may be taken as one segment, so as to obtain a first segment code sequence "2,6,1"; then, the second coding information "6" and the following 2 continuous coding information "1", "1" are taken as a fragment to obtain a second fragment coding sequence "6,1,1"; and taking the third coded information and the next two consecutive coded information as a fragment to obtain a third fragment coded sequence '1,1,2', and so on.

Optionally, in some possible cases, the terminal may divide the stroke code sequence by taking m as different values according to the above method, so as to obtain a plurality of segment code sequences. For example, the terminal may determine, by the above method, a plurality of segment code sequences including three pieces of code information when m = 2. Then, the terminal may make m =3, and continue to divide the stroke code sequence by the above method, so as to obtain a plurality of segment code sequences containing four pieces of code information. Then, the terminal may further let m =4, and continue to divide the stroke code sequence by the above method, so as to obtain a plurality of segment code sequences including five pieces of code information.

Alternatively, if the coded information contained in the stroke code sequence of the character to be detected is equal to k, the terminal may not supplement the stroke code sequence, but generate the segment code sequences directly by the method described above.

After determining to obtain the plurality of segment encoding sequences, the terminal may use the plurality of segment encoding sequences and the obtained context information of the character to be detected as input values of the specified deep learning model. The specified deep learning model may generate a segment code vector from each segment code sequence. Then, an initial word vector is generated based on the generated plurality of segment encoding vectors. Then, the specified deep learning model can determine and output the word vector of the character to be detected according to the initial word vector and the context information of the character to be detected.

The assigned depth model can assign values to the initial segment vectors of the assigned dimensions according to each segment coding sequence, so that segment coding vectors corresponding to the corresponding segment coding sequences are generated. After generating the plurality of segment-coded vectors, the specified depth model may sum and average vector elements located at the same position in the plurality of segment-coded vectors to obtain an initial word vector. For example, assuming that the dimension of the segment coding vector is 20 dimensions, the specified depth model may sum and average the elements at the first position in each segment coding vector, where the average is the first element in the initial word vector, sum and average the elements at the second position in each segment coding vector, where the average is the second element in the initial word vector, and so on, to obtain the initial word vector.

It should be noted that the specified depth model may be a skip-gram model. Also, the specified depth model may refer to a model trained by a large amount of sample data.

Illustratively, the terminal can obtain a plurality of sample data containing the specified characters, wherein each sample data comprises a stroke code sequence of the specified characters and context information of the specified characters; processing a plurality of sample data through a specified deep learning model to obtain a word vector corresponding to a specified character; and storing the designated character and the word vector corresponding to the designated character in the mapping relation of the word vector and the character.

For each character in the Chinese characters, the terminal can obtain sample text information corresponding to different characters. For example, for the "country" word, the terminal may acquire a plurality of sample text information including the "country" word. Then, the terminal may obtain a plurality of sample data of each character according to the sample text information corresponding to each character.

It should be noted that, for any character in the chinese characters, for convenience of description, the character is referred to as a designated character, and after the terminal acquires a plurality of sample text messages containing the designated character, the terminal may refer to the related method described above, and acquire context information of the designated character in each sample text message according to a position of the designated character in each sample text message. Meanwhile, the terminal may also determine to obtain the stroke coding sequence of the specified character according to the stroke order of the specified character by referring to the related method described above. Then, the terminal may use the stroke coding sequence and each of the obtained plurality of pieces of context information as a sample data of the designated character, input the sample data to a designated deep learning model, and the designated deep learning model processes the plurality of sample data of the designated character to adjust model parameters of the designated deep learning model, and finally input a word vector corresponding to the designated character. At this time, the word vector is obtained by training the specified depth model, and the word vector and the specified character are correspondingly stored in the mapping relation between the word vector and the character.

For each character in the Chinese characters, the terminal can process the corresponding character through the method, further train the specified deep learning model through the sample data of the corresponding character, simultaneously output the trained word vector, and correspondingly store the word vector and the corresponding character.

Step 104: and identifying the character to be detected according to the stored mapping relation between the word vector and the character and the word vector of the character to be detected.

After determining the word vector of the character to be detected through step 103, the terminal may determine the similarity between each word vector included in the mapping relationship between the word vector and the character and the word vector of the character to be detected; and identifying the character to be detected according to the similarity between each character vector and the character vector of the character to be detected.

The terminal can calculate the distance between the character to be detected and each word vector in the mapping relation, and the distance is used for representing the similarity between the character to be detected and the corresponding word vector. Wherein, the closer the distance between two vectors is, the higher the similarity between the two vectors is. The implementation manner of calculating the distance between the two vectors may refer to a related implementation manner, and this embodiment is not described herein again.

After determining the vector distance between the word vector of the character to be detected and each word vector, the terminal may obtain a vector distance less than a specified distance threshold from among the plurality of vector distances. If the number of the vector distances smaller than the specified distance threshold is only one, the terminal can acquire the character corresponding to the word vector corresponding to the vector distance, and if the character is the same as the character to be detected, the character to be detected is not a wrongly-written character. If the number of the vector distances smaller than the specified distance threshold is multiple, the terminal may obtain a minimum vector distance among the multiple vector distances smaller than the specified distance threshold, and obtain a character corresponding to a word vector corresponding to the minimum vector distance. If the character is different from the character to be detected, the character to be detected can be determined to be a wrongly written character, and at the moment, the terminal can display the character corresponding to the word vector corresponding to the minimum vector distance as a recommended character. And the recommended character is a correct character recommended to the user and used for replacing the character to be detected.

Alternatively, in some possible cases, if there is no vector distance smaller than a specified distance threshold, the terminal may directly determine the character to be detected as a wrongly-written character.

In the embodiment of the application, the terminal can determine the stroke coding sequence of the character to be detected according to the stroke sequence of the character to be detected, and obtain the context information of the character to be detected. And then, determining a word vector of the character to be detected according to the stroke coding sequence and the context information of the character to be detected, and identifying the character to be detected according to the mapping relation between the word vector of the character to be detected and the stored word vector and the character. That is, the character to be detected can be identified by combining the font characteristics and the context semantic meaning of the character to be detected, so that the identification accuracy of wrongly-written characters is improved, and particularly, the identification accuracy of characters with similar shapes and wrongly-written characters can be improved. For example, text information obtained through OCR recognition usually has near-wrong words, and when the method provided by the embodiment of the present application is used to recognize near-wrong words in such text information, the recognition accuracy can be improved.

In addition, the character recognition method in the embodiment of the application can also be used for mining the similar characters, in the scene, after the vector distance between the character vector of the character to be detected and each character vector is obtained through calculation, the terminal can also determine the similarity probability between each character vector and the character vector of the character to be detected according to the vector distance, and further output the character similar to the character to be detected according to the similarity probability.

Next, a description will be given of a character string detection apparatus provided in an embodiment of the present application.

Referring to fig. 2, an embodiment of the present application provides a character recognition apparatus 200, where the apparatus 200 includes:

the first determining module 201 is configured to determine a stroke coding sequence of a character to be detected according to a stroke sequence of the character to be detected;

a first obtaining module 202, configured to obtain context information of the character to be detected;

the second determining module 203 is configured to determine a word vector of the character to be detected according to the stroke coding sequence and the context information of the character to be detected;

and the recognition module 204 is configured to recognize the character to be detected according to the stored mapping relationship between the word vector and the character and the word vector of the character to be detected.

Optionally, referring to fig. 3, the first determining module 201 includes:

the splitting submodule 2011 is configured to split the character to be detected according to the stroke sequence of the character to be detected, so as to obtain a stroke sequence including a plurality of character components;

the determining submodule 2012 is configured to determine, according to a mapping relationship between a character component and encoding information, encoding information corresponding to each character component in the stroke sequence;

and the sequencing submodule 2013 is used for sequencing the determined plurality of coded information according to the sequence of the plurality of character components in the stroke sequence to obtain the stroke coded sequence.

Optionally, the second determining module 203 is specifically configured to:

determining a plurality of segment coding sequences according to the stroke coding sequence;

and processing the plurality of segment coding sequences and the context information of the character to be detected through a specified deep learning model to obtain a word vector of the character to be detected.

Optionally, referring to fig. 4, the apparatus 200 further comprises:

a second obtaining module 205, configured to obtain a plurality of sample data including a specified character, where each sample data includes a stroke code sequence of the specified character and a context information of the specified character;

the processing module 206 is configured to process the multiple sample data through a specified deep learning model to obtain a word vector corresponding to the specified character;

the storage module 207 is configured to store the specified character and the word vector of the specified character in a mapping relationship between the word vector and the character.

Optionally, the identifying module 204 is specifically configured to:

determining the similarity between each word vector contained in the mapping relation between the word vector and the character and the word vector of the character to be detected;

and identifying the character to be detected according to the similarity between each character vector and the character vector of the character to be detected.

In summary, the embodiment of the present application may determine the stroke coding sequence of the character to be detected according to the stroke sequence of the character to be detected, and obtain the context information of the character to be detected. And then, determining a word vector of the character to be detected according to the stroke coding sequence and the context information of the character to be detected, and identifying the character to be detected according to the mapping relation between the word vector of the character to be detected and the stored word vector and the character. That is, the character to be detected can be recognized by combining the font characteristics and the context and the semanteme of the character to be detected, and the recognition accuracy of wrongly-written characters is improved. .

It should be noted that: in the above-described embodiment, when the character recognition apparatus recognizes a wrongly written or mispronounced character, only the division of the functional modules is used as an example, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the above-described functions. In addition, the character recognition device and the character recognition method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.

Fig. 5 shows a block diagram of a terminal 500 for character recognition according to an exemplary embodiment of the present application. The terminal 500 may be: a smartphone, a tablet, a laptop, or a desktop computer. The terminal 500 may also be referred to as a user equipment, a portable device for adapting a neural network model, a laptop device for adapting a neural network model, a desktop device for adapting a neural network model, or by other names.

In general, the terminal 500 includes: a processor 501 and a memory 502.

The processor 501 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 501 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 501 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in a wake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 501 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, processor 501 may also include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.

Memory 502 may include one or more computer-readable storage media, which may be non-transitory. Memory 502 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer-readable storage medium in memory 502 is used to store at least one instruction for execution by processor 501 to implement the character recognition methods provided by the method embodiments of the present application.

In some embodiments, the terminal 500 may further optionally include: a peripheral interface 503 and at least one peripheral. The processor 501, memory 502 and peripheral interface 503 may be connected by a bus or signal lines. Various peripheral devices may be connected to the peripheral interface 503 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 504, touch screen display 505, camera 506, audio circuitry 507, positioning components 508, and power supply 509.

The peripheral interface 503 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 501 and the memory 502. In some embodiments, the processor 501, memory 502, and peripheral interface 503 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 501, the memory 502 and the peripheral interface 503 may be implemented on a single chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 504 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 504 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 504 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 504 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 504 may communicate with other devices that adapt the neural network model via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 504 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 505 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 505 is a touch display screen, the display screen 505 also has the ability to capture touch signals on or over the surface of the display screen 505. The touch signal may be input to the processor 501 as a control signal for processing. At this point, the display screen 505 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display screen 505 may be one, providing the front panel of the terminal 500; in other embodiments, the display screens 505 may be at least two, respectively disposed on different surfaces of the terminal 500 or in a folded design; in still other embodiments, the display 505 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 500. Even more, the display screen 505 can be arranged in a non-rectangular irregular figure, i.e. a shaped screen. The Display screen 505 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and other materials.

The camera assembly 506 is used to capture images or video. Optionally, camera assembly 506 includes a front camera and a rear camera. Generally, a front camera is disposed on a front panel of a terminal device, and a rear camera is disposed on a rear surface of the terminal device. In some embodiments, the number of the rear cameras is at least two, and each of the rear cameras is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, the main camera and the wide-angle camera are fused to realize panoramic shooting and a VR (Virtual Reality) shooting function or other fusion shooting functions. In some embodiments, camera assembly 506 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

The audio circuitry 507 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 501 for processing, or inputting the electric signals to the radio frequency circuit 504 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 500. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 501 or the radio frequency circuit 504 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 507 may also include a headphone jack.

The positioning component 508 is used for positioning the current geographic Location of the terminal 500 for navigation or LBS (Location Based Service). The Positioning component 508 may be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, or the galileo System in russia.

Power supply 509 is used to provide power to the various components of terminal 500. The power source 509 may be alternating current, direct current, disposable or rechargeable. When power supply 509 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the terminal 500 also includes one or more sensors 510. The one or more sensors 510 include, but are not limited to: acceleration sensor 511, gyro sensor 512, pressure sensor 513, fingerprint sensor 514, optical sensor 515, and proximity sensor 516.

The acceleration sensor 511 may detect the magnitude of acceleration on three coordinate axes of the coordinate system established with the terminal 500. For example, the acceleration sensor 511 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 501 may control the touch screen 505 to display the user interface in a landscape view or a portrait view according to the acceleration signal of gravity collected by the acceleration sensor 511. The acceleration sensor 511 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 512 may detect a body direction and a rotation angle of the terminal 500, and the gyro sensor 512 may cooperate with the acceleration sensor 511 to acquire a 3D motion of the user on the terminal 500. The processor 501 may implement the following functions according to the data collected by the gyro sensor 512: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

The pressure sensor 513 may be disposed on a side bezel of the terminal 500 and/or an underlying layer of the touch display screen 505. When the pressure sensor 513 is disposed on the side frame of the terminal 500, a user's holding signal of the terminal 500 may be detected, and the processor 501 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 513. When the pressure sensor 513 is disposed at the lower layer of the touch display screen 505, the processor 501 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 505. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 514 is used for collecting a fingerprint of the user, and the processor 501 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 514, or the fingerprint sensor 514 identifies the identity of the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 501 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 514 may be provided on the front, back, or side of the terminal 500. When a physical button or a vendor Logo is provided on the terminal 500, the fingerprint sensor 514 may be integrated with the physical button or the vendor Logo.

The optical sensor 515 is used to collect the ambient light intensity. In one embodiment, the processor 501 may control the display brightness of the touch display screen 505 based on the ambient light intensity collected by the optical sensor 515. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 505 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 505 is turned down. In another embodiment, processor 501 may also dynamically adjust the shooting parameters of camera head assembly 506 based on the ambient light intensity collected by optical sensor 515.

A proximity sensor 516, also referred to as a distance sensor, is typically disposed on the front panel of the terminal 500. The proximity sensor 516 is used to collect the distance between the user and the front surface of the terminal 500. In one embodiment, when the proximity sensor 516 detects that the distance between the user and the front surface of the terminal 500 gradually decreases, the processor 501 controls the touch display screen 505 to switch from the bright screen state to the dark screen state; when the proximity sensor 516 detects that the distance between the user and the front surface of the terminal 500 becomes gradually larger, the processor 501 controls the touch display screen 505 to switch from the screen-rest state to the screen-on state.

Those skilled in the art will appreciate that the configuration shown in fig. 5 is not intended to be limiting of terminal 500 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

In an exemplary embodiment of the present application, there is also provided a computer-readable storage medium, such as a memory, including instructions executable by a processor in the terminal device to perform the character recognition method in the above-described embodiment. For example, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of character recognition, the method comprising:

acquiring N characters forwards and backwards from the position of the character to be detected in the text information through an N-gram model, and taking the acquired 2N characters as context information of the character to be detected, wherein N is an integer greater than or equal to 1, and N is the window size of the N-gram model;

inputting the plurality of segment coding sequences and the context information of the character to be detected into an appointed deep learning model, wherein the appointed deep learning model is used for assigning a value to an initial segment vector of an appointed dimension according to each segment coding sequence for each segment coding sequence in the plurality of segment coding sequences, and generating segment coding vectors corresponding to the corresponding segment coding sequences to obtain a plurality of segment coding vectors; summing and averaging vector elements positioned at the same position in the plurality of segment coding vectors to obtain an initial word vector; determining and outputting the word vector of the character to be detected according to the initial word vector and the context information of the character to be detected;

calculating a vector distance between each word vector contained in the mapping relation between the word vector and the character and the word vector of the character to be detected, wherein the vector distance is used for representing the similarity between the character to be detected and the corresponding word vector; acquiring a vector distance smaller than a specified distance threshold, acquiring a character corresponding to a word vector corresponding to the vector distance if the number of the vector distances smaller than the specified distance threshold is only one, and if the character is the same as the character to be detected, not detecting the character to be detected as a wrongly-written character; if the number of the vector distances smaller than the specified distance threshold is multiple, acquiring the minimum vector distance in the multiple vector distances smaller than the specified distance threshold, and acquiring the character corresponding to the word vector corresponding to the minimum vector distance, if the character is different from the character to be detected, the character to be detected is a wrongly-written character, and displaying the character corresponding to the word vector corresponding to the minimum vector distance as a recommended character, wherein the recommended character is a correct character recommended to a user and used for replacing the character to be detected.

2. The method according to claim 1, wherein determining the stroke code sequence of the character to be detected according to the stroke order of the character to be detected comprises:

splitting the character to be detected according to the stroke sequence of the character to be detected to obtain a stroke sequence comprising a plurality of character components;

determining coding information corresponding to each character component in the stroke sequence according to the mapping relation between the character components and the coding information;

and sequencing the determined plurality of coded information according to the sequence of the plurality of character components in the stroke sequence to obtain the stroke coding sequence.

3. The method according to claim 1 or 2, wherein before determining the word vector of the character to be detected according to the stroke code sequence and the context information of the character to be detected, the method further comprises:

acquiring a plurality of sample data containing designated characters, wherein each sample data comprises a stroke coding sequence of the designated characters and context information of the designated characters;

processing the plurality of sample data through a specified deep learning model to obtain a word vector corresponding to the specified character;

and correspondingly storing the specified character and a word vector of the specified character in a mapping relation between the word vector and the character.

4. An apparatus for character recognition, the apparatus comprising:

the first obtaining module is used for obtaining N characters from the position of the character to be detected in the text information through an N-gram model forwards and backwards, and taking the obtained 2N characters as context information of the character to be detected, wherein N is an integer larger than or equal to 1, and N is the window size of the N-gram model;

the second determining module is used for determining a plurality of segment coding sequences according to the stroke coding sequences; inputting the plurality of segment coding sequences and the context information of the character to be detected into a specified deep learning model; the appointed deep learning model is used for assigning values to initial segment vectors of appointed dimensions according to each segment coding sequence in the segment coding sequences to generate segment coding vectors corresponding to the corresponding segment coding sequences to obtain a plurality of segment coding vectors; summing and averaging vector elements positioned at the same position in the plurality of segment coding vectors to obtain an initial word vector; determining and outputting the word vector of the character to be detected according to the initial word vector and the context information of the character to be detected;

the recognition module is used for calculating a vector distance between each word vector contained in the mapping relation between the word vector and the character and the word vector of the character to be detected, wherein the vector distance is used for representing the similarity between the character to be detected and the corresponding word vector; obtaining a vector distance smaller than a specified distance threshold; if the number of the vector distances smaller than the specified distance threshold is only one, acquiring characters corresponding to the word vectors corresponding to the vector distances, and if the characters are the same as the characters to be detected, the characters to be detected are not wrongly-written characters; if the number of the vector distances smaller than the specified distance threshold is multiple, acquiring the minimum vector distance in the multiple vector distances smaller than the specified distance threshold, and acquiring the character corresponding to the word vector corresponding to the minimum vector distance, if the character is different from the character to be detected, the character to be detected is a wrongly-written character, and displaying the character corresponding to the word vector corresponding to the minimum vector distance as a recommended character, wherein the recommended character is a correct character recommended to a user and used for replacing the character to be detected.

5. The apparatus of claim 4, wherein the first determining module comprises:

the splitting submodule is used for splitting the character to be detected according to the stroke sequence of the character to be detected to obtain a stroke sequence comprising a plurality of character components;

the determining submodule is used for determining the coding information corresponding to each character component in the stroke sequence according to the mapping relation between the character components and the coding information;

and the sequencing submodule is used for sequencing the determined coded information according to the sequence of the character components in the stroke sequence to obtain the stroke coding sequence.

6. An apparatus for character recognition, the apparatus comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement a method for character recognition according to any one of claims 1 to 3.

7. A computer-readable storage medium, having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the character recognition method of any one of claims 1 to 3.