WO2021042505A1 - Procédé et appareil de génération de notes basés sur une technologie de reconnaissance de caractères, et dispositif informatique - Google Patents

Procédé et appareil de génération de notes basés sur une technologie de reconnaissance de caractères, et dispositif informatique Download PDF

Info

Publication number
WO2021042505A1
WO2021042505A1 PCT/CN2019/116337 CN2019116337W WO2021042505A1 WO 2021042505 A1 WO2021042505 A1 WO 2021042505A1 CN 2019116337 W CN2019116337 W CN 2019116337W WO 2021042505 A1 WO2021042505 A1 WO 2021042505A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
text
designated
value
preset
Prior art date
Application number
PCT/CN2019/116337
Other languages
English (en)
Chinese (zh)
Inventor
温桂龙
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021042505A1 publication Critical patent/WO2021042505A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/242Division of the character sequences into groups prior to recognition; Selection of dictionaries
    • G06V30/244Division of the character sequences into groups prior to recognition; Selection of dictionaries using graphical properties, e.g. alphabet type or font
    • G06V30/2455Discrimination between machine-print, hand-print and cursive writing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • This application relates to the computer field, in particular to a method, device, computer equipment and storage medium for generating notes based on text recognition technology.
  • the main purpose of this application is to provide a note generation method, device, computer equipment, and storage medium based on text recognition technology, aiming to improve the preservation of information when generating notes.
  • this application proposes a note generation method based on text recognition technology, which is applied to a designated terminal, and includes:
  • the handwritten text and printed text in the designated picture are recognized as handwritten text and printed text, respectively, by using a preset text recognition technology , And extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text;
  • the feature data is input into the emotion recognition model trained based on the neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text and is related to the pre-collected handwritten text. Trained on sample data composed of emotion categories associated with text;
  • the printed text and the handwritten text are typeset according to the target text typesetting type to generate the note.
  • the note generation method, device, computer equipment, and storage medium based on text recognition technology of this application use the emotion recognition model to recognize the emotion category of the note writer when writing notes, and select the corresponding typesetting method according to the emotion category, thereby integrating the emotion
  • the category information or excitement, or sadness, etc.
  • the category information is preserved in the form of typesetting, which overcomes the defect of information loss (such as loss of emotion) when the existing text recognition technology recognizes text. Improve the preservation of information.
  • FIG. 1 is a schematic flowchart of a note generation method based on text recognition technology according to an embodiment of the application;
  • FIG. 2 is a schematic block diagram of the structure of a note generation device based on text recognition technology according to an embodiment of the application;
  • FIG. 3 is a schematic block diagram of the structure of a computer device according to an embodiment of the application.
  • an embodiment of the present application provides a method for generating notes based on text recognition technology, which is applied to a designated terminal, and includes:
  • the designated picture is not similar to the picture previously acquired by the designated terminal, use a preset text recognition technology to recognize the handwritten text and printed text in the designated picture as handwritten text and printed text, respectively Text, and extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text;
  • a designated picture with handwritten text and printed text is acquired.
  • the designated picture may be a picture with handwritten text and printed text collected in real time through a preset camera, or may be a pre-stored picture with handwritten text and printed text.
  • Printed text refers to the font used by publications to publish text, and is the font used for text printed in batches, where publications are physical carriers such as books and magazines. Therefore, there is a clear distinction between handwritten text and printed text.
  • a preset picture similarity determination method is used to determine whether the designated picture is similar to the picture previously acquired by the designated terminal.
  • the picture similarity judgment method is for example: sequentially comparing the corresponding pixels in the two pictures, if the number of the same pixels in the number of all the pixels is greater than a predetermined threshold, then the judgment is similar; if the same pixels If the proportion of the number of pixels in the number of all pixels is not greater than a predetermined threshold, it is determined that they are not similar. If the designated picture is similar to the picture previously obtained by the designated terminal, it indicates that the designated picture has undergone recognition processing, and only the recognition result of the last time needs to be called out, and there is no need to perform the recognition operation again.
  • step S3 if the designated picture is not similar to the picture previously acquired by the designated terminal, the handwritten text and printed text in the designated picture are respectively recognized as handwritten by using a preset text recognition technology. Text and printed text, as well as extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text. If the designated picture is not similar to the picture previously acquired by the designated terminal, it indicates that the designated picture has not undergone identification processing and is a brand new picture, and therefore needs to be identified.
  • the preset text recognition technology is, for example, OCR (Optical Character Recognition, optical character recognition) technology, in which one or more of the following technical means can be used in the recognition process:
  • Grayscale RGB model is used to represent each image For each pixel, take the average value of R, G, and B of each pixel instead of the original R, G, and B values to obtain the gray value of the image;
  • binarization divide the pixels of the image into black and white Part, black is regarded as foreground information, and white is regarded as background information to process other objects and backgrounds in the original image except the target text; noise reduction: median filter, mean filter, adaptive Wiener filter, etc.
  • Text segmentation Use projection operation to segment text, project a single line of text or multiple lines of text on the X axis, and accumulate the values. The text area must have a relatively large value, and the interval area must have no value. Then consider the rationality of the interval. In this way, a single text is segmented; feature extraction: extract the special points of these pixels, such as extreme points, isolated points, etc., as the feature points of the image, and then perform dimensionality reduction processing on them to increase the processing speed.
  • the method for extracting the feature data of the handwritten text in the designated picture, where the feature data includes at least the position of the repetition and the number of repetitions in the handwritten text includes: dividing the pen of the handwritten text into multiple points for data Collect and analyze, obtain the pressure value of each point, the clarity of the sequence when writing, etc. by identifying the data change trend of the pixel, and then obtain the characteristic data including the position of the heavy pen and the number of the heavy pen.
  • the heavy pen refers to the handwritten text with the greatest force Strokes.
  • the neural network model can be any model, such as VGG16 model, VGG-F model, ResNet152 model, ResNet50 model, DPN131 model, AlexNet model, DenseNet model, etc., and the DPN model is preferred.
  • DPN Dual Path Network
  • the emotion categories can be classified in any manner, for example, including tension, happiness, sadness, indignation, and so on.
  • the target text typesetting type corresponding to the predicted emotion type is obtained according to the preset correspondence between the emotion category and the text typesetting type.
  • the preset correspondence between the emotion category and the text typesetting type is, for example, when the emotion category is a stable emotion, the identifier is used to replace the original handwritten text, and the recognized handwritten text is recorded at the end of the text without destroying the printed text
  • the emotion category is agitation, typeset the handwritten text with a special font in the original place of the handwritten text.
  • the text typesetting can be any feasible way. Among them, the type of text layout corresponds to the emotion category. For example, for the passionate emotion category, red font and bold are used to reflect; for the sad emotion category, green font and italics are used to reflect.
  • the typesetting type can also include any other feasible types.
  • step S6 the printed text and the handwritten text are typeset according to the target text typesetting type to generate the note. Since the handwritten notes obtained by typesetting the printed text and the handwritten text according to the target text typesetting further retain the information of the original handwritten text, the recognition is more relevant and the user experience Better, the rate of missing information is lower.
  • the step S2 of judging whether the designated picture is similar to the picture previously obtained by the designated terminal by using a preset picture similarity judgment method includes:
  • S201 Perform gray-scale processing on the designated picture and the picture previously acquired by the designated terminal respectively to obtain a first gray-scale picture and a second gray-scale picture;
  • S202 Calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale picture, and calculate the average value B of the gray values of all the pixels in the gray-scale picture;
  • grayscale refers to the color representing a grayscale color.
  • the color represents a grayscale color
  • the gray scale range is, for example, 0-255 (when the values of R, G, and B are all 0-255, of course, it will also change with the change of the value range of R, G, and B).
  • the gray-scale processing method can be any method, such as the component method, the maximum value method, the average method, and the weighted average method. Among them, since there are only 256 value ranges for gray values, image comparison on this basis can greatly reduce the amount of calculation. Then calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale picture, and calculate the average value B of the gray values of all the pixels in the gray-scale picture.
  • the process of calculating the average value Am of the gray values of all pixels in the m-th column or m-th row of the gray-scale picture includes: collecting all the pixels in the m-th column or m-th row of the gray-scale picture Add the gray values of all pixels in the mth column or mth row, and divide the sum of the gray values obtained by the summation by the mth column or The number of all pixels in the m rows is the average value Am of the gray values of all the pixels in the mth column or mth row of the grayscale image.
  • the process of calculating the average value B of the gray values of all pixels in the gray image includes: calculating the sum of the gray values of all pixels in the gray image, and then dividing the sum of the gray values by According to the number of pixels, the average value B of the gray values of all pixels in the gray image is obtained.
  • the overall variance is used to measure the average of the gray values Am of the pixels in the m-th column or the m-th row of the gray-scale picture and the average of the gray-scale values of all pixels in the gray-scale picture. The difference between the value B.
  • the gray value of the m-th column or row of the first gray-scale image is the same or approximately the same as the gray value of the m-th column or m-th row of the second gray-scale image (approximate judgment to save computing power , And because the overall variance of the two different pictures is generally not equal, the accuracy of the judgment is very high), on the contrary, the gray value of the mth column or mth row of the first grayscale image is considered to be the same as the second grayscale value.
  • the gray values of the m-th column or m-th row of the picture are different. judgment Whether it is less than the preset variance error threshold. among them
  • the return value is The maximum value in.
  • the specified picture is similar to the picture previously acquired by the specified terminal. Approximate judgment is used (because all gray values of grayscale pictures converted from two different pictures are generally not equal, and all grayscale values of grayscale pictures converted from the same picture are generally equal), it is possible to reduce the cost of calculation Under the premise of resources, it is determined whether the designated picture is similar to the picture previously acquired by the designated terminal. Accordingly, when the designated picture is not similar to the picture previously acquired by the designated terminal, the subsequent steps are performed (if the designated picture is similar to the picture previously acquired by the designated terminal, it indicates that the designated picture is similar to the picture previously acquired by the designated terminal. The specified picture has been processed for note generation, so there is no need to process it again), reducing unnecessary resource consumption.
  • the step S2 of judging whether the designated picture is similar to the picture previously obtained by the designated terminal by using a preset picture similarity judgment method includes:
  • the proportion of the same pixels the number of the same pixels/the number of all the pixels in the specified picture, to obtain the proportion of the same pixels
  • this embodiment adopts a method of successively comparing pixels for judgment. If the two pictures are the same, the number of the same pixels should account for the vast majority, that is, the proportion of the same pixels is close to 1.
  • the proportion of the same pixels the number of the same pixels/the number of all the pixels in the specified picture, the proportion of the same pixels is calculated, and if the proportion of the same pixels is greater than With a preset proportion threshold, it is determined that the designated picture is similar to the picture previously acquired by the designated terminal.
  • the color of the handwritten text is different from the color of the printed text
  • the preset text recognition technology is used to recognize the handwritten text and the printed text in the designated picture as handwritten text.
  • step S3 of printed text including:
  • S301 Collect the value of the R color channel, the value of the G color channel, and the value of the B color channel in the RGB color model of the pixel in the specified picture, and convert the specified picture into the specified picture according to a preset three-value method.
  • the RGB color of the pixel is set to (0,0,0), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255 to obtain a temporary picture composed of three colors ;
  • this application uses a three-value method, that is, according to a preset three-value method, the RGB color of the pixel in the specified picture is set to (0,0,0 ), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255, obtain a temporary picture composed of three colors, and calculate the proportion of the three colors in the temporary picture Area, and use the preset text segmentation method for the area occupied by the two colors with the smaller area (because the largest area is definitely the background, so there is no need to analyze the area with the largest area) to obtain a single handwritten text that is divided And separate printed text.
  • a three-value method that is, according to a preset three-value method, the RGB color of the pixel in the specified picture is set to (0,0,0 ), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255, obtain a temporary picture composed of three colors, and calculate the proportion
  • the support vector machine is a generalized linear classifier that performs binary classification of data in a supervised learning manner, and is suitable for comparing the recognized text with the pre-stored text to output the most similar text. According to this, the text features of the single handwritten text and the text features of the single printed text are extracted, and input into a preset support vector machine for classification, and the recognized handwritten text and printed text are obtained.
  • the character feature is, for example, a special point in the pixel point corresponding to the character, such as an extreme point, an isolated point, etc.
  • the collection of the values of the R color channel, the value of the G color channel and the value of the B color channel in the RGB color model of the pixel in the specified picture is performed according to a preset three-value method
  • the step S301 of setting the RGB color of the pixel in the designated picture to (0, 0, 0), (255, 255, 255) or (P, P, P) includes:
  • F2 MAX ⁇ ROUND[(a1R+a2G+a3B)/L,0],B ⁇ , obtain the reference value F2, where MIN is the maximum value Function, B is the second threshold parameter with a preset value in the range (0,255), and B is greater than A;
  • the value of the R color channel, the value of the G color channel, and the value of the B color channel in the RGB color model of the pixel in the specified picture are collected, and all the values are calculated according to the preset three-value method.
  • the RGB color of the pixel in the specified picture is set to (0,0,0), (255,255,255) or (P,P,P).
  • ROUND function is a rounding function
  • S401 retrieve pre-collected sample data, and divide the sample data into a training set and a test set; wherein the sample data includes pre-collected handwritten text and emotion categories associated with the pre-collected handwritten text;
  • S402 Input the sample data of the training set into a preset neural network model for training to obtain an initial emotion recognition model, where the stochastic gradient descent method is used in the training process;
  • S403 Use the sample data of the test set to verify the initial emotion recognition model.
  • the emotion recognition model is set.
  • This application is based on a neural network model to train an emotion recognition model.
  • the neural network model can be VGG16 model, VGG-F model, ResNet152 model, ResNet50 model, DPN131 model, AlexNet model, DenseNet model, etc.
  • the stochastic gradient descent method is to randomly sample some training data to replace the entire training set. If the sample size is large (for example, hundreds of thousands), then only tens of thousands or thousands of samples may be used, and iterative When the optimal solution is reached, the training speed can be improved. Further, the training can also use the reverse conduction rule to update the parameters of each layer of the neural network.
  • the reverse conduction law is based on the gradient descent method, and its input-output relationship is essentially a mapping relationship: the function of a neural network with n-input and m-output is from n-dimensional Euclidean space to m-dimensional Ou A continuous mapping of a finite field in the space, this mapping is highly non-linear, which is conducive to the update of the parameters of each layer of the neural network model.
  • the sample data of the test set is then used to verify the initial emotion recognition model, and if the verification is passed, the initial emotion recognition model is recorded as the emotion recognition model.
  • the step S6 of formatting the printed text and the handwritten text according to the target text typesetting includes:
  • S61 Receive an acquisition request for acquiring a handwritten note sent by a second terminal, where the acquisition request records a reading format supported by the second terminal;
  • the note is sent to the second terminal. Since the second terminal may not support reading and displaying the note, the note is formatted and then sent to the second terminal to avoid the second terminal from failing to recognize the handwritten note. Based on this, it is determined whether the reading format of the reading software can display the note; if the reading format of the reading software can display the note, the note is sent to the second terminal. Further, if the reading format of the reading software cannot display the note, the format of the note is converted to the reading format of the reading software, and then sent to the second terminal.
  • an embodiment of the present application provides a note generation device based on text recognition technology, which is applied to a designated terminal, and includes:
  • the designated picture acquiring unit 10 is used to acquire designated pictures with handwritten text and printed text;
  • the similarity determination unit 20 is configured to use a preset picture similarity determination method to determine whether the designated picture is similar to the picture previously acquired by the designated terminal;
  • the feature data acquiring unit 30 is configured to, if the designated picture is not similar to the picture previously acquired by the designated terminal, use a preset text recognition technology to recognize the handwritten text and printed text in the designated picture as Handwritten text and printed text, as well as extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the repetition number in the handwritten text;
  • the predicted emotion category obtaining unit 40 is configured to input the feature data into an emotion recognition model trained based on a neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text , And training from sample data composed of emotion categories associated with the pre-collected handwritten text;
  • the typesetting type obtaining unit 50 is configured to obtain the target text typesetting type corresponding to the predicted emotion type according to the preset correspondence between the emotion category and the text typesetting type;
  • the typesetting unit 60 is configured to typeset the printed text and the handwritten text according to the target text typesetting type to generate the note.
  • the similarity judgment unit 20 includes:
  • a grayscale subunit configured to perform grayscale processing on the designated picture and the picture previously acquired by the designated terminal, respectively, to obtain a first grayscale picture and a second grayscale picture;
  • the average value calculation subunit is used to calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale image, and calculate the average value B of the gray values of all the pixels in the gray-scale image ;
  • the overall variance calculation subunit is used according to the formula: Calculate the overall variance of the m-th column or m-th row of the grayscale image Where N is the total number of columns or rows in the grayscale picture;
  • the difference of variance calculation subunit is used according to the formula: Obtain the difference between the overall variance of the m-th column or m-th row of the first gray-scale picture and the second gray-scale picture among them, Is the overall variance of the m-th column or m-th row of the first grayscale picture, Is the overall variance of the m-th column or the m-th row of the second grayscale picture;
  • Error threshold judgment subunit used to judge Whether it is less than the preset variance error threshold
  • Similarity determination subunit used if If it is less than the preset variance error threshold, it is determined that the specified picture is similar to the picture previously obtained by the specified terminal.
  • the similarity judgment unit 20 includes:
  • the same pixel count subunit which is used to sequentially compare corresponding pixels in the designated picture and the picture previously obtained by the designated terminal, and count the number of identical pixels;
  • the proportion threshold judging subunit is used to judge whether the proportion of the same pixel is greater than a preset proportion threshold
  • the second similarity determination subunit is configured to determine that the designated picture is similar to the picture previously obtained by the designated terminal if the proportion of the same pixel is greater than the preset proportion threshold.
  • the color of the handwritten text is different from the color of the printed text
  • the characteristic data acquiring unit 30 includes:
  • the temporary picture generation subunit is used to collect the value of the R color channel, the value of the G color channel and the value of the B color channel in the RGB color model of the pixel in the specified picture, and according to the preset three-value method Set the RGB color of the pixel in the specified picture to (0,0,0), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255, and the Temporary pictures composed of various colors;
  • the segmentation subunit is used to calculate the area occupied by the three colors in the temporary picture, and use the preset text segmentation method for the area occupied by the two colors with the smaller area to obtain the divided single handwritten text and Separate single printed text;
  • the recognition subunit is used to extract the text features of the single handwritten text and the text features of the single printed text, and input them into a preset support vector machine for classification to obtain the recognized handwritten text text and printed text text.
  • the temporary picture generation subunit includes:
  • the reference value F1 judgment module is used to judge whether the value of the reference value F1 is equal to A;
  • the reference value F2 judgment module is used to judge whether the value of the reference value F2 is equal to B;
  • the color setting module is configured to set the RGB color of the designated pixel to (255, 255, 255) if the value of the reference value F2 is not equal to B.
  • the device includes:
  • the sample data retrieval unit is used to retrieve pre-collected sample data and divide the sample data into a training set and a test set; wherein the sample data includes pre-collected handwritten characters and is associated with the pre-collected handwritten characters Emotional category;
  • the training unit is used to input the sample data of the training set into the preset neural network model for training to obtain the initial emotion recognition model, wherein the stochastic gradient descent method is used in the training process;
  • a verification unit for verifying the initial emotion recognition model by using sample data of the test set
  • the marking unit is configured to record the initial emotion recognition model as the emotion recognition model if the verification of the initial emotion recognition model is passed.
  • the device includes:
  • a reading format obtaining unit configured to receive an obtaining request for obtaining handwritten notes sent by a second terminal, wherein the obtaining request records a reading format supported by the second terminal;
  • the reading format judgment unit is used to judge whether the reading format of the reading software can display the notes
  • the note sending unit is configured to send the note to the second terminal if the reading format of the reading software can display the note.
  • an embodiment of the present application also provides a computer device.
  • the computer device may be a server, and its internal structure may be as shown in the figure.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor designed by the computer is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, a computer program, and a database.
  • the memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the database of the computer equipment is used to store the data used in the note generation method based on text recognition technology.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program is executed by the processor to realize a note generation method based on text recognition technology.
  • the above-mentioned processor executes the above-mentioned note generation method based on text recognition technology, wherein the steps included in the method respectively correspond to the steps of executing the note generation method based on text recognition technology of the aforementioned embodiment one-to-one, and will not be repeated here.
  • An embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
  • a computer program is stored.
  • the computer-readable storage medium is, for example, a non-volatile computer-readable storage medium or a volatile computer-readable storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Character Input (AREA)

Abstract

L'invention concerne un procédé et un appareil de génération de notes basés sur une technologie de reconnaissance de caractères, un dispositif informatique, et un support de stockage, le procédé comportant les étapes consistant à: acquérir une image spécifiée comprenant des caractères manuscrits et des caractères imprimés; si l'image spécifiée n'est pas similaire à l'image acquise précédemment par un terminal spécifié, reconnaître respectivement les caractères manuscrits et les caractères imprimés dans l'image spécifiée en tant que texte en caractères manuscrits et texte en caractères imprimés, et extraire des données de caractéristiques des caractères manuscrits dans l'image spécifiée; introduire les données de caractéristiques dans un modèle de reconnaissance d'émotions entraîné sur la base d'un modèle de réseau neuronal et acquérir un type d'émotion prédit délivré par le modèle de reconnaissance d'émotions; acquérir un type visé de composition de caractères correspondant au type d'émotion prédit; et composer le texte en caractères imprimés et le texte en caractères manuscrits d'après le type visé de composition de texte pour générer une note manuscrite. Le degré de préservation de l'information est ainsi accru.
PCT/CN2019/116337 2019-09-03 2019-11-07 Procédé et appareil de génération de notes basés sur une technologie de reconnaissance de caractères, et dispositif informatique WO2021042505A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910828605.2 2019-09-03
CN201910828605.2A CN110705233B (zh) 2019-09-03 2019-09-03 基于文字识别技术的笔记生成方法、装置和计算机设备

Publications (1)

Publication Number Publication Date
WO2021042505A1 true WO2021042505A1 (fr) 2021-03-11

Family

ID=69194318

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/116337 WO2021042505A1 (fr) 2019-09-03 2019-11-07 Procédé et appareil de génération de notes basés sur une technologie de reconnaissance de caractères, et dispositif informatique

Country Status (2)

Country Link
CN (1) CN110705233B (fr)
WO (1) WO2021042505A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112882678A (zh) * 2021-03-15 2021-06-01 百度在线网络技术(北京)有限公司 图文处理方法和展示方法、装置、设备和存储介质
CN113255613A (zh) * 2021-07-06 2021-08-13 北京世纪好未来教育科技有限公司 判题方法、装置及计算机存储介质
CN113486653A (zh) * 2021-07-06 2021-10-08 安徽淘云科技股份有限公司 形近字多候选控制方法、装置和设备

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111651960B (zh) * 2020-06-01 2023-05-30 杭州尚尚签网络科技有限公司 一种从合同简体迁移到繁体的光学字符联合训练及识别方法
CN111832547A (zh) * 2020-06-24 2020-10-27 平安普惠企业管理有限公司 文字识别模型的动态部署方法、装置和计算机设备
CN112257710A (zh) * 2020-10-26 2021-01-22 北京云杉世界信息技术有限公司 一种带文字平面的图片倾斜度检测方法及装置
CN112257629A (zh) * 2020-10-29 2021-01-22 广联达科技股份有限公司 一种建筑图纸的文本信息识别方法及装置
CN113610186A (zh) * 2021-08-20 2021-11-05 湖州师范学院 一种数字化书写识别情绪状态的方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050100217A1 (en) * 2003-11-07 2005-05-12 Microsoft Corporation Template-based cursive handwriting recognition
CN106598948A (zh) * 2016-12-19 2017-04-26 杭州语忆科技有限公司 基于长短期记忆神经网络结合自动编码器的情绪识别方法
CN108885555A (zh) * 2016-11-30 2018-11-23 微软技术许可有限责任公司 基于情绪的交互方法和装置
CN109189985A (zh) * 2018-08-17 2019-01-11 北京达佳互联信息技术有限公司 文本风格处理方法、装置、电子设备及存储介质
CN109815463A (zh) * 2018-12-13 2019-05-28 深圳壹账通智能科技有限公司 文本编辑选取控制方法、装置、计算机设备及存储介质
CN110135427A (zh) * 2019-04-11 2019-08-16 北京百度网讯科技有限公司 用于识别图像中的字符的方法、装置、设备和介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2767894A1 (fr) * 2013-02-15 2014-08-20 BlackBerry Limited Procédé et appareil d'ajustement d'embellissements graphique de texte
US20160239608A1 (en) * 2013-09-13 2016-08-18 Vivago Oy Arrangement and a method for creating a synthesis from numerical data and textual information
US20170068436A1 (en) * 2015-09-03 2017-03-09 Microsoft Technology Licensing, Llc Interpreting and Supplementing Captured Stroke Information
US10210383B2 (en) * 2015-09-03 2019-02-19 Microsoft Technology Licensing, Llc Interacting with an assistant component based on captured stroke information

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050100217A1 (en) * 2003-11-07 2005-05-12 Microsoft Corporation Template-based cursive handwriting recognition
CN108885555A (zh) * 2016-11-30 2018-11-23 微软技术许可有限责任公司 基于情绪的交互方法和装置
CN106598948A (zh) * 2016-12-19 2017-04-26 杭州语忆科技有限公司 基于长短期记忆神经网络结合自动编码器的情绪识别方法
CN109189985A (zh) * 2018-08-17 2019-01-11 北京达佳互联信息技术有限公司 文本风格处理方法、装置、电子设备及存储介质
CN109815463A (zh) * 2018-12-13 2019-05-28 深圳壹账通智能科技有限公司 文本编辑选取控制方法、装置、计算机设备及存储介质
CN110135427A (zh) * 2019-04-11 2019-08-16 北京百度网讯科技有限公司 用于识别图像中的字符的方法、装置、设备和介质

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112882678A (zh) * 2021-03-15 2021-06-01 百度在线网络技术(北京)有限公司 图文处理方法和展示方法、装置、设备和存储介质
CN112882678B (zh) * 2021-03-15 2024-04-09 百度在线网络技术(北京)有限公司 图文处理方法和展示方法、装置、设备和存储介质
CN113255613A (zh) * 2021-07-06 2021-08-13 北京世纪好未来教育科技有限公司 判题方法、装置及计算机存储介质
CN113255613B (zh) * 2021-07-06 2021-09-24 北京世纪好未来教育科技有限公司 判题方法、装置及计算机存储介质
CN113486653A (zh) * 2021-07-06 2021-10-08 安徽淘云科技股份有限公司 形近字多候选控制方法、装置和设备

Also Published As

Publication number Publication date
CN110705233A (zh) 2020-01-17
CN110705233B (zh) 2023-04-07

Similar Documents

Publication Publication Date Title
WO2021042505A1 (fr) Procédé et appareil de génération de notes basés sur une technologie de reconnaissance de caractères, et dispositif informatique
WO2021027336A1 (fr) Procédé et appareil d'authentification basés sur un cachet et une signature et dispositif informatique
Dutta et al. Improving CNN-RNN hybrid networks for handwriting recognition
CN109543690B (zh) 用于提取信息的方法和装置
CN108664996B (zh) 一种基于深度学习的古文字识别方法及系统
CN108764195B (zh) 手写模型训练方法、手写字识别方法、装置、设备及介质
RU2707147C1 (ru) Обучение нейронной сети посредством специализированных функций потерь
CN111191568B (zh) 翻拍图像识别方法、装置、设备及介质
CN113254654B (zh) 模型训练、文本识别方法、装置、设备和介质
WO2020164278A1 (fr) Dispositif et procédé de traitement des images, appareil électronique, et support d'enregistrement lisible
CN113011253B (zh) 基于ResNeXt网络的人脸表情识别方法、装置、设备及存储介质
CN113011144A (zh) 表单信息的获取方法、装置和服务器
CN111932418B (zh) 一种学生学习情况识别方法、系统、教学终端及存储介质
CN114357206A (zh) 基于语义分析的教育类视频彩色字幕生成方法及系统
WO2022062028A1 (fr) Procédé de reconnaissance d'étiquette de vin, procédé et appareil de gestion d'informations d'œnologie, dispositif, et support de stockage
CN113111880A (zh) 证件图像校正方法、装置、电子设备及存储介质
JP2012048624A (ja) 学習装置、方法及びプログラム
CN111340032A (zh) 一种基于金融领域应用场景的字符识别方法
CN113361666B (zh) 一种手写字符识别方法、系统及介质
CN111008624A (zh) 光学字符识别方法和产生光学字符识别的训练样本的方法
CN114037886A (zh) 图像识别方法、装置、电子设备和可读存储介质
CN111242114B (zh) 文字识别方法及装置
CN111881880A (zh) 一种基于新型网络的票据文本识别方法
CN116645683A (zh) 基于提示学习的签名笔迹鉴别方法、系统及存储介质
CN111414889A (zh) 基于文字识别的财务报表识别方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19944315

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19944315

Country of ref document: EP

Kind code of ref document: A1