WO2021042505A1 - Note generation method and apparatus based on character recognition technology, and computer device - Google Patents

Note generation method and apparatus based on character recognition technology, and computer device Download PDF

Info

Publication number
WO2021042505A1
WO2021042505A1 PCT/CN2019/116337 CN2019116337W WO2021042505A1 WO 2021042505 A1 WO2021042505 A1 WO 2021042505A1 CN 2019116337 W CN2019116337 W CN 2019116337W WO 2021042505 A1 WO2021042505 A1 WO 2021042505A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
text
designated
value
preset
Prior art date
Application number
PCT/CN2019/116337
Other languages
French (fr)
Chinese (zh)
Inventor
温桂龙
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021042505A1 publication Critical patent/WO2021042505A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/242Division of the character sequences into groups prior to recognition; Selection of dictionaries
    • G06V30/244Division of the character sequences into groups prior to recognition; Selection of dictionaries using graphical properties, e.g. alphabet type or font
    • G06V30/2455Discrimination between machine-print, hand-print and cursive writing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • This application relates to the computer field, in particular to a method, device, computer equipment and storage medium for generating notes based on text recognition technology.
  • the main purpose of this application is to provide a note generation method, device, computer equipment, and storage medium based on text recognition technology, aiming to improve the preservation of information when generating notes.
  • this application proposes a note generation method based on text recognition technology, which is applied to a designated terminal, and includes:
  • the handwritten text and printed text in the designated picture are recognized as handwritten text and printed text, respectively, by using a preset text recognition technology , And extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text;
  • the feature data is input into the emotion recognition model trained based on the neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text and is related to the pre-collected handwritten text. Trained on sample data composed of emotion categories associated with text;
  • the printed text and the handwritten text are typeset according to the target text typesetting type to generate the note.
  • the note generation method, device, computer equipment, and storage medium based on text recognition technology of this application use the emotion recognition model to recognize the emotion category of the note writer when writing notes, and select the corresponding typesetting method according to the emotion category, thereby integrating the emotion
  • the category information or excitement, or sadness, etc.
  • the category information is preserved in the form of typesetting, which overcomes the defect of information loss (such as loss of emotion) when the existing text recognition technology recognizes text. Improve the preservation of information.
  • FIG. 1 is a schematic flowchart of a note generation method based on text recognition technology according to an embodiment of the application;
  • FIG. 2 is a schematic block diagram of the structure of a note generation device based on text recognition technology according to an embodiment of the application;
  • FIG. 3 is a schematic block diagram of the structure of a computer device according to an embodiment of the application.
  • an embodiment of the present application provides a method for generating notes based on text recognition technology, which is applied to a designated terminal, and includes:
  • the designated picture is not similar to the picture previously acquired by the designated terminal, use a preset text recognition technology to recognize the handwritten text and printed text in the designated picture as handwritten text and printed text, respectively Text, and extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text;
  • a designated picture with handwritten text and printed text is acquired.
  • the designated picture may be a picture with handwritten text and printed text collected in real time through a preset camera, or may be a pre-stored picture with handwritten text and printed text.
  • Printed text refers to the font used by publications to publish text, and is the font used for text printed in batches, where publications are physical carriers such as books and magazines. Therefore, there is a clear distinction between handwritten text and printed text.
  • a preset picture similarity determination method is used to determine whether the designated picture is similar to the picture previously acquired by the designated terminal.
  • the picture similarity judgment method is for example: sequentially comparing the corresponding pixels in the two pictures, if the number of the same pixels in the number of all the pixels is greater than a predetermined threshold, then the judgment is similar; if the same pixels If the proportion of the number of pixels in the number of all pixels is not greater than a predetermined threshold, it is determined that they are not similar. If the designated picture is similar to the picture previously obtained by the designated terminal, it indicates that the designated picture has undergone recognition processing, and only the recognition result of the last time needs to be called out, and there is no need to perform the recognition operation again.
  • step S3 if the designated picture is not similar to the picture previously acquired by the designated terminal, the handwritten text and printed text in the designated picture are respectively recognized as handwritten by using a preset text recognition technology. Text and printed text, as well as extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text. If the designated picture is not similar to the picture previously acquired by the designated terminal, it indicates that the designated picture has not undergone identification processing and is a brand new picture, and therefore needs to be identified.
  • the preset text recognition technology is, for example, OCR (Optical Character Recognition, optical character recognition) technology, in which one or more of the following technical means can be used in the recognition process:
  • Grayscale RGB model is used to represent each image For each pixel, take the average value of R, G, and B of each pixel instead of the original R, G, and B values to obtain the gray value of the image;
  • binarization divide the pixels of the image into black and white Part, black is regarded as foreground information, and white is regarded as background information to process other objects and backgrounds in the original image except the target text; noise reduction: median filter, mean filter, adaptive Wiener filter, etc.
  • Text segmentation Use projection operation to segment text, project a single line of text or multiple lines of text on the X axis, and accumulate the values. The text area must have a relatively large value, and the interval area must have no value. Then consider the rationality of the interval. In this way, a single text is segmented; feature extraction: extract the special points of these pixels, such as extreme points, isolated points, etc., as the feature points of the image, and then perform dimensionality reduction processing on them to increase the processing speed.
  • the method for extracting the feature data of the handwritten text in the designated picture, where the feature data includes at least the position of the repetition and the number of repetitions in the handwritten text includes: dividing the pen of the handwritten text into multiple points for data Collect and analyze, obtain the pressure value of each point, the clarity of the sequence when writing, etc. by identifying the data change trend of the pixel, and then obtain the characteristic data including the position of the heavy pen and the number of the heavy pen.
  • the heavy pen refers to the handwritten text with the greatest force Strokes.
  • the neural network model can be any model, such as VGG16 model, VGG-F model, ResNet152 model, ResNet50 model, DPN131 model, AlexNet model, DenseNet model, etc., and the DPN model is preferred.
  • DPN Dual Path Network
  • the emotion categories can be classified in any manner, for example, including tension, happiness, sadness, indignation, and so on.
  • the target text typesetting type corresponding to the predicted emotion type is obtained according to the preset correspondence between the emotion category and the text typesetting type.
  • the preset correspondence between the emotion category and the text typesetting type is, for example, when the emotion category is a stable emotion, the identifier is used to replace the original handwritten text, and the recognized handwritten text is recorded at the end of the text without destroying the printed text
  • the emotion category is agitation, typeset the handwritten text with a special font in the original place of the handwritten text.
  • the text typesetting can be any feasible way. Among them, the type of text layout corresponds to the emotion category. For example, for the passionate emotion category, red font and bold are used to reflect; for the sad emotion category, green font and italics are used to reflect.
  • the typesetting type can also include any other feasible types.
  • step S6 the printed text and the handwritten text are typeset according to the target text typesetting type to generate the note. Since the handwritten notes obtained by typesetting the printed text and the handwritten text according to the target text typesetting further retain the information of the original handwritten text, the recognition is more relevant and the user experience Better, the rate of missing information is lower.
  • the step S2 of judging whether the designated picture is similar to the picture previously obtained by the designated terminal by using a preset picture similarity judgment method includes:
  • S201 Perform gray-scale processing on the designated picture and the picture previously acquired by the designated terminal respectively to obtain a first gray-scale picture and a second gray-scale picture;
  • S202 Calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale picture, and calculate the average value B of the gray values of all the pixels in the gray-scale picture;
  • grayscale refers to the color representing a grayscale color.
  • the color represents a grayscale color
  • the gray scale range is, for example, 0-255 (when the values of R, G, and B are all 0-255, of course, it will also change with the change of the value range of R, G, and B).
  • the gray-scale processing method can be any method, such as the component method, the maximum value method, the average method, and the weighted average method. Among them, since there are only 256 value ranges for gray values, image comparison on this basis can greatly reduce the amount of calculation. Then calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale picture, and calculate the average value B of the gray values of all the pixels in the gray-scale picture.
  • the process of calculating the average value Am of the gray values of all pixels in the m-th column or m-th row of the gray-scale picture includes: collecting all the pixels in the m-th column or m-th row of the gray-scale picture Add the gray values of all pixels in the mth column or mth row, and divide the sum of the gray values obtained by the summation by the mth column or The number of all pixels in the m rows is the average value Am of the gray values of all the pixels in the mth column or mth row of the grayscale image.
  • the process of calculating the average value B of the gray values of all pixels in the gray image includes: calculating the sum of the gray values of all pixels in the gray image, and then dividing the sum of the gray values by According to the number of pixels, the average value B of the gray values of all pixels in the gray image is obtained.
  • the overall variance is used to measure the average of the gray values Am of the pixels in the m-th column or the m-th row of the gray-scale picture and the average of the gray-scale values of all pixels in the gray-scale picture. The difference between the value B.
  • the gray value of the m-th column or row of the first gray-scale image is the same or approximately the same as the gray value of the m-th column or m-th row of the second gray-scale image (approximate judgment to save computing power , And because the overall variance of the two different pictures is generally not equal, the accuracy of the judgment is very high), on the contrary, the gray value of the mth column or mth row of the first grayscale image is considered to be the same as the second grayscale value.
  • the gray values of the m-th column or m-th row of the picture are different. judgment Whether it is less than the preset variance error threshold. among them
  • the return value is The maximum value in.
  • the specified picture is similar to the picture previously acquired by the specified terminal. Approximate judgment is used (because all gray values of grayscale pictures converted from two different pictures are generally not equal, and all grayscale values of grayscale pictures converted from the same picture are generally equal), it is possible to reduce the cost of calculation Under the premise of resources, it is determined whether the designated picture is similar to the picture previously acquired by the designated terminal. Accordingly, when the designated picture is not similar to the picture previously acquired by the designated terminal, the subsequent steps are performed (if the designated picture is similar to the picture previously acquired by the designated terminal, it indicates that the designated picture is similar to the picture previously acquired by the designated terminal. The specified picture has been processed for note generation, so there is no need to process it again), reducing unnecessary resource consumption.
  • the step S2 of judging whether the designated picture is similar to the picture previously obtained by the designated terminal by using a preset picture similarity judgment method includes:
  • the proportion of the same pixels the number of the same pixels/the number of all the pixels in the specified picture, to obtain the proportion of the same pixels
  • this embodiment adopts a method of successively comparing pixels for judgment. If the two pictures are the same, the number of the same pixels should account for the vast majority, that is, the proportion of the same pixels is close to 1.
  • the proportion of the same pixels the number of the same pixels/the number of all the pixels in the specified picture, the proportion of the same pixels is calculated, and if the proportion of the same pixels is greater than With a preset proportion threshold, it is determined that the designated picture is similar to the picture previously acquired by the designated terminal.
  • the color of the handwritten text is different from the color of the printed text
  • the preset text recognition technology is used to recognize the handwritten text and the printed text in the designated picture as handwritten text.
  • step S3 of printed text including:
  • S301 Collect the value of the R color channel, the value of the G color channel, and the value of the B color channel in the RGB color model of the pixel in the specified picture, and convert the specified picture into the specified picture according to a preset three-value method.
  • the RGB color of the pixel is set to (0,0,0), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255 to obtain a temporary picture composed of three colors ;
  • this application uses a three-value method, that is, according to a preset three-value method, the RGB color of the pixel in the specified picture is set to (0,0,0 ), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255, obtain a temporary picture composed of three colors, and calculate the proportion of the three colors in the temporary picture Area, and use the preset text segmentation method for the area occupied by the two colors with the smaller area (because the largest area is definitely the background, so there is no need to analyze the area with the largest area) to obtain a single handwritten text that is divided And separate printed text.
  • a three-value method that is, according to a preset three-value method, the RGB color of the pixel in the specified picture is set to (0,0,0 ), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255, obtain a temporary picture composed of three colors, and calculate the proportion
  • the support vector machine is a generalized linear classifier that performs binary classification of data in a supervised learning manner, and is suitable for comparing the recognized text with the pre-stored text to output the most similar text. According to this, the text features of the single handwritten text and the text features of the single printed text are extracted, and input into a preset support vector machine for classification, and the recognized handwritten text and printed text are obtained.
  • the character feature is, for example, a special point in the pixel point corresponding to the character, such as an extreme point, an isolated point, etc.
  • the collection of the values of the R color channel, the value of the G color channel and the value of the B color channel in the RGB color model of the pixel in the specified picture is performed according to a preset three-value method
  • the step S301 of setting the RGB color of the pixel in the designated picture to (0, 0, 0), (255, 255, 255) or (P, P, P) includes:
  • F2 MAX ⁇ ROUND[(a1R+a2G+a3B)/L,0],B ⁇ , obtain the reference value F2, where MIN is the maximum value Function, B is the second threshold parameter with a preset value in the range (0,255), and B is greater than A;
  • the value of the R color channel, the value of the G color channel, and the value of the B color channel in the RGB color model of the pixel in the specified picture are collected, and all the values are calculated according to the preset three-value method.
  • the RGB color of the pixel in the specified picture is set to (0,0,0), (255,255,255) or (P,P,P).
  • ROUND function is a rounding function
  • S401 retrieve pre-collected sample data, and divide the sample data into a training set and a test set; wherein the sample data includes pre-collected handwritten text and emotion categories associated with the pre-collected handwritten text;
  • S402 Input the sample data of the training set into a preset neural network model for training to obtain an initial emotion recognition model, where the stochastic gradient descent method is used in the training process;
  • S403 Use the sample data of the test set to verify the initial emotion recognition model.
  • the emotion recognition model is set.
  • This application is based on a neural network model to train an emotion recognition model.
  • the neural network model can be VGG16 model, VGG-F model, ResNet152 model, ResNet50 model, DPN131 model, AlexNet model, DenseNet model, etc.
  • the stochastic gradient descent method is to randomly sample some training data to replace the entire training set. If the sample size is large (for example, hundreds of thousands), then only tens of thousands or thousands of samples may be used, and iterative When the optimal solution is reached, the training speed can be improved. Further, the training can also use the reverse conduction rule to update the parameters of each layer of the neural network.
  • the reverse conduction law is based on the gradient descent method, and its input-output relationship is essentially a mapping relationship: the function of a neural network with n-input and m-output is from n-dimensional Euclidean space to m-dimensional Ou A continuous mapping of a finite field in the space, this mapping is highly non-linear, which is conducive to the update of the parameters of each layer of the neural network model.
  • the sample data of the test set is then used to verify the initial emotion recognition model, and if the verification is passed, the initial emotion recognition model is recorded as the emotion recognition model.
  • the step S6 of formatting the printed text and the handwritten text according to the target text typesetting includes:
  • S61 Receive an acquisition request for acquiring a handwritten note sent by a second terminal, where the acquisition request records a reading format supported by the second terminal;
  • the note is sent to the second terminal. Since the second terminal may not support reading and displaying the note, the note is formatted and then sent to the second terminal to avoid the second terminal from failing to recognize the handwritten note. Based on this, it is determined whether the reading format of the reading software can display the note; if the reading format of the reading software can display the note, the note is sent to the second terminal. Further, if the reading format of the reading software cannot display the note, the format of the note is converted to the reading format of the reading software, and then sent to the second terminal.
  • an embodiment of the present application provides a note generation device based on text recognition technology, which is applied to a designated terminal, and includes:
  • the designated picture acquiring unit 10 is used to acquire designated pictures with handwritten text and printed text;
  • the similarity determination unit 20 is configured to use a preset picture similarity determination method to determine whether the designated picture is similar to the picture previously acquired by the designated terminal;
  • the feature data acquiring unit 30 is configured to, if the designated picture is not similar to the picture previously acquired by the designated terminal, use a preset text recognition technology to recognize the handwritten text and printed text in the designated picture as Handwritten text and printed text, as well as extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the repetition number in the handwritten text;
  • the predicted emotion category obtaining unit 40 is configured to input the feature data into an emotion recognition model trained based on a neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text , And training from sample data composed of emotion categories associated with the pre-collected handwritten text;
  • the typesetting type obtaining unit 50 is configured to obtain the target text typesetting type corresponding to the predicted emotion type according to the preset correspondence between the emotion category and the text typesetting type;
  • the typesetting unit 60 is configured to typeset the printed text and the handwritten text according to the target text typesetting type to generate the note.
  • the similarity judgment unit 20 includes:
  • a grayscale subunit configured to perform grayscale processing on the designated picture and the picture previously acquired by the designated terminal, respectively, to obtain a first grayscale picture and a second grayscale picture;
  • the average value calculation subunit is used to calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale image, and calculate the average value B of the gray values of all the pixels in the gray-scale image ;
  • the overall variance calculation subunit is used according to the formula: Calculate the overall variance of the m-th column or m-th row of the grayscale image Where N is the total number of columns or rows in the grayscale picture;
  • the difference of variance calculation subunit is used according to the formula: Obtain the difference between the overall variance of the m-th column or m-th row of the first gray-scale picture and the second gray-scale picture among them, Is the overall variance of the m-th column or m-th row of the first grayscale picture, Is the overall variance of the m-th column or the m-th row of the second grayscale picture;
  • Error threshold judgment subunit used to judge Whether it is less than the preset variance error threshold
  • Similarity determination subunit used if If it is less than the preset variance error threshold, it is determined that the specified picture is similar to the picture previously obtained by the specified terminal.
  • the similarity judgment unit 20 includes:
  • the same pixel count subunit which is used to sequentially compare corresponding pixels in the designated picture and the picture previously obtained by the designated terminal, and count the number of identical pixels;
  • the proportion threshold judging subunit is used to judge whether the proportion of the same pixel is greater than a preset proportion threshold
  • the second similarity determination subunit is configured to determine that the designated picture is similar to the picture previously obtained by the designated terminal if the proportion of the same pixel is greater than the preset proportion threshold.
  • the color of the handwritten text is different from the color of the printed text
  • the characteristic data acquiring unit 30 includes:
  • the temporary picture generation subunit is used to collect the value of the R color channel, the value of the G color channel and the value of the B color channel in the RGB color model of the pixel in the specified picture, and according to the preset three-value method Set the RGB color of the pixel in the specified picture to (0,0,0), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255, and the Temporary pictures composed of various colors;
  • the segmentation subunit is used to calculate the area occupied by the three colors in the temporary picture, and use the preset text segmentation method for the area occupied by the two colors with the smaller area to obtain the divided single handwritten text and Separate single printed text;
  • the recognition subunit is used to extract the text features of the single handwritten text and the text features of the single printed text, and input them into a preset support vector machine for classification to obtain the recognized handwritten text text and printed text text.
  • the temporary picture generation subunit includes:
  • the reference value F1 judgment module is used to judge whether the value of the reference value F1 is equal to A;
  • the reference value F2 judgment module is used to judge whether the value of the reference value F2 is equal to B;
  • the color setting module is configured to set the RGB color of the designated pixel to (255, 255, 255) if the value of the reference value F2 is not equal to B.
  • the device includes:
  • the sample data retrieval unit is used to retrieve pre-collected sample data and divide the sample data into a training set and a test set; wherein the sample data includes pre-collected handwritten characters and is associated with the pre-collected handwritten characters Emotional category;
  • the training unit is used to input the sample data of the training set into the preset neural network model for training to obtain the initial emotion recognition model, wherein the stochastic gradient descent method is used in the training process;
  • a verification unit for verifying the initial emotion recognition model by using sample data of the test set
  • the marking unit is configured to record the initial emotion recognition model as the emotion recognition model if the verification of the initial emotion recognition model is passed.
  • the device includes:
  • a reading format obtaining unit configured to receive an obtaining request for obtaining handwritten notes sent by a second terminal, wherein the obtaining request records a reading format supported by the second terminal;
  • the reading format judgment unit is used to judge whether the reading format of the reading software can display the notes
  • the note sending unit is configured to send the note to the second terminal if the reading format of the reading software can display the note.
  • an embodiment of the present application also provides a computer device.
  • the computer device may be a server, and its internal structure may be as shown in the figure.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor designed by the computer is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, a computer program, and a database.
  • the memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the database of the computer equipment is used to store the data used in the note generation method based on text recognition technology.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program is executed by the processor to realize a note generation method based on text recognition technology.
  • the above-mentioned processor executes the above-mentioned note generation method based on text recognition technology, wherein the steps included in the method respectively correspond to the steps of executing the note generation method based on text recognition technology of the aforementioned embodiment one-to-one, and will not be repeated here.
  • An embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
  • a computer program is stored.
  • the computer-readable storage medium is, for example, a non-volatile computer-readable storage medium or a volatile computer-readable storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Character Input (AREA)

Abstract

A note generation method and apparatus based on character recognition technology, a computer device, and a storage medium, the method comprising: acquiring a specified picture with handwritten characters and printed characters; if the specified picture is not similar to the picture previously acquired by a specified terminal, respectively recognising the handwritten characters and the printed characters in the specified picture as handwritten character text and printed character text, and extracting feature data of the handwritten characters in the specified picture; inputting the feature data into an emotion recognition model trained on the basis of a neural network model and acquiring a predicted emotion type outputted by the emotion recognition model; acquiring a target character typesetting type corresponding to the predicted emotion type; and typesetting the printed character text and the handwritten character text on the basis of the target text typesetting type to generate a handwritten note. The degree of information preservation is thereby increased.

Description

基于文字识别技术的笔记生成方法、装置和计算机设备Note generation method, device and computer equipment based on text recognition technology
本申请要求于2019年9月3日提交中国专利局、申请号为201910828605.2,发明名称为“基于文字识别技术的笔记生成方法、装置和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on September 3, 2019, the application number is 201910828605.2, and the invention title is "Note Generation Method, Apparatus and Computer Equipment Based on Text Recognition Technology". The entire content of the application is approved The reference is incorporated in this application.
技术领域Technical field
本申请涉及到计算机领域,特别是涉及到一种基于文字识别技术的笔记生成方法、装置、计算机设备和存储介质。This application relates to the computer field, in particular to a method, device, computer equipment and storage medium for generating notes based on text recognition technology.
背景技术Background technique
在对实体书籍进行阅读时,很大一部分人会有的笔记或者摘抄的习惯。对于这些具有手写笔记的实体书籍若能将其转换为更加合适编辑的数字文件文本的话,更利于用户后期的整理与编辑,有利于信息的理解与传播。现有技术一般只能将具有手写笔记的实体书籍进行机械识别,获得的文字文本一般不区分书籍原始记载内容与手写笔记,或者以图片形式保留手写文字(为了保全手写文字的全部信息),再与印刷体文字拼接;从而造成信息丢失的问题,或者造成笔记生成需要消耗大量计算资源的问题。因此现有技术缺少完美的手写笔记生成的技术方案。When reading physical books, a large number of people have the habit of taking notes or excerpts. If these physical books with handwritten notes can be converted into digital file texts that are more suitable for editing, it will be more conducive to the user's later collation and editing, and it will be conducive to the understanding and dissemination of information. The prior art generally can only mechanically recognize physical books with handwritten notes, and the obtained texts generally do not distinguish between the original recorded content of the book and the handwritten notes, or retain the handwritten text in the form of pictures (in order to preserve all the information of the handwritten text), and then Splicing with printed text; thus causing the problem of information loss, or the problem of consuming a lot of computing resources for note generation. Therefore, the prior art lacks a perfect technical solution for generating handwritten notes.
技术问题technical problem
本申请的主要目的为提供一种基于文字识别技术的笔记生成方法、装置、计算机设备和存储介质,旨在生成笔记时提高信息的保全度。The main purpose of this application is to provide a note generation method, device, computer equipment, and storage medium based on text recognition technology, aiming to improve the preservation of information when generating notes.
技术解决方案Technical solutions
为了实现上述发明目的,本申请提出一种基于文字识别技术的笔记生成方法,应用于指定终端,包括:In order to achieve the above-mentioned purpose of the invention, this application proposes a note generation method based on text recognition technology, which is applied to a designated terminal, and includes:
获取具有手写文字和印刷体文字的指定图片;Obtain designated pictures with handwritten text and printed text;
利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似;Using a preset picture similarity judgment method to judge whether the specified picture is similar to the picture previously obtained by the specified terminal;
若所述指定图片与所述指定终端前一次获取的图片不相似,则利用预设的文字识别技术将所述指定图片中的手写文字和印刷体文字分别识别为手写文字文本和印刷体文字文本,以及提取所述指定图片中手写文字的特征数据,其中所述特征数据至少包括所述手写文字中的重笔位置与重笔数量;If the designated picture is not similar to the picture previously obtained by the designated terminal, the handwritten text and printed text in the designated picture are recognized as handwritten text and printed text, respectively, by using a preset text recognition technology , And extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text;
将所述特征数据输入基于神经网络模型训练完成的情绪识别模型,获得所述情绪识别模型输出的预测情绪类别,其中所述情绪识别模型基于预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别组成的样本数据训练而成;The feature data is input into the emotion recognition model trained based on the neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text and is related to the pre-collected handwritten text. Trained on sample data composed of emotion categories associated with text;
根据预设的情绪类别与文字排版类型的对应关系,获取与所述预测情绪类别对应的目标文字排版类型;Acquiring the target text typesetting type corresponding to the predicted emotion type according to the preset correspondence relationship between the emotion category and the text typesetting type;
将所述印刷体文字文本和所述手写文字文本根据所述目标文字排版类型进行排版,生成所述笔记。The printed text and the handwritten text are typeset according to the target text typesetting type to generate the note.
有益效果Beneficial effect
本申请的基于文字识别技术的笔记生成方法、装置、计算机设备和存储介质,利用情绪识别模型识别出笔记书写者在书写笔记时的情绪类别,并根据情绪类别选择对应的排版方式,从而将情绪类别信息(或激昂,或悲伤等)以排版方式的形式保存了下来,克服了现有的文字识别技术识别文字时丢失信息(例如情绪丢失)的缺陷。提高信息的保全度。The note generation method, device, computer equipment, and storage medium based on text recognition technology of this application use the emotion recognition model to recognize the emotion category of the note writer when writing notes, and select the corresponding typesetting method according to the emotion category, thereby integrating the emotion The category information (or excitement, or sadness, etc.) is preserved in the form of typesetting, which overcomes the defect of information loss (such as loss of emotion) when the existing text recognition technology recognizes text. Improve the preservation of information.
附图说明Description of the drawings
图1为本申请一实施例的基于文字识别技术的笔记生成方法的流程示意图;FIG. 1 is a schematic flowchart of a note generation method based on text recognition technology according to an embodiment of the application;
图2为本申请一实施例的基于文字识别技术的笔记生成装置的结构示意框图;2 is a schematic block diagram of the structure of a note generation device based on text recognition technology according to an embodiment of the application;
图3为本申请一实施例的计算机设备的结构示意框图。FIG. 3 is a schematic block diagram of the structure of a computer device according to an embodiment of the application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.
本申请的最佳实施方式The best implementation of this application
参照图1,本申请实施例提供一种基于文字识别技术的笔记生成方法,应用于指定终端,包括:1, an embodiment of the present application provides a method for generating notes based on text recognition technology, which is applied to a designated terminal, and includes:
S1、获取具有手写文字和印刷体文字的指定图片;S1. Obtain designated pictures with handwritten text and printed text;
S2、利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似;S2, using a preset method for judging picture similarity, judge whether the specified picture is similar to the picture previously obtained by the specified terminal;
S3、若所述指定图片与所述指定终端前一次获取的图片不相似,则利用预设的文字识别技术将所述指定图片中的手写文字和印刷体文字分别识别为手写文字文本和印刷体文字文本,以及提取所述指定图片中手写文字的特征数据,其中所述特征数据至少包括所述手写文字中的重笔位置与重笔数量;S3. If the designated picture is not similar to the picture previously acquired by the designated terminal, use a preset text recognition technology to recognize the handwritten text and printed text in the designated picture as handwritten text and printed text, respectively Text, and extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text;
S4、将所述特征数据输入基于神经网络模型训练完成的情绪识别模型,获得所述情绪识别模型输出的预测情绪类别,其中所述情绪识别模型基于预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别组成的样本数据训练而成;S4. Input the characteristic data into the emotion recognition model trained based on the neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text and is related to the pre-collected Trained on sample data composed of emotional categories associated with handwritten text;
S5、根据预设的情绪类别与文字排版类型的对应关系,获取与所述预测情绪类别对应的目标文字排版类型;S5. Obtain the target text typesetting type corresponding to the predicted emotion type according to the preset correspondence between the emotion category and the text typesetting type;
S6、将所述印刷体文字文本和所述手写文字文本根据所述目标文字排版类型进行排版,生成所述笔记。S6. Typesetting the printed text and the handwritten text according to the target text typesetting type to generate the note.
如上述步骤S1所述,获取具有手写文字和印刷体文字的指定图片。其中所述指定图片可以通过预设摄像头实时采集的具有手写文字和印刷体文字的图片,也可以是预存的具有手写文字和印刷体文字的图片。其中印刷体文字指出版物用于刊载文字的字体,是被批量印制出的文字的使用的字体,其中出版物例如书籍、杂志等实体载体。因此手写文字与印刷体文字具有明显区别。As described in step S1 above, a designated picture with handwritten text and printed text is acquired. The designated picture may be a picture with handwritten text and printed text collected in real time through a preset camera, or may be a pre-stored picture with handwritten text and printed text. Printed text refers to the font used by publications to publish text, and is the font used for text printed in batches, where publications are physical carriers such as books and magazines. Therefore, there is a clear distinction between handwritten text and printed text.
如上述步骤S2所述,利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似。所述图片相似度判断方法例如为:依次对比两张图片中对应的像素点,若相同的 像素点的数量在所有像素点数量中的占比大于预定阈值,则判定相似;若相同的像素点的数量在所有像素点数量中的占比不大于预定阈值,则判定不相似。若所述指定图片与所述指定终端前一次获取的图片相似,表明所述指定图片已经过识别处理,只需将上次的识别结果调出即可,无需再次执行识别操作。As described in step S2 above, a preset picture similarity determination method is used to determine whether the designated picture is similar to the picture previously acquired by the designated terminal. The picture similarity judgment method is for example: sequentially comparing the corresponding pixels in the two pictures, if the number of the same pixels in the number of all the pixels is greater than a predetermined threshold, then the judgment is similar; if the same pixels If the proportion of the number of pixels in the number of all pixels is not greater than a predetermined threshold, it is determined that they are not similar. If the designated picture is similar to the picture previously obtained by the designated terminal, it indicates that the designated picture has undergone recognition processing, and only the recognition result of the last time needs to be called out, and there is no need to perform the recognition operation again.
如上述步骤S3所述,若所述指定图片与所述指定终端前一次获取的图片不相似,则利用预设的文字识别技术将所述指定图片中的手写文字和印刷体文字分别识别为手写文字文本和印刷体文字文本,以及提取所述指定图片中手写文字的特征数据,其中所述特征数据至少包括所述手写文字中的重笔位置与重笔数量。若所述指定图片与所述指定终端前一次获取的图片不相似,表明所述指定图片未经过识别处理,是全新的图片,因此需要进行识别。其中预设的文字识别技术例如为OCR(Optical Character Recognition,光学字符识别)技术,其中在识别过程中可以采中下述一种或者多种技术手段:灰度化:采用RGB模型表示图像的每个像素点,取每个像素点的R、G、B的平均值代替原来的R、G、B的值得到图像的灰度值;二值化:将图像的像素点分为黑色和白色两部分,黑色的视为前景信息,白色的则视为背景信息,以处理掉原始图像除目标文字外的其他物体、背景等;降噪:采用中值滤波、均值滤波、自适应维纳滤波等进行滤波,以处理图像采集、压缩、传输等过程中导致的图像噪声;倾斜矫正:采用霍夫变换等方法处理图像,以矫正拍照等导致的图像倾斜。文字分割:采用投影运算进行文字切分,将单行文字或者多行文字投影到X轴上,并将值累加,文字的区域必定值比较大,间隔区域必定没有值,再考虑间隔的合理性,以此分割出单个文字;特征提取:提取出这些像素点中的特殊的点如极值点,孤立点等,作为图像的特征点,然后再对其进行降维处理,以提高处理速度。分类:采用SVM(Support Vector Machine,采用支持向量机)分类器进行分类,得到初识别结果;处理结果:采用NLP(Natural Language Processing,自然语言处理)方法对初识别结果处理优化后输出,以排除部分误识别到的与正确文字形近的但与上下文无关的文字。其中提取所述指定图片中手写文字的特征数据,其中所述特征数据至少包括所述手写文字中的重笔位置与重笔数量的方法例如包括:将手写文字的笔划分解为多个点进行数据采集分析,通过识别像素点的数据变化趋势得到每个点的压力值、书写时顺序的清晰度等,进而获取包括重笔位置与重笔数量的特征数据,其中重笔指手写文字中用力最大的笔划。As described in step S3 above, if the designated picture is not similar to the picture previously acquired by the designated terminal, the handwritten text and printed text in the designated picture are respectively recognized as handwritten by using a preset text recognition technology. Text and printed text, as well as extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text. If the designated picture is not similar to the picture previously acquired by the designated terminal, it indicates that the designated picture has not undergone identification processing and is a brand new picture, and therefore needs to be identified. The preset text recognition technology is, for example, OCR (Optical Character Recognition, optical character recognition) technology, in which one or more of the following technical means can be used in the recognition process: Grayscale: RGB model is used to represent each image For each pixel, take the average value of R, G, and B of each pixel instead of the original R, G, and B values to obtain the gray value of the image; binarization: divide the pixels of the image into black and white Part, black is regarded as foreground information, and white is regarded as background information to process other objects and backgrounds in the original image except the target text; noise reduction: median filter, mean filter, adaptive Wiener filter, etc. Filtering to deal with image noise caused by the process of image acquisition, compression, transmission, etc.; tilt correction: use Hough transform and other methods to process the image to correct image tilt caused by photographing. Text segmentation: Use projection operation to segment text, project a single line of text or multiple lines of text on the X axis, and accumulate the values. The text area must have a relatively large value, and the interval area must have no value. Then consider the rationality of the interval. In this way, a single text is segmented; feature extraction: extract the special points of these pixels, such as extreme points, isolated points, etc., as the feature points of the image, and then perform dimensionality reduction processing on them to increase the processing speed. Classification: use SVM (Support Vector Machine) classifier to classify and get the initial recognition result; processing result: use NLP (Natural Language Processing, natural language processing) method to process and optimize the output of the initial recognition result to eliminate Some misrecognized texts that are close to the correct text but have nothing to do with the context. The method for extracting the feature data of the handwritten text in the designated picture, where the feature data includes at least the position of the repetition and the number of repetitions in the handwritten text, for example, includes: dividing the pen of the handwritten text into multiple points for data Collect and analyze, obtain the pressure value of each point, the clarity of the sequence when writing, etc. by identifying the data change trend of the pixel, and then obtain the characteristic data including the position of the heavy pen and the number of the heavy pen. The heavy pen refers to the handwritten text with the greatest force Strokes.
如上述步骤S4所述,将所述特征数据输入基于神经网络模型训练完成的情绪识别模型,获得所述情绪识别模型输出的预测情绪类别,其中所述情绪识别模型基于预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别组成的样本数据训练而成。其中神经网络模型可以为任意模型,例如VGG16模型、VGG-F模型、ResNet152模型、ResNet50模型、DPN131模型、AlexNet模型和DenseNet模型等,优选DPN模型。DPN(Dual Path Network)是神经网络结构,在ResNeXt的基础上引入了DenseNet的核心内容,使得模型对特征的利用更加充分。上述DPN、ResNeXt和DenseNet是现有的网络结构,在此不在赘述。其中所述情绪类别可以以任意方式分类,例如包括紧张、高兴、伤感、愤慨等。As described in step S4 above, input the characteristic data into the emotion recognition model trained based on the neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text, and It is trained on sample data composed of emotion categories associated with the pre-collected handwritten text. The neural network model can be any model, such as VGG16 model, VGG-F model, ResNet152 model, ResNet50 model, DPN131 model, AlexNet model, DenseNet model, etc., and the DPN model is preferred. DPN (Dual Path Network) is a neural network structure, which introduces the core content of DenseNet on the basis of ResNeXt, which makes the model use features more fully. The above-mentioned DPN, ResNeXt and DenseNet are existing network structures and will not be repeated here. The emotion categories can be classified in any manner, for example, including tension, happiness, sadness, indignation, and so on.
如上述步骤S5所述,根据预设的情绪类别与文字排版类型的对应关系,获取与所述预测情绪类别 对应的目标文字排版类型。其中预设的情绪类别与文字排版类型的对应关系例如为,当情绪类别为平稳情绪时,在手写文字原处用标识符进行替换,而在文末记载识别得到的手写文字,不破坏印刷体文字的连贯性;当情绪类别为激动情绪时,在手写文字原处用特殊字体排版所述手写文字。其中所述文字排版可为任意可行方式。其中,文字排版类型与情绪类别对应,例如对于激昂情绪类别,采用红色字体、加粗体现;对悲伤情绪类别,采用绿色字体、斜体体现。当然,排版类型还可以包括其他任意可行类型。As described in the above step S5, the target text typesetting type corresponding to the predicted emotion type is obtained according to the preset correspondence between the emotion category and the text typesetting type. The preset correspondence between the emotion category and the text typesetting type is, for example, when the emotion category is a stable emotion, the identifier is used to replace the original handwritten text, and the recognized handwritten text is recorded at the end of the text without destroying the printed text When the emotion category is agitation, typeset the handwritten text with a special font in the original place of the handwritten text. The text typesetting can be any feasible way. Among them, the type of text layout corresponds to the emotion category. For example, for the passionate emotion category, red font and bold are used to reflect; for the sad emotion category, green font and italics are used to reflect. Of course, the typesetting type can also include any other feasible types.
如上述步骤S6所述,将所述印刷体文字文本和所述手写文字文本根据所述目标文字排版类型进行排版,生成所述笔记。由于将所述印刷体文字文本和所述手写文字文本根据所述目标文字排版类型进行排版得到的手写笔记,更进一步地保留了原有手写文字的信息,使识别的贴切性更高,用户体验更佳,信息缺失率更低。As described in step S6, the printed text and the handwritten text are typeset according to the target text typesetting type to generate the note. Since the handwritten notes obtained by typesetting the printed text and the handwritten text according to the target text typesetting further retain the information of the original handwritten text, the recognition is more relevant and the user experience Better, the rate of missing information is lower.
在一个实施方式中,所述利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似的步骤S2,包括:In one embodiment, the step S2 of judging whether the designated picture is similar to the picture previously obtained by the designated terminal by using a preset picture similarity judgment method includes:
S201、分别对所述指定图片与所述指定终端前一次获取的图片进行灰度化处理,得到第一灰度图片和第二灰度图片;S201: Perform gray-scale processing on the designated picture and the picture previously acquired by the designated terminal respectively to obtain a first gray-scale picture and a second gray-scale picture;
S202、计算灰度图片的第m列或者第m行的所有像素点的灰度值的平均值Am,以及计算灰度图片中所有像素点的灰度值的平均值B;S202: Calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale picture, and calculate the average value B of the gray values of all the pixels in the gray-scale picture;
S203、根据公式:
Figure PCTCN2019116337-appb-000001
计算灰度图片的第m列或者第m行的总体方差
Figure PCTCN2019116337-appb-000002
其中N为所述灰度图片中的列或者行的总数量;
S203. According to the formula:
Figure PCTCN2019116337-appb-000001
Calculate the overall variance of the m-th column or m-th row of the grayscale image
Figure PCTCN2019116337-appb-000002
Where N is the total number of columns or rows in the grayscale picture;
S204、根据公式:
Figure PCTCN2019116337-appb-000003
获得所述第一灰度图片与所述第二灰度图片的第m列或者第m行的总体方差之差
Figure PCTCN2019116337-appb-000004
其中,
Figure PCTCN2019116337-appb-000005
为所述第一灰度图片的第m列或者第m行的总体方差,
Figure PCTCN2019116337-appb-000006
为所述第二灰度图片的第m列或者第m行的总体方差;
S204, according to the formula:
Figure PCTCN2019116337-appb-000003
Obtain the difference between the overall variance of the m-th column or m-th row of the first gray-scale picture and the second gray-scale picture
Figure PCTCN2019116337-appb-000004
among them,
Figure PCTCN2019116337-appb-000005
Is the overall variance of the m-th column or m-th row of the first grayscale picture,
Figure PCTCN2019116337-appb-000006
Is the overall variance of the m-th column or the m-th row of the second grayscale picture;
S205、判断
Figure PCTCN2019116337-appb-000007
是否小于预设的方差误差阈值;
S205, judgment
Figure PCTCN2019116337-appb-000007
Whether it is less than the preset variance error threshold;
S206、若
Figure PCTCN2019116337-appb-000008
小于预设的方差误差阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。
S206, if
Figure PCTCN2019116337-appb-000008
If it is less than the preset variance error threshold, it is determined that the specified picture is similar to the picture previously obtained by the specified terminal.
如上所述,实现了利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似。其中,灰度化指将彩色表示一种灰度颜色,例如在在RGB模型中,如果R=G=B时,则彩色表示一种灰度颜色,其中R=G=B的值叫灰度值,因此,灰度图像每个像素只需一个字节存放灰度值(又称强度值、亮度值),减少存储量。灰度范围例如为0-255(当R,G,B的取值均为0-255时,当然也会随R,G,B的取值范围的变化而变化)。采用灰度化处理的方法可以为任意方法,例如分量法、最大值法、平均值法、加权平均法等。其中,由于灰度值的取值范围只有256种,在此基础上进行图片对比能够大大减轻计算量。再计算所述灰度图片的第m列或者第m行的所有像素点的灰度值的平均值 Am,以及计算所述灰度图片中所有像素点的灰度值的平均值B。其中,计算所述灰度图片的第m列或者第m行的所有像素点的灰度值的平均值Am的过程包括:采集所述灰度图片的第m列或者第m行的所有像素点的灰度值,对所述第m列或者第m行的所有像素点的灰度值进行加和处理,将进行过加和处理得到的灰度值之和除以所述第m列或者第m行的所有像素点的数量,得到所述灰度图片的第m列或者第m行的所有像素点的灰度值的平均值Am。计算所述灰度图片中所有像素点的灰度值的平均值B的过程包括:计算所述灰度图片中所有像素点的灰度值之和,再以所述灰度值之和除以所述像素点的数量,得到所述灰度图片中所有像素点的灰度值的平均值B。根据公式:
Figure PCTCN2019116337-appb-000009
计算所述灰度图片的第m列或者第m行的总体方差
Figure PCTCN2019116337-appb-000010
其中N为所述灰度图片中的列或者行的总数量。在本申请中,采用总体方差来衡量所述灰度图片的第m列或者第m行的像素点的灰度值的平均值Am与所述灰度图片中所有像素点的灰度值的平均值B之间的差异。
As described above, the use of a preset picture similarity judgment method is implemented to judge whether the specified picture is similar to the picture previously obtained by the specified terminal. Among them, grayscale refers to the color representing a grayscale color. For example, in the RGB model, if R=G=B, the color represents a grayscale color, and the value of R=G=B is called grayscale. Therefore, each pixel of a grayscale image only needs one byte to store the grayscale value (also called intensity value, brightness value), which reduces the storage capacity. The gray scale range is, for example, 0-255 (when the values of R, G, and B are all 0-255, of course, it will also change with the change of the value range of R, G, and B). The gray-scale processing method can be any method, such as the component method, the maximum value method, the average method, and the weighted average method. Among them, since there are only 256 value ranges for gray values, image comparison on this basis can greatly reduce the amount of calculation. Then calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale picture, and calculate the average value B of the gray values of all the pixels in the gray-scale picture. Wherein, the process of calculating the average value Am of the gray values of all pixels in the m-th column or m-th row of the gray-scale picture includes: collecting all the pixels in the m-th column or m-th row of the gray-scale picture Add the gray values of all pixels in the mth column or mth row, and divide the sum of the gray values obtained by the summation by the mth column or The number of all pixels in the m rows is the average value Am of the gray values of all the pixels in the mth column or mth row of the grayscale image. The process of calculating the average value B of the gray values of all pixels in the gray image includes: calculating the sum of the gray values of all pixels in the gray image, and then dividing the sum of the gray values by According to the number of pixels, the average value B of the gray values of all pixels in the gray image is obtained. According to the formula:
Figure PCTCN2019116337-appb-000009
Calculate the overall variance of the m-th column or m-th row of the grayscale image
Figure PCTCN2019116337-appb-000010
Where N is the total number of columns or rows in the grayscale picture. In this application, the overall variance is used to measure the average of the gray values Am of the pixels in the m-th column or the m-th row of the gray-scale picture and the average of the gray-scale values of all pixels in the gray-scale picture. The difference between the value B.
根据公式:
Figure PCTCN2019116337-appb-000011
获得两张所述灰度图片的第m列或者第m行的总体方差之差
Figure PCTCN2019116337-appb-000012
其中,
Figure PCTCN2019116337-appb-000013
为第一张灰度图片的第m列或者第m行的总体方差,
Figure PCTCN2019116337-appb-000014
为第二张灰度图片的第m列或者第m行的总体方差。总体方差之差
Figure PCTCN2019116337-appb-000015
反应了两张灰度图片的第m列或者第m行的灰度值的差异。当
Figure PCTCN2019116337-appb-000016
较小时,例如为0时,表明
Figure PCTCN2019116337-appb-000017
等于或者近似等于
Figure PCTCN2019116337-appb-000018
可视为第一张灰度图片第m列或者第m行的灰度值与第二张灰度图片第m列或者第m行的灰度值相同或者近似相同(近似判断,以节省算力,并且由于不同的两张图片的总体方差一般不相等,因此该判断的准确性很高),反之认为第一张灰度图片第m列或者第m行的灰度值与第二张灰度图片第m列或者第m行的灰度值不相同。判断
Figure PCTCN2019116337-appb-000019
是否小于预设的方差误差阈值。其中
Figure PCTCN2019116337-appb-000020
的返回值即为
Figure PCTCN2019116337-appb-000021
中的最大值。若
Figure PCTCN2019116337-appb-000022
小于预设的方差误差阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。利用了近似判断(由于两张不同图片转化为的灰度图片的所有灰度值一般不相等,而相同图片转化为的灰度图片的所有灰度值一般相等),实现了在消耗较少计算资源的前提下,判断所述指定图片与所述指定终端前一次获取的图片是否相似。据此,当所述指定图片与所述指定终端前一次获取的图片不相似的前提下,才进行后续的步骤(若所述指定图片与所述指定终端前一次获取的图片相似,则表明所述指定图片已进行了笔记生成处理,因此无需再次进行处理),减少了不必要的资源消耗。
According to the formula:
Figure PCTCN2019116337-appb-000011
Obtain the difference between the overall variance of the m-th column or m-th row of the two grayscale images
Figure PCTCN2019116337-appb-000012
among them,
Figure PCTCN2019116337-appb-000013
Is the overall variance of the mth column or mth row of the first grayscale image,
Figure PCTCN2019116337-appb-000014
Is the overall variance of the mth column or mth row of the second grayscale image. Difference of population variance
Figure PCTCN2019116337-appb-000015
It reflects the difference in the gray value of the m-th column or m-th row of the two gray-scale pictures. when
Figure PCTCN2019116337-appb-000016
When it is smaller, such as 0, it means
Figure PCTCN2019116337-appb-000017
Equal to or approximately equal to
Figure PCTCN2019116337-appb-000018
It can be considered that the gray value of the m-th column or row of the first gray-scale image is the same or approximately the same as the gray value of the m-th column or m-th row of the second gray-scale image (approximate judgment to save computing power , And because the overall variance of the two different pictures is generally not equal, the accuracy of the judgment is very high), on the contrary, the gray value of the mth column or mth row of the first grayscale image is considered to be the same as the second grayscale value. The gray values of the m-th column or m-th row of the picture are different. judgment
Figure PCTCN2019116337-appb-000019
Whether it is less than the preset variance error threshold. among them
Figure PCTCN2019116337-appb-000020
The return value is
Figure PCTCN2019116337-appb-000021
The maximum value in. If
Figure PCTCN2019116337-appb-000022
If it is less than the preset variance error threshold, it is determined that the specified picture is similar to the picture previously acquired by the specified terminal. Approximate judgment is used (because all gray values of grayscale pictures converted from two different pictures are generally not equal, and all grayscale values of grayscale pictures converted from the same picture are generally equal), it is possible to reduce the cost of calculation Under the premise of resources, it is determined whether the designated picture is similar to the picture previously acquired by the designated terminal. Accordingly, when the designated picture is not similar to the picture previously acquired by the designated terminal, the subsequent steps are performed (if the designated picture is similar to the picture previously acquired by the designated terminal, it indicates that the designated picture is similar to the picture previously acquired by the designated terminal. The specified picture has been processed for note generation, so there is no need to process it again), reducing unnecessary resource consumption.
在一个实施方式中,所述利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似的步骤S2,包括:In one embodiment, the step S2 of judging whether the designated picture is similar to the picture previously obtained by the designated terminal by using a preset picture similarity judgment method includes:
S211、依次对比所述指定图片与所述指定终端前一次获取的图片中对应的像素点,并统计相同像素点的数量;S211: Compare corresponding pixels in the designated picture and the picture previously acquired by the designated terminal in turn, and count the number of the same pixels;
S212、根据公式:相同像素点占比=所述相同像素点的数量/所述指定图片中所有像素点的数量,获 得所述相同像素点占比;S212. According to the formula: the proportion of the same pixels=the number of the same pixels/the number of all the pixels in the specified picture, to obtain the proportion of the same pixels;
S213、判断所述相同像素点占比是否大于预设的占比阈值;S213: Determine whether the proportion of the same pixel points is greater than a preset proportion threshold;
S214、若所述相同像素点占比大于预设的占比阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。S214: If the proportion of the same pixel points is greater than a preset proportion threshold, it is determined that the specified picture is similar to the picture previously obtained by the specified terminal.
如上所述,实现了利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似。为了精准判断所述指定图片与所述指定终端前一次获取的图片是否相似,本实施方式采用逐次比对像素点的方式进行判断。若两张图片是相同的,那么相同像素点的数量应当占绝大多数,即所述相同像素点占比趋近于1。据此,根据公式:相同像素点占比=所述相同像素点的数量/所述指定图片中所有像素点的数量,计算出所述相同像素点占比,若所述相同像素点占比大于预设的占比阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。As described above, it is possible to use a preset method for judging the similarity of pictures to judge whether the designated picture is similar to the picture previously acquired by the designated terminal. In order to accurately determine whether the designated picture is similar to the picture previously acquired by the designated terminal, this embodiment adopts a method of successively comparing pixels for judgment. If the two pictures are the same, the number of the same pixels should account for the vast majority, that is, the proportion of the same pixels is close to 1. According to this, according to the formula: the proportion of the same pixels = the number of the same pixels/the number of all the pixels in the specified picture, the proportion of the same pixels is calculated, and if the proportion of the same pixels is greater than With a preset proportion threshold, it is determined that the designated picture is similar to the picture previously acquired by the designated terminal.
在一个实施方式中,所述手写文字的颜色与所述印刷体文字的颜色不同,所述利用预设的文字识别技术将所述指定图片中的手写文字和印刷体文字分别识别为手写文字文本和印刷体文字文本的步骤S3,包括:In one embodiment, the color of the handwritten text is different from the color of the printed text, and the preset text recognition technology is used to recognize the handwritten text and the printed text in the designated picture as handwritten text. And step S3 of printed text, including:
S301、采集所述指定图片中的像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值,并根据预设的三值化法将所述指定图片中的像素点的RGB颜色设置为(0,0,0)、(255,255,255)或者(P,P,P),其中P为大于0且小于255的预设数值,获得由三种颜色构成的暂时图片;S301. Collect the value of the R color channel, the value of the G color channel, and the value of the B color channel in the RGB color model of the pixel in the specified picture, and convert the specified picture into the specified picture according to a preset three-value method. The RGB color of the pixel is set to (0,0,0), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255 to obtain a temporary picture composed of three colors ;
S302、计算三种颜色在所述暂时图片中所占面积,并对面积较小的两种颜色的所占区域分别采用预设的文字分割方法,获得分割开的单个手写文字和分割开的单个印刷体文字;S302. Calculate the area occupied by the three colors in the temporary picture, and respectively adopt a preset text segmentation method for the area occupied by the two colors with a smaller area to obtain a single handwritten text that is divided and a single handwritten text that is divided. Printed text
S303、提取所述单个手写文字的文字特征和所述单个印刷体文字的文字特征,并输入预设的支持向量机中进行分类,获得识别而得的手写文字文本和印刷体文字文本。S303. Extract the text features of the single handwritten text and the text features of the single printed text, and input them into a preset support vector machine for classification to obtain recognized handwritten text and printed text.
如上所述,实现了采用三值化法获得识别而得的手写文字文本和印刷体文字文本。为了更准确地区分手写文字与印刷体文字,本申请使用了三值化法,即根据预设的三值化法将所述指定图片中的像素点的RGB颜色设置为(0,0,0)、(255,255,255)或者(P,P,P),其中P为大于0且小于255的预设数值,获得由三种颜色构成的暂时图片,并计算三种颜色在所述暂时图片中所占面积,并对面积较小的两种颜色的所占区域分别采用预设的文字分割方法(由于面积最大的肯定是背景,因此无需对面积最大的区域进行分析),获得分割开的单个手写文字和分割开的单个印刷体文字。其中所述支持向量机是一类按监督学习方式对数据进行二元分类的广义线性分类器,适用于对待识别文字与预存的文字进行对比,以输出最相似的文字。据此提取所述单个手写文字的文字特征和所述单个印刷体文字的文字特征,并输入预设的支持向量机中进行分类,获得识别而得的手写文字文本和印刷体文字文本。其中所述文字特征例如为文字对应的像素点中的特殊的点如极值点,孤立点等。As described above, the recognition of handwritten text and printed text using the three-value method is realized. In order to distinguish between handwritten text and printed text more accurately, this application uses a three-value method, that is, according to a preset three-value method, the RGB color of the pixel in the specified picture is set to (0,0,0 ), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255, obtain a temporary picture composed of three colors, and calculate the proportion of the three colors in the temporary picture Area, and use the preset text segmentation method for the area occupied by the two colors with the smaller area (because the largest area is definitely the background, so there is no need to analyze the area with the largest area) to obtain a single handwritten text that is divided And separate printed text. The support vector machine is a generalized linear classifier that performs binary classification of data in a supervised learning manner, and is suitable for comparing the recognized text with the pre-stored text to output the most similar text. According to this, the text features of the single handwritten text and the text features of the single printed text are extracted, and input into a preset support vector machine for classification, and the recognized handwritten text and printed text are obtained. The character feature is, for example, a special point in the pixel point corresponding to the character, such as an extreme point, an isolated point, etc.
在一个实施方式中,所述采集所述指定图片中的像素点的RGB颜色模型中的R颜色通道的数值、 G颜色通道的数值和B颜色通道的数值,并根据预设的三值化法将所述指定图片中的像素点的RGB颜色设置为(0,0,0)、(255,255,255)或者(P,P,P)的步骤S301,包括:In one embodiment, the collection of the values of the R color channel, the value of the G color channel and the value of the B color channel in the RGB color model of the pixel in the specified picture is performed according to a preset three-value method The step S301 of setting the RGB color of the pixel in the designated picture to (0, 0, 0), (255, 255, 255) or (P, P, P) includes:
S3011、采集所述指定图片中的像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值,并根据公式:F1=MIN{ROUND[(a1R+a2G+a3B)/L,0],A},获取参考数值F1,其中MIN为最小值函数,ROUND为四舍五入函数,a1、a2、a3均为大于0且小于L的正数,L为大于0的整数,A为预设的取值在范围(0,255)之内第一阈值参数,R、G、B分别为所述指定图片中的指定像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值;S3011, collect the values of the R color channel, the value of the G color channel and the value of the B color channel in the RGB color model of the pixel in the specified picture, and according to the formula: F1=MIN{ROUND[(a1R+a2G+ a3B)/L,0],A}, get the reference value F1, where MIN is the minimum value function, ROUND is the rounding function, a1, a2, and a3 are all positive numbers greater than 0 and less than L, and L is an integer greater than 0 , A is the first threshold parameter with a preset value in the range (0,255), R, G, and B are respectively the value of the R color channel and the G color in the RGB color model of the designated pixel in the designated picture The value of the channel and the value of the B color channel;
S3012、判断所述参考数值F1的值是否等于A;S3012. Determine whether the value of the reference value F1 is equal to A;
S3013、若所述参考数值F1的值不等于A,则根据公式:F2=MAX{ROUND[(a1R+a2G+a3B)/L,0],B},获取参考数值F2,其中MIN为最大值函数,B为预设的取值在范围(0,255)之内第二阈值参数,并且B大于A;S3013. If the value of the reference value F1 is not equal to A, then according to the formula: F2=MAX{ROUND[(a1R+a2G+a3B)/L,0],B}, obtain the reference value F2, where MIN is the maximum value Function, B is the second threshold parameter with a preset value in the range (0,255), and B is greater than A;
S3014、判断所述参考数值F2的值是否等于B;S3014. Determine whether the value of the reference value F2 is equal to B;
S3015、若所述参考数值F2的值不等于B,则将所述指定像素点的RGB颜色设置为(255,255,255)。S3015: If the value of the reference value F2 is not equal to B, set the RGB color of the designated pixel to (255, 255, 255).
如上所述,实现了采集所述指定图片中的像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值,并根据预设的三值化法将所述指定图片中的像素点的RGB颜色设置为(0,0,0)、(255,255,255)或者(P,P,P)。本申请采用公式:F1=MIN{ROUND[(a1R+a2G+a3B)/L,0],A}和公式:F2=MAX{ROUND[(a1R+a2G+a3B)/L,0],B},以确定所述指定像素点的RGB颜色。进一步地,若所述参考数值F1的值不等于A,则将所述指定像素点的RGB颜色设置为(0,0,0)。进一步地,若所述参考数值F2的值等于B,则将所述指定像素点的RGB颜色设置为(P,P,P)。实现了三值化处理,以使背景、印刷体文字、手写体文字完全区分出来,以便于后续的识别处理。其中ROUND函数是四舍五入函数,ROUND(X,a)指对实数X按小数位为a进行四舍五入运算,其中a为大于等于0的整数,例如ROUND(2.4,0)=2。As mentioned above, the value of the R color channel, the value of the G color channel, and the value of the B color channel in the RGB color model of the pixel in the specified picture are collected, and all the values are calculated according to the preset three-value method. The RGB color of the pixel in the specified picture is set to (0,0,0), (255,255,255) or (P,P,P). This application adopts the formula: F1=MIN{ROUND[(a1R+a2G+a3B)/L,0],A} and the formula: F2=MAX{ROUND[(a1R+a2G+a3B)/L,0],B} To determine the RGB color of the designated pixel. Further, if the value of the reference value F1 is not equal to A, the RGB color of the designated pixel is set to (0, 0, 0). Further, if the value of the reference value F2 is equal to B, the RGB color of the designated pixel is set to (P, P, P). Three-value processing is realized, so that the background, printed text, and handwritten text can be completely distinguished to facilitate subsequent recognition processing. The ROUND function is a rounding function, ROUND(X,a) refers to the rounding operation of the real number X according to the decimal place a, where a is an integer greater than or equal to 0, for example, ROUND(2.4,0)=2.
在一个实施方式中,所述将所述特征数据输入基于神经网络模型训练完成的情绪识别模型,获得所述情绪识别模型输出的预测情绪类别,其中所述情绪识别模型基于预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别组成的样本数据训练而成的步骤S4之前,包括:In one embodiment, said inputting said characteristic data into an emotion recognition model based on neural network model training to obtain the predicted emotion category output by said emotion recognition model, wherein said emotion recognition model is based on pre-collected handwritten text, And before step S4 trained by sample data composed of emotion categories associated with the pre-collected handwritten text, the method includes:
S401、调取预先采集的样本数据,并将样本数据分成训练集和测试集;其中,所述样本数据包括预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别;S401. Retrieve pre-collected sample data, and divide the sample data into a training set and a test set; wherein the sample data includes pre-collected handwritten text and emotion categories associated with the pre-collected handwritten text;
S402、将训练集的样本数据输入到预设的神经网络模型中进行训练,得到初始情绪识别模型,其中,训练的过程中采用随机梯度下降法;S402: Input the sample data of the training set into a preset neural network model for training to obtain an initial emotion recognition model, where the stochastic gradient descent method is used in the training process;
S403、利用测试集的样本数据验证所述初始情绪识别模型;S403: Use the sample data of the test set to verify the initial emotion recognition model.
S404、若所述初始情绪识别模型验证通过,则将所述初始情绪识别模型记为所述情绪识别模型。S404: If the verification of the initial emotion recognition model is passed, record the initial emotion recognition model as the emotion recognition model.
如上所述,实现了设置情绪识别模型。本申请基于神经网络模型以训练出情绪识别模型。其中神经网络模型可为VGG16模型、VGG-F模型、ResNet152模型、ResNet50模型、DPN131模型、AlexNet模型和DenseNet模型等。其中,随机梯度下降法就是随机取样一些训练数据,替代整个训练集,如果样本量很大的情况(例如几十万),那么可能只用其中几万条或者几千条的样本,就已经迭代到最优解了,可以提高训练速度。进一步地,训练还可以采用反向传导法则更新神经网络各层的参数。其中反向传导法则是建立在梯度下降法的基础上,其输入输出关系实质上是一种映射关系:一个n输入m输出的神经网络所完成的功能是从n维欧氏空间向m维欧氏空间中一有限域的连续映射,这一映射具有高度非线性,有利于神经网络模型各层的参数的更新。获得初始情绪识别模型。再利用测试集的样本数据验证所述初始情绪识别模型,若验证通过,则将所述初始情绪识别模型记为所述情绪识别模型。As described above, the emotion recognition model is set. This application is based on a neural network model to train an emotion recognition model. Among them, the neural network model can be VGG16 model, VGG-F model, ResNet152 model, ResNet50 model, DPN131 model, AlexNet model, DenseNet model, etc. Among them, the stochastic gradient descent method is to randomly sample some training data to replace the entire training set. If the sample size is large (for example, hundreds of thousands), then only tens of thousands or thousands of samples may be used, and iterative When the optimal solution is reached, the training speed can be improved. Further, the training can also use the reverse conduction rule to update the parameters of each layer of the neural network. The reverse conduction law is based on the gradient descent method, and its input-output relationship is essentially a mapping relationship: the function of a neural network with n-input and m-output is from n-dimensional Euclidean space to m-dimensional Ou A continuous mapping of a finite field in the space, this mapping is highly non-linear, which is conducive to the update of the parameters of each layer of the neural network model. Obtain the initial emotion recognition model. The sample data of the test set is then used to verify the initial emotion recognition model, and if the verification is passed, the initial emotion recognition model is recorded as the emotion recognition model.
在一个实施方式中,所述将所述印刷体文字文本和所述手写文字文本根据所述目标文字排版类型进行排版,生成所述笔记的步骤S6之后,包括:In one embodiment, after the step S6 of generating the note, the step S6 of formatting the printed text and the handwritten text according to the target text typesetting includes:
S61、接收第二终端发送的获取手写笔记的获取请求,其中所述获取请求记载有所述第二终端支持的阅读格式;S61. Receive an acquisition request for acquiring a handwritten note sent by a second terminal, where the acquisition request records a reading format supported by the second terminal;
S62、判断所述阅读软件的阅读格式是否能够展示所述笔记;S62: Determine whether the reading format of the reading software can display the notes;
S63、若所述阅读软件的阅读格式能够展示所述笔记,则将所述笔记发送给所述第二终端。S63. If the reading format of the reading software can display the note, send the note to the second terminal.
如上所述,实现了将所述笔记发送给所述第二终端。由于所述第二终端可能并不支持阅读展示所述笔记,那么将所述笔记进行格式变换之后再发送给第二终端,以避免所述第二终端识别手写笔记失败。据此,判断所述阅读软件的阅读格式是否能够展示所述笔记;若所述阅读软件的阅读格式能够展示所述笔记,则将所述笔记发送给所述第二终端。进一步地,若所述阅读软件的阅读格式不能够展示所述笔记,则将所述笔记的格式转换为所述阅读软件的阅读格式,再发送给所述第二终端。As described above, it is realized that the note is sent to the second terminal. Since the second terminal may not support reading and displaying the note, the note is formatted and then sent to the second terminal to avoid the second terminal from failing to recognize the handwritten note. Based on this, it is determined whether the reading format of the reading software can display the note; if the reading format of the reading software can display the note, the note is sent to the second terminal. Further, if the reading format of the reading software cannot display the note, the format of the note is converted to the reading format of the reading software, and then sent to the second terminal.
参照图2,本申请实施例提供一种基于文字识别技术的笔记生成装置,应用于指定终端,包括:Referring to FIG. 2, an embodiment of the present application provides a note generation device based on text recognition technology, which is applied to a designated terminal, and includes:
指定图片获取单元10,用于获取具有手写文字和印刷体文字的指定图片;The designated picture acquiring unit 10 is used to acquire designated pictures with handwritten text and printed text;
相似度判断单元20,用于利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似;The similarity determination unit 20 is configured to use a preset picture similarity determination method to determine whether the designated picture is similar to the picture previously acquired by the designated terminal;
特征数据获取单元30,用于若所述指定图片与所述指定终端前一次获取的图片不相似,则利用预设的文字识别技术将所述指定图片中的手写文字和印刷体文字分别识别为手写文字文本和印刷体文字文本,以及提取所述指定图片中手写文字的特征数据,其中所述特征数据至少包括所述手写文字中的重笔位置与重笔数量;The feature data acquiring unit 30 is configured to, if the designated picture is not similar to the picture previously acquired by the designated terminal, use a preset text recognition technology to recognize the handwritten text and printed text in the designated picture as Handwritten text and printed text, as well as extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the repetition number in the handwritten text;
预测情绪类别获取单元40,用于将所述特征数据输入基于神经网络模型训练完成的情绪识别模型,获得所述情绪识别模型输出的预测情绪类别,其中所述情绪识别模型基于预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别组成的样本数据训练而成;The predicted emotion category obtaining unit 40 is configured to input the feature data into an emotion recognition model trained based on a neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text , And training from sample data composed of emotion categories associated with the pre-collected handwritten text;
排版类型获取单元50,用于根据预设的情绪类别与文字排版类型的对应关系,获取与所述预测情绪类别对应的目标文字排版类型;The typesetting type obtaining unit 50 is configured to obtain the target text typesetting type corresponding to the predicted emotion type according to the preset correspondence between the emotion category and the text typesetting type;
排版单元60,用于将所述印刷体文字文本和所述手写文字文本根据所述目标文字排版类型进行排版,生成所述笔记。The typesetting unit 60 is configured to typeset the printed text and the handwritten text according to the target text typesetting type to generate the note.
其中上述单元分别用于执行的操作与前述实施方式的基于文字识别技术的笔记生成方法的步骤一一对应,在此不再赘述。The operations performed by the above-mentioned units respectively correspond to the steps of the note generation method based on the text recognition technology of the foregoing embodiment, and will not be repeated here.
在一个实施方式中,所述相似度判断单元20,包括:In one embodiment, the similarity judgment unit 20 includes:
灰度化子单元,用于分别对所述指定图片与所述指定终端前一次获取的图片进行灰度化处理,得到第一灰度图片和第二灰度图片;A grayscale subunit, configured to perform grayscale processing on the designated picture and the picture previously acquired by the designated terminal, respectively, to obtain a first grayscale picture and a second grayscale picture;
平均值计算子单元,用于计算灰度图片的第m列或者第m行的所有像素点的灰度值的平均值Am,以及计算灰度图片中所有像素点的灰度值的平均值B;The average value calculation subunit is used to calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale image, and calculate the average value B of the gray values of all the pixels in the gray-scale image ;
总体方差计算子单元,用于根据公式:
Figure PCTCN2019116337-appb-000023
计算灰度图片的第m列或者第m行的总体方差
Figure PCTCN2019116337-appb-000024
其中N为所述灰度图片中的列或者行的总数量;
The overall variance calculation subunit is used according to the formula:
Figure PCTCN2019116337-appb-000023
Calculate the overall variance of the m-th column or m-th row of the grayscale image
Figure PCTCN2019116337-appb-000024
Where N is the total number of columns or rows in the grayscale picture;
方差之差计算子单元,用于根据公式:
Figure PCTCN2019116337-appb-000025
获得所述第一灰度图片与所述第二灰度图片的第m列或者第m行的总体方差之差
Figure PCTCN2019116337-appb-000026
其中,
Figure PCTCN2019116337-appb-000027
为所述第一灰度图片的第m列或者第m行的总体方差,
Figure PCTCN2019116337-appb-000028
为所述第二灰度图片的第m列或者第m行的总体方差;
The difference of variance calculation subunit is used according to the formula:
Figure PCTCN2019116337-appb-000025
Obtain the difference between the overall variance of the m-th column or m-th row of the first gray-scale picture and the second gray-scale picture
Figure PCTCN2019116337-appb-000026
among them,
Figure PCTCN2019116337-appb-000027
Is the overall variance of the m-th column or m-th row of the first grayscale picture,
Figure PCTCN2019116337-appb-000028
Is the overall variance of the m-th column or the m-th row of the second grayscale picture;
误差阈值判断子单元,用于判断
Figure PCTCN2019116337-appb-000029
是否小于预设的方差误差阈值;
Error threshold judgment subunit, used to judge
Figure PCTCN2019116337-appb-000029
Whether it is less than the preset variance error threshold;
相似判定子单元,用于若
Figure PCTCN2019116337-appb-000030
小于预设的方差误差阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。
Similarity determination subunit, used if
Figure PCTCN2019116337-appb-000030
If it is less than the preset variance error threshold, it is determined that the specified picture is similar to the picture previously obtained by the specified terminal.
其中上述子单元分别用于执行的操作与前述实施方式的基于文字识别技术的笔记生成方法的步骤一一对应,在此不再赘述。The operations performed by the above-mentioned sub-units respectively correspond to the steps of the note generation method based on the text recognition technology of the foregoing embodiment, and will not be repeated here.
在一个实施方式中,所述相似度判断单元20,包括:In one embodiment, the similarity judgment unit 20 includes:
相同像素点统计子单元,用于依次对比所述指定图片与所述指定终端前一次获取的图片中对应的像素点,并统计相同像素点的数量;The same pixel count subunit, which is used to sequentially compare corresponding pixels in the designated picture and the picture previously obtained by the designated terminal, and count the number of identical pixels;
相同像素点占比计算子单元,用于根据公式:相同像素点占比=所述相同像素点的数量/所述指定图片中所有像素点的数量,获得所述相同像素点占比;The same pixel ratio calculation subunit is used to obtain the same pixel ratio according to the formula: the same pixel ratio=the number of the same pixels/the number of all pixels in the specified picture;
占比阈值判断子单元,用于判断所述相同像素点占比是否大于预设的占比阈值;The proportion threshold judging subunit is used to judge whether the proportion of the same pixel is greater than a preset proportion threshold;
第二相似判定子单元,用于若所述相同像素点占比大于预设的占比阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。The second similarity determination subunit is configured to determine that the designated picture is similar to the picture previously obtained by the designated terminal if the proportion of the same pixel is greater than the preset proportion threshold.
其中上述子单元分别用于执行的操作与前述实施方式的基于文字识别技术的笔记生成方法的步骤 一一对应,在此不再赘述。The operations performed by the above-mentioned sub-units respectively correspond to the steps of the note generation method based on the text recognition technology of the foregoing embodiment, and will not be repeated here.
在一个实施方式中,所述手写文字的颜色与所述印刷体文字的颜色不同,所述特征数据获取单元30,包括:In one embodiment, the color of the handwritten text is different from the color of the printed text, and the characteristic data acquiring unit 30 includes:
暂时图片生成子单元,用于采集所述指定图片中的像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值,并根据预设的三值化法将所述指定图片中的像素点的RGB颜色设置为(0,0,0)、(255,255,255)或者(P,P,P),其中P为大于0且小于255的预设数值,获得由三种颜色构成的暂时图片;The temporary picture generation subunit is used to collect the value of the R color channel, the value of the G color channel and the value of the B color channel in the RGB color model of the pixel in the specified picture, and according to the preset three-value method Set the RGB color of the pixel in the specified picture to (0,0,0), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255, and the Temporary pictures composed of various colors;
分割子单元,用于计算三种颜色在所述暂时图片中所占面积,并对面积较小的两种颜色的所占区域分别采用预设的文字分割方法,获得分割开的单个手写文字和分割开的单个印刷体文字;The segmentation subunit is used to calculate the area occupied by the three colors in the temporary picture, and use the preset text segmentation method for the area occupied by the two colors with the smaller area to obtain the divided single handwritten text and Separate single printed text;
识别子单元,用于提取所述单个手写文字的文字特征和所述单个印刷体文字的文字特征,并输入预设的支持向量机中进行分类,获得识别而得的手写文字文本和印刷体文字文本。The recognition subunit is used to extract the text features of the single handwritten text and the text features of the single printed text, and input them into a preset support vector machine for classification to obtain the recognized handwritten text text and printed text text.
其中上述子单元分别用于执行的操作与前述实施方式的基于文字识别技术的笔记生成方法的步骤一一对应,在此不再赘述。The operations performed by the above-mentioned sub-units respectively correspond to the steps of the note generation method based on the text recognition technology of the foregoing embodiment, and will not be repeated here.
在一个实施方式中,所述暂时图片生成子单元,包括:In one embodiment, the temporary picture generation subunit includes:
参考数值F1计算模块,用于采集所述指定图片中的像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值,并根据公式:F1=MIN{ROUND[(a1R+a2G+a3B)/L,0],A},获取参考数值F1,其中MIN为最小值函数,ROUND为四舍五入函数,a1、a2、a3均为大于0且小于L的正数,L为大于0的整数,A为预设的取值在范围(0,255)之内第一阈值参数,R、G、B分别为所述指定图片中的指定像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值;The reference value F1 calculation module is used to collect the value of the R color channel, the value of the G color channel and the value of the B color channel in the RGB color model of the pixel in the specified picture, and according to the formula: F1=MIN{ROUND [(a1R+a2G+a3B)/L,0],A}, get the reference value F1, where MIN is the minimum value function, ROUND is the rounding function, a1, a2, and a3 are all positive numbers greater than 0 and less than L, L is an integer greater than 0, A is the first threshold parameter with a preset value in the range (0, 255), R, G, and B are respectively the R color in the RGB color model of the designated pixel in the designated picture The numerical value of the channel, the numerical value of the G color channel and the numerical value of the B color channel;
参考数值F1判断模块,用于判断所述参考数值F1的值是否等于A;The reference value F1 judgment module is used to judge whether the value of the reference value F1 is equal to A;
参考数值F2计算模块,用于若所述参考数值F1的值不等于A,则根据公式:F2=MAX{ROUND[(a1R+a2G+a3B)/L,0],B},获取参考数值F2,其中MIN为最大值函数,B为预设的取值在范围(0,255)之内第二阈值参数,并且B大于A;The reference value F2 calculation module is used to obtain the reference value F2 according to the formula: F2=MAX{ROUND[(a1R+a2G+a3B)/L,0],B} if the value of the reference value F1 is not equal to A , Where MIN is the maximum value function, B is the second threshold parameter with a preset value in the range (0,255), and B is greater than A;
参考数值F2判断模块,用于判断所述参考数值F2的值是否等于B;The reference value F2 judgment module is used to judge whether the value of the reference value F2 is equal to B;
颜色设置模块,用于若所述参考数值F2的值不等于B,则将所述指定像素点的RGB颜色设置为(255,255,255)。The color setting module is configured to set the RGB color of the designated pixel to (255, 255, 255) if the value of the reference value F2 is not equal to B.
其中上述模块分别用于执行的操作与前述实施方式的基于文字识别技术的笔记生成方法的步骤一一对应,在此不再赘述。The operations performed by the above-mentioned modules respectively correspond to the steps of the note generation method based on the text recognition technology of the foregoing embodiment, and will not be repeated here.
在一个实施方式中,所述装置,包括:In one embodiment, the device includes:
样本数据调取单元,用于调取预先采集的样本数据,并将样本数据分成训练集和测试集;其中,所 述样本数据包括预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别;The sample data retrieval unit is used to retrieve pre-collected sample data and divide the sample data into a training set and a test set; wherein the sample data includes pre-collected handwritten characters and is associated with the pre-collected handwritten characters Emotional category;
训练单元,用于将训练集的样本数据输入到预设的神经网络模型中进行训练,得到初始情绪识别模型,其中,训练的过程中采用随机梯度下降法;The training unit is used to input the sample data of the training set into the preset neural network model for training to obtain the initial emotion recognition model, wherein the stochastic gradient descent method is used in the training process;
验证单元,用于利用测试集的样本数据验证所述初始情绪识别模型;A verification unit for verifying the initial emotion recognition model by using sample data of the test set;
标记单元,用于若所述初始情绪识别模型验证通过,则将所述初始情绪识别模型记为所述情绪识别模型。The marking unit is configured to record the initial emotion recognition model as the emotion recognition model if the verification of the initial emotion recognition model is passed.
其中上述单元分别用于执行的操作与前述实施方式的基于文字识别技术的笔记生成方法的步骤一一对应,在此不再赘述。The operations performed by the above-mentioned units respectively correspond to the steps of the note generation method based on the text recognition technology of the foregoing embodiment, and will not be repeated here.
在一个实施方式中,所述装置,包括:In one embodiment, the device includes:
阅读格式获取单元,用于接收第二终端发送的获取手写笔记的获取请求,其中所述获取请求记载有所述第二终端支持的阅读格式;A reading format obtaining unit, configured to receive an obtaining request for obtaining handwritten notes sent by a second terminal, wherein the obtaining request records a reading format supported by the second terminal;
阅读格式判断单元,用于判断所述阅读软件的阅读格式是否能够展示所述笔记;The reading format judgment unit is used to judge whether the reading format of the reading software can display the notes;
笔记发送单元,用于若所述阅读软件的阅读格式能够展示所述笔记,则将所述笔记发送给所述第二终端。The note sending unit is configured to send the note to the second terminal if the reading format of the reading software can display the note.
其中上述单元分别用于执行的操作与前述实施方式的基于文字识别技术的笔记生成方法的步骤一一对应,在此不再赘述。The operations performed by the above-mentioned units respectively correspond to the steps of the note generation method based on the text recognition technology of the foregoing embodiment, and will not be repeated here.
参照图3,本申请实施例中还提供一种计算机设备,该计算机设备可以是服务器,其内部结构可以如图所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设计的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储基于文字识别技术的笔记生成方法所用数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种基于文字识别技术的笔记生成方法。3, an embodiment of the present application also provides a computer device. The computer device may be a server, and its internal structure may be as shown in the figure. The computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor designed by the computer is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The database of the computer equipment is used to store the data used in the note generation method based on text recognition technology. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer program is executed by the processor to realize a note generation method based on text recognition technology.
上述处理器执行上述基于文字识别技术的笔记生成方法,其中所述方法包括的步骤分别与执行前述实施方式的基于文字识别技术的笔记生成方法的步骤一一对应,在此不再赘述。The above-mentioned processor executes the above-mentioned note generation method based on text recognition technology, wherein the steps included in the method respectively correspond to the steps of executing the note generation method based on text recognition technology of the aforementioned embodiment one-to-one, and will not be repeated here.
本申请一实施例还提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现基于文字识别技术的笔记生成方法,其中所述方法包括的步骤分别与执行前述实施方式的基于文字识别技术的笔记生成方法的步骤一一对应,在此不再赘述。其中所述计算机可读存储介质,例如为非易失性的计算机可读存储介质,或者为易失性的计算机可读存储介质。An embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, a method for generating notes based on text recognition technology is realized, wherein the steps included in the method are respectively the same as those in The steps of the note generation method based on the text recognition technology of the embodiment correspond one to one, and will not be repeated here. The computer-readable storage medium is, for example, a non-volatile computer-readable storage medium or a volatile computer-readable storage medium.

Claims (20)

  1. 一种基于文字识别技术的笔记生成方法,应用于指定终端,其特征在于,包括:A note generation method based on text recognition technology, applied to a designated terminal, is characterized in that it includes:
    获取具有手写文字和印刷体文字的指定图片;Obtain designated pictures with handwritten text and printed text;
    利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似;Using a preset picture similarity judgment method to judge whether the specified picture is similar to the picture previously obtained by the specified terminal;
    若所述指定图片与所述指定终端前一次获取的图片不相似,则利用预设的文字识别技术将所述指定图片中的手写文字和印刷体文字分别识别为手写文字文本和印刷体文字文本,以及提取所述指定图片中手写文字的特征数据,其中所述特征数据至少包括所述手写文字中的重笔位置与重笔数量;If the designated picture is not similar to the picture previously obtained by the designated terminal, the handwritten text and printed text in the designated picture are recognized as handwritten text and printed text, respectively, by using a preset text recognition technology , And extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text;
    将所述特征数据输入基于神经网络模型训练完成的情绪识别模型,获得所述情绪识别模型输出的预测情绪类别,其中所述情绪识别模型基于预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别组成的样本数据训练而成;The feature data is input into the emotion recognition model trained based on the neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text and is related to the pre-collected handwritten text. Trained on sample data composed of emotion categories associated with text;
    根据预设的情绪类别与文字排版类型的对应关系,获取与所述预测情绪类别对应的目标文字排版类型;Acquiring the target text typesetting type corresponding to the predicted emotion type according to the preset correspondence relationship between the emotion category and the text typesetting type;
    将所述印刷体文字文本和所述手写文字文本根据所述目标文字排版类型进行排版,生成所述笔记。The printed text and the handwritten text are typeset according to the target text typesetting type to generate the note.
  2. 根据权利要求1所述的基于文字识别技术的笔记生成方法,其特征在于,所述利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似的步骤,包括:The method for generating notes based on text recognition technology according to claim 1, wherein the predetermined method for judging the similarity of pictures is used to judge whether the designated picture is similar to the picture previously obtained by the designated terminal. The steps include:
    分别对所述指定图片与所述指定终端前一次获取的图片进行灰度化处理,得到第一灰度图片和第二灰度图片;Performing gray-scale processing on the designated picture and the picture previously acquired by the designated terminal, respectively, to obtain a first gray-scale picture and a second gray-scale picture;
    计算灰度图片的第m列或者第m行的所有像素点的灰度值的平均值Am,以及计算灰度图片中所有像素点的灰度值的平均值B;Calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale picture, and calculate the average value B of the gray values of all the pixels in the gray-scale picture;
    根据公式:
    Figure PCTCN2019116337-appb-100001
    计算灰度图片的第m列或者第m行的总体方差
    Figure PCTCN2019116337-appb-100002
    其中N为所述灰度图片中的列或者行的总数量;
    According to the formula:
    Figure PCTCN2019116337-appb-100001
    Calculate the overall variance of the m-th column or m-th row of the grayscale image
    Figure PCTCN2019116337-appb-100002
    Where N is the total number of columns or rows in the grayscale picture;
    根据公式:
    Figure PCTCN2019116337-appb-100003
    获得所述第一灰度图片与所述第二灰度图片的第m列或者第m行的总体方差之差
    Figure PCTCN2019116337-appb-100004
    其中,
    Figure PCTCN2019116337-appb-100005
    为所述第一灰度图片的第m列或者第m行的总体方差,
    Figure PCTCN2019116337-appb-100006
    为所述第二灰度图片的第m列或者第m行的总体方差;
    According to the formula:
    Figure PCTCN2019116337-appb-100003
    Obtain the difference between the overall variance of the m-th column or m-th row of the first gray-scale picture and the second gray-scale picture
    Figure PCTCN2019116337-appb-100004
    among them,
    Figure PCTCN2019116337-appb-100005
    Is the overall variance of the m-th column or m-th row of the first grayscale picture,
    Figure PCTCN2019116337-appb-100006
    Is the overall variance of the m-th column or the m-th row of the second grayscale picture;
    判断
    Figure PCTCN2019116337-appb-100007
    是否小于预设的方差误差阈值;
    judgment
    Figure PCTCN2019116337-appb-100007
    Whether it is less than the preset variance error threshold;
    Figure PCTCN2019116337-appb-100008
    小于预设的方差误差阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。
    If
    Figure PCTCN2019116337-appb-100008
    If it is less than the preset variance error threshold, it is determined that the specified picture is similar to the picture previously acquired by the specified terminal.
  3. 根据权利要求1所述的基于文字识别技术的笔记生成方法,其特征在于,所述利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似的步骤,包括:The method for generating notes based on text recognition technology according to claim 1, wherein the predetermined method for judging the similarity of pictures is used to judge whether the designated picture is similar to the picture previously obtained by the designated terminal. The steps include:
    依次对比所述指定图片与所述指定终端前一次获取的图片中对应的像素点,并统计相同像素点的数量;Sequentially compare the corresponding pixels in the designated picture and the picture previously acquired by the designated terminal, and count the number of the same pixels;
    根据公式:相同像素点占比=所述相同像素点的数量/所述指定图片中所有像素点的数量,获得所述相同像素点占比;According to the formula: the proportion of the same pixels=the number of the same pixels/the number of all the pixels in the specified picture, the proportion of the same pixels is obtained;
    判断所述相同像素点占比是否大于预设的占比阈值;Judging whether the proportion of the same pixel points is greater than a preset proportion threshold;
    若所述相同像素点占比大于预设的占比阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。If the proportion of the same pixel is greater than the preset proportion threshold, it is determined that the specified picture is similar to the picture previously obtained by the specified terminal.
  4. 根据权利要求1所述的基于文字识别技术的笔记生成方法,其特征在于,所述手写文字的颜色与所述印刷体文字的颜色不同,所述利用预设的文字识别技术将所述指定图片中的手写文字和印刷体文字分别识别为手写文字文本和印刷体文字文本的步骤,包括:The method for generating notes based on text recognition technology according to claim 1, wherein the color of the handwritten text is different from the color of the printed text, and the predetermined text recognition technology is used to convert the designated picture The steps of recognizing the handwritten text and printed text as handwritten text and printed text respectively include:
    采集所述指定图片中的像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值,并根据预设的三值化法将所述指定图片中的像素点的RGB颜色设置为(0,0,0)、(255,255,255)或者(P,P,P),其中P为大于0且小于255的预设数值,获得由三种颜色构成的暂时图片;Collect the value of the R color channel, the value of the G color channel, and the value of the B color channel in the RGB color model of the pixels in the designated picture, and convert the pixels in the designated picture according to a preset three-value method The RGB color of the point is set to (0,0,0), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255 to obtain a temporary picture composed of three colors;
    计算三种颜色在所述暂时图片中所占面积,并对面积较小的两种颜色的所占区域分别采用预设的文字分割方法,获得分割开的单个手写文字和分割开的单个印刷体文字;Calculate the area occupied by the three colors in the temporary image, and use the preset text segmentation method for the area occupied by the two colors with the smaller area to obtain a single handwritten text divided and a single printed body divided Text
    提取所述单个手写文字的文字特征和所述单个印刷体文字的文字特征,并输入预设的支持向量机中进行分类,获得识别而得的手写文字文本和印刷体文字文本。The text features of the single handwritten text and the text features of the single printed text are extracted and input into a preset support vector machine for classification to obtain recognized handwritten text and printed text.
  5. 根据权利要求4所述的基于文字识别技术的笔记生成方法,其特征在于,所述采集所述指定图片中的像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值,并根据预设的三值化法将所述指定图片中的像素点的RGB颜色设置为(0,0,0)、(255,255,255)或者(P,P,P)的步骤,包括:The method for generating notes based on text recognition technology according to claim 4, wherein the value of the R color channel, the value of the G color channel, and the value of the B color channel in the RGB color model of the pixel in the specified picture are collected. The value of the color channel, and the step of setting the RGB color of the pixel in the specified picture to (0,0,0), (255,255,255) or (P,P,P) according to the preset three-value method, include:
    采集所述指定图片中的像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值,并根据公式:F1=MIN{ROUND[(a1R+a2G+a3B)/L,0],A},获取参考数值F1,其中MIN为最小值函数,ROUND为四舍五入函数,a1、a2、a3均为大于0且小于L的正数,L为大于0的整数,A为预设的取值在范围(0,255)之内第一阈值参数,R、G、B分别为所述指定图片中的指定像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值;Collect the value of the R color channel, the value of the G color channel and the value of the B color channel in the RGB color model of the pixel in the specified picture, and according to the formula: F1=MIN{ROUND[(a1R+a2G+a3B) /L,0],A}, get the reference value F1, where MIN is the minimum value function, ROUND is the rounding function, a1, a2, and a3 are all positive numbers greater than 0 and less than L, L is an integer greater than 0, A It is the preset value of the first threshold parameter within the range (0,255), R, G, and B are respectively the value of the R color channel and the value of the G color channel in the RGB color model of the specified pixel in the specified picture. Value and value of B color channel;
    判断所述参考数值F1的值是否等于A;Determine whether the value of the reference value F1 is equal to A;
    若所述参考数值F1的值不等于A,则根据公式:F2=MAX{ROUND[(a1R+a2G+a3B)/L,0],B},获取参考数值F2,其中MIN为最大值函数,B为预设的取值在范围(0,255)之内第二阈值参数,并且B大于A;If the value of the reference value F1 is not equal to A, then the reference value F2 is obtained according to the formula: F2=MAX{ROUND[(a1R+a2G+a3B)/L,0],B}, where MIN is the maximum value function, B is the second threshold parameter with a preset value in the range (0,255), and B is greater than A;
    判断所述参考数值F2的值是否等于B;Determine whether the value of the reference value F2 is equal to B;
    若所述参考数值F2的值不等于B,则将所述指定像素点的RGB颜色设置为(255,255,255)。If the value of the reference value F2 is not equal to B, the RGB color of the designated pixel is set to (255, 255, 255).
  6. 根据权利要求1所述的基于文字识别技术的笔记生成方法,其特征在于,所述将所述特征数据输入基于神经网络模型训练完成的情绪识别模型,获得所述情绪识别模型输出的预测情绪类别,其中所述情绪识别模型基于预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别组成的样本数据训练而成的步骤之前,包括:The method for generating notes based on text recognition technology according to claim 1, wherein said inputting said characteristic data into an emotion recognition model based on neural network model training to obtain the predicted emotion category output by said emotion recognition model Before the step of training the emotion recognition model based on sample data composed of pre-collected handwritten text and emotion categories associated with the pre-collected handwritten text, it includes:
    调取预先采集的样本数据,并将样本数据分成训练集和测试集;其中,所述样本数据包括预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别;Retrieve pre-collected sample data, and divide the sample data into a training set and a test set; wherein the sample data includes pre-collected handwritten characters and emotion categories associated with the pre-collected handwritten characters;
    将训练集的样本数据输入到预设的神经网络模型中进行训练,得到初始情绪识别模型,其中,训练的过程中采用随机梯度下降法;Input the sample data of the training set into the preset neural network model for training to obtain the initial emotion recognition model, where the stochastic gradient descent method is used in the training process;
    利用测试集的样本数据验证所述初始情绪识别模型;Verifying the initial emotion recognition model by using sample data of the test set;
    若所述初始情绪识别模型验证通过,则将所述初始情绪识别模型记为所述情绪识别模型。If the verification of the initial emotion recognition model is passed, the initial emotion recognition model is recorded as the emotion recognition model.
  7. 根据权利要求1所述的基于文字识别技术的笔记生成方法,其特征在于,所述将所述印刷体文字文本和所述手写文字文本根据所述目标文字排版类型进行排版,生成所述笔记的步骤之后,包括:The method for generating notes based on text recognition technology according to claim 1, characterized in that the said printed text and said handwritten text are typeset according to said target text typesetting type to generate said note After the steps, include:
    接收第二终端发送的获取手写笔记的获取请求,其中所述获取请求记载有所述第二终端支持的阅读格式;Receiving an acquisition request for acquiring a handwritten note sent by a second terminal, where the acquisition request records a reading format supported by the second terminal;
    判断所述阅读软件的阅读格式是否能够展示所述笔记;Determine whether the reading format of the reading software can display the notes;
    若所述阅读软件的阅读格式能够展示所述笔记,则将所述笔记发送给所述第二终端。If the reading format of the reading software can display the note, the note is sent to the second terminal.
  8. 一种基于文字识别技术的笔记生成装置,应用于指定终端,其特征在于,包括:A note generation device based on text recognition technology, which is applied to a designated terminal, and is characterized in that it includes:
    指定图片获取单元,用于获取具有手写文字和印刷体文字的指定图片;Designated picture acquisition unit for acquiring designated pictures with handwritten text and printed text;
    相似度判断单元,用于利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似;A similarity judgment unit, configured to use a preset image similarity judgment method to judge whether the designated picture is similar to the picture previously acquired by the designated terminal;
    特征数据获取单元,用于若所述指定图片与所述指定终端前一次获取的图片不相似,则利用预设的文字识别技术将所述指定图片中的手写文字和印刷体文字分别识别为手写文字文本和印刷体文字文本,以及提取所述指定图片中手写文字的特征数据,其中所述特征数据至少包括所述手写文字中的重笔位置与重笔数量;The feature data acquisition unit is configured to, if the designated picture is not similar to the picture previously acquired by the designated terminal, use a preset text recognition technology to recognize the handwritten text and printed text in the designated picture as handwriting. Text and printed text, and extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text;
    预测情绪类别获取单元,用于将所述特征数据输入基于神经网络模型训练完成的情绪识别模型,获得所述情绪识别模型输出的预测情绪类别,其中所述情绪识别模型基于预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别组成的样本数据训练而成;The predicted emotion category obtaining unit is configured to input the characteristic data into an emotion recognition model trained based on a neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text, And training from sample data composed of emotion categories associated with the pre-collected handwritten text;
    排版类型获取单元,用于根据预设的情绪类别与文字排版类型的对应关系,获取与所述预测情绪类别对应的目标文字排版类型;The typesetting type obtaining unit is configured to obtain the target text typesetting type corresponding to the predicted emotion type according to the preset correspondence between the emotion category and the text typesetting type;
    排版单元,用于将所述印刷体文字文本和所述手写文字文本根据所述目标文字排版类型进行排版, 生成所述笔记。The typesetting unit is configured to typeset the printed text and the handwritten text according to the target text typesetting type to generate the note.
  9. 根据权利要求8所述的基于文字识别技术的笔记生成装置,其特征在于,所述相似度判断单元,包括:8. The note generation device based on text recognition technology according to claim 8, wherein the similarity judgment unit comprises:
    灰度化子单元,用于分别对所述指定图片与所述指定终端前一次获取的图片进行灰度化处理,得到第一灰度图片和第二灰度图片;A grayscale subunit, configured to perform grayscale processing on the designated picture and the picture previously acquired by the designated terminal, respectively, to obtain a first grayscale picture and a second grayscale picture;
    平均值计算子单元,用于计算灰度图片的第m列或者第m行的所有像素点的灰度值的平均值Am,以及计算灰度图片中所有像素点的灰度值的平均值B;The average value calculation subunit is used to calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale image, and calculate the average value B of the gray values of all the pixels in the gray-scale image ;
    总体方差计算子单元,用于根据公式:
    Figure PCTCN2019116337-appb-100009
    计算灰度图片的第m列或者第m行的总体方差
    Figure PCTCN2019116337-appb-100010
    其中N为所述灰度图片中的列或者行的总数量;
    The overall variance calculation subunit is used according to the formula:
    Figure PCTCN2019116337-appb-100009
    Calculate the overall variance of the m-th column or m-th row of the grayscale image
    Figure PCTCN2019116337-appb-100010
    Where N is the total number of columns or rows in the grayscale picture;
    方差之差计算子单元,用于根据公式:
    Figure PCTCN2019116337-appb-100011
    获得所述第一灰度图片与所述第二灰度图片的第m列或者第m行的总体方差之差
    Figure PCTCN2019116337-appb-100012
    其中,
    Figure PCTCN2019116337-appb-100013
    为所述第一灰度图片的第m列或者第m行的总体方差,
    Figure PCTCN2019116337-appb-100014
    为所述第二灰度图片的第m列或者第m行的总体方差;
    The difference of variance calculation subunit is used according to the formula:
    Figure PCTCN2019116337-appb-100011
    Obtain the difference between the overall variance of the m-th column or m-th row of the first gray-scale picture and the second gray-scale picture
    Figure PCTCN2019116337-appb-100012
    among them,
    Figure PCTCN2019116337-appb-100013
    Is the overall variance of the m-th column or m-th row of the first grayscale picture,
    Figure PCTCN2019116337-appb-100014
    Is the overall variance of the m-th column or the m-th row of the second grayscale picture;
    误差阈值判断子单元,用于判断
    Figure PCTCN2019116337-appb-100015
    是否小于预设的方差误差阈值;
    Error threshold judgment subunit, used to judge
    Figure PCTCN2019116337-appb-100015
    Whether it is less than the preset variance error threshold;
    相似判定子单元,用于若
    Figure PCTCN2019116337-appb-100016
    小于预设的方差误差阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。
    Similarity determination subunit, used if
    Figure PCTCN2019116337-appb-100016
    If it is less than the preset variance error threshold, it is determined that the specified picture is similar to the picture previously acquired by the specified terminal.
  10. 根据权利要求8所述的基于文字识别技术的笔记生成装置,其特征在于,所述相似度判断单元,包括:8. The note generation device based on text recognition technology according to claim 8, wherein the similarity judgment unit comprises:
    相同像素点统计子单元,用于依次对比所述指定图片与所述指定终端前一次获取的图片中对应的像素点,并统计相同像素点的数量;The same pixel count subunit, which is used to sequentially compare corresponding pixels in the designated picture and the picture previously obtained by the designated terminal, and count the number of identical pixels;
    相同像素点占比计算子单元,用于根据公式:相同像素点占比=所述相同像素点的数量/所述指定图片中所有像素点的数量,获得所述相同像素点占比;The same pixel ratio calculation subunit is used to obtain the same pixel ratio according to the formula: the same pixel ratio=the number of the same pixels/the number of all pixels in the specified picture;
    占比阈值判断子单元,用于判断所述相同像素点占比是否大于预设的占比阈值;The proportion threshold judging subunit is used to judge whether the proportion of the same pixel is greater than a preset proportion threshold;
    第二相似判定子单元,用于若所述相同像素点占比大于预设的占比阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。The second similarity determination subunit is configured to determine that the designated picture is similar to the picture previously obtained by the designated terminal if the proportion of the same pixel is greater than the preset proportion threshold.
  11. 根据权利要求8所述的基于文字识别技术的笔记生成装置,其特征在于,所述手写文字的颜色与所述印刷体文字的颜色不同,所述特征数据获取单元,包括:8. The note generation device based on text recognition technology according to claim 8, wherein the color of the handwritten text is different from the color of the printed text, and the characteristic data acquiring unit includes:
    暂时图片生成子单元,用于采集所述指定图片中的像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值,并根据预设的三值化法将所述指定图片中的像素点的RGB颜色设置为(0,0,0)、(255,255,255)或者(P,P,P),其中P为大于0且小于255的预设数值,获得由三种颜色构成的暂时图片;The temporary picture generation subunit is used to collect the value of the R color channel, the value of the G color channel and the value of the B color channel in the RGB color model of the pixel in the specified picture, and according to the preset three-value method Set the RGB color of the pixel in the specified picture to (0,0,0), (255,255,255) or (P,P,P), where P is a preset value greater than 0 and less than 255, and the Temporary pictures composed of various colors;
    分割子单元,用于计算三种颜色在所述暂时图片中所占面积,并对面积较小的两种颜色的所占区域分别采用预设的文字分割方法,获得分割开的单个手写文字和分割开的单个印刷体文字;The segmentation subunit is used to calculate the area occupied by the three colors in the temporary picture, and use the preset text segmentation method for the area occupied by the two colors with the smaller area to obtain the divided single handwritten text and Separate single printed text;
    识别子单元,用于提取所述单个手写文字的文字特征和所述单个印刷体文字的文字特征,并输入预设的支持向量机中进行分类,获得识别而得的手写文字文本和印刷体文字文本。The recognition subunit is used to extract the text features of the single handwritten text and the text features of the single printed text, and input them into a preset support vector machine for classification to obtain the recognized handwritten text text and printed text text.
  12. 根据权利要求11所述的基于文字识别技术的笔记生成装置,其特征在于,所述暂时图片生成子单元,包括:The note generation device based on text recognition technology according to claim 11, wherein the temporary picture generation subunit comprises:
    参考数值F1计算模块,用于采集所述指定图片中的像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值,并根据公式:F1=MIN{ROUND[(a1R+a2G+a3B)/L,0],A},获取参考数值F1,其中MIN为最小值函数,ROUND为四舍五入函数,a1、a2、a3均为大于0且小于L的正数,L为大于0的整数,A为预设的取值在范围(0,255)之内第一阈值参数,R、G、B分别为所述指定图片中的指定像素点的RGB颜色模型中的R颜色通道的数值、G颜色通道的数值和B颜色通道的数值;The reference value F1 calculation module is used to collect the value of the R color channel, the value of the G color channel and the value of the B color channel in the RGB color model of the pixel in the specified picture, and according to the formula: F1=MIN{ROUND [(a1R+a2G+a3B)/L,0],A}, get the reference value F1, where MIN is the minimum value function, ROUND is the rounding function, a1, a2, and a3 are all positive numbers greater than 0 and less than L, L is an integer greater than 0, A is the first threshold parameter with a preset value in the range (0, 255), R, G, and B are respectively the R color in the RGB color model of the designated pixel in the designated picture The numerical value of the channel, the numerical value of the G color channel and the numerical value of the B color channel;
    参考数值F1判断模块,用于判断所述参考数值F1的值是否等于A;The reference value F1 judgment module is used to judge whether the value of the reference value F1 is equal to A;
    参考数值F2计算模块,用于若所述参考数值F1的值不等于A,则根据公式:F2=MAX{ROUND[(a1R+a2G+a3B)/L,0],B},获取参考数值F2,其中MIN为最大值函数,B为预设的取值在范围(0,255)之内第二阈值参数,并且B大于A;The reference value F2 calculation module is used to obtain the reference value F2 according to the formula: F2=MAX{ROUND[(a1R+a2G+a3B)/L,0],B} if the value of the reference value F1 is not equal to A , Where MIN is the maximum value function, B is the second threshold parameter with a preset value in the range (0,255), and B is greater than A;
    参考数值F2判断模块,用于判断所述参考数值F2的值是否等于B;The reference value F2 judgment module is used to judge whether the value of the reference value F2 is equal to B;
    颜色设置模块,用于若所述参考数值F2的值不等于B,则将所述指定像素点的RGB颜色设置为(255,255,255)。The color setting module is configured to set the RGB color of the designated pixel to (255, 255, 255) if the value of the reference value F2 is not equal to B.
  13. 根据权利要求8所述的基于文字识别技术的笔记生成装置,所述装置,包括:8. The note generation device based on text recognition technology according to claim 8, said device comprising:
    样本数据调取单元,用于调取预先采集的样本数据,并将样本数据分成训练集和测试集;其中,所述样本数据包括预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别;The sample data retrieval unit is used to retrieve pre-collected sample data and divide the sample data into a training set and a test set; wherein the sample data includes pre-collected handwritten characters and is associated with the pre-collected handwritten characters Emotional category;
    训练单元,用于将训练集的样本数据输入到预设的神经网络模型中进行训练,得到初始情绪识别模型,其中,训练的过程中采用随机梯度下降法;The training unit is used to input the sample data of the training set into the preset neural network model for training to obtain the initial emotion recognition model, wherein the stochastic gradient descent method is used in the training process;
    验证单元,用于利用测试集的样本数据验证所述初始情绪识别模型;A verification unit for verifying the initial emotion recognition model by using sample data of the test set;
    标记单元,用于若所述初始情绪识别模型验证通过,则将所述初始情绪识别模型记为所述情绪识别模型。The marking unit is configured to record the initial emotion recognition model as the emotion recognition model if the verification of the initial emotion recognition model is passed.
  14. 根据权利要求8所述的基于文字识别技术的笔记生成装置,其特征在于,所述装置,包括:8. The note generation device based on text recognition technology according to claim 8, wherein the device comprises:
    阅读格式获取单元,用于接收第二终端发送的获取手写笔记的获取请求,其中所述获取请求记载有所述第二终端支持的阅读格式;A reading format obtaining unit, configured to receive an obtaining request for obtaining handwritten notes sent by a second terminal, wherein the obtaining request records a reading format supported by the second terminal;
    阅读格式判断单元,用于判断所述阅读软件的阅读格式是否能够展示所述笔记;The reading format judgment unit is used to judge whether the reading format of the reading software can display the notes;
    笔记发送单元,用于若所述阅读软件的阅读格式能够展示所述笔记,则将所述笔记发送给所述第二终端。The note sending unit is configured to send the note to the second terminal if the reading format of the reading software can display the note.
  15. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现基于文字识别技术的笔记生成方法,所述基于文字识别技术的笔记生成方法,包括:A computer device includes a memory and a processor, the memory stores a computer program, and is characterized in that the processor implements a method for generating notes based on text recognition technology when the processor executes the computer program, and the text recognition technology-based Note generation methods, including:
    获取具有手写文字和印刷体文字的指定图片;Obtain designated pictures with handwritten text and printed text;
    利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似;Using a preset picture similarity judgment method to judge whether the specified picture is similar to the picture previously obtained by the specified terminal;
    若所述指定图片与所述指定终端前一次获取的图片不相似,则利用预设的文字识别技术将所述指定图片中的手写文字和印刷体文字分别识别为手写文字文本和印刷体文字文本,以及提取所述指定图片中手写文字的特征数据,其中所述特征数据至少包括所述手写文字中的重笔位置与重笔数量;If the designated picture is not similar to the picture previously obtained by the designated terminal, the handwritten text and printed text in the designated picture are recognized as handwritten text and printed text, respectively, by using a preset text recognition technology , And extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text;
    将所述特征数据输入基于神经网络模型训练完成的情绪识别模型,获得所述情绪识别模型输出的预测情绪类别,其中所述情绪识别模型基于预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别组成的样本数据训练而成;The feature data is input into the emotion recognition model trained based on the neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text and is related to the pre-collected handwritten text. Trained on sample data composed of emotion categories associated with text;
    根据预设的情绪类别与文字排版类型的对应关系,获取与所述预测情绪类别对应的目标文字排版类型;Acquiring the target text typesetting type corresponding to the predicted emotion type according to the preset correspondence relationship between the emotion category and the text typesetting type;
    将所述印刷体文字文本和所述手写文字文本根据所述目标文字排版类型进行排版,生成所述笔记。The printed text and the handwritten text are typeset according to the target text typesetting type to generate the note.
  16. 根据权利要求15所述的计算机设备,其特征在于,所述利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似的步骤,包括:The computer device according to claim 15, wherein the step of determining whether the designated picture is similar to the picture previously obtained by the designated terminal by using a preset picture similarity judgment method comprises:
    分别对所述指定图片与所述指定终端前一次获取的图片进行灰度化处理,得到第一灰度图片和第二灰度图片;Performing gray-scale processing on the designated picture and the picture previously acquired by the designated terminal, respectively, to obtain a first gray-scale picture and a second gray-scale picture;
    计算灰度图片的第m列或者第m行的所有像素点的灰度值的平均值Am,以及计算灰度图片中所有像素点的灰度值的平均值B;Calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale picture, and calculate the average value B of the gray values of all the pixels in the gray-scale picture;
    根据公式:
    Figure PCTCN2019116337-appb-100017
    计算灰度图片的第m列或者第m行的总体方差
    Figure PCTCN2019116337-appb-100018
    其中N为所述灰度图片中的列或者行的总数量;
    According to the formula:
    Figure PCTCN2019116337-appb-100017
    Calculate the overall variance of the m-th column or m-th row of the grayscale image
    Figure PCTCN2019116337-appb-100018
    Where N is the total number of columns or rows in the grayscale picture;
    根据公式:
    Figure PCTCN2019116337-appb-100019
    获得所述第一灰度图片与所述第二灰度图片的第m列或者第m行的总体方差之差
    Figure PCTCN2019116337-appb-100020
    其中,
    Figure PCTCN2019116337-appb-100021
    为所述第一灰度图片的第m列或者第m行的总体方差,
    Figure PCTCN2019116337-appb-100022
    为所述第二灰度图片的第m列或者第m行的总体方差;
    According to the formula:
    Figure PCTCN2019116337-appb-100019
    Obtain the difference between the overall variance of the m-th column or m-th row of the first gray-scale picture and the second gray-scale picture
    Figure PCTCN2019116337-appb-100020
    among them,
    Figure PCTCN2019116337-appb-100021
    Is the overall variance of the m-th column or m-th row of the first grayscale picture,
    Figure PCTCN2019116337-appb-100022
    Is the overall variance of the m-th column or the m-th row of the second grayscale picture;
    判断
    Figure PCTCN2019116337-appb-100023
    是否小于预设的方差误差阈值;
    judgment
    Figure PCTCN2019116337-appb-100023
    Whether it is less than the preset variance error threshold;
    Figure PCTCN2019116337-appb-100024
    小于预设的方差误差阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。
    If
    Figure PCTCN2019116337-appb-100024
    If it is less than the preset variance error threshold, it is determined that the specified picture is similar to the picture previously acquired by the specified terminal.
  17. 根据权利要求15所述的计算机设备,其特征在于,所述利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似的步骤,包括:The computer device according to claim 15, wherein the step of determining whether the designated picture is similar to the picture previously obtained by the designated terminal by using a preset picture similarity judgment method comprises:
    依次对比所述指定图片与所述指定终端前一次获取的图片中对应的像素点,并统计相同像素点的数量;Sequentially compare the corresponding pixels in the designated picture and the picture previously acquired by the designated terminal, and count the number of the same pixels;
    根据公式:相同像素点占比=所述相同像素点的数量/所述指定图片中所有像素点的数量,获得所述相同像素点占比;According to the formula: the proportion of the same pixels=the number of the same pixels/the number of all the pixels in the specified picture, the proportion of the same pixels is obtained;
    判断所述相同像素点占比是否大于预设的占比阈值;Judging whether the proportion of the same pixel points is greater than a preset proportion threshold;
    若所述相同像素点占比大于预设的占比阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。If the proportion of the same pixel is greater than the preset proportion threshold, it is determined that the specified picture is similar to the picture previously obtained by the specified terminal.
  18. 一种非易失性的计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现基于文字识别技术的笔记生成方法,所述基于文字识别技术的笔记生成方法,包括:A non-volatile computer-readable storage medium with a computer program stored thereon, wherein the computer program is characterized in that, when the computer program is executed by a processor, a method for generating notes based on text recognition technology is implemented, and the text recognition technology-based Note generation methods, including:
    获取具有手写文字和印刷体文字的指定图片;Obtain designated pictures with handwritten text and printed text;
    利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似;Using a preset picture similarity judgment method to judge whether the specified picture is similar to the picture previously obtained by the specified terminal;
    若所述指定图片与所述指定终端前一次获取的图片不相似,则利用预设的文字识别技术将所述指定图片中的手写文字和印刷体文字分别识别为手写文字文本和印刷体文字文本,以及提取所述指定图片中手写文字的特征数据,其中所述特征数据至少包括所述手写文字中的重笔位置与重笔数量;If the designated picture is not similar to the picture previously obtained by the designated terminal, the handwritten text and printed text in the designated picture are recognized as handwritten text and printed text, respectively, by using a preset text recognition technology , And extracting feature data of the handwritten text in the designated picture, wherein the feature data includes at least the repetition position and the number of repetitions in the handwritten text;
    将所述特征数据输入基于神经网络模型训练完成的情绪识别模型,获得所述情绪识别模型输出的预测情绪类别,其中所述情绪识别模型基于预先采集的手写文字,以及与所述预先采集的手写文字关联的情绪类别组成的样本数据训练而成;The feature data is input into the emotion recognition model trained based on the neural network model to obtain the predicted emotion category output by the emotion recognition model, wherein the emotion recognition model is based on pre-collected handwritten text and is related to the pre-collected handwritten text. Trained on sample data composed of emotion categories associated with text;
    根据预设的情绪类别与文字排版类型的对应关系,获取与所述预测情绪类别对应的目标文字排版类型;Acquiring the target text typesetting type corresponding to the predicted emotion type according to the preset correspondence relationship between the emotion category and the text typesetting type;
    将所述印刷体文字文本和所述手写文字文本根据所述目标文字排版类型进行排版,生成所述笔记。The printed text and the handwritten text are typeset according to the target text typesetting type to generate the note.
  19. 根据权利要求18所述的非易失性的计算机可读存储介质,其特征在于,所述利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似的步骤,包括:The non-volatile computer-readable storage medium according to claim 18, wherein the predetermined method for determining the similarity of pictures is used to determine whether the designated picture is the same as the picture previously acquired by the designated terminal. Similar steps include:
    分别对所述指定图片与所述指定终端前一次获取的图片进行灰度化处理,得到第一灰度图片和第二灰度图片;Performing gray-scale processing on the designated picture and the picture previously acquired by the designated terminal, respectively, to obtain a first gray-scale picture and a second gray-scale picture;
    计算灰度图片的第m列或者第m行的所有像素点的灰度值的平均值Am,以及计算灰度图片中所有像素点的灰度值的平均值B;Calculate the average value Am of the gray values of all pixels in the m-th column or the m-th row of the gray-scale picture, and calculate the average value B of the gray values of all the pixels in the gray-scale picture;
    根据公式:
    Figure PCTCN2019116337-appb-100025
    计算灰度图片的第m列或者第m行的总体方差
    Figure PCTCN2019116337-appb-100026
    其中N为所述灰度图片中的列或者行的总数量;
    According to the formula:
    Figure PCTCN2019116337-appb-100025
    Calculate the overall variance of the m-th column or m-th row of the grayscale image
    Figure PCTCN2019116337-appb-100026
    Where N is the total number of columns or rows in the grayscale picture;
    根据公式:
    Figure PCTCN2019116337-appb-100027
    获得所述第一灰度图片与所述第二灰度图片的第m列或者第m行的总体方差之差
    Figure PCTCN2019116337-appb-100028
    其中,
    Figure PCTCN2019116337-appb-100029
    为所述第一灰度图片的第m列或者第m行的总体方差,
    Figure PCTCN2019116337-appb-100030
    为所述第二灰度图片的第m列或者第m行的总体方差;
    According to the formula:
    Figure PCTCN2019116337-appb-100027
    Obtain the difference between the overall variance of the m-th column or m-th row of the first gray-scale picture and the second gray-scale picture
    Figure PCTCN2019116337-appb-100028
    among them,
    Figure PCTCN2019116337-appb-100029
    Is the overall variance of the m-th column or m-th row of the first grayscale picture,
    Figure PCTCN2019116337-appb-100030
    Is the overall variance of the m-th column or the m-th row of the second grayscale picture;
    判断
    Figure PCTCN2019116337-appb-100031
    是否小于预设的方差误差阈值;
    judgment
    Figure PCTCN2019116337-appb-100031
    Whether it is less than the preset variance error threshold;
    Figure PCTCN2019116337-appb-100032
    小于预设的方差误差阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。
    If
    Figure PCTCN2019116337-appb-100032
    If it is less than the preset variance error threshold, it is determined that the specified picture is similar to the picture previously acquired by the specified terminal.
  20. 根据权利要求18所述的非易失性的计算机可读存储介质,其特征在于,所述利用预设的图片相似度判断方法,判断所述指定图片与所述指定终端前一次获取的图片是否相似的步骤,包括:The non-volatile computer-readable storage medium according to claim 18, wherein the predetermined method for determining the similarity of pictures is used to determine whether the designated picture is the same as the picture previously acquired by the designated terminal. Similar steps include:
    依次对比所述指定图片与所述指定终端前一次获取的图片中对应的像素点,并统计相同像素点的数量;Sequentially compare the corresponding pixels in the designated picture and the picture previously acquired by the designated terminal, and count the number of the same pixels;
    根据公式:相同像素点占比=所述相同像素点的数量/所述指定图片中所有像素点的数量,获得所述相同像素点占比;According to the formula: the proportion of the same pixels=the number of the same pixels/the number of all the pixels in the specified picture, the proportion of the same pixels is obtained;
    判断所述相同像素点占比是否大于预设的占比阈值;Judging whether the proportion of the same pixel points is greater than a preset proportion threshold;
    若所述相同像素点占比大于预设的占比阈值,则判定所述指定图片与所述指定终端前一次获取的图片相似。If the proportion of the same pixel is greater than the preset proportion threshold, it is determined that the specified picture is similar to the picture previously obtained by the specified terminal.
PCT/CN2019/116337 2019-09-03 2019-11-07 Note generation method and apparatus based on character recognition technology, and computer device WO2021042505A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910828605.2 2019-09-03
CN201910828605.2A CN110705233B (en) 2019-09-03 2019-09-03 Note generation method and device based on character recognition technology and computer equipment

Publications (1)

Publication Number Publication Date
WO2021042505A1 true WO2021042505A1 (en) 2021-03-11

Family

ID=69194318

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/116337 WO2021042505A1 (en) 2019-09-03 2019-11-07 Note generation method and apparatus based on character recognition technology, and computer device

Country Status (2)

Country Link
CN (1) CN110705233B (en)
WO (1) WO2021042505A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112882678A (en) * 2021-03-15 2021-06-01 百度在线网络技术(北京)有限公司 Image-text processing method, display method, device, equipment and storage medium
CN113255613A (en) * 2021-07-06 2021-08-13 北京世纪好未来教育科技有限公司 Question judging method and device and computer storage medium

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476279A (en) * 2020-03-24 2020-07-31 平安银行股份有限公司 Similarity value-based identification method and device and computer equipment
CN111651960B (en) * 2020-06-01 2023-05-30 杭州尚尚签网络科技有限公司 Optical character joint training and recognition method for transferring contract simplified body to complex body
CN111832547A (en) * 2020-06-24 2020-10-27 平安普惠企业管理有限公司 Dynamic deployment method and device of character recognition model and computer equipment
CN112257710A (en) * 2020-10-26 2021-01-22 北京云杉世界信息技术有限公司 Method and device for detecting inclination of picture with character plane
CN112257629A (en) * 2020-10-29 2021-01-22 广联达科技股份有限公司 Text information identification method and device for construction drawing
CN113610186A (en) * 2021-08-20 2021-11-05 湖州师范学院 Method for recognizing emotional state through digital writing

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050100217A1 (en) * 2003-11-07 2005-05-12 Microsoft Corporation Template-based cursive handwriting recognition
CN106598948A (en) * 2016-12-19 2017-04-26 杭州语忆科技有限公司 Emotion recognition method based on long-term and short-term memory neural network and by combination with autocoder
CN108885555A (en) * 2016-11-30 2018-11-23 微软技术许可有限责任公司 Exchange method and device based on mood
CN109189985A (en) * 2018-08-17 2019-01-11 北京达佳互联信息技术有限公司 Text style processing method, device, electronic equipment and storage medium
CN109815463A (en) * 2018-12-13 2019-05-28 深圳壹账通智能科技有限公司 Control method, device, computer equipment and storage medium are chosen in text editing
CN110135427A (en) * 2019-04-11 2019-08-16 北京百度网讯科技有限公司 The method, apparatus, equipment and medium of character in image for identification

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2767894A1 (en) * 2013-02-15 2014-08-20 BlackBerry Limited Method and apparatus pertaining to adjusting textual graphic embellishments
US20160239608A1 (en) * 2013-09-13 2016-08-18 Vivago Oy Arrangement and a method for creating a synthesis from numerical data and textual information
US10210383B2 (en) * 2015-09-03 2019-02-19 Microsoft Technology Licensing, Llc Interacting with an assistant component based on captured stroke information
US20170068436A1 (en) * 2015-09-03 2017-03-09 Microsoft Technology Licensing, Llc Interpreting and Supplementing Captured Stroke Information

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050100217A1 (en) * 2003-11-07 2005-05-12 Microsoft Corporation Template-based cursive handwriting recognition
CN108885555A (en) * 2016-11-30 2018-11-23 微软技术许可有限责任公司 Exchange method and device based on mood
CN106598948A (en) * 2016-12-19 2017-04-26 杭州语忆科技有限公司 Emotion recognition method based on long-term and short-term memory neural network and by combination with autocoder
CN109189985A (en) * 2018-08-17 2019-01-11 北京达佳互联信息技术有限公司 Text style processing method, device, electronic equipment and storage medium
CN109815463A (en) * 2018-12-13 2019-05-28 深圳壹账通智能科技有限公司 Control method, device, computer equipment and storage medium are chosen in text editing
CN110135427A (en) * 2019-04-11 2019-08-16 北京百度网讯科技有限公司 The method, apparatus, equipment and medium of character in image for identification

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112882678A (en) * 2021-03-15 2021-06-01 百度在线网络技术(北京)有限公司 Image-text processing method, display method, device, equipment and storage medium
CN112882678B (en) * 2021-03-15 2024-04-09 百度在线网络技术(北京)有限公司 Image-text processing method, image-text processing display method, image-text processing device, image-text processing equipment and storage medium
CN113255613A (en) * 2021-07-06 2021-08-13 北京世纪好未来教育科技有限公司 Question judging method and device and computer storage medium
CN113255613B (en) * 2021-07-06 2021-09-24 北京世纪好未来教育科技有限公司 Question judging method and device and computer storage medium

Also Published As

Publication number Publication date
CN110705233B (en) 2023-04-07
CN110705233A (en) 2020-01-17

Similar Documents

Publication Publication Date Title
WO2021042505A1 (en) Note generation method and apparatus based on character recognition technology, and computer device
WO2021027336A1 (en) Authentication method and apparatus based on seal and signature, and computer device
Dutta et al. Improving CNN-RNN hybrid networks for handwriting recognition
CN109543690B (en) Method and device for extracting information
CN108664996B (en) Ancient character recognition method and system based on deep learning
CN108764195B (en) Handwriting model training method, handwritten character recognition method, device, equipment and medium
WO2021051598A1 (en) Text sentiment analysis model training method, apparatus and device, and readable storage medium
RU2707147C1 (en) Neural network training by means of specialized loss functions
WO2020164278A1 (en) Image processing method and device, electronic equipment and readable storage medium
CN113254654B (en) Model training method, text recognition method, device, equipment and medium
CN113011144A (en) Form information acquisition method and device and server
CN111932418B (en) Student learning condition identification method and system, teaching terminal and storage medium
JP5214679B2 (en) Learning apparatus, method and program
WO2022062028A1 (en) Wine label recognition method, wine information management method and apparatus, device, and storage medium
CN111340032A (en) Character recognition method based on application scene in financial field
CN113361666B (en) Handwritten character recognition method, system and medium
CN113011253B (en) Facial expression recognition method, device, equipment and storage medium based on ResNeXt network
CN114357206A (en) Education video color subtitle generation method and system based on semantic analysis
CN114037886A (en) Image recognition method and device, electronic equipment and readable storage medium
CN111242114B (en) Character recognition method and device
CN111881880A (en) Bill text recognition method based on novel network
CN116645683A (en) Signature handwriting identification method, system and storage medium based on prompt learning
CN111414889A (en) Financial statement identification method and device based on character identification
CN111008624A (en) Optical character recognition method and method for generating training sample for optical character recognition
CN115880702A (en) Data processing method, device, equipment, program product and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19944315

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19944315

Country of ref document: EP

Kind code of ref document: A1