CN111797610A - Font and key sentence analysis method based on image processing - Google Patents

Font and key sentence analysis method based on image processing Download PDF

Info

Publication number
CN111797610A
CN111797610A CN202010671365.2A CN202010671365A CN111797610A CN 111797610 A CN111797610 A CN 111797610A CN 202010671365 A CN202010671365 A CN 202010671365A CN 111797610 A CN111797610 A CN 111797610A
Authority
CN
China
Prior art keywords
font
text
key
detected
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010671365.2A
Other languages
Chinese (zh)
Inventor
耿绘绘
张誉文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Mayuan Network Technology Co ltd
Original Assignee
Zhengzhou Mayuan Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Mayuan Network Technology Co ltd filed Critical Zhengzhou Mayuan Network Technology Co ltd
Priority to CN202010671365.2A priority Critical patent/CN111797610A/en
Publication of CN111797610A publication Critical patent/CN111797610A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention provides a font and key sentence analysis method based on image processing, which comprises the following steps: and simultaneously sending the text subimage to be detected into a font analysis twin network and a key sentence analysis twin network to obtain a font analysis result and a key sentence analysis result of the text subimage to be detected, integrating the font analysis result and the key sentence analysis result of the text subimage to be detected to obtain a font analysis result and an article structure judgment result of the text subimage to be detected, and obtaining a grading range of the text image to be detected based on the font analysis result and the article structure judgment result of the text image to be detected. The method can assist the paper marking process of the paper marking teacher, is not influenced by subjective factors of the paper marking teacher, has accurate detection results, and improves the working efficiency of the paper marking teacher.

Description

Font and key sentence analysis method based on image processing
Technical Field
The invention relates to the field of artificial intelligence and image processing, in particular to a font and key sentence analysis method based on image processing.
Background
At present, key sentences of English writing fonts and writing fonts are generally analyzed through artificial analysis, namely analysis is conducted through subjective judgment of a marking teacher, and the method has the defects that the method consumes great time and energy of the marking teacher, the marking teacher is not focused on the key sentences in the writing text, the scoring result is inaccurate, and learning enthusiasm of students is struck.
Disclosure of Invention
In order to solve the above problems, the present invention provides a method for analyzing fonts and key sentences based on image processing, the method comprising:
building a twin network which comprises a font analysis twin network and a key sentence analysis twin network, wherein each twin network comprises two branches sharing weight and a distance calculation module, and each branch comprises a coder and a full connection layer;
training a twin network: constructing a training set by using the collected English text images, and marking whether the images in the training set are attractive or not and whether key sentences are contained or not after intercepting the images in the training set; intercepting two images in the training set, and then simultaneously using the two images as the input of two twin networks, and obtaining the Euclidean distance between two key sentence characteristic vectors and the Euclidean distance between the two font characteristic vectors through the operation processing of the font analysis twin network and the key sentence analysis twin network;
the text image to be detected is subjected to intercepting operation to obtain n text subimages to be detected, a branch is respectively selected from the trained key sentence analysis twin network and font analysis twin network, the text subimages to be detected are sequentially and respectively sent into the selected branch, and the two distance calculation modules, the encoders in the two branches and the full connecting layer respectively correspond to a block; all blocks randomly select available nodes, generate block chain private chains according to the inference sequence between the two selected branches and the distance calculation module, and then calculate at the corresponding nodes according to the block chain private chains;
comparing the Euclidean distances to obtain a font analysis result and a key sentence analysis result of the text subimage to be detected; integrating the font analysis result and the key sentence analysis result of the text subimage to be detected to obtain the font analysis result and the article structure judgment result of the text subimage to be detected; and obtaining the grading range of the text image to be detected based on the font analysis result and the article structure judgment result of the text image to be detected.
The number ratio of the English text images containing the key sentences in the training set to the English text images not containing the key sentences is 1: 1, the number ratio of English text images with beautiful fonts to English text images with unattractive fonts is 1: 1.
and storing the feature vectors obtained in the training process into four categories according to the characteristics of key sentence inclusion, key sentence non-inclusion, beautiful font and unattractive font.
And finishing the intercepting operation through the sliding window, wherein the size of the sliding window is set to contain at least two lines of English text information.
The node selected by the block corresponding to the encoder is a local server node, the node selected by the block corresponding to the full connection layer and the distance calculation module is a cloud server node, and encryption and decryption processing is performed on data transmitted between the nodes by using an encryption and decryption algorithm.
The reasoning sequence specifically comprises:
the key sentence analyzes the twin network: the method comprises the steps that a node selected by a coder corresponding to a block in a selected branch circuit performs feature extraction on an input image to obtain a first feature map, the first feature map is sent to a node selected by a full-connection layer corresponding to the block after being subjected to a leveling operation to perform calculation to obtain a key sentence feature vector, and a distance calculation module calculates Euclidean distances between the key sentence feature vector and two types of stored feature vectors which contain a key sentence and do not contain the key sentence corresponding to the node selected by the block;
font analysis twin network: the node selected by the encoder corresponding to the block in the selected branch performs feature extraction on the input image to obtain a second feature map, the second feature map is sent to the node selected by the corresponding block of the full-connection layer after the leveling operation to perform calculation to obtain a font feature vector, and the distance calculation module calculates the Euclidean distance between the font feature vector and the two types of feature vectors of the stored font with attractive appearance and the stored font with unattractive appearance corresponding to the node selected by the block.
The comparison operation specifically comprises the following steps: setting two threshold values alpha and beta, judging that the keyword sentence is contained when one distance exists in the Euclidean distances between the keyword sentence characteristic vector and the stored characteristic vector containing the keyword sentence and the like and is lower than the threshold value alpha, and judging that the keyword sentence is not contained when one distance exists in the Euclidean distances between the keyword sentence characteristic vector and the stored characteristic vector not containing the keyword sentence and the like and is lower than the threshold value alpha; and when one of the Euclidean distances between the font feature vector and the stored attractive font feature vector is lower than the threshold value beta, judging that the font is attractive, and when one of the Euclidean distances between the font feature vector and the stored unattractive font feature vector is lower than the threshold value beta, judging that the font is unattractive.
The integration operation specifically comprises: for n key sentence analysis results, when the key sentence analysis result is that the ratio of the number of results containing key sentences to n is more than or equal to
Figure BDA0002582417800000021
When the text image article structure to be detected is excellent, judging the result; when the ratio is greater than or equal to
Figure BDA0002582417800000022
Is less than
Figure BDA0002582417800000023
When the text image article structure to be detected is good, judging the result; when the ratio is less than
Figure BDA0002582417800000024
When the text image article structure to be detected is judged to be general;
for n font analysis results, when the font analysis results are beautiful results, the ratio of the number of the results to n is more than or equal to
Figure BDA0002582417800000025
Then, the character font analysis result of the text image to be detected is beautiful; otherwise, it is judged to be unattractive.
The scoring rules are as follows: the score is divided into e scores, when the article structure is general, the font is judged to be general, and the score range is [0, a ]; the method comprises the steps of (a, b) when the article structure is general and the font is beautiful, judging the font to be general when the article structure is good, judging the font to be general and the score to be (b, c) when the article structure is good, judging the font to be general when the article structure is excellent, and (c, d) when the article structure is excellent and the font is beautiful, wherein the score to be (d, e) is 0 < a < b < c < d < e.
The invention has the beneficial effects that:
1. the method adopts a simpler twin network to perform rough identification, has simple network and high running speed, can assist the scoring process of a scoring teacher, simultaneously obtains whether key sentences are contained in English compositions and whether fonts are beautiful, combines two analysis results to provide an objective and fair scoring range for the English compositions, and the scoring teacher can give specific scores according to the scoring range and other factors, thereby reducing the workload of the scoring teacher and improving the working efficiency of the scoring teacher.
2. The existing method mainly depends on the manual judgment of a paper making teacher whether key sentences exist in the paper making text, the judgment is easily influenced by subjective factors of the paper making teacher, so that the scoring result is not objective, the method is not influenced by the manual subjective factors, whether key sentences exist in an English paper can be accurately and quickly judged, the aim of assisting the paper making process of the paper making teacher is further fulfilled, the paper making teacher does not need to invest too much effort, the work load of the paper making teacher is reduced, and the problem of scoring errors caused by the fact that the key sentences in the paper making text are not found due to the influence of the subjective factors of the paper making teacher is effectively solved.
3. The method adopts the block chain and the encryption technology, ensures the safety of data, improves the safety performance of a system using the method, and can prevent the data from being leaked in the transmission process.
Drawings
Fig. 1 is a flow chart of the method.
Detailed Description
In order to make the invention more comprehensible to those skilled in the art, the invention is described in detail below with reference to an embodiment and the accompanying drawings, in which fig. 1 is shown.
Example (b):
a method for analyzing fonts and key sentences based on image processing comprises the following steps:
and acquiring English text images, wherein the sizes, formats and the like of writing partial areas in the English test paper are fixed, so that English writing partial images at the same positions are shot by cameras with the same pose.
Building a twin network, including a font analysis twin network and a key sentence analysis twin network, as shown in fig. 1, the network architectures of the two twin networks are the same, the weights are different, each twin network includes two branches sharing the weights and a distance calculation module, specifically:
the key sentence analysis twin network comprises two branches and a first distance calculation module, wherein the left branch comprises a first encoder and a first full connection layer; the right branch comprises a second encoder and a second full connection layer; the first full-connection layer and the second full-connection layer are connected with the first distance calculation module.
The font analysis twin network comprises two branches and a second distance calculation module, wherein the left branch comprises a third encoder and a third full-connection layer, and the right branch comprises a fourth encoder and a fourth full-connection layer; the third full connecting layer and the fourth full connecting layer are connected with the second distance calculating module.
The encoder is used for extracting features of an input image to obtain a feature map, and the feature map is sent to a full-connection layer for calculation after being subjected to a leveling operation to obtain a feature vector.
Training a twin network:
constructing a training set by using the collected English text images, marking whether the images in the training set are attractive or not and whether key sentences are contained or not after intercepting operation, suggesting a multi-choice examination teacher group as an investigation object to obtain the mark whether each image is attractive or not, finally reserving a certain number of images in the set as a training data set after screening, and suggesting that the number ratio of the English text images containing the key sentences in the training set to the English text images not containing the key sentences is 1: 1, the number ratio of English text images with beautiful fonts to English text images with unattractive fonts is 1: 1;
performing sliding window interception operation on two English text images in a randomly selected training set, then simultaneously using the two English text images as the input of two twin networks, and performing operation processing on a font analysis twin network and a key sentence analysis twin network to obtain an Euclidean distance between two key sentence characteristic vectors and an Euclidean distance between the two font characteristic vectors; training the twin network by comparing loss functions; the formula for the contrast loss function is:
Figure BDA0002582417800000031
wherein, Y is a label indicating whether the two samples match, and when Y is 1, it represents that the two samples are similar or match; when Y is equal to 0, the two samples are not matched, N is the number of the samples, m is a set threshold value, the threshold values in the two twin networks are different, the threshold value in the key sentence analysis twin network is set as alpha, and the threshold value in the font analysis twin network is set as beta;
Figure BDA0002582417800000032
feature vector X output for each twin network1,X2P represents the characteristic dimension of the sample, and in the twin network of the present invention, P is taken to be 1.
According to the loss function, when two samples are similar,
Figure BDA0002582417800000033
the Euclidean distance of similar samples is used as loss; when similar samples are close to the Euclidean distance, the loss is small; when the similar samples are far from each other in Euclidean distance, the loss is increased.
When the two samples do not match up,
Figure BDA0002582417800000034
when the Euclidean distance of the unmatched samples is small, the corresponding loss function is large; when the Euclidean distance of the unmatched samples is large, the loss is small. And setting a threshold value, considering only the range between 0 and the threshold valueObviously, the requirement is satisfied when the distance of the unmatched samples is large enough.
The trained twin network can realize that the distance between the characteristic vectors of similar samples is very close, the distance between the characteristic vectors of dissimilar samples is very far, wherein a threshold value m is set as a judgment standard, namely the Euclidean distance is lower than the threshold value m, and the samples are similar; the euclidean distance is greater than the threshold m and the samples are dissimilar. In the two trained twin networks, the feature vectors of the English text image training set are stored into four categories according to the characteristics of key sentences, non-key sentences, beautiful fonts and unattractive fonts. The Euclidean distance of the feature vectors of the samples in the class is close, and the Euclidean distance of the feature vectors of the samples between the classes is far.
It should be noted that two branches in a twin network are shared in weight, and two branches are adopted during training, and training is performed by using a contrast loss function, so that the method can adapt to the situation that the number of samples is not large enough; in actual analysis, only one branch is used for outputting the characteristic vector, and the Euclidean distance is calculated between the characteristic vector and the stored characteristic vector.
The method comprises the steps that a text image to be detected is subjected to sliding window intercepting operation to obtain n text subimages to be detected, wherein the sliding window size suggestion is set to include at least two lines of English text information; the text subimage to be detected is sequentially and respectively sent into one branch of a font analysis twin network and one branch of a key sentence analysis twin network to respectively obtain a key sentence characteristic vector and a font characteristic vector, Euclidean distances between the key sentence characteristic vector and two types of stored characteristic vectors which contain key sentences and do not contain key sentences and between the font characteristic vector and two types of stored characteristic vectors which are beautiful in font and not beautiful in font are respectively calculated by distance calculation modules corresponding to the two branches, and a font analysis result and a key sentence analysis result of the text subimage to be detected are obtained after comparison operation, specifically:
according to the set threshold values alpha and beta of the two twin networks, when one distance in the Euclidean distance between the key sentence feature vector and the stored feature vector containing the key sentence is lower than the threshold value alpha, the key sentence is judged to be contained, and when one distance in the Euclidean distance between the key sentence feature vector and the stored feature vector not containing the key sentence is lower than the threshold value alpha, the key sentence is judged not to be contained; and when one of the Euclidean distances between the font feature vector and the stored attractive font feature vector is lower than the threshold value beta, judging that the font is attractive, and when one of the Euclidean distances between the font feature vector and the stored unattractive font feature vector is lower than the threshold value beta, judging that the font is unattractive.
Integrating the font analysis result and the key sentence analysis result of the text subimage to be detected, which specifically comprises the following steps: for n key sentence analysis results, when the key sentence analysis result is that the ratio of the number of results containing key sentences to n is more than or equal to
Figure BDA0002582417800000041
When the text image article structure to be detected is excellent, judging the result; when the ratio is greater than or equal to
Figure BDA0002582417800000042
Is less than
Figure BDA0002582417800000043
When the text image article structure to be detected is good, judging the result; when the ratio is less than
Figure BDA0002582417800000044
And when the text image article structure to be detected is judged to be general.
For n font analysis results, when the font analysis results are results with beautiful font, the ratio of the number of the results to n is more than or equal to
Figure BDA0002582417800000045
Then, the font analysis result of the text image to be detected is that the font is beautiful; otherwise, the font is judged to be unattractive.
Thus, a font analysis result and an article structure judgment result of the text image to be detected are obtained.
Obtaining the scoring range of the text image to be detected based on the font analysis result and the article structure judgment result of the text image to be detected, wherein the scoring specific rule is as follows: the embodiment assumes that the score is full of 30 points, when the article structure is general, the font is judged to be general, and the score range is [0, 10 ]; the score range is (10, 15) when the article structure is general and the font is beautiful, the score range is (15, 20) when the article structure is good and the font is judged to be general, the score range is (20, 25) when the article structure is good and the font is beautiful, the score range is (20, 25) when the article structure is excellent and the font is judged to be general, and the score range is (25, 30) when the article structure is excellent and the font is beautiful.
And giving a specific score by the scoring teacher according to the obtained scoring range and other influence factors.
In consideration of the safety problem in the data transmission process, the method also adopts a block chain and an encryption technology.
An implementer selects a branch in two trained twin networks respectively, an encoder and a full connection layer in the selected branch, a distance calculation module corresponds to a block respectively, all the blocks randomly select available nodes for calculation, the available nodes comprise local server nodes and cloud server nodes, specifically, a mixed cloud calculation mode is adopted, the nodes selected by the encoder corresponding to the blocks are the local server nodes, the nodes selected by the full connection layer and the distance calculation module corresponding to the blocks are the cloud server nodes, each node comprises the distance calculation module, the trained encoder and the full connection layer, a block chain private chain is generated according to the inference sequence of the two selected branches, and data transmitted between the nodes are encrypted by using an encryption algorithm.
The specific way of randomly selecting the available nodes is as follows: generating a random number seed, generating a random number sequence by utilizing a square-based method according to the seed, sequentially distributing the random numbers in the sequence to available local server nodes, sequencing the nodes according to the size of the random numbers, and obtaining a serial number by each local server node; distributing a fixed serial number to each block corresponding to the encoder in the selected branch, and selecting a node with the same serial number as each block to finish the selection of the local server node of the block corresponding to the encoder; the number of the nodes is larger than that of the blocks.
Similarly, according to the method for randomly selecting the nodes, each cloud server node obtains an index, each block corresponding to the full connection layer and the distance calculation module in the selected branch is assigned with a fixed label, each block selects the node with the same index as the label, and the selection of the cloud server nodes of the blocks corresponding to the full connection layer and the distance calculation module is completed.
The implementer can obtain the random number seed by means of IP, Global Time and the like.
The specific method for generating the random number sequence by the square-taking method comprises the following steps: setting the obtained random number seeds as X0, obtaining four digits by X0 mod10000, carrying out square operation on the four digits to obtain eight digits, filling zero in front when the number of the four digits is less than eight, taking the middle four digits of the obtained eight digits as the next random number X1, obtaining a new four digits by X1 mod10000, and repeating the operation to obtain a random number sequence; the number of the random numbers in the random number sequence is the same as the number of the available nodes.
Taking a key sentence analysis twin network left branch and a first distance calculation module as examples to explain the generation process of the block chain private chain:
the reasoning sequence of the network left branch and the first distance calculation module is as follows: the first encoder performs feature extraction on an input image to obtain a first feature map, the first feature map is sent to a first full-connection layer for calculation after being subjected to a leveling operation, and a key sentence feature vector is output; and then, calculating Euclidean distances between the feature vectors of the key sentences and the stored feature vectors of two types including the key sentences and not including the key sentences in a first distance calculation module.
The distance calculation module and the blocks corresponding to the encoder and the full connection layer in the left branch are sequentially marked with serial numbers, specifically, the first encoder corresponds to the block [1], the first full connection layer corresponds to the block [2], the first distance calculation module corresponds to the block [3], the block [1] comprises data of various parameters and image information of the first encoder, the block [2] comprises data of various parameters and image information of the first full connection layer, and the block [3] comprises various information data of the first distance calculation module. Suppose that all blocks have randomly selected three nodes according to the above method for randomly selecting nodes:
a local server node 5 is selected in the block [1], a cloud server node 8 is selected in the block [2], a cloud server node 7 is selected in the block [3], specifically, an encoder in the local server node 5 calculates data in the block [1], a full connection layer in the cloud server node 8 calculates data in the block [2], and a distance calculation module in the cloud server node 7 calculates data in the block [3 ]; and generating a private chain of the block chain by the blocks according to the reasoning sequence, namely connecting the block [1] with the block [2] and connecting the block [2] with the block [3 ].
Similarly, the font analysis twin network left branch and the second distance calculation module also generate a corresponding block chain private chain according to the steps.
Therefore, for each text subimage to be detected, two private block chain links can be generated according to the reasoning sequence, and a font analysis result and a key sentence analysis result of the text subimage to be detected are obtained by calculating at corresponding nodes according to the private block chain links.
The embodiment chooses to use the RC5 encryption algorithm, and data between all nodes is transmitted based on the encryption and decryption algorithm until the analysis of all english composition fonts and key sentences is completed. The mechanism of the RC5 encryption algorithm is:
creating a key group: the RC5 algorithm uses 2r +2 key-dependent 32-bit words for encryption, where r denotes the number of rounds of encryption. A key group is created by first copying the key bytes into an array L of 32-bit words (note here whether the processors are in little-endian order or big-endian order), and the last word can be padded with zeros if necessary. Then, the array S is initialized by using a linear congruence generator, and finally L and S are mixed.
Encryption processing: after the key set is created, encryption of the plaintext is started, and when encryption is performed, the plaintext packet is firstly divided into two 32-bit words: a and B (for example, in the case of assuming that the byte order of the processor is little-endian, w is 32, the first plaintext byte enters the lowest byte of a, the fourth plaintext byte enters the highest byte of a, the fifth plaintext byte enters the lowest byte of B, and so on), and the addition is performed by moving left in a loop. The output ciphertext is the content in registers a and B.
Decryption processing: the ciphertext block is divided into two words: a and B (the storage mode is the same as the encryption mode), and the subtraction operation is carried out according to the circular right shift.
The implementer can choose which encryption method is used.
The above description is intended to provide those skilled in the art with a better understanding of the present invention and is not intended to limit the present invention.

Claims (9)

1. A method for analyzing fonts and key sentences based on image processing is characterized by comprising the following steps:
building a twin network which comprises a font analysis twin network and a key sentence analysis twin network, wherein each twin network comprises two branches sharing weight and a distance calculation module, and each branch comprises a coder and a full connection layer;
training a twin network: constructing a training set by using the collected English text images, and marking whether the images in the training set are attractive or not and whether key sentences are contained or not after intercepting the images in the training set; intercepting two images in the training set, and then simultaneously using the two images as the input of two twin networks, and obtaining the Euclidean distance between two key sentence characteristic vectors and the Euclidean distance between the two font characteristic vectors through the operation processing of the font analysis twin network and the key sentence analysis twin network;
the text image to be detected is subjected to intercepting operation to obtain n text subimages to be detected, a branch is respectively selected from the trained key sentence analysis twin network and font analysis twin network, the text subimages to be detected are sequentially and respectively sent into the selected branch, and the two distance calculation modules, the encoders in the two branches and the full connecting layer respectively correspond to a block; all blocks randomly select available nodes, generate block chain private chains according to the inference sequence between the two selected branches and the distance calculation module, and then calculate at the corresponding nodes according to the block chain private chains;
comparing the Euclidean distances to obtain a font analysis result and a key sentence analysis result of the text subimage to be detected; integrating the font analysis result and the key sentence analysis result of the text subimage to be detected to obtain the font analysis result and the article structure judgment result of the text subimage to be detected; and obtaining the grading range of the text image to be detected based on the font analysis result and the article structure judgment result of the text image to be detected.
2. The method of claim 1 wherein the ratio of english text images containing key sentences to english text images not containing key sentences in the training set is 1: 1, the number ratio of English text images with beautiful fonts to English text images with unattractive fonts is 1: 1.
3. the method of claim 1, wherein the feature vectors obtained during the training process are stored in four categories including key sentences, no key sentences, beautiful fonts, and unattractive fonts.
4. The method of claim 1, wherein the intercepting is accomplished through a sliding window sized to contain at least two lines of english text information.
5. The method of claim 1, wherein the selected node of the block corresponding to the encoder is a local server node, the selected node of the block corresponding to the full link layer and the distance calculation module is a cloud server node, and the encryption and decryption algorithm is used to encrypt and decrypt data transmitted between the nodes.
6. The method according to claim 1, characterized in that the inference order is specifically:
the key sentence analyzes the twin network: the method comprises the steps that a node selected by a coder corresponding to a block in a selected branch circuit performs feature extraction on an input image to obtain a first feature map, the first feature map is sent to a node selected by a full-connection layer corresponding to the block after being subjected to a leveling operation to perform calculation to obtain a key sentence feature vector, and a distance calculation module calculates Euclidean distances between the key sentence feature vector and two types of stored feature vectors which contain a key sentence and do not contain the key sentence corresponding to the node selected by the block;
font analysis twin network: the node selected by the encoder corresponding to the block in the selected branch performs feature extraction on the input image to obtain a second feature map, the second feature map is sent to the node selected by the corresponding block of the full-connection layer after the leveling operation to perform calculation to obtain a font feature vector, and the distance calculation module calculates the Euclidean distance between the font feature vector and the two types of feature vectors of the stored font with attractive appearance and the stored font with unattractive appearance corresponding to the node selected by the block.
7. The method of claim 1, wherein the comparing operation is specifically: setting two threshold values alpha and beta, judging that the keyword sentence is contained when one distance exists in the Euclidean distances between the keyword sentence characteristic vector and the stored characteristic vector containing the keyword sentence and the like and is lower than the threshold value alpha, and judging that the keyword sentence is not contained when one distance exists in the Euclidean distances between the keyword sentence characteristic vector and the stored characteristic vector not containing the keyword sentence and the like and is lower than the threshold value alpha; and when one of the Euclidean distances between the font feature vector and the stored attractive font feature vector is lower than the threshold value beta, judging that the font is attractive, and when one of the Euclidean distances between the font feature vector and the stored unattractive font feature vector is lower than the threshold value beta, judging that the font is unattractive.
8. The method of claim 1, wherein the integrating operation is specifically: for n key sentence analysis results, when the key sentence analysis result is that the ratio of the number of results containing key sentences to n is more than or equal to
Figure FDA0002582417790000021
Then, the text graph to be detectedThe judgment result is excellent like the article structure; when the ratio is greater than or equal to
Figure FDA0002582417790000022
Is less than
Figure FDA0002582417790000023
When the text image article structure to be detected is good, judging the result; when the ratio is less than
Figure FDA0002582417790000024
When the text image article structure to be detected is judged to be general;
for n font analysis results, when the font analysis results are beautiful results, the ratio of the number of the results to n is more than or equal to
Figure FDA0002582417790000025
Then, the character font analysis result of the text image to be detected is beautiful; otherwise, it is judged to be unattractive.
9. The method of claim 1, wherein the specific rules for scoring are: the score is divided into e scores, when the article structure is general, the font is judged to be general, and the score range is [0, a ]; the method comprises the steps of (a, b) when the article structure is general and the font is beautiful, judging the font to be general when the article structure is good, judging the font to be general and the score to be (b, c) when the article structure is good, judging the font to be general when the article structure is excellent, and (c, d) when the article structure is excellent and the font is beautiful, wherein the score to be (d, e) is 0 < a < b < c < d < e.
CN202010671365.2A 2020-07-13 2020-07-13 Font and key sentence analysis method based on image processing Withdrawn CN111797610A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010671365.2A CN111797610A (en) 2020-07-13 2020-07-13 Font and key sentence analysis method based on image processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010671365.2A CN111797610A (en) 2020-07-13 2020-07-13 Font and key sentence analysis method based on image processing

Publications (1)

Publication Number Publication Date
CN111797610A true CN111797610A (en) 2020-10-20

Family

ID=72808525

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010671365.2A Withdrawn CN111797610A (en) 2020-07-13 2020-07-13 Font and key sentence analysis method based on image processing

Country Status (1)

Country Link
CN (1) CN111797610A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507671A (en) * 2020-12-18 2021-03-16 北京百度网讯科技有限公司 Method, device and readable medium for adjusting text space
CN113205084A (en) * 2021-07-05 2021-08-03 北京一起教育科技有限责任公司 English dictation correction method and device and electronic equipment
CN113204974A (en) * 2021-05-14 2021-08-03 清华大学 Method, device and equipment for generating confrontation text and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507671A (en) * 2020-12-18 2021-03-16 北京百度网讯科技有限公司 Method, device and readable medium for adjusting text space
CN112507671B (en) * 2020-12-18 2024-01-12 北京百度网讯科技有限公司 Method, apparatus, and readable medium for adjusting text distance
CN113204974A (en) * 2021-05-14 2021-08-03 清华大学 Method, device and equipment for generating confrontation text and storage medium
CN113205084A (en) * 2021-07-05 2021-08-03 北京一起教育科技有限责任公司 English dictation correction method and device and electronic equipment
CN113205084B (en) * 2021-07-05 2021-10-08 北京一起教育科技有限责任公司 English dictation correction method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN111797610A (en) Font and key sentence analysis method based on image processing
CN109299284B (en) Knowledge graph representation learning method based on structural information and text description
CN107945204B (en) Pixel-level image matting method based on generation countermeasure network
Bogart Preference structures. II: distances between asymmetric relations
CN109583429A (en) A kind of method and device for the middle application topic that corrects an examination paper
CN114818891B (en) Small sample multi-label text classification model training method and text classification method
CN105989001A (en) Image searching method and device, and image searching system
CN106649886A (en) Method for searching for images by utilizing depth monitoring hash of triple label
CN114780723B (en) Portrayal generation method, system and medium based on guide network text classification
CN112215017A (en) Mongolian Chinese machine translation method based on pseudo parallel corpus construction
Zhang et al. Hyperpelt: Unified parameter-efficient language model tuning for both language and vision-and-language tasks
CN107885854A (en) A kind of semi-supervised cross-media retrieval method of feature based selection and virtual data generation
CN101828181A (en) Data processing device
CN104298997B (en) data classification method and device
CN115587192A (en) Relationship information extraction method, device and computer readable storage medium
CN115147849A (en) Training method of character coding model, character matching method and device
CN110110764B (en) Random forest strategy optimization method based on hybrid network and storage medium
CN103544500A (en) Multi-user natural scene mark sequencing method
CN112134679A (en) Combined high-order side channel attack method, device, equipment and medium for SM4
Richner Model Development
CN108717809A (en) A kind of virtual reality emulation tutoring system
CN116996470B (en) Rich media information sending system
CN112101383B (en) Color cast image identification method
Zhang et al. A Spatial-Aware Representation Learning Model for Link Completion in GeoKG: A Case Study on Wikidata and OpenStreetMap
CN112906862A (en) Method for solving similar mathematical problems based on arithmetic and problems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20201020

WW01 Invention patent application withdrawn after publication