CN113312444B - Word stock construction method and device, electronic equipment and storage medium - Google Patents

Word stock construction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113312444B
CN113312444B CN202110694439.9A CN202110694439A CN113312444B CN 113312444 B CN113312444 B CN 113312444B CN 202110694439 A CN202110694439 A CN 202110694439A CN 113312444 B CN113312444 B CN 113312444B
Authority
CN
China
Prior art keywords
font
component
chinese character
target chinese
complexity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110694439.9A
Other languages
Chinese (zh)
Other versions
CN113312444A (en
Inventor
郦悦华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Bank of China
Original Assignee
Agricultural Bank of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Bank of China filed Critical Agricultural Bank of China
Priority to CN202110694439.9A priority Critical patent/CN113312444B/en
Publication of CN113312444A publication Critical patent/CN113312444A/en
Application granted granted Critical
Publication of CN113312444B publication Critical patent/CN113312444B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

According to the character library construction method, the character library construction device, the electronic equipment and the storage medium, the first output data is obtained by inputting the obtained first input data into the first neural network model, and the corresponding first component under the first font is obtained from the Chinese character library according to the first font attribute value set of the component of the target Chinese character, so that the second input data is obtained. Inputting the second input data into a second neural network model to obtain second output data; generating target Chinese characters in the first font according to the components in the first font and the second output data, and obtaining third input data; and inputting the third input data into a third neural network model to obtain a second position parameter value of the target Chinese character in the character construction space, and finally obtaining the target Chinese character. The application realizes the selection, relative position adjustment and position arrangement of the whole Chinese character by three neural network models, so that the construction time of a character library is shortened and the efficiency is improved.

Description

Word stock construction method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and apparatus for constructing a word stock, an electronic device, and a storage medium.
Background
At present, with the development of network technology, the personalized demands of users on the fonts of Chinese characters are continuously increased, so that the construction of a Chinese character library becomes particularly important.
In the prior art, in order to meet the personalized demands of users on Chinese characters, a designer usually manually adjusts the stroke position and stroke form of each Chinese character, so as to construct word libraries with different fonts.
However, the process of constructing the Chinese character library by the method is long in time consumption, low in efficiency and high in labor cost.
Disclosure of Invention
The application provides a word stock construction method, a device, electronic equipment and a storage medium, which are used for solving the problem of long time consumption for constructing a Chinese character word stock in the prior art.
In a first aspect, the present application provides a word stock construction method, the method comprising:
acquiring first input data, the first input data comprising: the average value of font attribute value sets of a component of the target Chinese character corresponds to the font attribute value sets one by one, and each font attribute value set comprises attribute values of font attributes under various fonts; wherein different attribute values of the font attribute are used to characterize different fonts;
Inputting the first input data into a first neural network model to obtain first output data, wherein the first output data comprises: the first font attribute value set of the component of the target Chinese character, the ratio of the weight of the component to the area of the circumscribed rectangle of the component and the shape parameter of the circumscribed rectangle of the component; wherein the first set of font property values comprises a first property value of the font property;
according to the first font attribute value set of the components of the target Chinese character, obtaining a corresponding first component under a first font from a Chinese character library, and obtaining second input data, wherein the second input data comprises: a second attribute value of a font attribute of the first member in a first font, a ratio of a weight of the first member to an area of an circumscribed rectangle of the first member, and a shape parameter of the circumscribed rectangle of the first member;
inputting the second input data into a second neural network model to obtain second output data, wherein the second output data comprises structural relation parameters between the first components and shape parameters of circumscribed rectangles of the first components;
generating a target Chinese character in the first font according to the component in the first font and the second output data, and obtaining third input data, wherein the third input data comprises a shape parameter of an circumscribed rectangle of the target Chinese character in the first font, a shape parameter of the target Chinese character, a first position parameter value of the target Chinese character in a character forming space and a ratio of the weight of the target Chinese character to the area of the circumscribed rectangle of the target Chinese character;
Inputting the third input data into a third neural network model to obtain a second position parameter value of the target Chinese character in a character construction space; and obtaining the target Chinese character under the first font in the character forming space based on the second position parameter value of the target Chinese character in the character forming space and the target Chinese character under the first font.
In one possible implementation, the font properties include at least one of: complexity parameters in different directions, maximum complexity parameters in different directions, and center of gravity parameters in different directions.
In one possible implementation, the font properties include complexity parameters for different directions; the complexity parameter includes complexity; the method further comprises the steps of:
for each direction, based on each component of the target Chinese character under each font, obtaining the complexity of each component of the target Chinese character in different directions under multiple fonts by performing the following processing:
acquiring a pixel array of the component; counting the number of pixels of the pixel array along the direction, the number of pixels corresponding to the circumscribed rectangle of the components in the pixel array and the number of pixels corresponding to the components in the pixel array; and calculating the ratio of the pixel number corresponding to the component to a product result, wherein the product result is the result of multiplying the pixel number of the pixel array along the first direction by the pixel number corresponding to the circumscribed rectangle of the component in the pixel array, and the product result is used as the complexity of the component in the direction under the font.
In one possible implementation, the complexity parameter includes a complexity ratio; the method further comprises the steps of:
and calculating the ratio of the complexity of each component of the target Chinese character in each direction to the complexity of the components of the target Chinese character in other directions in each font, and obtaining the ratio of the complexity of each component of the target Chinese character in multiple fonts.
In one possible implementation, for each direction, based on each component of the target Chinese character for each font, the maximum complexity of each component of the target Chinese character in different directions under multiple fonts is obtained by performing the following processing:
acquiring coordinates of each pixel in a pixel array of the component; calculating the sum of coordinates of pixels corresponding to the member in each group of pixels in the direction, and obtaining the sum of coordinates corresponding to each group of pixels in the direction; and taking the maximum value of the sum of coordinates corresponding to each group of pixels in the direction as the maximum complexity of the component in the direction under the font.
In one possible implementation, the maximum complexity parameter includes a maximum complexity ratio; the method further comprises the steps of:
And calculating the ratio of the maximum complexity of each component of the target Chinese character in each font to the maximum complexity of the components of the target Chinese character in other directions according to the maximum complexity of each component in each direction in each font, and obtaining the maximum complexity ratio of each component of the target Chinese character in various fonts.
In one possible implementation, the font properties include center of gravity parameters for different directions; the center of gravity parameter includes a center of gravity; the method further comprises the steps of:
for each direction, based on each component of the target Chinese character under each font, the barycenter of each component of the target Chinese character in different directions under multiple fonts is obtained by performing the following processing:
acquiring coordinates of each pixel point in a pixel array of the component; counting the number of pixels corresponding to the circumscribed rectangle of the member in the pixel array; calculating the sum of coordinates of pixels corresponding to the components of the pixel points in the direction; and calculating the ratio of the sum of the coordinates of the pixels corresponding to the component to the number of pixels corresponding to the circumscribed rectangle of the component in the pixel array, and taking the ratio as the gravity center of the component in the direction under the font.
In one possible implementation, the gravity center parameter includes a gravity center ratio; the method further comprises the steps of:
And calculating the ratio of the gravity center of each component of the target Chinese character in each direction to the gravity centers of the components of the target Chinese character in other directions in each font, and obtaining the gravity center ratio of each component of the target Chinese character in various fonts.
In one possible implementation, the component shape parameters further include morphological parameters of the component at different regions under different scan directions; the method further comprises the steps of:
acquiring a pixel array of the component; determining, for each scan direction, a respective set of pixels along the scan direction in a current scan region of the pixel array; scanning each group of pixels along the scanning direction, and determining a first scanned pixel in each group of pixels, wherein the first pixel belongs to a pixel corresponding to a component; calculating the sum of coordinates of a first pixel corresponding to each group of pixels to obtain a first summation result; and calculating the reciprocal of the mean square value of the first summation result as a morphological parameter of the component in the area under the scanning direction.
In one possible implementation manner, the first position parameter value includes a ratio of full-deficiency degree of the left part of the target Chinese character to a total sum of full-deficiency degrees of the left and right directions and a ratio of full-deficiency degree of the upper part of the target Chinese character to a total sum of full-deficiency degrees of the upper and lower directions; the second position parameter value comprises a parameter value of a gravity center parameter of the target Chinese character in a character forming space.
In a second aspect, the present application provides a word stock construction apparatus, the apparatus comprising:
a first acquisition unit configured to acquire first input data including: the average value of font attribute value sets of a component of the target Chinese character corresponds to the font attribute value sets one by one, and each font attribute value set comprises attribute values of font attributes under various fonts; wherein different attribute values of the font attribute are used to characterize different fonts;
the first generating unit is configured to input the first input data into a first neural network model, and obtain first output data, where the first output data includes: the first font attribute value set of the component of the target Chinese character, the ratio of the weight of the component to the area of the circumscribed rectangle of the component and the shape parameter of the circumscribed rectangle of the component; wherein the first set of font property values comprises a first property value of the font property;
the second obtaining unit is configured to obtain, according to the first font attribute value set of the component of the target kanji, a first component in a corresponding first font from a kanji library, and obtain second input data, where the second input data includes: a second attribute value of a font attribute of the first member in a first font, a ratio of a weight of the first member to an area of an circumscribed rectangle of the first member, and a shape parameter of the circumscribed rectangle of the first member;
The second generation unit is used for inputting the second input data into a second neural network model to obtain second output data, wherein the second output data comprises structural relation parameters between the first components and shape parameters of circumscribed rectangles of the first components;
a third obtaining unit, configured to generate, according to the component under the first font and the second output data, a target chinese character under the first font, and obtain third input data, where the third input data includes a shape parameter of an circumscribed rectangle of the target chinese character under the first font, a shape parameter of the target chinese character, a first position parameter value of the target chinese character in a word formation space, and a ratio of a weight of the target chinese character to an area of the circumscribed rectangle of the target chinese character;
the third generating unit is used for inputting the third input data into a third neural network model to obtain a second position parameter value of the target Chinese character in a character construction space; and obtaining the target Chinese character under the first font in the character forming space based on the second position parameter value of the target Chinese character in the character forming space and the target Chinese character under the first font.
In one possible implementation, the font properties include at least one of: complexity parameters in different directions, maximum complexity parameters in different directions, and center of gravity parameters in different directions.
In one possible implementation, the font properties include complexity parameters for different directions; the complexity parameter includes complexity; the apparatus further comprises:
a first calculation unit for obtaining, for each direction, based on each component of a target kanji in each font, the complexity of each component of the target kanji in different directions in multiple fonts by performing the following processing:
acquiring a pixel array of the component; counting the number of pixels of the pixel array along the direction, the number of pixels corresponding to the circumscribed rectangle of the components in the pixel array and the number of pixels corresponding to the components in the pixel array; and calculating the ratio of the pixel number corresponding to the component to a product result, wherein the product result is the result of multiplying the pixel number of the pixel array along the first direction by the pixel number corresponding to the circumscribed rectangle of the component in the pixel array, and the product result is used as the complexity of the component in the direction under the font.
In one possible implementation, the complexity parameter includes a complexity ratio; the apparatus further comprises:
and the second calculating unit is used for calculating the ratio of the complexity of each component of the target Chinese character in each direction to the complexity of the components of the target Chinese character in other directions in each font, and obtaining the ratio of the complexity of each component of the target Chinese character in a plurality of fonts.
In one possible implementation, the font properties include maximum complexity parameters for different directions; the apparatus further comprises:
a third calculation unit, configured to obtain, for each direction, based on each component of the target kanji in each font, a maximum complexity of each component of the target kanji in different directions in multiple fonts by performing:
acquiring coordinates of each pixel in a pixel array of the component; calculating the sum of coordinates of pixels corresponding to the member in each group of pixels in the direction, and obtaining the sum of coordinates corresponding to each group of pixels in the direction;
and taking the maximum value of the sum of coordinates corresponding to each group of pixels in the direction as the maximum complexity of the component in the direction under the font.
In one possible implementation, the maximum complexity parameter includes a maximum complexity ratio; the apparatus further comprises:
and a fourth calculation unit, configured to calculate, for the maximum complexity of each component of the target Chinese character in each direction under each font, a ratio of the maximum complexity to the maximum complexity of the components in other directions under the font, and obtain a maximum complexity ratio of each component of the target Chinese character under multiple fonts.
In one possible implementation, the font properties include center of gravity parameters for different directions; the center of gravity parameter includes a center of gravity; the apparatus further comprises:
a fifth calculation unit for obtaining, for each direction, a barycenter of each component of the target kanji in different directions under a plurality of fonts, based on each component of the target kanji under each font, by performing the following processing:
acquiring coordinates of each pixel point in a pixel array of the component; counting the number of pixels corresponding to the circumscribed rectangle of the member in the pixel array; calculating the sum of coordinates of pixels corresponding to the components of the pixel points in the direction; and calculating the ratio of the sum of the coordinates of the pixels corresponding to the component to the number of pixels corresponding to the circumscribed rectangle of the component in the pixel array, and taking the ratio as the gravity center of the component in the direction under the font.
In one possible implementation, the gravity center parameter includes a gravity center ratio; the apparatus further comprises:
and a sixth calculation unit, configured to calculate, for each gravity center of each component of the target Chinese character in each direction in each font, a ratio of the gravity center to the gravity centers of the components of the target Chinese character in other directions in the font, and obtain gravity center ratios of the components of the target Chinese character in multiple fonts.
In one possible implementation, the component shape parameters further include morphological parameters of the component at different regions under different scan directions; the apparatus further comprises:
a seventh calculation unit for acquiring a pixel array of the member; determining, for each scan direction, a respective set of pixels along the scan direction in a current scan region of the pixel array; scanning each group of pixels along the scanning direction, and determining a first scanned pixel in each group of pixels, wherein the first pixel belongs to a pixel corresponding to a component; calculating the sum of coordinates of a first pixel corresponding to each group of pixels to obtain a first summation result; and calculating the reciprocal of the mean square value of the first summation result as a morphological parameter of the component in the area under the scanning direction.
In one possible implementation manner, the first position parameter value includes a ratio of full-deficiency degree of the left part of the target Chinese character to a total sum of full-deficiency degrees of the left and right directions and a ratio of full-deficiency degree of the upper part of the target Chinese character to a total sum of full-deficiency degrees of the upper and lower directions; the second position parameter value comprises a parameter value of a gravity center parameter of the target Chinese character in a character forming space.
In a third aspect, the present application provides an electronic device comprising: a memory, a processor;
a memory; a memory for storing the processor-executable instructions;
wherein the processor is configured to perform the method according to any of the first aspects according to the executable instructions.
In a fourth aspect, the present application provides a computer-readable storage medium having stored therein computer-executable instructions for performing the method of any of the first aspects when executed by a processor.
In a fifth aspect, the application provides a computer program product comprising a computer program which, when executed by a processor, implements the method according to any of the first aspects.
According to the character library construction method, the character library construction device, the electronic equipment and the storage medium, the first output data is obtained by inputting the obtained first input data into the first neural network model, and the corresponding first component under the first font is obtained from the Chinese character library according to the first font attribute value set of the component of the target Chinese character, so that the second input data is obtained. Inputting the second input data into a second neural network model to obtain second output data; generating target Chinese characters in the first font according to the components in the first font and the second output data, and obtaining third input data; inputting third input data into a third neural network model to obtain a second position parameter value of the target Chinese character in the character construction space; and obtaining the target Chinese character under the first font in the character forming space based on the second position parameter value of the target Chinese character in the character forming space and the target Chinese character under the first font. The application realizes the selection of Chinese character components, the relative position adjustment and the position arrangement of the whole character by constructing three neural network models, thereby realizing the construction of a character library, shortening the construction time of the character library and improving the efficiency.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a schematic flow chart of a word stock construction method according to an embodiment of the present application;
FIG. 2 is a diagram illustrating a selection process of a Chinese character component according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a first model training process according to an embodiment of the present application;
fig. 4 is a flow chart of a complexity obtaining method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a component pixel array according to an embodiment of the present application;
fig. 6 is a flowchart of a method for obtaining a maximum complexity parameter according to an embodiment of the present application;
fig. 7 is a flowchart of a method for acquiring a gravity center parameter according to an embodiment of the present application;
fig. 8 is a flow chart of a method for obtaining morphological parameters according to an embodiment of the application;
fig. 9 is a schematic structural diagram of a word stock construction device according to an embodiment of the present application;
FIG. 10 is a schematic diagram of a word stock construction device according to an embodiment of the present application;
Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
At present, with the continuous development of network technology, most people adopt a network office mode, and with the continuous popularization of network office, the personalized demands of users on fonts are increased. Currently, in order to meet the personalized demands of users on fonts, a designer is usually required to construct a Chinese character library manually. For example, the components of the target font Chinese character are manually selected from the font library, the shape and the size of the selected components are manually adjusted in the font constructing space, and the positions of the Chinese character components are adjusted, so that the final Chinese character of the target font is obtained.
However, the method for constructing the Chinese character library takes longer time and requires more manpower, and the design efficiency of the Chinese character library is slower.
The application provides a method and a device for constructing a word stock, electronic equipment and a storage medium, and aims to solve the technical problems in the prior art.
The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a word stock construction method according to an embodiment of the present application. As shown in fig. 1, the method comprises the steps of:
s101, acquiring first input data, wherein the first input data comprises: the average value of the font attribute value sets of the components of the target Chinese characters corresponds to the font attribute value sets one by one, and each font attribute value set comprises attribute values of font attributes under various fonts; wherein different attribute values of the font attributes are used to characterize different fonts.
Illustratively, when designing a target Chinese character, an average value of a font property set of each component in the target Chinese character is first obtained. For example, when a Chinese character "" is required, it is first determined that the Chinese character may be composed of two left and right members of the Chinese character, and each member includes a plurality of different fonts in a member library of the Chinese character. Fig. 2 is a schematic diagram of a selection process of a kanji component according to an embodiment of the present application. For the left component of the Chinese character , three different types of fonts can be included in the font library, and the component of one font is selected as the left component of the final target Chinese character . Therefore, when the first input data of the component is acquired, the average value of the set of attribute values of the font attribute under different fonts of the component can be taken as the first input data of the component.
S102, inputting first input data into a first neural network model to obtain first output data, wherein the first output data comprises: a first font property value set of a component of the target Chinese character, a ratio of the weight of the component to the area of the circumscribed rectangle of the component, and a shape parameter of the circumscribed rectangle of the component; wherein the first set of font property values comprises a first property value of the font property.
Illustratively, after the first input data of each component in the target Chinese character is obtained, the first input data is used as the input data of the first neural network model, and the first output data is obtained. The first neural network model can be used for selecting a component of the target Chinese character under the target font from the Chinese character construction library.
Specifically, the first neural network model may output first output data, wherein the first output data includes a first font property set (i.e., a first property value of a font property) of a component of the target Chinese character, a ratio of a component weight to an area of a component bounding rectangle, and a shape parameter of the component bounding rectangle. Wherein, the first font represents the font of the target Chinese character which is finally designed. The member circumscribed rectangle is the minimum circumscribed rectangle frame of the member, and the ratio of the weight of the member to the area of the member circumscribed rectangle is the ratio of the number of the pixel points of the member in the member to the number of all the pixel points in the member circumscribed rectangle. Alternatively, the shape parameter of the circumscribed rectangle of the member may be characterized by the width and height of the circumscribed rectangle of the member.
The first neural network model in the step is obtained by training a neural network model by taking a font attribute value set of a component of a sample Chinese character, a font attribute value of a preset font of the component of the sample Chinese character, a ratio of the weight of the component to the area of a circumscribed rectangle of the component and a shape parameter of the circumscribed rectangle of the component as training samples. Fig. 3 is a schematic structural diagram of a first model training process according to an embodiment of the present application. In the figure, the average value of the font attribute value set of the components of the sample Chinese characters can be determined from the Chinese character library and used as the input parameter in the model training process. And training by taking the attribute value of the font, the ratio of the weight of the member to the area of the circumscribed rectangle of the member and the shape parameter of the circumscribed rectangle of the member, which are designed in the font library, as the output parameters in the model training process. The connection parameters are transmission parameters among neurons of each layer in the forward propagation or backward propagation process of the model.
Optionally, in order to prevent the model from being fitted during the training process, when training the neural network model, the training sample may be divided into a training set and a verification set, where the training set is used to train the model; taking the verification set as an input parameter of the model, acquiring an error of the model, and stopping training if the error is smaller than a preset value; otherwise, continuing to train the model.
S103, according to a first font attribute value set of the components of the target Chinese character, obtaining a corresponding first component under a first font from a Chinese character library, and obtaining second input data, wherein the second input data comprises: the second attribute value of the font attribute of the first member in the first font, the ratio of the weight of the first member to the area of the circumscribed rectangle of the first member, and the shape parameter of the circumscribed rectangle of the first member.
For example, after the first font attribute value set of the target Chinese character is obtained, a first component of the target Chinese character in the first font may be selected from the Chinese character library according to each attribute value in the first font attribute value set.
Optionally, the user may select, according to the first output data, a member with the smallest error with the first output data in the kanji member library, as a member of the target kanji in the first font. The first font is a target font which is finally required to be designed.
After determining the first component of the target Chinese character in the first font, the second attribute value of the font of the first component in the first font, the ratio of the weight of the first component to the area of the circumscribed rectangle of the first component, and the shape parameter of the circumscribed rectangle of the first component can be obtained.
Optionally, the second attribute value of the font of the first member in the first font, the ratio of the weight of the first member to the area of the circumscribed rectangle of the first member, and the shape parameter of the circumscribed rectangle of the first member may be equal to or different from the value in the first output data (i.e. when there is an error).
S104, inputting second input data into the second neural network model to obtain second output data, wherein the second output data comprises structural relation parameters between the first components and shape parameters of circumscribed rectangles of the first components.
The second input data obtained in the step S103 is taken as an input of a second neural network model, and an output of the second neural network model is obtained, wherein the second neural network model is used for determining a relative position relationship between the first components of the target kanji selected in the step S under the first font, and adaptively adjusting the sizes of the components while determining the relative position relationship. The second output data includes structural relationship parameters between the first members and shape parameters of circumscribed rectangles of the first members. Alternatively, the structural relationship parameter between the first members may be determined by the geometric center of the respective circumscribed rectangle of each first member in the same coordinate system. Alternatively, the shape parameters between the first members may be determined by the center of gravity of the member-bounding rectangle and the width and height of the member-bounding rectangle.
The second neural network model in the step shows that the font attribute value of the preset font of the component of the sample Chinese character, the shape parameter of the circumscribed rectangle of the component of the sample Chinese character under the preset font and the structural relation parameter among the components of the sample Chinese character are training samples, and the training is carried out by adopting the neural network model.
Alternatively, when determining the number of hidden layer nodes in the neural network model in the model training process, the following empirical model calculation method may be adopted first. For example, the initial values of hidden layer nodes in the model may be determined by the following formula.
m=log 2 n
Wherein m is the initial number of hidden layer nodes, and n is the number of input layer nodes.
Optionally, in the model training process, the hidden layer neuron may adopt an S-type transfer function log sig, so as to ensure the nonlinear mapping capability of the neural network.
Optionally, in the process of model training, the output layer neuron may adopt a linear function purelin to perform value range expansion on the neural network output.
Optionally, a bayesian learning algorithm may also be incorporated during model training. When the training samples are insufficient, the prediction performance of the trained neural network can be effectively improved.
S105, generating target Chinese characters in the first font according to the components in the first font and the second output data, and obtaining third input data, wherein the third input data comprises the shape parameters of the circumscribed rectangle of the target Chinese characters in the first font, the shape parameters of the target Chinese characters, the first position parameter value of the target Chinese characters in a character forming space and the ratio of the weight of the target Chinese characters to the area of the circumscribed rectangle of the target Chinese characters.
After the second output data is obtained, the size of the components in the first font is adjusted and the relative positions of the components in the first font are arranged according to the second output data and the components in the first font, namely the components in the first font are adjusted and then spliced together to obtain the target Chinese characters in the first font. And analyzing the target Chinese characters in the first font to obtain the shape parameters of the circumscribed rectangle of the target Chinese characters in the first font, the shape parameters of the target Chinese characters, the first position parameter value of the target Chinese characters in the character forming space and the ratio of the weight of the target Chinese characters to the area of the circumscribed rectangle of the target Chinese characters, and taking the parameters as third input data.
Alternatively, the first position parameter value of the target Chinese character in the word forming space can be represented by the ratio of the total sum of the full-deficiency degree of the left part of the target Chinese character and the full-deficiency degree of the left and right directions and the ratio of the full-deficiency degree of the upper part of the target Chinese character and the total sum of the full-deficiency degrees of the upper and lower directions. When the left full-degree of the target Chinese character scans from the left frame to the right frame of the circumscribed rectangle of the component, the calculated result of the row is obtained by squaring the number of the first component pixel point scanned in each row of pixel points and the left frame of the circumscribed rectangle of the component and then taking the reciprocal, and the calculated result of each row is summed to obtain the left full-degree of the target Chinese character.
S106, inputting third input data into a third neural network model to obtain a second position parameter value of the target Chinese character in the character construction space; and obtaining the target Chinese character under the first font in the character forming space based on the second position parameter value of the target Chinese character in the character forming space and the target Chinese character under the first font.
Illustratively, in the step, the third neural network model is used to adjust the position of the chinese character generated in the step S105 in the word formation space, so as to obtain the target chinese character in the first font in the word formation space.
Specifically, the third input data obtained in step S105 is input into the third neural network model as an input parameter, a second position parameter of the target kanji in the word formation space is obtained through the third neural network model, and the position of the target kanji in the word formation space is adjusted according to the second position parameter. Wherein the second position parameter can be characterized by the center of gravity of the Chinese character in the character forming space.
The third neural network model is obtained by training the training samples by taking the shape parameters of the circumscribed rectangle of the sample Chinese characters, the shape parameters of the sample Chinese characters, the first position parameter values of the sample Chinese characters in the word formation space, the ratio of the weight of the sample Chinese characters to the area of the circumscribed rectangle of the sample Chinese characters and the second position parameter values of the sample Chinese characters in the word formation space.
Alternatively, the second position parameter value of the target Chinese character in the character forming space may be represented by a parameter value of a gravity center parameter of the target Chinese character in the character forming space.
Alternatively, as the three neural network models described above, BP (Back Propagation) neural networks may be used.
Alternatively, in order to avoid saturation of output data due to an excessively large absolute value of input data of the neural network model, the input data of the model may be normalized before the input data, for example, the input data may be normalized as follows.
Wherein x is max X is the maximum value in the input data min To be the minimum value in the input data, x i To normalize the ith parameter, x, in the pre-input data i ' is the i-th parameter in the normalized input data.
In this embodiment, through three neural network models, the selection and relative position adjustment of the components under the first font of the target Chinese character are realized, and the position of the obtained whole character (i.e. the target Chinese character) in the character formation space is determined, so that the construction time of the character library in the character formation space of the target Chinese character is shortened, and the efficiency is improved.
In the embodiment shown in fig. 1, the font properties include at least one of the following: complexity parameters in different directions, maximum complexity parameters in different directions, and center of gravity parameters in different directions.
When complexity parameters in different directions are acquired, the complexity parameters include complexity, and the complexity can be acquired through the following steps. As shown in fig. 4, fig. 4 is a flow chart of a complexity obtaining method according to an embodiment of the present application, where the method includes:
s201, acquiring a pixel array of the component.
Illustratively, the pixel array of the component includes pixel points in the circumscribed rectangular frame of the component and pixel points corresponding to the component itself. Fig. 5 is a schematic diagram of a component pixel array according to an embodiment of the present application. The component can be divided into pixel points which are arrayed, wherein the number of the pixel points corresponding to the component is the number of black small boxes in the graph, and the number of the pixel points in the circumscribed rectangular frame of the component is the sum of the number of the black small boxes and the number of the white small boxes in the largest rectangular frame.
S202, counting the number of pixels of the pixel array along a certain direction, the number of pixels corresponding to the circumscribed rectangle of the components in the pixel array, and the number of pixels corresponding to the components in the pixel array.
Illustratively, counting the number of pixels in a certain direction in the pixel array, for example, in fig. 5, may include the number of pixels in the X-axis direction, that is, the number of rows of pixels included in the circumscribed rectangle of the member. The number of pixels along the Y-axis direction, i.e., the number of columns of pixels contained in the circumscribed rectangle of the member, may also be included. And sequentially counting the number of pixels corresponding to the circumscribed rectangle of the components in the pixel array and the number of pixels corresponding to the components in the pixel array.
S203, calculating the ratio of the number of pixels corresponding to the component to the product result, wherein the product result is the result of multiplying the number of pixels of the pixel array along a certain direction by the number of pixels corresponding to the circumscribed rectangle of the component in the pixel array, and the ratio is used as the complexity of the component in a certain direction under the font.
Illustratively, after calculating the product of the number of pixels in a certain direction and the number of pixels corresponding to the circumscribed rectangle of the component in the pixel array, the ratio of the number of pixels corresponding to the component to the product result is taken as the complexity of the component in a certain direction under the font. Thereafter, for each direction, based on each component of the target Chinese character in each font, the complexity of each component of the target Chinese character in different directions in a plurality of fonts is obtained by performing steps S201-S203. The complexity of the different directions may include the complexity of the component X direction and the complexity of the component Y direction.
Optionally, the complexity parameter further includes a complexity ratio. Specifically, after the complexity of each component of the target Chinese character in a certain font in each direction is obtained, the ratio of the complexity to the complexity of the components in other directions in the font is calculated, so that the ratio of the complexity of each component of the target Chinese character in multiple fonts can be obtained. For example, the ratio may be a ratio of the complexity of the component in the X-direction to the complexity in the Y-direction.
In this embodiment, when acquiring the font attribute, the complexity parameter of the component may be used as the font attribute value of the component, so that the model prediction accuracy obtained by training is higher.
In some embodiments, when the font property is obtained, a maximum complexity parameter is included in the font property, and the maximum complexity parameter includes a maximum complexity. For each direction, based on each component of the target Chinese character under each font, the maximum complexity of each component of the target Chinese character in different directions under multiple fonts is obtained, and the method mainly comprises the following steps. Fig. 6 is a flow chart of a method for obtaining a maximum complexity parameter according to an embodiment of the present application, as shown in fig. 6.
S301, acquiring coordinates of each pixel in a pixel array of the component.
For example, for a component, coordinates may be added to the pixel array to obtain pixel coordinates for each pixel in the component bounding rectangle. For example, the coordinates of the geometric center in the small rectangular frame corresponding to each pixel point may be regarded as the coordinates of the pixel point.
S302, calculating the sum of coordinates of pixels corresponding to the component in each group of pixels in a certain direction, and obtaining the sum of coordinates corresponding to each group of pixels in the certain direction.
In this step, the pixels are divided into a group of pixels according to the direction, and the sum of coordinates of the corresponding pixels in each group of pixels is calculated, for example, for each group of pixels in the X direction of the member, the pixels in the circumscribed rectangle of the member may be divided into a group of pixels according to the row, and for each row of pixels, the sum of coordinates of the pixels of the member in the X direction is calculated, so as to obtain the sum of coordinates corresponding to each group of pixels in a certain direction. For example, for each group of pixels in the Y direction of the member, the pixels in the circumscribed rectangle of the member may be divided into a group of pixels by columns, and for each row of pixels, the sum of coordinates of the pixels of the member in the Y direction is calculated, so as to obtain the sum of coordinates corresponding to each group of pixels in a certain direction.
S303, taking the maximum value in the sum of coordinates corresponding to each group of pixels in a certain direction as the maximum complexity of the component in the certain direction under the font.
Illustratively, the coordinates of each group of pixels in a certain direction are compared, and the maximum value is selected as the maximum complexity of the component in the certain direction under the font.
Alternatively, the maximum complexity may include an X-direction maximum complexity and a Y-direction maximum complexity.
Optionally, the maximum complexity parameter includes a maximum complexity ratio. Specifically, when calculating the maximum complexity ratio, the following manner may be adopted: and calculating the ratio of the maximum complexity of each component of the target Chinese character in each font to the maximum complexity of the components of the target Chinese character in other directions according to the maximum complexity of each component of the target Chinese character in each direction, and obtaining the maximum complexity ratio of each component of the target Chinese character in various fonts. For example, the maximum complexity ratio may be 6) the ratio of the X-direction maximum complexity to the Y-direction maximum complexity.
In this embodiment, when acquiring the font attribute, the maximum complexity parameter of the component may be used as the font attribute value of the component, so as to improve the prediction accuracy of the model.
In some embodiments, when the font property is obtained, the font property includes gravity center parameters of different directions; the center of gravity parameter includes a center of gravity; for each direction, based on each component of the target Chinese character under each font, the following processing is performed to obtain the barycenter of each component of the target Chinese character under different directions under multiple fonts, which mainly comprises the following steps. Fig. 7 is a flowchart of a method for acquiring a gravity center parameter according to an embodiment of the present application, as shown in fig. 7.
S401, acquiring coordinates of each pixel point in a pixel array of the component.
Illustratively, this step is similar to the principle of step S301 in fig. 6, and will not be described here.
S402, counting the number of pixels corresponding to the circumscribed rectangle of the component in the pixel array.
Illustratively, this step is similar to the principle of step S301 in fig. 4, and will not be described here.
S403, calculating the sum of coordinates of pixels corresponding to the components of the pixel points in a certain direction.
Illustratively, in this step, for a pixel point in a certain direction, a sum of coordinates of the pixel point in the direction in which the member corresponds is calculated, for example, in the X direction, a sum of coordinates of the pixel point in the X direction in which the member corresponds is calculated.
S404, calculating the ratio of the sum of coordinates of pixels corresponding to the component to the number of pixels corresponding to the circumscribed rectangle of the component in the pixel array, and taking the ratio as the gravity center of the component in a certain direction under the font.
Illustratively, in this step, the ratio of the sum of the coordinates of the pixels obtained in step S403 to the pixel for the pair in the member-circumscribed rectangle is taken as the center of gravity in a certain direction, that is, when the ratio of the sum of the coordinates in the X direction to the pixel for the pair in the member-circumscribed rectangle is taken, the center of gravity obtained at this time is the center of gravity in the X direction.
Optionally, the center of gravity parameter comprises a center of gravity ratio. Specifically, when calculating the barycenter ratio, the barycenter ratio of each component of the target Chinese character in each font in each direction can be calculated for the barycenter of each component of the target Chinese character in each direction, and the barycenter ratio of each component of the target Chinese character in multiple fonts can be obtained by calculating the barycenter ratio of the barycenter of each component of the target Chinese character in other directions.
In some embodiments, when the font property is obtained, the font property includes morphological parameters of the component in different regions in different scanning directions. Fig. 8 is a flowchart of a method for obtaining morphological parameters according to an embodiment of the application. The method comprises the following steps:
s501, acquiring a pixel array of a component, and determining each group of pixels along the scanning direction in the current scanning area of the pixel array for each scanning direction.
Illustratively, the scan direction of the component may include: the method comprises the steps of scanning from an upper frame to a lower frame of a member circumscribed rectangle, scanning from a lower frame to an upper frame of the member circumscribed rectangle, scanning from a left frame to a right frame of the member circumscribed rectangle, and scanning from a right frame to a left frame of the member circumscribed rectangle. And the scan area of the circumscribed rectangle of the member may be divided into an upper portion, a middle portion, and a lower portion. Or may also include a left portion, a middle portion, and a right portion. Specifically, based on each scanning area, each scanning direction of each area can be scanned according to the scanning direction of the component, so as to obtain each group of pixels along the scanning direction in the current scanning area of the pixel array. Further, the component form parameters are divided into: an upper member form, a lower member form, an upper member left form, an upper member right form, a lower member left form, a lower member right form, a left member intermediate form, a right member intermediate form, a left member form, a right member form, an upper member left form, a left member lower form, an upper member right form, a right member lower form, an upper member intermediate form, and a lower member intermediate form.
S502, scanning each group of pixels along the scanning direction, and determining a first scanned pixel in each group of pixels, wherein the first pixel belongs to a pixel corresponding to the component.
Illustratively, as each group of pixels is scanned in a scanning direction, a corresponding first pixel of a first scanned component of each group of pixels in the scanning direction is acquired.
S503, calculating the sum of coordinates of the first pixels corresponding to each group of pixels to obtain a first summation result, and calculating the reciprocal of the mean square value of the first summation result as the morphological parameter of the component in the area in the scanning direction.
The coordinates of the first pixel are determined, a first summation result is obtained after summation, and the inverse of the mean square value of the first summation result is taken as a morphological parameter of the component in the area in the scanning direction. When the upper frame of the circumscribed rectangle of the component scans downwards or the lower frame scans upwards, the coordinate of the first pixel is the sum of the coordinates in the Y direction. When the left frame of the circumscribed rectangle of the component scans to the right frame or the right frame scans to the left frame, the coordinate of the first pixel is the sum of the coordinates in the X direction.
In this embodiment, by acquiring the morphological parameters of the lower members in different scanning directions in different areas, the distinction between the lower members in different fonts can be better distinguished, so that the accuracy of the model obtained by training is higher.
The above-described word stock construction method is exemplified below in conjunction with specific embodiments.
For the first neural network model, when the target kanji includes two components, the input parameters include: the parameters of the component 1 class and the component 2 class total 50, respectively, including: the component X-direction complexity, the component Y-direction complexity, the ratio of the X-direction complexity to the Y-direction complexity, the X-direction maximum complexity, the Y-direction maximum complexity, the ratio of the X-direction maximum complexity to the Y-direction maximum complexity, the component circumscribed rectangular in-frame center of gravity X, the component circumscribed rectangular in-frame center of gravity Y, the ratio of the component circumscribed rectangular in-frame center of gravity X to Y, the component upper shape, the component lower shape, the component upper left shape, the component upper right shape, the component lower left shape, the component lower right shape, the component middle left shape, the component middle right shape, the component left upper left shape, the component left lower shape, the component right lower shape, the component middle upper shape and the component middle lower shape.
For the first neural network model, the output parameters are the parameter basis for selecting the word-forming components, and the total output parameters of the component 1 and the component 2 are 30, which specifically includes:
The member circumscribed rectangle width, the member circumscribed rectangle height, the member circumscribed rectangle focus X, the member circumscribed rectangle focus Y, the ratio of member weight to circumscribed rectangle area, the member upper portion form, the member lower portion form, the member left portion form, the member right portion form, the left and right structural member still includes: a member upper left form, a member upper right form, a member lower left form, a member lower right form, a member middle left form, a member middle right form; the upper and lower structural members further include: an upper left part form, a lower left part form, an upper right part form, a lower right part form, an upper middle part form and a lower middle part form of the member.
For the second neural network model, when the target kanji includes two components, the input parameters at this time include: the method comprises the steps of a member external rectangle width, a member external rectangle height, a member X-direction complexity, a ratio of X-direction complexity to Y-direction complexity, a ratio of X-direction maximum complexity to Y-direction maximum complexity, a member external rectangle frame inner gravity center X, a member external rectangle frame inner gravity center Y, a ratio of member weight to word formation space area, a member upper shape, a member lower shape, a member left shape and a member right shape; the left and right structural left member further includes: the right and left structural members of the upper right form, the lower right form and the middle right form of the member further comprise: a member upper left form, a member lower left form, and a member middle left form; the upper and lower structural upper member further includes: a left lower form of the member, a right lower form of the member, and a middle lower form of the member; the upper and lower structural lower member further includes: upper left part form of the component, upper right part form of the component and upper middle part form of the component.
For the output parameters of the second neural network model, the output parameters are the parameter basis for selecting the word-forming components, and the total parameters of the component 1 and the component 2 are 8, specifically including: the component circumscribes the rectangular center of gravity X, the component circumscribes the rectangular center of gravity Y, the component circumscribes the rectangular geometric center X and the component circumscribes the rectangular geometric center Y.
For the third neural network model, when the target Chinese character includes two components, the input parameters are factors affecting the position in the character forming space of the Chinese character, and the input parameters include: the parameters of the component 1 class and the component 2 class total 25, respectively, including: the Chinese character is characterized by comprising a Chinese character external rectangle width, a Chinese character external rectangle height, a Chinese character external rectangle internal center X, a Chinese character external rectangle internal center Y, a Chinese character X direction complexity, a Chinese character Y direction complexity, a Chinese character weight to external rectangle area ratio, a Chinese character left part full deficiency degree to left and right direction full deficiency degree sum ratio, a Chinese character upper part shape, a Chinese character lower part shape, a Chinese character upper part left shape, a Chinese character upper part right shape, a Chinese character lower part left shape, a Chinese character lower part right shape, a Chinese character middle left shape, a Chinese character middle right shape, a Chinese character left part shape, a Chinese character right part shape, a Chinese character left part upper shape, a Chinese character left part lower part shape, a Chinese character right part upper shape, a Chinese character middle upper shape and a Chinese character middle lower shape.
For the third neural network model, the output parameters are the basis for determining the position of the Chinese character in the word formation space, and specifically comprise the gravity center X of the Chinese character in the word formation space and the gravity center Y of the Chinese character in the word formation space.
In the above embodiment, the font property of the components in each neural network model may be the same or different indicators, which is not particularly limited herein.
Fig. 9 is a schematic structural diagram of a word stock construction device according to an embodiment of the present application, as shown in fig. 9, where the device includes:
a first acquisition unit 61 for acquiring first input data including: the average value of the font attribute value sets of the components of the target Chinese characters corresponds to the font attribute value sets one by one, and each font attribute value set comprises attribute values of font attributes under various fonts; wherein different attribute values of the font attribute are used to characterize different fonts;
the first generating unit 62 is configured to input the first input data into the first neural network model, and obtain first output data, where the first output data includes: a first font property value set of a component of the target Chinese character, a ratio of the weight of the component to the area of the circumscribed rectangle of the component, and a shape parameter of the circumscribed rectangle of the component; wherein the first set of font property values comprises a first property value of a font property;
A second obtaining unit 63, configured to obtain, according to the first font attribute value set of the components of the target kanji, a corresponding first component in the first font from the kanji library, and obtain second input data, where the second input data includes: a second attribute value of the font attribute of the first member in the first font, a ratio of the weight of the first member to the area of the circumscribed rectangle of the first member, and a shape parameter of the circumscribed rectangle of the first member;
a second generating unit 64, configured to input second input data into the second neural network model, and obtain second output data, where the second output data includes structural relation parameters between the first members and shape parameters of an circumscribed rectangle of the first members;
a third obtaining unit 65, configured to generate a target chinese character in the first font according to the component in the first font and the second output data, and obtain third input data, where the third input data includes a shape parameter of an circumscribed rectangle of the target chinese character in the first font, a shape parameter of the target chinese character, a first position parameter value of the target chinese character in a word formation space, and a ratio of a weight of the target chinese character to an area of the circumscribed rectangle of the target chinese character;
a third generating unit 66, configured to input third input data into a third neural network model, and obtain a second position parameter value of the target kanji in the word formation space; and obtaining the target Chinese character under the first font in the character forming space based on the second position parameter value of the target Chinese character in the character forming space and the target Chinese character under the first font.
The device provided in this embodiment is configured to implement the technical scheme provided by the method, and the implementation principle and the technical effect are similar and are not repeated.
Fig. 10 is a schematic structural diagram of another word stock construction device according to an embodiment of the present application. As shown in fig. 10, in some embodiments, the font properties include at least one of: complexity parameters in different directions, maximum complexity parameters in different directions, and center of gravity parameters in different directions.
In some embodiments, the font properties include complexity parameters for different directions; the complexity parameter includes complexity; the apparatus further comprises:
a first calculation unit 67 for obtaining the complexity of each component of the target kanji in different directions under a plurality of fonts by performing the following processing based on each component of the target kanji under each font for each direction:
acquiring a pixel array of the member; counting the number of pixels of the pixel array along the direction, the number of pixels corresponding to the circumscribed rectangle of the components in the pixel array, and the number of pixels corresponding to the components in the pixel array; and calculating the ratio of the number of pixels corresponding to the component to a product result, wherein the product result is the result of multiplying the number of pixels of the pixel array along the first direction by the number of pixels corresponding to the circumscribed rectangle of the component in the pixel array, and the ratio is used as the complexity of the component in the direction under the font.
In some embodiments, the complexity parameter comprises a complexity ratio; the apparatus further comprises:
a second calculating unit 68, configured to calculate, for each component of the target kanji under each font, a ratio of the complexity to the complexity of the component under the font in other directions, and obtain the complexity ratio of each component of the target kanji under multiple fonts.
In some embodiments, the font properties include maximum complexity parameters for different directions; the apparatus further comprises:
a third calculation unit 69 for obtaining, for each direction, the maximum complexity of each component of the target kanji in different directions under a plurality of fonts, based on each component of the target kanji under each font, by performing the following processing:
acquiring coordinates of each pixel in a pixel array of the component; calculating the sum of coordinates of pixels corresponding to the member in each group of pixels in the direction, and obtaining the sum of coordinates corresponding to each group of pixels in the direction; the maximum value of the sum of coordinates corresponding to each group of pixels in the direction is taken as the maximum complexity of the component in the direction under the font.
In some embodiments, the apparatus further comprises:
a fourth calculation unit 70, configured to calculate, for each component of the target kanji under each font, a ratio of the maximum complexity to the maximum complexity of the components under the font in other directions, and obtain a maximum complexity ratio of each component of the target kanji under multiple fonts.
In some embodiments, the apparatus further comprises:
a fifth calculation unit 71 for obtaining, for each direction, the barycenter of each component of the target kanji in different directions under a plurality of fonts, based on each component of the target kanji under each font, by performing the following processing:
acquiring coordinates of each pixel point in a pixel array of a component; counting the number of pixels corresponding to the circumscribed rectangle of the member in the pixel array; calculating the sum of coordinates of pixels corresponding to the components of the pixel points aiming at the directional pixel points; the ratio of the sum of the coordinates of the pixels corresponding to the component to the number of pixels corresponding to the circumscribed rectangle of the component in the pixel array is calculated and used as the gravity center of the component in the direction under the font.
In some embodiments, the center of gravity parameter comprises a center of gravity ratio; the apparatus further comprises:
a sixth calculation unit 72 for calculating, for each gravity center of each component of the target kanji in each direction for each font, a ratio of the gravity center to the gravity centers of the components in other directions for the font, and obtaining gravity center ratios of the components of the target kanji in a plurality of fonts.
In some embodiments, the component shape parameters further include morphological parameters of the component at different regions for different scan directions; the apparatus further comprises:
A seventh calculation unit 73 for acquiring a pixel array of the member; determining, for each scan direction, a respective set of pixels along the scan direction in a current scan region of the pixel array; scanning each group of pixels along the scanning direction, and determining a first scanned pixel in each group of pixels, wherein the first pixel belongs to a pixel corresponding to a member; calculating the sum of coordinates of a first pixel corresponding to each group of pixels to obtain a first summation result; and calculating the reciprocal of the mean square value of the first summation result as the morphological parameter of the component in the area under the scanning direction.
In some embodiments, the first location parameter value includes a ratio of a left full-deficiency of the target Chinese character to a sum of left and right full-deficiency of the target Chinese character and a ratio of an upper full-deficiency of the target Chinese character to a sum of up and down full-deficiency of the target Chinese character; the second position parameter value comprises a parameter value of a gravity center parameter of the target Chinese character in the character forming space.
The device provided in this embodiment is configured to implement the technical scheme provided by the method, and the implementation principle and the technical effect are similar and are not repeated.
Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application, as shown in fig. 11, where the electronic device includes:
a processor 291, the electronic device further comprising a memory 292; a communication interface (Communication Interface) 293 and bus 294 may also be included. The processor 291, the memory 292, and the communication interface 293 may communicate with each other via the bus 294. Communication interface 293 may be used for information transfer. The processor 291 may call logic instructions in the memory 294 to perform the methods of the above embodiments.
Further, the logic instructions in memory 292 described above may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product.
The memory 292 is a computer readable storage medium, and may be used to store a software program, a computer executable program, and program instructions/modules corresponding to the methods in the embodiments of the present application. The processor 291 executes functional applications and data processing by running software programs, instructions and modules stored in the memory 292, i.e., implements the methods of the method embodiments described above.
Memory 292 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the terminal device, etc. Further, memory 292 may include high-speed random access memory, and may also include non-volatile memory.
The present application provides a computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to carry out a method as in any of the first aspects.
The application provides a computer program product comprising a computer program which, when executed by a processor, implements a method according to any of the first aspects.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (13)

1. The word stock construction method is characterized by comprising the following steps:
acquiring first input data, the first input data comprising: the average value of font attribute value sets of a component of the target Chinese character corresponds to the font attribute value sets one by one, and each font attribute value set comprises attribute values of font attributes under various fonts; wherein different attribute values of the font attribute are used to characterize different fonts;
Inputting the first input data into a first neural network model to obtain first output data, wherein the first output data comprises: the first font attribute value set of the component of the target Chinese character, the ratio of the weight of the component to the area of the circumscribed rectangle of the component and the shape parameter of the circumscribed rectangle of the component; wherein the first set of font property values comprises a first property value of the font property;
according to the first font attribute value set of the components of the target Chinese character, obtaining a corresponding first component under a first font from a Chinese character library, and obtaining second input data, wherein the second input data comprises: a second attribute value of a font attribute of the first member in a first font, a ratio of a weight of the first member to an area of an circumscribed rectangle of the first member, and a shape parameter of the circumscribed rectangle of the first member;
inputting the second input data into a second neural network model to obtain second output data, wherein the second output data comprises structural relation parameters between the first components and shape parameters of circumscribed rectangles of the first components;
generating a target Chinese character in the first font according to the component in the first font and the second output data, and obtaining third input data, wherein the third input data comprises a shape parameter of an circumscribed rectangle of the target Chinese character in the first font, a shape parameter of the target Chinese character, a first position parameter value of the target Chinese character in a character forming space and a ratio of the weight of the target Chinese character to the area of the circumscribed rectangle of the target Chinese character;
Inputting the third input data into a third neural network model to obtain a second position parameter value of the target Chinese character in a character construction space; and obtaining the target Chinese character under the first font in the character forming space based on the second position parameter value of the target Chinese character in the character forming space and the target Chinese character under the first font.
2. The method of claim 1, wherein the font properties comprise at least one of: complexity parameters in different directions, maximum complexity parameters in different directions, and center of gravity parameters in different directions.
3. The method of claim 2, wherein the font properties include complexity parameters for different directions; the complexity parameter includes complexity; the method further comprises the steps of:
for each direction, based on each component of the target Chinese character under each font, obtaining the complexity of each component of the target Chinese character in different directions under multiple fonts by performing the following processing:
acquiring a pixel array of the component; counting the number of pixels of the pixel array along the direction, the number of pixels corresponding to the circumscribed rectangle of the components in the pixel array and the number of pixels corresponding to the components in the pixel array; and calculating the ratio of the pixel number corresponding to the component to a product result, wherein the product result is the result of multiplying the pixel number of the pixel array along the first direction by the pixel number corresponding to the circumscribed rectangle of the component in the pixel array, and the product result is used as the complexity of the component in the direction under the font.
4. A method according to claim 3, wherein the complexity parameter comprises a complexity ratio; the method further comprises the steps of:
and calculating the ratio of the complexity of each component of the target Chinese character in each direction to the complexity of the components of the target Chinese character in other directions in each font, and obtaining the ratio of the complexity of each component of the target Chinese character in multiple fonts.
5. The method of claim 2, wherein the font properties include maximum complexity parameters for different directions; the method further comprises the steps of:
for each direction, based on each component of the target Chinese character under each font, obtaining the maximum complexity of each component of the target Chinese character in different directions under multiple fonts by performing the following processing:
acquiring coordinates of each pixel in a pixel array of the component; calculating the sum of coordinates of pixels corresponding to the member in each group of pixels in the direction, and obtaining the sum of coordinates corresponding to each group of pixels in the direction; and taking the maximum value of the sum of coordinates corresponding to each group of pixels in the direction as the maximum complexity of the component in the direction under the font.
6. The method of claim 5, wherein the maximum complexity parameter comprises a maximum complexity ratio; the method further comprises the steps of:
and calculating the ratio of the maximum complexity of each component of the target Chinese character in each font to the maximum complexity of the components of the target Chinese character in other directions according to the maximum complexity of each component in each direction in each font, and obtaining the maximum complexity ratio of each component of the target Chinese character in various fonts.
7. The method of claim 2, wherein the font properties include center of gravity parameters for different directions; the center of gravity parameter includes a center of gravity; the method further comprises the steps of:
for each direction, based on each component of the target Chinese character under each font, the barycenter of each component of the target Chinese character in different directions under multiple fonts is obtained by performing the following processing:
acquiring coordinates of each pixel point in a pixel array of the component; counting the number of pixels corresponding to the circumscribed rectangle of the member in the pixel array; calculating the sum of coordinates of pixels corresponding to the components of the pixel points in the direction; and calculating the ratio of the sum of the coordinates of the pixels corresponding to the component to the number of pixels corresponding to the circumscribed rectangle of the component in the pixel array, and taking the ratio as the gravity center of the component in the direction under the font.
8. The method of claim 7, wherein the center of gravity parameter comprises a center of gravity ratio; the method further comprises the steps of:
and calculating the ratio of the gravity center of each component of the target Chinese character in each direction to the gravity centers of the components of the target Chinese character in other directions in each font, and obtaining the gravity center ratio of each component of the target Chinese character in various fonts.
9. The method of claim 2, wherein the component shape parameters further comprise morphological parameters of the component at different regions for different scan directions; the method further comprises the steps of:
acquiring a pixel array of the component; determining, for each scan direction, a respective set of pixels along the scan direction in a current scan region of the pixel array; scanning each group of pixels along the scanning direction, and determining a first scanned pixel in each group of pixels, wherein the first pixel belongs to a pixel corresponding to a component; calculating the sum of coordinates of a first pixel corresponding to each group of pixels to obtain a first summation result; and calculating the reciprocal of the mean square value of the first summation result as a morphological parameter of the component in the area under the scanning direction.
10. The method of claim 1, wherein the first location parameter value comprises a ratio of a left full-deficiency of the target chinese character to a sum of left and right full-deficiency of the target chinese character and a ratio of an upper full-deficiency of the target chinese character to a sum of up and down full-deficiency of the target chinese character; the second position parameter value comprises a parameter value of a gravity center parameter of the target Chinese character in a character forming space.
11. A word stock construction apparatus, comprising:
a first acquisition unit configured to acquire first input data including: the average value of font attribute value sets of a component of the target Chinese character corresponds to the font attribute value sets one by one, and each font attribute value set comprises attribute values of font attributes under various fonts; wherein different attribute values of the font attribute are used to characterize different fonts;
the first generating unit is configured to input the first input data into a first neural network model, and obtain first output data, where the first output data includes: the first font attribute value set of the component of the target Chinese character, the ratio of the weight of the component to the area of the circumscribed rectangle of the component and the shape parameter of the circumscribed rectangle of the component; wherein the first set of font property values comprises a first property value of the font property;
The second obtaining unit is configured to obtain, according to the first font attribute value set of the component of the target kanji, a first component in a corresponding first font from a kanji library, and obtain second input data, where the second input data includes: a second attribute value of a font attribute of the first member in a first font, a ratio of a weight of the first member to an area of an circumscribed rectangle of the first member, and a shape parameter of the circumscribed rectangle of the first member;
the second generation unit is used for inputting the second input data into a second neural network model to obtain second output data, wherein the second output data comprises structural relation parameters between the first components and shape parameters of circumscribed rectangles of the first components;
a third obtaining unit, configured to generate, according to the component under the first font and the second output data, a target chinese character under the first font, and obtain third input data, where the third input data includes a shape parameter of an circumscribed rectangle of the target chinese character under the first font, a shape parameter of the target chinese character, a first position parameter value of the target chinese character in a word formation space, and a ratio of a weight of the target chinese character to an area of the circumscribed rectangle of the target chinese character;
The third generating unit is used for inputting the third input data into a third neural network model to obtain a second position parameter value of the target Chinese character in a character construction space; and obtaining the target Chinese character under the first font in the character forming space based on the second position parameter value of the target Chinese character in the character forming space and the target Chinese character under the first font.
12. An electronic device, comprising: a memory, a processor;
a memory; a memory for storing the processor-executable instructions;
wherein the processor is configured to perform the method of any of claims 1-10 according to the executable instructions.
13. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to carry out the method of any one of claims 1-10.
CN202110694439.9A 2021-06-22 2021-06-22 Word stock construction method and device, electronic equipment and storage medium Active CN113312444B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110694439.9A CN113312444B (en) 2021-06-22 2021-06-22 Word stock construction method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110694439.9A CN113312444B (en) 2021-06-22 2021-06-22 Word stock construction method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113312444A CN113312444A (en) 2021-08-27
CN113312444B true CN113312444B (en) 2023-11-24

Family

ID=77379760

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110694439.9A Active CN113312444B (en) 2021-06-22 2021-06-22 Word stock construction method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113312444B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110211203A (en) * 2019-06-10 2019-09-06 大连民族大学 The method of the Chinese character style of confrontation network is generated based on condition
CN110533737A (en) * 2019-08-19 2019-12-03 大连民族大学 The method generated based on structure guidance Chinese character style
CN112784531A (en) * 2019-11-05 2021-05-11 北京大学 Chinese font and word stock generation method based on deep learning and part splicing
CN112862025A (en) * 2021-03-08 2021-05-28 成都字嗅科技有限公司 Chinese character stroke filling method, system, terminal and medium based on computer

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110211203A (en) * 2019-06-10 2019-09-06 大连民族大学 The method of the Chinese character style of confrontation network is generated based on condition
CN110533737A (en) * 2019-08-19 2019-12-03 大连民族大学 The method generated based on structure guidance Chinese character style
CN112784531A (en) * 2019-11-05 2021-05-11 北京大学 Chinese font and word stock generation method based on deep learning and part splicing
CN112862025A (en) * 2021-03-08 2021-05-28 成都字嗅科技有限公司 Chinese character stroke filling method, system, terminal and medium based on computer

Also Published As

Publication number Publication date
CN113312444A (en) 2021-08-27

Similar Documents

Publication Publication Date Title
CN109325532A (en) The image processing method of EDS extended data set under a kind of small sample
CN110930454A (en) Six-degree-of-freedom pose estimation algorithm based on boundary box outer key point positioning
CN110135325A (en) Crowd's number method of counting and system based on dimension self-adaption network
CN109064525B (en) Picture format conversion method, device, equipment and storage medium
CN110490141B (en) Method, device, terminal and storage medium for identifying filling information
WO2021068376A1 (en) Convolution processing method and system applied to convolutional neural network, and related components
CN108184122A (en) Image reversible information hidden method and device based on multiple linear regression
CN110210490A (en) Image processing method, device, computer equipment and storage medium
CN104851133A (en) Image self-adaptive grid generation variational method
CN109993202A (en) A kind of line chirotype shape similarity judgment method, electronic equipment and storage medium
CN113312444B (en) Word stock construction method and device, electronic equipment and storage medium
CN105701760A (en) Histogram real-time generation method of geographic raster data optional polygon area
JP2023541351A (en) Character erasure model training method and device, translation display method and device, electronic device, storage medium, and computer program
US20170195520A1 (en) Image processing method, apparatus, and image forming device
CN110807409A (en) Crowd density detection model training method and crowd density detection method
CN116737301A (en) Alignment method and device for layer elements
CN116628912A (en) Method, device and medium for arranging spray heads
CN111598093A (en) Method, device, equipment and medium for generating structured information of characters in picture
CN113610864B (en) Image processing method, device, electronic equipment and computer readable storage medium
CN113486904B (en) Feature self-adaptive laser scanning projection graph control point extraction and optimization method
CN110070591A (en) A kind of polygon fill method of computer graphics
CN116563804A (en) Point cloud labeling method, device, equipment and storage medium
Guo et al. High efficient direct binary search using multiple lookup tables
CN112102420A (en) Camera calibration board and calibration method
CN113706583A (en) Image processing method, image processing device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant