CN113591433A

CN113591433A - Text typesetting method and device, storage medium and computer equipment

Info

Publication number: CN113591433A
Application number: CN202110194444.3A
Authority: CN
Inventors: 伍敏慧; 梅利健; 林榆耿
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-02-20
Filing date: 2021-02-20
Publication date: 2021-11-02

Abstract

The application discloses a text typesetting method, a text typesetting device, a storage medium and computer equipment; the method and the device can acquire a plurality of text boxes in the target text image and the text content of each text box; determining geometric information of each text box; based on the geometric information of each text box, sequencing all the text boxes in the target text image according to the target arrangement direction to obtain the sequence information of each text box; calculating the associated information between adjacent text boxes in the target text image; determining the typesetting information of the text content of each text box based on the geometric information and the sequence information of each text box in the target text image and the correlation information between all adjacent text boxes; typesetting all the text contents of the target text image according to the typesetting information of the text contents of each text box; the method and the device can be used for typesetting the text content of the target text image.

Description

Text typesetting method and device, storage medium and computer equipment

Technical Field

The present application relates to the field of data processing, and in particular, to a method and an apparatus for text composition, a storage medium, and a computer device.

Background

With the development of the technology, through the related technology in the field of computer vision, the image containing the text can be identified to obtain the identification result, in the prior art, the identification result may include the editable text, but the layout information of the editable text in the image cannot be effectively identified, so that the difference between the editable text and the layout in the image is large.

During the research and practice of the prior art, the inventors of the present application found that the editable text needs to be manually laid out before being used by the user due to the large difference between the editable text and the layout thereof in the image.

Disclosure of Invention

The embodiment of the application provides a text typesetting method, a text typesetting device, a storage medium and computer equipment, which can typeset text contents in target text images.

The embodiment of the application provides a text typesetting method, which comprises the following steps:

acquiring a plurality of text boxes in a target text image and the text content of each text box;

determining geometric information of each text box;

based on the geometric information of each text box, sequencing all the text boxes in the target text image according to the target arrangement direction to obtain sequence information of each text box;

calculating the association information between adjacent text boxes in the target text image;

determining the typesetting information of the text content of each text box based on the geometric information and the sequence information of each text box in the target text image and the association information between all adjacent text boxes;

and typesetting all the text contents of the target text image according to the typesetting information of the text contents of each text box.

Accordingly, the present application provides a text composition apparatus, comprising:

the acquisition module is used for acquiring a plurality of text boxes in the target text image and the text content of each text box;

a geometry determining module for determining the geometry information of each text box;

the sorting module is used for sorting all the text boxes in the target text image according to the target arrangement direction based on the geometric information of each text box to obtain the sequence information of each text box;

the calculation module is used for calculating the association information between the adjacent text boxes in the target text image;

the information determining module is used for determining the typesetting information of the text content of each text box based on the geometric information and the sequence information of each text box in the target text image and the association information between all adjacent text boxes;

and the typesetting module is used for typesetting all the text contents of the target text image according to the typesetting information of the text contents of each text box.

In some embodiments, the geometry information includes bounding box information and feature information, and the geometry determination module includes a measurement submodule and a determination submodule, wherein,

the measuring submodule is used for measuring the text box to obtain frame information of the text box;

and the determining submodule is used for determining the characteristic information of the text box based on the frame information.

In some embodiments, the feature information includes a feature line, a feature point, and an angle value of the feature line, and the determination submodule is specifically configured to:

determining feature points and feature lines of the text box in the area of the text box based on the frame information;

and measuring the angle value of the characteristic line in the reference direction of the target text image.

In some embodiments, the target arrangement direction includes a target first direction and a target second direction, the sorting module includes a determination sub-module, a sorting sub-module, and a setting sub-module, wherein,

the determining submodule is used for determining a starting text box and a starting feature point of the starting text box from all text boxes of the target text image;

the sequencing submodule is used for sequencing the feature points of all the text boxes in a first target direction and a second target direction of the target text image by taking the initial feature point as a starting point to obtain sequencing information of the feature points of each text box;

and the setting submodule is used for setting the sequence of the characteristic points as the sequence information of the text boxes to which the characteristic points belong so as to obtain the sequence information of each text box.

In some embodiments, the ordering sub-module is specifically configured to:

determining at least one target sub-direction of the target first direction in the target image by taking the starting feature point as a starting point in the target second direction;

and sequencing all the feature points in the target sub-direction by taking the starting feature point as a starting point to obtain sequencing information of all the feature points in the target sub-direction.

In some embodiments, the adjacent text boxes include a first text box and a second text box, the association information includes distance information and scale information, the calculation module includes a calculation sub-module and a determination sub-module, wherein,

the calculation submodule is used for calculating distance information between the first text box and the second text box according to the characteristic line of the first text box and the characteristic point of the second text box;

and the determining submodule is used for determining proportion information between the first text box and the second text box based on the characteristic lines of the first text box and the second text box.

In some embodiments, the determination submodule is specifically configured to:

determining a first projection length of a characteristic line of the first text box in a reference direction;

determining a second projection length of the characteristic line of the second text box in the reference direction;

and calculating length proportion information between the first projection length and the second projection length, and setting the length proportion information as proportion information between the first text box and the second text box.

In some embodiments, the association information includes distance information and scale information, the adjacent text boxes include a first text box and a second text box, and the information determination module is specifically configured to:

calculating an angle difference value between angle values of the feature lines of the first text box and the second text box;

when the angle difference value is smaller than a preset angle threshold value, the distance information is smaller than a preset distance threshold value, and the proportion information is larger than a preset proportion threshold value, distributing the same typesetting information for the first text box and the second text box;

and when the distance information is greater than a preset distance threshold value, distributing different typesetting information for the first text box and the second text box.

In some embodiments, the ranking module is specifically configured to:

segmenting the target image to obtain at least two content blocks, wherein the content blocks comprise text blocks, and the text blocks comprise at least one text box;

based on the geometric information of each text box, sequencing the at least one text box in the text block according to the target arrangement direction to obtain the sequence information of each text box in the text block;

the calculation module is specifically configured to:

calculating association information between adjacent text boxes in at least one text block of the target text image;

in some embodiments, the content block includes a non-text block, the non-text block corresponds to non-text content, and the information determination module is specifically configured to:

determining the typesetting information of the text content of each text box in the text block based on the geometric information and the sequence information of all the text boxes in the text block of the target text image and the correlation information between all the adjacent text boxes;

the typesetting module is specifically used for:

and typesetting all the text contents and the non-text contents in the target text image according to the typesetting information of the text contents of each text box in the text block and the position information of the non-text block.

Correspondingly, the embodiment of the present application further provides a storage medium, where the storage medium stores a computer program, and the computer program is suitable for being loaded by a processor to execute any one of the text typesetting methods provided by the embodiment of the present application.

Accordingly, an embodiment of the present application further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein when the processor executes the computer program, any one of the text typesetting methods provided by the embodiment of the present application is implemented.

The method and the device can acquire a plurality of text boxes in the target text image and the text content of each text box; determining geometric information of each text box; based on the geometric information of each text box, sequencing all the text boxes in the target text image according to the target arrangement direction to obtain the sequence information of each text box; calculating the associated information between adjacent text boxes in the target text image; determining the typesetting information of the text content of each text box based on the geometric information and the sequence information of each text box in the target text image and the correlation information between all adjacent text boxes; and typesetting all the text contents of the target text image according to the typesetting information of the text contents of each text box.

After receiving the plurality of text boxes of the target sample image and the text content of each text box, the method and the device can analyze and sort the information of the text boxes, determine the typesetting information of the text content of each text box according to the obtained geometric information, sequence information and associated information, and further realize the typesetting of the text content of the target sample image.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic view of a scene of a text composition system according to an embodiment of the present application;

fig. 2 is a flowchart illustrating a text typesetting method according to an embodiment of the present application;

FIG. 3 is a diagram illustrating a process of a text typesetting method according to an embodiment of the present application;

FIG. 4 is a diagram illustrating another example process of a text typesetting method according to an embodiment of the present application;

FIG. 5 is a diagram illustrating another example process of a text typesetting method according to an embodiment of the present application;

FIG. 6 is a diagram illustrating another example process of a text typesetting method according to an embodiment of the present application;

FIG. 7 is a diagram illustrating another process of a text typesetting method according to an embodiment of the present application;

FIG. 8 is a diagram illustrating another example process of a text typesetting method according to an embodiment of the present application;

FIG. 9 is a diagram illustrating another example process of a text typesetting method according to an embodiment of the present application;

FIG. 10 is another flowchart illustrating a method for typesetting text according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a text composition apparatus according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a computer device provided in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described clearly and completely with reference to the drawings in the embodiments of the present application, and it is obvious that the embodiments described in the present application are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.

In the application, the text box, the text content and the like can be obtained by identifying the target text image through a computer vision technology.

The text typesetting method can be integrated in a text typesetting device, the text typesetting device can be integrated in a text typesetting system, the text typesetting system can comprise one or more computer devices, the computer devices can comprise terminals or servers and the like, the servers can be independent physical servers, server clusters or distributed systems formed by a plurality of physical servers, and cloud servers for providing cloud computing services. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.

Referring to fig. 1, the text composition system may include a text composition device, which may be integrated in a computer device, and the text composition device may obtain a plurality of text boxes in the target text image and the text content of each text box; determining geometric information of each text box; based on the geometric information of each text box, sequencing all the text boxes in the target text image according to the target arrangement direction to obtain the sequence information of each text box; calculating the associated information between adjacent text boxes in the target text image; determining the typesetting information of the text content of each text box based on the geometric information and the sequence information of each text box in the target text image and the correlation information between all adjacent text boxes; and typesetting all the text contents of the target text image according to the typesetting information of the text contents of each text box.

It should be noted that the scene schematic diagram of the text composition system shown in fig. 1 is only an example, and the text composition system and the scene described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation on the technical solution provided in the embodiment of the present application.

The following are detailed below. In this embodiment, a text typesetting method will be described in detail, where the text typesetting method may be integrated on a server, as shown in fig. 2, and fig. 2 is a schematic flow diagram of the text typesetting method provided in this embodiment of the present application. The text typesetting method can comprise the following steps:

101. and acquiring a plurality of text boxes in the target text image and the text content of each text box.

The target text image may include an image containing text, the target text image may include an image captured on a computer device, an image photographed using a camera or the like, an image transmitted by another person, and the like, and the target text image may include various formats such as JPEG format, BMP format, GIF format, PNG format, and the like.

The image Recognition is performed on the target text image, a Recognition result may be obtained, the Recognition result may include a plurality of text boxes of the target text image and text content in each text box, and specifically, a manner of recognizing the target text image may be recognized based on an Optical Character Recognition (OCR) technique.

The text box may include an area containing characters in the target text image, the text content may include editable text content in the text box, the text in the present application may include multiple languages including english and chinese, multiple data texts including arabic numerals and roman numerals, multiple symbolic texts including punctuation marks and mathematical symbols, and the like, and the text in the present application may include handwritten text, printed text, and the like.

Specifically, the manner of obtaining the text box and the text content thereof may include various manners, such as directly obtaining the text box and the text content thereof locally from the computer device, or sending a request to another computer device (such as a server), and receiving a data packet containing the text box, the text content, and the like returned by the other computer device, and the like.

For example, referring to fig. 3, 6 text boxes of the image 1 and the text content of each text box are obtained, specifically, the image 1 may include a text box 1, a text box 2, a text box 3, a text box 4, a text box 5, and a text box 6, where the text content of the text box 1 is "1.2.2. XXXX of XXXX", and the text content of the text box 2 is "based on deep learning, a text segment region, a table region, a picture region, and the like are identified by using XXXX. XXXX comparison "the text content of text box 3 is" hard to acquire and depends on training data, for XXXXXXXXXXXXXX. ", the text content of text box 4 is" 2. XXXXXXXX detailed description (XXXX) ", the text content of text box 5 is" 2.1.XXX full XXXX (XXXX) provided by XXX ", and the text content of text box 6 is" we proceed with XXXXXX on the results of text recognition. For text line xxxxxxxx, paragraph and null are determined.

102. Geometric information for each text box is determined.

Where the geometric information may include graphic information of the text box, the text box may include a rectangular box, and the geometric information may include position information of a point in the text box, position information of a line, length information, and the like.

Specifically, the manner of determining the geometric information of the text box includes various manners, for example, the geometric information may be directly obtained from a memory, for example, the geometric information may be measured on the text box, and the like.

For example, for the text box 1 in fig. 3, the text box may be measured and calculated to obtain the geometric information 1 of the text box 1.

In some embodiments, the geometric information includes border information and feature information, and the step of "determining geometric information for each text box" may include:

measuring the text box to obtain frame information of the text box;

based on the bounding box information, feature information of the text box is determined.

The frame information may include position information and length information of the edge of the text box, and the feature information may include information capable of embodying and representing part or all of the features of the text box.

The method for determining the feature information of the text box according to the border information may be various, for example, the specific border information may be set as the feature information of the text box, the specific border information may be obtained by screening all border information of the text box based on a preset condition, the specific border information may also be obtained by randomly selecting all border information, and the like.

For example, the text box 1 may be measured to obtain the border information 1 of the text box 1, where the border information 1 may include the position and length of each edge in the text box 1 in the image 1, and then the feature information 1 of the text box is determined according to the border information 1.

In some embodiments, the feature information includes a feature line, a feature point, and an angle value of the feature line, and the step of determining the feature information of the text box based on the bounding box information may include:

the angle value of the feature line in the reference direction of the target text image is measured.

The feature points include points in the text box, and the feature points may represent partial features of the text box, for example, the feature points may include a center point, a center of gravity, an inner center, and the like of the text box.

The feature line may include a line in the text box, and the feature line may represent a partial feature of the text box, for example, the feature line may include a line segment formed by connecting any two points in the text box.

Specifically, the text box may be a triangle, a quadrangle, a polygon, a circle, or the like, when the text box is a rectangular box, the border information may include a left edge line, a right edge line, an upper edge line, and a lower edge line, and the manner of determining the feature point and the feature line of the text box may be: and respectively connecting the left midpoint and the right midpoint on the left side line and the right side line of the rectangular frame to obtain a feature line of the rectangular frame, determining the midpoint of the feature line, and setting the midpoint of the feature line as the feature point of the text frame.

The reference direction may be any direction, the reference direction may be determined according to a direction of a text box in the target sample data, the reference direction may be a direction close to or the same as a direction of characters in the text box, and the reference direction may be displayed in a line form.

The angle value may be an included angle between the characteristic line and the reference direction, may be directly measured, or may be calculated by a trigonometric function or the like, and specifically may be flexibly selected in practical application, without being limited thereto.

For example, referring to fig. 4, the left midpoint and the right midpoint of the left edge line and the right edge line of the text box 1 may be determined by measuring, the feature line 1 of the text box 1 may be obtained by connecting the left midpoint and the right midpoint, the midpoint of the feature line 1 may be determined, the feature point 1 of the text box 1 may be obtained, the reference direction may be the horizontal direction of the image 1, that is, the direction 1, and an included angle between the direction 1 and the feature line 1 is measured or calculated, so as to obtain the angle value 1.

103. And sequencing all the text boxes in the target text image according to the target arrangement direction based on the geometric information of each text box to obtain the sequence information of each text box.

The target arrangement direction may include a direction according to which the text boxes are sorted, the target arrangement direction may include multiple degrees, for example, the target arrangement direction may be a horizontal direction on a vertical axis of 20 in a coordinate axis, the target arrangement direction may further include a direction indicated by a point whose vertical axis is 20 and whose horizontal axis is greater than or equal to 0, and the like, when the target arrangement direction is multiple, each target arrangement direction may include a priority, and the determination of the text box order information is performed in order according to the priority.

The order information may include the order of the text boxes in all the text boxes, and the order information may be marked in text, numbers, symbols, and the like.

Specifically, it may be detected whether there is geometric information of the text box in the target sorting direction according to a sequence, smaller sequence information may be set for the text box to which the geometric information detected first belongs, and larger sequence information may be set for the text box to which the geometric information detected later belongs, for example, the geometric information may include a center point of the text box, and then the sequence information of the text box to which the center point X detected first belongs is determined as s1 in the target sorting direction.

For example, referring to fig. 5, the target sorting direction may be direction 2 (vertical direction in the image 2), and the geometric information based on which the sorting is performed may be a left outline of the text box, and then the left outline may be sequentially detected in the direction 2, so as to obtain sorting information of the text box 1 (order 1), sorting information of the text box 2 (order 2), sorting information of the text box 3 (order 3), sorting information of the text box 4 (order 4), sorting information of the text box 5 (order 5), and sorting information of the text box 6 (order 6).

In some embodiments, the target arrangement direction includes a target first direction and a target second direction, and the step of "ordering all text boxes in the target text image according to the target arrangement direction based on the geometric information of each text box to obtain the order information of each text box"

Determining a starting text box and a starting feature point of the starting text box from all text boxes of the target text image;

sequencing the feature points of all the text boxes in a first target direction and a second target direction of the target text image by taking the initial feature points as starting points to obtain sequencing information of the feature points of each text box;

and setting the sequencing information of the characteristic points as the sequence information of the text boxes to which the characteristic points belong to obtain the sequence information of each text box.

The starting text box and the starting feature point may be starting points in the sorting process, and the sequence information of the starting text box may be sequence 1.

Specifically, the geometric information may be feature points of the text box, and then the feature points may be detected in the first direction of the target and the second direction of the target by using the starting feature point as a starting point, so as to obtain sequence information of the feature points of each text box, where the sequence information of the feature points is the feature information of the text box to which the feature points belong.

For example, the starting text box in the image 1 may be a text box 1, the starting feature point may be a feature point 1 of the text box 1, the target text direction may include a direction 1 and a direction 2 (corresponding to a target first direction and a target second direction), then the feature points of all the text boxes may be sorted in the direction 1 and the direction 2 to obtain sorting information of each feature point, and the feature information of the sorting point is the order information of the text box to which the sorting point belongs.

In some embodiments, the step of "taking the starting feature point as a starting point, and sorting the feature points of all the text boxes in the target first direction and the target second direction of the target text image to obtain the position information of the feature point of each text box" may include:

and sequencing all the feature points in the target sub-direction by taking the initial feature point as a starting point to obtain the position information of all the feature points in the target sub-direction.

For example, referring to fig. 6, six sub-directions in the direction 1 may be determined in the direction 2, including the sub-direction 1, the sub-direction 2, the sub-direction 3, the sub-direction 4, the sub-direction 5, and the sub-direction 6 from top to bottom in the direction 2, then, with the feature point 1 as a starting point, whether feature points exist in the sub-direction 1, the sub-direction 2, the sub-direction 3, the sub-direction 4, the sub-direction 5, and the sub-direction 6 is detected in sequence, and according to the time when the feature points are detected, each detected feature point is sequenced in sequence, so as to obtain the sequencing information of each feature point.

104. And calculating the association information between the adjacent text boxes in the target text image.

The adjacent text boxes may include text boxes having an adjacent relationship in the target text image, and if there is no other text box between two text boxes, the two text boxes are a group of adjacent text boxes, and the adjacent text boxes may include an adjacent text box in a specific direction, for example, the direction referred by the determined adjacent relationship may be a target arrangement direction when the text boxes are sorted.

The association information may include information of association between adjacent text boxes in the target text image, and the determination of the association relationship may be preset, or may be determined according to the information of association between adjacent text boxes, for example, the association relationship between adjacent text boxes is determined according to geometric information of adjacent text boxes, specifically, at least one reference comparison value may be set for the geometric information, the association relationship is calculated according to a comparison result of the geometric information and the reference comparison value, and the like.

For example, the association information 1 between the text box 1 and the text box 2, the association information 2 between the text box 2 and the text box 3, the association information 3 between the text box 3 and the text box 4, the association information 4 between the text box 4 and the text box 5, and the association information 5 between the text box 5 and the text box 6 in the image 1 may be calculated, respectively.

In some embodiments, the adjacent text boxes include a first text box and a second text box, the association information includes distance information and scale information, and the step of "calculating association information between adjacent text boxes in the target text image" may include:

calculating distance information between the first text box and the second text box according to the characteristic line of the first text box and the characteristic point of the second text box;

based on the feature lines of the first text box and the second text box, proportion information between the first text box and the second text box is determined.

The distance information may include information capable of representing a distance between adjacent text boxes, and the calculation manner of the distance information may include multiple types, and may be calculated according to geometric information (including border information, feature information, and the like) of the text boxes, for example, the distance information may be determined according to left lines of the adjacent text boxes, and for example, the distance information may be calculated according to feature points and feature lines of the adjacent text boxes.

Specifically, any adjacent text box may include a first text box and a second text box, the order information of the first text box is prior to the order information of the second text box, a feature line of the first text box and a feature point of the second text box may be determined, a distance between the feature point and the feature line is calculated, and the obtained distance is the distance information between the first text box and the second text box.

The proportion information may include a proportion relationship between the first text box and the second text box, the proportion information may be obtained according to geometric information of the text boxes, and the proportion information may reflect a proportion relationship of some aspects (such as length, area, and the like) of the text boxes.

For example, the area proportional relationship between the first text box and the second text box may be determined according to the border information of the first text box and the second text box, and for example, the length proportional relationship between the first text box and the second text box may be determined according to the feature lines, the border information, and the like of the first text box and the second text box.

For example, distance information 1 between the text box 1 and the text box 2 is calculated from the feature line 1 of the text box 1 and the feature point 2 of the text box 2, and the proportion information 1 between the text box 1 and the text box 2 is determined based on the feature line 1 of the text box 1 and the feature line 2 of the text box 2.

In some embodiments, the step of "determining proportion information between the first text box and the second text box based on the feature lines of the first text box and the second text box" may include:

determining a first projection length of a characteristic line of a first text box in a reference direction;

For example, referring to fig. 7, the reference direction may be direction 1, the projection of the feature line 1 of the text box 1 in the reference direction is projection line 1, the length of the projection line 1 is projection length 1, the projection of the feature line 2 of the text box 2 in the reference direction is projection line 2, the length of the projection line 1 is projection length 2, length ratio information between the projection length 1 and the projection length 2 is calculated, and the length ratio information is set as ratio information between the text box 1 and the text box 2.

105. And determining the typesetting information of the text content of each text box based on the geometric information, the sequence information and the association information between all the adjacent text boxes of each text box in the target text image.

The layout information may include paragraph information of the text content, and the paragraph information may include a paragraph in which the text content is located, and information such as a distance (before the paragraph, after the paragraph, inside the paragraph), and an edge (an indentation value) of the paragraph, and the paragraph information of the text content in the paragraph may be determined after the paragraph in which the text content belongs is determined, and when the paragraph in which the text content belongs is determined, it may be determined that the two adjacent text frames belong to the same paragraph according to the association information between the adjacent text frames, for example, when the association information satisfies a preset condition.

Specifically, when determining the paragraph to which each text box belongs, a starting text box may be determined first, and a paragraph identifier 1 is assigned to the text content of the starting text box, then, whether the paragraph information of the starting text box and the adjacent text box satisfies a preset condition is compared, if so, the paragraph identifier 1 is assigned to the text box and is the same as the paragraph identifier of the starting text box, and if not, a paragraph identifier 2, that is, a new paragraph identifier different from the starting text box, is assigned to the text box, and the association information of all adjacent text boxes is sequentially compared according to the order information of each text box, so as to obtain the paragraph identifiers of all text boxes.

For example, a paragraph identifier d1 is allocated to the text box 1, the associated information and the geometric information of the text box 1 and the text box 2 of the image 1 are compared according to the sequence information of the text boxes, if the comparison finds that the preset condition is not met, a paragraph identifier d2 is allocated to the text box 2, then the associated information and the geometric information of the text box 2 and the text box 3 are compared, if the comparison finds that the preset condition is met, a paragraph identifier d2 which is the same as the text box 2 is allocated to the text box 3 until the associated information and the geometric information of all the text boxes are compared, and a paragraph identifier d3 of the text box 4, a paragraph identifier d4 of the text box 5, and a paragraph identifier d5 of the text box 6 are obtained.

In some embodiments, the association information includes distance information and scale information, the adjacent text boxes include a first text box and a second text box, and the step of determining layout information of the text content of each text box based on the geometric information, the order information, and the association information between all the adjacent text boxes in the target text image comprises:

calculating an angle difference value between angle values of the characteristic lines of the first text box and the second text box;

and when the distance information is greater than the preset distance threshold value, distributing different typesetting information for the first text box and the second text box.

Specifically, the geometric information includes angle values of feature lines of text boxes, a difference between angle lines of adjacent text boxes can be calculated, for text content of the same paragraph, even if different lines exist, the direction of the line text should be relatively fixed, the angle difference is usually kept within a reasonable range, distances between different lines of the same paragraph are kept relatively stable, except for a special condition possibly existing in a last line of the paragraph, the difference between text amounts of different lines is also kept relatively stable, and even though operations such as screenshot and photographing may cause a target text image to be deformed to a certain extent, the target text image is also within a reasonable interval, so that whether the adjacent text boxes are in the same paragraph can be determined by comparing the angle difference, the distance information, the proportion information and the like.

The preset angle threshold, the preset distance threshold and the preset proportion threshold can be flexibly set.

How to make the distance between adjacent text boxes far, that is, the distance information is large, the text contents of the two text boxes belong to different paragraphs, or an empty line exists in the middle, and segmentation or empty line processing needs to be performed according to the distance information.

For example, in the image 2, if the distance information between the text box 1 and the text box 2 is greater than the preset distance threshold 2, a paragraph identifier a different from that of the text 1 is allocated to the text box 2, the angle difference between the text box 2 and the text box 3 is smaller than the preset angle threshold 1, the distance information between the text box 2 and the text box 3 is smaller than the preset distance threshold 2, the ratio information between the text box 2 and the text box 3 is greater than the preset ratio threshold 3, and a paragraph identifier a identical to that of the text 2 may be allocated to the text box 3.

106. And typesetting all the text contents of the target text image according to the typesetting information of the text contents of each text box.

For example, the layout information of the text content of each text box in the image 1 is used to perform layout on the image 1, and the obtained content is shown in fig. 9.

In some embodiments, the step of "ordering all text boxes in the target text image according to the target arrangement direction based on the geometric information of each text box, and obtaining the order information of each text box" may include:

sequencing at least one text box in the text block according to the target arrangement direction based on the geometric information of each text box to obtain the sequence information of each text box in the text block;

the step of "calculating the association information between adjacent text boxes in the target text image" may include:

correlation information between adjacent text boxes within at least one text block of the target text image is calculated.

In some embodiments, the content blocks include non-text blocks corresponding to the non-text content, and the step of determining layout information of the text content of each text box based on the geometric information, the order information, and the association information between all adjacent text boxes in the target text image may include:

the step of "typesetting all the text contents of the target text image according to the typesetting information of the text contents of each text box" may include:

For example, referring to the image 2 in fig. 8, the image 2 is segmented to obtain a text block 1, an image block 1, and a text block 2, all the text blocks in the text block 1 are sorted according to the target arrangement direction according to the geometric information of each text block in the text block 1 to obtain the sequence information of each text block in the text block 1, the association information between all adjacent text blocks in the text block 1 is calculated, the association information includes distance information and proportion information, so as to obtain the layout information of all the text contents in the text block 1, and then the image 2 is laid out according to the position information of the segmented image block 1 and the segmented text block 2.

The method described in the above embodiments is further illustrated in detail by way of example.

The present application will describe a text typesetting method by taking a text typesetting system integrated in a computer device as an example, as shown in fig. 10, fig. 10 is a schematic flow diagram of the text typesetting method provided in the embodiment of the present application. The text typesetting method can comprise the following steps:

201. and the computer equipment identifies the target image to obtain a plurality of text boxes and the text content of each text box.

202. The computer device determines a feature line and a feature point for each text box.

203. The computer device measures an angle value of the characteristic line of each text box in a reference direction.

204. The computer equipment segments the image to obtain at least one text block, and the text block comprises at least one text box.

205. And the computer equipment sorts the feature points in all the text blocks in the target image according to the target arrangement direction to obtain the sorting information of each feature point in each text block, and determines the sorting information of each feature point as the sequence information of the text box to which the feature point belongs in the text blocks.

206. The computer device calculates distance information and scale information between all adjacent text boxes within each text block.

207. The computer device determines layout information of text contents of each text box in each text block based on a feature line, a feature point, an angle value, and order information of each text box in each text block, and distance information and scale information between all adjacent text boxes.

208. And the computer equipment typesets all the text contents in the target image according to the typesetting information of the text contents of each text box in each text block.

In order to better implement the text typesetting method provided by the embodiment of the application, the embodiment of the application also provides a device based on the text typesetting method. The meaning of the noun is the same as that in the text typesetting method, and specific implementation details can refer to the description in the method embodiment.

Fig. 11 is a schematic structural diagram of a text layout apparatus according to an embodiment of the present application, as shown in fig. 11, where the text layout apparatus may include an obtaining module 301, a geometry determining module 302, a sorting module 303, a calculating module 304, an information determining module 305, and a layout module 306, where,

an obtaining module 301, configured to obtain a plurality of text boxes in a target text image and text content of each text box;

a geometry determination module 302 for determining the geometry information of each text box;

the sorting module 303 is configured to sort all the text boxes in the target text image according to the target arrangement direction based on the geometric information of each text box, so as to obtain sequence information of each text box;

a calculating module 304, configured to calculate association information between adjacent text boxes in the target text image;

an information determining module 305, configured to determine layout information of text content of each text box based on the geometric information, the order information, and the association information between all adjacent text boxes of each text box in the target text image;

the layout module 306 is configured to layout all the text contents of the target text image according to the layout information of the text contents of each text box.

the sequencing submodule is used for sequencing the feature points of all the text boxes in a first target direction and a second target direction of the target text image by taking the initial feature points as starting points to obtain sequencing information of the feature points of each text box;

In some embodiments, the ordering sub-module is specifically configured to:

and sequencing all the feature points in the target sub-direction by taking the initial feature point as a starting point to obtain sequencing information of all the feature points in the target sub-direction.

In some embodiments, the determination submodule is specifically configured to:

In some embodiments, the ranking module is specifically configured to:

the calculation module is specifically configured to:

calculating the association information between adjacent text boxes in at least one text block of the target text image;

the typesetting module is specifically used for:

In the present application, the obtaining module 301 obtains a plurality of text boxes in a target text image and text content of each text box; the geometry determination module 302 determines the geometry information for each text box; the sorting module 303 sorts all the text boxes in the target text image according to the target arrangement direction based on the geometric information of each text box to obtain the sequence information of each text box; the calculation module 304 calculates the association information between adjacent text boxes in the target text image; the information determination module 305 determines layout information of the text content of each text box based on the geometric information, the order information, and the association information between all adjacent text boxes of each text box in the target text image; the layout module 306 lays out all the text contents of the target text image according to the layout information of the text contents of each text box.

In addition, an embodiment of the present application further provides a computer device, where the computer device may be a terminal or a server, as shown in fig. 12, which shows a schematic structural diagram of the computer device according to the embodiment of the present application, and specifically:

the computer device may include components such as a processor 401 of one or more processing cores, memory 402 of one or more computer-readable storage media, a power supply 403, and an input unit 404. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 12 does not constitute a limitation of computer devices, and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components. Wherein:

the processor 401 is a control center of the computer device, connects various parts of the entire computer device using various interfaces and lines, and performs various functions of the computer device and processes data by running or executing software programs and/or modules stored in the memory 402 and calling data stored in the memory 402, thereby monitoring the computer device as a whole. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor and a modem processor, wherein the application processor mainly handles operating systems, user pages, application programs, and the like, and the modem processor mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 401.

The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by operating the software programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the computer device, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 access to the memory 402.

The computer device further comprises a power supply 403 for supplying power to the various components, and preferably, the power supply 403 is logically connected to the processor 401 via a power management system, so that functions of managing charging, discharging, and power consumption are implemented via the power management system. The power supply 403 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

The computer device may also include an input unit 404, the input unit 404 being operable to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

Although not shown, the computer device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 401 in the computer device loads the executable file corresponding to the process of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the application programs stored in the memory 402, thereby implementing various functions as follows:

acquiring a plurality of text boxes in a target text image and the text content of each text box; determining geometric information of each text box; based on the geometric information of each text box, sequencing all the text boxes in the target text image according to the target arrangement direction to obtain the sequence information of each text box; calculating the associated information between adjacent text boxes in the target text image; determining the typesetting information of the text content of each text box based on the geometric information and the sequence information of each text box in the target text image and the correlation information between all adjacent text boxes; and typesetting all the text contents of the target text image according to the typesetting information of the text contents of each text box.

The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.

According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternative implementations of the above embodiments.

It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by a computer program, which may be stored in a computer-readable storage medium and loaded and executed by a processor, or by related hardware controlled by the computer program.

To this end, an embodiment of the present application further provides a storage medium, in which a computer program is stored, where the computer program can be loaded by a processor to execute the steps in any one of the text typesetting methods provided in the embodiment of the present application. For example, the computer program may perform the steps of:

Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

Since the computer program stored in the storage medium can execute the steps in any text typesetting method provided in the embodiment of the present application, the beneficial effects that can be realized by any text typesetting method provided in the embodiment of the present application can be realized, which are detailed in the foregoing embodiments and will not be described herein again.

The text typesetting method, the text typesetting device, the storage medium and the computer equipment provided by the embodiment of the application are introduced in detail, a specific example is applied in the text to explain the principle and the implementation mode of the application, and the description of the embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A text typesetting method is characterized by comprising the following steps:

determining geometric information of each text box;

2. The method of claim 1, wherein the geometric information comprises border information and feature information, and wherein determining the geometric information for each text box comprises:

measuring the text box to obtain frame information of the text box;

and determining the characteristic information of the text box based on the frame information.

3. The method of claim 2, wherein the feature information comprises a feature line, a feature point, and an angle value of the feature line, and wherein determining the feature information of the text box based on the bounding box information comprises:

4. The method of claim 3, wherein the target alignment direction comprises a target first direction and a target second direction,

the step of sequencing all the text boxes in the target text image according to the target arrangement direction based on the geometric information of each text box to obtain the sequence information of each text box comprises the following steps:

with the starting feature point as a starting point, sorting the feature points of all the text boxes in a first target direction and a second target direction of the target text image to obtain sorting information of the feature points of each text box;

and setting the sequence of the feature points as the sequence information of the text boxes to which the feature points belong to obtain the sequence information of each text box.

5. The method according to claim 4, wherein the step of ranking the feature points of all the text boxes in the target first direction and the target second direction of the target text image with the starting feature point as a starting point to obtain ranking information of the feature point of each text box comprises:

6. The method of claim 3, wherein the adjacent text boxes include a first text box and a second text box, wherein the association information includes distance information and scale information,

the calculating the association information between the adjacent text boxes in the target text image comprises:

and determining proportion information between the first text box and the second text box based on the characteristic lines of the first text box and the second text box.

7. The method of claim 6, wherein determining the ratio information between the first text box and the second text box based on the feature lines of the first text box and the second text box comprises:

8. The method of claim 3, wherein the association information includes distance information and scale information, wherein the adjacent text boxes include a first text box and a second text box,

determining layout information of text content of each text box based on the geometric information, the sequence information and the association information between all adjacent text boxes in the target text image, including:

9. The method according to claim 1, wherein the sorting all the text boxes in the target text image according to the target arrangement direction based on the geometric information of each text box to obtain the order information of each text box comprises:

and calculating the association information between the adjacent text boxes in at least one text block of the target text image.

10. The method of claim 9, wherein the content blocks comprise non-text blocks corresponding to non-text content, and wherein determining layout information for the text content of each text box based on the geometric information, the order information, and the association information between all adjacent text boxes of each text box in the target text image comprises:

the typesetting of all the text contents of the target text image according to the typesetting information of the text contents of each text box comprises the following steps:

11. A text composition apparatus, comprising:

12. A storage medium, characterized in that it stores a plurality of computer programs adapted to be loaded by a processor for performing the steps of the method according to any one of claims 1 to 10.

13. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method according to any of claims 1 to 10 are implemented when the computer program is executed by the processor.