CN111626036B - Image-text typesetting processing method - Google Patents

Image-text typesetting processing method Download PDF

Info

Publication number
CN111626036B
CN111626036B CN202010462916.4A CN202010462916A CN111626036B CN 111626036 B CN111626036 B CN 111626036B CN 202010462916 A CN202010462916 A CN 202010462916A CN 111626036 B CN111626036 B CN 111626036B
Authority
CN
China
Prior art keywords
characters
page
picture
typesetting
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010462916.4A
Other languages
Chinese (zh)
Other versions
CN111626036A (en
Inventor
杨希羚
王东洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Lanjingren Network Technology Co ltd
Original Assignee
Nanjing Lanjingren Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Lanjingren Network Technology Co ltd filed Critical Nanjing Lanjingren Network Technology Co ltd
Priority to CN202010462916.4A priority Critical patent/CN111626036B/en
Publication of CN111626036A publication Critical patent/CN111626036A/en
Application granted granted Critical
Publication of CN111626036B publication Critical patent/CN111626036B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/189Automatic justification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/109Font handling; Temporal or kinetic typography

Abstract

The invention discloses a novel image-text typesetting processing method, which relates to the technical field of big data and comprises the steps of establishing an image-text typesetting system, wherein the image-text typesetting system comprises a user terminal, an api gateway and a background server cluster, establishing a PDF generator, a rear-end WEB page generator, a typesetting service module, an article data service and transaction management service module and a database in the background server cluster, and communicating the user terminal with the background server cluster through the api gateway.

Description

Image-text typesetting processing method
Technical Field
The invention relates to the technical field of big data, in particular to a graphic and text typesetting processing method.
Background
The existing image-text typesetting method in the market mainly utilizes professional typesetting software to carry out manual typesetting, and also carries out the mode of manually filling image-text contents to carry out typesetting according to a template automatically set by a system, wherein the two modes are both manual typesetting modes, and more labor and time costs are required to be consumed. With the continuous maturity of the technology, the existing automatic typesetting technology is mainly applied to banner (banner advertisement) design and multi-image one-key typesetting, but the automatic typesetting technology based on user long image-text content is relatively rare in the market.
The prior art has the following defects:
1. the time consumption of manual typesetting is large;
2. the existing simple image-text typesetting technology mainly focuses on reasonable typesetting and image-text position judgment of an image, and has no better solution for the aspects of zooming processing of an image adaptive text, image-text page crossing judgment, image-text interval calculation, identification and combination of blank contents and the like.
Disclosure of Invention
The invention aims to provide a method for typesetting image-text, which solves the technical problem of automatic typesetting of image-text contents by judging the relative position of the original image-text contents and respectively calculating the relative height of image-text paragraphs.
In order to achieve the purpose, the invention adopts the following technical scheme:
an image-text typesetting processing method comprises the following steps:
step 1: establishing an image-text typesetting system, wherein the image-text typesetting system comprises a user side, an api gateway and a background server cluster, establishing a PDF generator, a rear-end WEB page generator, a typesetting service module, an article data service and transaction management service module and a database in the background server cluster, and communicating the user side with the background server cluster through the api gateway;
step 2: a user inputs an original article through a user side, and the user side sends a typesetting request to the api gateway, wherein the typesetting request comprises a request instruction and a user side number;
a user inputs an original article through a user side, enters an entering printing special area interface provided by the user side, and selects the article to print;
step 3, the api network verifies the request sent by the user side and verifies the authority of the user side number: when the user side number meets the authority requirement, the api gateway forwards the typesetting request to the background server cluster; when the user side number does not meet the authority requirement, the api gateway feeds back the authority requirement failure information to the user side and executes the step 2;
and 4, step 4: after the background server cluster receives the typesetting request, the typesetting service module sends the article requesting information to the api gateway, and the api gateway asks the user side for the original article and forwards the original article to the typesetting service module;
and 5: the typesetting service module carries out json analysis on the original article, obtains pictures in the original article, obtains characters by analyzing HTML and obtains emoji expression, and obtains a preprocessed text;
step 6: the typesetting service module typesets the preprocessed text according to the following method:
step S1: traversing the preprocessed text, and sequencing the pages according to paragraph division;
step S2: selecting a page according to the page sequencing order, and judging whether the current page contains characters or pictures: if yes, go to step S10; otherwise, go to step S3;
step S3: directly typesetting from the beginning of the next page, and judging whether the page content is the characters with the upper pictures below: if yes, go to step S6; if not, executing the step 4;
step S4: putting the picture into the current page, calculating the space size occupied by the picture on the current page, calculating the width of all characters, and judging whether all the characters can be put into the current page: if yes, putting characters in the characters, and executing the step S2; if not, go to step S5;
step S5: setting the minimum scaling of the picture, scaling the picture according to the minimum scaling, and judging whether all characters can be put into the current page after the picture is scaled: if yes, put in the characters and execute step S2; if not, the characters are put into the page to the maximum extent, and the redundant characters are put into the next paragraph, and step S2 is executed;
step S6: calculating the widths of all characters in the characters, and processing the characters according to the following method:
step A1: sequentially traversing the characters to sequentially obtain a character;
step A2: and judging the character: if the emoji expression is the emoji expression, converting the emoji expression into a picture, and executing the step A3; containing English characters, executing step A4; if it is a Chinese character, go to step A5;
step A3: finding the ending position of the emoji expression, marking the position and offsetting the circulating position, calculating the width of the emoji expression according to the width of the Chinese character, and executing the step A6;
step A4: setting the width of each English character in an array at the beginning, inquiring the array to obtain the width of each English character, and executing the step A6;
step A5: the Chinese character is a character with equal width and is calculated by 13 px;
step A6: after judging to add this literal character to the characters, the width of characters can reach and predetermine the width: if yes, go to step A7; if not, saving the current processing content, and executing the step A9;
step A7: if the character is English, the tail of the character is processed, and the space character at the tail of the character is deleted; if the last character of the line is an English character, detecting whether the position is the end of a word, if the position is not the end of the word, completely putting down the word by the line, and moving the word to the next line;
step A8: and traversing the characters in sequence, and judging whether the next character can appear at the head of a line: if yes, returning the content and position of the current line, saving the current processed content, and executing the step A1; if not, moving the last character of the current character to the next line, if the character can not be placed at the head of the line, continuing to move, moving twice at most, returning the content and the position of the current line, storing the current processed content, and executing the step A9;
step A9: judging whether all the characters are processed, if so, executing the step S7, and if not, executing the step A1;
step S7: judging whether the current page can put down all characters according to the width of the characters: if yes, go to step S8; if not, the current page is filled, a new page is opened to put the residual characters, and the step S7 is executed;
step S8: judging whether the current page can be put in pictures, if so, directly putting in pictures, and executing the step S2; otherwise, go to step S9;
step S9: reducing the picture according to the minimum scaling, and judging whether the reduced picture can be put into the current page: if yes, putting the pictures, carrying out secondary zooming on all the pictures of the current page, if the pictures occupy the residual space except the characters to the maximum extent, storing the current page, and executing the step S2; if not, the characters are put on the current page and stored, the pictures are put on the next page, and the step S2 is executed;
step S10: acquiring a paragraph to be typeset, and calculating the total height of the paragraph to be typeset;
step S11: judging whether the picture is a picture on the character, if so, executing the step A12; otherwise, executing step S17;
step S12: typesetting the characters word by word, calculating the height of the typeset characters, and storing the characters in each line after typesetting;
step S13: calculating the typesetting height of the picture, and judging whether different pages of the horizontal and vertical images and one page of the vertical image meet the following conditions: if yes, go to step S14; if not, the current content is typeset to the next page, and the step S2 is executed;
step S14: judging whether the current page can put down all the contents: if yes, put the current content in and execute step S2; otherwise, go to step S15;
step S15: after the picture is zoomed according to the minimum zoom scale, whether the current page can be put with all contents or not is judged, if yes, the picture is put into the current page, all pictures of the current page are zoomed for the second time, the residual space except the characters is occupied to the maximum extent, the current page is stored, and the step S2 is executed; otherwise, go to step S16;
step S16: putting the maximum amount of characters in the current page, and putting redundant characters in the following page, and executing the step S2;
step S17: calculating the height of the picture, and calculating the height after the characters are typeset and the width of each row of data;
step S18: judging different pages of the horizontal and vertical images, and judging whether the vertical image can be placed one page or not: if not, the vertical picture is placed at the next page, and the step S2 is executed; if yes, go to step S19;
step S19: judging whether the current page can put all contents: if yes, all characters and pictures are put in, and step S2 is executed; otherwise, go to step S20;
step S20: judging whether the picture of the current page can be put down independently: if yes, go to step S21; if not, the picture is arranged in the next page, and step S2 is executed;
step S21: putting the picture into the current page, putting the characters into the new page, and executing the step S2;
and 7: after the original article is typeset in full text according to the method in the step 6, a back-end WEB page generator generates a corresponding html file, a PDF generator generates a corresponding PDF format file, and the typeset full text is stored in a database which allocates addresses for the full text;
and 8: generating a catalog and page numbers for the full text after typesetting, generating a two-dimensional code or a bar code according to the address distributed by the database, calculating the thickness of the article after the article is formed into a book according to the page numbers, and automatically generating a cover;
and step 9: and the user side acquires the data in the database, displays the two-dimensional code or the bar code to the user, and the user checks the html file generated by the rear-end WEB page generator by scanning the two-dimensional code or the bar code and completes the printing of the PDF file by sending a printing request.
Preferably, the background server cluster adopts an Ali cloud host as a main operating environment.
Preferably, in executing step S4, the following method is adopted for calculating the widths of all characters: the Chinese adopts an equal-width font, and the width of the Chinese on Chrome is measured by taking the 13 th font apple side SC as a reference; english, number and punctuation are measured in their width on Chrome based on the "Times New Roman" font size 13.
Preferably, punctuation marks cannot appear at the head of a line when step A8 is performed.
Preferably, in executing step S6, the justfy field at the front end is used to allow the text to be adaptively aligned within the desired width.
The image-text typesetting processing method solves the technical problem that the article is automatically typeset into the book by one key, the user only needs to select the article to be printed in the American APP, common people can conveniently draw a book belonging to the user, full-automatic program typesetting is realized, manual work is completely saved, the typesetting cost is far lower than that of manual typesetting, the typesetting speed is much higher than that of manual typesetting, the production cost is greatly reduced and the productivity is improved, the picture typesetting is refined into the picture-text, the picture-figure, text-text, inter-text-paragraph interval is the interval between only one whole text paragraph and another whole text paragraph, the position of the picture in the text is more reasonably arranged, the invention scales the picture with the minimum scaling, and the second amplification is carried out after the typesetting, thereby ensuring the short space of the article and the definition of the picture.
Drawings
FIG. 1 is a system architecture diagram of the present invention;
FIG. 2 is a first flowchart of the present invention;
FIG. 3 is a second flowchart of the present invention;
FIG. 4 is a third flowchart of the present invention;
FIG. 5 is a fourth flow chart of the present invention;
fig. 6 is a fifth flow chart of the present invention.
Detailed Description
As shown in fig. 1-6, a method for typesetting image and text includes the following steps:
step 1: establishing an image-text typesetting system, wherein the image-text typesetting system comprises a user side, an api gateway and a background server cluster, establishing a PDF generator, a rear-end WEB page generator, a typesetting service module, an article data service and transaction management service module and a database in the background server cluster, and communicating the user side with the background server cluster through the api gateway;
step 2: a user inputs an original article through a user side, and the user side sends a typesetting request to the api gateway, wherein the typesetting request comprises a request instruction and a user side number;
a user inputs an original article through a user side, enters an entering printing special area interface provided by the user side, and selects the article to print;
in this embodiment, the user side is a user side APP.
The image-text typesetting is based on image-text paragraphs in APP, and a paragraph with image-text in APP is called an image-text paragraph. In order to make typesetting beautiful, the technology used by the invention comprises text length calculation, English word segmentation, punctuation segmentation, text in-line self-adaptation, picture scaling calculation, page crossing processing, image-text interval calculation, blank combination, emoji expression processing, catalog generation, article page number generation, two-dimensional code and bar code generation and cover generation.
Step 3, the api network verifies the request sent by the user side and verifies the authority of the user side number: when the user side number meets the authority requirement, the api gateway forwards the typesetting request to the background server cluster; when the user side number does not meet the authority requirement, the api gateway feeds back the authority requirement failure information to the user side and executes the step 2;
and 4, step 4: after the background server cluster receives the typesetting request, the typesetting service module sends the article requesting information to the api gateway, and the api gateway asks the user side for the original article and forwards the original article to the typesetting service module;
and 5: the typesetting service module carries out json analysis on the original article, obtains pictures in the original article, obtains characters by analyzing HTML and obtains emoji expression, and obtains a preprocessed text;
step 6: the typesetting service module typesets the preprocessed text according to the following method:
step S1: traversing the preprocessed text, and sequencing the pages according to paragraph division;
step S2: selecting a page according to the page sequencing order, and judging whether the current page contains characters or pictures: if yes, go to step S10; otherwise, go to step S3;
step S3: directly typesetting from the beginning of the next page, and judging whether the page content is the characters with the upper pictures below: if yes, go to step S6; if not, executing the step 4;
step S4: putting the picture into the current page, calculating the space size occupied by the picture on the current page, calculating the width of all characters, and judging whether all the characters can be put into the current page: if yes, putting characters in the characters, and executing the step S2; if not, go to step S5;
step S5: setting the minimum scaling of the picture, scaling the picture according to the minimum scaling, and judging whether all characters can be put into the current page after the picture is scaled: if yes, put in the characters and execute step S2; if not, the characters are put into the page to the maximum extent, and the redundant characters are put into the next paragraph, and step S2 is executed;
step S6: calculating the widths of all characters in the characters, and processing the characters according to the following method:
step A1: sequentially traversing the characters to sequentially obtain a character;
step A2: and judging the character: if the emoji expression is the emoji expression, converting the emoji expression into a picture, and executing the step A3; containing English characters, executing step A4; if it is a Chinese character, go to step A5;
step A3: finding the ending position of the emoji expression, marking the position and offsetting the circulating position, calculating the width of the emoji expression according to the width of the Chinese character, and executing the step A6;
step A4: setting the width of each English character in an array at the beginning, inquiring the array to obtain the width of each English character, and executing the step A6;
step A5: the Chinese character is a character with equal width and is calculated by 13 px;
step A6: after judging to add this literal character to the characters, the width of characters can reach and predetermine the width: if yes, go to step A7; if not, saving the current processing content, and executing the step A9;
step A7: if the character is English, the tail of the character is processed, and the space character at the tail of the character is deleted; in order to avoid the situation that the English word is divided into two lines, if the last character of the line is an English character, whether the position is the end of the word is detected, if the position is not the end of the word, the line completely loads the word, and the word is moved to the next line;
step A8: and traversing the characters in sequence, and judging whether the next character can appear at the head of a line: if yes, returning the content and position of the current line, saving the current processed content, and executing the step A1; if not, moving the last character of the current character to the next line, if the character can not be placed at the head of the line, continuing to move, moving twice at most, returning the content and the position of the current line, storing the current processed content, and executing the step A9;
for aesthetic reasons, the punctuation mark cannot appear at the beginning of a line of text, and if the punctuation mark appears at the head of the line, the previous word of the punctuation mark is pulled down to serve as the head of the line.
Step A9: judging whether all the characters are processed, if so, executing the step S7, and if not, executing the step A1;
step S7: judging whether the current page can put down all characters according to the width of the characters: if yes, go to step S8; if not, the current page is filled, a new page is opened to put the residual characters, and the step S7 is executed;
step S8: judging whether the current page can be put in pictures, if so, directly putting in pictures, and executing the step S2; otherwise, go to step S9;
step S9: reducing the picture according to the minimum scaling, and judging whether the reduced picture can be put into the current page: if yes, putting the pictures, carrying out secondary zooming on all the pictures of the current page, if the pictures occupy the residual space except the characters to the maximum extent, storing the current page, and executing the step S2; if not, the characters are put on the current page and stored, the pictures are put on the next page, and the step S2 is executed;
step S10: acquiring a paragraph to be typeset, and calculating the total height of the paragraph to be typeset;
step S11: judging whether the picture is a picture on the character, if so, executing the step A12; otherwise, executing step S17;
step S12: typesetting the characters word by word, calculating the height of the typeset characters, and storing the characters in each line after typesetting;
step S13: calculating the typesetting height of the picture, and judging whether different pages of the horizontal and vertical images and one page of the vertical image meet the following conditions: if yes, go to step S14; if not, the current content is typeset to the next page, and the step S2 is executed;
step S14: judging whether the current page can put down all the contents: if yes, put the current content in and execute step S2; otherwise, go to step S15;
step S15: after the picture is zoomed according to the minimum zoom scale, whether the current page can be put with all contents or not is judged, if yes, the picture is put into the current page, all pictures of the current page are zoomed for the second time, the residual space except the characters is occupied to the maximum extent, the current page is stored, and the step S2 is executed; otherwise, go to step S16;
step S16: putting the maximum amount of characters in the current page, and putting redundant characters in the following page, and executing the step S2;
step S17: calculating the height of the picture, and calculating the height after the characters are typeset and the width of each row of data;
step S18: judging different pages of the horizontal and vertical images, and judging whether the vertical image can be placed one page or not: if not, the vertical picture is placed at the next page, and the step S2 is executed; if yes, go to step S19;
step S19: judging whether the current page can put all contents: if yes, all characters and pictures are put in, and step S2 is executed; otherwise, go to step S20;
step S20: judging whether the picture of the current page can be put down independently: if yes, go to step S21; if not, the picture is arranged in the next page, and step S2 is executed;
step S21: putting the picture into the current page, putting the characters into the new page, and executing the step S2;
in order to ensure the internal compactness of an image-text paragraph and avoid overlarge single page blank, in order to ensure the compactness of the image-text paragraph, the image-text paragraph is treated as a whole when the image-text is processed and typeset, the problem that the image-text paragraph cannot be placed on the current page is often caused, the image-text paragraph can be integrally spread for attractiveness, but the problem that the page blank is overlarge may exist.
The invention sets the minimum image-text amount to avoid the problem of overlarge single page blank when processing the problem, firstly checks whether the current page reaches the minimum required image-text amount of one page, if the minimum requirement is reached, the whole image-text paragraph currently processed is displayed from the next page.
If the current page does not meet the minimum requirement of one page, the image-text paragraph can only be partially put on the current page and put on the next page, if the lower mode of the image-text is the lower mode of the image-text, the image is put first and then the amount of the characters can be put, if the upper mode of the image-text is the upper mode of the image-text, the characters are added on the current page until the current page is full of the characters, and the rest of the contents are put on the next page.
For aesthetic reasons, the invention comprises 3 kinds of inter-text-paragraph intervals, which are the intervals between the text paragraphs of only one whole and the text paragraphs of another whole, respectively.
And 7: after the original article is typeset in full text according to the method in the step 6, a back-end WEB page generator generates a corresponding html file, a PDF generator generates a corresponding PDF format file, and the typeset full text is stored in a database which allocates addresses for the full text;
and 8: generating a catalog and page numbers for the full text after typesetting, generating a two-dimensional code or a bar code according to the address distributed by the database, calculating the thickness of the article after the article is formed into a book according to the page numbers, and automatically generating a cover;
and step 9: and the user side acquires the data in the database, displays the two-dimensional code or the bar code to the user, and the user checks the html file generated by the rear-end WEB page generator by scanning the two-dimensional code or the bar code and completes the printing of the PDF file by sending a printing request.
Preferably, the background server cluster adopts an Ali cloud host as a main operating environment.
Preferably, in executing step S4, the following method is adopted for calculating the widths of all characters: the Chinese adopts an equal-width font, and the width of the Chinese on Chrome is measured by taking the 13 th font apple side SC as a reference; english, number and punctuation are measured in their width on Chrome based on the "Times New Roman" font size 13.
For a section of text, sequentially traversing each character, calculating the width sum, if the width sum reaches the maximum width of a line of characters, segmenting and recording the position, namely, putting the text in front of the position into the current line, and putting the text behind the position into the next line for processing.
Preferably, punctuation marks cannot appear at the head of a line when step A8 is performed.
Preferably, in step S6, because the measurement has errors, when the text length is longer, an error accumulation is formed, which results in that the characters which should be aligned are not aligned, and the invention uses the justfy field at the front end to make the characters be aligned adaptively in the expected width.
When the method is used for processing image zooming, for the image-text paragraphs formed by different characters of the same image, the display of the image-text paragraphs is controlled by zooming the size of the image.
Firstly, setting a minimum scaling for a picture, if the image-text paragraph can be completely displayed in the page at the minimum scaling, firstly calculating the total space required by the characters, then subtracting the character space from the total space of the page to obtain a picture space, and finally scaling the picture to the size of the space.
In the present embodiment, the minimum scaling is 0.88.
If the picture cannot be dropped on the current page after being scaled to the minimum scale, the text segment must be displayed in multiple pages. At the moment, for the purpose of displaying the picture in a larger size, the picture is placed to the maximum size which can be met by the current page, then the characters are gradually placed, and the part which cannot be placed is omitted to the next page for processing.
Because a large amount of blanks are added to a small part of articles during writing, for example, a section of characters contains too many continuous blank spaces or invisible characters, or a section of characters contains too many continuous line changes, a large number of blanks can be caused, and the attractiveness is seriously influenced. When the problem is solved, the regular expression is used for eliminating continuous invisible blank characters, and one blank character is reserved in the characters at most. For consecutive line feeds, the present invention merges them into one line feed to avoid large blanks.
As shown in fig. 2-6, F1, F2, F3, F4, F5, F6, and F7 are paging connection symbols of the flowchart.
In FIG. 1, Puppeneer is a Node library of Google officially produced, which controls the headless Chrome via the DevTools protocol.
Trademanager is the International version of Aliwang, article is a semantic tag proposed by html5, and MS SQL, Orade and Redis are databases.
The iOS, the Android, the Html and the CSS are all client running environments.
The image-text typesetting processing method solves the technical problem that the article is automatically typeset into the book by one key, the user only needs to select the article to be printed in the American APP, common people can conveniently draw a book belonging to the user, full-automatic program typesetting is realized, manual work is completely saved, the typesetting cost is far lower than that of manual typesetting, the typesetting speed is much higher than that of manual typesetting, the production cost is greatly reduced and the productivity is improved, the picture typesetting is refined into the picture-text, the picture-figure, text-text, inter-text-paragraph interval is the interval between only one whole text paragraph and another whole text paragraph, the position of the picture in the text is more reasonably arranged, the invention scales the picture with the minimum scaling, and the second amplification is carried out after the typesetting, thereby ensuring the short space of the article and the definition of the picture.

Claims (5)

1. A picture and text typesetting processing method is characterized by comprising the following steps: the method comprises the following steps:
step 1: establishing an image-text typesetting system, wherein the image-text typesetting system comprises a user side, an api gateway and a background server cluster, establishing a PDF generator, a rear-end WEB page generator, a typesetting service module, an article data service and transaction management service module and a database in the background server cluster, and communicating the user side with the background server cluster through the api gateway;
step 2: a user inputs an original article through a user side, and the user side sends a typesetting request to the api gateway, wherein the typesetting request comprises a request instruction and a user side number;
step 3, the api network verifies the request sent by the user side and verifies the authority of the user side number: when the user side number meets the authority requirement, the api gateway forwards the typesetting request to the background server cluster; when the user side number does not meet the authority requirement, the api gateway feeds back the authority requirement failure information to the user side and executes the step 2;
and 4, step 4: after the background server cluster receives the typesetting request, the typesetting service module sends the article requesting information to the api gateway, and the api gateway asks the user side for the original article and forwards the original article to the typesetting service module;
and 5: the typesetting service module carries out json analysis on the original article, obtains pictures in the original article, obtains characters by analyzing HTML and obtains emoji expression, and obtains a preprocessed text;
step 6: the typesetting service module typesets the preprocessed text according to the following method:
step S1: traversing the preprocessed text, and sequencing the pages according to paragraph division;
step S2: selecting a page according to the page sequencing order, and judging whether the current page contains characters or pictures: if yes, go to step S10; otherwise, go to step S3;
step S3: directly typesetting from the beginning of the next page, and judging whether the page content is the characters with the upper pictures below: if yes, go to step S6; if not, executing the step 4;
step S4: putting the picture into the current page, calculating the space size occupied by the picture on the current page, calculating the width of all characters, and judging whether all the characters can be put into the current page: if yes, putting characters in the characters, and executing the step S2; if not, go to step S5;
step S5: setting the minimum scaling of the picture, scaling the picture according to the minimum scaling, and judging whether all characters can be put into the current page after the picture is scaled: if yes, put in the characters and execute step S2; if not, the characters are put into the page to the maximum extent, and the redundant characters are put into the next paragraph, and step S2 is executed;
step S6: calculating the widths of all characters in the characters, and processing the characters according to the following method:
step A1: sequentially traversing the characters to sequentially obtain a character;
step A2: and judging the character: if the emoji expression is the emoji expression, converting the emoji expression into a picture, and executing the step A3; containing English characters, executing step A4; if it is a Chinese character, go to step A5;
step A3: finding the ending position of the emoji expression, marking the position and offsetting the circulating position, calculating the width of the emoji expression according to the width of the Chinese character, and executing the step A6;
step A4: setting the width of each English character in an array at the beginning, inquiring the array to obtain the width of each English character, and executing the step A6;
step A5: the Chinese character is a character with equal width and is calculated by 13 px;
step A6: after judging to add this literal character to the characters, the width of characters can reach and predetermine the width: if yes, go to step A7; if not, saving the current processing content, and executing the step A9;
step A7: if the character is English, the tail of the character is processed, and the space character at the tail of the character is deleted; if the last character of the line is an English character, detecting whether the position is the end of a word, if the position is not the end of the word, completely putting down the word by the line, and moving the word to the next line;
step A8: and traversing the characters in sequence, and judging whether the next character can appear at the head of a line: if yes, returning the content and position of the current line, saving the current processed content, and executing the step A1; if not, moving the last character of the current character to the next line, if the character can not be placed at the head of the line, continuing to move, moving twice at most, returning the content and the position of the current line, storing the current processed content, and executing the step A9;
step A9: judging whether all the characters are processed, if so, executing the step S7, and if not, executing the step A1;
step S7: judging whether the current page can put down all characters according to the width of the characters: if yes, go to step S8; if not, the current page is filled, a new page is opened to put the residual characters, and the step S7 is executed;
step S8: judging whether the current page can be put in pictures, if so, directly putting in pictures, and executing the step S2; otherwise, go to step S9;
step S9: reducing the picture according to the minimum scaling, and judging whether the reduced picture can be put into the current page: if yes, putting the pictures, carrying out secondary zooming on all the pictures of the current page, if the pictures occupy the residual space except the characters to the maximum extent, storing the current page, and executing the step S2; if not, the characters are put on the current page and stored, the pictures are put on the next page, and the step S2 is executed;
step S10: acquiring a paragraph to be typeset, and calculating the total height of the paragraph to be typeset;
step S11: judging whether the picture is a picture on the character, if so, executing the step A12; otherwise, executing step S17;
step S12: typesetting the characters word by word, calculating the height of the typeset characters, and storing the characters in each line after typesetting;
step S13: calculating the typesetting height of the picture, and judging whether different pages of the horizontal and vertical images and one page of the vertical image meet the following conditions: if yes, go to step S14; if not, the current content is typeset to the next page, and the step S2 is executed;
step S14: judging whether the current page can put down all the contents: if yes, put the current content in and execute step S2; otherwise, go to step S15;
step S15: after the picture is zoomed according to the minimum zoom scale, whether the current page can be put with all contents or not is judged, if yes, the picture is put into the current page, all pictures of the current page are zoomed for the second time, the residual space except the characters is occupied to the maximum extent, the current page is stored, and the step S2 is executed; otherwise, go to step S16;
step S16: putting the maximum amount of characters in the current page, and putting redundant characters in the following page, and executing the step S2;
step S17: calculating the height of the picture, and calculating the height after the characters are typeset and the width of each row of data;
step S18: judging different pages of the horizontal and vertical images, and judging whether the vertical image can be placed one page or not: if not, the vertical picture is placed at the next page, and the step S2 is executed; if yes, go to step S19;
step S19: judging whether the current page can put all contents: if yes, all characters and pictures are put in, and step S2 is executed; otherwise, go to step S20;
step S20: judging whether the picture of the current page can be put down independently: if yes, go to step S21; if not, the picture is arranged in the next page, and step S2 is executed;
step S21: putting the picture into the current page, putting the characters into the new page, and executing the step S2;
and 7: after the original article is typeset in full text according to the method in the step 6, a back-end WEB page generator generates a corresponding html file, a PDF generator generates a corresponding PDF format file, and the typeset full text is stored in a database which allocates addresses for the full text;
and 8: generating a catalog and page numbers for the full text after typesetting, generating a two-dimensional code or a bar code according to the address distributed by the database, calculating the thickness of the article after the article is formed into a book according to the page numbers, and automatically generating a cover;
and step 9: and the user side acquires the data in the database, displays the two-dimensional code or the bar code to the user, and the user checks the html file generated by the rear-end WEB page generator by scanning the two-dimensional code or the bar code and completes the printing of the PDF file by sending a printing request.
2. The image-text composition processing method according to claim 1, characterized in that: the background server cluster adopts an Ali cloud host as a main operating environment.
3. The image-text composition processing method according to claim 1, characterized in that: in executing step S4, the following method is adopted for calculating the widths of all characters: the Chinese adopts an equal-width font, and the width of the Chinese on Chrome is measured by taking the 13 th font apple side SC as a reference; english, number and punctuation are measured in their width on Chrome based on the "Times New Roman" font size 13.
4. The image-text composition processing method according to claim 1, characterized in that: in performing step A8, the punctuation mark cannot appear at the head of the line.
5. The image-text composition processing method according to claim 1, characterized in that: in executing step S6, the justfy field at the front end is used to allow the text to align adaptively within the desired width.
CN202010462916.4A 2020-05-27 2020-05-27 Image-text typesetting processing method Active CN111626036B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010462916.4A CN111626036B (en) 2020-05-27 2020-05-27 Image-text typesetting processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010462916.4A CN111626036B (en) 2020-05-27 2020-05-27 Image-text typesetting processing method

Publications (2)

Publication Number Publication Date
CN111626036A CN111626036A (en) 2020-09-04
CN111626036B true CN111626036B (en) 2021-04-30

Family

ID=72271481

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010462916.4A Active CN111626036B (en) 2020-05-27 2020-05-27 Image-text typesetting processing method

Country Status (1)

Country Link
CN (1) CN111626036B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112434487B (en) * 2020-10-27 2024-01-30 北京奇艺世纪科技有限公司 Image-text typesetting method and device and electronic equipment
CN112380816B (en) * 2020-11-11 2022-05-31 珠海读书郎网络教育有限公司 Test paper typesetting method and system based on mapping table
CN113743075B (en) * 2021-09-08 2023-10-31 北京超图软件股份有限公司 Meteorological disaster image-text service product generation method, device, equipment and storage medium
CN114153404A (en) * 2021-11-25 2022-03-08 武汉新新数码彩色印务有限公司 Mobile internet intelligent printing method and mobile internet intelligent printing system
CN114241093B (en) * 2022-02-28 2022-06-17 世纪开元智印互联科技集团股份有限公司 Method and system for automatically typesetting photo book without shielding

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1808482A (en) * 2006-02-09 2006-07-26 北京北大方正电子有限公司 Visual electronic signature and verification method
CN101013418A (en) * 2007-02-15 2007-08-08 北京大学 Auto-adaptive typesetting method for word in textbox
CN101183358A (en) * 2007-12-25 2008-05-21 北京方正国际软件系统有限公司 Typesetting method and apparatus
CN101937428A (en) * 2010-08-11 2011-01-05 优视科技有限公司 Method and system for rearranging pictures with literal contents for mobile terminal equipment
CN101984419A (en) * 2010-10-21 2011-03-09 优视科技有限公司 Method and device for reforming paragraphs of webpage picture content
CN102789448A (en) * 2012-06-21 2012-11-21 奇智软件(北京)有限公司 Electronic data typesetting method and device
CN103455475A (en) * 2012-06-01 2013-12-18 腾讯科技(深圳)有限公司 Typesetting method, equipment and system
CN104239284A (en) * 2014-09-15 2014-12-24 广州市西美信息科技有限公司 Method and device for automatic image-text composition
CN104615587A (en) * 2012-06-21 2015-05-13 北京奇虎科技有限公司 Electronic data composing method and device
CN105045776A (en) * 2015-09-07 2015-11-11 武汉大学 Automatic page type setting method
CN105912519A (en) * 2016-05-27 2016-08-31 北京京东尚科信息技术有限公司 Electronic document typesetting method and typesetting device
CN106445903A (en) * 2015-08-04 2017-02-22 腾讯科技(深圳)有限公司 Image-text data typesetting method and apparatus
CN106970900A (en) * 2017-03-26 2017-07-21 北京图文天地科技发展有限公司 A kind of method of compatible emoji emoticon typesetting
CN107291682A (en) * 2016-03-30 2017-10-24 同方知网(北京)技术有限公司 It is a kind of to divide piece algorithm based on many electronic documents for redirecting processing and twin check
CN107688557A (en) * 2016-08-03 2018-02-13 北大方正集团有限公司 Composition method, composing system and terminal
CN108170658A (en) * 2018-01-12 2018-06-15 山西同方知网数字出版技术有限公司 A kind of flexibly configurable, the Text region flexibly defined adapt critique system
CN110110290A (en) * 2019-03-29 2019-08-09 北京点众科技股份有限公司 A kind of method and apparatus for the typesetting pattern setting e-book
CN110516192A (en) * 2019-09-02 2019-11-29 福建天晴数码有限公司 Picture resource material and textual materials are dragged to software interface method and its system
CN110705223A (en) * 2019-08-13 2020-01-17 北京众信博雅科技有限公司 Footnote recognition and extraction method for multi-page layout document

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7017816B2 (en) * 2003-09-30 2006-03-28 Hewlett-Packard Development Company, L.P. Extracting graphical bar codes from template-based documents
JP5629435B2 (en) * 2009-06-30 2014-11-19 キヤノン株式会社 Information processing apparatus, information processing method, and program
US20120196260A1 (en) * 2011-02-01 2012-08-02 Kao Nhiayi Electronic Comic (E-Comic) Metadata Processing
US20140108941A1 (en) * 2012-10-17 2014-04-17 Christopher Stephen Joel Method and Apparatus for Automatically Optimizing the Loading of Images in a Cloud-Based Proxy Service
US9386432B2 (en) * 2013-08-12 2016-07-05 Yahoo! Inc. Displaying location-based images that match the weather conditions
US9779065B1 (en) * 2013-08-29 2017-10-03 Google Inc. Displaying graphical content items based on textual content items
CN105205077A (en) * 2014-06-25 2015-12-30 广州市动景计算机科技有限公司 Page layout method, device and system
US10157178B2 (en) * 2015-02-06 2018-12-18 International Business Machines Corporation Identifying categories within textual data
CN109710686A (en) * 2018-12-27 2019-05-03 杭州火树科技有限公司 The analysis system of visualization building chart
CN110728129B (en) * 2019-09-03 2023-06-23 北京字节跳动网络技术有限公司 Method, device, medium and equipment for typesetting text content in picture
CN110705503B (en) * 2019-10-14 2022-02-25 北京信息科技大学 Method and device for generating directory structured information
CN110929495B (en) * 2019-11-08 2023-08-29 广州坚和网络科技有限公司 Method for automatically beautifying typesetting of articles
CN111178019A (en) * 2019-12-30 2020-05-19 深圳市越疆科技有限公司 Robot character writing method, device and system based on voice control

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1808482A (en) * 2006-02-09 2006-07-26 北京北大方正电子有限公司 Visual electronic signature and verification method
CN101013418A (en) * 2007-02-15 2007-08-08 北京大学 Auto-adaptive typesetting method for word in textbox
CN101183358A (en) * 2007-12-25 2008-05-21 北京方正国际软件系统有限公司 Typesetting method and apparatus
CN101937428A (en) * 2010-08-11 2011-01-05 优视科技有限公司 Method and system for rearranging pictures with literal contents for mobile terminal equipment
CN101984419A (en) * 2010-10-21 2011-03-09 优视科技有限公司 Method and device for reforming paragraphs of webpage picture content
CN103455475A (en) * 2012-06-01 2013-12-18 腾讯科技(深圳)有限公司 Typesetting method, equipment and system
CN104615587A (en) * 2012-06-21 2015-05-13 北京奇虎科技有限公司 Electronic data composing method and device
CN102789448A (en) * 2012-06-21 2012-11-21 奇智软件(北京)有限公司 Electronic data typesetting method and device
CN104239284A (en) * 2014-09-15 2014-12-24 广州市西美信息科技有限公司 Method and device for automatic image-text composition
CN106445903A (en) * 2015-08-04 2017-02-22 腾讯科技(深圳)有限公司 Image-text data typesetting method and apparatus
CN105045776A (en) * 2015-09-07 2015-11-11 武汉大学 Automatic page type setting method
CN107291682A (en) * 2016-03-30 2017-10-24 同方知网(北京)技术有限公司 It is a kind of to divide piece algorithm based on many electronic documents for redirecting processing and twin check
CN105912519A (en) * 2016-05-27 2016-08-31 北京京东尚科信息技术有限公司 Electronic document typesetting method and typesetting device
CN107688557A (en) * 2016-08-03 2018-02-13 北大方正集团有限公司 Composition method, composing system and terminal
CN106970900A (en) * 2017-03-26 2017-07-21 北京图文天地科技发展有限公司 A kind of method of compatible emoji emoticon typesetting
CN108170658A (en) * 2018-01-12 2018-06-15 山西同方知网数字出版技术有限公司 A kind of flexibly configurable, the Text region flexibly defined adapt critique system
CN110110290A (en) * 2019-03-29 2019-08-09 北京点众科技股份有限公司 A kind of method and apparatus for the typesetting pattern setting e-book
CN110705223A (en) * 2019-08-13 2020-01-17 北京众信博雅科技有限公司 Footnote recognition and extraction method for multi-page layout document
CN110516192A (en) * 2019-09-02 2019-11-29 福建天晴数码有限公司 Picture resource material and textual materials are dragged to software interface method and its system

Also Published As

Publication number Publication date
CN111626036A (en) 2020-09-04

Similar Documents

Publication Publication Date Title
CN111626036B (en) Image-text typesetting processing method
US8593666B2 (en) Method and system for printing a web page
CN101246550B (en) Image character recognition method and device
US8515176B1 (en) Identification of text-block frames
CN112230870B (en) Method and device for printing form data
US10417516B2 (en) System and method for preprocessing images to improve OCR efficacy
CN101373465A (en) Translation apparatus, and translation method
CN110705503B (en) Method and device for generating directory structured information
US9734132B1 (en) Alignment and reflow of displayed character images
CN104133809B (en) Font style bolding method
CN109726369A (en) A kind of intelligent template questions record Implementation Technology based on normative document
KR100743175B1 (en) Advertisement writing method using computerized typesetting system
CN112417826A (en) PDF online editing method and device, electronic equipment and readable storage medium
CN105677262A (en) Typesetting method and device for banner printing
CN111198664A (en) Document printing method and device, computer storage medium and terminal
CN112183019B (en) Display method, computing equipment and computer storage medium of electronic book handwritten notes
CN106776489B (en) Electronic document display method and system of display device
CN110032348A (en) A kind of character display method, device, medium
CN111126007A (en) HTML (Hypertext markup language) -based medical record document paging algorithm
CN116778032B (en) Answer sheet generation method, device, equipment and storage medium
CN111046096B (en) Method and device for generating graphic structured information
CN113703699B (en) Real-time output method and device for electronic file
JP2014092824A (en) Display control apparatus and program
CN112906347B (en) Character typesetting method, electronic equipment and storage medium
JP2682873B2 (en) Recognition device for tabular documents

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Yang Xiling

Inventor after: Wang Dongyang

Inventor before: Yang Xiling

GR01 Patent grant
GR01 Patent grant