CN111626036B

CN111626036B - Image-text typesetting processing method

Info

Publication number: CN111626036B
Application number: CN202010462916.4A
Authority: CN
Inventors: 杨希羚; 王东洋
Original assignee: Nanjing Lanjingren Network Technology Co ltd
Current assignee: Nanjing Lanjingren Network Technology Co ltd
Priority date: 2020-05-27
Filing date: 2020-05-27
Publication date: 2021-04-30
Anticipated expiration: 2040-05-27
Also published as: CN111626036A

Abstract

The invention discloses a novel image-text typesetting processing method, which relates to the technical field of big data and comprises the steps of establishing an image-text typesetting system, wherein the image-text typesetting system comprises a user terminal, an api gateway and a background server cluster, establishing a PDF generator, a rear-end WEB page generator, a typesetting service module, an article data service and transaction management service module and a database in the background server cluster, and communicating the user terminal with the background server cluster through the api gateway.

Description

Image-text typesetting processing method

Technical Field

The invention relates to the technical field of big data, in particular to a graphic and text typesetting processing method.

Background

The existing image-text typesetting method in the market mainly utilizes professional typesetting software to carry out manual typesetting, and also carries out the mode of manually filling image-text contents to carry out typesetting according to a template automatically set by a system, wherein the two modes are both manual typesetting modes, and more labor and time costs are required to be consumed. With the continuous maturity of the technology, the existing automatic typesetting technology is mainly applied to banner (banner advertisement) design and multi-image one-key typesetting, but the automatic typesetting technology based on user long image-text content is relatively rare in the market.

The prior art has the following defects:

1. the time consumption of manual typesetting is large;

2. the existing simple image-text typesetting technology mainly focuses on reasonable typesetting and image-text position judgment of an image, and has no better solution for the aspects of zooming processing of an image adaptive text, image-text page crossing judgment, image-text interval calculation, identification and combination of blank contents and the like.

Disclosure of Invention

The invention aims to provide a method for typesetting image-text, which solves the technical problem of automatic typesetting of image-text contents by judging the relative position of the original image-text contents and respectively calculating the relative height of image-text paragraphs.

In order to achieve the purpose, the invention adopts the following technical scheme:

an image-text typesetting processing method comprises the following steps:

step 1: establishing an image-text typesetting system, wherein the image-text typesetting system comprises a user side, an api gateway and a background server cluster, establishing a PDF generator, a rear-end WEB page generator, a typesetting service module, an article data service and transaction management service module and a database in the background server cluster, and communicating the user side with the background server cluster through the api gateway;

step 2: a user inputs an original article through a user side, and the user side sends a typesetting request to the api gateway, wherein the typesetting request comprises a request instruction and a user side number;

a user inputs an original article through a user side, enters an entering printing special area interface provided by the user side, and selects the article to print;

step 3, the api network verifies the request sent by the user side and verifies the authority of the user side number: when the user side number meets the authority requirement, the api gateway forwards the typesetting request to the background server cluster; when the user side number does not meet the authority requirement, the api gateway feeds back the authority requirement failure information to the user side and executes the step 2;

and 4, step 4: after the background server cluster receives the typesetting request, the typesetting service module sends the article requesting information to the api gateway, and the api gateway asks the user side for the original article and forwards the original article to the typesetting service module;

and 5: the typesetting service module carries out json analysis on the original article, obtains pictures in the original article, obtains characters by analyzing HTML and obtains emoji expression, and obtains a preprocessed text;

step 6: the typesetting service module typesets the preprocessed text according to the following method:

step S1: traversing the preprocessed text, and sequencing the pages according to paragraph division;

step S2: selecting a page according to the page sequencing order, and judging whether the current page contains characters or pictures: if yes, go to step S10; otherwise, go to step S3;

step S3: directly typesetting from the beginning of the next page, and judging whether the page content is the characters with the upper pictures below: if yes, go to step S6; if not, executing the step 4;

step S4: putting the picture into the current page, calculating the space size occupied by the picture on the current page, calculating the width of all characters, and judging whether all the characters can be put into the current page: if yes, putting characters in the characters, and executing the step S2; if not, go to step S5;

step S5: setting the minimum scaling of the picture, scaling the picture according to the minimum scaling, and judging whether all characters can be put into the current page after the picture is scaled: if yes, put in the characters and execute step S2; if not, the characters are put into the page to the maximum extent, and the redundant characters are put into the next paragraph, and step S2 is executed;

step S6: calculating the widths of all characters in the characters, and processing the characters according to the following method:

step A1: sequentially traversing the characters to sequentially obtain a character;

step A2: and judging the character: if the emoji expression is the emoji expression, converting the emoji expression into a picture, and executing the step A3; containing English characters, executing step A4; if it is a Chinese character, go to step A5;

step A3: finding the ending position of the emoji expression, marking the position and offsetting the circulating position, calculating the width of the emoji expression according to the width of the Chinese character, and executing the step A6;

step A4: setting the width of each English character in an array at the beginning, inquiring the array to obtain the width of each English character, and executing the step A6;

step A5: the Chinese character is a character with equal width and is calculated by 13 px;

step A6: after judging to add this literal character to the characters, the width of characters can reach and predetermine the width: if yes, go to step A7; if not, saving the current processing content, and executing the step A9;

step A7: if the character is English, the tail of the character is processed, and the space character at the tail of the character is deleted; if the last character of the line is an English character, detecting whether the position is the end of a word, if the position is not the end of the word, completely putting down the word by the line, and moving the word to the next line;

step A8: and traversing the characters in sequence, and judging whether the next character can appear at the head of a line: if yes, returning the content and position of the current line, saving the current processed content, and executing the step A1; if not, moving the last character of the current character to the next line, if the character can not be placed at the head of the line, continuing to move, moving twice at most, returning the content and the position of the current line, storing the current processed content, and executing the step A9;

step A9: judging whether all the characters are processed, if so, executing the step S7, and if not, executing the step A1;

step S7: judging whether the current page can put down all characters according to the width of the characters: if yes, go to step S8; if not, the current page is filled, a new page is opened to put the residual characters, and the step S7 is executed;

step S8: judging whether the current page can be put in pictures, if so, directly putting in pictures, and executing the step S2; otherwise, go to step S9;

step S9: reducing the picture according to the minimum scaling, and judging whether the reduced picture can be put into the current page: if yes, putting the pictures, carrying out secondary zooming on all the pictures of the current page, if the pictures occupy the residual space except the characters to the maximum extent, storing the current page, and executing the step S2; if not, the characters are put on the current page and stored, the pictures are put on the next page, and the step S2 is executed;

step S10: acquiring a paragraph to be typeset, and calculating the total height of the paragraph to be typeset;

step S11: judging whether the picture is a picture on the character, if so, executing the step A12; otherwise, executing step S17;

step S12: typesetting the characters word by word, calculating the height of the typeset characters, and storing the characters in each line after typesetting;

step S13: calculating the typesetting height of the picture, and judging whether different pages of the horizontal and vertical images and one page of the vertical image meet the following conditions: if yes, go to step S14; if not, the current content is typeset to the next page, and the step S2 is executed;

step S14: judging whether the current page can put down all the contents: if yes, put the current content in and execute step S2; otherwise, go to step S15;

step S15: after the picture is zoomed according to the minimum zoom scale, whether the current page can be put with all contents or not is judged, if yes, the picture is put into the current page, all pictures of the current page are zoomed for the second time, the residual space except the characters is occupied to the maximum extent, the current page is stored, and the step S2 is executed; otherwise, go to step S16;

step S16: putting the maximum amount of characters in the current page, and putting redundant characters in the following page, and executing the step S2;

step S17: calculating the height of the picture, and calculating the height after the characters are typeset and the width of each row of data;

step S18: judging different pages of the horizontal and vertical images, and judging whether the vertical image can be placed one page or not: if not, the vertical picture is placed at the next page, and the step S2 is executed; if yes, go to step S19;

step S19: judging whether the current page can put all contents: if yes, all characters and pictures are put in, and step S2 is executed; otherwise, go to step S20;

step S20: judging whether the picture of the current page can be put down independently: if yes, go to step S21; if not, the picture is arranged in the next page, and step S2 is executed;

step S21: putting the picture into the current page, putting the characters into the new page, and executing the step S2;

and 7: after the original article is typeset in full text according to the method in the step 6, a back-end WEB page generator generates a corresponding html file, a PDF generator generates a corresponding PDF format file, and the typeset full text is stored in a database which allocates addresses for the full text;

and 8: generating a catalog and page numbers for the full text after typesetting, generating a two-dimensional code or a bar code according to the address distributed by the database, calculating the thickness of the article after the article is formed into a book according to the page numbers, and automatically generating a cover;

and step 9: and the user side acquires the data in the database, displays the two-dimensional code or the bar code to the user, and the user checks the html file generated by the rear-end WEB page generator by scanning the two-dimensional code or the bar code and completes the printing of the PDF file by sending a printing request.

Preferably, the background server cluster adopts an Ali cloud host as a main operating environment.

Preferably, in executing step S4, the following method is adopted for calculating the widths of all characters: the Chinese adopts an equal-width font, and the width of the Chinese on Chrome is measured by taking the 13 th font apple side SC as a reference; english, number and punctuation are measured in their width on Chrome based on the "Times New Roman" font size 13.

Preferably, punctuation marks cannot appear at the head of a line when step A8 is performed.

Preferably, in executing step S6, the justfy field at the front end is used to allow the text to be adaptively aligned within the desired width.

The image-text typesetting processing method solves the technical problem that the article is automatically typeset into the book by one key, the user only needs to select the article to be printed in the American APP, common people can conveniently draw a book belonging to the user, full-automatic program typesetting is realized, manual work is completely saved, the typesetting cost is far lower than that of manual typesetting, the typesetting speed is much higher than that of manual typesetting, the production cost is greatly reduced and the productivity is improved, the picture typesetting is refined into the picture-text, the picture-figure, text-text, inter-text-paragraph interval is the interval between only one whole text paragraph and another whole text paragraph, the position of the picture in the text is more reasonably arranged, the invention scales the picture with the minimum scaling, and the second amplification is carried out after the typesetting, thereby ensuring the short space of the article and the definition of the picture.

Drawings

FIG. 1 is a system architecture diagram of the present invention;

FIG. 2 is a first flowchart of the present invention;

FIG. 3 is a second flowchart of the present invention;

FIG. 4 is a third flowchart of the present invention;

FIG. 5 is a fourth flow chart of the present invention;

fig. 6 is a fifth flow chart of the present invention.

Detailed Description

As shown in fig. 1-6, a method for typesetting image and text includes the following steps:

in this embodiment, the user side is a user side APP.

The image-text typesetting is based on image-text paragraphs in APP, and a paragraph with image-text in APP is called an image-text paragraph. In order to make typesetting beautiful, the technology used by the invention comprises text length calculation, English word segmentation, punctuation segmentation, text in-line self-adaptation, picture scaling calculation, page crossing processing, image-text interval calculation, blank combination, emoji expression processing, catalog generation, article page number generation, two-dimensional code and bar code generation and cover generation.

step A7: if the character is English, the tail of the character is processed, and the space character at the tail of the character is deleted; in order to avoid the situation that the English word is divided into two lines, if the last character of the line is an English character, whether the position is the end of the word is detected, if the position is not the end of the word, the line completely loads the word, and the word is moved to the next line;

for aesthetic reasons, the punctuation mark cannot appear at the beginning of a line of text, and if the punctuation mark appears at the head of the line, the previous word of the punctuation mark is pulled down to serve as the head of the line.

in order to ensure the internal compactness of an image-text paragraph and avoid overlarge single page blank, in order to ensure the compactness of the image-text paragraph, the image-text paragraph is treated as a whole when the image-text is processed and typeset, the problem that the image-text paragraph cannot be placed on the current page is often caused, the image-text paragraph can be integrally spread for attractiveness, but the problem that the page blank is overlarge may exist.

The invention sets the minimum image-text amount to avoid the problem of overlarge single page blank when processing the problem, firstly checks whether the current page reaches the minimum required image-text amount of one page, if the minimum requirement is reached, the whole image-text paragraph currently processed is displayed from the next page.

If the current page does not meet the minimum requirement of one page, the image-text paragraph can only be partially put on the current page and put on the next page, if the lower mode of the image-text is the lower mode of the image-text, the image is put first and then the amount of the characters can be put, if the upper mode of the image-text is the upper mode of the image-text, the characters are added on the current page until the current page is full of the characters, and the rest of the contents are put on the next page.

For aesthetic reasons, the invention comprises 3 kinds of inter-text-paragraph intervals, which are the intervals between the text paragraphs of only one whole and the text paragraphs of another whole, respectively.

For a section of text, sequentially traversing each character, calculating the width sum, if the width sum reaches the maximum width of a line of characters, segmenting and recording the position, namely, putting the text in front of the position into the current line, and putting the text behind the position into the next line for processing.

Preferably, in step S6, because the measurement has errors, when the text length is longer, an error accumulation is formed, which results in that the characters which should be aligned are not aligned, and the invention uses the justfy field at the front end to make the characters be aligned adaptively in the expected width.

When the method is used for processing image zooming, for the image-text paragraphs formed by different characters of the same image, the display of the image-text paragraphs is controlled by zooming the size of the image.

Firstly, setting a minimum scaling for a picture, if the image-text paragraph can be completely displayed in the page at the minimum scaling, firstly calculating the total space required by the characters, then subtracting the character space from the total space of the page to obtain a picture space, and finally scaling the picture to the size of the space.

In the present embodiment, the minimum scaling is 0.88.

If the picture cannot be dropped on the current page after being scaled to the minimum scale, the text segment must be displayed in multiple pages. At the moment, for the purpose of displaying the picture in a larger size, the picture is placed to the maximum size which can be met by the current page, then the characters are gradually placed, and the part which cannot be placed is omitted to the next page for processing.

Because a large amount of blanks are added to a small part of articles during writing, for example, a section of characters contains too many continuous blank spaces or invisible characters, or a section of characters contains too many continuous line changes, a large number of blanks can be caused, and the attractiveness is seriously influenced. When the problem is solved, the regular expression is used for eliminating continuous invisible blank characters, and one blank character is reserved in the characters at most. For consecutive line feeds, the present invention merges them into one line feed to avoid large blanks.

As shown in fig. 2-6, F1, F2, F3, F4, F5, F6, and F7 are paging connection symbols of the flowchart.

In FIG. 1, Puppeneer is a Node library of Google officially produced, which controls the headless Chrome via the DevTools protocol.

Trademanager is the International version of Aliwang, article is a semantic tag proposed by html5, and MS SQL, Orade and Redis are databases.

The iOS, the Android, the Html and the CSS are all client running environments.

Claims

1. A picture and text typesetting processing method is characterized by comprising the following steps: the method comprises the following steps:

2. The image-text composition processing method according to claim 1, characterized in that: the background server cluster adopts an Ali cloud host as a main operating environment.

3. The image-text composition processing method according to claim 1, characterized in that: in executing step S4, the following method is adopted for calculating the widths of all characters: the Chinese adopts an equal-width font, and the width of the Chinese on Chrome is measured by taking the 13 th font apple side SC as a reference; english, number and punctuation are measured in their width on Chrome based on the "Times New Roman" font size 13.

4. The image-text composition processing method according to claim 1, characterized in that: in performing step A8, the punctuation mark cannot appear at the head of the line.

5. The image-text composition processing method according to claim 1, characterized in that: in executing step S6, the justfy field at the front end is used to allow the text to align adaptively within the desired width.