Disclosure of Invention
The invention mainly aims to provide a PDF generation method, device, equipment and storage medium based on web pages, and aims to solve the technical problem that PDF files generated based on web pages in the prior art are poor in quality.
In order to achieve the above object, the present invention provides a web page-based PDF generation method, which includes the following steps:
When a PDF generation instruction is received, determining the heights of a webpage to be processed and a PDF file page according to the PDF generation instruction;
Acquiring page information of the webpage to be processed;
Determining page elements in the webpage to be processed and element heights corresponding to the page elements according to the page information;
Determining a screenshot area according to the page height of the PDF file, the page element and the element height;
and obtaining a picture to be processed according to the screenshot area, and generating a PDF file according to the picture to be processed.
Optionally, the determining a screenshot area according to the PDF file page height, the page element and the element height includes:
traversing the page elements, and taking the traversed page elements as page elements to be processed;
Taking the element height corresponding to the page element to be processed as the element height to be processed;
and determining a screenshot area according to the page height of the PDF file, the page element to be processed and the element height to be processed.
Optionally, the determining the screenshot area according to the PDF file page height, the to-be-processed page element and the to-be-processed element height includes:
Sorting the page elements to be processed to obtain element sorting results;
Progressively adding the heights of the elements to be processed according to the element sequencing result to obtain the total heights of the elements;
comparing the total height of the elements with the page height of the PDF file to determine critical elements;
and determining a screenshot area according to the critical element.
Optionally, the comparing the total height of the element with the page height of the PDF file to determine a critical element includes:
Comparing the total height of the elements with the page height of the PDF file;
And when the total height of the elements is larger than the page height of the PDF file, selecting critical elements from the elements to be processed.
Optionally, the determining a screenshot area according to the critical element includes:
Selecting elements to be compared which are arranged before the critical elements from the elements to be processed according to the element sorting result;
Searching the height of the element to be compared corresponding to the element to be compared, and calculating the height to be compared according to the height of the element to be compared;
and determining a screenshot area according to the height to be compared, the critical element and the element to be compared.
Optionally, the determining a screenshot area according to the height to be compared, the critical element and the element to be compared includes:
calculating a height difference value according to the page height of the PDF file and the height to be compared, and judging whether the critical element has a subelement or not;
When the critical element has sub-elements, acquiring the heights of the sub-elements corresponding to the sub-elements;
And determining a screenshot area according to the sub-element, the height of the sub-element, the height difference value and the element to be compared.
Optionally, after the determining whether the critical element has a child element, the method further includes:
when the critical element does not have sub-elements, acquiring an element region to be compared corresponding to the element to be compared;
And determining a screenshot area according to the element area to be compared.
Optionally, the determining a screenshot area according to the sub-element, the sub-element height, the height difference value and the element to be compared includes:
Sequencing the sub-elements to obtain a sub-element sequencing result;
determining a critical subelement according to the subelement sequencing result, the subelement height and the height difference value;
and determining a screenshot area according to the critical subelement and the element to be compared.
Optionally, the determining a screenshot area according to the critical subelement and the element to be compared includes:
judging whether Sun Yuansu exists in the critical subelement or not;
acquiring a Sun Yuansu height corresponding to the Sun Yuansu when the critical subelement exists Sun Yuansu;
selecting the subelements to be compared which are arranged before the critical subelements from the subelements according to the subelements sequencing result;
and determining a screenshot area according to the Sun Yuansu, the Sun Yuansu height and the subelements to be compared.
Optionally, after the determining whether the critical subelement exists Sun Yuansu, the method further includes:
Acquiring a sub-element region to be compared corresponding to the sub-element to be compared when Sun Yuansu is not present in the critical sub-element;
and determining a screenshot area according to the element area to be compared and the sub-element area to be compared.
Optionally, before traversing the page element and taking the traversed page element as the page element to be processed, the method further includes:
determining an element selection strategy according to the PDF generation instruction;
correspondingly, the traversing the page element, taking the traversed page element as the page element to be processed, includes:
determining a target page element according to the element selection strategy and the page element;
traversing the target page element, and taking the traversed target page element as a page element to be processed.
Optionally, the determining the target page element according to the element selection policy and the page element includes:
determining target element information according to the element selection strategy;
And selecting a target page element from the page elements according to the target element information.
Optionally, the obtaining the to-be-processed picture according to the screenshot area, and generating the PDF file according to the to-be-processed picture, includes:
Performing screenshot processing on the webpage to be processed according to the screenshot area to obtain a picture to be processed;
Sequencing the pictures to be processed to obtain a picture sequencing result;
and carrying out format conversion on the pictures to be processed according to the picture sorting result so as to generate a PDF file.
In addition, in order to achieve the above object, the present invention also provides a PDF generating device based on a web page, where the PDF generating device based on a web page includes:
the instruction receiving module is used for determining the heights of the webpage to be processed and the PDF file page according to the PDF generation instruction when the PDF generation instruction is received;
The information acquisition module is used for acquiring page information of the webpage to be processed;
The element determining module is used for determining page elements in the webpage to be processed and element heights corresponding to the page elements according to the page information;
the screenshot area module is used for determining a screenshot area according to the page height of the PDF file, the page elements and the element heights;
and the file generation module is used for obtaining a picture to be processed according to the screenshot area and generating a PDF file according to the picture to be processed.
Optionally, the screenshot area module is further configured to traverse the page element, and take the traversed page element as a page element to be processed; taking the element height corresponding to the page element to be processed as the element height to be processed; and determining a screenshot area according to the page height of the PDF file, the page element to be processed and the element height to be processed.
Optionally, the screenshot area module is further configured to sort the page elements to be processed to obtain an element sorting result; progressively adding the heights of the elements to be processed according to the element sequencing result to obtain the total heights of the elements; comparing the total height of the elements with the page height of the PDF file to determine critical elements; and determining a screenshot area according to the critical element.
Optionally, the screenshot area module is further configured to compare the total height of the element with the height of the PDF file page; and when the total height of the elements is larger than the page height of the PDF file, selecting critical elements from the elements to be processed.
Optionally, the screenshot area module is further configured to select an element to be compared, which is ranked before the critical element, from the elements to be processed according to the element ranking result; searching the height of the element to be compared corresponding to the element to be compared, and calculating the height to be compared according to the height of the element to be compared; and determining a screenshot area according to the height to be compared, the critical element and the element to be compared.
In addition, in order to achieve the above object, the present invention also proposes a web page-based PDF generating apparatus, including: the system comprises a memory, a processor and a web page-based PDF generation program which is stored in the memory and can run on the processor, wherein the web page-based PDF generation program is configured with steps for realizing the web page-based PDF generation method.
In addition, in order to achieve the above object, the present invention also proposes a storage medium having stored thereon a web page-based PDF generation program which, when executed by a processor, implements the steps of the web page-based PDF generation method described above.
According to the PDF generation method based on the webpage, when a PDF generation instruction is received, the heights of the webpage to be processed and the PDF file page are determined according to the PDF generation instruction; acquiring page information of the webpage to be processed; determining page elements in the webpage to be processed and element heights corresponding to the page elements according to the page information; determining a screenshot area according to the page height of the PDF file, the page element and the element height; and obtaining a picture to be processed according to the screenshot area, and generating a PDF file according to the picture to be processed. The method comprises the steps of determining the heights of a webpage to be processed and a PDF file according to a PDF generation instruction, determining page elements in the webpage to be processed and the heights of corresponding elements, further determining a screenshot area to obtain a picture to be processed, and generating a PDF file according to the picture to be processed, so that the document specification of the PDF file required by a user can be determined according to the PDF generation instruction, reasonable paging is carried out according to the screenshot area, and the quality of the PDF file generated according to the webpage is improved.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a PDF generating device based on web pages in a hardware running environment according to an embodiment of the present invention.
As shown in fig. 1, the web page-based PDF generation apparatus may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as keys, and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed random access memory (Random Access Memory, RAM) memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Those skilled in the art will appreciate that the device architecture shown in fig. 1 does not constitute a limitation of the web page based PDF generation device, and may include more or fewer components than illustrated, or may combine certain components, or may be a different arrangement of components.
As shown in fig. 1, an operating system, a network communication module, a user interface module, and a PDF generation program based on web pages may be included in the memory 1005 as one type of storage medium.
In the PDF generating device based on the web page shown in fig. 1, the network interface 1004 is mainly used for connecting to an external network and performing data communication with other network devices; the user interface 1003 is mainly used for connecting user equipment and communicating data with the user equipment; the apparatus of the present invention calls the PDF generation program based on the web page stored in the memory 1005 through the processor 1001, and executes the PDF generation method based on the web page provided by the embodiment of the present invention.
Based on the hardware structure, the embodiment of the PDF generation method based on the webpage is provided.
Referring to fig. 2, fig. 2 is a flowchart of a first embodiment of a PDF generating method based on web pages according to the present invention.
In a first embodiment, the web page-based PDF generation method includes the steps of:
and step S10, when a PDF generation instruction is received, determining the heights of the webpage to be processed and the PDF file page according to the PDF generation instruction.
It should be noted that, the execution body of the embodiment may be a PDF generating device based on a web page, or may be other devices capable of implementing the same or similar functions, which is not limited in this embodiment, and in this embodiment, the PDF generating device based on a web page is described by taking the PDF generating device based on a web page as an example, where the PDF generating device based on a web page may be a computer device, and this embodiment is not limited in this embodiment.
It should be understood that the web page to be processed is a page to be processed, a PDF file is generated according to the web page to be processed, and the page height of the PDF file is the page height of the generated PDF file. Documents of different specifications have different page heights, for example, a document of A4 specification corresponds to a page height of A4, a document of A5 specification corresponds to a page height of A5, and the page heights corresponding to the horizontal A4 and the vertical A4 are also different. Therefore, the height of the PDF file page may be determined according to the PDF generation instruction to determine the document specification of the PDF file, where the document specification may be a document specification (width×height) defined by the user in addition to the predetermined specifications of A4, A5, and the like, which is not limited in this embodiment.
It should be understood that the user may open a plurality of web pages at the same time, and one of the web pages may be selected as the web page to be processed through the PDF generation instruction, for example, the currently displayed web page may be taken as the web page to be processed. The uniform resource locator system (URL, uniform resource locator) of the web page may also be determined by the PDF generation instruction, and the specified web page is used as the web page to be processed according to the URL, which is not limited in this embodiment.
In the embodiment, the designated web page can be exported to be a PDF document, and the page height of the PDF file is determined according to the PDF generation instruction input by the user, so that the generated PDF file is more diversified, and the customization requirement of the user is met.
Step S20, obtaining page information of the webpage to be processed.
It should be understood that each page has its corresponding page information, which may include information such as a page element and an element height corresponding to the page element, and may also include other information, which is not limited in this embodiment. After the webpage to be processed is determined, the webpage information of the webpage to be processed can be further acquired.
And step S30, determining page elements in the webpage to be processed and element heights corresponding to the page elements according to the page information.
It should be understood that the page elements in this embodiment may be text, pictures, etc. elements in the web page, and may also be other elements, which is not limited in this implementation. The element height corresponding to the page element refers to the height value of the page element, for example, when the page element is a picture, the height of the picture is the element height corresponding to the page element.
It should be understood that another expression manner of the page element in this embodiment may be a DOM node, where the page element in this embodiment may be equivalent to the page DOM node, and according to the HTML DOM standard, all contents in the HTML document are nodes, and the HTML DOM regards the HTML document as a tree structure, where this structure is called a node tree, so each node may have its corresponding child node, and each child node may also have its corresponding grandchild node, and in this embodiment, the expression is performed with child elements, sun Yuansu, where the child elements correspond to child nodes, and Sun Yuansu correspond to grandchild nodes.
And S40, determining a screenshot area according to the page height of the PDF file, the page element and the element height.
It can be understood that the screenshot area can be determined according to the page height, the page element and the element height of the PDF file, and the screenshot is performed according to the screenshot area to obtain the picture to be processed. According to the scheme in the implementation, whether paging is needed or not can be judged according to the page height, the page elements and the element height of the PDF file, and the screenshot area is determined according to the actual situation.
And S50, obtaining a picture to be processed according to the screenshot area, and generating a PDF file according to the picture to be processed.
It should be appreciated that after the screenshot area is determined, the html2canvas plug-in may be used to perform the screenshot and then jsPDF may be used to generate the PDF file.
In the embodiment, the page heights of the webpage to be processed and the PDF file are determined according to the PDF generation instruction, page elements in the webpage to be processed and corresponding element heights are determined, a screenshot area is further determined to obtain a picture to be processed, and a PDF file is generated according to the picture to be processed, so that the document specification of the PDF file required by a user can be determined according to the PDF generation instruction, reasonable paging is performed according to the screenshot area, and the quality of the PDF file generated according to the webpage is improved.
In an embodiment, as shown in fig. 3, a second embodiment of the web page-based PDF generating method according to the present invention is provided based on the first embodiment, and the step S40 includes:
step S401, traversing the page element, and taking the traversed page element as a page element to be processed.
In a specific implementation, for example, when there are 5 page elements in the web page to be processed, namely page elements 1,2, 3, 4 and 5, the page elements may be traversed, and the traversed 5 page elements are used as the page elements to be processed, and at this time, all the page elements in the web page to be processed are converted into the PDF file by default.
Further, in a specific use, the user may not want to export all page elements in the web page, only want to export only part of the page elements, and in order to export only the selected specified page elements to be combined into the PDF file, before step S401, the method further includes:
determining an element selection strategy according to the PDF generation instruction;
It should be understood that when the user inputs the PDF generation instruction, an element selection policy may be further added to the PDF generation instruction, and the page element that the user wants to export may be determined according to the element selection policy. For example, when the element selection policy is to export page elements 1-3, then generating a PDF file according to page elements 1-3; when the element selection policy is to export page elements 1,3, 5, then a PDF file is generated from page elements 1,3, 5. In this embodiment, the to-be-processed picture is obtained in a screenshot manner, and then the PDF file is generated according to the to-be-processed picture, so that the page elements can be flexibly selected according to the user requirements.
Accordingly, the step S401 includes:
determining target element information according to the element selection strategy; selecting a target page element from the page elements according to the target element information; traversing the target page element, and taking the traversed target page element as a page element to be processed.
It should be appreciated that the target element information may be determined according to an element selection policy, and the target page element may be selected from the page elements according to the target element information, for example, when the element selection policy is to derive the page elements 1, 3, 5, the target page element may be determined to be the page elements 1, 3, 5. After determining the target page element, traversing the target page element, and taking the traversed target page element as the page element to be processed.
Step S402, taking the element height corresponding to the page element to be processed as the element height to be processed.
It is understood that, after determining the page element to be processed, the element height corresponding to the page element to be processed may be used as the element height to be processed.
Step S403, determining a screenshot area according to the PDF file page height, the page element to be processed, and the element height to be processed.
It should be understood that, whether paging is required or not may be determined according to the page height of the PDF file, the page element to be processed, and the element height to be processed, so as to determine the screenshot area.
Further, in order to accurately determine whether paging is required, and perform reasonable paging when paging is required, the step S403 includes:
sorting the page elements to be processed to obtain element sorting results; progressively adding the heights of the elements to be processed according to the element sequencing result to obtain the total heights of the elements; comparing the total height of the elements with the page height of the PDF file to determine critical elements; and determining a screenshot area according to the critical element.
It should be understood that the page elements to be processed may be ordered to obtain element ordering results, and in this embodiment, the ordering is illustrated in the order from top to bottom. Then, the element height to be processed is added according to the element sequencing result to obtain the total element height, the total element height is compared with the page height of the PDF file, whether paging is needed or not is judged according to the comparison result, and when paging is not needed, a screenshot area is determined according to the element of the page to be processed; when paging is needed, selecting a critical element from elements to be processed, and determining a screenshot area according to the critical element.
In a specific implementation, for example, as shown in fig. 4, fig. 4 is a first page schematic diagram, the page height of the PDF file may be represented by PAGEHEIGHT, and the element heights of the page elements are represented by h1, h2, h3, h4, and h5, where htotal=h1+h2+h3+h4+h5, and the total element height is known by comparing the element height with the PDF file page height, where htotal is less than PAGEHEIGHT, so that paging is not required, and only one page is required, so that the page elements 1-5 may be used as a screenshot area to perform screenshot, obtain a picture to be processed, and then generate the PDF file.
In a specific implementation, for example, as shown in fig. 5, fig. 5 is a second page schematic diagram, where halways > PAGEHEIGHT, so paging is required, and it is known that the critical element is page element 5, and the processing needs to be performed on page element 5 to determine the screenshot area.
Further, in order to make paging more reasonable and finer when paging is required, the determining the screenshot area according to the critical element includes:
selecting elements to be compared which are arranged before the critical elements from the elements to be processed according to the element sorting result; searching the height of the element to be compared corresponding to the element to be compared, and calculating the height to be compared according to the height of the element to be compared; calculating a height difference value according to the page height of the PDF file and the height to be compared, and judging whether the critical element has a subelement or not; when the critical element has sub-elements, acquiring the heights of the sub-elements corresponding to the sub-elements; and determining a screenshot area according to the sub-element, the height of the sub-element, the height difference value and the element to be compared.
It should be understood that the element to be compared arranged before the critical element can be selected from the elements to be processed according to the element sorting result, then the height to be compared is calculated according to the height of the element to be compared corresponding to the element to be compared, and the height difference is calculated according to the page height of the PDF file and the height to be compared.
In a specific implementation, for example, the critical element is page element 5, and page elements 1-4 are arranged before page element 5, so that page elements 1-4 are used as elements to be compared, h1, h2, h3 and h4 are used as heights of elements to be compared, the heights to be compared can be obtained by adding h1, h2, h3 and h4, and then a height difference= PAGEHEIGHT- (h1+h2+h3+h4) can be calculated.
It can be understood that the page element may further include a sub-element, and the sub-element may further include Sun Yuansu, so that it may be determined whether the critical element has a sub-element, and when the critical element has a sub-element, the sub-element height corresponding to the sub-element is obtained, and then the screenshot area is determined according to the sub-element, the sub-element height, the height difference value, and the element to be compared.
After the judging whether the critical element has the sub-element, the method further comprises:
when the critical element does not have sub-elements, acquiring an element region to be compared corresponding to the element to be compared; and determining a screenshot area according to the element area to be compared.
It should be understood that when the critical element does not have the sub-element, the element region to be compared corresponding to the element to be compared is obtained, the element region to be compared is subjected to screenshot, the screenshot is placed on the first page of the file, and other page elements are placed on the back page from the critical element.
In a specific implementation, as shown in fig. 6, fig. 6 is a third page schematic. At this time, the critical element is page element 5, and no sub-elements exist in page element 5, so page elements 1-4 are placed on the first page, and page element 5 is placed on the second page.
Further, the determining a screenshot area according to the sub-element, the sub-element height, the height difference value and the element to be compared includes:
Sequencing the sub-elements to obtain a sub-element sequencing result; determining a critical subelement according to the subelement sequencing result, the subelement height and the height difference value; and determining a screenshot area according to the critical subelement and the element to be compared.
It should be appreciated that the sub-elements may be ordered to obtain a sub-element ordering result, and the critical sub-elements may be determined based on the sub-element ordering result, the sub-element heights, and the height differences. For example, as shown in fig. 7, fig. 7 is a schematic diagram of a fourth page, in which the page element 5 includes two sub-elements, namely, sub-element 51 and sub-element 52, whose corresponding sub-element heights are h51 and h52, respectively, and h51 is smaller than the height difference, and h51 plus h52 is larger than the height difference, so that the sub-element 52 can be determined to be a critical sub-element.
Further, the determining a screenshot area according to the critical subelement and the element to be compared includes:
Judging whether Sun Yuansu exists in the critical subelement or not; acquiring a sub-element region to be compared corresponding to the sub-element to be compared when Sun Yuansu is not present in the critical sub-element; and determining a screenshot area according to the element area to be compared and the sub-element area to be compared. Acquiring a Sun Yuansu height corresponding to the Sun Yuansu when the critical subelement exists Sun Yuansu; selecting the subelements to be compared which are arranged before the critical subelements from the subelements according to the subelements sequencing result; and determining a screenshot area according to the Sun Yuansu, the Sun Yuansu height and the subelements to be compared.
It should be appreciated that it may be further determined whether Sun Yuansu is present in the critical subelement, and when Sun Yuansu is not present in the critical subelement, fig. 8 is a fifth page schematic, and subelement 52 is a critical subelement and Sun Yuansu is not present, so page elements 1-4 and subelements may be placed on a first page, subelement 52 may be placed on a second page, and the corresponding screenshot region may be determined.
It will be appreciated that when the critical subelements exist Sun Yuansu, the Sun Yuansu height corresponding to Sun Yuansu may be obtained, and subelements to be compared arranged between the critical subelements are selected from the subelements according to the subelements sorting result, so as to determine the critical Sun Yuansu.
In a specific implementation, as shown in fig. 9, fig. 9 is a schematic diagram of a sixth page, it is known that the sub-element to be compared is the sub-element 51, the sub-element 52 includes two Sun Yuansu, sun Yuansu 521 and Sun Yuansu 522 respectively, their corresponding Sun Yuansu heights are h521 and h522 respectively, and further, the critical Sun Yuansu is Sun Yuansu 522, so that it can be further determined whether the Sun Yuansu 522 has the element of the next level, when Sun Yuansu 522 does not have the element of the next level, the page element 1-4, the sub-elements 51 and Sun Yuansu 521 can be placed on the first page, and Sun Yuansu 522 is placed on the second page. When Sun Yuansu has the next level element, the above steps are repeated until there is no next level element.
In this embodiment, through the above scheme, when paging is required, a critical element is automatically determined from page elements to be processed, and the critical element is further detected to identify its child element and Sun Yuansu, so that a screenshot area is determined according to the conditions of its child element and grandchild element, and further, it is determined which elements are to be placed on the first page and which are to be placed on the back page, which not only avoids distortion caused by too many page elements of the same page, but also can ensure that the situation that the elements are not fully displayed due to cutting elements during paging does not occur.
In an embodiment, as shown in fig. 9, a third embodiment of the web page-based PDF generating method according to the present invention is provided based on the first embodiment or the second embodiment, and in this embodiment, the description is given based on the first embodiment, and the step S50 includes:
step S501, performing screenshot processing on the web page to be processed according to the screenshot area to obtain a picture to be processed.
It should be understood that in this embodiment, the html2canvas plug-in may be used to perform screenshot processing to obtain a to-be-processed picture.
Step S502, sorting the pictures to be processed to obtain a picture sorting result.
It should be understood that since the screenshot is performed according to the screenshot area, when there are a plurality of screenshot areas, a plurality of pictures to be processed can be obtained. The pictures to be processed can be ordered to obtain a picture ordering result.
Step S503, performing format conversion on the to-be-processed picture according to the picture sorting result, so as to generate a PDF file.
It should be appreciated that the picture to be processed may be format converted using jsPDF plug-ins to generate a PDF file based on the picture ordering result. In the prior art, html-to-PDF programs, such as wkhtmltopdf, iText, need to be matched by a server, which consumes time and resources, and cannot be used offline. The PDF file generating manner in this embodiment may be used offline, so as to solve the offline requirement of converting the web page into PDF.
In a specific implementation, as shown in fig. 4, in this case of fig. 4, the page elements 1-5 may be used as screenshot areas to obtain a to-be-processed picture, and the to-be-processed picture is placed on the first page of the PDF file.
As shown in fig. 6, in this case of fig. 6, the page elements 1 to 4 may be used as the first screenshot area, the page element 5 may be used as the second screenshot area to obtain a first to-be-processed picture and a second to-be-processed picture, the first to-be-processed picture may be placed on the first page of the PDF file, and the second to-be-processed picture may be placed on the second page of the PDF file.
As shown in fig. 8, in this case of fig. 8, the page elements 1 to 4, the sub-element 51, and the sub-element 52 may be used as a first screenshot area and a second screenshot area, so as to obtain a first to-be-processed picture and a second to-be-processed picture, where the first to-be-processed picture is placed on a first page of the PDF file, and the second to-be-processed picture is placed on a second page of the PDF file.
As shown in fig. 9, in this case of fig. 9, the page elements 1 to 4, the sub-elements 51, sun Yuansu, 521 may be used as a first screenshot area, sun Yuansu 522 may be used as a second screenshot area to obtain a first to-be-processed picture and a second to-be-processed picture, the first to-be-processed picture may be placed on a first page of a PDF file, and the second to-be-processed picture may be placed on a second page of the PDF file.
According to the scheme, the to-be-processed pictures can be obtained according to the screenshot area, and the PDF file is generated according to the arrangement sequence of the to-be-processed pictures, so that the generated PDF file cannot generate deformation torque, and the document quality of the PDF file is improved.
In addition, the embodiment of the invention also provides a storage medium, wherein the storage medium stores a PDF generating program based on the webpage, and the PDF generating program based on the webpage realizes the steps of the PDF generating method based on the webpage when being executed by a processor.
Because the storage medium adopts all the technical schemes of all the embodiments, the storage medium has at least all the beneficial effects brought by the technical schemes of the embodiments, and the description is omitted here.
In addition, referring to fig. 10, an embodiment of the present invention further provides a PDF generating device based on a web page, where the PDF generating device based on a web page includes:
The instruction receiving module 10 is configured to determine, when receiving a PDF generation instruction, a height of a web page to be processed and a height of a PDF file page according to the PDF generation instruction.
It should be understood that the web page to be processed is a page to be processed, a PDF file is generated according to the web page to be processed, and the page height of the PDF file is the page height of the generated PDF file. Documents of different specifications have different page heights, for example, a document of A4 specification corresponds to a page height of A4, a document of A5 specification corresponds to a page height of A5, and the page heights corresponding to the horizontal A4 and the vertical A4 are also different. Therefore, the height of the PDF file page may be determined according to the PDF generation instruction to determine the document specification of the PDF file, where the document specification may be a document specification (width×height) defined by the user in addition to the predetermined specifications of A4, A5, and the like, which is not limited in this embodiment.
It should be understood that the user may open a plurality of web pages at the same time, and one of the web pages may be selected as the web page to be processed through the PDF generation instruction, for example, the currently displayed web page may be taken as the web page to be processed. The uniform resource locator system (URL, uniform resource locator) of the web page may also be determined by the PDF generation instruction, and the specified web page is used as the web page to be processed according to the URL, which is not limited in this embodiment.
In the embodiment, the designated web page can be exported to be a PDF document, and the page height of the PDF file is determined according to the PDF generation instruction input by the user, so that the generated PDF file is more diversified, and the customization requirement of the user is met.
And the information acquisition module 20 is used for acquiring the page information of the webpage to be processed.
It should be understood that each page has its corresponding page information, which may include information such as a page element and an element height corresponding to the page element, and may also include other information, which is not limited in this embodiment. After the webpage to be processed is determined, the webpage information of the webpage to be processed can be further acquired.
And the element determining module 30 is configured to determine a page element in the web page to be processed and an element height corresponding to the page element according to the page information.
It should be understood that the page elements in this embodiment may be text, pictures, etc. elements in the web page, and may also be other elements, which is not limited in this implementation. The element height corresponding to the page element refers to the height value of the page element, for example, when the page element is a picture, the height of the picture is the element height corresponding to the page element.
It should be understood that another expression manner of the page element in this embodiment may be a DOM node, where the page element in this embodiment may be equivalent to the page DOM node, and according to the HTML DOM standard, all contents in the HTML document are nodes, and the HTML DOM regards the HTML document as a tree structure, where this structure is called a node tree, so each node may have its corresponding child node, and each child node may also have its corresponding grandchild node, and in this embodiment, the expression is performed with child elements, sun Yuansu, where the child elements correspond to child nodes, and Sun Yuansu correspond to grandchild nodes.
And the screenshot area module 40 is configured to determine a screenshot area according to the page height of the PDF file, the page element and the element height.
It can be understood that the screenshot area can be determined according to the page height, the page element and the element height of the PDF file, and the screenshot is performed according to the screenshot area to obtain the picture to be processed. According to the scheme in the implementation, whether paging is needed or not can be judged according to the page height, the page elements and the element height of the PDF file, and the screenshot area is determined according to the actual situation.
The file generating module 50 is configured to obtain a to-be-processed picture according to the screenshot area, and generate a PDF file according to the to-be-processed picture.
It should be appreciated that after the screenshot area is determined, the html2canvas plug-in may be used to perform the screenshot and then jsPDF may be used to generate the PDF file.
In the embodiment, the page heights of the webpage to be processed and the PDF file are determined according to the PDF generation instruction, page elements in the webpage to be processed and corresponding element heights are determined, a screenshot area is further determined to obtain a picture to be processed, and a PDF file is generated according to the picture to be processed, so that the document specification of the PDF file required by a user can be determined according to the PDF generation instruction, reasonable paging is performed according to the screenshot area, and the quality of the PDF file generated according to the webpage is improved.
In an embodiment, the screenshot area module 40 is further configured to calculate a height difference value according to the page height of the PDF file and the height to be compared, and determine whether the critical element has a subelement; when the critical element has sub-elements, acquiring the heights of the sub-elements corresponding to the sub-elements; and determining a screenshot area according to the sub-element, the height of the sub-element, the height difference value and the element to be compared.
In an embodiment, the screenshot area module 40 is further configured to obtain an element area to be compared corresponding to the element to be compared when the critical element does not have a subelement; and determining a screenshot area according to the element area to be compared.
In an embodiment, the screenshot area module 40 is further configured to sort the subelements to obtain a subelement sorting result; determining a critical subelement according to the subelement sequencing result, the subelement height and the height difference value; and determining a screenshot area according to the critical subelement and the element to be compared.
In an embodiment, the screenshot area module 40 is further configured to determine whether the critical subelement exists Sun Yuansu; acquiring a Sun Yuansu height corresponding to the Sun Yuansu when the critical subelement exists Sun Yuansu; selecting the subelements to be compared which are arranged before the critical subelements from the subelements according to the subelements sequencing result; and determining a screenshot area according to the Sun Yuansu, the Sun Yuansu height and the subelements to be compared.
In an embodiment, the screenshot area module 40 is further configured to obtain a sub-element area to be compared corresponding to the sub-element to be compared when Sun Yuansu does not exist in the critical sub-element; and determining a screenshot area according to the element area to be compared and the sub-element area to be compared.
In an embodiment, the screenshot-area module 40 is further configured to determine an element selection policy according to the PDF generation instruction; determining a target page element according to the element selection strategy and the page element; traversing the target page element, and taking the traversed target page element as a page element to be processed.
In an embodiment, the screenshot-area module 40 is further configured to determine target element information according to the element selection policy; and selecting a target page element from the page elements according to the target element information.
In an embodiment, the file generating module 50 is further configured to perform screenshot processing on the web page to be processed according to the screenshot area, so as to obtain a picture to be processed; sequencing the pictures to be processed to obtain a picture sequencing result; and carrying out format conversion on the pictures to be processed according to the picture sorting result so as to generate a PDF file.
Other embodiments or specific implementation methods of the PDF generating device based on web pages in the present invention may refer to the above method embodiments, and are not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in an estimator readable storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a smart device (which may be a mobile phone, estimator, web page based PDF generating device, air conditioner, or network web page based PDF generating device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.