CN114817804A - Webpage generation method and device, electronic equipment and storage medium - Google Patents

Webpage generation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114817804A
CN114817804A CN202210346440.7A CN202210346440A CN114817804A CN 114817804 A CN114817804 A CN 114817804A CN 202210346440 A CN202210346440 A CN 202210346440A CN 114817804 A CN114817804 A CN 114817804A
Authority
CN
China
Prior art keywords
webpage
text
target
web page
original text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210346440.7A
Other languages
Chinese (zh)
Inventor
李亚楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202210346440.7A priority Critical patent/CN114817804A/en
Publication of CN114817804A publication Critical patent/CN114817804A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/986Document structures and storage, e.g. HTML extensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Abstract

The disclosure relates to a webpage generation method, a webpage generation device, electronic equipment and a storage medium, and relates to the technical field of internet. The method comprises the following steps: responding to a processing request, and acquiring an original text of a first webpage, wherein the processing request comprises a uniform resource locator of the first webpage, and the original text of the first webpage is used for acquiring content in the first webpage; preprocessing the original text of the first webpage to obtain a target text of the first webpage; and inserting the content in the first webpage into a second webpage based on the target text of the first webpage to generate a target webpage. In the disclosure, a new webpage (i.e., a target webpage) generated by the electronic device may completely and effectively represent content in the first webpage, and further, in the process of using the target webpage, the electronic device may perform related operations on the content in the first webpage in the target webpage, so as to improve the use effectiveness of the target webpage.

Description

Webpage generation method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of internet technologies, and in particular, to a method and an apparatus for generating a web page, an electronic device, and a storage medium.
Background
Currently, rich text content in a three-party webpage can generate a corresponding hypertext markup language (HTML) character string through a text editor, and then the HTML character string is inserted into other webpages to generate a new webpage.
However, in the above method, the content in the three-party webpage may not only include the rich text content, but also include, for example, a coding format used by the three-party webpage, so that it may be unreasonable to merely insert the HTML character string into the other webpage, that is, the generated new webpage may not completely and effectively represent the content in the three-party webpage, and further may affect the using process of the newly generated webpage.
Disclosure of Invention
The invention provides a webpage generating method, a webpage generating device, electronic equipment and a storage medium, and solves the technical problem that in the prior art, a new webpage generated only when relevant HTML character strings are inserted into other webpages cannot completely and effectively represent the content in a three-party webpage, so that the use process of the newly generated webpage is influenced.
The technical scheme of the embodiment of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, a method for generating a web page is provided. The method can comprise the following steps: responding to a processing request, and acquiring an original text of a first webpage, wherein the processing request comprises a uniform resource locator of the first webpage, and the original text of the first webpage is used for acquiring content in the first webpage; preprocessing the original text of the first webpage to obtain a target text of the first webpage; and inserting the content in the first webpage into a second webpage based on the target text of the first webpage to generate a target webpage.
Optionally, the preprocessing the original text of the first web page to obtain the target text of the first web page specifically includes: and performing first preprocessing on the original text of the first webpage to obtain a target text of the first webpage, wherein the first preprocessing is used for performing scope processing on the original text.
Optionally, the performing the first preprocessing on the original text of the first webpage to obtain the target text of the first webpage specifically includes: determining a plurality of style texts of the first webpage based on the original text of the first webpage; determining a plurality of first tags of the first webpage and a second tag of the second webpage; when a target label identical to the label of the second label exists in the plurality of first labels, adding a preset scope label to at least one style text of the first webpage to generate the target text, wherein the at least one style text is the style text corresponding to the target label.
Optionally, the inserting the content in the first webpage into the second webpage based on the target text of the first webpage to generate the target webpage specifically includes: setting scope information for the style text corresponding to the target label based on the preset scope identification in the target text; and inserting the content in the first webpage into the second webpage based on the target text of the first webpage, and updating the target label based on the scope information to generate the target webpage.
Optionally, the preprocessing the original text of the first web page to obtain the target text of the first web page further includes: and performing second pretreatment on the original text of the first webpage to obtain a target text of the first webpage, wherein the second pretreatment is used for filtering the original text.
Optionally, the performing the second preprocessing on the original text of the first webpage to obtain the target text of the first webpage specifically includes: acquiring a risk text from an original text of the first webpage according to a preset query rule; translating the risk text to generate a translated text; the translated text is filtered from the original text of the first web page to generate the target text.
Optionally, the inserting the content in the first webpage into the second webpage based on the target text of the first webpage to generate the target webpage specifically includes: and based on the target text, inserting the content in the first webpage into the specified position of the second webpage to generate the target webpage.
According to a second aspect of the embodiments of the present disclosure, a web page generating apparatus is provided. The apparatus may include: the device comprises an acquisition module and a processing module; the obtaining module is configured to obtain an original text of a first webpage in response to a processing request, wherein the processing request includes a uniform resource locator of the first webpage, and the original text of the first webpage is used for obtaining content in the first webpage; the processing module is configured to preprocess an original text of the first webpage to obtain a target text of the first webpage; the processing module is further configured to insert the content in the first webpage into a second webpage based on the target text of the first webpage to generate a target webpage.
Optionally, the processing module is specifically configured to perform a first preprocessing on the original text of the first webpage to obtain a target text of the first webpage, where the first preprocessing is used to perform scoping on the original text.
Optionally, the web page generating apparatus further includes a determining module; the determination module configured to determine a plurality of style texts of the first webpage based on an original text of the first webpage; the determining module is further configured to determine a plurality of first tags of the first web page and a second tag of the second web page; the processing module is specifically configured to add a preset scope identifier to at least one style text of the first webpage to generate the target text when a target tag identical to the identifier of the second tag exists in the plurality of first tags, where the at least one style text is a style text corresponding to the target tag.
Optionally, the processing module is further specifically configured to set scope range information for the style text corresponding to the target tag based on the preset scope identifier in the target text; the processing module is specifically configured to insert content in the first web page into the second web page based on the target text of the first web page, and update the target tag based on the scope information to generate the target web page.
Optionally, the processing module is specifically configured to perform a second preprocessing on the original text of the first webpage to obtain a target text of the first webpage, where the second preprocessing is used to perform a filtering processing on the original text.
Optionally, the obtaining module is further configured to obtain a risk text from an original text of the first webpage according to a preset query rule; the processing module is specifically configured to perform translation processing on the risk text to generate a translation text; the processing module is further configured to filter the translated text from the original text of the first web page to generate the target text.
Optionally, the processing module is specifically configured to insert the content in the first webpage into the specified position of the second webpage based on the target text, so as to generate the target webpage.
According to a third aspect of embodiments of the present disclosure, there is provided an electronic device, which may include: a processor and a memory configured to store processor-executable instructions; wherein the processor is configured to execute the instructions to implement any of the above-described optional web page generation methods of the first aspect.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium having instructions stored thereon, where the instructions in the computer-readable storage medium, when executed by an electronic device, enable the electronic device to perform any one of the above-mentioned optional web page generation methods of the first aspect.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising computer instructions which, when run on an electronic device, cause the electronic device to perform the optional web page generation method as in any one of the first aspects.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
based on any one of the above aspects, in the present disclosure, in response to a processing request, an electronic device may obtain an original text of a first webpage, perform preprocessing on the original text of the first webpage to obtain a target text of the first webpage, and then insert content in the first webpage into a second webpage based on the target text of the first webpage to generate a target webpage. In the embodiment of the present disclosure, the content of the first webpage includes not only the related rich text content, but also the attribute information of the first webpage, the encoding format used by the first webpage, and other scripts included in the first webpage. In this way, the electronic device may obtain the content in the first web page based on the original text (or source code) of the first web page, and insert the content in the first web page into the second web page based on the pre-processed target text of the first web page. The generated new webpage (i.e., the target webpage) can completely and effectively represent the content in the first webpage, and further, in the process of using the target webpage, the electronic device can execute related operations on the content in the first webpage in the target webpage so as to improve the use effectiveness of the target webpage.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
Fig. 1 is a flowchart illustrating a method for generating a web page according to an embodiment of the present disclosure;
fig. 2 is a flowchart illustrating a webpage generating method according to an embodiment of the present disclosure;
FIG. 3 is a flowchart illustrating a method for generating a web page according to an embodiment of the present disclosure;
FIG. 4 is a flowchart illustrating a method for generating a web page according to an embodiment of the present disclosure;
FIG. 5 is a flowchart illustrating a method for generating a web page according to an embodiment of the present disclosure;
FIG. 6 is a flowchart illustrating a method for generating a web page according to an embodiment of the present disclosure;
FIG. 7 is a flowchart illustrating a method for generating a web page according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram illustrating a web page generation apparatus provided in an embodiment of the present disclosure;
fig. 9 shows a schematic structural diagram of still another web page generation apparatus provided in an embodiment of the present disclosure.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, and/or components.
The data to which the present disclosure relates may be data that is authorized by a user or sufficiently authorized by parties.
As described in the background art, in the prior art, only the HTML character string corresponding to the rich text content in the three-party webpage is inserted into other webpages, and the content in the three-party webpage may not only include the rich text content, so that the newly generated webpage cannot completely and effectively represent the content in the three-party webpage, thereby affecting the use process of the newly generated webpage. Based on this, the embodiments of the present disclosure provide a method for generating a web page, since the content of the first web page includes not only related rich text content, but also attribute information of the first web page, an encoding format used by the first web page, and other scripts included in the first web page. In this way, the electronic device may obtain the content in the first web page based on the original text (or source code) of the first web page, and insert the content in the first web page into the second web page based on the pre-processed target text of the first web page. The generated new webpage (i.e., the target webpage) can completely and effectively represent the content in the first webpage, and further, in the process of using the target webpage, the electronic device can execute related operations on the content in the first webpage in the target webpage so as to improve the use effectiveness of the target webpage.
The webpage generation method, the webpage generation device, the electronic equipment and the storage medium provided by the embodiment of the disclosure are applied to a scene of generating a webpage (or webpage insertion). When the electronic device responds to the processing request, the target webpage can be generated according to the method provided by the embodiment of the disclosure.
The following describes an exemplary method for generating a web page according to an embodiment of the present disclosure with reference to the accompanying drawings:
for example, the electronic device executing the webpage generating method provided by the embodiment of the present disclosure may be a mobile phone, a tablet computer, a desktop, a laptop, a handheld computer, a notebook, an ultra-mobile personal computer (UMPC), a netbook, a cellular phone, a Personal Digital Assistant (PDA), an Augmented Reality (AR) \ Virtual Reality (VR) device, and other devices that can install and use a content community application, and the present disclosure does not particularly limit the specific form of the electronic device. The system can be used for man-machine interaction with a user through one or more modes of a keyboard, a touch pad, a touch screen, a remote controller, voice interaction or handwriting equipment and the like.
As shown in fig. 1, a webpage generation method provided by the embodiment of the present disclosure may include S101-S103.
S101, the electronic equipment responds to the processing request and obtains an original text of the first webpage.
The processing request comprises a uniform resource locator of the first webpage, and original text of the first webpage is used for acquiring content in the first webpage.
It should be understood that the first web page is a three-party web page, i.e., a web page (or page) of a three-party website. The original text of the first web page is the source code (or string of source code) of the first web page. Specifically, the electronic device may obtain the source code (i.e., the original text) of the first webpage based on a Uniform Resource Locator (URL) of the first webpage included in the processing request.
It can be understood that the electronic device may obtain the original text of the first web page based on the ajax asynchronous request, and a specific response type (response type) corresponding to the ajax asynchronous request may be a text, which may be used to obtain the source code of the first web page.
S102, the electronic equipment preprocesses the original text of the first webpage to obtain a target text of the first webpage.
It should be understood that the electronic device preprocesses the original text of the first webpage, so that the electronic device can insert the content in the first webpage into other webpages based on the target text of the first webpage after preprocessing.
S103, the electronic equipment inserts the content in the first webpage into the second webpage based on the target text of the first webpage to generate a target webpage.
It should be understood that the content of the first web page may include related rich text content, and may also include attribute information of the first web page (e.g., whether the first web page is a chinese page), an encoding format used by the first web page, and other scripts (e.g., interaction type scripts) included in the first web page. After obtaining the original text (or source code) of the first webpage, the electronic device may obtain the content in the first webpage based on the original text, preprocess the original text of the first webpage to obtain the target text of the first webpage, and insert the content in the first webpage into the second webpage based on the target text of the first webpage, so as to generate the target webpage.
It will be appreciated that the second web page is a web page into which relevant content (e.g., content in the first web page) is to be inserted.
In an optional implementation, the electronic device may insert the content of the first web page into the second web page in ele.
The technical scheme provided by the embodiment can at least bring the following beneficial effects: as known from S101 to S103, in response to the processing request, the electronic device may obtain an original text of a first web page, pre-process the original text of the first web page to obtain a target text of the first web page, and then insert content in the first web page into a second web page based on the target text of the first web page to generate the target web page. In the embodiment of the present disclosure, the content of the first webpage includes not only the related rich text content, but also the attribute information of the first webpage, the encoding format used by the first webpage, and other scripts included in the first webpage. In this way, the electronic device may obtain the content in the first web page based on the original text (or source code) of the first web page, and insert the content in the first web page into the second web page based on the pre-processed target text of the first web page. The generated new webpage (i.e., the target webpage) can completely and effectively represent the content in the first webpage, and further, in the process of using the target webpage, the electronic device can execute related operations on the content in the first webpage in the target webpage so as to improve the use effectiveness of the target webpage.
With reference to fig. 1, as shown in fig. 2, in an implementation manner of the embodiment of the present disclosure, the preprocessing the original text of the first webpage to obtain the target text of the first webpage specifically includes S1021.
S1021, the electronic equipment performs first preprocessing on the original text of the first webpage to obtain a target text of the first webpage.
Wherein the first preprocessing is used for performing scoping processing on the original text.
It should be appreciated that scoping processes are used to specifically process certain areas (or portions) of the original text to distinguish (or differentiate) the first web page from the second web page, and thus the electronic device may process content in the first web page differently from content in the second web page.
The technical scheme provided by the embodiment can at least bring the following beneficial effects: as known from S1021, the electronic device may perform a first pre-processing on the original text of the first webpage to obtain a target text of the first webpage. In the embodiment of the disclosure, since the first preprocessing is used for performing scope processing on the original text of the first webpage, the scope processing can distinguish the first webpage from the second webpage, and further the electronic device can perform different processing on the content in the first webpage and the content in the second webpage, so that the electronic device can avoid performing unnecessary processing on the content in the second webpage, and further the target webpage can be accurately generated.
With reference to fig. 2, as shown in fig. 3, in an implementation manner of the embodiment of the present disclosure, the performing the first preprocessing on the original text of the first webpage to obtain the target text of the first webpage may specifically include S1021a-S1021 c.
S1021a, the electronic device determines a plurality of style texts of the first webpage based on the original text of the first webpage.
It should be understood that the source code (or the character string of the source code) including the character string related to the style may be understood as the style text in the first web page, and the plurality of style texts of the first web page are a part of the original text of the first web page. Specifically, a style text of a web page may be used to control a position, a size, a color, and the like of a certain tag (or element) in the web page, and the electronic device may control the tag or content to be displayed in the first web page through the plurality of style texts.
Alternatively, the electronic device may determine the style texts from the original text of the first webpage through a preset processing rule (e.g., a regular expression or a regular expression).
S1021b, the electronic device determines a plurality of first tags of the first web page and a second tag of the second web page.
It should be appreciated that the second tab is one of a plurality of tabs included in the second web page.
It is understood that a tag is a format of a code for language identification of a web page (or browser), a web page (e.g., a first web page) may correspond to (or include) tags (i.e., first tags), a tag may characterize a paragraph or a sentence in the web page, etc.
S1021c, when a target label identical to the label of the second label exists in the plurality of first labels, adding a preset scope label to at least one style text of the first webpage by the electronic equipment to generate a target text.
And the at least one style text is a style text corresponding to the target label.
It should be understood that one tag (e.g., a target tag) may correspond to at least one style text.
It can be understood that, when the target tag identical to the identifier of the second tag exists in the plurality of first tags, it indicates that the same tag identifier (or tag) exists in the first webpage and the second webpage. However, the user may only want to update or overlay the first webpage (or the target tag in the first webpage) based on the at least one style text (i.e., the style text corresponding to the target tag), and the style of the second webpage (or the second tag of the second webpage) remains unchanged.
In the embodiment of the disclosure, since the identifier of the target tag is the same as the identifier of the second tag, if the electronic device directly performs an update operation on the target tag according to the at least one style text, it is likely that the target tag will be updated together with the second tag. At this time, the electronic device may add the preset scope identifier to the at least one style text, so that the electronic device may perform an update operation only on the target tag, and may not perform an update operation on the second tag (which may also be understood as keeping the style of the second tag unchanged).
The following describes an example of a process of adding, by an electronic device, a preset scope identifier to the at least one style text in the embodiment of the present disclosure.
For example, it is assumed that the identifier of the target tag and the identifier of the second tag are both h1, i.e. the identifier of the target tag is the same as the identifier of the second tag. Further, assume that the at least one style text includes a first style text, the first style text is h1{ font-size:18 px; color, i.e., the update operation that the electronic device needs to perform on the target tag is 18 pixels in the front size, and the color is black. Thus, the electronic device may add a preset scope identifier (e.g., part b) to the first style, that is, the first style obtained after adding the preset scope identifier is # part b h1{ font-size:18 px; color, black, so as to prevent the electronic device from performing an update operation on the second tag based on the first pattern.
The technical scheme provided by the embodiment can at least bring the following beneficial effects: from S1021a-S1021c, the electronic device can determine a plurality of style texts of a first web page based on an original text of the first web page, and determine a plurality of first tags of the first web page and a second tag of a second web page; when a target tag identical to the identifier of the second tag exists in the plurality of first tags, it is indicated that the same tag identifier (or tag) exists in the first webpage and the second webpage, and at this time, the electronic device may add a preset scope identifier to at least one style text of the first webpage (i.e., a style text corresponding to the target tag) to generate the target text. The electronic equipment can be prevented from carrying out misoperation on the label (such as the second label) in the second webpage, and the effectiveness of webpage generation is improved.
With reference to fig. 3 and as shown in fig. 4, in an implementation manner of the embodiment of the present disclosure, the inserting content in the first web page into the second web page based on the target text of the first web page to generate the target web page specifically includes S1031 to S1032.
And S1031, setting scope information for the style text corresponding to the target label by the electronic equipment based on the preset scope identification in the target text.
With reference to the description of the above embodiment, it should be understood that the style text corresponding to the target tag is the at least one style text, and the scope information is used to represent the scope (or region) corresponding to the at least one style text. The electronic device sets scope information for the at least one style text based on the preset scope identifier in the target text, that is, determines which regions (specifically, regions corresponding to the target label) in the target text are set with the scope information.
S1032, the electronic device inserts the content in the first webpage into the second webpage based on the target text of the first webpage, and updates the target label based on the scope information to generate the target webpage.
It can be understood that the electronic device updates the target tag based on the scope information, that is, updates the target tag in a corresponding range (or area) of the target tag according to the at least one text style.
In the disclosed embodiment, the target tag (or the initial state of the target tag) may be plain text. The electronic device updates the target label based on the scope information, that is, updates or covers the target label (or the text corresponding to the target label) based on the scope information according to the style provided by the at least one style text, so as to obtain the target label including the corresponding style, that is, the updated target label.
The technical scheme provided by the embodiment can at least bring the following beneficial effects: s1031 to S1032 may determine that the electronic device may set scope information for the style text corresponding to the target tag based on the preset scope identifier in the target text, and then the electronic device inserts the content in the first web page into the second web page based on the target text of the first web page, and updates the target tag based on the scope information, so as to generate the target web page. In the embodiment of the disclosure, the electronic device may update or cover the target tag (or the text corresponding to the target tag) according to the style provided by the style text (i.e., at least one style text) corresponding to the target tag based on the scope information, and may accurately and effectively generate the target webpage.
In an optional implementation manner, the method for generating a webpage may further include step a.
And step A, when the target label and the second label are updated according to the same style text, the electronic equipment adds a preset scope identifier to the style text corresponding to the second label.
It is to be understood that the same style text may include one of the at least one style text described above. When the first target tag and the second tag are updated according to the same style text, it indicates that the user wants to update (or cover) the target tag based on the same style text and also wants to update (or cover) the second tag based on the same style text, and at this time, the electronic device may add the preset scope identifier to the style text corresponding to the second tag on the basis of adding the preset scope identifier to the style text corresponding to the target tag (i.e., the at least one style text), so that the electronic device may perform an update operation on the first tag and the second tag based on the same style text.
With reference to fig. 1, as shown in fig. 5, in an implementation manner of the embodiment of the present disclosure, the preprocessing the original text of the first webpage to obtain the target text of the first webpage may specifically include S1022.
S1022, the electronic device performs second preprocessing on the original text of the first webpage to obtain a target text of the first webpage.
Wherein the second preprocessing is used for filtering the original text.
It should be appreciated that the filtering process may filter (or delete) invalid text in the original text, and avoid processing (or updating) the invalid text, so as to improve the effectiveness of webpage generation.
With reference to fig. 5, as shown in fig. 6, in an implementation manner of the embodiment of the present disclosure, the performing the second preprocessing on the original text of the first webpage to obtain the target text of the first webpage may specifically include S1022a-S1022 c.
S1022a, the electronic device obtains the risk text from the original text of the first webpage according to the preset query rule.
It should be understood that the risk text is text (or code) in the original text of the first web page where a risk exists.
In an alternative implementation manner, the preset query rule may be a dompress technology, that is, the electronic device may obtain the risk text (or risk code) from the original text (or source code) of the first webpage through the dompress technology.
S1022b, the electronic device performs translation processing on the risk text to generate a translation text.
Specifically, after acquiring the risk text (or the risk code), the electronic device may perform a translation process on the risk text, where the translation process may be understood as translating the relevant scripting language into text content.
S1022c, the electronic device filters the translated text from the original text of the first webpage to generate the target text.
It is understood that the electronic device filters the translated text from the original text, i.e., deletes the translated text from the original text. The electronic device deletes the translation text, so that the target text can be generated.
It will be appreciated that the risk text (or risk code) may be executed as a scripting language in the first web page (specifically the original text of the first web page), i.e. a series of logic needs to be executed based on the risk text. In the embodiment of the disclosure, the electronic device performs a translation process (or a filtering process) on the risk text, generates a translation text, and filters the translation text from the original text of the first webpage to generate the target text. That is, the risk text is not executed as the script language any more, but is displayed as the text content, so that xss attack can be effectively prevented.
In an implementation manner of the embodiment of the present disclosure, the electronic device may use a translated text generated after the risk text is translated and/or a text obtained after a preset scope identifier is added to the text of the relevant style, that is, may understand that an original text of the first webpage after the two operations (or processes) is used as a dom content, and then insert the content of the first webpage into the second webpage in a v-html form.
The technical scheme provided by the embodiment can at least bring the following beneficial effects: from S1022a-S1022c, the electronic device may obtain a risk text from the original text of the first webpage according to a preset query rule, and perform translation processing on the risk text to generate a translation text; the electronic device may then filter the translated text from the original text of the first web page to generate a target text. In the embodiment of the disclosure, since the risk text is a text (or a code) with a risk in the original text, the risk text may need to be executed as a script in the first webpage, so that the electronic device translates the risk text to generate a translated text with plain text content, that is, a text that can only be displayed but cannot be executed is generated, and the electronic device may filter (or delete) the translated text from the original text to generate the target text. Xss attacks can be effectively prevented, and the security of the target webpage is improved.
With reference to fig. 1, as shown in fig. 7, in an implementation manner of the embodiment of the present disclosure, the inserting the content in the first web page into the second web page based on the target text of the first web page to generate the target web page may specifically include S1033.
S1033, the electronic equipment inserts the content in the first webpage into the designated position of the second webpage based on the target text to generate the target webpage.
In the embodiment of the disclosure, the electronic device inserts the content in the first webpage into the designated position of the second webpage, so that the content of a certain webpage can be conveniently and accurately inserted into a specific position of other webpages, and a target webpage with high accuracy can be generated.
It is understood that, in practical implementation, the electronic device according to the embodiment of the present disclosure may include one or more hardware structures and/or software modules for implementing the corresponding web page generation method, and the executing hardware structures and/or software modules may constitute an electronic device. Those of skill in the art will readily appreciate that the present disclosure can be implemented in hardware or a combination of hardware and computer software for implementing the exemplary algorithm steps described in connection with the embodiments disclosed herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
Based on such understanding, the embodiment of the present disclosure further provides a web page generating device correspondingly, and fig. 8 shows a schematic structural diagram of the web page generating device provided by the embodiment of the present disclosure. As shown in fig. 8, the web page generating apparatus 10 may include: an acquisition module 101 and a processing module 102.
The obtaining module 101 is configured to obtain an original text of a first web page in response to a processing request, where the processing request includes a uniform resource locator of the first web page, and the original text of the first web page is used to obtain content in the first web page.
The processing module 102 is configured to pre-process the original text of the first webpage to obtain a target text of the first webpage.
The processing module 102 is further configured to insert the content in the first webpage into a second webpage based on the target text of the first webpage to generate a target webpage.
Optionally, the processing module 102 is specifically configured to perform a first preprocessing on the original text of the first webpage to obtain a target text of the first webpage, where the first preprocessing is used to perform scoping on the original text.
Optionally, the web page generating apparatus 10 further includes a determining module 103.
A determination module 103 configured to determine a plurality of style texts of the first web page based on the original text of the first web page.
The determining module 103 is further configured to determine a plurality of first tags of the first web page and a second tag of the second web page.
The processing module 102 is specifically configured to, when a target tag identical to the identifier of the second tag exists in the plurality of first tags, add a preset scope identifier to at least one style text of the first webpage to generate the target text, where the at least one style text is a style text corresponding to the target tag.
Optionally, the processing module 102 is further specifically configured to set scope range information for the style text corresponding to the target tag based on the preset scope identifier in the target text.
The processing module 102 is specifically configured to insert the content in the first web page into the second web page based on the target text of the first web page, and update the target tag based on the scope information to generate the target web page.
Optionally, the processing module 102 is specifically configured to perform a second preprocessing on the original text of the first webpage to obtain a target text of the first webpage, where the second preprocessing is used to perform a filtering processing on the original text.
Optionally, the obtaining module 101 is further configured to obtain a risk text from the original text of the first webpage according to a preset query rule.
The processing module 102 is further specifically configured to perform a translation process on the risk text to generate a translation text.
The processing module 102 is further specifically configured to filter the translated text from the original text of the first web page to generate the target text.
Optionally, the processing module 102 is specifically configured to insert the content in the first webpage into the specified position of the second webpage based on the target text, so as to generate the target webpage.
As described above, the embodiment of the present disclosure may perform division of the functional modules on the web page generation apparatus according to the above method example. The integrated module can be realized in a hardware form, and can also be realized in a software functional module form. In addition, it should be further noted that the division of the modules in the embodiments of the present disclosure is schematic, and is only a logic function division, and there may be another division manner in actual implementation. For example, the functional blocks may be divided for the respective functions, or two or more functions may be integrated into one processing block.
Regarding the web page generation apparatus in the foregoing embodiment, the specific manner in which each module executes operations and the beneficial effects thereof have been described in detail in the foregoing method embodiment, and are not described herein again.
Fig. 9 is a schematic structural diagram of another web page generation apparatus provided by the present disclosure. As shown in fig. 9, the web page generating device 20 may include at least one processor 201 and a memory 203 for storing processor-executable instructions. Wherein the processor 201 is configured to execute the instructions in the memory 203 to implement the web page generation method in the above-described embodiments.
In addition, the web page generating device 20 may further include a communication bus 202 and at least one communication interface 204.
The processor 201 may be a Central Processing Unit (CPU), a micro-processing unit, an ASIC, or one or more integrated circuits for controlling the execution of programs according to the present disclosure.
The communication bus 202 may include a path that conveys information between the aforementioned components.
The communication interface 204 may be any device, such as a transceiver, for communicating with other devices or communication networks, such as an ethernet, a Radio Access Network (RAN), a Wireless Local Area Network (WLAN), etc.
The memory 203 may be a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these. The memory may be self-contained and connected to the processing unit by a bus. The memory may also be integrated with the processing unit.
The memory 203 is used for storing instructions for executing the disclosed solution, and is controlled by the processor 201. The processor 201 is configured to execute instructions stored in the memory 203 to implement the functions of the disclosed method.
In particular implementations, processor 201 may include one or more CPUs such as CPU0 and CPU1 in fig. 9 for one embodiment.
In particular implementations, web page generating device 20 may include a plurality of processors, such as processor 201 and processor 207 in fig. 9, as an embodiment. Each of these processors may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
In a specific implementation, the web page generating apparatus 20 may further include an output device 205 and an input device 206, as an embodiment. The output device 205 is in communication with the processor 201 and may display information in a variety of ways. For example, the output device 205 may be a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display device, a Cathode Ray Tube (CRT) display device, a projector (projector), or the like. The input device 206 is in communication with the processor 201 and can accept user input in a variety of ways. For example, the input device 206 may be a mouse, a keyboard, a touch screen device, or a sensing device, among others.
Those skilled in the art will appreciate that the configuration shown in fig. 9 does not constitute a limitation of the web page generating apparatus 20, and may include more or fewer components than those shown, or combine some components, or adopt a different arrangement of components.
In addition, the present disclosure also provides a computer-readable storage medium including instructions, which when executed by an electronic device, cause the electronic device to perform the webpage generating method provided in the above embodiment.
In addition, the present disclosure also provides a computer program product including instructions, which when executed by an electronic device, cause the electronic device to execute the webpage generating method provided in the above embodiment.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (10)

1. A webpage generating method is applied to electronic equipment and is characterized by comprising the following steps:
responding to a processing request, and acquiring an original text of a first webpage, wherein the processing request comprises a uniform resource locator of the first webpage, and the original text of the first webpage is used for acquiring content in the first webpage;
preprocessing the original text of the first webpage to obtain a target text of the first webpage;
and inserting the content in the first webpage into a second webpage based on the target text of the first webpage to generate a target webpage.
2. The method for generating a web page according to claim 1, wherein the preprocessing the original text of the first web page to obtain the target text of the first web page includes:
and performing first preprocessing on the original text of the first webpage to obtain a target text of the first webpage, wherein the first preprocessing is used for performing scope processing on the original text.
3. The method for generating a web page according to claim 1, wherein the first preprocessing the original text of the first web page to obtain the target text of the first web page includes:
determining a plurality of style texts of the first webpage based on the original text of the first webpage;
determining a plurality of first tags of the first webpage and a second tag of the second webpage;
when a target label identical to the label of the second label exists in the plurality of first labels, adding a preset scope label to at least one style text of the first webpage to generate the target text, wherein the at least one style text is the style text corresponding to the target label.
4. The method for generating a web page according to claim 3, wherein the inserting the content in the first web page into a second web page based on the target text of the first web page to generate a target web page comprises:
setting scope range information for the style text corresponding to the target label based on the preset scope identification in the target text;
inserting the content in the first webpage into the second webpage based on the target text of the first webpage, and updating the target label based on the scope information to generate the target webpage.
5. The method for generating web pages according to any one of claims 1 to 4, wherein the preprocessing the original text of the first web page to obtain the target text of the first web page further comprises:
and performing second preprocessing on the original text of the first webpage to obtain a target text of the first webpage, wherein the second preprocessing is used for filtering the original text.
6. The method for generating a web page according to claim 5, wherein the second preprocessing the original text of the first web page to obtain the target text of the first web page includes:
acquiring a risk text from an original text of the first webpage according to a preset query rule;
translating the risk text to generate a translated text;
filtering the translated text from the original text of the first web page to generate the target text.
7. The webpage generating device is characterized by comprising an acquisition module and a processing module;
the obtaining module is configured to obtain an original text of a first webpage in response to a processing request, where the processing request includes a uniform resource locator of the first webpage, and the original text of the first webpage is used for obtaining content in the first webpage;
the processing module is configured to preprocess an original text of the first webpage to obtain a target text of the first webpage;
the processing module is further configured to insert the content in the first webpage into a second webpage based on the target text of the first webpage to generate a target webpage.
8. An electronic device, characterized in that the electronic device comprises:
a processor;
a memory configured to store the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the web page generation method of any one of claims 1-6.
9. A computer-readable storage medium having instructions stored thereon, wherein the instructions in the computer-readable storage medium, when executed by an electronic device, enable the electronic device to perform the web page generation method of any one of claims 1-6.
10. A computer program product, characterized in that it comprises computer instructions which, when run on an electronic device, cause the electronic device to perform the web page generation method according to any one of claims 1-6.
CN202210346440.7A 2022-03-31 2022-03-31 Webpage generation method and device, electronic equipment and storage medium Pending CN114817804A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210346440.7A CN114817804A (en) 2022-03-31 2022-03-31 Webpage generation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210346440.7A CN114817804A (en) 2022-03-31 2022-03-31 Webpage generation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114817804A true CN114817804A (en) 2022-07-29

Family

ID=82532845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210346440.7A Pending CN114817804A (en) 2022-03-31 2022-03-31 Webpage generation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114817804A (en)

Similar Documents

Publication Publication Date Title
EP3876116B1 (en) Method and apparatus for running mini program, electronic device, and storage medium
EP3518124A1 (en) Webpage rendering method and related device
CN109522018B (en) Page processing method and device and storage medium
US8819177B2 (en) Adding personalized value to web sites
CN104461484B (en) The implementation method and device of front-end template
CN106294658B (en) Webpage quick display method and device
EP2458499B1 (en) Method and equipment for generating widget
US9117314B2 (en) Information output apparatus, method, and recording medium for displaying information on a video display
US11677809B2 (en) Methods for transforming a server side template into a client side template and devices thereof
CN104268229B (en) Resource obtaining method and device based on multi-process browser
CN112685671A (en) Page display method, device, equipment and storage medium
CN114996619A (en) Page display method and device, computer equipment and storage medium
WO2021098242A1 (en) Page processing method and apparatus, electronic device and computer readable medium
WO2022134776A1 (en) Label-based anti-crawler method and apparatus, computer device, and storage medium
CN113360106B (en) Webpage printing method and device
CN113239256A (en) Method for generating website signature and method and device for identifying website
CN111488546A (en) Page generation method and device and storage medium
CN114817804A (en) Webpage generation method and device, electronic equipment and storage medium
CN111783006A (en) Page generation method and device, electronic equipment and computer readable medium
CN108664535B (en) Information output method and device
KR20110027790A (en) Information output apparatus, information output method, and recording medium
CN114679321B (en) SSTI vulnerability detection method, device and medium
US11847405B1 (en) Encoding hyperlink data in a printed document
CN111190818B (en) Front-end code analysis method, front-end code analysis device, computer equipment and storage medium
CN118036053A (en) Page data display method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination