CN117312711A - Search engine optimization method and system based on AI analysis - Google Patents
Search engine optimization method and system based on AI analysis Download PDFInfo
- Publication number
- CN117312711A CN117312711A CN202311250434.2A CN202311250434A CN117312711A CN 117312711 A CN117312711 A CN 117312711A CN 202311250434 A CN202311250434 A CN 202311250434A CN 117312711 A CN117312711 A CN 117312711A
- Authority
- CN
- China
- Prior art keywords
- content
- page
- search engine
- links
- keywords
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000005457 optimization Methods 0.000 title claims abstract description 31
- 238000012544 monitoring process Methods 0.000 claims abstract description 13
- 230000000295 complement effect Effects 0.000 claims abstract description 10
- 238000001914 filtration Methods 0.000 claims abstract description 7
- 235000014347 soups Nutrition 0.000 claims abstract description 4
- 239000013598 vector Substances 0.000 claims description 6
- 238000003058 natural language processing Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000011161 development Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/302—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3438—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/865—Monitoring of software
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Quality & Reliability (AREA)
- Mathematical Physics (AREA)
- Computer Hardware Design (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention discloses a search engine optimization method and a search engine optimization system based on AI analysis, which belong to the technical field of website optimization, and specifically comprise the following steps: acquiring a website URL provided by a user, and acquiring operation area information and HTML content of the website according to the website URL; analyzing the webpage through the Requests library and the Beau full Soup library, capturing all links in the webpage, removing the empty links and the external links, and repeatedly capturing the links for filtering; analyzing the page element, checking whether the page element content is blank or unreasonable, if so, generating corresponding complement content according to the context information, and filling or replacing the original element content through JavaScript; if not, continuing to analyze; the ranking and the exposure degree of the user websites in the search engine are monitored regularly, and the websites are adjusted according to the monitoring results; the invention realizes the ranking and the exposure degree of the optimized website in the overseas search engine.
Description
Technical Field
The invention relates to the technical field of webpage optimization, in particular to a search engine optimization method and system based on AI analysis.
Background
Artificial intelligence analysis has wide prospects and current situations in search engine optimization. With the continuous development and application of artificial intelligence technology, the role of AI in search engine optimization is becoming more and more important. In recent years, with the advent of a series of artificial intelligence technologies such as GPT, the effect of AI in search engine optimization will be more pronounced. GPT is a deep learning-based natural language processing technique that automatically generates text, simulates human language behavior, allows AI to better understand a user's search query, provide more relevant search results, and optimize the ranking of websites.
Sophisticated AI Search Engine Optimization (SEO) tools are critical to your success of digital content. Once the visitor views your web site, the search engine optimization keeps them stay on each page longer, preserving their attention as they learn more about your meaningful solutions. And among the many elements of Search Engine Optimization (SEO), page elements (such as meta-titles and descriptions) are the most critical factors to achieve high ranking on google and like search engines. Many tools on the market can optimize suggestions for you, but for most website owners, the website can be simply operated, too many professional modification suggestions are not known to be modified, and the corresponding development capability is also lacking, and the existing search engine optimization method is optimized based on keywords, but the effect of the method in overseas markets is limited because of large search habits and language differences of users in different countries and regions.
Disclosure of Invention
The invention aims to provide a search engine optimization method and system based on AI analysis, which solve the following technical problems:
for most website owners, the website is simply operated, too much professional modification suggestions are not known and are not changed, and the corresponding development capability is also lacked, and the existing search engine optimization method is based on keyword optimization, but the effect of the method in overseas markets is limited because of large user search habits and language differences in different countries and regions.
The aim of the invention can be achieved by the following technical scheme:
a search engine optimization method based on AI analysis includes the following steps:
acquiring a website URL provided by a user, and acquiring operation area information and HTML content of the website according to the website URL;
analyzing the webpage through the Requests library and the Beau full Soup library, capturing all links in the webpage, removing the empty links and the external links, and repeatedly capturing the links for filtering;
analyzing the page element, checking whether the page element content is blank or unreasonable, if so, generating corresponding complement content according to the context information, and filling or replacing the original element content through JavaScript; if not, continuing to analyze;
the ranking and the exposure degree of the user websites in the search engine are monitored regularly, and the websites are adjusted according to the monitoring results.
As a further scheme of the invention: the process of analyzing the page elements is as follows:
identifying text elements of the page, wherein the text elements comprise titles, keywords and brief introduction contents, and checking whether the title contents, the keyword contents and the brief introduction contents are blank or unreasonable;
identifying a picture tag in a page, checking whether the alt attribute of the picture exists, checking whether the alt content is reasonable, checking whether the picture is opened normally or not, and checking whether the size of the picture meets the requirement or not;
identifying link labels in the page, and checking the quantity and quality of the internal links and the external links;
the h1 tag in the page is identified, and the use position and the number of h1 in the page are checked.
As a further scheme of the invention: the process of generating the corresponding complement content is as follows:
generating corresponding title, keywords and brief introduction content according to the article content in the webpage, and updating the corresponding title, keywords and brief introduction content into corresponding text elements in the webpage;
analyzing the context information of the picture and the paragraph where the picture is located in the page, generating accurate alt attribute content, and adjusting the size of the picture;
and adjusting the density of the internal links, adjusting the quantity and quality of the external links, and adjusting the appearance position and quantity of the h1 label.
As a further scheme of the invention: the process of updating the title and keywords according to the content of the article is as follows:
dividing a text marked with keywords into a plurality of words, mapping the words into vectors, inputting the vectors into a BERT model after position embedding coding, extracting content characteristics in the text, carrying out back propagation updating on the BERT model through the marked keywords until convergence, inputting a webpage article to be processed into the BERT model after position embedding coding, extracting the content characteristics in the webpage article, clustering to generate the pending keywords, utilizing the back propagation updating model of the pending keywords to obtain the pending keywords repeatedly updated for a plurality of times, marking the pending keywords as preferred keywords, updating the preferred keywords into corresponding positions of the page, and generating corresponding page titles according to grammar sequences based on the preferred keywords for updating.
As a further scheme of the invention: the process of updating profile content according to article content is as follows:
segmenting text content according to paragraphs and sentences, and removing interference elements, wherein the interference elements comprise advertisements, headers and footers; detecting the content of the beginning part of the text, searching whether paragraphs with different styles and formats from the rest part of the content exist or not, extracting sentences containing specific words in the paragraphs, wherein the specific words comprise topics, purposes, methods, results and arguments, combining the extracted sentences, verifying the context information by using a natural language processing tool, forming brief contents and updating.
As a further scheme of the invention: the process of generating the picture alt attribute is as follows:
basic information about an image is acquired from metadata of the image, including a title and description, image content is automatically identified by using a picture identification tool, a preliminary description text is generated, preprocessing is performed on the identified text, unnecessary characters, punctuation marks and formats are removed, context information of the position where the picture is located is acquired, the context information and the text are clustered, keyword information is acquired, alt attributes are generated, and the generated alt attributes are inserted into alt descriptions of img tags.
A search engine optimization system based on AI analysis, comprising:
the content acquisition module is used for acquiring the website URL provided by the user and acquiring the operation area information and the HTML content of the website according to the website URL;
the link screening module is used for analyzing the webpage through the Requests library and the BeautiflulSoup library, capturing all links in the webpage, removing empty links and external links, and repeatedly capturing links for filtering;
the page analysis module is used for analyzing the page elements, checking whether the page element contents are blank or unreasonable, if so, generating corresponding complement contents according to the context information, and filling or replacing the original element contents through JavaScript; if not, continuing to analyze;
and the monitoring feedback module is used for regularly monitoring the ranking and the exposure degree of the user website in the search engine and adjusting the website according to the monitoring result.
The invention has the beneficial effects that:
according to the invention, through detecting web pages, missing or unreasonable elements of the web pages are supplemented, web page contents which are easy to search and attract attention are generated, partial unreasonable links are removed, and the search behavior and user preference of overseas markets are analyzed and predicted by utilizing an AI technology, so that the ranking and the exposure degree of websites in overseas search engines are optimized, and the visibility and the flow of the websites in the overseas markets are improved.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a flow chart of a search engine optimization method based on AI analysis of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the invention discloses a search engine optimization method and system based on AI analysis, comprising the following steps:
acquiring a website URL provided by a user, and acquiring operation area information and HTML content of the website according to the website URL;
analyzing the webpage through the Requests library and the Beau full Soup library, capturing all links in the webpage, removing the empty links and the external links, and repeatedly capturing the links for filtering;
analyzing the page element, checking whether the page element content is blank or unreasonable, if so, generating corresponding complement content according to the context information, and filling or replacing the original element content through JavaScript; if not, continuing to analyze;
the ranking and the exposure degree of the user websites in the search engine are monitored regularly, and the websites are adjusted according to the monitoring results.
Sophisticated AI Search Engine Optimization (SEO) tools are critical to your success of digital content. Once the visitor views your web site, the search engine optimization keeps them stay on each page longer, preserving their attention as they learn more about your meaningful solutions. And among the many elements of Search Engine Optimization (SEO), page elements (such as meta-titles and descriptions) are the most critical factors to achieve high ranking on google and like search engines. Many tools on the market can optimize suggestions for you, but for most website owners, the website can be simply operated, too many professional modification suggestions are not known to be modified, and the corresponding development capability is also lacking, and the existing search engine optimization method is optimized based on keywords, but the effect of the method in overseas markets is limited because of large search habits and language differences of users in different countries and regions.
The method and the system analyze and predict the search behavior and the user preference of the overseas market by utilizing the AI technology, so that the ranking and the exposure degree of the website in the overseas search engine are optimized, and the visibility and the flow of the website in the overseas market are improved.
In a preferred embodiment of the present invention, the process of analyzing the page elements is:
identifying text elements of the page, wherein the text elements comprise titles, keywords and brief introduction contents, and checking whether the title contents, the keyword contents and the brief introduction contents are blank or unreasonable;
identifying a picture tag in a page, checking whether the alt attribute of the picture exists, checking whether the alt content is reasonable, checking whether the picture is opened normally or not, and checking whether the size of the picture meets the requirement or not;
identifying link labels in the page, and checking the quantity and quality of the internal links and the external links;
the h1 tag in the page is identified, and the use position and the number of h1 in the page are checked.
In another preferred embodiment of the present invention, the process of generating the corresponding complement content is:
generating corresponding title, keywords and brief introduction content according to the article content in the webpage, and updating the corresponding title, keywords and brief introduction content into corresponding text elements in the webpage;
analyzing the context information of the picture and the paragraph where the picture is located in the page, generating accurate alt attribute content, and adjusting the size of the picture;
and adjusting the density of the internal links, adjusting the quantity and quality of the external links, and adjusting the appearance position and quantity of the h1 label.
In another preferred embodiment of the present invention, the process of updating the title and keywords according to the content of the article is:
dividing a text marked with keywords into a plurality of words, mapping the words into vectors, inputting the vectors into a BERT model after position embedding coding, extracting content characteristics in the text, carrying out back propagation updating on the BERT model through the marked keywords until convergence, inputting a webpage article to be processed into the BERT model after position embedding coding, extracting the content characteristics in the webpage article, clustering to generate the pending keywords, utilizing the back propagation updating model of the pending keywords to obtain the pending keywords repeatedly updated for a plurality of times, marking the pending keywords as preferred keywords, updating the preferred keywords into corresponding positions of the page, and generating corresponding page titles according to grammar sequences based on the preferred keywords for updating.
In another preferred embodiment of the present invention, the process of updating profile content based on article content is:
segmenting text content according to paragraphs and sentences, and removing interference elements, wherein the interference elements comprise advertisements, headers and footers; detecting the content of the beginning part of the text, searching whether paragraphs with different styles and formats from the rest part of the content exist or not, extracting sentences containing specific words in the paragraphs, wherein the specific words comprise topics, purposes, methods, results and arguments, combining the extracted sentences, verifying the context information by using a natural language processing tool, forming brief contents and updating.
In another preferred embodiment of the present invention, the process of generating the picture alt attribute is:
basic information about an image is acquired from metadata of the image, including a title and description, image content is automatically identified by using a picture identification tool, a preliminary description text is generated, preprocessing is performed on the identified text, unnecessary characters, punctuation marks and formats are removed, context information of the position where the picture is located is acquired, the context information and the text are clustered, keyword information is acquired, alt attributes are generated, and the generated alt attributes are inserted into alt descriptions of img tags.
A search engine optimization system based on AI analysis, comprising:
the content acquisition module is used for acquiring the website URL provided by the user and acquiring the operation area information and the HTML content of the website according to the website URL;
the link screening module is used for analyzing the webpage through the Requests library and the BeautiflulSoup library, capturing all links in the webpage, removing empty links and external links, and repeatedly capturing links for filtering;
the page analysis module is used for analyzing the page elements, checking whether the page element contents are blank or unreasonable, if so, generating corresponding complement contents according to the context information, and filling or replacing the original element contents through JavaScript; if not, continuing to analyze;
and the monitoring feedback module is used for regularly monitoring the ranking and the exposure degree of the user website in the search engine and adjusting the website according to the monitoring result.
The foregoing describes one embodiment of the present invention in detail, but the description is only a preferred embodiment of the present invention and should not be construed as limiting the scope of the invention. All equivalent changes and modifications within the scope of the present invention are intended to be covered by the present invention.
Claims (7)
1. A search engine optimization method based on AI analysis is characterized by comprising the following steps:
acquiring a website URL provided by a user, and acquiring operation area information and HTML content of the website according to the website URL;
analyzing the webpage through the Requests library and the Beau full Soup library, capturing all links in the webpage, removing the empty links and the external links, and repeatedly capturing the links for filtering;
analyzing the page element, checking whether the page element content is blank or unreasonable, if so, generating corresponding complement content according to the context information, and filling or replacing the original element content through JavaScript; if not, continuing to analyze;
the ranking and the exposure degree of the user websites in the search engine are monitored regularly, and the websites are adjusted according to the monitoring results.
2. The method for optimizing a search engine based on AI analysis of claim 1, wherein the process of analyzing the page elements is:
identifying text elements of the page, wherein the text elements comprise titles, keywords and brief introduction contents, and checking whether the title contents, the keyword contents and the brief introduction contents are blank or unreasonable;
identifying a picture tag in a page, checking whether the alt attribute of the picture exists, checking whether the alt content is reasonable, checking whether the picture is opened normally or not, and checking whether the size of the picture meets the requirement or not;
identifying link labels in the page, and checking the quantity and quality of the internal links and the external links;
the h1 tag in the page is identified, and the use position and the number of h1 in the page are checked.
3. The method for optimizing a search engine based on AI analysis of claim 1, wherein the process of generating the corresponding complement content is:
generating corresponding title, keywords and brief introduction content according to the article content in the webpage, and updating the corresponding title, keywords and brief introduction content into corresponding text elements in the webpage;
analyzing the context information of the picture and the paragraph where the picture is located in the page, generating accurate alt attribute content, and adjusting the size of the picture;
and adjusting the density of the internal links, adjusting the quantity and quality of the external links, and adjusting the appearance position and quantity of the h1 label.
4. The search engine optimization method based on AI analysis of claim 3, wherein the process of updating titles and keywords according to the contents of articles is as follows:
dividing a text marked with keywords into a plurality of words, mapping the words into vectors, inputting the vectors into a BERT model after position embedding coding, extracting content characteristics in the text, carrying out back propagation updating on the BERT model through the marked keywords until convergence, inputting a webpage article to be processed into the BERT model after position embedding coding, extracting the content characteristics in the webpage article, clustering to generate the pending keywords, utilizing the back propagation updating model of the pending keywords to obtain the pending keywords repeatedly updated for a plurality of times, marking the pending keywords as preferred keywords, updating the preferred keywords into corresponding positions of the page, and generating corresponding page titles according to grammar sequences based on the preferred keywords for updating.
5. The search engine optimization method based on AI analysis of claim 3, wherein the process of updating the profile content based on the article content is:
segmenting text content according to paragraphs and sentences, and removing interference elements, wherein the interference elements comprise advertisements, headers and footers; detecting the content of the beginning part of the text, searching whether paragraphs with different styles and formats from the rest part of the content exist or not, extracting sentences containing specific words in the paragraphs, wherein the specific words comprise topics, purposes, methods, results and arguments, combining the extracted sentences, verifying the context information by using a natural language processing tool, forming brief contents and updating.
6. The search engine optimization method based on AI analysis of claim 3, wherein the process of generating the picture alt attribute is:
basic information about an image is acquired from metadata of the image, wherein the basic information comprises a title and description, a picture identification tool is used for automatically identifying the content of the image, a preliminary description text is generated, preprocessing is carried out on the identified description text, unnecessary characters, punctuation marks and formats are removed, context information of the position of the picture is acquired, the context information and the description text are clustered, key information of a core after clustering is acquired, alt attributes are generated according to the key information, and the generated alt attributes are inserted into alt descriptions of img labels.
7. A search engine optimization system based on AI analysis, comprising:
the content acquisition module is used for acquiring the website URL provided by the user and acquiring the operation area information and the HTML content of the website according to the website URL;
the link screening module is used for analyzing the webpage through the Requests library and the BeautiflulSoup library, capturing all links in the webpage, removing empty links and external links, and repeatedly capturing links for filtering;
the page analysis module is used for analyzing the page elements, checking whether the page element contents are blank or unreasonable, if so, generating corresponding complement contents according to the context information, and filling or replacing the original element contents through JavaScript; if not, continuing to analyze;
and the monitoring feedback module is used for regularly monitoring the ranking and the exposure degree of the user website in the search engine and adjusting the website according to the monitoring result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311250434.2A CN117312711A (en) | 2023-09-26 | 2023-09-26 | Search engine optimization method and system based on AI analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311250434.2A CN117312711A (en) | 2023-09-26 | 2023-09-26 | Search engine optimization method and system based on AI analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117312711A true CN117312711A (en) | 2023-12-29 |
Family
ID=89287785
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311250434.2A Pending CN117312711A (en) | 2023-09-26 | 2023-09-26 | Search engine optimization method and system based on AI analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117312711A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118095443A (en) * | 2024-04-19 | 2024-05-28 | 深圳爱莫科技有限公司 | Training method and equipment for generating large text model according to facts |
CN118094049A (en) * | 2024-04-19 | 2024-05-28 | 福建省政务门户网站运营管理有限公司 | Portal website dynamic management system based on big data |
-
2023
- 2023-09-26 CN CN202311250434.2A patent/CN117312711A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118095443A (en) * | 2024-04-19 | 2024-05-28 | 深圳爱莫科技有限公司 | Training method and equipment for generating large text model according to facts |
CN118094049A (en) * | 2024-04-19 | 2024-05-28 | 福建省政务门户网站运营管理有限公司 | Portal website dynamic management system based on big data |
CN118094049B (en) * | 2024-04-19 | 2024-07-23 | 福建省政务门户网站运营管理有限公司 | Portal website dynamic management system based on big data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Gatterbauer et al. | Towards domain-independent information extraction from web tables | |
US9514216B2 (en) | Automatic classification of segmented portions of web pages | |
Liu et al. | Vide: A vision-based approach for deep web data extraction | |
CN103544255B (en) | Text semantic relativity based network public opinion information analysis method | |
CN107180045B (en) | Method for extracting geographic entity relation contained in internet text | |
CN101908071B (en) | Method and device thereof for improving search efficiency of search engine | |
CN117312711A (en) | Search engine optimization method and system based on AI analysis | |
CN104881458B (en) | A kind of mask method and device of Web page subject | |
WO2017080090A1 (en) | Extraction and comparison method for text of webpage | |
CN109543126B (en) | Webpage text information extraction method based on block character ratio | |
CN109857956B (en) | News webpage key information automatic extraction method based on label and block characteristics | |
CN102135967A (en) | Webpage keywords extracting method, device and system | |
CN102306201B (en) | Method and system for analyzing webpage title | |
US20110246462A1 (en) | Method and System for Prompting Changes of Electronic Document Content | |
CN103309862A (en) | Webpage type recognition method and system | |
CN103530429A (en) | Webpage content extracting method | |
CN112667940A (en) | Webpage text extraction method based on deep learning | |
JP2005063432A (en) | Multimedia object retrieval apparatus and multimedia object retrieval method | |
Cardoso et al. | An efficient language-independent method to extract content from news webpages | |
CN105740355B (en) | Webpage context extraction method and device based on aggregation text density | |
CN112818200A (en) | Data crawling and event analyzing method and system based on static website | |
CN107145591A (en) | Title-based webpage effective metadata content extraction method | |
CN104778232A (en) | Searching result optimizing method and device based on long query | |
CN110083760B (en) | Multi-recording dynamic webpage information extraction method based on visual block | |
CN118377950A (en) | Webpage text extraction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |