CN111914201A - Network page processing method and device - Google Patents

Network page processing method and device Download PDF

Info

Publication number
CN111914201A
CN111914201A CN202010789735.2A CN202010789735A CN111914201A CN 111914201 A CN111914201 A CN 111914201A CN 202010789735 A CN202010789735 A CN 202010789735A CN 111914201 A CN111914201 A CN 111914201A
Authority
CN
China
Prior art keywords
page
pages
target
medical
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010789735.2A
Other languages
Chinese (zh)
Other versions
CN111914201B (en
Inventor
康战辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010789735.2A priority Critical patent/CN111914201B/en
Publication of CN111914201A publication Critical patent/CN111914201A/en
Application granted granted Critical
Publication of CN111914201B publication Critical patent/CN111914201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the application provides a method and a device for processing a web page. The processing method of the network page comprises the following steps: performing domain classification on the page to be processed based on the content in the page to be processed to obtain at least one domain; determining authority values of all pages in the target field relative to other pages in the target field based on the incidence relation among the pages in the target field; and presenting the information of the page in the target field in the webpage based on the authority value corresponding to each page. The authority value is calculated according to the page in one field, so that the display of the page with the incidence relation in the field is realized, the logic and hierarchy of network page pushing are further improved, and the content pushing effect on a user side is improved.

Description

Network page processing method and device
Technical Field
The present application relates to the field of computer and communication technologies, and in particular, to a method and an apparatus for processing a web page.
Background
In many websites, the content is pushed by recommending the information of some related webpages in one webpage, so that the purpose of information popularization is achieved. In the in-website web page pushing process of many websites, it is common to directly push the web pages by indexing through an in-website search engine, so as to present some relevant contents on the user terminal. However, due to the variety of sources, types, etc. of the push content, the content pushed by such a push method is often messy, has no logic and different levels, and thus the content push effect on the user terminal is not good.
Disclosure of Invention
Embodiments of the present application provide a method and an apparatus for processing a network page, so that the logic and hierarchy of network page pushing can be improved at least to a certain extent, and the content pushing effect on a user side is improved.
Other features and advantages of the present application will be apparent from the following detailed description, or may be learned by practice of the application.
According to an aspect of an embodiment of the present application, a method for processing a web page is provided, including: performing domain classification on the page to be processed based on the content in the page to be processed to obtain at least one domain; determining authority values of all pages in a target field relative to other pages in the target field based on the incidence relation among the pages in the target field; and presenting the information of the pages in the target field in the webpage based on the authority values corresponding to the pages.
According to an aspect of the embodiments of the present application, there is provided a device for processing a web page, including: the classification unit is used for carrying out domain classification on the page to be processed based on the content in the page to be processed to obtain at least one domain; the fixed value unit is used for determining authority values of all pages in the target field relative to other pages in the target field based on the incidence relation among the pages in the target field; and the presenting unit is used for presenting the information of the pages in the target field in the webpage based on the authority values corresponding to the pages.
In some embodiments of the present application, based on the foregoing solution, the processing device of the web page further includes: the first acquisition unit is used for acquiring the website navigation information; the second acquisition unit is used for acquiring a page in the website based on the website structure and the seed page in the website navigation information; and the relationship determining unit is used for determining the association relationship among the pages based on the link relationship among the pages.
In some embodiments of the present application, based on the foregoing scheme, the second obtaining unit is configured to: and crawling the information in the website based on the website structure in the website navigation information and the seed page to obtain the page in the website.
In some embodiments of the present application, based on the foregoing scheme, the classification unit includes: the extraction unit is used for extracting the text content in the page to be processed; and the input unit is used for inputting the text content into a trained page classification model to obtain a field corresponding to the to-be-processed page output by the page classification model.
In some embodiments of the present application, the method for training a page classification model based on the foregoing scheme includes: acquiring text content of a page sample and a corresponding field tag thereof; extracting a vocabulary sample from the text content; inputting the vocabulary sample into a page classification network to obtain a classification result output by the page classification network; and adjusting parameters in the page classification network based on the classification result and the loss function obtained by the domain label to obtain the page classification model.
In some embodiments of the present application, based on the foregoing scheme, the valuing unit includes: the related page determining unit is used for determining related pages in the target field based on the related relation among the pages in the selected target field; and the authority value determining unit is used for determining the authority values of the associated pages in the target field relative to other pages in the target field based on the calling relationship among the associated pages, wherein the calling relationship is positively correlated with the authority values.
In some embodiments of the present application, based on the foregoing solution, the authority value determination unit is configured to: determining an incidence matrix based on the calling relation among the incidence pages; determining an authority parameter representing the relationship between the associated page and other pages in the target field based on the target field and the other pages except the page in the target field; and determining the authority value of the associated page in the target field relative to other pages in the target field based on the incidence matrix, the authority parameters and the damping coefficient.
In some embodiments of the present application, based on the foregoing solution, the presenting unit includes: a third acquisition unit configured to acquire a search term for the target domain; the target page determining unit is used for searching a target page corresponding to the search vocabulary entry from the page corresponding to the target field; and the page presenting unit is used for determining the display sequence of the target page based on the authority value corresponding to the target page and presenting the information of the target page in the webpage based on the display sequence.
In some embodiments of the present application, based on the foregoing solution, the processing device of the web page further includes: the medical classification unit is used for classifying the medical pages based on articles in the medical pages to be processed to obtain medical fields corresponding to the medical pages; the medical rating unit is used for determining authority values of the medical pages in the target medical field relative to other medical pages in the target medical field based on the incidence relation among the medical pages in the selected target medical field; and the medical presentation unit is used for presenting the information of the medical page in the medical field in a medical webpage based on the authority values corresponding to the pages.
According to an aspect of the embodiments of the present application, there is provided a computer readable medium, on which a computer program is stored, which when executed by a processor, implements the method for processing a web page as described in the above embodiments.
According to an aspect of an embodiment of the present application, there is provided an electronic device including: one or more processors; a storage device, configured to store one or more programs, which when executed by the one or more processors, cause the one or more processors to implement the method for processing a web page as described in the above embodiments.
According to an aspect of embodiments herein, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the processing method of the web page provided in the above-mentioned various alternative implementations.
In the technical solutions provided in some embodiments of the present application, each page to be processed in a website is classified to obtain a page corresponding to each field, so as to perform targeted processing on the page in each field. According to the association relationship among the pages in a target field, determining authority values of the pages in the field relative to other pages, finally presenting the information of the pages with the association relationship in the target field in the web page based on the authority values corresponding to the pages, and realizing the display of the pages with the association relationship in the field by calculating the authority values aiming at the pages in the field, thereby improving the logic and hierarchy of network page push and improving the content push effect on a user terminal.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:
FIG. 1 shows a schematic diagram of an exemplary system architecture to which aspects of embodiments of the present application may be applied;
FIG. 2 schematically illustrates a flow diagram of a method of processing a web page according to one embodiment of the present application;
FIG. 3 schematically shows a diagram of semantic drift according to an embodiment of the present application;
FIG. 4 schematically shows a diagram of training a page classification model according to an embodiment of the present application;
FIG. 5 schematically shows a flow diagram for presenting information of a page in the target domain in a web page according to one embodiment of the present application;
FIG. 6 schematically shows a flow chart of a method of processing a medical network page according to an embodiment of the present application;
FIG. 7 schematically shows a schematic view of a medical field classification according to an embodiment of the present application;
FIG. 8 schematically illustrates a diagram presenting information of a medical page according to an embodiment of the present application;
FIG. 9 schematically illustrates a block diagram of a processing device of a web page according to an embodiment of the present application;
FIG. 10 schematically shows a block diagram of a processing device of a medical network page according to an embodiment of the present application;
FIG. 11 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present application.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the application. One skilled in the relevant art will recognize, however, that the subject matter of the present application can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the application.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and the like.
With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common virtual assistants, intelligent speakers, intelligent marketing, robots, intelligent medical treatment, intelligent customer service, and the like.
The scheme provided by the embodiment of the application relates to technologies such as artificial intelligence natural language processing and machine learning, and is specifically explained by the following embodiments: fig. 1 shows a schematic diagram of an exemplary system architecture to which the technical solution of the embodiments of the present application can be applied.
As shown in fig. 1, the system architecture may include a terminal device (e.g., one or more of a smartphone 101, a tablet computer 102, and a portable computer 103 shown in fig. 1, but may also be a desktop computer, etc.), a network 104, and a server 105. The network 104 serves as a medium for providing communication links between terminal devices and the server 105. Network 104 may include various connection types, such as wired communication links, wireless communication links, and so forth.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, server 105 may be a server cluster comprised of multiple servers, or the like.
A user may use a terminal device to interact with the server 105 over the network 104 to receive or send messages or the like. The server 105 may be a server that provides various services. For example, the server 105 performs domain classification on the page to be processed based on the content in the page to be processed to obtain at least one domain, then determines authority values of each page in the target domain relative to other pages in the target domain based on the association relationship between the pages in the target domain, and finally presents the information of the page in the target domain in the web page based on the authority values corresponding to each page.
According to the scheme in the embodiment, each page to be processed in the website is classified to obtain the page corresponding to each field, so that the pages in each field are processed in a targeted manner. According to the association relationship among the pages in a target field, determining authority values of the pages in the field relative to other pages, finally presenting the information of the pages with the association relationship in the target field in the web page based on the authority values corresponding to the pages, and realizing the display of the pages with the association relationship in the field by calculating the authority values aiming at the pages in the field, thereby improving the logic and hierarchy of network page push and improving the content push effect on a user terminal.
It should be noted that the processing method of the web page provided in the embodiment of the present application is generally executed by the server 105, and accordingly, the processing device of the web page is generally disposed in the server 105. However, in other embodiments of the present application, the terminal device may also have a similar function as the server, so as to execute the method for processing the web page provided in the embodiments of the present application.
The implementation details of the technical solution of the embodiment of the present application are set forth in detail below:
fig. 2 shows a flowchart of a processing method of a web page according to an embodiment of the present application, which may be performed by a server, which may be the server shown in fig. 1. Referring to fig. 2, the method for processing the web page at least includes steps S210 to S230, which are described in detail as follows:
in step S210, the to-be-processed page is subjected to domain classification based on the content in the to-be-processed page, so as to obtain at least one domain.
Fig. 3 is a schematic diagram of semantic drift according to an embodiment of the present application.
As shown in fig. 3, in an embodiment of the present application, since different websites have different definitions on the related recommendation policy, there may still be a certain semantic drift between the related recommendation link given by the system and the original web page. For example, the text area 310 of the web page in fig. 3 is "the reason for frequent blisters on the mouth" and the corresponding text portion, and the related problems of the recommendation area 320 in the page include "no side effect will occur when face shaping is performed". Without any management between the two, this situation leads to drift and bias of content recommendations. In order to avoid this situation, in this embodiment, the to-be-processed pages are subjected to domain classification based on the content in the to-be-processed pages, so as to obtain pages included in each domain.
In an embodiment of the present application, in the process of performing domain classification on the page to be processed, the page to be processed may be classified based on text content and images in the page to be processed. For example, the similarity between images in each page to be processed is identified, the pages to be processed belonging to the same field are determined based on the similarity, and the corresponding name of the field is determined based on the image content or the text content.
In one embodiment of the present application, the pages to be processed may include various pages in a website, and the content in the pages may be associated or not associated. Meanwhile, the pages may also include a next level page below one page, and the like.
In one embodiment of the present application, the domain may be used to represent different page types, scopes to which the pages belong, and the like. The fields in this embodiment may be classified for multiple times to obtain corresponding fields of different grades, a first-grade field, a second-grade field, and the like under one directory.
In an embodiment of the present application, the process of performing domain classification on the to-be-processed page based on the content in the to-be-processed page in step S210 to obtain at least one domain includes the following steps: extracting text content in a page to be processed; and inputting the text content into the trained page classification model to obtain the field corresponding to the page to be processed and output by the page classification model.
In an embodiment of the present application, when performing domain classification on a to-be-processed page based on content in the to-be-processed page, the domain classification may be performed based on text content in the to-be-processed page. And obtaining the corresponding field of each page to be processed by identifying the similar condition or the associated condition among the text contents in the page to be processed. In addition, the text content of the page to be processed can be input into the trained page classification model, and the field corresponding to the page to be processed and output by the page classification model can be obtained.
Specifically, in an embodiment of the present application, a method for training a page classification model includes: acquiring text content of a page sample and a corresponding field tag thereof; extracting a vocabulary sample from the text content; inputting the vocabulary sample into a page classification network to obtain a classification result output by the page classification network; and adjusting parameters in the page classification network based on the classification result and the loss function obtained by the domain label to obtain a page classification model.
Fig. 4 is a schematic diagram of a training page classification model according to an embodiment of the present application.
As shown in fig. 4, in an embodiment of the present application, the page classification network may be constructed based on a Text Convolutional Neural network (TextCNN). Firstly, carrying out sample marking on text contents of a page sample, and determining a corresponding field label; inputting a sequence with the length of n, and extracting words 0-n-1 from the sequence; inputting each vocabulary into a page classification network in an input layer 410 to obtain a word vector with a dimension of K; inputting the word vector dimension K into the convolution layer 420 for convolution, wherein the specific convolution modes can be 2 × 1 dimension, 3 × 1 dimension and 4 × 1 dimension of 1024 layers; pooling data output by the convolutional layer 420 in the pooling layer 430 to obtain 1024 layers of pooled data; and then, connecting the pooled data at the full connection layer to obtain a classification system corresponding to the page sample, and finally obtaining a corresponding classification label based on the classification system.
Further, after obtaining the classification label corresponding to the page sample, comparing the classification label with the set domain label, and determining a corresponding loss function according to the comparison result so as to adjust parameters in the page classification network, thereby obtaining a page classification model.
Illustratively, as shown in fig. 4, in an application scenario of a medical website, a TextCNN-based classification model is trained by obtaining a medical science popularization article of the medical website to perform automatic sample labeling. Firstly, a word vector model in the medical field is trained based on tens of millions of medical information articles collected in advance, and then information titles in subsequent training and predicting stages are subjected to vector representation. Wherein, the words 0-n-1 in the leftmost data sequence of FIG. 4 are the K-dimensional word vectors corresponding to each segmented word in the medical information header. And the right-most classification system to be classified is the disease classification in the medical website.
In an embodiment of the present application, before the process of determining authority values of each page in the target domain relative to other pages in the target domain based on the association relationship between the pages in the target domain in step S220, the following steps are included: acquiring website navigation information; acquiring a page in a website based on a website structure and a seed page in the website navigation information; and determining the association relation between the pages based on the link relation between the pages.
It should be noted that, this part of the scheme may be executed before step S220, or may be executed before step S210.
In an embodiment of the application, the website navigation information is acquired, so that information in the website is crawled based on a website structure and a seed page in the website navigation information to acquire a page in the website. And determining the association relationship between the pages based on the connection relationship between the pages.
Specifically, the website navigation information in this embodiment may include a website structure, a seed page as a root page or a home page, and the like.
In step S220, authority values of the pages in the target domain relative to other pages in the target domain are determined based on the association relationship between the pages in the target domain.
In an embodiment of the present application, based on a specified target domain, according to an association relationship between pages in the target domain, in this embodiment, more important pages are often referred to by other pages more, or hyperlinks leading to the pages are added to other pages more. Illustratively, the link from the A page to the B page is interpreted as the A page voting for the B page, and the rank and authority value of the voted page are determined according to the voting source, the source of the source, namely the rank of the page linked to the A page and the voting object.
In an embodiment of the present application, the process of determining authority values of the pages in the target domain relative to other pages in the target domain based on the association relationship between the pages in the target domain in step S220 includes the following steps S2201 to S2202:
in step S2201, the associated pages in the target domain are determined based on the association relationship between the pages in the selected target domain.
In an embodiment of the present application, there is an association relationship between pages in a website, and, in the scope of the target domain, there is an association relationship between some websites, and there may be no association relationship between some webpages. In this embodiment, the page having the association relationship is used as the association page in the target domain based on the association relationship between the pages in the target domain.
In step S2202, authority values of the associated pages in the target domain relative to other pages in the target domain are determined based on a call relationship between the associated pages, where the call relationship and the authority values are positively correlated.
In an embodiment of the present application, since a positive correlation exists between the calling relationship and the authority value, in this embodiment, the authority value of the associated page in the target field relative to other pages is determined according to the calling relationship between the associated pages.
In an embodiment of the present application, in step S2202, based on a call relationship between associated pages, an authority value of an associated page in a target domain relative to other pages in the target domain is determined, where a process of positively correlating the call relationship with the authority value includes the following steps:
based on the calling relationship among the associated pages, determining the association matrix as:
Figure BDA0002623329690000111
wherein p is1~pNIndicating the page identification, N is a natural number greater than 2, iota (p)i,pj) For representing pages piFor page pjI and j are natural numbers smaller than N.
And determining the authority parameter representing the relationship between the associated page and other pages to be s based on the target domain and other pages except the pages in the target domain. Where s represents a vector, i.e., an inlink matrix in the same domain. Specifically, for a field, if the page k belongs to the field, the kth element in s is 1, otherwise, it is 0. Since the domains to which the respective pages belong are different, there is s corresponding to each domain, and | s | represents the number of 1 s, and the larger the number, the more pages the domain has.
Determining authority values of the associated pages in the target field relative to other pages in the target field in an iteration mode based on the incidence matrix, the authority parameters and the damping coefficient q as follows:
Figure BDA0002623329690000112
the concrete expression is as follows:
Figure BDA0002623329690000113
in one embodiment of the application, authority parameters are determined based on each page contained in one field, so that authority values corresponding to the pages in the field are determined based on the authority parameters, and the comprehensiveness and accuracy of authority calculation of the pages are improved.
In step S230, information of the page in the target domain is presented in the web page based on the authority value corresponding to each page.
In an embodiment of the application, after authority values corresponding to pages are obtained through calculation, information of the pages with association relation in a target field is presented in a webpage based on the authority values of the pages.
In an embodiment of the present application, as shown in fig. 5, the process of presenting the information of the page in the target domain in the web page based on the authority value corresponding to each page in step S230 includes steps S2301 to S2303:
in step S2301, a search term for the target domain is acquired.
In one embodiment of the application, after the authority value is calculated, the search term for the target field input by the user is obtained. The search lifting bar in this embodiment may be a keyword or the like of search corresponding to the target field, and in addition, may be an image, a screenshot, or the like.
Specifically, in this embodiment, after the search term input by the user is acquired, the target field corresponding to the search term is determined in the website based on the search term. Or directly prompting the user to input the search terms aiming at the target field under the environment corresponding to the target field.
In step S2302, a target page corresponding to the search term is searched for from the pages corresponding to the target field.
In one embodiment of the application, a target page corresponding to a search term is determined based on pages corresponding to a target field. The specific target page searching method may be based on the search term, and search whether the text content in the page corresponding to the target field includes the search term and its similar term, if yes, the page is determined to be the target page.
In step S2303, a display order of the target page is determined based on the authority value corresponding to the target page, and information of the target page is presented in the web page based on the display order.
In one embodiment of the application, after determining a target page corresponding to a search term and authority values of the target page corresponding to the field, determining a display order of the target page based on the authority values of the target page, so as to present information of the target page in a webpage based on the display order. Specifically, the target page with the highest authority value may be used as the main page, and based on the authority value, information of other target pages is presented in the recommended part below the main page.
In one embodiment of the present application, the information of the rendered target page may include an illustration of the target page, a summary of the target page, a date of generation of the target page, and the like.
Fig. 6 shows a flowchart of a processing method of a medical network page according to an embodiment of the present application in the medical field, where the processing method of the medical network page may be performed by a server, which may be the server shown in fig. 1. Referring to fig. 6, the processing method of the medical web page at least includes steps S610 to S630, which are described in detail as follows:
in step S610, the medical pages are classified based on the articles in the medical page to be processed, so as to obtain the medical field corresponding to the medical page.
In an embodiment of the application, the medical field corresponding to the medical page is obtained by classifying the field of the medical page based on the article in the medical page to be processed in the medical website.
Fig. 7 is a schematic diagram of a medical field classification provided in an embodiment of the present application.
As shown in fig. 7, the medical field in the present embodiment may include a primary field, a secondary field, and so on. Wherein, the primary domain may be a department classification 710, which may include medical domains such as: internal medicine, surgery, oncology, neurology, infectious department, otorhinolaryngology, pediatrics, and the like. The secondary domain 710 may be a domain below the primary domain, such as under-medical nephrology, gastroenterology, endocrinology, and so forth. The tertiary domain may be a domain below the secondary domain, for example, kidney stones below nephrology, kidney deficiency, uremia, and the like.
In the embodiment, different medical fields are classified into different grades, so that the medical fields can be more clearly classified to obtain more clear page recommendation.
For example, in this embodiment, different domain levels may be divided for pages in the website according to different project levels, and then pages in corresponding ranges are recommended based on target domains in the different domain levels.
In step S620, based on the association relationship between the medical pages in the selected target medical field, authority values of the medical pages in the target medical field relative to other medical pages in the target medical field are determined.
In one embodiment of the application, authority values of the medical pages in the target medical field relative to other medical pages in the target medical field are determined based on the incidence relations among the medical pages in the selected target medical field. The authority value determination method may refer to the description in step S220 corresponding to fig. 2, and is not described herein again.
In step S630, information of the medical page in the medical field is presented in the medical web page based on the authority values corresponding to the respective pages.
Fig. 8 is a schematic diagram illustrating a presentation of information of a medical page according to an embodiment of the present application.
As shown in FIG. 8, for the home page 810 in the current medical website, during the display process, relevant recommendations appear at the bottom of the page, including the associated pages in the same domain and associated with the home page. Each associated page has a different authority value, and in this embodiment, the summary information of the associated pages may be displayed in an order from high to low based on the authority values, such as 820, 830, and 840 in fig. 8.
In the embodiment, the authority values of the associated pages are determined in the same medical field, so that when the content of one main page is displayed, the corresponding associated page can be determined based on the content of the main page, and the display mode of the information of the associated page can be determined based on the authority values of the associated pages, thereby improving the efficiency of page pushing.
The following describes an embodiment of an apparatus of the present application, which may be used to execute a method for processing a web page in the foregoing embodiment of the present application. For details that are not disclosed in the embodiments of the apparatus of the present application, please refer to the embodiments of the method for processing web pages described above in the present application.
FIG. 9 shows a block diagram of a processing device of a web page according to an embodiment of the application.
Referring to fig. 9, a device 900 for processing a web page according to an embodiment of the present application includes: a classifying unit 910, configured to perform domain classification on the page to be processed based on content in the page to be processed, so as to obtain at least one domain; a rating unit 920, configured to determine authority values of each page in the target domain relative to other pages in the target domain based on an association relationship between pages in the target domain; a presenting unit 930, configured to present, in the web page, information of the page in the target field based on the authority value corresponding to each page.
In some embodiments of the present application, based on the foregoing solution, the processing apparatus 900 for a web page further includes: the first acquisition unit is used for acquiring the website navigation information; the second acquisition unit is used for acquiring the page in the website based on the website structure and the seed page in the website navigation information; and the relationship determining unit is used for determining the association relationship between the pages based on the link relationship between the pages.
In some embodiments of the present application, based on the foregoing scheme, the second obtaining unit is configured to: and crawling the information in the website based on the website structure and the seed page in the website navigation information to obtain the page in the website.
In some embodiments of the present application, based on the foregoing scheme, the classification unit 910 includes: the extraction unit is used for extracting text contents in the page to be processed; and the input unit is used for inputting the text content into the trained page classification model to obtain the field corresponding to the page to be processed and output by the page classification model.
In some embodiments of the present application, the training method based on the aforementioned solution page classification model includes: acquiring text content of a page sample and a corresponding field tag thereof; extracting a vocabulary sample from the text content; inputting the vocabulary sample into a page classification network to obtain a classification result output by the page classification network; and adjusting parameters in the page classification network based on the classification result and the loss function obtained by the domain label to obtain a page classification model.
In some embodiments of the present application, based on the foregoing scheme, the constant value unit 920 includes: the related page determining unit is used for determining related pages in the target field based on the related relation among the pages in the selected target field; and the authority value determining unit is used for determining the authority values of the associated pages in the target field relative to other pages in the target field based on the calling relationship among the associated pages, wherein the calling relationship is positively correlated with the authority values.
In some embodiments of the present application, based on the foregoing scheme, the authority value determination unit is configured to: determining an incidence matrix based on a calling relation between the incidence pages; determining an authority parameter representing the relation between the associated page and other pages based on the target field and other pages except the pages in the target field; and determining the authority value of the associated page in the target field relative to other pages in the target field based on the incidence matrix, the authority parameters and the damping coefficient.
In some embodiments of the present application, based on the foregoing scheme, the presenting unit 930 includes: a third acquisition unit configured to acquire a search term for the target domain; the target page determining unit is used for searching a target page corresponding to the search vocabulary entry from the page corresponding to the target field; and the page presenting unit is used for determining the display sequence of the target page based on the authority value corresponding to the target page and presenting the information of the target page in the webpage based on the display sequence.
Fig. 10 is a block diagram of a processing apparatus of a medical network page according to an embodiment of the present application, and an implementation method of the processing apparatus in the application and medical field is the embodiment corresponding to fig. 6, which is not described herein again.
Referring to fig. 10, a processing apparatus 1000 for a medical network page according to an embodiment of the present application includes: the medical classification unit 1010 is used for classifying the medical pages based on the articles in the medical pages to be processed to obtain the medical fields corresponding to the medical pages; a medical rating unit 1020, configured to determine authority values of the medical pages in the target medical field relative to other medical pages in the target medical field based on the association relationship between the medical pages in the selected target medical field; and a medical presenting unit 1030, configured to present information of medical pages in the medical field in the medical web page based on the authority values corresponding to the respective pages.
FIG. 11 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present application.
It should be noted that the computer system 1100 of the electronic device shown in fig. 11 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 11, a computer system 1100 includes a Central Processing Unit (CPU)1101, which can perform various appropriate actions and processes, such as performing the methods described in the above embodiments, according to a program stored in a Read-Only Memory (ROM) 1102 or a program loaded from a storage section 1108 into a Random Access Memory (RAM) 1103. In the RAM 1103, various programs and data necessary for system operation are also stored. The CPU 1101, ROM 1102, and RAM 1103 are connected to each other by a bus 1104. An Input/Output (I/O) interface 1105 is also connected to bus 1104.
The following components are connected to the I/O interface 1105: an input portion 1106 including a keyboard, mouse, and the like; an output section 1107 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 1108 including a hard disk and the like; and a communication section 1109 including a Network interface card such as a LAN (local area Network) card, a modem, or the like. The communication section 1109 performs communication processing via a network such as the internet. A driver 1110 is also connected to the I/O interface 1105 as necessary. A removable medium 1111, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 1110 as necessary, so that a computer program read out therefrom is mounted into the storage section 1108 as necessary.
In particular, according to embodiments of the application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising a computer program for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 1109 and/or installed from the removable medium 1111. When the computer program is executed by a Central Processing Unit (CPU)1101, various functions defined in the system of the present application are executed.
It should be noted that the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with a computer program embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. The computer program embodied on the computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternative implementations described above.
As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the method described in the above embodiments.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units. Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which can be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present application.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (10)

1. A method for processing a web page, comprising:
performing domain classification on the page to be processed based on the content in the page to be processed to obtain at least one domain;
determining authority values of all pages in a target field relative to other pages in the target field based on the incidence relation among the pages in the target field;
and presenting the information of the pages in the target field in the webpage based on the authority values corresponding to the pages.
2. The method according to claim 1, wherein before determining authority values of the pages in the target domain relative to other pages in the target domain based on the association relationship between the pages in the target domain, the method further comprises:
acquiring website navigation information;
acquiring a page in the website based on a website structure and a seed page in the website navigation information;
and determining the association relation among the pages based on the link relation among the pages.
3. The method of claim 2, wherein obtaining the page in the website based on the website structure and the seed page in the website navigation information comprises:
and crawling the information in the website based on the website structure in the website navigation information and the seed page to obtain the page in the website.
4. The method according to claim 1, wherein performing domain classification on the page to be processed based on the content in the page to be processed to obtain at least one domain comprises:
extracting text content in the page to be processed;
and inputting the text content into a trained page classification model to obtain a field corresponding to the page to be processed and output by the page classification model.
5. The method of claim 4, wherein the method of training the page classification model comprises:
acquiring text content of a page sample and a corresponding field tag thereof;
extracting a vocabulary sample from the text content;
inputting the vocabulary sample into a page classification network to obtain a classification result output by the page classification network;
and adjusting parameters in the page classification network based on the classification result and the loss function obtained by the domain label to obtain the page classification model.
6. The method of claim 1, wherein determining authority values of each page in a target domain relative to other pages in the target domain based on the association between pages in the target domain comprises:
determining an associated page in the target field based on the association relationship between the pages in the selected target field;
and determining authority values of the associated pages in the target field relative to other pages in the target field based on the calling relation among the associated pages, wherein the calling relation is positively correlated with the authority values.
7. The method of claim 6, wherein determining authority values of the associated pages in the target domain relative to other pages in the target domain based on call relations between the associated pages comprises:
determining an incidence matrix based on the calling relation among the incidence pages;
determining an authority parameter representing the relationship between the associated page and other pages in the target field based on the target field and the other pages except the page in the target field;
and determining the authority value of the associated page in the target field relative to other pages in the target field based on the incidence matrix, the authority parameters and the damping coefficient.
8. The method according to claim 1, wherein presenting information of pages in the target domain in a web page based on the authority values corresponding to the respective pages comprises:
acquiring a search entry aiming at the target field;
searching a target page corresponding to the search entry from the page corresponding to the target field;
determining the display sequence of the target page based on the authority value corresponding to the target page, and presenting the information of the target page in the webpage based on the display sequence.
9. The method of claim 1, further comprising:
classifying the medical pages based on articles in the medical pages to be processed to obtain medical fields corresponding to the medical pages;
determining authority values of the medical pages in the target medical field relative to other medical pages in the target medical field based on the incidence relation among the medical pages in the selected target medical field;
and presenting the information of the medical page in the medical field in a medical webpage based on the authority values corresponding to the pages.
10. An apparatus for processing web pages, comprising:
the classification unit is used for carrying out domain classification on the page to be processed based on the content in the page to be processed to obtain at least one domain;
the fixed value unit is used for determining authority values of all pages in the target field relative to other pages in the target field based on the incidence relation among the pages in the target field;
and the presenting unit is used for presenting the information of the pages in the target field in the webpage based on the authority values corresponding to the pages.
CN202010789735.2A 2020-08-07 2020-08-07 Processing method and device of network page Active CN111914201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010789735.2A CN111914201B (en) 2020-08-07 2020-08-07 Processing method and device of network page

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010789735.2A CN111914201B (en) 2020-08-07 2020-08-07 Processing method and device of network page

Publications (2)

Publication Number Publication Date
CN111914201A true CN111914201A (en) 2020-11-10
CN111914201B CN111914201B (en) 2023-11-07

Family

ID=73283233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010789735.2A Active CN111914201B (en) 2020-08-07 2020-08-07 Processing method and device of network page

Country Status (1)

Country Link
CN (1) CN111914201B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416212A (en) * 2020-11-25 2021-02-26 维沃移动通信有限公司 Program access method, device, electronic equipment and readable storage medium

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101132446A (en) * 2006-08-23 2008-02-27 上海万纬信息技术有限公司 Web page intelligent snapping system and method thereof
US20080275833A1 (en) * 2007-05-04 2008-11-06 Microsoft Corporation Link spam detection using smooth classification function
CN101751438A (en) * 2008-12-17 2010-06-23 中国科学院自动化研究所 Theme webpage filter system for driving self-adaption semantics
CN101903878A (en) * 2007-10-11 2010-12-01 谷歌公司 Methods and systems for classifying search results to determine page elements
CN102567409A (en) * 2010-12-31 2012-07-11 珠海博睿科技有限公司 Method and device for providing retrieval associated word
CN102859516A (en) * 2009-04-08 2013-01-02 谷歌公司 Generating improved document classification data using historical search results
CN102890717A (en) * 2012-09-29 2013-01-23 北京奇虎科技有限公司 System and method for building webpage category knowledge base
CN102902793A (en) * 2012-09-29 2013-01-30 北京奇虎科技有限公司 Creation system and method of webpage classification knowledge base
CN102902790A (en) * 2012-09-29 2013-01-30 北京奇虎科技有限公司 Web page classification system and method
CN102959545A (en) * 2010-06-29 2013-03-06 微软公司 Navigation to popular search results
US20150095300A1 (en) * 2010-06-20 2015-04-02 Remeztech Ltd. System and method for mark-up language document rank analysis
CN104504070A (en) * 2014-12-22 2015-04-08 北京奇虎科技有限公司 Search method and device
US20150302076A1 (en) * 2014-04-17 2015-10-22 Samsung Electronics Co., Ltd. Method of storing and expressing web page in an electronic device
CN106649823A (en) * 2016-12-29 2017-05-10 淮海工学院 Webpage classification recognition method based on comprehensive subject term vertical search and focused crawler
CN106776710A (en) * 2016-11-18 2017-05-31 广东技术师范学院 A kind of picture and text construction of knowledge base method based on vertical search engine
CN106874340A (en) * 2016-12-22 2017-06-20 新华三技术有限公司 A kind of web page address sorting technique and device
CN107153498A (en) * 2016-03-30 2017-09-12 阿里巴巴集团控股有限公司 A kind of page processing method, device and intelligent terminal
CN108694197A (en) * 2017-04-10 2018-10-23 富士通株式会社 Hypertext grasping means and device
CN110209906A (en) * 2018-02-07 2019-09-06 北京京东尚科信息技术有限公司 Method and apparatus for extracting webpage information
US20210377628A1 (en) * 2018-08-31 2021-12-02 Beijing Bytedance Network Technology Co., Ltd. Method and apparatus for outputting information

Patent Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101132446A (en) * 2006-08-23 2008-02-27 上海万纬信息技术有限公司 Web page intelligent snapping system and method thereof
US20080275833A1 (en) * 2007-05-04 2008-11-06 Microsoft Corporation Link spam detection using smooth classification function
CN101903878A (en) * 2007-10-11 2010-12-01 谷歌公司 Methods and systems for classifying search results to determine page elements
CN101751438A (en) * 2008-12-17 2010-06-23 中国科学院自动化研究所 Theme webpage filter system for driving self-adaption semantics
CN102859516A (en) * 2009-04-08 2013-01-02 谷歌公司 Generating improved document classification data using historical search results
US20150095300A1 (en) * 2010-06-20 2015-04-02 Remeztech Ltd. System and method for mark-up language document rank analysis
CN102959545A (en) * 2010-06-29 2013-03-06 微软公司 Navigation to popular search results
CN102567409A (en) * 2010-12-31 2012-07-11 珠海博睿科技有限公司 Method and device for providing retrieval associated word
CN102890717A (en) * 2012-09-29 2013-01-23 北京奇虎科技有限公司 System and method for building webpage category knowledge base
CN102902793A (en) * 2012-09-29 2013-01-30 北京奇虎科技有限公司 Creation system and method of webpage classification knowledge base
CN102902790A (en) * 2012-09-29 2013-01-30 北京奇虎科技有限公司 Web page classification system and method
US20150302076A1 (en) * 2014-04-17 2015-10-22 Samsung Electronics Co., Ltd. Method of storing and expressing web page in an electronic device
CN104504070A (en) * 2014-12-22 2015-04-08 北京奇虎科技有限公司 Search method and device
CN107153498A (en) * 2016-03-30 2017-09-12 阿里巴巴集团控股有限公司 A kind of page processing method, device and intelligent terminal
CN106776710A (en) * 2016-11-18 2017-05-31 广东技术师范学院 A kind of picture and text construction of knowledge base method based on vertical search engine
CN106874340A (en) * 2016-12-22 2017-06-20 新华三技术有限公司 A kind of web page address sorting technique and device
CN106649823A (en) * 2016-12-29 2017-05-10 淮海工学院 Webpage classification recognition method based on comprehensive subject term vertical search and focused crawler
CN108694197A (en) * 2017-04-10 2018-10-23 富士通株式会社 Hypertext grasping means and device
CN110209906A (en) * 2018-02-07 2019-09-06 北京京东尚科信息技术有限公司 Method and apparatus for extracting webpage information
US20210377628A1 (en) * 2018-08-31 2021-12-02 Beijing Bytedance Network Technology Co., Ltd. Method and apparatus for outputting information

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHINE N. DAS 等: ""An Efficient Approach for Finding Near Duplicate Web pages using Minimum Weight Overlapping Method"", 《INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING (IJECE)》, pages 187 - 194 *
何力 等: ""基于无标记Web数据的层次式文本分类"", 《智能系统学报》, pages 330 - 335 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416212A (en) * 2020-11-25 2021-02-26 维沃移动通信有限公司 Program access method, device, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN111914201B (en) 2023-11-07

Similar Documents

Publication Publication Date Title
US10963794B2 (en) Concept analysis operations utilizing accelerators
US11151177B2 (en) Search method and apparatus based on artificial intelligence
CN107491534B (en) Information processing method and device
CN111737476B (en) Text processing method and device, computer readable storage medium and electronic equipment
CN110851713B (en) Information processing method, recommending method and related equipment
CN107463704B (en) Search method and device based on artificial intelligence
CN113535984B (en) Knowledge graph relation prediction method and device based on attention mechanism
CN111046275B (en) User label determining method and device based on artificial intelligence and storage medium
US9535980B2 (en) NLP duration and duration range comparison methodology using similarity weighting
CN106776503A (en) The determination method and device of text semantic similarity
CN113011172B (en) Text processing method, device, computer equipment and storage medium
CN113392209A (en) Text clustering method based on artificial intelligence, related equipment and storage medium
CN111813905A (en) Corpus generation method and device, computer equipment and storage medium
CN113761190A (en) Text recognition method and device, computer readable medium and electronic equipment
CN112464042A (en) Task label generation method according to relation graph convolution network and related device
Azzam et al. A question routing technique using deep neural network for communities of question answering
CN111914201B (en) Processing method and device of network page
JP2023517518A (en) Vector embedding model for relational tables with null or equivalent values
CN116628162A (en) Semantic question-answering method, device, equipment and storage medium
WO2021223165A1 (en) Systems and methods for object evaluation
CN109885647B (en) User history verification method, device, electronic equipment and storage medium
CN113705692A (en) Emotion classification method and device based on artificial intelligence, electronic equipment and medium
CN113656586B (en) Emotion classification method, emotion classification device, electronic equipment and readable storage medium
CN114357163A (en) Text type identification method and device, computer readable medium and electronic equipment
CN112528183B (en) Webpage component layout method and device based on big data, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant