CN108334630A - A kind of URL classification method and system - Google Patents
A kind of URL classification method and system Download PDFInfo
- Publication number
- CN108334630A CN108334630A CN201810156915.XA CN201810156915A CN108334630A CN 108334630 A CN108334630 A CN 108334630A CN 201810156915 A CN201810156915 A CN 201810156915A CN 108334630 A CN108334630 A CN 108334630A
- Authority
- CN
- China
- Prior art keywords
- url
- classification
- sorted
- feature
- library
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
- G06F16/9566—URL specific, e.g. using aliases, detecting broken or misspelled links
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The present invention discloses a kind of URL classification method and system, is related to information sorting technique field, and the URL classification method includes:Judge the classification information with the presence or absence of URL to be sorted in preset URL classification library;When the classification information of the URL to be sorted is not present in the URL classification library, from the corresponding webpages of the URL to be sorted, the feature phrase of expression web page contents is obtained;Lexical analysis is carried out to the feature phrase, to generate the classification marker of expression user behavior;According to the URL to be sorted and the corresponding classification markers of the URL to be sorted, corresponding classification information is generated, and be recorded in the URL classification library.The present invention can be achieved to classify to all URL, and have very high accuracy.
Description
Technical field
The present invention relates to information sorting technique field more particularly to a kind of URL classification method and system.
Background technology
Uniform resource locator (URL) is one kind of the position and access method of the resource to that can be obtained from internet
Succinct expression is the address of standard resource on internet, also referred to as web page address.
Currently, it is a kind of technology used when carrying out labeling to user that the URL accessed user, which carries out analysis,.But
It is that current this technology is typically embodied as carrying out analysis to the composition of URL or obtains the classification of URL by clustering method.So
And the composition of URL is ever-changing, it is that can not accomplish very high accuracy to carry out classification to URL from the composition of URL merely.If
It is analyzed from cluster angle, the training samples number based on current URL is limited, and there are prodigious inclined for the result trained
Difference.Therefore, it if accurately to classify to URL, needs to be analyzed from the corresponding content of pages of URL.
For example, the patent of Publication No. CN106960040A, it discloses a kind of classifications of URL to determine method and device,
The method includes:In the corresponding web page contents of URL to be sorted, the corresponding each tagged word of preset each feature is obtained
Section;For each feature field, this feature field is divided at least one first phrase, according to each phrase pre-saved
Target classification probability in each feature and non-targeted class probability, determine the First Eigenvalue of this feature field;According to true
The fixed corresponding each the First Eigenvalues of the URL to be sorted, and the URL classification model completed is trained in advance, described in determination
The corresponding classification of URL to be sorted.In technical solution disclosed in the patent document, accesses and obtain pair to URL first
Then the keyword answered carries out cluster training to obtain corresponding training sample, due to training sample to these keywords
Often there is certain error in reality is used in quantity factor.
By analyzing above, in existing technology, there are certain defects for the accuracy classified to URL.
Invention content
Technical problem to be solved by the present invention lies in existing technology, the accuracy classified to URL is not
It is high.
In order to solve the above technical problem, the present invention provides a kind of URL classification method and system.
The URL classification method includes:
Judge the classification information with the presence or absence of URL to be sorted in preset URL classification library;
When the classification information of the URL to be sorted is not present in the URL classification library, corresponded to from the URL to be sorted
Webpage in, obtain expression web page contents feature phrase;
Lexical analysis is carried out to the feature phrase, to generate the classification marker of expression user behavior;
According to the URL to be sorted and the corresponding classification markers of the URL to be sorted, corresponding classification information is generated,
And it is recorded in the URL classification library.
Optionally, classification information of the judgement with the presence or absence of URL to be sorted in preset URL classification library includes:
Intercept the feature string of the URL to be sorted;
The URL classification library is inquired according to the feature string, to judge in the URL classification library with the presence or absence of described
The classification information of URL to be sorted.
Optionally, described according to the URL to be sorted and the corresponding classification markers of the URL to be sorted, it generates and corresponds to
Classification information include:
According to the corresponding feature strings of URL to be sorted and the classification marker, corresponding classification letter is generated
Breath.
Optionally, in the corresponding webpage from the URL to be sorted, the feature phrase packet of expression web page contents is obtained
It includes:
By accessing the URL to be sorted, the corresponding web page contents of URL to be sorted are obtained;
Determine the feature phrase for expressing the web page contents.
Optionally, the feature phrase includes at least the web page title information that the URL to be sorted corresponds to webpage.
On the other hand, the present invention also provides a kind of URL classification systems, including:
Judgment module, for judging the classification information in preset URL classification library with the presence or absence of URL to be sorted;
Feature phrase acquisition module, for when the classification information that the URL to be sorted is not present in the URL classification library
When, from the corresponding webpages of the URL to be sorted, obtain the feature phrase of expression web page contents;
Classification marker generation module, for carrying out lexical analysis to the feature phrase, to generate expression user behavior
Classification marker;
Sort module, for according to the URL to be sorted and the corresponding classification markers of the URL to be sorted, generation pair
The classification information answered, and be recorded in the URL classification library.
Optionally, the judgment module includes:
Character string intercepts submodule, the feature string for intercepting the URL to be sorted;
Judging submodule, for inquiring the URL classification library according to the feature string, to judge the URL classification
It whether there is the classification information of the URL to be sorted in library.
Optionally, the sort module includes:
Classification information generates submodule, for according to the corresponding feature strings of the URL to be sorted and described point
Class marks, and generates corresponding classification information.
Optionally, the feature phrase acquisition module includes:
URL accesses submodule, for by accessing the URL to be sorted, obtaining the corresponding web page contents of URL to be sorted;
Feature phrase determination sub-module, for determining the feature phrase for expressing the web page contents.
Optionally, the feature phrase includes at least the web page title information that the URL to be sorted corresponds to webpage.
When needing the URL that classifies to can not find corresponding classification in URL classification library, by the corresponding webpages of URL into
Row analysis, the feature phrase of extraction expression web page contents, and lexical analysis is carried out to this feature phrase, and then obtain expression user
The classification marker of behavior, and classified according to URL and the classification marker, to update URL classification library.The present invention can be real
Now classify to URL, and there is very high accuracy.
Description of the drawings
Fig. 1 is a kind of flow chart for URL classification method that the embodiment of the present invention one provides;
Fig. 2 is a kind of flow chart of URL classification method provided by Embodiment 2 of the present invention;
Fig. 3 is a kind of structure diagram for URL classification system that the embodiment of the present invention three provides.
Specific implementation mode
Following is a specific embodiment of the present invention in conjunction with the accompanying drawings, technical scheme of the present invention will be further described,
However, the present invention is not limited to these examples.
It is also understood that specific embodiment described herein is used only for understanding the present invention, it is not used to limit this hair
It is bright.
In the present invention, it is provided with URL classification library, the classification information of URL is provided in the URL classification library.It is described
In URL classification library, classification logotype is carried out to URL using classification marker, each classification marker can correspond to a plurality of types of URL.
The classification marker can be used for expressing the behavior of user, for example, " purchase washing machine ", " inquiry washing machine price ",
" cosmetics for buying some brand " etc..
When a URL of needs is classified, the URL classification library is inquired first, it can be in the URL classification library
There are the corresponding classification informations of the URL.When classification information corresponding there is no the URL in the URL classification library, basis is needed
The corresponding text mining of the URL, to generate a new classification.
Specifically, the feature phrase of extraction expression web page contents, and lexical analysis is carried out to this feature phrase, and then obtain
The classification marker of user behavior is expressed, and is classified according to URL and the classification marker, to update URL classification library.
Due in the present invention, when needing the URL to classify to can not find corresponding classification in URL classification library, producing new
Classification, and then realize URL classification integrality and accuracy.Be not in the case where URL can not be divided into a certain classification.This
Outside, since in the method for the present invention, the more accurate matching way used can realize higher accuracy in classification.
Embodiment one
Fig. 1 shows a kind of flow chart for URL classification method that the embodiment of the present invention one provides, and is described in detail such as in conjunction with attached drawing
Under:
In the present embodiment, URL to be sorted is searched first in URL classification library, is existed needing the URL to classify
When can not find corresponding classification in URL classification library, by analyzing the corresponding webpages of URL, extraction expression web page contents
Feature phrase, and lexical analysis is carried out to this feature phrase, and then the classification marker of expression user behavior is obtained, and according to URL
And the classification marker is classified, to update URL classification library.
Step S101 judges the classification information that whether there is URL to be sorted in preset URL classification library.
The classification information of URL is provided in the URL classification library.Classification marker can be used, classification logotype is carried out to URL,
Each classification marker can correspond to a plurality of types of URL.
It can be by the way that the URL be matched in class library, to judge whether the URL to be sorted belongs to described
A URL classification in URL classification library.Wherein, it can also be fuzzy matching that matching process, which can be accurate matching,.It is waited for point described
When the URL of class is matched, the character string that the partial character of the URL to be sorted can be used to constitute.In general, which can
The main characteristic information for including the URL, does not limit the concrete form of the character string here.
Step S102 is waited for point when the classification information of the URL to be sorted is not present in the URL classification library from described
In the corresponding webpages of class URL, the feature phrase of expression web page contents is obtained.
When the classification information of the URL to be sorted is not present in the URL classification library, that is, the URL to classify is being needed to exist
When can not find corresponding classification in URL classification library, the corresponding web page contents to the URL to be sorted is needed to analyze, with
Just new classification is generated in the URL classification library.
The feature phrase of expression web page contents is obtained from the corresponding webpages of the URL, the feature phrase includes multiple
Vocabulary for expressing web page contents.The vocabulary can be obtained by text messages such as the titles of the corresponding webpage of extraction.
Further, the picture in the corresponding webpages of URL can be also identified, to obtain the text of expression image content
Information extracts vocabulary in the text message, and is included in the feature phrase.The identification can be the text identified in webpage
Information can also be the text message by shape recognition in picture at the description shape, such as have a refrigerator shape in picture,
It can be identified as text message " refrigerator ".
In addition, the video content in webpage can be also identified, it is similar to picture recognition, it just repeats no more here.
It should be noted that above-mentioned picture recognition and video identification are all the common prior arts, here no longer to its into
Row specifically describes.
The feature vocabulary is used to express the content of webpage, but there are many information exhibition methods in webpage, is not limited to text
Font formula can take various ways to extract webpage information.
Optionally, the feature phrase includes at least the web page title information that the URL to be sorted corresponds to webpage.
Step S103 carries out lexical analysis to the feature phrase, to generate the classification marker of expression user behavior.
Lexical analysis is carried out to the feature vocabulary, to generate the classification marker of expression user behavior.
Further, lexical analysis module can be used to realize for the lexical analysis, pass through the lexical analysis API Calls vocabulary
Analysis module produces the classification marker of expression user behavior.When calling the lexical analysis module, can by feature vocabulary with
The mode of vectorization indicates, vocabulary vector is obtained, for example, [" stores xx ", " washing machine ", " xx models ", " roller "].
The lexical analysis API be the lexical analysis module an access address, will need the vocabulary analyzed to
Amount is transmitted to the lexical analysis module.The lexical analysis Module implementations have very much, for example can analyze as needed
It is cumulative that vocabulary of all categories in vocabulary vector carries out weight, finally using the higher attribute of weight as the class another characteristic,
For example the vocabulary contained in the vocabulary vector has:" Haier's washing machine ", " TCL washing machines ", " Samsung SC1000 ", then vocabulary point
Analysis module is weighted washing machine, i.e., the weight of washing machine is 2 in classification, and the weight sheet of other classifications is 1, therefore described
It is " washing machine " that lexical analysis module exports one of feature in the vocabulary vector in the category.And according to each classification
Feature obtains the classification marker of expression user behavior.
In the present invention, the classification marker can be used for expressing the behavior of user, for example, " purchase washing machine ", " inquiry is washed
Clothing machine price ", " cosmetics for buying some brand " etc..
It should be noted that lexical analysis is technological means commonly used in the prior art, there are many mode of realization, above-mentioned mistake
Journey is one such.
Step S104 is generated corresponding according to the URL to be sorted and the corresponding classification markers of the URL to be sorted
Classification information, and be recorded in the URL classification library.
When generating the classification information of the URL, certain processing can be carried out to the URL, is obtained representative
Character string enables to represent a kind of URL;Directly the URL can also be completely written in URL classification library.
In the URL classification library, classification logotype is carried out to URL using classification marker, each classification marker can correspond to more
The URL of type.
The classification information includes the relevant character strings of the URL and corresponding classification marker.
When needing the URL to classify not in preset URL classification library, by analyzing the corresponding webpages of URL, carry
The feature phrase of expression web page contents is taken, and lexical analysis is carried out to this feature phrase, and then obtains point of expression user behavior
Class marks, and is classified according to URL and the classification marker, to update URL classification library.With existing sorting technique phase
Than the complete classification to all URL can be achieved in the present invention, and has very high accuracy.
Embodiment two
Fig. 2 shows a kind of flow charts of URL classification method provided by Embodiment 2 of the present invention, are described in detail such as in conjunction with attached drawing
Under:
Step S201 intercepts the feature string of the URL to be sorted.
The feature string is character string representative in the URL, can represent a kind of URL.For example, URL
For " bbs.phicomm.com/article/titleS=123 ", corresponding feature string are:“phicomm.com/
article”.The present invention does not limit specifically feature string intercept method.In general, the feature string is at least
Field in main part and upper directory including domain name.
Step S202 inquires the URL classification library, to judge to be in the URL classification library according to the feature string
It is no that there are the classification informations of the URL to be sorted.
Step S203 is waited for point when the classification information of the URL to be sorted is not present in the URL classification library from described
In the corresponding webpages of class URL, the feature phrase of expression web page contents is obtained;
Step S204 carries out lexical analysis to the feature phrase, to generate the classification marker of expression user behavior.
Step S205 is generated and is corresponded to according to the corresponding feature strings of URL to be sorted and the classification marker
Classification information, and be recorded in the URL classification library.
In the present embodiment, the corresponding feature strings of URL to be sorted and corresponding classification marker are written
In the URL classification library.
In the present embodiment, it by the way that the corresponding feature strings of the URL and classification marker to be written in class library, generates
New classification.
In the present embodiment, the feature string of the URL can represent a kind of URL, and then realize a kind of URL classifications
Determination.
Embodiment three
Fig. 3 shows a kind of structure diagram for URL classification system that the embodiment of the present invention three provides, and is described in detail such as in conjunction with attached drawing
Under:
The URL classification system includes:
Judgment module 31, for judging the classification information in preset URL classification library with the presence or absence of URL to be sorted;
Feature phrase acquisition module 32, for when the classification information that the URL to be sorted is not present in the URL classification library
When, from the corresponding webpages of the URL to be sorted, obtain the feature phrase of expression web page contents;
Classification marker generation module 33, for carrying out lexical analysis to the feature phrase, to generate expression user behavior
Classification marker;
Sort module 34, for according to the URL to be sorted and the corresponding classification markers of the URL to be sorted, generating
Corresponding classification information, and be recorded in the URL classification library.
Optionally, the judgment module 31 includes:
Character string intercepts submodule, the feature string for intercepting the URL to be sorted;
Judging submodule, for inquiring the URL classification library according to the feature string, to judge the URL classification
It whether there is the classification information of the URL to be sorted in library.
Optionally, the sort module 34 includes:
Classification information generates submodule, for according to the corresponding feature strings of the URL to be sorted and described point
Class marks, and generates corresponding classification information.
Optionally, the feature phrase acquisition module 32 includes:
URL accesses submodule, for by accessing the URL to be sorted, obtaining the corresponding web page contents of URL to be sorted;
Feature phrase determination sub-module, for determining the feature phrase for expressing the web page contents.
Optionally, the feature phrase includes at least the web page title information that the URL to be sorted corresponds to webpage.
By URL classification system in this present embodiment to being used for embodiment of the method above-mentioned, the content of detailed description is referring to aforementioned
Embodiment of the method one and embodiment of the method two, which is not described herein again.
It should be appreciated that there is no the stringent sequences that executes for the step in the present invention, it is all it is contemplated that and not influencing function
The variation of realization all should be within the scope of the present invention.
In embodiment provided herein, it should be appreciated that described method and system is all schematical, in reality
By adjusting can difference in the implementation process of border.
In addition, the specific name of each functional unit or module is also only to facilitate mutually differentiation, is not used to the present invention
Protection domain.
Specific embodiment described herein is only an example for the spirit of the invention.Technology belonging to the present invention is led
The technical staff in domain can make various modifications or additions to the described embodiments or replace by a similar method
In generation, however, it does not deviate from the spirit of the invention or beyond the scope of the appended claims.
Claims (10)
1. a kind of URL classification method, which is characterized in that including step:
Judge the classification information with the presence or absence of URL to be sorted in preset URL classification library;
When the classification information of the URL to be sorted is not present in the URL classification library, from the corresponding nets of the URL to be sorted
In page, the feature phrase of expression web page contents is obtained;
Lexical analysis is carried out to the feature phrase, to generate the classification marker of expression user behavior;
According to the URL to be sorted and the corresponding classification markers of the URL to be sorted, corresponding classification information is generated, and remember
Record is in the URL classification library.
2. URL classification method according to claim 1, which is characterized in that the judgement is in preset URL classification library
The no classification information there are URL to be sorted includes:
Intercept the feature string of the URL to be sorted;
The URL classification library is inquired according to the feature string, to judge to wait for point with the presence or absence of described in the URL classification library
The classification information of class URL.
3. URL classification method according to claim 2, which is characterized in that described according to the URL to be sorted and described
The corresponding classification marker of URL to be sorted, generating corresponding classification information includes:
According to the corresponding feature strings of URL to be sorted and the classification marker, corresponding classification information is generated.
4. URL classification method according to claim 1, which is characterized in that described from the corresponding webpages of the URL to be sorted
In, the feature phrase for obtaining expression web page contents includes:
By accessing the URL to be sorted, the corresponding web page contents of URL to be sorted are obtained;
Determine the feature phrase for expressing the web page contents.
5. URL classification method according to claim 1, which is characterized in that the feature phrase includes at least described wait for point
Class URL corresponds to the web page title information of webpage.
6. a kind of URL classification system, which is characterized in that including:
Judgment module, for judging the classification information in preset URL classification library with the presence or absence of URL to be sorted;
Feature phrase acquisition module is used for when the classification information of the URL to be sorted is not present in the URL classification library, from
In the corresponding webpage of the URL to be sorted, the feature phrase of expression web page contents is obtained;
Classification marker generation module, for carrying out lexical analysis to the feature phrase, to generate the classification of expression user behavior
Label;
Sort module, for according to the URL to be sorted and the corresponding classification markers of the URL to be sorted, generating corresponding
Classification information, and be recorded in the URL classification library.
7. URL classification system according to claim 6, which is characterized in that the judgment module includes:
Character string intercepts submodule, the feature string for intercepting the URL to be sorted;
Judging submodule, for inquiring the URL classification library according to the feature string, to judge in the URL classification library
With the presence or absence of the classification information of the URL to be sorted.
8. URL classification system according to claim 7, which is characterized in that the sort module includes:
Classification information generates submodule, for according to the corresponding feature strings of URL to be sorted and the contingency table
Note, generates corresponding classification information.
9. URL classification system according to claim 6, which is characterized in that the feature phrase acquisition module includes:
URL accesses submodule, for by accessing the URL to be sorted, obtaining the corresponding web page contents of URL to be sorted;
Feature phrase determination sub-module, for determining the feature phrase for expressing the web page contents.
10. URL classification system according to claim 6, which is characterized in that the feature phrase includes at least described wait for point
Class URL corresponds to the web page title information of webpage.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810156915.XA CN108334630A (en) | 2018-02-24 | 2018-02-24 | A kind of URL classification method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810156915.XA CN108334630A (en) | 2018-02-24 | 2018-02-24 | A kind of URL classification method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108334630A true CN108334630A (en) | 2018-07-27 |
Family
ID=62929737
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810156915.XA Pending CN108334630A (en) | 2018-02-24 | 2018-02-24 | A kind of URL classification method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108334630A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110516136A (en) * | 2019-08-29 | 2019-11-29 | 南京烽火天地通信科技有限公司 | A kind of internet crawler content page recognition methods based on sample |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102819591A (en) * | 2012-08-07 | 2012-12-12 | 北京网康科技有限公司 | Content-based web page classification method and system |
CN102819597A (en) * | 2012-08-13 | 2012-12-12 | 北京星网锐捷网络技术有限公司 | Web page classification method and equipment |
US20160217144A1 (en) * | 2013-09-04 | 2016-07-28 | Zte Corporation | Method and device for obtaining web page category standards, and method and device for categorizing web page categories |
WO2017167067A1 (en) * | 2016-03-30 | 2017-10-05 | 阿里巴巴集团控股有限公司 | Method and device for webpage text classification, method and device for webpage text recognition |
-
2018
- 2018-02-24 CN CN201810156915.XA patent/CN108334630A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102819591A (en) * | 2012-08-07 | 2012-12-12 | 北京网康科技有限公司 | Content-based web page classification method and system |
CN102819597A (en) * | 2012-08-13 | 2012-12-12 | 北京星网锐捷网络技术有限公司 | Web page classification method and equipment |
US20160217144A1 (en) * | 2013-09-04 | 2016-07-28 | Zte Corporation | Method and device for obtaining web page category standards, and method and device for categorizing web page categories |
WO2017167067A1 (en) * | 2016-03-30 | 2017-10-05 | 阿里巴巴集团控股有限公司 | Method and device for webpage text classification, method and device for webpage text recognition |
Non-Patent Citations (1)
Title |
---|
宗校军: "中文网页定题采集及分类研究", 《中国博士学位论文全文数据库(信息科技辑)》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110516136A (en) * | 2019-08-29 | 2019-11-29 | 南京烽火天地通信科技有限公司 | A kind of internet crawler content page recognition methods based on sample |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11809393B2 (en) | Image and text data hierarchical classifiers | |
CN101542486B (en) | Rank graph | |
US8856129B2 (en) | Flexible and scalable structured web data extraction | |
CN102053983B (en) | Method, system and device for querying vertical search | |
US7676745B2 (en) | Document segmentation based on visual gaps | |
JP4637969B1 (en) | Properly understand the intent of web pages and user preferences, and recommend the best information in real time | |
EP2570974A1 (en) | Automatic crowd sourcing for machine learning in information extraction | |
CN103886020B (en) | A kind of real estate information method for fast searching | |
US20110246462A1 (en) | Method and System for Prompting Changes of Electronic Document Content | |
CN107038173A (en) | Application query method and apparatus, similar application detection method and device | |
CN105243058A (en) | Webpage content translation method and electronic apparatus | |
CN103617192B (en) | The clustering method and device of a kind of data object | |
US20230351789A1 (en) | Systems and methods for deep learning based approach for content extraction | |
CN112035675A (en) | Medical text labeling method, device, equipment and storage medium | |
CN104036190A (en) | Method and device for detecting page tampering | |
ur Rehman et al. | Learning a semantic space for modeling images, tags and feelings in cross-media search | |
CN114222000B (en) | Information pushing method, device, computer equipment and storage medium | |
US11386263B2 (en) | Automatic generation of form application | |
CN109885583A (en) | Data query method, apparatus, equipment and storage medium based on block chain | |
CN108334630A (en) | A kind of URL classification method and system | |
CN113836434B (en) | Web page data processing method based on database | |
CN109948015B (en) | Meta search list result extraction method and system | |
CN115186240A (en) | Social network user alignment method, device and medium based on relevance information | |
JP2007323238A (en) | Highlighting device and program | |
CN114239689A (en) | Multi-mode-based website type judgment method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180727 |
|
WD01 | Invention patent application deemed withdrawn after publication |