CN108959289A - Categories of websites acquisition methods and device - Google Patents
Categories of websites acquisition methods and device Download PDFInfo
- Publication number
- CN108959289A CN108959289A CN201710351636.4A CN201710351636A CN108959289A CN 108959289 A CN108959289 A CN 108959289A CN 201710351636 A CN201710351636 A CN 201710351636A CN 108959289 A CN108959289 A CN 108959289A
- Authority
- CN
- China
- Prior art keywords
- data set
- data acquisition
- acquisition system
- order data
- website
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/231—Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/07—Guided tours
Abstract
This application discloses categories of websites acquisition methods and devices.One specific embodiment of this method includes: the order data set obtained targeted website in the first preset time period and access data acquisition system;Order data set and access data acquisition system are analyzed, order data is selected from order data set and generates target order data set, access data are selected from access data acquisition system and generate target access data acquisition system;Feature vector is extracted from target order data set and target access data acquisition system;Feature vector is input to websites collection model trained in advance to classify, obtains the second level classification of targeted website, wherein websites collection model is used to characterize the corresponding relationship of the feature vector of website and the second level classification of website.This embodiment improves websites collection efficiency.
Description
Technical field
This application involves field of computer technology, and in particular to Internet technical field more particularly to categories of websites obtain
Method and apparatus.
Background technique
With the popularity of the internet, more prominent the advantages of shopping at network.Use a network for the userbase of shopping not
Disconnected to rise, various types of websites (such as Online Store) are also emerged one after another.
For same type of website, different management modes might have.According to different management modes, same type
Website can also be divided into different classifications.
However, existing websites collection mode is usually that those skilled in the art divide website by manual analysis
Class, websites collection efficiency are lower.
Summary of the invention
The purpose of the embodiment of the present application is to propose a kind of improved categories of websites acquisition methods and device, more than solving
The technical issues of background technology part is mentioned.
In a first aspect, the embodiment of the present application provides a kind of categories of websites acquisition methods, this method comprises: obtaining target network
The order data set and access data acquisition system stood in the first preset time period;To order data set and access data acquisition system
It is analyzed, order data is selected from order data set and generates target order data set, from access data acquisition system
It selects access data and generates target access data acquisition system;It is extracted from target order data set and target access data acquisition system
Feature vector;Feature vector is input to websites collection model trained in advance to classify, obtains the second level class of targeted website
Not, wherein websites collection model is used to characterize the corresponding relationship of the feature vector of website and the second level classification of website.
In some embodiments, feature vector includes at least one of the following: that the order volume of targeted website, targeted website are ordered
Single amount of money, the user sessions of targeted website, targeted website pageview.
In some embodiments, classify feature vector is input in advance trained websites collection model, obtain
After the second level classification of targeted website, further includes: the first mapping table of inquiry obtains belonging to the second level classification of targeted website
Category, wherein the first mapping table is for storing category belonging to second level classification and second level classification;Obtain target
The initial category that website is submitted in registration;Determine category belonging to the second level classification of targeted website and initial one
Whether grade classification is identical;If not identical, output abnormality prompt information.
In some embodiments, classify feature vector is input in advance trained websites collection model, obtain
After the second level classification of targeted website, further includes: the second mapping table of inquiry, the second level classification for obtaining targeted website are corresponding
Lower list rush hour section, wherein when the second mapping table is for storing second level classification and second level classification corresponding lower single peak
Between section;Export the corresponding lower single rush hour section of second level classification of targeted website.
In some embodiments, this method further includes the steps that the disaggregated model that sets up a web site, and set up a web site disaggregated model
Step includes: to obtain order data set of multiple websites in the second preset time period and access data acquisition system respectively;To more
The order data set and access data acquisition system of a website are analyzed, and are selected and are ordered from the order data set of multiple websites
Forms data generates multiple sample order data set, and it is more that access data generation is selected from the access data acquisition system of multiple websites
A sample interview data acquisition system;It is extracted from multiple sample order data set and multiple sample interview data acquisition systems respectively more
A sampling feature vectors;Multiple sampling feature vectors are clustered, websites collection model is obtained.
In some embodiments, the order data set of multiple websites and access data acquisition system are analyzed, from multiple
Order data is selected in the order data set of website generates multiple sample order data set, the access number from multiple websites
Multiple sample interview data acquisition systems are generated according to access data are selected in set, comprising: by the order data set of multiple websites
It is deleted with the order data of field missing in access data acquisition system and access data, obtains the first order data collection of multiple websites
It closes and first accesses data acquisition system;The first order data set of multiple websites and the first access data acquisition system are gone respectively
It handles again, obtains the second order data set and the second access data acquisition system of multiple websites;Based on preset first cluster
It is several that second order data set of multiple websites and the second access data acquisition system are denoised, obtain multiple sample order datas
Set and multiple sample interview data acquisition systems.
In some embodiments, it is extracted from multiple sample order data set and multiple sample interview data acquisition systems respectively
Multiple sampling feature vectors out, comprising: multiple sample order data set and multiple sample interview data acquisition systems are carried out respectively
Normalized obtains multiple normalized sample order data set and multiple normalized sample interview data acquisition systems;Point
It first derivative set corresponding with multiple normalized sample order data set and Sheng Cheng not be visited with multiple normalized samples
Ask data acquisition system corresponding first derivative set, and as multiple sampling feature vectors.
In some embodiments, multiple sampling feature vectors are clustered, obtains websites collection model, comprising: be based on
Preset second cluster number and preset distance parameter carry out level to multiple sampling feature vectors using hierarchy clustering method
Cluster, obtains websites collection model.
In some embodiments, hierarchy clustering method includes at least one of the following: knearest neighbour method, longest distance method, puts down
Equal Furthest Neighbor, centroid distance method.
Second aspect, the embodiment of the present application provide a kind of categories of websites acquisition device, which includes: acquiring unit,
It is configured to the order data set obtained targeted website in the first preset time period and access data acquisition system;Selection unit,
It is configured to analyze order data set and access data acquisition system, it is raw that order data is selected from order data set
At target order data set, access data are selected from access data acquisition system and generate target access data acquisition system;It extracts single
Member is configured to extract feature vector from target order data set and target access data acquisition system;Taxon, configuration are used
Classify in feature vector is input to websites collection model trained in advance, obtain the second level classification of targeted website, wherein
Websites collection model is used to characterize the corresponding relationship of the feature vector of website and the second level classification of website.
In some embodiments, feature vector includes at least one of the following: that the order volume of targeted website, targeted website are ordered
Single amount of money, the user sessions of targeted website, targeted website pageview.
In some embodiments, the device further include: the first query unit is configured to the first mapping table of inquiry,
Obtain category belonging to the second level classification of targeted website, wherein the first mapping table is for storing second level classification and two
Category belonging to grade classification;Classification acquiring unit is configured to obtain targeted website is submitted in registration initial one
Grade classification;Determination unit is configured to determine that category belonging to the second level classification of targeted website is with initial category
It is no identical;First output unit, if being configured to not identical, output abnormality prompt information.
In some embodiments, the device further include: the second query unit is configured to the second mapping table of inquiry,
Obtain the corresponding lower single rush hour section of second level classification of targeted website, wherein the second mapping table is for storing second level class
The corresponding lower single rush hour section of other and second level classification;Second output unit is configured to the second level classification of output targeted website
Corresponding lower single rush hour section.
In some embodiments, which further includes websites collection model foundation unit, websites collection model foundation unit
Include: acquisition subelement, is configured to obtain order data set and visit of multiple websites in the second preset time period respectively
Ask data acquisition system;Subelement is chosen, is configured to analyze the order data set and access data acquisition system of multiple websites,
Order data is selected from the order data set of multiple websites and generates multiple sample order data set, from multiple websites
Access data are selected in access data acquisition system generates multiple sample interview data acquisition systems;Subelement is extracted, is configured to distinguish
Multiple sampling feature vectors are extracted from multiple sample order data set and multiple sample interview data acquisition systems;Cluster is single
Member.It is configured to cluster multiple sampling feature vectors, obtains websites collection model.
In some embodiments, choosing subelement includes: removing module, is configured to the order data collection of multiple websites
It closes and accesses the order data of field missing and access data in data acquisition system to delete, obtain the first order data of multiple websites
Set and the first access data acquisition system;Deduplication module is configured to the first order data set to multiple websites respectively and
One access data acquisition system carries out duplicate removal processing, obtains the second order data set and the second access data acquisition system of multiple websites;
Module is denoised, is configured to access the second order data set of multiple websites and second based on preset first cluster number
Data acquisition system is denoised, and multiple sample order data set and multiple sample interview data acquisition systems are obtained.
In some embodiments, extracting subelement includes: normalization module, is configured to respectively to multiple sample order numbers
According to set and multiple sample interview data acquisition systems be normalized, obtain multiple normalized sample order data set and
Multiple normalized sample interview data acquisition systems;Derivation module is configured to generate respectively and multiple normalized sample orders
The corresponding first derivative set of data acquisition system and first derivative set corresponding with multiple normalized sample interview data acquisition systems,
And as multiple sampling feature vectors.
In some embodiments, cluster subelement is further configured to: based on preset second cluster number and being preset
Distance parameter, using hierarchy clustering method to multiple sampling feature vectors carry out hierarchical clustering, obtain websites collection model.
In some embodiments, hierarchy clustering method includes at least one of the following: knearest neighbour method, longest distance method, puts down
Equal Furthest Neighbor, centroid distance method.
The third aspect, the embodiment of the present application provide a kind of server, which includes: one or more processors;
Storage device, for storing one or more programs;When one or more programs are executed by one or more processors, so that one
A or multiple processors realize the method as described in implementation any in first aspect.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey
Sequence realizes the method as described in implementation any in first aspect when the computer program is executed by processor.
Categories of websites acquisition methods and device provided by the embodiments of the present application, by obtaining targeted website when first is default
Between order data set and access data acquisition system in section, to analyze order data set and access data acquisition system,
To generate target order data set and target access data acquisition system;Then, from target order data set and target access
Feature vector is extracted in data acquisition system;Classify finally, feature vector is input to websites collection model trained in advance, from
And obtain the second level classification of targeted website.Classified by websites collection model to website, to improve websites collection effect
Rate.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 is the flow chart according to one embodiment of the categories of websites acquisition methods of the application;
Fig. 3 is the flow chart according to one embodiment of the method for the disaggregated model that sets up a web site of the application;
Fig. 4 is the structural schematic diagram according to one embodiment of the categories of websites acquisition device of the application;
Fig. 5 is adapted for the structural schematic diagram for the computer system for realizing the server of the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the categories of websites acquisition methods of the application or the exemplary system of categories of websites acquisition device
System framework 100.
As shown in Figure 1, system architecture 100 may include terminal device 101, database server 102,103 kimonos of network
Business device 104.Network 103 between terminal device 101, database server 102 and server 104 to provide communication link
Medium.Network 103 may include various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101 and be interacted by network 103 with server 104, to receive or send message etc..
For example, terminal device 101, which can be used, in user sends targeted website in the first preset time to server 104 by network 103
Order data set and access data acquisition system in section.Wherein, terminal device 101 can be various electronic equipments, including but not
It is limited to smart phone, tablet computer, E-book reader, pocket computer on knee and desktop computer etc..
Database server 102 can be used for order data set of the storage targeted website in the first preset time period
With access data acquisition system so that by network 103 to obtain targeted website from database server 102 pre- first for server 104
If order data set and access data acquisition system in the period.
Server 104 can be to provide the server of various services.For example, server 104 can from terminal device 101 or
Person's database server 102 obtains order data set and access data acquisition system of the targeted website in the first preset time period,
And order data set of the accessed targeted website in the first preset time period and access data acquisition system are analyzed
Deng processing, and export processing result (such as second level classification of targeted website).
It should be noted that categories of websites acquisition methods provided by the embodiment of the present application are generally executed by server 104,
Correspondingly, categories of websites acquisition device is generally positioned in server 104.
It should be understood that the number of terminal device, database server, network and server in Fig. 1 is only schematic
's.According to needs are realized, any number of terminal device, database server, network and server can have.In server
In the case where being stored with order data set of the targeted website in the first preset time period and access data acquisition system in 104, it is
Terminal device 101 and database server 102 can be not provided in system framework 100.
With continued reference to Fig. 2, it illustrates the processes according to one embodiment of the categories of websites acquisition methods of the application
200.The categories of websites acquisition methods, comprising the following steps:
Step 201, order data set and access data acquisition system of the targeted website in the first preset time period are obtained.
In the present embodiment, electronic equipment (such as the server shown in FIG. 1 of categories of websites acquisition methods operation thereon
104) available targeted website ordering in the first preset time period (such as in some day, in a certain week, in certain January etc.)
Forms data set and access data acquisition system.Wherein, website typically refers to use HTML according to certain rule on the internet
Tool makings such as (Hyper Text Markup Language, HyperText Markup Languages) are used to show specific content correlation
The set of webpage.For example, website can be the Online Store on some e-commerce platform, targeted website can be some electronics
Some Online Store on business platform.
In the present embodiment, order data set can be the data set relevant to order of user in the target website
It closes.Wherein, every order data can include but is not limited to: the information of targeted website is (for example, the title of targeted website, target
Telephone number, address of targeted website of website etc.), the information of lower single user (for example, the account name of lower single user, under be applied alone
The telephone number at family, address of lower single user etc.), the information of lower single article is (for example, the title of lower list article, lower single article
SKU (Stock Keeping Unit, keeper unit) numbers, the category of lower single article, the price of lower single article etc.) etc. data.
Access data acquisition system can be the data acquisition system relevant to access of user in the target website.Wherein, every access data can
To include but is not limited to: the information of targeted website is (for example, the title of targeted website, the telephone number of targeted website, targeted website
Address etc.), the information of access user is (for example, the account name of access user, the telephone number for accessing user, access user
Address etc.), the information of access article is (for example, the title of access article, the category for accessing No. SKU of article, accessing article, visit
Ask the price etc. of article) etc. data.
It should be noted that electronic equipment can from local, communicate with terminal (such as the terminal shown in FIG. 1 of connection
Equipment 101) or communicate with and obtain target in the database server (such as database server 102 shown in FIG. 1) of connection
Order data set of the website in the first preset time period and access data acquisition system, the present embodiment obtain electronic equipment from where
Take order data set of the targeted website in the first preset time period and access data acquisition system without limiting.
Step 202, order data set and access data acquisition system are analyzed, selects and orders from order data set
Forms data generates target order data set, and access data are selected from access data acquisition system and generate target access data set
It closes.
In the present embodiment, based on order data set acquired in step 201 and access data acquisition system, electronic equipment can
To analyze order data set and access data acquisition system, target order data set is obtained from order data set,
Target access data acquisition system is obtained from access data acquisition system.
In the present embodiment, electronic equipment can obtain target order data set and target access number in several ways
According to set.
In some optional implementations of the present embodiment, electronic equipment can be randomly selected from order data set
Several order datas generate target order data set out;Electronic equipment can randomly select out several from access data acquisition system
It accesses data and generates target access data acquisition system.
In some optional implementations of the present embodiment, electronic equipment can be first by order data set and access
The order data and access data that field lacks in data acquisition system are deleted;Then respectively to order data set and access data set
It closes and carries out duplicate removal processing, to obtain target order data set and target access data acquisition system.
Step 203, feature vector is extracted from target order data set and target access data acquisition system.
In the present embodiment, step 202 target order data set generated and target access data acquisition system, electricity are based on
Sub- equipment can extract feature vector from target order data set and target access data acquisition system.As an example, electronics is set
It is standby can be for statistical analysis to target order data set, to obtain the order volume of targeted website;Electronic equipment can be with
It is for statistical analysis to target access data acquisition system, to obtain the user sessions of targeted website.At this point, electronic equipment can be by mesh
The order volume of website and the user sessions of targeted website are marked as feature vector;It can also order volume and target network to targeted website
The user sessions stood is normalized, and using the user sessions of the order volume of normalized targeted website and targeted website as spy
Levy vector.
In some optional implementations of the present embodiment, feature vector can include but is not limited to following at least one
: the order volume of targeted website, the order amount of money of targeted website, the user sessions of targeted website, targeted website pageview.
Step 204, feature vector is input to websites collection model trained in advance to classify, obtains targeted website
Second level classification.
In the present embodiment, it is based on the extracted feature vector of step 203, feature vector can be input to by electronic equipment
Trained websites collection model is classified in advance, to obtain the second level classification of targeted website.Wherein, second level classification can be
The management mode classification of website.For example, second level classification can include but is not limited to: wholesale and retail pattern class, solid shop/brick and mortar store on-line shop
Pattern class, buys pattern class on behalf at distribution model classification.
In the present embodiment, websites collection model can be used for characterizing the feature vector of website and the second level classification of website
Corresponding relationship.Here, electronic equipment can set up a web site disaggregated model in several ways.For example, electronic equipment can be based on
The second level classification of feature vector and website to a large amount of websites counts and generates the second level for being stored with multiple feature vectors and website
The mapping table of the corresponding relationship of classification, and using the mapping table as websites collection model.
In some optional implementations of the present embodiment, after obtaining the second level classification of targeted website, electronics is set
It is standby to inquire the first mapping table first, obtain category belonging to the second level classification of targeted website;Later, mesh is obtained
The initial category that mark website is submitted in registration;Then, it is determined that category belonging to the second level classification of targeted website
It is whether identical as initial category;Finally, the category belonging to the second level classification of targeted website and initial category
In different situation, output abnormality prompt information.Wherein, the first mapping table can be used for storing second level classification and second level
Category belonging to classification.Category can be the type of website, according to the difference of the category of website institute items for merchandising, net
Multiple types can be divided by standing, for example, electronic product website, books class website, foodstuff website, drug class website,
Clothing website etc..As an example, if category belonging to the second level classification of targeted website is drug class, and targeted website
The initial category submitted in registration is clothing, at this point, electronic equipment can be with output abnormality prompt information, for mentioning
The case where showing targeted website there may be fake registrations.
In some optional implementations of the present embodiment, after obtaining the second level classification of targeted website, electronics is set
It is standby to inquire the second mapping table first, obtain the corresponding lower single rush hour section of second level classification of targeted website;Then,
Export the corresponding lower single rush hour section of second level classification of targeted website.Wherein, the second mapping table can be used for storing two
Grade classification and the corresponding lower single rush hour section of second level classification.Here, for each second level classification, those skilled in the art can be with
It is for statistical analysis to lower single time of a large amount of websites, to obtain the corresponding lower single rush hour section of each second level classification.
Categories of websites acquisition methods provided by the embodiments of the present application, by obtaining targeted website in the first preset time period
Order data set and access data acquisition system, to analyze order data set and access data acquisition system, thus raw
At target order data set and target access data acquisition system;Then, from target order data set and target access data set
Feature vector is extracted in conjunction;Classify finally, feature vector is input to websites collection model trained in advance, to obtain
The second level classification of targeted website.Classified by websites collection model to website, to improve websites collection efficiency.
With further reference to Fig. 3, it illustrates the processes 300 of one embodiment of the method for the disaggregated model that sets up a web site.It should
Set up a web site disaggregated model method process 300, comprising the following steps:
Step 301, order data set of multiple websites in the second preset time period and access data set are obtained respectively
It closes.
In the present embodiment, electronic equipment (such as server 104 shown in FIG. 1) can obtain multiple websites respectively
Order data set and access data acquisition system in two preset time periods (such as in some day, in a certain week, in certain January etc.).
Wherein, website can be the Online Store on some e-commerce platform.
Step 302, the order data set to multiple websites and access data acquisition system are analyzed, from ordering for multiple websites
Order data is selected in forms data set and generates multiple sample order data set, from the access data acquisition system of multiple websites
It selects access data and generates multiple sample interview data acquisition systems.
In the present embodiment, order data set and access data acquisition system based on multiple websites acquired in step 301,
Electronic equipment can order data set to multiple websites and access data acquisition system analyze, the order numbers from multiple websites
Multiple sample order data set are generated according to order data is selected in set, are chosen from the access data acquisition system of multiple websites
Access data generate multiple sample interview data acquisition systems out.
In the present embodiment, electronic equipment can obtain multiple sample order data set in several ways and sample is visited
Ask data acquisition system.
In some optional implementations of the present embodiment, for each website in multiple websites, electronic equipment can
To randomly select out the sample order data set that several order datas generate the website from the order data set of the website;
Electronic equipment can randomly select out the sample that several access data generate the website from the access data acquisition system of the website and visit
Ask data acquisition system.
In some optional implementations of the present embodiment, electronic equipment can obtain multiple samples by following steps
Order data set and sample interview data acquisition system.
Firstly, electronic equipment can be ordered what field in the order data set of multiple websites and access data acquisition system lacked
Forms data and access data are deleted, and the first order data set and the first access data acquisition system of multiple websites are obtained.Specifically,
For every order data of each website or every access data, electronic equipment can determine that this order data or this are visited
Ask whether the field in data is complete, if imperfect, this order data or this is accessed into data and deleted.
Then, electronic equipment can respectively to the first order data set of multiple websites and first access data acquisition system into
Row duplicate removal processing obtains the second order data set and the second access data acquisition system of multiple websites.Specifically, for each net
The the first order data set stood or the first access data acquisition system, electronic equipment can be to the first order data set of the website
Or first access data acquisition system carry out duplicate removal processing, duplicate first orders in the first order data set to get rid of the website
Duplicate first access data in forms data or the first access data acquisition system.
Finally, electronic equipment can be based on preset first cluster number (for example, the first cluster number is between 12-17
Value) the second order data set of multiple websites and the second access data acquisition system are denoised, obtain multiple sample orders
Data acquisition system and multiple sample interview data acquisition systems.Specifically, electronic equipment can be poly- using level based on the first cluster number
Class method carries out hierarchical clustering to the second order data set of multiple websites and the second access data acquisition system, may be deposited with removal
In the second order data set of the website of fake registrations and the second access data acquisition system, and the second order of remaining website
Data acquisition system and the second access data acquisition system are as multiple sample order data set and multiple sample interview data acquisition systems.
Step 303, it is extracted from multiple sample order data set and multiple sample interview data acquisition systems respectively multiple
Sampling feature vectors.
In the present embodiment, step 302 multiple sample order data set generated and multiple sample interview numbers are based on
According to set, electronic equipment can extract from multiple sample order data set and multiple sample interview data acquisition systems more respectively
A sampling feature vectors.Wherein, sampling feature vectors can include but is not limited at least one of following: the order volume of website, net
The user sessions of the order amount of money, website stood, website pageview.As an example, for each sample order data set or often
A sample interview data acquisition system, electronic equipment can be for statistical analysis to the sample order data set, to obtain the sample
The corresponding order volume of this order data set;Electronic equipment can also be for statistical analysis to the sample interview data acquisition system, from
And obtain the corresponding user sessions of sample interview data acquisition system.At this point, electronic equipment can be by the sample order data set pair
The order volume answered user sessions corresponding with the sample interview data acquisition system is as sampling feature vectors.
In some optional implementations of the present embodiment, electronic equipment can extract multiple samples by following steps
Feature vector.
Firstly, electronic equipment can respectively carry out multiple sample order data set and multiple sample interview data acquisition systems
Normalized obtains multiple normalized sample order data set and multiple normalized sample interview data acquisition systems.This
In, electronic equipment can use min-max standardized method to multiple sample order data set and multiple sample interview data
Set is normalized.Specifically, minimum value (min) and maximum value (max) can be arranged in electronic equipment first;Then will
Original value x standardizes formula by following min-max and is mapped to the value x in section [min, max]*:
As an example, order volume of certain website within certain day 7-12 moment is as shown in table 1 below:
Table 1
If the order volume in each moment in table 1 is standardized formula by min-max to be mapped in section [0,1]
Normalized numerical value, then normalized order volume of certain website within certain day 7-12 moment is as shown in table 2 below:
Table 2
Then, electronic equipment can generate first derivative corresponding with multiple normalized sample order data set respectively
Set and first derivative set corresponding with multiple normalized sample interview data acquisition systems, and as multiple sample characteristics to
Amount.Example immediately above, electronic equipment can use following formula and ask corresponding with the normalized order volume in each moment
First derivative f'(x* i):
Wherein, i is positive integer, and 7≤i≤12, x*For normalized order volume, x* iIt is normalized in the i-th moment
Order volume, f'(x*) it is first derivative corresponding with normalized order volume, f'(x* i) it is to be ordered with normalized in the i-th moment
It is single to measure corresponding first derivative.
Step 304, multiple sampling feature vectors are clustered, obtains websites collection model.
In the present embodiment, the extracted multiple sampling feature vectors of step 303 are based on, electronic equipment can be to multiple samples
Eigen vector is clustered, thus between the feature vector to set up a web site and the second level classification of website accurate corresponding relationship instruction
Websites collection model after white silk.Wherein, cluster is usually and is divided into the set of physics or abstract object to be made of similar object
Multiple classes process.Pair by clustering the set that class generated is one group of data object, in these objects and same class
It is different with the object in other classes as similar to each other.Here, to multiple sampling feature vectors carry out cluster can be generated it is multiple
Class, the corresponding second level classification of each class.
In some optional implementations of the present embodiment, electronic equipment can be based on preset second cluster number
It is (in general, the second cluster number is generally less than the first cluster number, for example, second cluster number value between 2-5) and default
Distance parameter, using hierarchy clustering method to multiple sampling feature vectors carry out hierarchical clustering, obtain websites collection model.Its
In, hierarchical clustering is a kind of main clustering method, completes to cluster by generating a series of clustering tree of nestings.Single-point cluster
It is in the bottom of tree, has a root node cluster in the top layer of tree.Root node cluster covers whole all data points.Layer
Secondary cluster can be divided into merging (from bottom to top) cluster and division (from top to bottom) cluster, use agglomerative clustering here.Distance ginseng
Number may include the distance between two objects of the distance between two classes value and same class value.Here, distance parameter institute
The distance of instruction can be Euclidean distance or manhatton distance.The termination condition of hierarchical clustering is for the distance between two classes and together
The distance between two objects of one class reach distance indicated by distance parameter or the number of class reaches the second cluster
Number.
In some optional implementations of the present embodiment, hierarchy clustering method can include but is not limited to it is following at least
One: knearest neighbour method (SL method, single-linkage), is put down at longest distance method (CL method, complete-linkage)
Equal Furthest Neighbor (AL method, average-linkage), centroid distance method (centroid-linkage).Wherein, knearest neighbour method
Between class distance be equal to two class objects between minimum range.The between class distance of longest distance method is equal between two class objects most
Big distance.The between class distance of average distance method is equal to the average distance between two class objects.The class spacing of centroid distance method is equal to
The distance between two class object mass centers.
The method of the disaggregated model provided by the embodiments of the present application that sets up a web site, by obtaining multiple websites when second is default
Between order data set and access data acquisition system in section, so as to the order data set and access data acquisition system to multiple websites
It is analyzed, to generate multiple sample order data set and sample interview data acquisition system;Then, it is ordered respectively from multiple samples
Multiple sampling feature vectors are extracted in forms data set and multiple sample interview data acquisition systems;Finally, to multiple sample characteristics
Vector is clustered, to obtain websites collection model.To realize the two of the feature vector and website that rapidly set up a web site
The websites collection model of accurate corresponding relationship between grade classification.
With further reference to Fig. 4, as the realization to method shown in above-mentioned each figure, this application provides a kind of categories of websites to obtain
One embodiment of device is taken, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically apply
In various electronic equipments.
As shown in figure 4, the categories of websites acquisition device 400 of the present embodiment may include: acquiring unit 401, selection unit
402, extraction unit 403 and taxon 404.Wherein, it is default first to be configured to acquisition targeted website for acquiring unit 401
Order data set and access data acquisition system in period;Selection unit 402 is configured to order data set and access
Data acquisition system is analyzed, and order data is selected from order data set and generates target order data set, from access number
Target access data acquisition system is generated according to access data are selected in set;Extraction unit 403 is configured to from target order data
Feature vector is extracted in set and target access data acquisition system;Taxon 404 is configured to for feature vector being input in advance
Trained websites collection model is classified, and the second level classification of targeted website is obtained, wherein websites collection model is for characterizing net
The corresponding relationship of the second level classification of the feature vector and website stood.
In the present embodiment, in categories of websites acquisition device 400: acquiring unit 401, selection unit 402, extraction unit
403 and taxon 404 specific processing and its brought technical effect can be respectively with reference to the step in Fig. 2 corresponding embodiment
201, the related description of step 202, step 203 and step 204, details are not described herein.
In some optional implementations of the present embodiment, feature vector includes at least one of the following: targeted website
Order volume, the order amount of money of targeted website, the user sessions of targeted website, targeted website pageview.
In some optional implementations of the present embodiment, categories of websites acquisition device 400 can also include: first to look into
Unit (not shown) is ask, the first mapping table of inquiry is configured to, obtains one belonging to the second level classification of targeted website
Grade classification, wherein the first mapping table is for storing category belonging to second level classification and second level classification;Classification obtains single
First (not shown) is configured to obtain the initial category that targeted website is submitted in registration;Determination unit, configuration
For determining whether category belonging to the second level classification of targeted website is identical as initial category;First output unit
(not shown), if being configured to not identical, output abnormality prompt information.
In some optional implementations of the present embodiment, categories of websites acquisition device 400 can also include: second to look into
Ask unit (not shown), be configured to inquiry the second mapping table, obtain targeted website second level classification it is corresponding under
Single rush hour section, wherein the second mapping table is for storing second level classification and second level classification corresponding lower single rush hour
Section;Second output unit (not shown) is configured to second level classification corresponding lower single rush hour of output targeted website
Section.
In some optional implementations of the present embodiment, categories of websites acquisition device 400 can also be including website point
Class model establishes unit (not shown), and websites collection model foundation unit may include: to obtain subelement (not show in figure
Out), it is configured to obtain order data set of multiple websites in the second preset time period and access data acquisition system respectively;Choosing
Subelement (not shown) is taken, is configured to analyze the order data set and access data acquisition system of multiple websites,
Order data is selected from the order data set of multiple websites and generates multiple sample order data set, from multiple websites
Access data are selected in access data acquisition system generates multiple sample interview data acquisition systems;Subelement (not shown) is extracted,
It is configured to extract multiple sample characteristics from multiple sample order data set and multiple sample interview data acquisition systems respectively
Vector;Cluster subelement (not shown).It is configured to cluster multiple sampling feature vectors, obtains websites collection mould
Type.
In some optional implementations of the present embodiment, choosing subelement may include: that removing module (does not show in figure
Out), it is configured to the order data and access of field missing in the order data set of multiple websites and access data acquisition system
Data are deleted, and the first order data set and the first access data acquisition system of multiple websites are obtained;Deduplication module (does not show in figure
Out), it is configured to carry out duplicate removal processing to the first order data set of multiple websites and the first access data acquisition system respectively, obtain
To the second order data set of multiple websites and the second access data acquisition system;Module (not shown) is denoised, is configured to
The second order data set of multiple websites and the second access data acquisition system are denoised based on preset first cluster number,
Obtain multiple sample order data set and multiple sample interview data acquisition systems.
In some optional implementations of the present embodiment, extracting subelement may include: to normalize module (in figure not
Show), it is configured to that multiple sample order data set and multiple sample interview data acquisition systems are normalized respectively,
Obtain multiple normalized sample order data set and multiple normalized sample interview data acquisition systems;Derivation module is (in figure
Be not shown), be configured to generate respectively first derivative set corresponding with multiple normalized sample order data set and with
The corresponding first derivative set of multiple normalized sample interview data acquisition systems, and as multiple sampling feature vectors.
In some optional implementations of the present embodiment, cluster subelement is further configured to: based on preset
Second cluster number and preset distance parameter carry out hierarchical clustering to multiple sampling feature vectors using hierarchy clustering method,
Obtain websites collection model.
In some optional implementations of the present embodiment, hierarchy clustering method includes at least one of the following: most short distance
From method, longest distance method, average distance method, centroid distance method.
Below with reference to Fig. 5, it illustrates the computer systems 500 for the server for being suitable for being used to realize the embodiment of the present application
Structural schematic diagram.Server shown in Fig. 5 is only an example, should not function and use scope band to the embodiment of the present application
Carry out any restrictions.
As shown in figure 5, computer system 500 includes central processing unit (CPU) 501, it can be read-only according to being stored in
Program in memory (ROM) 502 or be loaded into the program in random access storage device (RAM) 503 from storage section 508 and
Execute various movements appropriate and processing.In RAM 503, also it is stored with system 500 and operates required various programs and data.
CPU 501, ROM 502 and RAM 503 are connected with each other by bus 504.Input/output (I/O) interface 505 is also connected to always
Line 504.
I/O interface 505 is connected to lower component: the importation 506 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 507 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 508 including hard disk etc.;
And the communications portion 509 of the network interface card including LAN card, modem etc..Communications portion 509 via such as because
The network of spy's net executes communication process.Driver 510 is also connected to I/O interface 505 as needed.Detachable media 511, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 510, in order to read from thereon
Computer program be mounted into storage section 508 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communications portion 509, and/or from detachable media
511 are mounted.When the computer program is executed by central processing unit (CPU) 501, limited in execution the present processes
Above-mentioned function.
It should be noted that the above-mentioned computer-readable medium of the application can be computer-readable signal media or meter
Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not
Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter
The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires
Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage
Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device,
Or above-mentioned any appropriate combination.In this application, computer readable storage medium can be it is any include or storage journey
The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this
In application, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for
By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. are above-mentioned
Any appropriate combination.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet
Include acquiring unit, selection unit, extraction unit and taxon.Wherein, the title of these units not structure under certain conditions
The restriction of the pairs of unit itself, for example, acquiring unit is also described as " obtaining targeted website in the first preset time period
The unit of interior order data set and access data acquisition system ".
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be
Included in server described in above-described embodiment;It is also possible to individualism, and without in the supplying server.It is above-mentioned
Computer-readable medium carries one or more program, when said one or multiple programs are executed by the server,
So that the server: obtaining order data set and access data acquisition system of the targeted website in the first preset time period;To ordering
Forms data set and access data acquisition system are analyzed, and order data is selected from order data set and generates target order numbers
According to set, access data are selected from access data acquisition system and generate target access data acquisition system;From target order data set
Feature vector is extracted in target access data acquisition system;Feature vector is input to websites collection model trained in advance to be divided
Class obtains the second level classification of targeted website, wherein websites collection model is for characterizing the feature vector of website and the second level of website
The corresponding relationship of classification.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art
Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature
Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein
Can technical characteristic replaced mutually and the technical solution that is formed.
Claims (13)
1. a kind of categories of websites acquisition methods, which is characterized in that the described method includes:
Obtain order data set and access data acquisition system of the targeted website in the first preset time period;
The order data set and the access data acquisition system are analyzed, selects and orders from the order data set
Forms data generates target order data set, and access data are selected from the access data acquisition system and generate target access data
Set;
Feature vector is extracted from the target order data set and the target access data acquisition system;
Described eigenvector is input to websites collection model trained in advance to classify, obtains the second level of the targeted website
Classification, wherein the websites collection model is used to characterize the corresponding relationship of the feature vector of website and the second level classification of website.
2. the method according to claim 1, wherein described eigenvector includes at least one of the following: the mesh
Mark the order volume of website, the order amount of money of the targeted website, the user sessions of the targeted website, the targeted website browsing
Amount.
3. the method according to claim 1, wherein described eigenvector is input to training in advance described
Websites collection model is classified, after obtaining the second level classification of the targeted website, further includes:
The first mapping table is inquired, category belonging to the second level classification of the targeted website is obtained, wherein described first
Mapping table is for storing category belonging to second level classification and second level classification;
Obtain the initial category that the targeted website is submitted in registration;
Determine whether category belonging to the second level classification of the targeted website and the initial category are identical;
If not identical, output abnormality prompt information.
4. the method according to claim 1, wherein described eigenvector is input to training in advance described
Websites collection model is classified, after obtaining the second level classification of the targeted website, further includes:
The second mapping table is inquired, obtains the corresponding lower single rush hour section of second level classification of the targeted website, wherein institute
The second mapping table is stated for storing second level classification and the corresponding lower single rush hour section of second level classification;
Export the corresponding lower single rush hour section of second level classification of the targeted website.
5. method described in one of -4 according to claim 1, which is characterized in that the method also includes the disaggregated models that sets up a web site
The step of, it is described set up a web site disaggregated model the step of include:
Order data set of multiple websites in the second preset time period and access data acquisition system are obtained respectively;
Order data set and access data acquisition system to the multiple website are analyzed, the order numbers from the multiple website
Multiple sample order data set are generated according to order data is selected in set, from the access data acquisition system of the multiple website
It selects access data and generates multiple sample interview data acquisition systems;
Multiple samples are extracted from the multiple sample order data set and the multiple sample interview data acquisition system respectively
Feature vector;
The multiple sampling feature vectors are clustered, websites collection model is obtained.
6. according to the method described in claim 5, it is characterized in that, described to the order data set of the multiple website and visit
It asks that data acquisition system is analyzed, the multiple samples of order data generation is selected from the order data set of the multiple website and are ordered
Forms data set selects access data from the access data acquisition system of the multiple website and generates multiple sample interview data sets
It closes, comprising:
By the order data and access data of field missing in the order data set of the multiple website and access data acquisition system
It deletes, obtains the first order data set and the first access data acquisition system of the multiple website;
Duplicate removal processing is carried out to the first order data set of the multiple website and the first access data acquisition system respectively, obtains institute
State the second order data set and the second access data acquisition system of multiple websites;
Based on preset first cluster number to the second order data set of the multiple website and the second access data acquisition system
It is denoised, obtains multiple sample order data set and multiple sample interview data acquisition systems.
7. according to the method described in claim 5, it is characterized in that, it is described respectively from the multiple sample order data set and
Multiple sampling feature vectors are extracted in the multiple sample interview data acquisition system, comprising:
The multiple sample order data set and the multiple sample interview data acquisition system are normalized respectively, obtained
To multiple normalized sample order data set and multiple normalized sample interview data acquisition systems;
Generate respectively first derivative set corresponding with the multiple normalized sample order data set and with it is the multiple
The corresponding first derivative set of normalized sample interview data acquisition system, and as multiple sampling feature vectors.
8. according to the method described in claim 5, it is characterized in that, described cluster the multiple sampling feature vectors,
Obtain websites collection model, comprising:
Based on preset second cluster number and preset distance parameter, using hierarchy clustering method to the multiple sample characteristics
Vector carries out hierarchical clustering, obtains websites collection model.
9. according to the method described in claim 8, it is characterized in that, the hierarchy clustering method includes at least one of the following: most
Short distance method, longest distance method, average distance method, centroid distance method.
10. a kind of categories of websites acquisition device, which is characterized in that described device includes:
Acquiring unit is configured to the order data set obtained targeted website in the first preset time period and access data set
It closes;
Selection unit is configured to analyze the order data set and the access data acquisition system, from the order
Order data is selected in data acquisition system and generates target order data set, selects access number from the access data acquisition system
According to generation target access data acquisition system;
Extraction unit, be configured to from the target order data set and the target access data acquisition system extract feature to
Amount;
Taxon is configured to for described eigenvector being input to websites collection model trained in advance and classifies, obtains
The second level classification of the targeted website, wherein the websites collection model be used for characterize website feature vector and website two
The corresponding relationship of grade classification.
11. device according to claim 10, which is characterized in that described device further includes websites collection model foundation list
Member, the websites collection model foundation unit include:
Subelement is obtained, is configured to obtain order data set and access of multiple websites in the second preset time period respectively
Data acquisition system;
Subelement is chosen, is configured to analyze the order data set and access data acquisition system of the multiple website, from
Order data is selected in the order data set of the multiple website and generates multiple sample order data set, from the multiple
Access data are selected in the access data acquisition system of website generates multiple sample interview data acquisition systems;
Subelement is extracted, is configured to respectively from the multiple sample order data set and the multiple sample interview data set
Multiple sampling feature vectors are extracted in conjunction;
Cluster subelement.It is configured to cluster the multiple sampling feature vectors, obtains websites collection model.
12. a kind of server, comprising:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now method as described in any in claim 1-9.
13. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The method as described in any in claim 1-9 is realized when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710351636.4A CN108959289B (en) | 2017-05-18 | 2017-05-18 | Website category acquisition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710351636.4A CN108959289B (en) | 2017-05-18 | 2017-05-18 | Website category acquisition method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108959289A true CN108959289A (en) | 2018-12-07 |
CN108959289B CN108959289B (en) | 2022-04-26 |
Family
ID=64462802
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710351636.4A Active CN108959289B (en) | 2017-05-18 | 2017-05-18 | Website category acquisition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108959289B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111882265A (en) * | 2020-06-29 | 2020-11-03 | 深圳市法本信息技术股份有限公司 | Cross-border e-commerce automatic customs declaration method and automatic customs declaration robot |
CN112417893A (en) * | 2020-12-16 | 2021-02-26 | 江苏徐工工程机械研究院有限公司 | Software function demand classification method and system based on semantic hierarchical clustering |
CN114615262A (en) * | 2022-01-30 | 2022-06-10 | 阿里巴巴(中国)有限公司 | Network aggregation method, storage medium, processor and system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103324628A (en) * | 2012-03-21 | 2013-09-25 | 腾讯科技(深圳)有限公司 | Industry classification method and system for text publishing |
CN103605794A (en) * | 2013-12-05 | 2014-02-26 | 国家计算机网络与信息安全管理中心 | Website classifying method |
CN103744981A (en) * | 2014-01-14 | 2014-04-23 | 南京汇吉递特网络科技有限公司 | System for automatic classification analysis for website based on website content |
CN104809125A (en) * | 2014-01-24 | 2015-07-29 | 腾讯科技(深圳)有限公司 | Method and device for identifying webpage categories |
CN105184574A (en) * | 2015-06-30 | 2015-12-23 | 电子科技大学 | Method for detecting fraud behavior of merchant category code cloning |
US9262646B1 (en) * | 2013-05-31 | 2016-02-16 | Symantec Corporation | Systems and methods for managing web browser histories |
CN105556557A (en) * | 2013-09-20 | 2016-05-04 | 日本电气株式会社 | Shipment-volume prediction device, shipment-volume prediction method, recording medium, and shipment-volume prediction system |
CN106682217A (en) * | 2016-12-31 | 2017-05-17 | 成都数联铭品科技有限公司 | Method for enterprise second-grade industry classification based on automatic screening and learning of information |
-
2017
- 2017-05-18 CN CN201710351636.4A patent/CN108959289B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103324628A (en) * | 2012-03-21 | 2013-09-25 | 腾讯科技(深圳)有限公司 | Industry classification method and system for text publishing |
US9262646B1 (en) * | 2013-05-31 | 2016-02-16 | Symantec Corporation | Systems and methods for managing web browser histories |
CN105556557A (en) * | 2013-09-20 | 2016-05-04 | 日本电气株式会社 | Shipment-volume prediction device, shipment-volume prediction method, recording medium, and shipment-volume prediction system |
CN103605794A (en) * | 2013-12-05 | 2014-02-26 | 国家计算机网络与信息安全管理中心 | Website classifying method |
CN103744981A (en) * | 2014-01-14 | 2014-04-23 | 南京汇吉递特网络科技有限公司 | System for automatic classification analysis for website based on website content |
CN104809125A (en) * | 2014-01-24 | 2015-07-29 | 腾讯科技(深圳)有限公司 | Method and device for identifying webpage categories |
CN105184574A (en) * | 2015-06-30 | 2015-12-23 | 电子科技大学 | Method for detecting fraud behavior of merchant category code cloning |
CN106682217A (en) * | 2016-12-31 | 2017-05-17 | 成都数联铭品科技有限公司 | Method for enterprise second-grade industry classification based on automatic screening and learning of information |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111882265A (en) * | 2020-06-29 | 2020-11-03 | 深圳市法本信息技术股份有限公司 | Cross-border e-commerce automatic customs declaration method and automatic customs declaration robot |
CN112417893A (en) * | 2020-12-16 | 2021-02-26 | 江苏徐工工程机械研究院有限公司 | Software function demand classification method and system based on semantic hierarchical clustering |
CN114615262A (en) * | 2022-01-30 | 2022-06-10 | 阿里巴巴(中国)有限公司 | Network aggregation method, storage medium, processor and system |
Also Published As
Publication number | Publication date |
---|---|
CN108959289B (en) | 2022-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106911697B (en) | Access rights setting method, device, server and storage medium | |
CN107832468B (en) | Demand recognition methods and device | |
CN109460513A (en) | Method and apparatus for generating clicking rate prediction model | |
CN107105031A (en) | Information-pushing method and device | |
CN108090162A (en) | Information-pushing method and device based on artificial intelligence | |
CN107908789A (en) | Method and apparatus for generating information | |
CN108520324A (en) | Method and apparatus for generating information | |
CN107391680A (en) | Content recommendation method, device and equipment | |
CN107315824A (en) | Method and apparatus for generating thermodynamic chart | |
CN109976997A (en) | Test method and device | |
CN110298716A (en) | Information-pushing method and device | |
CN108776692A (en) | Method and apparatus for handling information | |
CN109214730A (en) | Information-pushing method and device | |
CN109388548A (en) | Method and apparatus for generating information | |
CN109087138A (en) | Data processing method and system, computer system and readable storage medium storing program for executing | |
CN108121699A (en) | For the method and apparatus of output information | |
CN107977678A (en) | Method and apparatus for output information | |
CN109711733A (en) | For generating method, electronic equipment and the computer-readable medium of Clustering Model | |
CN107346344A (en) | The method and apparatus of text matches | |
CN108959289A (en) | Categories of websites acquisition methods and device | |
CN110097302A (en) | The method and apparatus for distributing order | |
CN109784407A (en) | The method and apparatus for determining the type of literary name section | |
CN110209658A (en) | Data cleaning method and device | |
CN110309142A (en) | The method and apparatus of regulation management | |
CN109753424A (en) | The method and apparatus of AB test |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |