CN101320370A - Deep layer web page data source sort management method based on query interface connection drawing - Google Patents

Deep layer web page data source sort management method based on query interface connection drawing Download PDF

Info

Publication number
CN101320370A
CN101320370A CNA2008100242518A CN200810024251A CN101320370A CN 101320370 A CN101320370 A CN 101320370A CN A2008100242518 A CNA2008100242518 A CN A2008100242518A CN 200810024251 A CN200810024251 A CN 200810024251A CN 101320370 A CN101320370 A CN 101320370A
Authority
CN
China
Prior art keywords
query interface
list
label
value
sim
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2008100242518A
Other languages
Chinese (zh)
Other versions
CN101320370B (en
Inventor
崔志明
赵朋朋
方巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shu Lan
Original Assignee
崔志明
赵朋朋
方巍
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 崔志明, 赵朋朋, 方巍 filed Critical 崔志明
Priority to CN2008100242518A priority Critical patent/CN101320370B/en
Publication of CN101320370A publication Critical patent/CN101320370A/en
Application granted granted Critical
Publication of CN101320370B publication Critical patent/CN101320370B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The present invention discloses a deep webpage data source classifying management method based on a query interface connecting picture. The method includes the following procedures: (1) acquiring the form assembly of a deep webpage query interface; (2) automatically extracting the characteristic values of the query interface form acquired in procedure (1) which comprises a form label name and attribute values; (3) constructing characteristic vectors of the form; (4) acquiring an associated adjacent matrix related with label, attribute value and the combination of label and attribute value, through the similarity comparison among vectors; (5) constructing a connecting picture of the query interface form assembly which can be expressed by the associated adjacent matrix; (6) utilizing a clustering method to cluster weighted undirected graphs; (7) acquiring the clustering result of the deep webpage data source. The present invention increases the automatic classifying management performance of the large-scale deep webpage data source through effectively constructing the query interface connecting picture of the deep webpage data source and combining with the graph mining.

Description

Deep layer net page data source sort management method based on the query interface connection layout
Technical field
The present invention relates to the automatic sort management method of a kind of information, be specifically related to a kind of sort management method that is applied to deep layer net page data source.
Background technology
Along with the widespread use of network data base, network quickens " in-depth ".It is dynamically to be produced by background data base that a large amount of pages are arranged on the internet, this part information can not directly be obtained by static linkage, can only submit to inquiry to obtain by filling in list,, can't obtain these pages because traditional web crawlers (Crawler) does not have the ability of filling in list.Therefore, existing search engine searches does not go out this part page info, thus cause this part information to the user be hide, sightless, we are referred to as deep layer Webpage (Deep Web is called Invisible Web again, Hidden Web).Deep Web is one and the corresponding notion of SurfaceWeb, is proposed in 1994 by Dr.Jill Ellsworth at first, refers to those are difficult to find its information content by the general search engine Webpage.Deep Web information generally is stored in the database, and compare common quantity of information with static page bigger, and theme is more single-minded, and information quality is better, and message structureization is better, and growth rate is faster.Studies show that Deep Web information is 500 times of Surface Web information, nearly 450,000 Deep Web websites are arranged.Realize that extensive Deep Web data integration is an effective way that is user-friendly to Deep Web information.
Deep Web data source has isomerism, dynamic, content and covers features such as the field is wide, the Deep Web information of magnanimity can't be obtained by the reptile Automatic Program, can only be with the unique inlet of query interface as the Web database, the user submits inquiry to by filling in list, obtains the inquiry correlated results.In order to effectively utilize the affluent resources that freely are distributed on the Web and in addition integrated automatically, to find the suitable data storehouse easily and retrieve hiding information to help the user, the automatic Classification Management of Deep Web data source seems particularly important.Utilizing manual mode to come organizational information obviously is a very thing of difficulty, research to the query interface feature at present provides opportunity for our present work, research method to the query interface feature awaits further deeply, seeks new method and realizes the automatic Classification Management of Deep Web data source.Classification and cluster are the important method of data source Classification Management in the data integration, at present the approach of query interface research is confined to the Web feature in the taxonomic clustering method and the excavation of text message mostly, do not relate to new approach, being introduced as us and providing new approach to the research of Deep Web data source Classification Management of graph model, this is the field that the researcher does not set foot in as yet.
To scheme to excavate and apply in the Deep Web data source Classification Management research, has following advantage: at first, Deep Web outstanding behaviours is isomerism and autonomy, a large amount of Deep Web inside relate to a plurality of fields, therefore belong to the association that just exists between the Web in same field to a certain extent, the construction process that meets a Web graph model fully, and the definition of each online database is different, be isomery each other, this has just brought inconvenience for our information integration, utilization figure excavation means just can excavate Web each other linked character and classify.Secondly, figure excavates and can excavate a lot of theme features of hiding, and utilizes graph structure so, and we also can find to be hidden in a plurality of fields theme among the Deep Web.
In order to effectively utilize the affluent resources that freely are distributed on the Web, help the user to find the suitable data storehouse easily and retrieve hiding information, need such technology, can realize carrying out automatic Classification Management at extensive Deep Web data source.
Summary of the invention
The object of the invention provides a kind of automatic deep layer net page data source sort management method, utilize the feature-rich of deep layer net page (Deep Web) data source query interface and the method that figure excavates, improve the automatic Classification Management performance of deep layer net page heterogeneous data source, be beneficial to the integrated realization of large-scale data.
For achieving the above object, main design of the present invention is:
We are primarily aimed at structurized query interface and carry out cluster analysis, and structurized query interface comprises a plurality of attribute informations, the list that query interface is normally represented with HTML in Web page.The control that list comprises can divide three major types: INPUT control, SELECT control and TEXTAREA control, the TYPE attribute description of INPUT control the types of elements of input, eight kinds of Text, CheckBox, Radio, Submit, Reset, Radio, Image, Hidden are arranged.But the query interface formalization is defined as F={a 1, a 2..., a n, a iRepresent the control property on the list.Each control all has corresponding label (Label) to describe, and promptly describes text for one, and each control can have one or more values (value).For example a drop-down list has a plurality of values to select for the user, and radio button and check box have a value usually.Say that in logic a control label related with it has constituted an attribute (attribute), a corresponding field in the Deep Web background data base.A common attribute comprises a label, one or more form controls.We can regard the title (attribute name) of attribute as label in the attribute, and we can regard the value (attribute value) of attribute as the form controls in the attribute.Query interface can formalized description be F={ (L then 1, V 1) ..., (L n, V n), L wherein iRepresent label value, V i={ E j... E kThe representative one or more control property values corresponding with label.
Deep layer net page (Deep Web) data source has stronger isomerism and interconnectivity, and therefore, whole Deep Web can abstractly be the graph structure of the many relations of an isomery.Node among the figure is represented the query interface list, and the incidence relation between the query interface list is represented on the limit among the figure, and we represent correlation degree between the two with associated weight value.
According to above-mentioned design, the technical solution used in the present invention is: a kind of deep layer net page data source sort management method based on the query interface connection layout comprises the following steps:
(1) obtains the set of deep webpage query interface list;
(2) eigenwert of the query interface list that obtains of Automatic Extraction step (1), described eigenwert comprises the title and the property value of form tags;
(3) structure form feature vector comprises, with the title of the label that extracts and property value structural attitude Space L S and VS respectively, the characteristic of correspondence vector of characteristic set structure to each list among LS and the VS forms obtains the vector set thus;
(4) in the vector set that step (3) obtains, to calculating the query interface connection layout that obtains about label, property value, label and property value combination by similarity between each vector, available respectively LableMatrix, ValueMatrix, LableValueMatrix adjacency matrix represent that the degree of association computing method between the query interface are:
In calculating based on the list degree of association of label, utilize the quantity of the same characteristic features item of label to weigh, the column criterionization of going forward side by side,
Sim L ( F 1 , F 2 ) = sw len
Wherein, sw represents list F 1And F 2Have the number of same label, len represents F 1And F 2The average length of middle label proper vector, both are divided by and carry out standardization, Sim L(F 1, F 2) represent based on label (Label, list F L) 1And F 2Associated weight value;
The list degree of association based on property value and label and property value combination is calculated, and we utilize the similarity function between its vector to calculate,
Sim v ( F 1 , F 2 ) = Σ k = 1 n W 1 k × W 2 k ( Σ k = 1 n W 1 k 2 ) ( Σ k = 1 n W 2 k 2 )
In the formula, W 1kAnd W 2kRepresent list F respectively 1And F 2(Value, the vector representation of V) set formation utilizes vectorial cosine formula to calculate based on the list F1 of property value and the associated weight value Sim of F2 to middle property value V(F 1, F 2).Based on label and property value (Label﹠amp; Value, LV) Zu He list associated weight value Sim LV(F 1, F 2) and Sim V(F 1, F 2) computing method similar, but W wherein 1kAnd W 2kRepresent list F respectively 1And F 2Middle label and property value constitute the vector representation of set jointly;
(5) structure query interface connection layout:
With step (4) obtain three matrix L ableMatrix, ValueMatrix, LableValueMatrix be weighted merging, be about to similarity value in above-mentioned three matrixes and be weighted summation as the associated weight value between the query interface list of interconnection in twos; According to the building method of the undirected connection layout of cum rights, as a node among the figure, for setting up a nonoriented edge between the query interface that has certain degree of association, this associated weight value is just as the weights on limit each query interface;
Sim(F 1,F 2)=ω 1*Sim L(F 1,F 2)+ω 2*SimV(F 1,F 2)+ω 3*Sim LV(F 1,F 2)
Wherein, ω 1, ω 2, ω 3 are expressed as the weight coefficient that each degree of association component distributes, and its span ω 1 is 0.25~0.35, and ω 2 is 0.15~0.25, and ω 3 is 0.45~0.55, the available optimal value of determining these weights based on genetic algorithm.Sim (F 1, F 2) expression query interface list F 1And F 2Associated weight value, form a deep webpage query interface connection layout thus, available adjacency matrix FormLinkMatrix represents;
(6) utilize clustering method that the undirected connection layout of query interface list cum rights is carried out cluster;
(7) obtain the deep layer net page data source cluster result, finish.
In the technique scheme, in the described step (4), generate list incidence matrix LableMatrix from Label space vector, dependency value space vector generates list incidence matrix ValueMatrix, generates list incidence matrix LableValueMatrix from label and the property value space vector that combines;
In the technique scheme, in the described step (5), from form tags incidence matrix LableMatrix, list property value incidence matrix ValueMatrix and form tags and the property value incidence matrix LableValueMatrix generated query interface list connection layout that combines.
Because the technique scheme utilization, the present invention compared with prior art has following advantage:
The present invention in conjunction with the figure digging technology, has improved the performance of the automatic Classification Management of extensive Deep Web data source by constructing Deep Web data source query interface connection layout effectively.
Description of drawings
Fig. 1 is based on the deep layer net page Classification Management workflow synoptic diagram of query interface connection layout in the embodiment of the invention;
Fig. 2 is based on the automatic Classification Management process flow diagram of the deep layer net page of query interface connection layout among the embodiment;
Fig. 3 is a query interface form tags incidence matrix structure process flow diagram among the embodiment;
Fig. 4 is a query interface list property value incidence matrix structure process flow diagram among the embodiment;
Fig. 5 is query interface form tags and a property value incidence matrix structure process flow diagram among the embodiment;
Fig. 6 is the organigram of query interface connection layout among the embodiment.
Embodiment
Below in conjunction with drawings and Examples the present invention is further described:
Embodiment one: to shown in the accompanying drawing 6, a kind of deep layer net page data source sort management method based on the query interface connection layout comprises the following steps: referring to accompanying drawing 1
(1) obtains the set of deep webpage query interface list;
(2) eigenwert of the query interface list that obtains of Automatic Extraction step (1), described eigenwert comprises the title and the property value of form tags;
(3) structure form feature vector comprises, with the title of the label that extracts and property value structural attitude Space L S and VS respectively, the characteristic of correspondence vector of characteristic set structure to each list among LS and the VS forms obtains the vector set thus;
(4) in the vector set that step (3) obtains, to calculating the query interface connection layout that obtains about label, property value, label and property value combination by similarity between each vector, available respectively LableMatrix, ValueMatrix, LableValueMatrix adjacency matrix represent that the degree of association computing method between the query interface are:
In calculating based on the list degree of association of label, utilize the quantity of the same characteristic features item of label to weigh, the column criterionization of going forward side by side,
Sim L ( F 1 , F 2 ) = sw len
Wherein, sw represents list F 1And F 2Have the number of same label, len represents F 1And F 2The average length of middle label proper vector, both are divided by and carry out standardization, Sim L(F 1, F 2) represent based on label (Label, list F L) 1And F 2Associated weight value;
The list degree of association based on property value and label and property value combination is calculated, and we utilize the similarity function between its vector to calculate,
Sim v ( F 1 , F 2 ) = Σ k = 1 n W 1 k × W 2 k ( Σ k = 1 n W 1 k 2 ) ( Σ k = 1 n W 2 k 2 )
In the formula, W 1kAnd W 2kRepresent list F respectively 1And F 2(Value, the vector representation of V) set formation utilizes vectorial cosine formula to calculate based on the list F1 of property value and the associated weight value Sim of F2 to middle property value V(F 1, F 2).Based on label and property value (Label﹠amp; Value, LV) Zu He list associated weight value Sim LV(F 1, F 2) and Sim V(F 1, F 2) computing method similar, but W wherein 1kAnd W 2kRepresent list F respectively 1And F 2Middle label and property value constitute the vector representation of set jointly;
(5) structure query interface connection layout:
With step (4) obtain three matrix L ableMatrix, ValueMatrix, LableValueMatrix be weighted merging, be about to similarity value in above-mentioned three matrixes and be weighted summation as the associated weight value between the query interface list of interconnection in twos; According to the building method of the undirected connection layout of cum rights, as a node among the figure, for setting up a nonoriented edge between the query interface that has certain degree of association, this associated weight value is just as the weights on limit each query interface;
Sim(F 1,F 2)=ω 1*Sim L(F 1,F 2)+ω 2*SimV(F 1,F 2)+ω 3*Sim LV(F 1,F 2)
Wherein, ω 1, ω 2, ω 3 are expressed as the weight coefficient that each degree of association component distributes, and its span ω 1 is 0.25~0.35, and ω 2 is 0.15~0.25, and ω 3 is 0.45~0.55, the available optimal value of determining these weights based on genetic algorithm.Sim (F 1, F 2) expression query interface list F 1And F 2Associated weight value, form a deep webpage query interface connection layout thus, available adjacency matrix FormLinkMatrix represents;
(6) utilize clustering method that the undirected connection layout of query interface list cum rights is carried out cluster;
Traditional division methods is a kind of hard method, being divided in certain class of each pending object strictness, C-mean algorithm for example, its degree of membership are not 1 to be exactly 0, the division of this strictness uncertain membership between the object and class of failing truly to reflect reality in the world.In the fuzzy clustering algorithm of promoting, in membership function, introduced the weight index.We apply to fuzzy clustering (FCM) method in the cluster of query interface connection layout, and its calculating is simple and speed is fast, has geometric meaning more intuitively.With the input of adjacency list form, and the weights of opposite side carry out regularization, through the FCM cluster calculation, obtain the clustering cluster of query interface list with the list connection layout that generated.
(7) obtain the deep layer net page data source cluster result, finish.

Claims (3)

1. the deep layer net page data source sort management method based on the query interface connection layout is characterized in that, comprises the following steps:
(1) obtains the set of deep webpage query interface list;
(2) eigenwert of the query interface list that obtains of Automatic Extraction step (1), described eigenwert comprises the title and the property value of form tags;
(3) structure form feature vector comprises, with the title of the label that extracts and property value structural attitude Space L S and VS respectively, the characteristic of correspondence vector of characteristic set structure to each list among LS and the VS forms obtains the vector set thus;
(4) in the vector set that step (3) obtains, to calculating the query interface connection layout that obtains about label, property value, label and property value combination by similarity between each vector, available respectively LableMatrix, ValueMatrix, LableValueMatrix adjacency matrix represent that the degree of association computing method between the query interface are:
In calculating based on the list degree of association of label, utilize the quantity of the same characteristic features item of label to weigh, the column criterionization of going forward side by side,
Sim L ( F 1 , F 2 ) = sw len
Wherein, sw represents list F 1And F 2Have the number of same label, len represents F 1And F 2The average length of middle label proper vector, both are divided by and carry out standardization, Sim L(F 1, F 2) represent based on label (Label, list F L) 1And F 2Associated weight value;
The list degree of association based on property value and label and property value combination is calculated, and we utilize the similarity function between its vector to calculate,
Sim v ( F 1 , F 2 ) = Σ k = 1 n W 1 k × W 2 k ( Σ k = 1 n W 1 k 2 ) ( Σ k = 1 n W 2 k 2 )
In the formula, W 1kAnd W 2kRepresent list F respectively 1And F 2(Value, the vector representation of V) set formation utilizes vectorial cosine formula to calculate based on the list F1 of property value and the associated weight value Sim of F2 to middle property value V(F 1, F 2).Based on label and property value (Label﹠amp; Value, LV) Zu He list associated weight value Sim LV(F 1, F 2) and Sim V(F 1, F 2) computing method similar, but W wherein 1kAnd W 2kRepresent list F respectively 1And F 2Middle label and property value constitute the vector representation of set jointly;
(5) structure query interface connection layout:
With step (4) obtain three matrix L ableMatrix, ValueMatrix, LableValueMatrix be weighted merging, be about to similarity value in above-mentioned three matrixes and be weighted summation as the associated weight value between the query interface list of interconnection in twos; According to the building method of the undirected connection layout of cum rights, as a node among the figure, for setting up a nonoriented edge between the query interface that has certain degree of association, this associated weight value is just as the weights on limit each query interface;
Sim(F 1,F 2)=ω 1*Sim L(F 1,F 2)+ω 2*SimV(F 1,F 2)+ω 3*Sim LV(F 1,F 2)
Wherein, ω 1, ω 2, ω 3 are expressed as the weight coefficient that each degree of association component distributes, and its span ω 1 is 0.25~0.35, and ω 2 is 0.15~0.25, and ω 3 is 0.45~0.55, the available optimal value of determining these weights based on genetic algorithm.Sim (F 1, F 2) expression query interface list F 1And F 2Associated weight value, form a deep webpage query interface connection layout thus, available adjacency matrix Form LinkMatrix represents;
(6) utilize clustering method that the undirected connection layout of query interface list cum rights is carried out cluster;
(7) obtain the deep layer net page data source cluster result, finish.
2. the deep layer net page data source sort management method based on the query interface connection layout according to claim 1, it is characterized in that: in the described step (4), generate list incidence matrix LableMatrix from the Label space vector, dependency value space vector generates list incidence matrix ValueMatrix, generates list incidence matrix LableValueMatrix from label and the property value space vector that combines;
3. the deep layer net page data source sort management method based on the query interface connection layout according to claim 1, it is characterized in that: in the described step (5), from form tags incidence matrix LableMatrix, list property value incidence matrix ValueMatrix and form tags and the property value incidence matrix LableValueMatrix generated query interface list connection layout that combines.
CN2008100242518A 2008-05-16 2008-05-16 Deep layer web page data source sort management method based on query interface connection drawing Expired - Fee Related CN101320370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008100242518A CN101320370B (en) 2008-05-16 2008-05-16 Deep layer web page data source sort management method based on query interface connection drawing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008100242518A CN101320370B (en) 2008-05-16 2008-05-16 Deep layer web page data source sort management method based on query interface connection drawing

Publications (2)

Publication Number Publication Date
CN101320370A true CN101320370A (en) 2008-12-10
CN101320370B CN101320370B (en) 2011-06-01

Family

ID=40180423

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008100242518A Expired - Fee Related CN101320370B (en) 2008-05-16 2008-05-16 Deep layer web page data source sort management method based on query interface connection drawing

Country Status (1)

Country Link
CN (1) CN101320370B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101916272A (en) * 2010-08-10 2010-12-15 南京信息工程大学 Data source selection method for deep web data integration
WO2011085588A1 (en) * 2010-01-12 2011-07-21 苏州阔地网络科技有限公司 Webpage contents grabbing method which can be general adapted to any webpage
WO2012016457A1 (en) * 2010-08-06 2012-02-09 华为技术有限公司 Method and system for selecting data source
CN102609546A (en) * 2011-12-08 2012-07-25 清华大学 Method and system for excavating information of academic journal paper authors
CN102750375A (en) * 2012-06-21 2012-10-24 武汉大学 Service and tag recommendation method based on random walk
CN101996102B (en) * 2009-08-31 2013-07-17 中国移动通信集团公司 Method and system for mining data association rule
CN103257981A (en) * 2012-06-12 2013-08-21 苏州大学 Deep Web data superficializing method based on query interface attributive character
CN103425666A (en) * 2012-05-16 2013-12-04 富士通株式会社 Information processing device and information processing method
CN104156472A (en) * 2014-08-25 2014-11-19 四达时代通讯网络技术有限公司 Video recommendation method and system
CN106446124A (en) * 2016-09-19 2017-02-22 成都知道创宇信息技术有限公司 Website classification method based on network relation graph
CN107992556A (en) * 2017-11-28 2018-05-04 福建中金在线信息科技有限公司 A kind of station field signal method, apparatus, electronic equipment and storage medium
CN109815235A (en) * 2018-12-29 2019-05-28 东软集团股份有限公司 Generate method, apparatus, storage medium and the electronic equipment of data source
CN110110089A (en) * 2018-01-09 2019-08-09 网智天元科技集团股份有限公司 Cultural relations drawing generating method and system
CN114511027A (en) * 2022-01-29 2022-05-17 重庆工业职业技术学院 Method for extracting English remote data through big data network

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101996102B (en) * 2009-08-31 2013-07-17 中国移动通信集团公司 Method and system for mining data association rule
WO2011085588A1 (en) * 2010-01-12 2011-07-21 苏州阔地网络科技有限公司 Webpage contents grabbing method which can be general adapted to any webpage
WO2012016457A1 (en) * 2010-08-06 2012-02-09 华为技术有限公司 Method and system for selecting data source
CN101916272B (en) * 2010-08-10 2012-04-25 南京信息工程大学 Data source selection method for deep web data integration
CN101916272A (en) * 2010-08-10 2010-12-15 南京信息工程大学 Data source selection method for deep web data integration
CN102609546B (en) * 2011-12-08 2014-11-05 清华大学 Method and system for excavating information of academic journal paper authors
CN102609546A (en) * 2011-12-08 2012-07-25 清华大学 Method and system for excavating information of academic journal paper authors
CN103425666A (en) * 2012-05-16 2013-12-04 富士通株式会社 Information processing device and information processing method
CN103257981A (en) * 2012-06-12 2013-08-21 苏州大学 Deep Web data superficializing method based on query interface attributive character
CN103257981B (en) * 2012-06-12 2016-04-13 苏州大学 Deep web data based on query interface attributive character is come to the surface method
CN102750375A (en) * 2012-06-21 2012-10-24 武汉大学 Service and tag recommendation method based on random walk
CN102750375B (en) * 2012-06-21 2014-04-02 武汉大学 Service and tag recommendation method based on random walk
CN104156472A (en) * 2014-08-25 2014-11-19 四达时代通讯网络技术有限公司 Video recommendation method and system
CN106446124A (en) * 2016-09-19 2017-02-22 成都知道创宇信息技术有限公司 Website classification method based on network relation graph
CN106446124B (en) * 2016-09-19 2019-11-15 成都知道创宇信息技术有限公司 A kind of Website classification method based on cyberrelationship figure
CN107992556A (en) * 2017-11-28 2018-05-04 福建中金在线信息科技有限公司 A kind of station field signal method, apparatus, electronic equipment and storage medium
CN107992556B (en) * 2017-11-28 2020-08-21 福建中金在线信息科技有限公司 Site management method and device, electronic equipment and storage medium
CN110110089A (en) * 2018-01-09 2019-08-09 网智天元科技集团股份有限公司 Cultural relations drawing generating method and system
CN110110089B (en) * 2018-01-09 2021-03-30 网智天元科技集团股份有限公司 Cultural relation graph generation method and system
CN109815235A (en) * 2018-12-29 2019-05-28 东软集团股份有限公司 Generate method, apparatus, storage medium and the electronic equipment of data source
CN109815235B (en) * 2018-12-29 2021-10-15 东软集团股份有限公司 Method and device for generating data source, storage medium and electronic equipment
CN114511027A (en) * 2022-01-29 2022-05-17 重庆工业职业技术学院 Method for extracting English remote data through big data network
CN114511027B (en) * 2022-01-29 2022-11-11 重庆工业职业技术学院 Method for extracting English remote data through big data network

Also Published As

Publication number Publication date
CN101320370B (en) 2011-06-01

Similar Documents

Publication Publication Date Title
CN101320370B (en) Deep layer web page data source sort management method based on query interface connection drawing
Memon et al. Travel recommendation using geo-tagged photos in social media for tourist
Popat et al. Review and comparative study of clustering techniques
Shen et al. Attraction recommendation: Towards personalized tourism via collective intelligence
Poelmans et al. Fuzzy and rough formal concept analysis: a survey
Wu et al. Positive and unlabeled multi-graph learning
CN106951498A (en) Text clustering method
CN103488746B (en) Method and device for acquiring business information
CN104239513A (en) Semantic retrieval method oriented to field data
Halim et al. Density-based clustering of big probabilistic graphs
CN111553279A (en) Interest point characterization learning and identification method, device, equipment and storage medium
Guo et al. Modeling of spatial stratified heterogeneity
Wang et al. An ontology-based framework for geospatial clustering
Zhuang et al. SNS user classification and its application to obscure POI discovery
Mehrotra et al. Comparative analysis of K-Means with other clustering algorithms to improve search result
CN103942232A (en) Method and equipment for mining intentions
Zhu et al. Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction.
Gamgne Domgue et al. Community structure extraction in directed network using triads
Escalante et al. An energy-based model for region-labeling
Chehreghani et al. Density link-based methods for clustering web pages
CN107577681B (en) A kind of terrain analysis based on social media picture, recommended method and system
Luo et al. Multi‐scale information extraction from high resolution remote sensing imagery and region partition methods based on GMRF–SVM
CN101452462A (en) Method and system for auto establishing hierarchy between information objects on network
Mehdizadeh et al. Electrical fuzzy C-means: A new heuristic fuzzy clustering algorithm
Hassan et al. SODRet: Instance retrieval using salient object detection for self-service shopping

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: SUZHOU PUDA NEW INFORMATION TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: CUI ZHIMING

Effective date: 20100524

Free format text: FORMER OWNER: ZHAO PENGPENG FANG WEI

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 215001 ROOM 403, BUILDING 115, SUAN NEW HOUSING ESTATE, SUZHOU CITY, JIANGSU PROVINCE TO: 215021 NO.E101-18, PHASE 2, INTERNATIONAL SCIENCE PARK, NO.1355, JINJIHU AVENUE, SUZHOU INDUSTRY PARK, SUZHOU CITY, JIANGSU PROVINCE

TA01 Transfer of patent application right

Effective date of registration: 20100524

Address after: 215021, 1355 international science and Technology Park, Jinji Lake Avenue, Suzhou Industrial Park, Suzhou, Jiangsu, two E101-18

Applicant after: Suzhou Production Information Technology Co., Ltd.

Address before: 215001 room 115, building 403, Su an village, Suzhou, Jiangsu

Applicant before: Cui Zhiming

Co-applicant before: Zhao Pengpeng

Co-applicant before: Fang Wei

C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20081210

Assignee: SUZHOU SOUKE INFORMATION TECHNOLOGY CO., LTD.

Assignor: Suzhou Production Information Technology Co., Ltd.

Contract record no.: 2013320010067

Denomination of invention: Deep layer web page data source sort management method based on query interface connection drawing

Granted publication date: 20110601

License type: Exclusive License

Record date: 20130412

LICC Enforcement, change and cancellation of record of contracts on the licence for exploitation of a patent or utility model
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20161010

Address after: Canglang District of Suzhou City, Jiangsu province 215021 liberation Village 5 403 room

Patentee after: Shu Lan

Address before: 215021, 1355 international science and Technology Park, Jinji Lake Avenue, Suzhou Industrial Park, Suzhou, Jiangsu, two E101-18

Patentee before: Suzhou Production Information Technology Co., Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110601

Termination date: 20180516