CN104166651B - Method and apparatus based on the data search integrated to homogeneous data object - Google Patents
Method and apparatus based on the data search integrated to homogeneous data object Download PDFInfo
- Publication number
- CN104166651B CN104166651B CN201310182427.3A CN201310182427A CN104166651B CN 104166651 B CN104166651 B CN 104166651B CN 201310182427 A CN201310182427 A CN 201310182427A CN 104166651 B CN104166651 B CN 104166651B
- Authority
- CN
- China
- Prior art keywords
- data
- label
- data label
- objects
- data object
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application is related to a kind of method and apparatus based on the data search integrated to homogeneous data object, including:The searching request from user is received, one or more data objects that search matches with the searching request in all data objects to be searched;Each in the one or more of data objects searched is analyzed, to obtain the data label of each data object;The data label of acquisition is matched;One or more data objects that the data label matches are integrated into homogeneous data object composition, and user is back to as search result.The application utilizes the data label of data object, classification integration is carried out to mass data object in advance and obtains homogeneous data object, and one in multiple homogeneous data objects of return is shown in a search engine, so as to improve the accuracy and return rate of data search, and add the diversity of search result.
Description
Technical field
The application is related to field of data search, more particularly to a kind of based on the data search integrated to homogeneous data object
Method and apparatus.
Background technology
With the arriving of cloud era, big data has attracted increasing concern, and big data technology does not lie in grasp magnanimity
Data/data object, and be more conceived to and reach collection within reasonable time, handle and arrange as the number required for user
According to.Substantial amounts of data are there are in a network, sufficiently using these data, can be brought greatly just for the life of user
Profit.User can carry out data search by using search engine, to the data for obtaining expecting obtaining.Using data search as
Example, search engine is captured to the webpage in internet in advance, after the webpage to being captured is pre-processed, and can just be carried
For retrieval service.Wherein, most important is exactly to extract the keyword in webpage, and other also include removing repeated pages, participle, sentenced
Disconnected type of webpage, analysis hyperlink, the importance/richness for calculating webpage etc..
When carrying out data search, search engine is the keyword inputted according to user, is retrieved and the keyword phase
The high occurrence of closing property, but in the process, the search result enormous amount matched with the keyword, and include society
The every field of life, so as to cause search result quality low, such as:It is unfavorable for user to use, accuracy is poor.
According to the means of information integration, mass data object that search engine can be captured carry out content select,
The processing such as analysis, classification, can reduce the scope of data search, increase the specific aim of search result.But it is due between data
The ambiguity of presence(Such as:Same keyword correspondence different field), cause the accuracy of search result low;Or there is it in keyword
His expression method(Ethernet, second too net), cause search result to return not comprehensive.
For example, carrying out data search to keyword " Ethernet ", occur in search results pages related to " Ethernet "
Search result, but " Ethernet " and " second too net " is the keyword of same meaning different expression, due to both keyword it
Between be not present any incidence relation, then the search result related to " second too net " will not be not present in search results pages, cause
A part of search result fails to be retrieved, and reduces search result quality, such as:The return rate of search result.
Also, select, analyze, sorting out etc. and handle because search engine has carried out content to the data/data object of magnanimity,
When returning to search result, in search results pages, multiple same or analogous data objects can be shown, thus causes and searches
The waste of hitch fruit.For example, 20 search results can only be shown in every page search results pages, but in this 20 search knots
It is same or analogous data object to have 10 in fruit, then user has to repeatedly click on lower one page, to check different numbers
According to object.
The content of the invention
The main purpose of the application be provide a kind of method based on the data search integrated to homogeneous data object with
Device, with solve using prior art search engine carry out data search when, because data volume is excessive, and data object with
In the absence of relevance between data object, and the low-quality problem of search result occurred.
In order to solve the above-mentioned technical problem, the purpose of the application is achieved through the following technical solutions:
This application provides a kind of method based on the data search integrated to homogeneous data object, comprise the following steps:
The searching request from user is received, search matches with the searching request in all data objects to be searched one
Or multiple data objects;Each in the one or more of data objects searched is analyzed, to obtain described in each
The data label of data object;The data label of acquisition is matched;The data label is matched one or
Multiple data objects are integrated into homogeneous data object composition, and are back to user as search result.
Preferably, according in method described herein, the data label includes the first data label and the second number
According to label, the first data label attributive character different with the second data label difference identified data object.
Preferably, according in method described herein, it can also include:To all data objects to be searched, in advance
First integration is handled, to determine one or more homogeneous data objects that each described data object to be searched is corresponding, with
Obtain data object mapping table.
Preferably, according in method described herein, to all data objects to be searched, advance integration is handled,
Including:Excavation processing is carried out to the second data label in each data object and the second data label classification distribution table;To each
The second data label in data object carries out the second data label excavation, and the second data label for generating all data objects is same
The set of adopted word;First data label excavation is carried out to the first data label in each data object, all data objects are generated
The first data label TongYiCi CiLin;The first data label and the second data label in each data object is excavated,
The first data label is generated to the mapping relations of the second data label.
Preferably, according in method described herein, the second data label synonym includes:Identical classification
Under, multiple data objects with different second data labels and with identical first data label;First data label
Synonym includes:Multiple the first similar data labels in same data object.
Preferably, according in method described herein, the first data label in each data object and second are counted
Excavated according to label, generate the mapping relations of the first data label to the second data label, including:If a data object is only
With the presence of first data label and first data label only with unique second data label co-occurrence, then set up described
The mapping relations of first data label and second data label.
Preferably, according in method described herein, to all data objects to be searched, advance integration is handled,
Including:The data label of one or more of same data object second is extracted, to obtain one or more data of candidate second
Label, and disambiguation is carried out to the data label of one or more candidates second of extraction;Rule based on configuration, extracts multiple data
The first data label in object, and to multiple first data label normalizeds of extraction;By the second of synonym each other
Data label or the first data label are normalized;Closed according to the mapping of the first data label of structure and the second data label
System, the data object to lacking the second data label carries out the second data label completion.
Preferably, according in method described herein, the data label of one or more candidates second of extraction is entered
Row disambiguation, including:Classification distribution table based on the second data label, obtains the data label of candidate second in the classification
The number of times of appearance, if number of times is more than default threshold value, then it is assumed that be the second data label of the data object;And/or, if one
There are multiple data labels of candidate second in data object, then selects in the second data label classification distribution table, occurrence number is most
Many second data labels as the data object the second data label.
Preferably, according in method described herein, it can include:In search results pages, show described similar
One of them of multiple data objects in data combination, wherein, homogeneous data combination includes:Homogeneous data object each other
Multiple data objects.
Preferably, according in method described herein, the homogeneous data object can include:In identical classification
Under, multiple data pair with the second identical or synonymous data label and with the first identical or synonymous data label
As.
Present invention also provides a kind of device based on the data search integrated to homogeneous data object, including:Receive with
Search module, for receiving the searching request from user, search please with the search in all data objects to be searched
Seek the one or more data objects matched;Acquisition module, for analyzing the one or more of data objects searched
In each, to obtain the data label of each data object;Matching module, for the data mark to acquisition
Label are matched;Integrate with returning to module, for one or more data objects that the data label matches to be integrated into
Homogeneous data object composition, and it is back to user as search result.
Preferably, according in device described herein, the data label includes the first data label and the second number
According to label, the first data label attributive character different with the second data label difference identified data object.
Preferably, according in device described herein, it can also include:Pretreatment module, for needing to be searched
The data object of rope, advance integration processing, to determine corresponding one or more of each described data object to be searched
Homogeneous data object, to obtain data object mapping table.
Preferably, according in device described herein, the pretreatment module is further configured to:To each data object
In the second data label and the second data label classification distribution table carry out excavation processing;To the second number in each data object
The second data label excavation is carried out according to label, the set of the second data label synonym of all data objects is generated;To each number
The first data label excavation is carried out according to the first data label in object, the first data label for generating all data objects is synonymous
Set of words;The first data label and the second data label in each data object is excavated, the first data label of generation is extremely
The mapping relations of second data label;If a data object only have first data label and first data label only
There is co-occurrence with unique second data label, then the mapping for setting up first data label and second data label is closed
System.
Preferably, according in device described herein, the second data label synonym includes:It is similar now,
Multiple data objects with different second data labels and with identical first data label;State the first data label synonym
Including:Multiple the first similar data labels in same data object.
Preferably, according in device described herein, the pretreatment module is further configured to:Extract same data
The data label of one or more of object second, to obtain one or more data labels of candidate second, and to the one of extraction
Individual or multiple data labels of candidate second carry out disambiguation;Rule based on configuration, extracts the first data in multiple data objects
Label, and to multiple first data label normalizeds of extraction;Second data label of synonym each other or first are counted
It is normalized according to label;According to the mapping relations of the first data label of structure and the second data label, to lacking the second number
According to the data object of label, the second data label completion is carried out;Classification distribution table based on the second data label, obtains described wait
The number of times for selecting the second data label to occur in the classification, if number of times is more than default threshold value, then it is assumed that be the data pair
The second data label of elephant;If and/or there are multiple data labels of candidate second in a data object, selects in the second data mark
Sign in classification distribution table, most second data label of occurrence number as the data object the second data mark
Label.
Preferably, according in device described herein, the integration is further configured to returning to module:In search knot
In fruit page, one of them of multiple data objects in the homogeneous data combination is shown, wherein, the homogeneous data combination bag
Include:Multiple data objects of homogeneous data object each other;The homogeneous data object includes:In identical class now, with identical
Or synonymous the second data label and multiple data objects with the first identical or synonymous data label.
Compared with prior art, according to the technical scheme of the application, there is following beneficial effect:
The application is using important label/attributes such as the first data label, the second data labels of data object, in advance to sea
Measure data object carry out classification integration, and between homogeneous data object set up association, improve data search accuracy and
Return rate, so as to improve the quality of search result.
Multiple homogeneous data objects that the application returns to search engine carry out integration processing, and in search results pages only
One in the plurality of homogeneous data object is shown, so that search results pages show a greater variety of data objects, is added
The diversity of search result, better user experience.
Brief description of the drawings
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes the part of the application, this Shen
Schematic description and description please is used to explain the application, does not constitute the improper restriction to the application.In the accompanying drawings:
Fig. 1 is the flow chart of the method based on the data search integrated to homogeneous data object of the embodiment of the present application;
Fig. 2 is the flow chart for the step of pre- integration to homogeneous data object of the embodiment of the present application is handled;
Fig. 3 is the flow that all data objects to be searched are performed with the data mining processing under line of the embodiment of the present application
Figure;
Fig. 4 is that being performed to all data objects to be searched for the embodiment of the present application normalizes and mapping processing accordingly
Flow chart;And
Fig. 5 is the structure chart of the device based on the data search integrated to homogeneous data object of the embodiment of the present application.
Embodiment
The main thought of the application is, utilizes the data label included in searched data object(Attribute)Come area
Divide homogeneous data object and inhomogeneity data object, use the data label included in data object(For example:First data label
With the second data label etc.), the mass data object in advance integrated database.For example:To the of same implication different expression
Mapped between two data labels, e.g., " Ethernet " and " second too net " is counted to the difference first included in same data object
Mapped according between label, the different pieces of information object progress with same first data label is mapped etc., based on data
Mapping relations between object and data label, to obtain multiple data objects of homogeneous data object each other.Also, in data
In search, the advance integration based on mass data object, according to the searching request of user, such as utilizes " keyword "(Key)Search
Rope is asked, and while obtaining the data object matched with the keyword in database, can also obtain the number matched
According to the homogeneous data object of object, so that the accuracy and return rate of data search are improved, also, can also be to multiple similar
Data object carries out integration processing, one in the plurality of homogeneous data object is only shown in search results pages, so that searching
Rope result page shows a greater variety of data objects, adds the diversity of search result.
To make the purpose, technical scheme and advantage of the application clearer, below in conjunction with drawings and the specific embodiments, to this
Application is described in further detail.
According to embodiments herein, there is provided a kind of method based on the data search integrated to homogeneous data object.
With reference to the flow chart of the method based on the data search integrated to homogeneous data of Fig. 1 the embodiment of the present application.
At step S102, the searching request from user is received, and perform search.Wherein, the searching request is used for
The one or more data objects matched with the searching request are searched out in all data objects to be searched.
The searching request can perform search comprising keyword or network linking etc. according to searching request, to find with being somebody's turn to do
The data object that keyword matches, or find one or more data objects pointed by the network linking etc..User passes through
Send the searching request, can be obtained in many data objects with the keyword or with the content phase representated by the network linking
The data object of matching, and the data object matched can be one or more.
One or more of data objects can be to be stored in database in the form of data file.Wherein, described one
Each data object in individual or multiple data objects includes various data labels, such as the first data label, the second data
Label etc..First data label, the second data label are to represent to represent two kinds of entirely different features or attribute, and this description is
In order to divide into two kinds of features and non-a defined.It is stored in data file to be searched in database, it is desirable to have corresponding data knot
Structure carries out tissue and integration, just can guarantee that the integrality of its search, high-quality and high efficiency, this will integrate homogeneous data below
It is described in the processing of object.
At step S104, divided by each data object in one or more data objects to searching
Analysis, obtains the data label of each data object.Wherein, the data label includes the first data label and the second number
According to label, the first data label attributive character different with the second data label difference identified data object.
In other words, it can be divided by each data object in one or more data objects to searching
Analysis, obtains the first data label and/or the second data label of each data object.
Multiple data labels, the title of such as data object, storage location, index are included in each described data object
Numbering etc..Each data object can include the first data label, the second data label, first data label and second
Data label distinguishes the different data labels of characterize data object.Wherein, the first data label and the second data label can be with
A data object is determined, so, in embodiments herein, integration is used as using the first data label and the second data label
The basis of homogeneous data.For example, in employee information table, with employee ID(12345)As the first data label, with name(
Three)It is used as the second data label, the employee ID(12345)And name(Zhang San)It can determine that the zooid in the employee information table
Work Zhang San.
In mass data object, it is possible to use first data label and the second data label can determine a number
According to Properties of Objects, determine the homogeneous data object of the data object, in another example, can using commodity as a data object,
, can be by searching using the brand word of the commodity as the second data label using the article No. of the commodity as the first data label
Rope to commodity analyzed, to obtain the article No. and brand word of the commodity, can by the article No. and brand word of the acquisition with
Magnanimity commodity are matched, so that the similar commodity of the commodity are obtained, for example, commodity article No. " 1111 " and Brand word are " resistance to
Gram " commodity " shell head sport footwear " can be determined, then in magnanimity commodity, can by by commodity article No. " 1111 " and
Brand word " Nike " is matched with magnanimity commodity, so that it is the commodity with money with " shell head sport footwear " to obtain multiple.
The step of the first data label and/or the second data label of acquisition are matched with mass data, for details, reference can be made to step
S106。
At step S106, according to the data label of acquisition, e.g., the first data label and/or the second data label, to obtaining
The data label taken is matched, to obtain the one or more similar numbers matched with each described data object
According to object.
Thus, the more data objects matched with searching request can be obtained, so that the comprehensive of data search is improved,
The quality of search result is lifted, convenient data search services are provided the user.
, will each described data object and its corresponding one or more homogeneous data object integration at step S108
(Polymerization)Combined for a homogeneous data, and return to the user.
In other words, can be by the data label(First data label and/or second data label)Match
One or more data objects are integrated into homogeneous data object composition, and are back to user as search result.
The homogeneous data combination includes multiple data objects of homogeneous data object each other, and user can be similar by this
Each homogeneous data object that data combination is checked in the multiple homogeneous data objects wherein included.
In one embodiment, multiple data objects that can be in search results pages only in the combination of displaying homogeneous data
One of them, and other homogeneous data objects are hidden, when needing to show other hiding homogeneous data objects, it can trigger
One operation for showing other hiding homogeneous data objects, the modes such as a button are triggered for example, passing through.
Further, search result is the data object that will be matched with the searching request, and the number matched
The user is returned to according to the corresponding one or more homogeneous data objects of object.So, match in return with searching request
Data object while, return to corresponding with data object homogeneous data object, improve the accuracy of data search with
Return rate, moreover, the concept combined using homogeneous data, of a sort data object is condensed together, can be in search knot
A greater variety of data objects are shown in fruit page, the diversity of search result is added.
Wherein, can be according to data label, such as the first data label and/or the second data label are more same to match
Class data object, is that the data structure integrated based on homogeneous data object is realized.It will be detailed below similar data object
Integration process.
The flow chart of the pre- integration processing to homogeneous data object of the embodiment of the present application as shown in Figure 2.
All data objects to be searched are carried out advance integration processing, to determine to wait to search described in each by step S202
The corresponding one or more homogeneous data objects of the data object of rope, obtain data object mapping table.
The step of performing the advance integration processing, it is therefore intended that obtain one or many corresponding with each data object
Individual homogeneous data object.Wherein, the homogeneous data object includes, in identical class now, with identical or synonymous second
Data label and multiple data objects with the first identical or synonymous data label.
The step of advance integration processing, is handled by the data mining to mass data object under line, and based on this pair
Data object excavates result, performs the extraction of the second data label and the first data label on line, and returns accordingly
One changes and mapping processing, finally obtains the mapping relations between data object, the first data label, the second data label, so that,
The of a sort data object of identical class now is integrated together, that is, the association of the homogeneous data object after being integrated is closed
System.
First, based on mass data object(Such as several hundred million orders of magnitude), the data mining processing under line is performed, such as to all
Data object to be searched performs the data mining processing under line, mode preferably, as shown in figure 3, excavating the second data mark
Sign table, the second data label synonym table, the second data label classification distribution table, the first data label synonym table, Yi Ji
The mapping table of one data label to the second data label.
Step S302, digs to the second data label in each data object and the second data label classification distribution table
Pick is handled.
The second data label in each data object can be extracted, the collection of the second data label of all data objects is generated
Close, e.g., form the second data label table;And the set based on second data label, by all data pair in database
As carrying out classification division, obtaining the second data label classification distribution of all data objects, e.g., form the second data label class
A variety of classifications of mesh distribution table, such as paper database:GEOGRAPHIC ATTRIBUTES, life kind etc..Such as the various differences of merchandising database
Classification:Clothing, clock and watch class etc..
Preferably, to same class now, in the data file of each data object, count its have all different
Two data labels, and all the second data labels of all data objects are constituted to the set of the second data labels, Yi Jitong
Count the number of times or frequency of each second data label appearance., can be with according to all second data labels of all data objects
The second data label table of all data objects is formed, according to the second data label of each data object in different classifications
Number of times or frequency that class occurs now, can form the second data label classification distribution table of all second data labels, wherein,
Comprising multiple second data labels multiple classifications, each class now in the second data label classification distribution table, and it is each
The number of times of individual second data label(Or frequency).
For example:School bus, subway, three differences of car are extracted in the data object city of GEOGRAPHIC ATTRIBUTES now, its file
The second data label, occur in the file for being all " city " this object in same GEOGRAPHIC ATTRIBUTES now, by " school bus ",
Three the second data labels of " subway ", " car " are all put into the second data label set, so, same class now and not
All second data labels of similar data object all now all extract to form the second data label set, Ke Yitong
Cross tabular form storage.And also need to these the second data labels, difference second in such as object " city " of GEOGRAPHIC ATTRIBUTES now
Number of times or frequency that data label " school bus ", " subway ", " car " each occur(The frequency)Carry out statistics and form the second data
The classification distribution relation of label." school bus " occurs 15 times, and " subway " occurs 10 times, and " car " occurs 20 times, and from big to small
Queue up, and in other classifications such as file of " the large-scale articles for use of the family " data object of " life kind ", the number of times that " school bus " occurs
For 0, the number of times that " subway " occurs is 0, and the number of times that " car " occurs is 20.Thus, then " school bus ", " subway " the two second
Data label belongs to " geography " class now, and " car " can be corresponded in " geography ", " life " class now.By classification, such
Now the times or frequency that the second data label, second data label statistics occur(The frequency)Pass through the second data label class
Mesh distribution table is preserved.
Wherein, the number of times or the statistics of frequency occurred to the second data label, can be in the data text including data object
In part, the information of relevant second data label occurred is counted, for example:Wrapped in data object attribute information, title word segmentation result
Second data label contained etc..
Step S304, can carry out the second data label excavation to the second data label in each data object, and generation is all
The set of second data label synonym of data object, e.g., forms the second data label synonym table.
In identical class now, synonym excavation processing is carried out to the second data label in each data object, for example, extracting
Go out the second data label and the first data label of same class all data objects now.By with identical first data label
Different second data labels included in two data objects, which are considered as the once common of the second data label synonym pair, to be occurred.Example
Such as:Data object M1With label a(Second data label)With coding B(First data label);Data object M2With label A
(Second data label)With coding B(First data label), then can consider label A and label a is synonym, then data object
M1With data object M2It is exactly that the once common of synonym pair occurs.Hereafter, the second data label classification distribution table can be based on,
Count the common occurrence number of all second data label synonyms pair.Go out occurrence jointly according to the second data label synonym
Number(Or frequency)Sort from high to low, can be preferentially same to the second data label synonym the second data label of generation of high reps
Adopted vocabulary comes and preserved.
Excavation processing obtains identical class now, with different second data labels and with identical first data label
The association of multiple data objects, forms the set of the second data label synonym, forms many of the second data label synonym each other
Individual data object.Such as, if having the data label of identical first between multiple data objects, but with the second different data
Label, can be referred to as the second data label synonym by the second different data label, can also be by with the second data
The form of label synonym table is preserved.
Step S306, can carry out the first data label excavation to the first data label in each data object, and generation is all
First data label TongYiCi CiLin of data object, such as the first data label synonym table.
Appeared in for example, can extract in individual data object(Such as its data file)Multiple first data labels, for example,
Multiple first data labels included in the heading message of file, if to meet length identical for the data label of two of which first
And prefix is identical, then it is assumed that be the first data label synonym pair, finally, the first all data label synonyms pair gathers
Synthesize the first data label synonym cluster(TongYiCi CiLin), can be preserved using the first data label synonym table form.This
Sample, by multiple the first similar data labels in same data object, forms the set of the first data label synonym.
More specifically, multiple first data labels, e.g., a certain data object A bags can be included in a data object
Containing the first data label " 1110 ", " 1111 ", also, the two first data labels meet that length is identical and prefix is identical, then
First data label " 1110 " and " 1111 " can turn into the first data label synonym.One can be formed by such mode
One data label synonym table.
Step S308, can be excavated to the first data label and the second data label in each data object, generation
The mapping relations of first data label to the second data label.
For example, the first data label and the second data label in extracting the data files of all data objects, according to
The second data label classification distribution table, counts same first data label and different second data labels in data object
Co-occurrence(It is common to occur)Number of times(Or frequency), wherein, if a data object only has first data label and first number
Only there is co-occurrence mistake with unique second data label according to label, then set up first data label and second data label
Mapping relations, such as set up the first data label to the mapping table of the second data label, and preserve, so as to some data objects of completion
Feature in may missing the second data tag information, e.g., some data objects be likely to occur only the first data label and
Situation without the second data label:Only " 11 " are encoded in certain data object(First data label)And without label(Second number
According to label), but coding " 11 " once occurred and only occurred and label " A "(First data label)The situation of co-occurrence, then reflect
Penetrate.Such mapping can be in completion data object second data label this feature information.So as to when response search please
When asking, the recall rate of the same class data object of polymerization can be provided(Find rate, search full rate, return rate).
For example:Data object A only includes first data label " 1110 ", and does not include(Lack)Second data mark
Label, according to the first data label and the second data label of all data objects extracted, first number in same class now
There is co-occurrence in the second data label " BB " according to label " 1110 " only with data object B, in other words, in all data objects,
Only data object B includes the first data mark of the first data label " 1110 " and the second data label " BB ", then data object A
Sign second data label " BB " of " 1110 " only with data object B and there is co-occurrence, in such a case, it is possible to set up the first data
Mapping relations of the label " 1110 " to the second data label " BB ".
Further, the second data label table for being polymerize based on above-mentioned data mining, the second data label synonym table, the
One data label synonym table, the first data label to the second data label mapping table, the second data label classification distribution,
All data objects on line can be normalized and be mapped, thus embody data object, the first data label and
Mapping relations between second data label.
According to the above-mentioned data mining result to mass data object, the homogeneous data pair of initial integration can be formed
The incidence relation of elephant, i.e. in identical class now, with the second identical or synonymous data label and with identical or synonymous
Multiple data objects of first data label.
Secondly, for a certain data object(Or each data object), based on each set excavated under line(Table), from this
Data object(In data file)Heading message and attribute information in, extract the second data label and the first data label,
Also, according to data mining result, corresponding normalization is performed to all data objects to be searched(It is unified)And mapping
Processing, final integrate belongs to of a sort data object, as shown in Figure 4.Further optimize similar data object to integrate.Optimization
The classification distribution table of data object, optimization data object, the second data label, mapping relations of the first data label etc..
Step S402, extracts one or more of same data object the second data label, one or more to obtain
The data label of candidate second.To the heading message participle of a certain data object(Attribute information equally can be with), then by certain participle
Fragment(Set)The second data label table is matched, if matched completely with the second data label in the second data label table,
It regard the second data label as candidate.For example, including the second data label " A ", the second data label in a data object
" B ", the second data label " C ", and only comprising the second data label " A ", the second data label in the second data label table
" B ", then using the second data label " A ", the second data label " B " as the data object the data label of candidate second.
Based on the second data label classification distribution table, number of times or frequency that different second data labels occur are counted, according to
Number of times or frequency sort from high in the end, can regard the data label of candidate second of high reps or frequent as the data object
Second data label.
Step S404, disambiguation is carried out to the data label of one or more candidates second of extraction.
The disambiguation processing carried out to the data label of candidate second includes being distributed in classification according to each data label of candidate second
Occurrence number or frequency in table filter out the data label of candidate second conformed to a predetermined condition and counted as the second of data object
According to label.
In a specific embodiment, the classification belonging to data object is determined, the classification based on the second data label
Distribution table, obtains the number of times that the data label of candidate second occurs in the classification(Or frequency)If number of times is more than default threshold
Value(Such as 1 time), then it is assumed that it is the second data label of the data object.In another embodiment, if a data object goes out
Existing multiple data labels of candidate second, then select in the second data label classification distribution table, occurrence number is most(The frequency is most
Greatly)Second data label as the data object the second data label.For example, as it is known that the candidate of a data object
Second data label includes the second data label " A " and the second data label " B ", according to the second data label classification distribution table,
In the class belonging to the data object now, the number of times that the second data label " A " occurs is 1000 times, and the second data label " B " goes out
Existing number of times is 1 time, then the second data label " A " can be defined as to the second data label of the data object.
Step S406, the normalization of the second data label synonym.The second data label after disambiguation can be based on excavating under line
The second data label synonym table, the second data label of the data object of extraction is rewritten, normalize the second data
Label.For example, when the second data label " A " occurred 500 times in the second data label classification distribution table, the second data mark
The synonym " B " of label " A " occurs 20 times in the second data label classification distribution table, then can be by the second data label " B " more
It is changed to the second data label " A ".
For example, a certain data object is commodity, and the second data label of commodity can include the brand word of commodity.It is same
Individual commodity, its brand word there may be different literary styles, include the synonym and the form wrongly write of brand word.For example, a certain business
The brand word of product is " new bolune ", and the brand word has synonym " New Balance " and " new balance ", or the shape wrongly write
Formula " newbalance ", or write a Chinese character in simplified form " nb " etc..Can be according to the synonym table of brand word(Second data label synonym table)With
And the brand word after disambiguation(Second data label), rewrite the second data label extracted(Brand word), that is, unify one and most close
Suitable brand word uses " New Balance " as the second data mark of the commodity as the second data label of the commodity as unified
Label.
Step S408, according to the mapping relations of the first data label of structure and the second data label, to lacking the second number
According to the data object of label, the second data label completion is carried out.
If a data object has only extracted the first data label and has not extracted the second data label, i.e., the second number
According to the situation of tag misses, also, the first data label extracted, it is entirely capable of the first data with being excavated under same class purpose line
First data label of the label into the second data label mapping table matches, then the first data label excavated under the line to
The second data label of the data object is obtained in second data label mapping table, for the data object is aggregated to accordingly
Same class data object set in, lack the second data label it is possible to further which second data label is write into this
Data object in.
Step S410, the rule based on configuration extracts the first data label in multiple data objects.Rule based on configuration
In heading message, attribute information then in data file etc., the first data label of data object is extracted.For example, configuration canonical
Expression formula extracts the first data label of certain data object.
Step S412, to multiple first data label normalizeds of extraction.For example, phase is included in a data object
Same data number, such as " 1110 ", and different son numbering such as " 001 ", " 002 ", then sub- numbering is removed, to reach the first number
According to the normalization of label:" 1110-001 " and " 1110-002 " is normalized under " 1110 ", or is normalized to identical in the lump
Under first data label " 1110-001 ".
Exemplified by searching for the commercial articles searching of mass data object, the first data label extracted in data object commodity is such as:
Article No., based on separator cutting, it is " 537889 " and " 001 " two parts that article No. " 537889-001 ", which is based on "-" cutting, by main goods
Number i.e. " 537889 " above are considered as the article No. after normalization.
Step S414, the normalization of the first data label synonym.After the first data label normalized, based under line
The the first data label synonym table excavated, the first data label of the data object of extraction is rewritten, and unified is one
First data label.For example, when the first data label " 1110 " occurred 500 times in the second data label classification distribution table,
The synonym " 1111 " of first data label " 1110 " occurs 20 times in the classification distribution table of data object, then can be by
One data label " 1111 " is changed to the first data label " 1110 ".
Each table based on data mining under line is operated on line, each data object is based in its data label most frequently
The first data label and the second data label existed, is integrated, and the second data label of synonym each other or first are counted
It is normalized according to label, according to the second data label table, the second data label synonym table, the first data label synonym
Table, the second data label classification distribution table, the first data label are to the second data label mapping table, it is determined that in a certain class now,
Which data object should be integrated into homogeneous data object, and its unified second data label and the first data label, in order to
Search matching.
According to the data mining under line, and normalization, completion processing on line, data object, the first data can be obtained
Mapping relations between label, the second data label this three, can form data object mapping table, so as in data search
When, homogeneous data object is searched for according to the data object mapping table.
Obtained data object mapping table is integrated in step S204, storage in advance.The storage includes storage and passed through under line
Excavate with the data object mapping table after the integration obtained after completion normalization on line in database.
In the advance integration processing procedure to data object, the second data label table, the second data label classification are formed
Distribution table, the second data label synonym table, the first data label synonym table, the first data label to the second data label reflect
Firing table, and by described various tables(Set)It is stored in database, so, can be searched in data as the pre- result for integrating processing
Called at any time in rope, to improve system operations speed.
By the method integrated in advance to data object of the application, associated being set up between homogeneous data object,
And homogeneous data object is shown in search result, more fully data can be provided and for users to use, improve data and search
The accuracy and return rate of rope, so as to improve the quality of search result.
Present invention also provides a kind of device for the data search integrated based on homogeneous data.
As shown in figure 5, the structure drawing of device of the data search integrated based on homogeneous data for the embodiment of the present application.
According in device 500 described herein, can include receiving with search module 501, acquisition module 503,
With module 505, integrate with returning to module 507.The implementation of each step of the modules correspondence above method.
Wherein, receive and search module 501, for receiving the searching request from user, and perform search, wherein, institute
Stating searching request is used for one or more numbers that search matches with the searching request in all data objects to be searched
According to object.
Acquisition module 503, for by analyzing each data in the one or more of data objects searched
Object, obtains the data label of each data object, wherein, the data label includes the first data label and second
Data label, the first data label attributive character different with the second data label difference identified data object.So, it is described to obtain
Modulus block 503 can be used for the first data label and/or the second data label for obtaining each data object.
Matching module 505, for the data label according to acquisition(First data label and/or second data
Label)Matched, i.e. further matching is done to each data object in one or more data objects for searching,
To obtain one or more homogeneous data objects corresponding with each described data object.
Integrate with returning to module 507, for by the data label(First data label and/or the second data mark
Label)The one or more data objects matched are integrated into homogeneous data object composition, and are back to user as search result.
Wherein, the homogeneous data combination includes:Multiple data objects of homogeneous data object each other, can be with search results pages
Show one of them of multiple data objects in the homogeneous data combination.
In device 500 described herein, in addition to pretreatment module 509 and memory module 511.
Wherein, pretreatment module 509, for carrying out advance integration processing to all data objects to be searched, it is determined that often
The corresponding one or more homogeneous data objects of one data object to be searched, to obtain data object mapping relations
Table.
Specifically, all data objects to be searched of 509 pairs of the pretreatment module enter under line data mining and line
On data object normalization and map.
In the data mining under entering line, the pretreatment module 509 can be to the second data in each data object
Label and the second data label classification distribution table carry out excavation processing.
The pretreatment module 509 can carry out the second data label digging to the second data label in each data object
Pick, generates the set of the second data label synonym of all data objects.Wherein, the second data label synonym bag
Include:Identical class now, with different second data labels and with identical first data label multiple data objects.
The pretreatment module 509 can carry out the first data label digging to the first data label in each data object
Pick, generates the first data label TongYiCi CiLin of all data objects.Wherein, the first data label synonym includes:
Multiple the first similar data labels in same data object.
The pretreatment module 509 can be dug to the first data label and the second data label in each data object
Pick, generates the mapping relations of the first data label to the second data label.Specifically, if a data object only has one
One data label and first data label only has co-occurrence with unique second data label, then set up first data label
With the mapping relations of second data label.
When data object on line is entered is normalized and mapped, the pretreatment module 509 is configured to:Extract same number
According to the data label of one or more of object second, to obtain one or more data labels of candidate second, and to extraction
One or more data labels of candidate second carry out disambiguation.Further, the classification distribution table based on the second data label, is obtained
The number of times that the data label of candidate second occurs in the classification, if number of times is more than default threshold value, then it is assumed that be the data
Second data label of object.In another embodiment, if multiple data labels of candidate second occurs in a data object, select
Select in the second data label classification distribution table, most second data label of occurrence number is used as the data pair
The second data label of elephant.
Pretreatment module 509 is further configured to:Rule based on configuration, extracts the first data mark in multiple data objects
Label, and to multiple first data label normalizeds of extraction;By the second data label or the first data of synonym each other
Label is normalized;According to the mapping relations of the first data label of structure and the second data label, to lacking the second data
The data object of label, carries out the second data label completion.
The purpose of pretreatment module 509 is to all data objects to be searched(Mass data object)Carry out in advance
Integration is handled, and to obtain homogeneous data object, the homogeneous data object is included in identical class now, with identical or synonymous
Second data label and multiple data objects with the first identical or synonymous data label.Also, at advance integration
During reason, data object, the first data label, the mapping relations of the second data label can be obtained, data object is formed
Mapping table.
Memory module 511, obtained data object mapping table is integrated for storing in advance.Carrying out data search
When, homogeneous data pair corresponding with the data object searched can directly be matched by the data object mapping table
As.
So, using key character/attributes such as the first data label of data object, the second data labels, in advance to sea
Measure data object carry out classification integration, and between homogeneous data object set up association, improve data search accuracy and
Return rate, so as to improve the quality of search result.
Also, the multiple homogeneous data objects for returning to search engine carry out integration processing, it is possible in search results pages
In only show one in the plurality of homogeneous data object, search results pages is shown a greater variety of data objects, add
The diversity of search result, better user experience.
Embodiment and the side of the application due to the modules included by the device of the application described by Fig. 5
The embodiment of step in method is corresponding, due to Fig. 1-Fig. 4 being described in detail, so in order to
The application is not obscured, the detail no longer to modules is described herein.
Each embodiment in this specification is typically described by the way of progressive, and what each embodiment was stressed is
With the difference of other embodiment, between each embodiment identical similar part mutually referring to.
The application can be described in the general context of computer executable instructions, such as program
Module or unit.Usually, program module or unit can include performing particular task or realize particular abstract data type
Routine, program, object, component, data structure etc..In general, program module or unit can be by softwares, hardware or both
Combination realize.The application can also be put into practice in a distributed computing environment, in these DCEs, by passing through
Communication network and connected remote processing devices perform task.In a distributed computing environment, program module or unit can
With positioned at including in the local and remote computer-readable storage medium including storage device.
Finally, in addition it is also necessary to explanation, term " comprising ", "comprising" or its any other variant are intended to non-exclusive
Property include so that process, method, commodity or equipment including a series of key elements not only include those key elements, and
Also include other key elements for being not expressly set out, or also include for this process, method, commodity or equipment inherently
Key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including described
Also there is other identical element in process, method, commodity or the equipment of key element.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or computer program
Product.Therefore, the application can be using the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware
Apply the form of example.Moreover, the application can be used in one or more computers for wherein including computer usable program code
Usable storage medium(Including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)The computer program production of upper implementation
The form of product.
Specific case used herein is set forth to the principle and embodiment of the application, and above example is said
It is bright to be only intended to help and understand the present processes and its main thought;Simultaneously for those of ordinary skill in the art, foundation
The thought of the application, will change in specific embodiments and applications, in summary, and this specification content is not
It is interpreted as the limitation to the application.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net
Network interface and internal memory.Internal memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM)
And/or the form, such as read-only storage (ROM) or flash memory (flash RAM) such as Nonvolatile memory.Internal memory is computer-readable Jie
The example of matter.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moved
State random access memory (DRAM), other kinds of random access memory (RAM), read-only storage (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic cassette tape, the storage of tape magnetic rigid disk or other magnetic storage apparatus
Or any other non-transmission medium, the information that can be accessed by a computing device available for storage.Define, calculate according to herein
Machine computer-readable recording medium does not include the data-signal and carrier wave of non-temporary computer readable media (transitory media), such as modulation.
Claims (17)
1. a kind of method based on the data search integrated to homogeneous data object, it is characterised in that including:
The searching request from user is received, search matches with the searching request in all data objects to be searched
One or more data objects;
Each in the one or more of data objects searched is analyzed, to obtain the number of each data object
According to label;
The data label of acquisition is matched with data object to be searched, matched with obtaining with the data label
One or more similar data objects;
One or more data objects that the data label matches are integrated into homogeneous data object composition, and are used as search
As a result it is back to user.
2. according to the method described in claim 1, it is characterised in that the data label includes the first data label and the second number
According to label, the first data label attributive character different with the second data label difference identified data object.
3. method according to claim 2, it is characterised in that also include:It is whole in advance to all data objects to be searched
Conjunction is handled, to determine one or more homogeneous data objects that each described data object to be searched is corresponding, to obtain
Data object mapping table.
4. method according to claim 3, it is characterised in that to all data objects to be searched, advance integration is handled,
Including:
Excavation processing is carried out to the second data label in each data object and the second data label classification distribution table;
Second data label excavation is carried out to the second data label in each data object, the second number of all data objects is generated
According to the set of label synonym;
First data label excavation is carried out to the first data label in each data object, the first number of all data objects is generated
According to label TongYiCi CiLin;
The first data label and the second data label in each data object is excavated, the first data label of generation to second
The mapping relations of data label.
5. method according to claim 4, it is characterised in that
The second data label synonym includes:Identical class now, with different second data labels and with identical first
Multiple data objects of data label;
The first data label synonym includes:Multiple the first similar data labels in same data object.
6. method according to claim 4, it is characterised in that counted to the first data label in each data object and second
Excavated according to label, generate the mapping relations of the first data label to the second data label, including:If a data object is only
With the presence of first data label and first data label only with unique second data label co-occurrence, then set up described
The mapping relations of first data label and second data label.
7. method according to claim 3, it is characterised in that to all data objects to be searched, advance integration is handled,
Including:
The data label of one or more of same data object second is extracted, to obtain one or more the second data of candidate marks
Label, and disambiguation is carried out to the data label of one or more candidates second of extraction;
Rule based on configuration, extracts the first data label in multiple data objects, and to multiple first data marks of extraction
Sign normalized;
The second data label or the first data label of synonym each other are normalized;
According to the mapping relations of the first data label of structure and the second data label, the data pair to lacking the second data label
As carrying out the second data label completion.
8. method according to claim 7, it is characterised in that enter to the data label of one or more candidates second of extraction
Row disambiguation, including:
Classification distribution table based on the second data label, obtains time that the data label of candidate second occurs in the classification
Number, if number of times is more than default threshold value, then it is assumed that be the second data label of the data object;And/or, if a data object
There are multiple data labels of candidate second, then select in the second data label classification distribution table, most one of occurrence number
Second data label as the data object the second data label.
9. the method as described in claim 1, it is characterised in that including:
In search results pages, one of them of multiple data objects in the homogeneous data combination is shown, wherein, it is described same
The combination of class data includes:Multiple data objects of homogeneous data object each other.
10. method according to claim 2, it is characterised in that the homogeneous data object includes:In identical class now,
Multiple data objects with the second identical or synonymous data label and with the first identical or synonymous data label.
11. a kind of device based on the data search integrated to homogeneous data object, it is characterised in that including:
Receive and search module, for receiving the searching request from user, in all data objects to be searched search with
One or more data objects that the searching request matches;
Acquisition module, for analyzing each in the one or more of data objects searched, to obtain each institute
State the data label of data object;
Matching module, is matched for the data label to acquisition with data object to be searched, with obtain with it is described
One or more similar data objects that data label matches;
Integrate with returning to module, for one or more data objects that the data label matches to be integrated into homogeneous data
Object composition, and it is back to user as search result.
12. device according to claim 11, it is characterised in that the data label includes the first data label and second
Data label, the first data label attributive character different with the second data label difference identified data object.
13. device according to claim 12, it is characterised in that also include:
Pretreatment module, for all data objects to be searched, advance integration processing, to determine that each is described to be searched
The corresponding one or more homogeneous data objects of data object, to obtain data object mapping table.
14. device according to claim 13, it is characterised in that the pretreatment module is further configured to:
Excavation processing is carried out to the second data label in each data object and the second data label classification distribution table;
Second data label excavation is carried out to the second data label in each data object, the second number of all data objects is generated
According to the set of label synonym;
First data label excavation is carried out to the first data label in each data object, the first number of all data objects is generated
According to label TongYiCi CiLin;
The first data label and the second data label in each data object is excavated, the first data label of generation to second
The mapping relations of data label;
If a data object only have first data label and first data label only with unique second data mark
There is co-occurrence in label, then set up the mapping relations of first data label and second data label.
15. device according to claim 14, it is characterised in that the second data label synonym includes:It is mutually similar
Now, multiple data objects with different second data labels and with identical first data label;The first data mark
Label synonym includes:Multiple the first similar data labels in same data object.
16. device according to claim 13, it is characterised in that the pretreatment module is further configured to:
The data label of one or more of same data object second is extracted, to obtain one or more the second data of candidate marks
Label, and disambiguation is carried out to the data label of one or more candidates second of extraction;
Rule based on configuration, extracts the first data label in multiple data objects, and to multiple first data marks of extraction
Sign normalized;
The second data label or the first data label of synonym each other are normalized;
According to the mapping relations of the first data label of structure and the second data label, the data pair to lacking the second data label
As carrying out the second data label completion;
Classification distribution table based on the second data label, obtains time that the data label of candidate second occurs in the classification
Number, if number of times is more than default threshold value, then it is assumed that be the second data label of the data object;And/or
If multiple data labels of candidate second occurs in a data object, select in the second data label classification distribution table, go out
Most second data label of occurrence number as the data object the second data label.
17. device according to claim 12, it is characterised in that the integration is further configured to returning to module:
In search results pages, one of them of multiple data objects in the homogeneous data combination is shown, wherein, it is described same
The combination of class data includes:Multiple data objects of homogeneous data object each other;
The homogeneous data object includes:In identical class now, with the second identical or synonymous data label and with phase
Multiple data objects of the first same or synonymous data label.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711064889.XA CN107844565B (en) | 2013-05-16 | 2013-05-16 | Commodity searching method and device |
CN201310182427.3A CN104166651B (en) | 2013-05-16 | 2013-05-16 | Method and apparatus based on the data search integrated to homogeneous data object |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310182427.3A CN104166651B (en) | 2013-05-16 | 2013-05-16 | Method and apparatus based on the data search integrated to homogeneous data object |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711064889.XA Division CN107844565B (en) | 2013-05-16 | 2013-05-16 | Commodity searching method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104166651A CN104166651A (en) | 2014-11-26 |
CN104166651B true CN104166651B (en) | 2017-10-13 |
Family
ID=51910470
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310182427.3A Active CN104166651B (en) | 2013-05-16 | 2013-05-16 | Method and apparatus based on the data search integrated to homogeneous data object |
CN201711064889.XA Active CN107844565B (en) | 2013-05-16 | 2013-05-16 | Commodity searching method and device |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711064889.XA Active CN107844565B (en) | 2013-05-16 | 2013-05-16 | Commodity searching method and device |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN104166651B (en) |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104462316B (en) * | 2014-12-01 | 2017-09-26 | 苏州朗米尔照明科技有限公司 | A kind of tag match method |
CN105786922B (en) * | 2014-12-25 | 2020-02-14 | 高德软件有限公司 | Method and device for determining missing electronic map data |
CN104794003B (en) * | 2015-02-04 | 2019-06-04 | 汉鼎宇佑互联网股份有限公司 | It is a kind of to integrate real-time and non-real-time mode big data analysis system |
CN106033457B (en) * | 2015-03-18 | 2019-10-18 | 阿里巴巴集团控股有限公司 | A kind of method and apparatus of the attribute information of determining fruit objective attribute target attribute |
CN105138637A (en) * | 2015-08-24 | 2015-12-09 | 浪潮软件股份有限公司 | Data processing method and device |
CN105279277A (en) * | 2015-11-12 | 2016-01-27 | 百度在线网络技术(北京)有限公司 | Knowledge data processing method and device |
CN107451141B (en) * | 2016-05-30 | 2021-01-29 | 阿里巴巴集团控股有限公司 | Data recommendation processing interaction method, device and system |
CN106372191A (en) * | 2016-08-31 | 2017-02-01 | 广东华邦云计算股份有限公司 | Data search method and device |
US10289309B2 (en) | 2016-09-12 | 2019-05-14 | Toshiba Memory Corporation | Automatic detection of multiple streams |
US10542089B2 (en) | 2017-03-10 | 2020-01-21 | Toshiba Memory Corporation | Large scale implementation of a plurality of open channel solid state drives |
US10073640B1 (en) | 2017-03-10 | 2018-09-11 | Toshiba Memory Corporation | Large scale implementation of a plurality of open channel solid state drives |
CN108415886B (en) * | 2018-03-07 | 2019-04-05 | 清华大学 | A kind of data label error correction method and device based on production process |
CN109063222B (en) * | 2018-11-04 | 2021-11-30 | 朗威寰球(北京)科技集团有限公司 | Self-adaptive data searching method based on big data |
CN109558468B (en) * | 2018-12-13 | 2022-04-01 | 北京百度网讯科技有限公司 | Resource processing method, device, equipment and storage medium |
CN110059967B (en) * | 2019-04-23 | 2021-02-23 | 北京相数科技有限公司 | Data processing method and device applied to city aid decision analysis |
CN110263240A (en) * | 2019-06-18 | 2019-09-20 | 深圳市酷开网络科技有限公司 | Integration searching method, device and the computer readable storage medium of information |
CN110516140A (en) * | 2019-08-15 | 2019-11-29 | 北京泰迪熊移动科技有限公司 | A kind of information processing method, equipment and computer storage medium |
CN111667235A (en) * | 2020-05-18 | 2020-09-15 | 上海兴亚报关有限公司 | Customs clearance information management method, system, device and storage medium |
CN113763081B (en) * | 2020-08-26 | 2024-06-18 | 北京沃东天骏信息技术有限公司 | Article recall method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102890686A (en) * | 2011-07-21 | 2013-01-23 | 腾讯科技(深圳)有限公司 | Method and system for showing commodity search result |
CN102915498A (en) * | 2011-08-03 | 2013-02-06 | 腾讯科技(深圳)有限公司 | Method and device for goods classification of e-commerce platform |
CN102930038A (en) * | 2012-11-12 | 2013-02-13 | 江苏外博资讯有限公司 | Combined method of search result similar items and system of the same |
CN103020207A (en) * | 2012-12-06 | 2013-04-03 | 优视科技有限公司 | Browser label page grouping management method and device |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100776697B1 (en) * | 2006-01-05 | 2007-11-16 | 주식회사 인터파크지마켓 | Method for searching products intelligently based on analysis of customer's purchasing behavior and system therefor |
CN101206674A (en) * | 2007-12-25 | 2008-06-25 | 北京科文书业信息技术有限公司 | Enhancement type related search system and method using commercial articles as medium |
US8200682B2 (en) * | 2008-04-22 | 2012-06-12 | Uc4 Software Gmbh | Method of detecting a reference sequence of events in a sample sequence of events |
CN101887436B (en) * | 2009-05-12 | 2013-08-21 | 阿里巴巴集团控股有限公司 | Retrieval method and device |
CN101963966A (en) * | 2009-07-24 | 2011-02-02 | 李占胜 | Method for sorting search results by adding labels into search results |
CN102207963A (en) * | 2011-05-30 | 2011-10-05 | 何吴迪 | Post-search instant intelligent navigation technique for cloud computing window platform |
CN103092856B (en) * | 2011-10-31 | 2015-09-23 | 阿里巴巴集团控股有限公司 | Search result ordering method and equipment, searching method and equipment |
CN102902806B (en) * | 2012-10-17 | 2016-02-10 | 深圳市宜搜科技发展有限公司 | A kind of method and system utilizing search engine to carry out query expansion |
CN103106240A (en) * | 2012-12-12 | 2013-05-15 | 江苏乐买到网络科技有限公司 | Method of searching products in online shopping |
-
2013
- 2013-05-16 CN CN201310182427.3A patent/CN104166651B/en active Active
- 2013-05-16 CN CN201711064889.XA patent/CN107844565B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102890686A (en) * | 2011-07-21 | 2013-01-23 | 腾讯科技(深圳)有限公司 | Method and system for showing commodity search result |
CN102915498A (en) * | 2011-08-03 | 2013-02-06 | 腾讯科技(深圳)有限公司 | Method and device for goods classification of e-commerce platform |
CN102930038A (en) * | 2012-11-12 | 2013-02-13 | 江苏外博资讯有限公司 | Combined method of search result similar items and system of the same |
CN103020207A (en) * | 2012-12-06 | 2013-04-03 | 优视科技有限公司 | Browser label page grouping management method and device |
Also Published As
Publication number | Publication date |
---|---|
CN107844565A (en) | 2018-03-27 |
CN104166651A (en) | 2014-11-26 |
CN107844565B (en) | 2021-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104166651B (en) | Method and apparatus based on the data search integrated to homogeneous data object | |
CN102725759B (en) | For the semantic directory of Search Results | |
US9323731B1 (en) | Data extraction using templates | |
CN103678335B (en) | The method of method, apparatus and the commodity navigation of commodity sign label | |
CN105760495B (en) | A kind of knowledge based map carries out exploratory searching method for bug problem | |
CN104331446B (en) | A kind of massive data processing method mapped based on internal memory | |
CN105045901A (en) | Search keyword push method and device | |
CN106815307A (en) | Public Culture knowledge mapping platform and its use method | |
WO2013163644A2 (en) | Updating a search index used to facilitate application searches | |
CN105159930A (en) | Search keyword pushing method and apparatus | |
Babu et al. | Improving Quality of Content Based Image Retrieval with Graph Based Ranking | |
CN110008306A (en) | A kind of data relationship analysis method, device and data service system | |
WO2010008488A1 (en) | Method and system for dynamically generating a search result | |
CN103838754A (en) | Information searching device and method | |
CN103995828B (en) | A kind of cloud storage daily record data analysis method | |
CN105138637A (en) | Data processing method and device | |
US20140075299A1 (en) | Systems and methods for generating extraction models | |
CN105404677A (en) | Tree structure based retrieval method | |
CN103853771B (en) | A kind of method for pushing and system of search result | |
JP5324677B2 (en) | Similar document search support device and similar document search support program | |
CN102902705B (en) | Ambiguity in location data | |
JP5780036B2 (en) | Extraction program, extraction method and extraction apparatus | |
CN107784019A (en) | Word treatment method and system are searched in a kind of searching service | |
Zhao et al. | Predicting missing provenance using semantic associations in reservoir engineering | |
KR20140026796A (en) | System and method for providing customized patent analysis service |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |