CN107145525A - Data processing method, searching method and related device for confirming search scene - Google Patents
Data processing method, searching method and related device for confirming search scene Download PDFInfo
- Publication number
- CN107145525A CN107145525A CN201710243857.XA CN201710243857A CN107145525A CN 107145525 A CN107145525 A CN 107145525A CN 201710243857 A CN201710243857 A CN 201710243857A CN 107145525 A CN107145525 A CN 107145525A
- Authority
- CN
- China
- Prior art keywords
- data
- mapping
- search
- scene
- data set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 78
- 238000003672 processing method Methods 0.000 title claims abstract description 12
- 238000012544 monitoring process Methods 0.000 claims abstract description 106
- 238000013507 mapping Methods 0.000 claims abstract description 92
- 238000013506 data mapping Methods 0.000 claims abstract description 64
- 238000012545 processing Methods 0.000 claims abstract description 46
- 238000013480 data collection Methods 0.000 claims abstract description 29
- 238000012986 modification Methods 0.000 claims description 11
- 230000004048 modification Effects 0.000 claims description 11
- 238000005457 optimization Methods 0.000 claims description 11
- 230000000875 corresponding effect Effects 0.000 description 42
- 238000004458 analytical method Methods 0.000 description 30
- 238000005520 cutting process Methods 0.000 description 28
- 235000021152 breakfast Nutrition 0.000 description 18
- 235000013311 vegetables Nutrition 0.000 description 18
- 235000013305 food Nutrition 0.000 description 14
- 235000015277 pork Nutrition 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 238000003860 storage Methods 0.000 description 6
- 241000287828 Gallus gallus Species 0.000 description 5
- 238000001914 filtration Methods 0.000 description 5
- 230000006399 behavior Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 235000009508 confectionery Nutrition 0.000 description 3
- 238000012790 confirmation Methods 0.000 description 3
- 230000002596 correlated effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 241000227653 Lycopersicon Species 0.000 description 2
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 2
- 229920002472 Starch Polymers 0.000 description 2
- 230000003542 behavioural effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000007717 exclusion Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000000796 flavoring agent Substances 0.000 description 2
- 235000019634 flavors Nutrition 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 239000002304 perfume Substances 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 235000019698 starch Nutrition 0.000 description 2
- 239000008107 starch Substances 0.000 description 2
- 235000021419 vinegar Nutrition 0.000 description 2
- 239000000052 vinegar Substances 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 206010017472 Fumbling Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 235000019504 cigarettes Nutrition 0.000 description 1
- 238000012517 data analytics Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 210000004209 hair Anatomy 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 235000012054 meals Nutrition 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 235000014102 seafood Nutrition 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24578—Query processing with adaptation to user needs using ranking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0623—Item investigation
- G06Q30/0625—Directed, with specific intent or strategy
Abstract
The embodiment of the present invention provides a kind of data processing method, searching method and related device for being used to confirm search scene, is related to data processing and search field.Wherein, the data processing method includes:The primary data mapping set up between the first data set and the second data set;The primary data mapping is adjusted according to monitoring data collection, the real data mapping between first data set and second data set is obtained;The first data in first data set being actually mapped to based on the second data in second data set, determine the search scene that the second data in second data set are mapped.It using the present invention, can effectively optimize data mapping relations, improve the precision of mapping, and then improve the follow-up precision for determining search scene;Matching efficiency is lifted, effectively the range of lifting matching scene, improves the accuracy of search result.
Description
Technical field
The present embodiments relate to data processing and search field, more particularly, it is related to one kind and is searched for confirmation
Data processing method, searching method and the related device of rope scene.
Background technology
O2O electric business platform sends out rapidly emergence in internet arena in recent years, wherein the take-away field based on food catering
It is with the fastest developing speed.User completes consumption by searching for selection cuisines in application software, is necessarily referred to during this
One Core Feature is exactly to search for.
Different from traditional generic text search engine such as Baidu, GOOGLE, the search engine of food and drink electric business needs to pass through
Specific search scene and specialized data source expansion search mission.For example, search " deep-fried twisted dough sticks ", then corresponding special scenes should
Should be breakfast and north etc..In simple terms, search scene is exactly to excavate the information of user's search behavior behind, for example, searching for
" cray ", corresponding search scene is exactly some information such as " summer, stoke of midnight, many people party, seafood ", passes through these scene numbers
According to " association ", can more accurately output user expect result.
At present, the search scene Recognition technology based on catering field knowledge is at home also in the stage of fumbling.In industrial quarters,
Starting evening is searched because catering field is vertical and is quickly grown, the technology upgrading of search scene Recognition fails to keep up with the lifting of demand;And
Educational circles obtains extensive high value search data due to being difficult to, and area research progress is also basic to be stagnated.But huge market
Demand carrys out immense pressure to catering field searching strip.Therefore, it is accurate and it is specialized identification search scene just into this field
The core optimization direction of search engine technique.
In one kind in the prior art, the scene Recognition that the vertical electric business of catering field is searched for is mainly by the way of handmarking
Complete.This mode has that human cost is high, mark standard subjectivity by force can not the defect such as objective unification.Even if prior art branch
Hold automation mode, it is also difficult to ensure the accurate and specialized identification of search scene.
The content of the invention
In order to solve the defect present in prior art, the embodiment of the present invention provides a kind of number for being used to confirm search scene
According to processing method, searching method and related device, it can automate, realize to precision the mapping for searching for scene, improve search
The recognition accuracy of scene, improves the precision of search result.
In a first aspect, providing a kind of data processing method for being used to confirm search scene, bag in embodiment of the present invention
Include:
The primary data mapping set up between the first data set and the second data set, first data set includes multinomial the
One data, second data set includes multinomial second data;
The primary data mapping is adjusted according to monitoring data collection, first data set and second data set is obtained
Between real data mapping;
The first data in first data set being actually mapped to based on the second data in second data set,
Determine the corresponding search scene of the second data in second data set.
In a kind of implementation of embodiment of the present invention, first data set is the scene characteristic of catering field
Storehouse, second data set includes vegetable data and merchant data.
In a kind of implementation of embodiment of the present invention, methods described also includes:According to time dimension and geographical dimension
Degree the first data source of processing, obtains first data set.Or, methods described also includes:Cutting word is carried out to monitoring data source
Analysis, word frequency analysis, stem are extracted and semantic analysis, obtain the monitoring data collection.
In a kind of implementation of embodiment of the present invention, the monitoring data that the monitoring data is concentrated is except including short
Language title, in addition to weight and/or penalty factor.
Further, it is described that the primary data mapping is adjusted according to monitoring data collection, including:
The monitoring data for determining to be mutually matched and the first data are handled using text matches;
For every second data, the power of the monitoring data based on the first Data Matching arrived with the second data initial mapping
Mapping relations between weight, the first data that the second data of modification and its initial mapping are arrived, and/or,
For every second data, based on the second data initial mapping to the monitoring data of the first Data Matching punish
Penalty factor, the weight for the first data that the second data initial mapping of adjustment is arrived.
It is described real based on the second data in second data set in a kind of implementation of embodiment of the present invention
The first data in first data set that border is mapped to, determine the corresponding search of the second data in second data set
Scene, including:For every second data, at least partly first is chosen from the first data for actually mapping to the second data and is counted
According to or the combinations of the data of described at least part first be used as the search scene.
A kind of search scene recognition method is provided in second aspect, embodiment of the present invention, this method includes:
Cutting word is carried out to search terms, search term is obtained;
The matched data for determining to match with the search term in the second data set by matching treatment;
The search scene mapped according to the matched data, determines the corresponding search scene of the search terms;
Wherein, the search scene that second data set is mapped is determined using aforementioned data processing method.
The third aspect, embodiment of the present invention also provides a kind of searching method, including:
The search scene mapped according to search terms and the second data set and second data set, determines the search terms
Corresponding search scene, wherein, the scene that second data set is mapped is determined using aforementioned data mapping method (should
The output result of step is identification search scene, and it can specifically be realized by above-mentioned second aspect);
Loading data file corresponding with the search scene, the data file is configured with the optimization plan of call back data
Slightly;
Sequence is optimized to call back data according to the data file.
A kind of data processing equipment for being used to confirm search scene is provided in fourth aspect, embodiment of the present invention, including:
Module is set up in data mapping, for setting up the mapping of the data between the first data set and the second data set, described the
One data set includes multinomial first data, and second data set includes multinomial second data;
Data map adjusting module, are mapped for adjusting the primary data according to monitoring data collection, obtain described first
Real data mapping between data set and second data set;
Search for scene mapping block, for be actually mapped to based on the second data in second data set described the
The first data in one data set, determine the corresponding search scene of the second data in second data set.
In a kind of implementation of embodiment of the present invention, first data set is the scene characteristic of catering field
Storehouse, second data set includes vegetable data and merchant data.
In a kind of implementation of embodiment of the present invention, described device also includes:First data processing module, is used for
The first data source is handled according to time dimension and geography dimensionality, first data set is obtained.Or, described device also includes:
Monitoring data processing module, for carrying out cutting word analysis, word frequency analysis, stem extraction and semantic analysis to monitoring data source,
Obtain the monitoring data collection.
In a kind of implementation of embodiment of the present invention, the monitoring data that the monitoring data is concentrated is except including short
Language title, in addition to weight and/or penalty factor.
Further, the data mapping adjusting module includes:Matched sub-block, is determined for being handled using text matches
The monitoring data being mutually matched and the first data;First adjustment submodule, for for every second data, based on the second number
The weight of the monitoring data of the first Data Matching arrived according to initial mapping, the first number that the second data of modification are arrived with its initial mapping
Mapping relations between, and/or, the second adjustment submodule, for for every second data, based on initial with the second data
The penalty factor of the monitoring data for the first Data Matching being mapped to, the power for the first data that the second data initial mapping of adjustment is arrived
Weight.
In a kind of implementation of embodiment of the present invention, it is described search scene mapping block specifically for:For each
The second data of item, choose at least partly the first data or described at least part from the first data for being actually mapped to the second data
The combination of first data is used as the search scene.
A kind of search scene Recognition device is provided in the 5th aspect, embodiment of the present invention, including:
Cutting word module, for carrying out cutting word to search terms, obtains search term;
Matching module, for the matched data for determining to match with the search term in the second data set by matching treatment;
Determining module, for the search scene mapped according to the matched data, determines that the search terms is corresponding and searches
Rope scene;
Wherein, the scene that second data set is mapped is determined using aforementioned data mapping method.
A kind of searcher is provided in 6th aspect, the embodiment of the present invention, including:
Scene determining module, for the search mapped according to search terms and the second data set and second data set
Scape, determines the corresponding search scene of the search terms, wherein, the scene that second data set is mapped is reflected using aforementioned data
Shooting method determines that (output result of the module is identification search scene, and it can specifically pass through above-mentioned search scene Recognition device
Realize);
Load-on module, for loading data file corresponding with the search scene, the data file, which is configured with, recalls
The optimisation strategy of data;
Optimization module, sequence is optimized to call back data for the data file according to loading.
The function of the search scene Recognition device and searcher can be realized by hardware, can also be held by hardware
The corresponding software of row is realized.The hardware or software include one or more modules corresponding with above-mentioned functions.
In a possible design, the structure of above-mentioned search scene Recognition device or searcher include processor and
Memory, the memory is used to store the program for supporting relevant apparatus to perform foregoing respective handling, and the processor is configured
For for performing the program stored in the memory.Relevant apparatus can also include communication interface, be set for device with other
Standby or communication.
7th aspect, the embodiments of the invention provide a kind of computer-readable storage medium, knows for storing the search scene
Computer software instructions used in other device and/or searcher, it, which is included, is used to perform above-mentioned correlation method so that search field
Scape identifying device and/or searcher realize the involved program of corresponding data processing.
The embodiment of the present invention can effectively optimize data mapping relations, improve the precision of mapping, and then improve follow-up true
Surely the precision of scene is searched for;Matching efficiency can be additionally lifted, the effectively range of lifting matching scene, and then effectively carry
The accuracy of high search result.
The aspects of the invention or other aspects can more straightforwards in the following description.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is the accompanying drawing used required in technology description to be briefly described, it should be apparent that, drawings in the following description are this hairs
Some bright embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can be with root
Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is a kind of flow signal for being used to confirm the data processing method of search scene according to embodiments of the present invention
Figure;
Fig. 2 is a kind of schematic flow sheet of method for setting up scene characteristic storehouse according to embodiments of the present invention;
Fig. 3 is a kind of schematic flow sheet of the method for acquisition monitoring data according to embodiments of the present invention;
Fig. 4 is a kind of data mapping logic schematic diagram according to embodiments of the present invention;
Fig. 5 is a kind of schematic flow sheet of data mapping method according to embodiments of the present invention;
Fig. 6 is a kind of schematic flow sheet of search scene recognition method according to embodiments of the present invention;
Fig. 7 is a kind of schematic flow sheet of searching method according to embodiments of the present invention;
Fig. 8 is the one of the block diagram of a kind of data processing equipment for being used for confirmation search scene according to embodiments of the present invention
Example;
Fig. 9 is one of the block diagram of a kind of search scene Recognition device according to embodiments of the present invention;
Figure 10 is one of the block diagram of a kind of searcher according to embodiments of the present invention.
Embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention
Accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described.
In some flows of description in description and claims of this specification and above-mentioned accompanying drawing, contain according to
Particular order occur multiple operations, but it should be clearly understood that these operation can not herein occur according to it is suitable
Sequence is performed or performed parallel, and the sequence number such as 101,102 etc. of operation is only used for distinguishing each different operation, sequence number
Any execution sequence is not represented for itself.In addition, these flows can include more or less operations, and these operations can
To perform or perform parallel in order.It should be noted that the description such as " first ", " second " herein, is to be used to distinguish not
Same message, equipment, module etc., does not represent sequencing, it is different types also not limit " first " and " second ".
First, to the present invention relates to or the part noun that may relate to illustrate.These explain only for the purposes of understand,
And do not constitute the limitation to various embodiments of the invention.
Search technique, sets up information database and index data information, by various soft for the data resource of internet
Part, hardware technology realize that performance optimizes, and the function optimization of accuracy and ranking results is scanned for using related algorithm strategy.
Scene Recognition, carries out excavating based on big data and the depth data of natural language processing for search keyword, point
The search scene residing for keyword is analysed, and then from higher level Optimizing Search result.
Special knowledge and technical ability in domain knowledge, industry field.Field refers to the specialty of some restriction or the scope of industry,
Such as finance, manufacture, food and drink.The knowledge frame that expertise, technical ability, management competency in field are constituted is referred to as ken.
Natural language processing, is the process and relevant technology with computer disposal natural language information.What natural language referred to
It is written or oral form the language of mankind itself, such as Chinese, English, Japanese, it is relative to artificial formalization
For computer language.The key for handling natural language is computer understanding natural language to be allowed.
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on
Embodiment in the present invention, the every other implementation that those skilled in the art are obtained under the premise of creative work is not made
Example, belongs to the scope of protection of the invention.
Fig. 1 is a kind of flow signal for being used to confirm the data processing method of search scene according to embodiments of the present invention
Figure, reference picture 1, methods described includes:
10:The primary data mapping set up between the first data set and the second data set.Wherein, the first data set bag
Containing multinomial first data, second data set includes multinomial second data.
In the present invention, the first data set and the second data set, which are included, can directly carry out the number of data mapping processing
According to.On how to obtain the first data set and the second data set under specific application environment, it will become clear from the description below.
Alternatively, in the present embodiment, processing 10 using the first data set to the second data set it can be appreciated that carried out
Data markers, so that the initial mapping relation set up between the first data set and the second data set.
12:The primary data mapping is adjusted according to monitoring data collection, first data set and the described second number is obtained
According to the real data mapping between collection.
Alternatively, in a kind of implementation of the present embodiment, the effect of monitoring data collection be processing 10 is obtained just
Beginning mapping relations are optimized, for example, preventing the over-fitting situation to the data markers of the first data set, mapping intensity is carried out
Limitation.
Wherein, the monitoring data that monitoring data collection is included in monitoring data, the present invention can be understood as a kind of standardization
Data sample, for aiding in carrying out the processing such as data filtering, adjustment, optimization, with data reference meaning.
14:The search scene for determining that second data set is mapped is mapped based on the real data.Specifically, base
The first data in first data set that the second data in second data set are actually mapped to, determine described
The corresponding search scene of the second data in two data sets.
The method provided using the present embodiment, not sufficiently effective or over-fitting situation existing mapping is mapped relative to existing
For technology, capable adjustment is mapped into data based on monitoring data, can effectively optimize data mapping relations, improves the essence of mapping
Accuracy, and then improve the precision of identified search scene.
Alternatively, in a kind of implementation of the present embodiment, the monitoring data that the monitoring data is concentrated includes phrase
Title and adjusting parameter, the adjusting parameter include weight and/or penalty factor.Now, processing 12 can be in the following manner
Realize:
First, the monitoring data for determining to be mutually matched and the first data are handled using text matches.For example, by phrase title
Matching treatment is carried out with the first data in the first data set, it is determined that the monitoring data being mutually matched and the first data.Then, pin
To every second data, the weight of the monitoring data based on the first Data Matching arrived with the second data initial mapping, modification the
Mapping relations between the first data that two data and its initial mapping are arrived, and/or, for every second data, based on
The penalty factor of the monitoring data for the first Data Matching that two data initial mappings are arrived, the second data initial mapping of adjustment arrive the
The weight of one data.
Wherein, the mapping relations between the first data that its initial mapping of the second data of the modification is arrived, including:Delete institute
The weighted value of the monitoring data of matching is unsatisfactory for the first data of preparatory condition and the mapping relations of the second data, according to the first number
The mapping between the first data and the second data is ranked up according to the weighted value of the monitoring data matched.
Alternatively, in a kind of implementation of the present embodiment, the first data are handled according to time dimension and geography dimensionality
Source obtains first data set;To monitoring data source carry out text-processing (including:Cutting word analysis, word frequency analysis, stem are extracted
And semantic analysis), obtain the monitoring data collection;Second data set can be existing data set.
Alternatively, in a kind of implementation of the present embodiment, the first data set, the second data set and monitoring data collection are
The data in same field.For example, by taking catering field as an example, the first data set is the scene characteristic storehouse of catering field, the second data
Collection includes vegetable data and merchant data, and effective catering field information that monitoring data collection is then excavated based on external source is obtained.
Alternatively, in a kind of implementation of the present embodiment, for every second data, from actually mapping to the second number
According to the first data in choose the combinations of at least partly the first data or the data of described at least part first and be used as the search field
Scape.
For example, by taking " deep-fried twisted dough sticks " in the second data set this vegetable words as an example, it is assumed that the first packet being mapped to
Include:" breakfast ", " north ", " staple food ", " fried food ", " chinese tradition " etc..Wherein " breakfast " frequency of occurrence highest, most generation
Table.Therefore, in the mapping data of " deep-fried twisted dough sticks " this vegetable word, " breakfast " can be come to the first place of all features, correspondence
Maximum weight.And then, in processing 14, " breakfast " can be chosen as the search scene of deep-fried twisted dough sticks.It is of course also possible to from mapping
Word in choose at least part word combination and constitute scene, " northern breakfast " is for example used as scene.In other words, in this reality
In existing mode, preparatory condition can be met according to weight (for example, weight sequencing) weight selection of the monitoring data matched
The combination of first data or the first data is used as corresponding search scene.
In the implementation, the Feature Words corresponding to " deep-fried twisted dough sticks " can be screened using the frequency of Feature Words,
The weight of each Feature Words can be optimized according to the frequency of Feature Words, so, by using the frequency of Feature Words as auxiliary
Help parameter to be adjusted/correct weight, the problem of weight is described into that may be present inaccurate is weakened, also ensure that based on power
The accuracy that real data be adjusted again obtained from maps.
On the frequency of Feature Words, it refers to that Feature Words are remembered in the Data Collection of first data set, statistics stage
The quantity recorded.For example:Assuming that data collection phase of " breakfast " one word in the first data set, counts on altogether 723 " early
Meal " or " breakfast " are used as main semantic phrase, then in first data set, the word frequency of " breakfast " this Feature Words is just
It is 723/ (total degree that all Feature Words occur).
Below, exemplified by applying the present invention to catering field, each details related to the present invention is carried out specifically
It is bright.
Fig. 2 is a kind of schematic flow sheet of method for setting up scene characteristic storehouse according to embodiments of the present invention.The scene
Feature database is a kind of specific implementation of the first data set.Reference picture 2, methods described includes:
First, the first data source is obtained.First data source includes user behavior data and external source mining data.User
Behavioral data mainly reflects behavior of the user on time dimension, the user being collected into using client (for example, APP clients)
Record is clicked on and browses, in service end using the time as dimension, by these behaviors of user successively finishing collecting, for example:User A
November 03 day in 2016 11 points of behavioral data for " open APP->Browse homepage->Drop-down menu to page 3->Stop 2 seconds
Select afterwards the 3rd trade company and entrance->Trade company details page selection X products->Into lower single-page->Select payment method and
Dispense geography information " etc..And open menu of the external source mining data including the professional food and drink website of main flow, vegetable way, food and drink point
The information such as class.
Then, by data analytics subsystem, first data source is analyzed, Novel Temporal Scenario basic number is obtained
According to, red-letter day scene master data, geography information master data.Specifically, using Text Mode matching technique by the first data
Source is divided into the basic red-letter day scenes such as breakfast, lunch, four basic time scenes of dinner and food, Chinese and western traditional festivals or holidays
And the user based on geography information dispenses the essential characteristics such as scene.
Then, after basic feature information is obtained, characteristic filter model is trained and is fitted by fitting algorithm be excellent
Change, complete the filtering of characteristic, the data of related information removing mistake, being not belonging to catering field make the number of feature database
According to rationalization.
Herein, why characteristic filter model is trained, be because filtered initial characteristic data often
There are various noise datas.For example:" cigarette " this search term is in original scene characteristic is extracted, it is possible to create " breakfast ",
" sweets " the two scene characteristics.It is apparent that this wrong identification that to be due to dirty data cause is, it is necessary to be filtered.Therefore, lead to
Artificial setting model the set goal state is crossed, using fit procedure, constantly filter condition can be made more accurate, Jin Erke
To filter out the feature database data that logic association is not strong.
Handled more than, you can obtain scene characteristic storehouse.Exemplarily, the Data Structures in scene characteristic storehouse are such as
Shown in following table:
(table one)
With reference to table one.Wherein, characteristic ID represents the unique identifying information of each feature, and being used in search scene Recognition should
ID calls correlated characteristic.Feature name facilitates feature database manager to check and information displaying.Residing for tagsort representative feature
Classification, for example, feature can be divided into one-level feature, secondary characteristics and three-level subclass.More specifically, " breakfast " belongs to one
Level feature, wherein comprising " fat-reducing breakfast " this secondary characteristics, the secondary characteristics are again special comprising three-levels such as " tuna products "
Levy.
Feature weight represents factor of influence of this feature in feature database, and its calculation formula is:
Wi=θ * Ci/∑J=0Cj+Punishment(i>=0, j are since 0)
WiThe weight (also referred to as factor of influence) of ith feature is represented, θ represents the positive incentive parameter artificially set, this
Parameter, which is used for weakening caused by the noise being previously mentioned, to be disturbed, CiRepresent that ith feature passes through cutting word, word in training data
Feature name obtained by frequency analysis and semantic analysis (related description refer to and explanation hereafter in monitoring data), and instruct
It is the first data source above to practice data.Punishment is penalty factor, for correct weight that over-fitting problem brings because
The problem of son influence is excessive.
Characteristic relation represents the relation between feature, including approximate, mutual exclusion and includes three kinds of relations.For example:" breakfast "
Just belong to mutual exclusion feature with " dinner ".The optimization important role of characteristic relation information Feature Mapping part to after, passes through
The comparison of feature weight and characteristic relation, can more accurately filter out the mapping result of mistake.
The method provided using the present embodiment, carries out data processing, especially into the time by full-automatic flow
Dimension and geography information dimension divide mass data, can effectively shorten data mining processing and manual review brings
Ineffective time cost consumption, improves whole strategy assessment performance.In addition, for lifting feature storehouse can be descriptive and can be representative,
Can by the way of characteristic model reverse energization double optimization feature database.Compared with traditional Feature Extraction Technology, accuracy is more
Height, comprising feature it is also more representative.
Fig. 3 is a kind of schematic flow sheet of the method for acquisition monitoring data according to embodiments of the present invention.This method is directed to
Catering field information carries out text-processing, obtains monitoring data, and the monitoring data refers to (a kind of basic suitable for monitor model
Machine learning method) data.Specifically, as shown in figure 3, methods described includes:
30:Obtain catering field information.The catering field information can by web crawlers robot from external source excavate number
Extracted according to middle.
32:Cutting word is analyzed.Specifically, tokenizer can be used to complete cutting word analysis.For example, being cut using wordseg
Word instrument, its general principle is that the word dictionary for generating mass data is matched with one section of catering information, once discovery
With successful phrase, then candidate's cutting word is regarded as, and go to pick out matching degree most according to the word weight that word dictionary is provided
High cutting word mode, then it is considered that this cutting word result is exactly final result.Formed after one section of catering information cutting word by short
The set of language composition, for example:" the main food materials of pork fried with sugar & vinegar dressing include lean pork taken under the spinal column of a hog, starch, tomato etc. ", this text was considered as food and drink
Phrase book after information, cutting word is combined into { " pork fried with sugar & vinegar dressing ", " main food materials ", " lean pork taken under the spinal column of a hog ", " starch ", " tomato " }.
34:Word frequency analysis.Specifically, after cutting word analysis is all carried out for each section of catering field information, and then
The number of times that each cutting word after-phrase occurs is counted, this number of times is exactly word frequency information.The main purpose of word frequency analysis is to filter out
Unwanted word, leaves most representational word.For example:For catering field information, two such is formed after cutting word
Word:" chicken row ", " big chicken row ".According to word frequency statisticses, " chicken row ", which has altogether, to be occurred in that 12834 times, and " big chicken row " has appearance altogether
231 times, then there is the word of Similar Text institutional framework for the two, only it can retain " chicken row ".
36:Stem is extracted.Specifically, carry out part with the cutting word phrase of foregoing generation using stem dictionary and match inspection
Look into, for example:" delicious lean pork taken under the spinal column of a hog " can be extracted as " lean pork taken under the spinal column of a hog ", and attribute " delicious " therein can be removed.Stem is extracted
The part of speech of phrase can be recognized, and then secondary cut is carried out to phrase, the noun part of core is finally left behind.
38:Semantic analysis.Exemplarily, the semantic analysis based on N-gram (a kind of language model) can be carried out.This point
Analysis method is based on a kind of it is assumed that the appearance of n-th word is only related to above N-1 word, and without related to other factors, this is short
The probability of language is exactly the product for the probability that each stem occurs.
The processing 30-38 more than, you can obtain the monitoring data of catering field.Exemplarily, the structure of monitoring data
It is as shown in the table:
Phrase ID | Phrase title | Weight | Penalty factor |
Table two
Wherein, phrase ID uniquely indicates the phrase, for being used when calling monitoring data.Phrase title is used for and first
Data (for example, Feature Words in scene characteristic storehouse) in data set carry out text matches.Weight refers to the weight of the monitoring data
The property wanted, such as vegetable " fish-flavoured shredded pork " are mapped to " Sichuan cuisine ", " prevalence ", " fashion intention " these three Feature Words, and the prison of system
Superintend and direct in data " Sichuan cuisine ", the weight of " prevalence " two supervision phrases and be significantly greater than " fashion intention " this phrase, then after filtering
The feature stayed is exactly " Sichuan cuisine ", " prevalence ", meanwhile, " this phrase expression way of XX " with sweet and sour flavor is also defined as one by system
Supervise formula.It is similar to when next time " when XX " with sweet and sour flavor phrase is processed, as long as there are " Sichuan cuisine ", " prevalence " or similar characteristics
When, monitor model will lift the factor of influence of these features, meanwhile, the mapping intensity of other features can be limited.Penalty factor
It is the amendment option of monitoring data, the numerical value is usually manual to be set, and is gone to assess by the manual examination and verification after data sampling and is supervised
Constraint of the data to feature.
Fig. 4 is a kind of data mapping logic schematic diagram according to embodiments of the present invention, its be illustrated that scene characteristic storehouse with
The real data mapping logic of catering field data.Reference picture 4, the data mapping logic includes:First, based on catering field number
The data set up therebetween according to (including vegetable data and merchant data) and scene characteristic storehouse map.Then, supervision number is read
According to weight and penalty factor, and then lifted and limited.Specifically, scene characteristic storehouse is being mapped to vegetable or business
During user data, the part Feature Words matched using the weight lifting of monitoring data itself with monitoring data, while passing through prison
Superintend and direct the penalty factor limitation mapping intensity (that is, the weights of Feature Words) of data, generation effectively mapping data (that is, actual mapping number
According to).
Traditional Feature Mapping technology there is a situation where to map not sufficiently effective or over-fitting.And the number that the present embodiment is used
According to mapping logic, the concept of monitoring data is introduced, the monitor model of catering field knowledge can be built by third party's data, is entered
And data mapping in based on monitoring data filter vegetable, shop title scene characteristic, lift map accuracy.
In the present embodiment, after generation effectively mapping data processing, it is possible to use the Feature Words frequency is to each food and drink
The Feature Words that realm information word (for example, vegetable and name of firm) is mapped to are ranked up.By taking " deep-fried twisted dough sticks " this vegetable word as an example,
The Feature Words being mapped to include:" breakfast ", " north ", " staple food ", " fried food ", " chinese tradition " etc., wherein " breakfast " this
Individual scene characteristic frequency of occurrence highest, it is most representative.Therefore in the mapping data of " deep-fried twisted dough sticks " this vegetable word, " breakfast " row
In the first place of all features, weight is maximum.Can using " breakfast " as deep-fried twisted dough sticks search scene.
Fig. 5 is a kind of schematic flow sheet of data mapping method according to embodiments of the present invention, and it illustrates scene characteristic
Storehouse and the real data mapping process of catering field data (including vegetable data and merchant data).Reference picture 5, methods described bag
Include:
50:Set up scene characteristic storehouse and the data of catering field data map.
52:The data mapping is optimized based on monitoring data.For example, being optimized by foregoing weight, penalty factor.
54:Determine the search scene corresponding to catering field data.For example, for single second number in the second data set
According to, it is ranked up, screens or combines according to frequency of occurrence, weight or other parameters with its first data mapped, so that
To corresponding search scene.
Fig. 6 is a kind of schematic flow sheet of search scene recognition method according to embodiments of the present invention.Reference picture 6, it is described
Method includes:
60:Cutting word is carried out to search terms, search term is obtained.The search term can be one or more.
Alternatively, in a kind of implementation of the present embodiment, place is identified in the search terms inputted first against user
Reason, the identifying processing includes simple filtering, recalls triggering first.Wherein, filtering refers to that carrying out exception for the search terms sentences
It is disconnected, if it find that the search terms is abnormal, such as:Search term includes forbidden character, sensitive information etc., and search will no longer carry out next
Step processing.
Alternatively, in the present embodiment, cutting word can be carried out using the tokenizer being mentioned above.
62:The matched data for determining to match with the search term in the second data set by matching treatment.Wherein, described
Two data sets set up data mapping (that is, real data with first data set using data mapping method as previously described
Mapping).On the explanation of the first data set and the second data set, refer to above.
Alternatively, in a kind of implementation of the present embodiment, the matching treatment is text matches processing, and preferably
Matched using part.Part matching refers to, if the second data in the second data set with it is any after search terms cutting word
One word matching, then second data are matched with search terms.For example, utilizing the cutting word result and the word of feature dictionary of search terms
Approximate calculation is carried out, if search term " river perfume Sichuan-style pork " and " twice-cooked stir-frying " this characteristic matching success in feature database, are in fact
The match is successful with correlated characteristic for " twice-cooked stir-frying " two word in " river perfume Sichuan-style pork ".
The Rapid matching catering field data by the way of part is matched, on the one hand lifting matching efficiency, on the other hand has
The range of effect ground lifting matching scene.
64:The search scene mapped according to the matched data, determines the corresponding search scene of the search terms.
Alternatively, in a kind of implementation of the present embodiment, by taking catering field as an example, the first data set is scene characteristic
Storehouse, the second data set are catering field data.It is determined that after the corresponding search scene of search term, it is possible to use in scene characteristic storehouse
The scene weight of precomputation carries out scene sequence.
Fig. 7 is a kind of schematic flow sheet of searching method according to embodiments of the present invention, reference picture 7, and methods described includes:
70:Recognize the corresponding search scene of search terms.For example, according to search terms and the second data set and second data
The mapped search scene of collection, determines the corresponding search scene of the search terms.Wherein, what second data set was mapped searches
Rope scene is determined using previously described data mapping method.More specifically, it can be known using the method shown in Fig. 6
Not.
72:Loading data file corresponding with search scene.The data file is configured with the optimisation strategy of call back data.
Alternatively, in a kind of implementation of the present embodiment, the data file corresponding to dynamic load different scenes, after
And obtain meeting the search result of user search intent.The dynamic load is hot loading technique, i.e., do not restarting the feelings of service
Under condition, data can be changed in real time.And in the present embodiment, the ordering strategy for recalling logic is configured to data one by one
File, by loading these data files, to construct sort algorithm.Exemplarily, the data file of these ordering strategies is as follows
Shown in table:
Tactful ID | Policy name | Policy class | Characterising parameter | Parameter role scope | Extend information |
(table three)
Wherein, characterising parameter and parameter role scope are intended to indicate that the influence point of strategy, for example:Sequence plan based on distance
In slightly, characterising parameter is exactly " apart from the factor ", and parameter role scope is exactly " 0km -20km ".
74:Sequence is optimized to call back data according to data file.
The method provided using the present embodiment, is recalled there is provided modular calculating entrance for search, can be for difference
The respective sorting consistence strategy of search Scenario Design, realize " face of thousand people thousand " search personalised effects.
Embodiment of the method according to embodiments of the present invention is described in detail above in association with accompanying drawing.Below in conjunction with the accompanying drawings
Apparatus according to the invention embodiment is illustrated.
Fig. 8 is the one of the block diagram of a kind of data processing equipment for being used for confirmation search scene according to embodiments of the present invention
Example.Reference picture 8, data processing equipment includes:Module 80 is set up in data mapping, for setting up the first data set and the second data set
Between primary data mapping;Data map adjusting module 82, map, obtain for adjusting the data according to monitoring data collection
Real data mapping between first data set and second data set;Search for scene mapping block 84, for based on
The first data in first data set that the second data in second data set are actually mapped to, determine described second
The corresponding search scene of the second data in data set.
Alternatively, in a kind of implementation of the present embodiment, the monitoring data that the monitoring data is concentrated except including
Weight and/or penalty factor.
Alternatively, in a kind of implementation of the present embodiment, data mapping adjusting module 82 includes:Matched sub-block,
For handling the monitoring data for determining to be mutually matched and the first data using text matches;First adjustment submodule, for for
Every second data, the weight of the monitoring data based on the first Data Matching arrived with the second data initial mapping, modification second
Mapping relations between the first data that data and its initial mapping are arrived, and/or, the second adjustment submodule, for for items
Second data, the penalty factor of the monitoring data based on the first Data Matching arrived with the second data initial mapping, adjustment second
The weight for the first data that data initial mapping is arrived.
Alternatively, in a kind of implementation of the present embodiment, the search scene mapping block 84 specifically for:For
Every second data, choose at least partly the first data or at least portion from the first data for being actually mapped to the second data
The combination of the first data is divided to be used as the search scene.For example, the weight of the monitoring data matched based on the first data is chosen
The data of at least part first.
Alternatively, in a kind of implementation of the present embodiment, first data set is the scene characteristic of catering field
Storehouse, second data set includes vegetable data and merchant data.
Fig. 9 is one of the block diagram of a kind of search scene Recognition device according to embodiments of the present invention, reference picture 9, the dress
Put including:Cutting word module 90, for carrying out cutting word to search terms, obtains search term;Matching module 92, for passing through matching treatment
Determine the matched data matched in the second data set with the search term;Determining module 94, for according to the matched data institute
The search scene of mapping, determines the corresponding search scene of the search terms.Wherein, previously described method is used for the second data
Collection mapping search scene.
Figure 10 is one of the block diagram of a kind of searcher according to embodiments of the present invention, and reference picture 10, the device includes:
Scene determining module 102, for the search scene mapped according to search terms and the second data set and second data set, really
Determining the corresponding search scene of the search terms, (scene that wherein, second data set is mapped uses previously described data
Mapping method is determined, or is determined using search scene Recognition device shown in Fig. 9);Load-on module 104, is searched for loading with described
The corresponding data file of rope scene, the data file is configured with the optimisation strategy of call back data;Optimization module 106, for root
Sequence is optimized to call back data according to the data file of loading.
Information-pushing method and device according to embodiments of the present invention are illustrated above in association with accompanying drawing, this area skill
Art personnel should be appreciated that the embodiment of the method that provides of the present invention or implementation can correspondingly device provided by the present invention it is real
Apply example or implementation is realized, and the embodiment of the method for processing procedure/logic of the device embodiment of the present invention and the present invention
It is consistent.Therefore, in the device embodiment of the present invention, on processing is handled or can perform performed by modules, submodule
Detailed description, on specific names, term, scope explanation, and on having that each embodiment, correlated characteristic have
The description of beneficial effect, refers to the respective description in embodiment of the method, here is omitted.
In a kind of possible design related to the present invention, aforementioned data processing unit can include processor and storage
Device, the memory, which is used to store, supports the data processing equipment to perform the processing performed by foregoing corresponding module/submodule
Program, the processor is configurable for performing the program stored in the memory.
Described program includes one or more computer instruction, wherein, one or more computer instruction is for described
Processor calls execution.
More specifically, the processor by perform the computer instruction for:
The primary data mapping set up between the first data set and the second data set, first data set includes multinomial the
One data, second data set includes multinomial second data;
The primary data mapping is adjusted according to monitoring data collection, first data set and second data set is obtained
Between real data mapping;
The first data in first data set being actually mapped to based on the second data in second data set,
Determine the search scene that the second data in second data set are mapped.
Alternatively, the processor can also by perform the computer instruction for:According to time dimension and ground
Manage dimension and handle the first data source, obtain first data set;Cutting word analysis, word frequency analysis, word are carried out to monitoring data source
It is dry to extract and semantic analysis, obtain the monitoring data collection.
Alternatively, the monitoring data that the monitoring data is concentrated is except including phrase title, in addition to weight and/or punishment
The factor.Now, the processing can also by perform the computer instruction for:
The monitoring data for determining to be mutually matched and the first data are handled using text matches;For every second data, base
In the weight of the monitoring data of the first Data Matching arrived with the second data initial mapping, the second data of modification and its initial mapping
Mapping relations between the first data arrived, and/or, for every second data, based on what is arrived with the second data initial mapping
The penalty factor of the monitoring data of first Data Matching, the weight for the first data that the second data initial mapping of adjustment is arrived.
Alternatively, the processing can also by perform the computer instruction for:For every second data, from
Actually map in the first data of the second data and choose the group of at least partly the first data or the data of described at least part first
Cooperate as the search scene.
Correspondingly, the embodiment of the present invention additionally provides a kind of computer-readable storage medium, for storing aforementioned data mapping dress
Performed computer software instructions are put, it, which is included, is used to perform involved by the data mapping unit of above-mentioned data mapping method
Program.
In alternatively possible design related to the present invention, previous searches device can include processor and storage
Device, the memory is used for the journey for storing the processing for supporting the data processing equipment to perform performed by corresponding module/submodule
Sequence, the processor is configurable for performing the program stored in the memory.
Described program includes one or more computer instruction, wherein, one or more computer instruction is for described
Processor calls execution.
More specifically, the processor by perform the computer instruction for:Counted according to search terms and second
According to collection and the search scene that is mapped of second data set, the corresponding search scene of the search terms is determined, wherein, described the
The search scene that two data sets are mapped is determined using aforementioned data mapping method;Loading data corresponding with the search scene
File, the data file is configured with the optimisation strategy of call back data;Call back data is optimized according to the data file
Sequence.
Correspondingly, a kind of computer-readable storage medium is also provided in the embodiment of the present invention, for storing previous searches device institute
The computer software instructions of execution, it, which is included, is used to perform the program involved by the searcher of searching method described previously.
It is apparent to those skilled in the art that, for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, may be referred to the corresponding process in preceding method embodiment, will not be repeated here.
Device embodiment described above is only schematical, wherein the unit illustrated as separating component can
To be or may not be physically separate, the part shown as unit can be or may not be physics list
Member, you can with positioned at a place, or can also be distributed on multiple NEs.It can be selected according to the actual needs
In some or all of module realize the purpose of this embodiment scheme.Those of ordinary skill in the art are not paying creativeness
Work in the case of, you can to understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
Realized by the mode of software plus required general hardware platform, naturally it is also possible to pass through hardware.Understood based on such, on
The part that technical scheme substantially in other words contributes to prior art is stated to embody in the form of software product, should
Computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disc, CD, including some fingers
Order is to cause a computer equipment (can be personal computer, server, or network equipment etc.) to perform each implementation
Method described in some parts of example or embodiment.
Finally it should be noted that:The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although
The present invention is described in detail with reference to the foregoing embodiments, it will be understood by those within the art that:It still may be used
To be modified to the technical scheme described in foregoing embodiments, or equivalent substitution is carried out to which part technical characteristic;
And these modification or replace, do not make appropriate technical solution essence depart from various embodiments of the present invention technical scheme spirit and
Scope.
The present invention discloses A1, a kind of data processing method for being used to confirm search scene, including:
The primary data mapping set up between the first data set and the second data set, first data set includes multinomial the
One data, second data set includes multinomial second data;
The primary data mapping is adjusted according to monitoring data collection, first data set and second data set is obtained
Between real data mapping;
The first data in first data set being actually mapped to based on the second data in second data set,
Determine the search scene that the second data in second data set are mapped.
In A2, the method as described in A1, first data set is the scene characteristic storehouse of catering field, second data
Collection includes vegetable data and merchant data.
In A3, the method as described in A1, methods described also includes:The first data are handled according to time dimension and geography dimensionality
Source, obtains first data set.
In A4, the method as described in A1, in addition to:
To monitoring data source carry out text-processing (including:Cutting word analysis, word frequency analysis, stem are extracted and semantic point
Analysis), obtain the monitoring data collection.
A5, the method as any one of A1~A4, the monitoring data that the monitoring data is concentrated include weight and/or
Penalty factor.
In A6, the method as described in A5,
The primary data mapping relations are adjusted according to monitoring data collection, including:
The monitoring data for determining to be mutually matched and the first data are handled using text matches;
For every second data, the power of the monitoring data based on the first Data Matching arrived with the second data initial mapping
Mapping relations between weight, the first data that the second data of modification and its initial mapping are arrived, and/or,
For every second data, based on the second data initial mapping to the monitoring data of the first Data Matching punish
Penalty factor, the weight for the first data that the second data initial mapping of adjustment is arrived.
In A7, the method as any one of A1-A4 or A6, second data based in second data set
The first data in first data set being actually mapped to, determine that the second data in second data set are corresponding and search
Rope scene, including:For every second data, at least partly first is chosen from the first data for actually mapping to the second data
The combination of data or the data of described at least part first is used as the search scene.
The invention also discloses B8, a kind of searching method, including:
The search scene mapped according to search terms and the second data set and second data set, determines the search terms
Corresponding search scene, wherein, the search scene that second data set is mapped is using the side as any one of A1-A7
Method is determined;
Loading data file corresponding with the search scene, the data file is configured with the optimization plan of call back data
Slightly;
Sequence is optimized to call back data according to the data file.
The invention also discloses C9, a kind of data processing equipment for being used to confirm search scene, including:
Module is set up in data mapping, for setting up the mapping of the primary data between the first data set and the second data set, institute
State the first data set and include multinomial first data, second data set includes multinomial second data;
Data map adjusting module, are mapped for adjusting the primary data according to monitoring data collection, obtain described first
Real data mapping between data set and second data set;
Search for scene mapping block, for be actually mapped to based on the second data in second data set described the
The first data in one data set, determine the corresponding search scene of the second data in second data set.
In C10, the device as described in C9, first data set is the scene characteristic storehouse of catering field, second number
Include vegetable data and merchant data according to collection.
In C11, the device as described in C9, described device also includes the first data processing module, for according to time dimension
The first data source is handled with geography dimensionality, first data set is obtained.
In C12, the device as described in C9, described device also includes monitoring data processing module, for monitoring data source
Carry out text-processing (e.g., including:Cutting word analysis, word frequency analysis, stem are extracted and semantic analysis), obtain the supervision number
According to collection.
In C13, the device as described in C9-C12, the monitoring data that the monitoring data is concentrated except including phrase title,
Also include weight and/or penalty factor.
In C14, the device as described in C13, the data mapping adjusting module includes:
Matched sub-block, for handling the monitoring data for determining to be mutually matched and the first data using text matches;
First adjustment submodule, for for every second data, based on the first number arrived with the second data initial mapping
Mapping relations between the first data arrived according to the weight of the monitoring data of matching, the second data of modification and its initial mapping, and/
Or,
Second adjustment submodule, for for every second data, based on the first number arrived with the second data initial mapping
According to the penalty factor of the monitoring data of matching, the weight for the first data that the second data initial mapping of adjustment is arrived.
In C15, the device as any one of C9-C12 or C14, it is described search scene mapping block specifically for:Pin
To every second data, from the first data for being actually mapped to the second data choose at least partly the first data or it is described at least
The combination of the data of part first is used as the search scene.
Invention additionally discloses D16, a kind of searcher, including:
Scene determining module, for the search mapped according to search terms and the second data set and second data set
Scape, determines the corresponding search scene of the search terms, wherein, the search scene that second data set is mapped uses such as A1-
Method any one of A7 is determined;
Load-on module, for loading data file corresponding with the search scene, the data file, which is configured with, recalls
The optimisation strategy of data;
Optimization module, sequence is optimized to call back data for the data file according to loading
The invention also discloses E1, a kind of data mapping unit, including memory and processor;Wherein,
The memory is used to store one or more computer instruction, wherein, one or more computer instruction
Execution is called for the processor;
The processor is by performing the computer instruction to perform following processing:
The primary data mapping set up between the first data set and the second data set, first data set includes multinomial the
One data, second data set includes multinomial second data;
The primary data mapping is adjusted according to monitoring data collection, first data set and second data set is obtained
Between real data mapping;
The first data in first data set being actually mapped to based on the second data in second data set,
Determine the search scene that the second data in second data set are mapped.
In E2, the data mapping unit as described in E1, first data set is the scene characteristic storehouse of catering field, described
Second data set includes vegetable data and merchant data.
In E3, the data mapping unit as described in E1, the processor by performing the computer instruction with perform with
Lower processing:The first data source is handled according to time dimension and geography dimensionality, first data set is obtained.
In E4, the data mapping unit as described in E1, the processor by performing the computer instruction with perform with
Lower processing:Text-processing is carried out (e.g., including to monitoring data source:Cutting word analysis, word frequency analysis, stem are extracted and semantic
Analysis) obtain the monitoring data collection.
In E5, the data mapping unit as any one of E1-E4, the monitoring data that the monitoring data is concentrated includes
Weight and/or penalty factor.
In E6, the data mapping unit as described in E5, the processor by performing the computer instruction with perform with
Lower processing:The monitoring data for determining to be mutually matched and the first data are handled using text matches;For every second data, it is based on
The weight of the monitoring data of the first Data Matching arrived with the second data initial mapping, the second data of modification are arrived with its initial mapping
The first data between mapping relations, and/or, for every second data, based on arrived with the second data initial mapping
The penalty factor of the monitoring data of one Data Matching, the weight for the first data that the second data initial mapping of adjustment is arrived.
E7, the device as any one of E1-E4 or E6, the processor is by performing the computer instruction to hold
Row is following to be handled:For every second data, at least partly first is chosen from the first data for actually mapping to the second data
The combination of data or the data of described at least part first is used as the search scene.
The invention also discloses F1, a kind of searcher, including memory and processor;Wherein,
The memory is used to store one or more computer instruction, wherein, one or more computer instruction
Execution is called for the processor;
The processor is by performing the computer instruction to perform following processing:According to search terms and the second data set
And the search scene that second data set is mapped, the corresponding search scene of the search terms is determined, wherein, second number
Determined according to method of the mapped search scene of collection as any one of A1-A7;Loading number corresponding with the search scene
According to file, the data file is configured with the optimisation strategy of call back data;Call back data is carried out according to the data file excellent
Change sequence.
Claims (10)
1. a kind of data processing method for being used to confirm search scene, it is characterised in that methods described includes:
The primary data mapping set up between the first data set and the second data set, first data set includes the multinomial first number
According to second data set includes multinomial second data;
The primary data mapping is adjusted according to monitoring data collection, obtained between first data set and second data set
Real data mapping;
The first data in first data set being actually mapped to based on the second data in second data set, it is determined that
The search scene that the second data in second data set are mapped.
2. the method as described in claim 1, it is characterised in that
The monitoring data that the monitoring data is concentrated includes weight and/or penalty factor.
3. method as claimed in claim 2, it is characterised in that described to be reflected according to the monitoring data collection adjustment primary data
Penetrate, including:
The monitoring data for determining to be mutually matched and the first data are handled using text matches;
For every second data, the weight of the monitoring data based on the first Data Matching arrived with the second data initial mapping,
The mapping relations between the first data that the second data and its initial mapping are arrived are changed, and/or,
For every second data, the punishment of the monitoring data based on the first Data Matching arrived with the second data initial mapping because
Son, the weight for the first data that the second data initial mapping of adjustment is arrived.
4. the method as any one of claim 1-3, it is characterised in that it is described based in second data set
The first data in first data set that two data are actually mapped to, determine the second data pair in second data set
The search scene answered, including:
For every second data, at least partly the first data or institute are chosen from the first data for actually mapping to the second data
The combination of at least partly the first data is stated as the search scene.
5. a kind of searching method, it is characterised in that methods described includes:
The search scene mapped according to search terms and the second data set and second data set, determines the search terms correspondence
Search scene, wherein, the search scene that second data set is mapped use as any one of claim 1-4
Method is determined;
Loading data file corresponding with the search scene, the data file is configured with the optimisation strategy of call back data;
Sequence is optimized to call back data according to the data file.
6. a kind of data processing equipment for being used to confirm search scene, it is characterised in that described device includes:
Module is set up in data mapping, for setting up the mapping of the primary data between the first data set and the second data set, described the
One data set includes multinomial first data, and second data set includes multinomial second data;
Data map adjusting module, are mapped for adjusting the primary data according to monitoring data collection, obtain first data
Real data between collection and second data set maps;
Scene mapping block is searched for, for first number being actually mapped to based on the second data in second data set
According to the first data of concentration, the corresponding search scene of the second data in second data set is determined.
7. device as claimed in claim 6, it is characterised in that
The monitoring data that the monitoring data is concentrated includes weight and/or penalty factor.
8. device as claimed in claim 7, it is characterised in that the data mapping adjusting module includes:
Matched sub-block, for handling the monitoring data for determining to be mutually matched and the first data using text matches;
First adjustment submodule, for for every second data, based on the first data arrived with the second data initial mapping
Mapping relations between the weight for the monitoring data matched somebody with somebody, the first data that the second data of modification and its initial mapping are arrived, and/or,
Second adjustment submodule, for for every second data, based on the first data arrived with the second data initial mapping
The penalty factor for the monitoring data matched somebody with somebody, the weight for the first data that the second data initial mapping of adjustment is arrived.
9. the device as any one of claim 6-8, it is characterised in that the search scene mapping block is specifically used
In:
For every second data, at least partly the first data or institute are chosen from the first data for being actually mapped to the second data
The combination of at least partly the first data is stated as the search scene.
10. a kind of searcher, it is characterised in that described device includes:
Scene determining module, for the search scene mapped according to search terms and the second data set and second data set,
The corresponding search scene of the search terms is determined, wherein, the search scene that second data set is mapped uses right such as will
The method any one of 1-4 is asked to determine;
Load-on module, for loading data file corresponding with the search scene, the data file is configured with call back data
Optimisation strategy;
Optimization module, sequence is optimized to call back data for the data file according to loading.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710243857.XA CN107145525B (en) | 2017-04-14 | 2017-04-14 | Data processing method for confirming search scene, search method and corresponding device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710243857.XA CN107145525B (en) | 2017-04-14 | 2017-04-14 | Data processing method for confirming search scene, search method and corresponding device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107145525A true CN107145525A (en) | 2017-09-08 |
CN107145525B CN107145525B (en) | 2020-10-16 |
Family
ID=59773563
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710243857.XA Expired - Fee Related CN107145525B (en) | 2017-04-14 | 2017-04-14 | Data processing method for confirming search scene, search method and corresponding device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107145525B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102033877A (en) * | 2009-09-27 | 2011-04-27 | 阿里巴巴集团控股有限公司 | Search method and device |
CN102612704A (en) * | 2009-10-19 | 2012-07-25 | Metaio有限公司 | Method of providing a descriptor for at least one feature of an image and method of matching features |
CN102665838A (en) * | 2009-11-11 | 2012-09-12 | 微软公司 | Methods and systems for determining and tracking extremities of a target |
CN104239907A (en) * | 2014-07-16 | 2014-12-24 | 华南理工大学 | Far infrared pedestrian detection method for changed scenes |
US9141871B2 (en) * | 2011-10-05 | 2015-09-22 | Carnegie Mellon University | Systems, methods, and software implementing affine-invariant feature detection implementing iterative searching of an affine space |
CN105335391A (en) * | 2014-07-09 | 2016-02-17 | 阿里巴巴集团控股有限公司 | Processing method and device of search request on the basis of search engine |
CN105956189A (en) * | 2016-06-08 | 2016-09-21 | 北京百度网讯科技有限公司 | Artificial intelligence-based information recommendation method and apparatus |
-
2017
- 2017-04-14 CN CN201710243857.XA patent/CN107145525B/en not_active Expired - Fee Related
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102033877A (en) * | 2009-09-27 | 2011-04-27 | 阿里巴巴集团控股有限公司 | Search method and device |
CN102612704A (en) * | 2009-10-19 | 2012-07-25 | Metaio有限公司 | Method of providing a descriptor for at least one feature of an image and method of matching features |
CN102665838A (en) * | 2009-11-11 | 2012-09-12 | 微软公司 | Methods and systems for determining and tracking extremities of a target |
US9141871B2 (en) * | 2011-10-05 | 2015-09-22 | Carnegie Mellon University | Systems, methods, and software implementing affine-invariant feature detection implementing iterative searching of an affine space |
CN105335391A (en) * | 2014-07-09 | 2016-02-17 | 阿里巴巴集团控股有限公司 | Processing method and device of search request on the basis of search engine |
CN104239907A (en) * | 2014-07-16 | 2014-12-24 | 华南理工大学 | Far infrared pedestrian detection method for changed scenes |
CN105956189A (en) * | 2016-06-08 | 2016-09-21 | 北京百度网讯科技有限公司 | Artificial intelligence-based information recommendation method and apparatus |
Also Published As
Publication number | Publication date |
---|---|
CN107145525B (en) | 2020-10-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109522556B (en) | Intention recognition method and device | |
WO2017167069A1 (en) | Resume assessment method and apparatus | |
CN110765257A (en) | Intelligent consulting system of law of knowledge map driving type | |
KR100816934B1 (en) | Clustering system and method using search result document | |
CN107766371A (en) | A kind of text message sorting technique and its device | |
CN111881302B (en) | Knowledge graph-based bank public opinion analysis method and system | |
CN107194617B (en) | App software engineer soft skill classification system and method | |
CN110472203B (en) | Article duplicate checking and detecting method, device, equipment and storage medium | |
CN103544307B (en) | A kind of multiple search engine automation contrast evaluating method independent of document library | |
CN105608075A (en) | Related knowledge point acquisition method and system | |
CN107491447A (en) | Establish inquiry rewriting discrimination model, method for distinguishing and corresponding intrument are sentenced in inquiry rewriting | |
CN112613321A (en) | Method and system for extracting entity attribute information in text | |
CN106844482A (en) | A kind of retrieval information matching method and device based on search engine | |
CN111191413B (en) | Method, device and system for automatically marking event core content based on graph sequencing model | |
CN116561291A (en) | Intelligent recommendation method and system based on natural language intelligent conversion model | |
CN110795930A (en) | Article title optimization method, system, medium and equipment | |
Homocianu et al. | An Analysis of Scientific Publications on'Decision Support Systems' and'Business Intelligence'Regarding Related Concepts Using Natural Language Processing Tools | |
CN112328812B (en) | Domain knowledge extraction method and system based on self-adjusting parameters and electronic equipment | |
CN106227661B (en) | Data processing method and device | |
CN107145525A (en) | Data processing method, searching method and related device for confirming search scene | |
CN105893363A (en) | A method and a system for acquiring relevant knowledge points of a knowledge point | |
CN110377706A (en) | Search statement method for digging and equipment based on deep learning | |
CN115934899A (en) | IT industry resume recommendation method and device, electronic equipment and storage medium | |
CN106777191A (en) | A kind of search modes generation method and device based on search engine | |
CN113869973A (en) | Product recommendation method, product recommendation system, and computer-readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: Building N3, building 12, No. 27, Jiancai Chengzhong Road, Haidian District, Beijing 100096 Applicant after: Beijing Xingxuan Technology Co.,Ltd. Address before: 100085 Beijing, Haidian District on the road to the information on the ground floor of the 1 to the 3 floor of the 2 floor, room 11, 202 Applicant before: Beijing Xiaodu Information Technology Co.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20201016 |
|
CF01 | Termination of patent right due to non-payment of annual fee |