CN107145525A - Data processing method, searching method and related device for confirming search scene - Google Patents

Data processing method, searching method and related device for confirming search scene Download PDF

Info

Publication number
CN107145525A
CN107145525A CN201710243857.XA CN201710243857A CN107145525A CN 107145525 A CN107145525 A CN 107145525A CN 201710243857 A CN201710243857 A CN 201710243857A CN 107145525 A CN107145525 A CN 107145525A
Authority
CN
China
Prior art keywords
data
mapping
search
scene
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710243857.XA
Other languages
Chinese (zh)
Other versions
CN107145525B (en
Inventor
吴霄
梁东
苟秋媛
张潇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaodu Information Technology Co Ltd
Original Assignee
Beijing Xiaodu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaodu Information Technology Co Ltd filed Critical Beijing Xiaodu Information Technology Co Ltd
Priority to CN201710243857.XA priority Critical patent/CN107145525B/en
Publication of CN107145525A publication Critical patent/CN107145525A/en
Application granted granted Critical
Publication of CN107145525B publication Critical patent/CN107145525B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • G06Q30/0625Directed, with specific intent or strategy

Abstract

The embodiment of the present invention provides a kind of data processing method, searching method and related device for being used to confirm search scene, is related to data processing and search field.Wherein, the data processing method includes:The primary data mapping set up between the first data set and the second data set;The primary data mapping is adjusted according to monitoring data collection, the real data mapping between first data set and second data set is obtained;The first data in first data set being actually mapped to based on the second data in second data set, determine the search scene that the second data in second data set are mapped.It using the present invention, can effectively optimize data mapping relations, improve the precision of mapping, and then improve the follow-up precision for determining search scene;Matching efficiency is lifted, effectively the range of lifting matching scene, improves the accuracy of search result.

Description

Data processing method, searching method and related device for confirming search scene
Technical field
The present embodiments relate to data processing and search field, more particularly, it is related to one kind and is searched for confirmation Data processing method, searching method and the related device of rope scene.
Background technology
O2O electric business platform sends out rapidly emergence in internet arena in recent years, wherein the take-away field based on food catering It is with the fastest developing speed.User completes consumption by searching for selection cuisines in application software, is necessarily referred to during this One Core Feature is exactly to search for.
Different from traditional generic text search engine such as Baidu, GOOGLE, the search engine of food and drink electric business needs to pass through Specific search scene and specialized data source expansion search mission.For example, search " deep-fried twisted dough sticks ", then corresponding special scenes should Should be breakfast and north etc..In simple terms, search scene is exactly to excavate the information of user's search behavior behind, for example, searching for " cray ", corresponding search scene is exactly some information such as " summer, stoke of midnight, many people party, seafood ", passes through these scene numbers According to " association ", can more accurately output user expect result.
At present, the search scene Recognition technology based on catering field knowledge is at home also in the stage of fumbling.In industrial quarters, Starting evening is searched because catering field is vertical and is quickly grown, the technology upgrading of search scene Recognition fails to keep up with the lifting of demand;And Educational circles obtains extensive high value search data due to being difficult to, and area research progress is also basic to be stagnated.But huge market Demand carrys out immense pressure to catering field searching strip.Therefore, it is accurate and it is specialized identification search scene just into this field The core optimization direction of search engine technique.
In one kind in the prior art, the scene Recognition that the vertical electric business of catering field is searched for is mainly by the way of handmarking Complete.This mode has that human cost is high, mark standard subjectivity by force can not the defect such as objective unification.Even if prior art branch Hold automation mode, it is also difficult to ensure the accurate and specialized identification of search scene.
The content of the invention
In order to solve the defect present in prior art, the embodiment of the present invention provides a kind of number for being used to confirm search scene According to processing method, searching method and related device, it can automate, realize to precision the mapping for searching for scene, improve search The recognition accuracy of scene, improves the precision of search result.
In a first aspect, providing a kind of data processing method for being used to confirm search scene, bag in embodiment of the present invention Include:
The primary data mapping set up between the first data set and the second data set, first data set includes multinomial the One data, second data set includes multinomial second data;
The primary data mapping is adjusted according to monitoring data collection, first data set and second data set is obtained Between real data mapping;
The first data in first data set being actually mapped to based on the second data in second data set, Determine the corresponding search scene of the second data in second data set.
In a kind of implementation of embodiment of the present invention, first data set is the scene characteristic of catering field Storehouse, second data set includes vegetable data and merchant data.
In a kind of implementation of embodiment of the present invention, methods described also includes:According to time dimension and geographical dimension Degree the first data source of processing, obtains first data set.Or, methods described also includes:Cutting word is carried out to monitoring data source Analysis, word frequency analysis, stem are extracted and semantic analysis, obtain the monitoring data collection.
In a kind of implementation of embodiment of the present invention, the monitoring data that the monitoring data is concentrated is except including short Language title, in addition to weight and/or penalty factor.
Further, it is described that the primary data mapping is adjusted according to monitoring data collection, including:
The monitoring data for determining to be mutually matched and the first data are handled using text matches;
For every second data, the power of the monitoring data based on the first Data Matching arrived with the second data initial mapping Mapping relations between weight, the first data that the second data of modification and its initial mapping are arrived, and/or,
For every second data, based on the second data initial mapping to the monitoring data of the first Data Matching punish Penalty factor, the weight for the first data that the second data initial mapping of adjustment is arrived.
It is described real based on the second data in second data set in a kind of implementation of embodiment of the present invention The first data in first data set that border is mapped to, determine the corresponding search of the second data in second data set Scene, including:For every second data, at least partly first is chosen from the first data for actually mapping to the second data and is counted According to or the combinations of the data of described at least part first be used as the search scene.
A kind of search scene recognition method is provided in second aspect, embodiment of the present invention, this method includes:
Cutting word is carried out to search terms, search term is obtained;
The matched data for determining to match with the search term in the second data set by matching treatment;
The search scene mapped according to the matched data, determines the corresponding search scene of the search terms;
Wherein, the search scene that second data set is mapped is determined using aforementioned data processing method.
The third aspect, embodiment of the present invention also provides a kind of searching method, including:
The search scene mapped according to search terms and the second data set and second data set, determines the search terms Corresponding search scene, wherein, the scene that second data set is mapped is determined using aforementioned data mapping method (should The output result of step is identification search scene, and it can specifically be realized by above-mentioned second aspect);
Loading data file corresponding with the search scene, the data file is configured with the optimization plan of call back data Slightly;
Sequence is optimized to call back data according to the data file.
A kind of data processing equipment for being used to confirm search scene is provided in fourth aspect, embodiment of the present invention, including:
Module is set up in data mapping, for setting up the mapping of the data between the first data set and the second data set, described the One data set includes multinomial first data, and second data set includes multinomial second data;
Data map adjusting module, are mapped for adjusting the primary data according to monitoring data collection, obtain described first Real data mapping between data set and second data set;
Search for scene mapping block, for be actually mapped to based on the second data in second data set described the The first data in one data set, determine the corresponding search scene of the second data in second data set.
In a kind of implementation of embodiment of the present invention, first data set is the scene characteristic of catering field Storehouse, second data set includes vegetable data and merchant data.
In a kind of implementation of embodiment of the present invention, described device also includes:First data processing module, is used for The first data source is handled according to time dimension and geography dimensionality, first data set is obtained.Or, described device also includes: Monitoring data processing module, for carrying out cutting word analysis, word frequency analysis, stem extraction and semantic analysis to monitoring data source, Obtain the monitoring data collection.
In a kind of implementation of embodiment of the present invention, the monitoring data that the monitoring data is concentrated is except including short Language title, in addition to weight and/or penalty factor.
Further, the data mapping adjusting module includes:Matched sub-block, is determined for being handled using text matches The monitoring data being mutually matched and the first data;First adjustment submodule, for for every second data, based on the second number The weight of the monitoring data of the first Data Matching arrived according to initial mapping, the first number that the second data of modification are arrived with its initial mapping Mapping relations between, and/or, the second adjustment submodule, for for every second data, based on initial with the second data The penalty factor of the monitoring data for the first Data Matching being mapped to, the power for the first data that the second data initial mapping of adjustment is arrived Weight.
In a kind of implementation of embodiment of the present invention, it is described search scene mapping block specifically for:For each The second data of item, choose at least partly the first data or described at least part from the first data for being actually mapped to the second data The combination of first data is used as the search scene.
A kind of search scene Recognition device is provided in the 5th aspect, embodiment of the present invention, including:
Cutting word module, for carrying out cutting word to search terms, obtains search term;
Matching module, for the matched data for determining to match with the search term in the second data set by matching treatment;
Determining module, for the search scene mapped according to the matched data, determines that the search terms is corresponding and searches Rope scene;
Wherein, the scene that second data set is mapped is determined using aforementioned data mapping method.
A kind of searcher is provided in 6th aspect, the embodiment of the present invention, including:
Scene determining module, for the search mapped according to search terms and the second data set and second data set Scape, determines the corresponding search scene of the search terms, wherein, the scene that second data set is mapped is reflected using aforementioned data Shooting method determines that (output result of the module is identification search scene, and it can specifically pass through above-mentioned search scene Recognition device Realize);
Load-on module, for loading data file corresponding with the search scene, the data file, which is configured with, recalls The optimisation strategy of data;
Optimization module, sequence is optimized to call back data for the data file according to loading.
The function of the search scene Recognition device and searcher can be realized by hardware, can also be held by hardware The corresponding software of row is realized.The hardware or software include one or more modules corresponding with above-mentioned functions.
In a possible design, the structure of above-mentioned search scene Recognition device or searcher include processor and Memory, the memory is used to store the program for supporting relevant apparatus to perform foregoing respective handling, and the processor is configured For for performing the program stored in the memory.Relevant apparatus can also include communication interface, be set for device with other Standby or communication.
7th aspect, the embodiments of the invention provide a kind of computer-readable storage medium, knows for storing the search scene Computer software instructions used in other device and/or searcher, it, which is included, is used to perform above-mentioned correlation method so that search field Scape identifying device and/or searcher realize the involved program of corresponding data processing.
The embodiment of the present invention can effectively optimize data mapping relations, improve the precision of mapping, and then improve follow-up true Surely the precision of scene is searched for;Matching efficiency can be additionally lifted, the effectively range of lifting matching scene, and then effectively carry The accuracy of high search result.
The aspects of the invention or other aspects can more straightforwards in the following description.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the accompanying drawing used required in technology description to be briefly described, it should be apparent that, drawings in the following description are this hairs Some bright embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can be with root Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is a kind of flow signal for being used to confirm the data processing method of search scene according to embodiments of the present invention Figure;
Fig. 2 is a kind of schematic flow sheet of method for setting up scene characteristic storehouse according to embodiments of the present invention;
Fig. 3 is a kind of schematic flow sheet of the method for acquisition monitoring data according to embodiments of the present invention;
Fig. 4 is a kind of data mapping logic schematic diagram according to embodiments of the present invention;
Fig. 5 is a kind of schematic flow sheet of data mapping method according to embodiments of the present invention;
Fig. 6 is a kind of schematic flow sheet of search scene recognition method according to embodiments of the present invention;
Fig. 7 is a kind of schematic flow sheet of searching method according to embodiments of the present invention;
Fig. 8 is the one of the block diagram of a kind of data processing equipment for being used for confirmation search scene according to embodiments of the present invention Example;
Fig. 9 is one of the block diagram of a kind of search scene Recognition device according to embodiments of the present invention;
Figure 10 is one of the block diagram of a kind of searcher according to embodiments of the present invention.
Embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention Accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described.
In some flows of description in description and claims of this specification and above-mentioned accompanying drawing, contain according to Particular order occur multiple operations, but it should be clearly understood that these operation can not herein occur according to it is suitable Sequence is performed or performed parallel, and the sequence number such as 101,102 etc. of operation is only used for distinguishing each different operation, sequence number Any execution sequence is not represented for itself.In addition, these flows can include more or less operations, and these operations can To perform or perform parallel in order.It should be noted that the description such as " first ", " second " herein, is to be used to distinguish not Same message, equipment, module etc., does not represent sequencing, it is different types also not limit " first " and " second ".
First, to the present invention relates to or the part noun that may relate to illustrate.These explain only for the purposes of understand, And do not constitute the limitation to various embodiments of the invention.
Search technique, sets up information database and index data information, by various soft for the data resource of internet Part, hardware technology realize that performance optimizes, and the function optimization of accuracy and ranking results is scanned for using related algorithm strategy.
Scene Recognition, carries out excavating based on big data and the depth data of natural language processing for search keyword, point The search scene residing for keyword is analysed, and then from higher level Optimizing Search result.
Special knowledge and technical ability in domain knowledge, industry field.Field refers to the specialty of some restriction or the scope of industry, Such as finance, manufacture, food and drink.The knowledge frame that expertise, technical ability, management competency in field are constituted is referred to as ken.
Natural language processing, is the process and relevant technology with computer disposal natural language information.What natural language referred to It is written or oral form the language of mankind itself, such as Chinese, English, Japanese, it is relative to artificial formalization For computer language.The key for handling natural language is computer understanding natural language to be allowed.
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, the every other implementation that those skilled in the art are obtained under the premise of creative work is not made Example, belongs to the scope of protection of the invention.
Fig. 1 is a kind of flow signal for being used to confirm the data processing method of search scene according to embodiments of the present invention Figure, reference picture 1, methods described includes:
10:The primary data mapping set up between the first data set and the second data set.Wherein, the first data set bag Containing multinomial first data, second data set includes multinomial second data.
In the present invention, the first data set and the second data set, which are included, can directly carry out the number of data mapping processing According to.On how to obtain the first data set and the second data set under specific application environment, it will become clear from the description below.
Alternatively, in the present embodiment, processing 10 using the first data set to the second data set it can be appreciated that carried out Data markers, so that the initial mapping relation set up between the first data set and the second data set.
12:The primary data mapping is adjusted according to monitoring data collection, first data set and the described second number is obtained According to the real data mapping between collection.
Alternatively, in a kind of implementation of the present embodiment, the effect of monitoring data collection be processing 10 is obtained just Beginning mapping relations are optimized, for example, preventing the over-fitting situation to the data markers of the first data set, mapping intensity is carried out Limitation.
Wherein, the monitoring data that monitoring data collection is included in monitoring data, the present invention can be understood as a kind of standardization Data sample, for aiding in carrying out the processing such as data filtering, adjustment, optimization, with data reference meaning.
14:The search scene for determining that second data set is mapped is mapped based on the real data.Specifically, base The first data in first data set that the second data in second data set are actually mapped to, determine described The corresponding search scene of the second data in two data sets.
The method provided using the present embodiment, not sufficiently effective or over-fitting situation existing mapping is mapped relative to existing For technology, capable adjustment is mapped into data based on monitoring data, can effectively optimize data mapping relations, improves the essence of mapping Accuracy, and then improve the precision of identified search scene.
Alternatively, in a kind of implementation of the present embodiment, the monitoring data that the monitoring data is concentrated includes phrase Title and adjusting parameter, the adjusting parameter include weight and/or penalty factor.Now, processing 12 can be in the following manner Realize:
First, the monitoring data for determining to be mutually matched and the first data are handled using text matches.For example, by phrase title Matching treatment is carried out with the first data in the first data set, it is determined that the monitoring data being mutually matched and the first data.Then, pin To every second data, the weight of the monitoring data based on the first Data Matching arrived with the second data initial mapping, modification the Mapping relations between the first data that two data and its initial mapping are arrived, and/or, for every second data, based on The penalty factor of the monitoring data for the first Data Matching that two data initial mappings are arrived, the second data initial mapping of adjustment arrive the The weight of one data.
Wherein, the mapping relations between the first data that its initial mapping of the second data of the modification is arrived, including:Delete institute The weighted value of the monitoring data of matching is unsatisfactory for the first data of preparatory condition and the mapping relations of the second data, according to the first number The mapping between the first data and the second data is ranked up according to the weighted value of the monitoring data matched.
Alternatively, in a kind of implementation of the present embodiment, the first data are handled according to time dimension and geography dimensionality Source obtains first data set;To monitoring data source carry out text-processing (including:Cutting word analysis, word frequency analysis, stem are extracted And semantic analysis), obtain the monitoring data collection;Second data set can be existing data set.
Alternatively, in a kind of implementation of the present embodiment, the first data set, the second data set and monitoring data collection are The data in same field.For example, by taking catering field as an example, the first data set is the scene characteristic storehouse of catering field, the second data Collection includes vegetable data and merchant data, and effective catering field information that monitoring data collection is then excavated based on external source is obtained.
Alternatively, in a kind of implementation of the present embodiment, for every second data, from actually mapping to the second number According to the first data in choose the combinations of at least partly the first data or the data of described at least part first and be used as the search field Scape.
For example, by taking " deep-fried twisted dough sticks " in the second data set this vegetable words as an example, it is assumed that the first packet being mapped to Include:" breakfast ", " north ", " staple food ", " fried food ", " chinese tradition " etc..Wherein " breakfast " frequency of occurrence highest, most generation Table.Therefore, in the mapping data of " deep-fried twisted dough sticks " this vegetable word, " breakfast " can be come to the first place of all features, correspondence Maximum weight.And then, in processing 14, " breakfast " can be chosen as the search scene of deep-fried twisted dough sticks.It is of course also possible to from mapping Word in choose at least part word combination and constitute scene, " northern breakfast " is for example used as scene.In other words, in this reality In existing mode, preparatory condition can be met according to weight (for example, weight sequencing) weight selection of the monitoring data matched The combination of first data or the first data is used as corresponding search scene.
In the implementation, the Feature Words corresponding to " deep-fried twisted dough sticks " can be screened using the frequency of Feature Words, The weight of each Feature Words can be optimized according to the frequency of Feature Words, so, by using the frequency of Feature Words as auxiliary Help parameter to be adjusted/correct weight, the problem of weight is described into that may be present inaccurate is weakened, also ensure that based on power The accuracy that real data be adjusted again obtained from maps.
On the frequency of Feature Words, it refers to that Feature Words are remembered in the Data Collection of first data set, statistics stage The quantity recorded.For example:Assuming that data collection phase of " breakfast " one word in the first data set, counts on altogether 723 " early Meal " or " breakfast " are used as main semantic phrase, then in first data set, the word frequency of " breakfast " this Feature Words is just It is 723/ (total degree that all Feature Words occur).
Below, exemplified by applying the present invention to catering field, each details related to the present invention is carried out specifically It is bright.
Fig. 2 is a kind of schematic flow sheet of method for setting up scene characteristic storehouse according to embodiments of the present invention.The scene Feature database is a kind of specific implementation of the first data set.Reference picture 2, methods described includes:
First, the first data source is obtained.First data source includes user behavior data and external source mining data.User Behavioral data mainly reflects behavior of the user on time dimension, the user being collected into using client (for example, APP clients) Record is clicked on and browses, in service end using the time as dimension, by these behaviors of user successively finishing collecting, for example:User A November 03 day in 2016 11 points of behavioral data for " open APP->Browse homepage->Drop-down menu to page 3->Stop 2 seconds Select afterwards the 3rd trade company and entrance->Trade company details page selection X products->Into lower single-page->Select payment method and Dispense geography information " etc..And open menu of the external source mining data including the professional food and drink website of main flow, vegetable way, food and drink point The information such as class.
Then, by data analytics subsystem, first data source is analyzed, Novel Temporal Scenario basic number is obtained According to, red-letter day scene master data, geography information master data.Specifically, using Text Mode matching technique by the first data Source is divided into the basic red-letter day scenes such as breakfast, lunch, four basic time scenes of dinner and food, Chinese and western traditional festivals or holidays And the user based on geography information dispenses the essential characteristics such as scene.
Then, after basic feature information is obtained, characteristic filter model is trained and is fitted by fitting algorithm be excellent Change, complete the filtering of characteristic, the data of related information removing mistake, being not belonging to catering field make the number of feature database According to rationalization.
Herein, why characteristic filter model is trained, be because filtered initial characteristic data often There are various noise datas.For example:" cigarette " this search term is in original scene characteristic is extracted, it is possible to create " breakfast ", " sweets " the two scene characteristics.It is apparent that this wrong identification that to be due to dirty data cause is, it is necessary to be filtered.Therefore, lead to Artificial setting model the set goal state is crossed, using fit procedure, constantly filter condition can be made more accurate, Jin Erke To filter out the feature database data that logic association is not strong.
Handled more than, you can obtain scene characteristic storehouse.Exemplarily, the Data Structures in scene characteristic storehouse are such as Shown in following table:
(table one)
With reference to table one.Wherein, characteristic ID represents the unique identifying information of each feature, and being used in search scene Recognition should ID calls correlated characteristic.Feature name facilitates feature database manager to check and information displaying.Residing for tagsort representative feature Classification, for example, feature can be divided into one-level feature, secondary characteristics and three-level subclass.More specifically, " breakfast " belongs to one Level feature, wherein comprising " fat-reducing breakfast " this secondary characteristics, the secondary characteristics are again special comprising three-levels such as " tuna products " Levy.
Feature weight represents factor of influence of this feature in feature database, and its calculation formula is:
Wi=θ * Ci/∑J=0Cj+Punishment(i>=0, j are since 0)
WiThe weight (also referred to as factor of influence) of ith feature is represented, θ represents the positive incentive parameter artificially set, this Parameter, which is used for weakening caused by the noise being previously mentioned, to be disturbed, CiRepresent that ith feature passes through cutting word, word in training data Feature name obtained by frequency analysis and semantic analysis (related description refer to and explanation hereafter in monitoring data), and instruct It is the first data source above to practice data.Punishment is penalty factor, for correct weight that over-fitting problem brings because The problem of son influence is excessive.
Characteristic relation represents the relation between feature, including approximate, mutual exclusion and includes three kinds of relations.For example:" breakfast " Just belong to mutual exclusion feature with " dinner ".The optimization important role of characteristic relation information Feature Mapping part to after, passes through The comparison of feature weight and characteristic relation, can more accurately filter out the mapping result of mistake.
The method provided using the present embodiment, carries out data processing, especially into the time by full-automatic flow Dimension and geography information dimension divide mass data, can effectively shorten data mining processing and manual review brings Ineffective time cost consumption, improves whole strategy assessment performance.In addition, for lifting feature storehouse can be descriptive and can be representative, Can by the way of characteristic model reverse energization double optimization feature database.Compared with traditional Feature Extraction Technology, accuracy is more Height, comprising feature it is also more representative.
Fig. 3 is a kind of schematic flow sheet of the method for acquisition monitoring data according to embodiments of the present invention.This method is directed to Catering field information carries out text-processing, obtains monitoring data, and the monitoring data refers to (a kind of basic suitable for monitor model Machine learning method) data.Specifically, as shown in figure 3, methods described includes:
30:Obtain catering field information.The catering field information can by web crawlers robot from external source excavate number Extracted according to middle.
32:Cutting word is analyzed.Specifically, tokenizer can be used to complete cutting word analysis.For example, being cut using wordseg Word instrument, its general principle is that the word dictionary for generating mass data is matched with one section of catering information, once discovery With successful phrase, then candidate's cutting word is regarded as, and go to pick out matching degree most according to the word weight that word dictionary is provided High cutting word mode, then it is considered that this cutting word result is exactly final result.Formed after one section of catering information cutting word by short The set of language composition, for example:" the main food materials of pork fried with sugar & vinegar dressing include lean pork taken under the spinal column of a hog, starch, tomato etc. ", this text was considered as food and drink Phrase book after information, cutting word is combined into { " pork fried with sugar & vinegar dressing ", " main food materials ", " lean pork taken under the spinal column of a hog ", " starch ", " tomato " }.
34:Word frequency analysis.Specifically, after cutting word analysis is all carried out for each section of catering field information, and then The number of times that each cutting word after-phrase occurs is counted, this number of times is exactly word frequency information.The main purpose of word frequency analysis is to filter out Unwanted word, leaves most representational word.For example:For catering field information, two such is formed after cutting word Word:" chicken row ", " big chicken row ".According to word frequency statisticses, " chicken row ", which has altogether, to be occurred in that 12834 times, and " big chicken row " has appearance altogether 231 times, then there is the word of Similar Text institutional framework for the two, only it can retain " chicken row ".
36:Stem is extracted.Specifically, carry out part with the cutting word phrase of foregoing generation using stem dictionary and match inspection Look into, for example:" delicious lean pork taken under the spinal column of a hog " can be extracted as " lean pork taken under the spinal column of a hog ", and attribute " delicious " therein can be removed.Stem is extracted The part of speech of phrase can be recognized, and then secondary cut is carried out to phrase, the noun part of core is finally left behind.
38:Semantic analysis.Exemplarily, the semantic analysis based on N-gram (a kind of language model) can be carried out.This point Analysis method is based on a kind of it is assumed that the appearance of n-th word is only related to above N-1 word, and without related to other factors, this is short The probability of language is exactly the product for the probability that each stem occurs.
The processing 30-38 more than, you can obtain the monitoring data of catering field.Exemplarily, the structure of monitoring data It is as shown in the table:
Phrase ID Phrase title Weight Penalty factor
Table two
Wherein, phrase ID uniquely indicates the phrase, for being used when calling monitoring data.Phrase title is used for and first Data (for example, Feature Words in scene characteristic storehouse) in data set carry out text matches.Weight refers to the weight of the monitoring data The property wanted, such as vegetable " fish-flavoured shredded pork " are mapped to " Sichuan cuisine ", " prevalence ", " fashion intention " these three Feature Words, and the prison of system Superintend and direct in data " Sichuan cuisine ", the weight of " prevalence " two supervision phrases and be significantly greater than " fashion intention " this phrase, then after filtering The feature stayed is exactly " Sichuan cuisine ", " prevalence ", meanwhile, " this phrase expression way of XX " with sweet and sour flavor is also defined as one by system Supervise formula.It is similar to when next time " when XX " with sweet and sour flavor phrase is processed, as long as there are " Sichuan cuisine ", " prevalence " or similar characteristics When, monitor model will lift the factor of influence of these features, meanwhile, the mapping intensity of other features can be limited.Penalty factor It is the amendment option of monitoring data, the numerical value is usually manual to be set, and is gone to assess by the manual examination and verification after data sampling and is supervised Constraint of the data to feature.
Fig. 4 is a kind of data mapping logic schematic diagram according to embodiments of the present invention, its be illustrated that scene characteristic storehouse with The real data mapping logic of catering field data.Reference picture 4, the data mapping logic includes:First, based on catering field number The data set up therebetween according to (including vegetable data and merchant data) and scene characteristic storehouse map.Then, supervision number is read According to weight and penalty factor, and then lifted and limited.Specifically, scene characteristic storehouse is being mapped to vegetable or business During user data, the part Feature Words matched using the weight lifting of monitoring data itself with monitoring data, while passing through prison Superintend and direct the penalty factor limitation mapping intensity (that is, the weights of Feature Words) of data, generation effectively mapping data (that is, actual mapping number According to).
Traditional Feature Mapping technology there is a situation where to map not sufficiently effective or over-fitting.And the number that the present embodiment is used According to mapping logic, the concept of monitoring data is introduced, the monitor model of catering field knowledge can be built by third party's data, is entered And data mapping in based on monitoring data filter vegetable, shop title scene characteristic, lift map accuracy.
In the present embodiment, after generation effectively mapping data processing, it is possible to use the Feature Words frequency is to each food and drink The Feature Words that realm information word (for example, vegetable and name of firm) is mapped to are ranked up.By taking " deep-fried twisted dough sticks " this vegetable word as an example, The Feature Words being mapped to include:" breakfast ", " north ", " staple food ", " fried food ", " chinese tradition " etc., wherein " breakfast " this Individual scene characteristic frequency of occurrence highest, it is most representative.Therefore in the mapping data of " deep-fried twisted dough sticks " this vegetable word, " breakfast " row In the first place of all features, weight is maximum.Can using " breakfast " as deep-fried twisted dough sticks search scene.
Fig. 5 is a kind of schematic flow sheet of data mapping method according to embodiments of the present invention, and it illustrates scene characteristic Storehouse and the real data mapping process of catering field data (including vegetable data and merchant data).Reference picture 5, methods described bag Include:
50:Set up scene characteristic storehouse and the data of catering field data map.
52:The data mapping is optimized based on monitoring data.For example, being optimized by foregoing weight, penalty factor.
54:Determine the search scene corresponding to catering field data.For example, for single second number in the second data set According to, it is ranked up, screens or combines according to frequency of occurrence, weight or other parameters with its first data mapped, so that To corresponding search scene.
Fig. 6 is a kind of schematic flow sheet of search scene recognition method according to embodiments of the present invention.Reference picture 6, it is described Method includes:
60:Cutting word is carried out to search terms, search term is obtained.The search term can be one or more.
Alternatively, in a kind of implementation of the present embodiment, place is identified in the search terms inputted first against user Reason, the identifying processing includes simple filtering, recalls triggering first.Wherein, filtering refers to that carrying out exception for the search terms sentences It is disconnected, if it find that the search terms is abnormal, such as:Search term includes forbidden character, sensitive information etc., and search will no longer carry out next Step processing.
Alternatively, in the present embodiment, cutting word can be carried out using the tokenizer being mentioned above.
62:The matched data for determining to match with the search term in the second data set by matching treatment.Wherein, described Two data sets set up data mapping (that is, real data with first data set using data mapping method as previously described Mapping).On the explanation of the first data set and the second data set, refer to above.
Alternatively, in a kind of implementation of the present embodiment, the matching treatment is text matches processing, and preferably Matched using part.Part matching refers to, if the second data in the second data set with it is any after search terms cutting word One word matching, then second data are matched with search terms.For example, utilizing the cutting word result and the word of feature dictionary of search terms Approximate calculation is carried out, if search term " river perfume Sichuan-style pork " and " twice-cooked stir-frying " this characteristic matching success in feature database, are in fact The match is successful with correlated characteristic for " twice-cooked stir-frying " two word in " river perfume Sichuan-style pork ".
The Rapid matching catering field data by the way of part is matched, on the one hand lifting matching efficiency, on the other hand has The range of effect ground lifting matching scene.
64:The search scene mapped according to the matched data, determines the corresponding search scene of the search terms.
Alternatively, in a kind of implementation of the present embodiment, by taking catering field as an example, the first data set is scene characteristic Storehouse, the second data set are catering field data.It is determined that after the corresponding search scene of search term, it is possible to use in scene characteristic storehouse The scene weight of precomputation carries out scene sequence.
Fig. 7 is a kind of schematic flow sheet of searching method according to embodiments of the present invention, reference picture 7, and methods described includes:
70:Recognize the corresponding search scene of search terms.For example, according to search terms and the second data set and second data The mapped search scene of collection, determines the corresponding search scene of the search terms.Wherein, what second data set was mapped searches Rope scene is determined using previously described data mapping method.More specifically, it can be known using the method shown in Fig. 6 Not.
72:Loading data file corresponding with search scene.The data file is configured with the optimisation strategy of call back data.
Alternatively, in a kind of implementation of the present embodiment, the data file corresponding to dynamic load different scenes, after And obtain meeting the search result of user search intent.The dynamic load is hot loading technique, i.e., do not restarting the feelings of service Under condition, data can be changed in real time.And in the present embodiment, the ordering strategy for recalling logic is configured to data one by one File, by loading these data files, to construct sort algorithm.Exemplarily, the data file of these ordering strategies is as follows Shown in table:
Tactful ID Policy name Policy class Characterising parameter Parameter role scope Extend information
(table three)
Wherein, characterising parameter and parameter role scope are intended to indicate that the influence point of strategy, for example:Sequence plan based on distance In slightly, characterising parameter is exactly " apart from the factor ", and parameter role scope is exactly " 0km -20km ".
74:Sequence is optimized to call back data according to data file.
The method provided using the present embodiment, is recalled there is provided modular calculating entrance for search, can be for difference The respective sorting consistence strategy of search Scenario Design, realize " face of thousand people thousand " search personalised effects.
Embodiment of the method according to embodiments of the present invention is described in detail above in association with accompanying drawing.Below in conjunction with the accompanying drawings Apparatus according to the invention embodiment is illustrated.
Fig. 8 is the one of the block diagram of a kind of data processing equipment for being used for confirmation search scene according to embodiments of the present invention Example.Reference picture 8, data processing equipment includes:Module 80 is set up in data mapping, for setting up the first data set and the second data set Between primary data mapping;Data map adjusting module 82, map, obtain for adjusting the data according to monitoring data collection Real data mapping between first data set and second data set;Search for scene mapping block 84, for based on The first data in first data set that the second data in second data set are actually mapped to, determine described second The corresponding search scene of the second data in data set.
Alternatively, in a kind of implementation of the present embodiment, the monitoring data that the monitoring data is concentrated except including Weight and/or penalty factor.
Alternatively, in a kind of implementation of the present embodiment, data mapping adjusting module 82 includes:Matched sub-block, For handling the monitoring data for determining to be mutually matched and the first data using text matches;First adjustment submodule, for for Every second data, the weight of the monitoring data based on the first Data Matching arrived with the second data initial mapping, modification second Mapping relations between the first data that data and its initial mapping are arrived, and/or, the second adjustment submodule, for for items Second data, the penalty factor of the monitoring data based on the first Data Matching arrived with the second data initial mapping, adjustment second The weight for the first data that data initial mapping is arrived.
Alternatively, in a kind of implementation of the present embodiment, the search scene mapping block 84 specifically for:For Every second data, choose at least partly the first data or at least portion from the first data for being actually mapped to the second data The combination of the first data is divided to be used as the search scene.For example, the weight of the monitoring data matched based on the first data is chosen The data of at least part first.
Alternatively, in a kind of implementation of the present embodiment, first data set is the scene characteristic of catering field Storehouse, second data set includes vegetable data and merchant data.
Fig. 9 is one of the block diagram of a kind of search scene Recognition device according to embodiments of the present invention, reference picture 9, the dress Put including:Cutting word module 90, for carrying out cutting word to search terms, obtains search term;Matching module 92, for passing through matching treatment Determine the matched data matched in the second data set with the search term;Determining module 94, for according to the matched data institute The search scene of mapping, determines the corresponding search scene of the search terms.Wherein, previously described method is used for the second data Collection mapping search scene.
Figure 10 is one of the block diagram of a kind of searcher according to embodiments of the present invention, and reference picture 10, the device includes: Scene determining module 102, for the search scene mapped according to search terms and the second data set and second data set, really Determining the corresponding search scene of the search terms, (scene that wherein, second data set is mapped uses previously described data Mapping method is determined, or is determined using search scene Recognition device shown in Fig. 9);Load-on module 104, is searched for loading with described The corresponding data file of rope scene, the data file is configured with the optimisation strategy of call back data;Optimization module 106, for root Sequence is optimized to call back data according to the data file of loading.
Information-pushing method and device according to embodiments of the present invention are illustrated above in association with accompanying drawing, this area skill Art personnel should be appreciated that the embodiment of the method that provides of the present invention or implementation can correspondingly device provided by the present invention it is real Apply example or implementation is realized, and the embodiment of the method for processing procedure/logic of the device embodiment of the present invention and the present invention It is consistent.Therefore, in the device embodiment of the present invention, on processing is handled or can perform performed by modules, submodule Detailed description, on specific names, term, scope explanation, and on having that each embodiment, correlated characteristic have The description of beneficial effect, refers to the respective description in embodiment of the method, here is omitted.
In a kind of possible design related to the present invention, aforementioned data processing unit can include processor and storage Device, the memory, which is used to store, supports the data processing equipment to perform the processing performed by foregoing corresponding module/submodule Program, the processor is configurable for performing the program stored in the memory.
Described program includes one or more computer instruction, wherein, one or more computer instruction is for described Processor calls execution.
More specifically, the processor by perform the computer instruction for:
The primary data mapping set up between the first data set and the second data set, first data set includes multinomial the One data, second data set includes multinomial second data;
The primary data mapping is adjusted according to monitoring data collection, first data set and second data set is obtained Between real data mapping;
The first data in first data set being actually mapped to based on the second data in second data set, Determine the search scene that the second data in second data set are mapped.
Alternatively, the processor can also by perform the computer instruction for:According to time dimension and ground Manage dimension and handle the first data source, obtain first data set;Cutting word analysis, word frequency analysis, word are carried out to monitoring data source It is dry to extract and semantic analysis, obtain the monitoring data collection.
Alternatively, the monitoring data that the monitoring data is concentrated is except including phrase title, in addition to weight and/or punishment The factor.Now, the processing can also by perform the computer instruction for:
The monitoring data for determining to be mutually matched and the first data are handled using text matches;For every second data, base In the weight of the monitoring data of the first Data Matching arrived with the second data initial mapping, the second data of modification and its initial mapping Mapping relations between the first data arrived, and/or, for every second data, based on what is arrived with the second data initial mapping The penalty factor of the monitoring data of first Data Matching, the weight for the first data that the second data initial mapping of adjustment is arrived.
Alternatively, the processing can also by perform the computer instruction for:For every second data, from Actually map in the first data of the second data and choose the group of at least partly the first data or the data of described at least part first Cooperate as the search scene.
Correspondingly, the embodiment of the present invention additionally provides a kind of computer-readable storage medium, for storing aforementioned data mapping dress Performed computer software instructions are put, it, which is included, is used to perform involved by the data mapping unit of above-mentioned data mapping method Program.
In alternatively possible design related to the present invention, previous searches device can include processor and storage Device, the memory is used for the journey for storing the processing for supporting the data processing equipment to perform performed by corresponding module/submodule Sequence, the processor is configurable for performing the program stored in the memory.
Described program includes one or more computer instruction, wherein, one or more computer instruction is for described Processor calls execution.
More specifically, the processor by perform the computer instruction for:Counted according to search terms and second According to collection and the search scene that is mapped of second data set, the corresponding search scene of the search terms is determined, wherein, described the The search scene that two data sets are mapped is determined using aforementioned data mapping method;Loading data corresponding with the search scene File, the data file is configured with the optimisation strategy of call back data;Call back data is optimized according to the data file Sequence.
Correspondingly, a kind of computer-readable storage medium is also provided in the embodiment of the present invention, for storing previous searches device institute The computer software instructions of execution, it, which is included, is used to perform the program involved by the searcher of searching method described previously.
It is apparent to those skilled in the art that, for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, may be referred to the corresponding process in preceding method embodiment, will not be repeated here.
Device embodiment described above is only schematical, wherein the unit illustrated as separating component can To be or may not be physically separate, the part shown as unit can be or may not be physics list Member, you can with positioned at a place, or can also be distributed on multiple NEs.It can be selected according to the actual needs In some or all of module realize the purpose of this embodiment scheme.Those of ordinary skill in the art are not paying creativeness Work in the case of, you can to understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can Realized by the mode of software plus required general hardware platform, naturally it is also possible to pass through hardware.Understood based on such, on The part that technical scheme substantially in other words contributes to prior art is stated to embody in the form of software product, should Computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disc, CD, including some fingers Order is to cause a computer equipment (can be personal computer, server, or network equipment etc.) to perform each implementation Method described in some parts of example or embodiment.
Finally it should be noted that:The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although The present invention is described in detail with reference to the foregoing embodiments, it will be understood by those within the art that:It still may be used To be modified to the technical scheme described in foregoing embodiments, or equivalent substitution is carried out to which part technical characteristic; And these modification or replace, do not make appropriate technical solution essence depart from various embodiments of the present invention technical scheme spirit and Scope.
The present invention discloses A1, a kind of data processing method for being used to confirm search scene, including:
The primary data mapping set up between the first data set and the second data set, first data set includes multinomial the One data, second data set includes multinomial second data;
The primary data mapping is adjusted according to monitoring data collection, first data set and second data set is obtained Between real data mapping;
The first data in first data set being actually mapped to based on the second data in second data set, Determine the search scene that the second data in second data set are mapped.
In A2, the method as described in A1, first data set is the scene characteristic storehouse of catering field, second data Collection includes vegetable data and merchant data.
In A3, the method as described in A1, methods described also includes:The first data are handled according to time dimension and geography dimensionality Source, obtains first data set.
In A4, the method as described in A1, in addition to:
To monitoring data source carry out text-processing (including:Cutting word analysis, word frequency analysis, stem are extracted and semantic point Analysis), obtain the monitoring data collection.
A5, the method as any one of A1~A4, the monitoring data that the monitoring data is concentrated include weight and/or Penalty factor.
In A6, the method as described in A5,
The primary data mapping relations are adjusted according to monitoring data collection, including:
The monitoring data for determining to be mutually matched and the first data are handled using text matches;
For every second data, the power of the monitoring data based on the first Data Matching arrived with the second data initial mapping Mapping relations between weight, the first data that the second data of modification and its initial mapping are arrived, and/or,
For every second data, based on the second data initial mapping to the monitoring data of the first Data Matching punish Penalty factor, the weight for the first data that the second data initial mapping of adjustment is arrived.
In A7, the method as any one of A1-A4 or A6, second data based in second data set The first data in first data set being actually mapped to, determine that the second data in second data set are corresponding and search Rope scene, including:For every second data, at least partly first is chosen from the first data for actually mapping to the second data The combination of data or the data of described at least part first is used as the search scene.
The invention also discloses B8, a kind of searching method, including:
The search scene mapped according to search terms and the second data set and second data set, determines the search terms Corresponding search scene, wherein, the search scene that second data set is mapped is using the side as any one of A1-A7 Method is determined;
Loading data file corresponding with the search scene, the data file is configured with the optimization plan of call back data Slightly;
Sequence is optimized to call back data according to the data file.
The invention also discloses C9, a kind of data processing equipment for being used to confirm search scene, including:
Module is set up in data mapping, for setting up the mapping of the primary data between the first data set and the second data set, institute State the first data set and include multinomial first data, second data set includes multinomial second data;
Data map adjusting module, are mapped for adjusting the primary data according to monitoring data collection, obtain described first Real data mapping between data set and second data set;
Search for scene mapping block, for be actually mapped to based on the second data in second data set described the The first data in one data set, determine the corresponding search scene of the second data in second data set.
In C10, the device as described in C9, first data set is the scene characteristic storehouse of catering field, second number Include vegetable data and merchant data according to collection.
In C11, the device as described in C9, described device also includes the first data processing module, for according to time dimension The first data source is handled with geography dimensionality, first data set is obtained.
In C12, the device as described in C9, described device also includes monitoring data processing module, for monitoring data source Carry out text-processing (e.g., including:Cutting word analysis, word frequency analysis, stem are extracted and semantic analysis), obtain the supervision number According to collection.
In C13, the device as described in C9-C12, the monitoring data that the monitoring data is concentrated except including phrase title, Also include weight and/or penalty factor.
In C14, the device as described in C13, the data mapping adjusting module includes:
Matched sub-block, for handling the monitoring data for determining to be mutually matched and the first data using text matches;
First adjustment submodule, for for every second data, based on the first number arrived with the second data initial mapping Mapping relations between the first data arrived according to the weight of the monitoring data of matching, the second data of modification and its initial mapping, and/ Or,
Second adjustment submodule, for for every second data, based on the first number arrived with the second data initial mapping According to the penalty factor of the monitoring data of matching, the weight for the first data that the second data initial mapping of adjustment is arrived.
In C15, the device as any one of C9-C12 or C14, it is described search scene mapping block specifically for:Pin To every second data, from the first data for being actually mapped to the second data choose at least partly the first data or it is described at least The combination of the data of part first is used as the search scene.
Invention additionally discloses D16, a kind of searcher, including:
Scene determining module, for the search mapped according to search terms and the second data set and second data set Scape, determines the corresponding search scene of the search terms, wherein, the search scene that second data set is mapped uses such as A1- Method any one of A7 is determined;
Load-on module, for loading data file corresponding with the search scene, the data file, which is configured with, recalls The optimisation strategy of data;
Optimization module, sequence is optimized to call back data for the data file according to loading
The invention also discloses E1, a kind of data mapping unit, including memory and processor;Wherein,
The memory is used to store one or more computer instruction, wherein, one or more computer instruction Execution is called for the processor;
The processor is by performing the computer instruction to perform following processing:
The primary data mapping set up between the first data set and the second data set, first data set includes multinomial the One data, second data set includes multinomial second data;
The primary data mapping is adjusted according to monitoring data collection, first data set and second data set is obtained Between real data mapping;
The first data in first data set being actually mapped to based on the second data in second data set, Determine the search scene that the second data in second data set are mapped.
In E2, the data mapping unit as described in E1, first data set is the scene characteristic storehouse of catering field, described Second data set includes vegetable data and merchant data.
In E3, the data mapping unit as described in E1, the processor by performing the computer instruction with perform with Lower processing:The first data source is handled according to time dimension and geography dimensionality, first data set is obtained.
In E4, the data mapping unit as described in E1, the processor by performing the computer instruction with perform with Lower processing:Text-processing is carried out (e.g., including to monitoring data source:Cutting word analysis, word frequency analysis, stem are extracted and semantic Analysis) obtain the monitoring data collection.
In E5, the data mapping unit as any one of E1-E4, the monitoring data that the monitoring data is concentrated includes Weight and/or penalty factor.
In E6, the data mapping unit as described in E5, the processor by performing the computer instruction with perform with Lower processing:The monitoring data for determining to be mutually matched and the first data are handled using text matches;For every second data, it is based on The weight of the monitoring data of the first Data Matching arrived with the second data initial mapping, the second data of modification are arrived with its initial mapping The first data between mapping relations, and/or, for every second data, based on arrived with the second data initial mapping The penalty factor of the monitoring data of one Data Matching, the weight for the first data that the second data initial mapping of adjustment is arrived.
E7, the device as any one of E1-E4 or E6, the processor is by performing the computer instruction to hold Row is following to be handled:For every second data, at least partly first is chosen from the first data for actually mapping to the second data The combination of data or the data of described at least part first is used as the search scene.
The invention also discloses F1, a kind of searcher, including memory and processor;Wherein,
The memory is used to store one or more computer instruction, wherein, one or more computer instruction Execution is called for the processor;
The processor is by performing the computer instruction to perform following processing:According to search terms and the second data set And the search scene that second data set is mapped, the corresponding search scene of the search terms is determined, wherein, second number Determined according to method of the mapped search scene of collection as any one of A1-A7;Loading number corresponding with the search scene According to file, the data file is configured with the optimisation strategy of call back data;Call back data is carried out according to the data file excellent Change sequence.

Claims (10)

1. a kind of data processing method for being used to confirm search scene, it is characterised in that methods described includes:
The primary data mapping set up between the first data set and the second data set, first data set includes the multinomial first number According to second data set includes multinomial second data;
The primary data mapping is adjusted according to monitoring data collection, obtained between first data set and second data set Real data mapping;
The first data in first data set being actually mapped to based on the second data in second data set, it is determined that The search scene that the second data in second data set are mapped.
2. the method as described in claim 1, it is characterised in that
The monitoring data that the monitoring data is concentrated includes weight and/or penalty factor.
3. method as claimed in claim 2, it is characterised in that described to be reflected according to the monitoring data collection adjustment primary data Penetrate, including:
The monitoring data for determining to be mutually matched and the first data are handled using text matches;
For every second data, the weight of the monitoring data based on the first Data Matching arrived with the second data initial mapping, The mapping relations between the first data that the second data and its initial mapping are arrived are changed, and/or,
For every second data, the punishment of the monitoring data based on the first Data Matching arrived with the second data initial mapping because Son, the weight for the first data that the second data initial mapping of adjustment is arrived.
4. the method as any one of claim 1-3, it is characterised in that it is described based in second data set The first data in first data set that two data are actually mapped to, determine the second data pair in second data set The search scene answered, including:
For every second data, at least partly the first data or institute are chosen from the first data for actually mapping to the second data The combination of at least partly the first data is stated as the search scene.
5. a kind of searching method, it is characterised in that methods described includes:
The search scene mapped according to search terms and the second data set and second data set, determines the search terms correspondence Search scene, wherein, the search scene that second data set is mapped use as any one of claim 1-4 Method is determined;
Loading data file corresponding with the search scene, the data file is configured with the optimisation strategy of call back data;
Sequence is optimized to call back data according to the data file.
6. a kind of data processing equipment for being used to confirm search scene, it is characterised in that described device includes:
Module is set up in data mapping, for setting up the mapping of the primary data between the first data set and the second data set, described the One data set includes multinomial first data, and second data set includes multinomial second data;
Data map adjusting module, are mapped for adjusting the primary data according to monitoring data collection, obtain first data Real data between collection and second data set maps;
Scene mapping block is searched for, for first number being actually mapped to based on the second data in second data set According to the first data of concentration, the corresponding search scene of the second data in second data set is determined.
7. device as claimed in claim 6, it is characterised in that
The monitoring data that the monitoring data is concentrated includes weight and/or penalty factor.
8. device as claimed in claim 7, it is characterised in that the data mapping adjusting module includes:
Matched sub-block, for handling the monitoring data for determining to be mutually matched and the first data using text matches;
First adjustment submodule, for for every second data, based on the first data arrived with the second data initial mapping Mapping relations between the weight for the monitoring data matched somebody with somebody, the first data that the second data of modification and its initial mapping are arrived, and/or,
Second adjustment submodule, for for every second data, based on the first data arrived with the second data initial mapping The penalty factor for the monitoring data matched somebody with somebody, the weight for the first data that the second data initial mapping of adjustment is arrived.
9. the device as any one of claim 6-8, it is characterised in that the search scene mapping block is specifically used In:
For every second data, at least partly the first data or institute are chosen from the first data for being actually mapped to the second data The combination of at least partly the first data is stated as the search scene.
10. a kind of searcher, it is characterised in that described device includes:
Scene determining module, for the search scene mapped according to search terms and the second data set and second data set, The corresponding search scene of the search terms is determined, wherein, the search scene that second data set is mapped uses right such as will The method any one of 1-4 is asked to determine;
Load-on module, for loading data file corresponding with the search scene, the data file is configured with call back data Optimisation strategy;
Optimization module, sequence is optimized to call back data for the data file according to loading.
CN201710243857.XA 2017-04-14 2017-04-14 Data processing method for confirming search scene, search method and corresponding device Expired - Fee Related CN107145525B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710243857.XA CN107145525B (en) 2017-04-14 2017-04-14 Data processing method for confirming search scene, search method and corresponding device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710243857.XA CN107145525B (en) 2017-04-14 2017-04-14 Data processing method for confirming search scene, search method and corresponding device

Publications (2)

Publication Number Publication Date
CN107145525A true CN107145525A (en) 2017-09-08
CN107145525B CN107145525B (en) 2020-10-16

Family

ID=59773563

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710243857.XA Expired - Fee Related CN107145525B (en) 2017-04-14 2017-04-14 Data processing method for confirming search scene, search method and corresponding device

Country Status (1)

Country Link
CN (1) CN107145525B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033877A (en) * 2009-09-27 2011-04-27 阿里巴巴集团控股有限公司 Search method and device
CN102612704A (en) * 2009-10-19 2012-07-25 Metaio有限公司 Method of providing a descriptor for at least one feature of an image and method of matching features
CN102665838A (en) * 2009-11-11 2012-09-12 微软公司 Methods and systems for determining and tracking extremities of a target
CN104239907A (en) * 2014-07-16 2014-12-24 华南理工大学 Far infrared pedestrian detection method for changed scenes
US9141871B2 (en) * 2011-10-05 2015-09-22 Carnegie Mellon University Systems, methods, and software implementing affine-invariant feature detection implementing iterative searching of an affine space
CN105335391A (en) * 2014-07-09 2016-02-17 阿里巴巴集团控股有限公司 Processing method and device of search request on the basis of search engine
CN105956189A (en) * 2016-06-08 2016-09-21 北京百度网讯科技有限公司 Artificial intelligence-based information recommendation method and apparatus

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033877A (en) * 2009-09-27 2011-04-27 阿里巴巴集团控股有限公司 Search method and device
CN102612704A (en) * 2009-10-19 2012-07-25 Metaio有限公司 Method of providing a descriptor for at least one feature of an image and method of matching features
CN102665838A (en) * 2009-11-11 2012-09-12 微软公司 Methods and systems for determining and tracking extremities of a target
US9141871B2 (en) * 2011-10-05 2015-09-22 Carnegie Mellon University Systems, methods, and software implementing affine-invariant feature detection implementing iterative searching of an affine space
CN105335391A (en) * 2014-07-09 2016-02-17 阿里巴巴集团控股有限公司 Processing method and device of search request on the basis of search engine
CN104239907A (en) * 2014-07-16 2014-12-24 华南理工大学 Far infrared pedestrian detection method for changed scenes
CN105956189A (en) * 2016-06-08 2016-09-21 北京百度网讯科技有限公司 Artificial intelligence-based information recommendation method and apparatus

Also Published As

Publication number Publication date
CN107145525B (en) 2020-10-16

Similar Documents

Publication Publication Date Title
CN109522556B (en) Intention recognition method and device
WO2017167069A1 (en) Resume assessment method and apparatus
CN110765257A (en) Intelligent consulting system of law of knowledge map driving type
KR100816934B1 (en) Clustering system and method using search result document
CN107766371A (en) A kind of text message sorting technique and its device
CN111881302B (en) Knowledge graph-based bank public opinion analysis method and system
CN107194617B (en) App software engineer soft skill classification system and method
CN110472203B (en) Article duplicate checking and detecting method, device, equipment and storage medium
CN103544307B (en) A kind of multiple search engine automation contrast evaluating method independent of document library
CN105608075A (en) Related knowledge point acquisition method and system
CN107491447A (en) Establish inquiry rewriting discrimination model, method for distinguishing and corresponding intrument are sentenced in inquiry rewriting
CN112613321A (en) Method and system for extracting entity attribute information in text
CN106844482A (en) A kind of retrieval information matching method and device based on search engine
CN111191413B (en) Method, device and system for automatically marking event core content based on graph sequencing model
CN116561291A (en) Intelligent recommendation method and system based on natural language intelligent conversion model
CN110795930A (en) Article title optimization method, system, medium and equipment
Homocianu et al. An Analysis of Scientific Publications on'Decision Support Systems' and'Business Intelligence'Regarding Related Concepts Using Natural Language Processing Tools
CN112328812B (en) Domain knowledge extraction method and system based on self-adjusting parameters and electronic equipment
CN106227661B (en) Data processing method and device
CN107145525A (en) Data processing method, searching method and related device for confirming search scene
CN105893363A (en) A method and a system for acquiring relevant knowledge points of a knowledge point
CN110377706A (en) Search statement method for digging and equipment based on deep learning
CN115934899A (en) IT industry resume recommendation method and device, electronic equipment and storage medium
CN106777191A (en) A kind of search modes generation method and device based on search engine
CN113869973A (en) Product recommendation method, product recommendation system, and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Building N3, building 12, No. 27, Jiancai Chengzhong Road, Haidian District, Beijing 100096

Applicant after: Beijing Xingxuan Technology Co.,Ltd.

Address before: 100085 Beijing, Haidian District on the road to the information on the ground floor of the 1 to the 3 floor of the 2 floor, room 11, 202

Applicant before: Beijing Xiaodu Information Technology Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20201016

CF01 Termination of patent right due to non-payment of annual fee