CN108153792A - A kind of data processing method and relevant apparatus - Google Patents

A kind of data processing method and relevant apparatus Download PDF

Info

Publication number
CN108153792A
CN108153792A CN201611110268.6A CN201611110268A CN108153792A CN 108153792 A CN108153792 A CN 108153792A CN 201611110268 A CN201611110268 A CN 201611110268A CN 108153792 A CN108153792 A CN 108153792A
Authority
CN
China
Prior art keywords
resource
semantic primitive
search
participle
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611110268.6A
Other languages
Chinese (zh)
Other versions
CN108153792B (en
Inventor
彭正超
安伟亭
魏虎
李鹏飞
张建锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201611110268.6A priority Critical patent/CN108153792B/en
Publication of CN108153792A publication Critical patent/CN108153792A/en
Application granted granted Critical
Publication of CN108153792B publication Critical patent/CN108153792B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a kind of data processing method and relevant apparatus, in order to improve search experience, target semantic primitive can be segmented, obtained search result is segmented for one, if the affiliated classification of resource is less in the search result obtained according to participle search, and registration is higher between the affiliated classification of resource in the search result with being obtained according to target semanteme unit searches, it can be as the core word of the target semantic primitive.In the search result searched for using core word feature possessed by resource with wish by target semanteme unit searches to resource the characteristics of be consistent to a certain extent, therefore it can effectively play the role of quantity expansion by regarding the corresponding search result of core word as the corresponding search result of target semantic primitive, and the resource expanded is more likely to have correlation with the search purpose scanned for the target semantic primitive, improves the search experience of user.

Description

A kind of data processing method and relevant apparatus
Technical field
The present invention relates to data processing field, more particularly to a kind of data processing method and relevant apparatus.
Background technology
With the development of Internet technology, the mode extensive use of resource is provided a user using internet platform.
If user wishes to check, obtain the resource of a certain type, the money can be embodied by input on internet platform The search term of source feature scans for, to wish to obtain the resource for having the characteristics that embody search term from search result.
Invention content
However, sometimes when search term quantity input by user is more or search term is inaccurate, the search of acquisition It as a result may be seldom even without so as to reduce the experience of user.As it can be seen that how to improve search experience is current urgent need to resolve The technical issues of.
In order to solve the above technical problem, the present invention provides a kind of data processing method and relevant apparatus, can be effective The effect expanded the corresponding search result of target semantic primitive, so as to improve the search experience of user.
The embodiment of the invention discloses following technical solutions:
In a first aspect, the present invention provides a kind of data processing method, the method includes:
Target semantic primitive is obtained, the target semantic primitive is a search semantic primitive for search;
The target semanteme dividing elements are obtained into multiple participles;
It is obtained according to the quantity of the affiliated classification of resource in the obtained search result of participle search and according to participle search In search result the affiliated classification of resource with according to belonging to resource in the search result that the target semanteme unit searches obtain Registration determines the core word of the target semantic primitive between classification;
Using the search result searched for according to the core word determined from the multiple participle as according to the mesh The search result that poster justice unit searches obtain.
Optionally, it is described the target semanteme dividing elements are obtained into multiple participles before, further include:
If resource quantity is semantic less than first threshold or the target single in the search result of the target semantic primitive The character length of member performs the described the step of target semanteme dividing elements are obtained multiple participles more than second threshold.
Optionally, the first participle is segmented for any one in the multiple participle, described to be searched according to what participle search obtained The quantity of the affiliated classification of resource and the affiliated classification and root according to resource in the obtained search result of participle search in hitch fruit Registration determines the target between the affiliated classification of resource in the search result obtained according to the target semanteme unit searches The core word of semantic primitive, including:
Number, the first participle that the first participle is obtained in historical search behavior for search are searched for The number that resource quantity and the first participle occur jointly in historical search behavior with other words;
It is searched for according to number, the first participle of the first participle in historical search behavior for search Number that resource quantity, the first participle occur jointly in historical search behavior with other words, according to the first participle Search for the quantity of the affiliated classification of resource and the search result searched for according to the first participle in obtained search result In the affiliated classification of middle resource and the search result obtained according to the target semanteme unit searches between the affiliated classification of resource Registration calculates the core word score of the first participle;
If the core word score of the first participle is in the core word highest scoring that is respectively segmented in the multiple participle The first participle is determined as the core word of the target semantic primitive by top N.
Optionally, in the search result using corresponding to the core word determined from the multiple participle as described in After the search result of target semantic primitive, further include:
If the quantity of resource does not meet third threshold value, root in the search result obtained according to the target semanteme unit searches Expanded according to the target semantic primitive, obtain expanding semantic primitive, the expansion semantic primitive is search semantic primitive;
Using the search result searched for according to the expansion semantic primitive as according to the target semanteme unit searches Obtained search result.
Optionally, first resource is any one money in the search result obtained according to the target semanteme unit searches Source, it is described to be expanded according to the target semantic primitive, it obtains expanding semantic primitive, including:
If it can be searched for obtain the first resource according to the first semantic primitive, using first semantic primitive as described in Expand semantic primitive, first semantic primitive is a search semantic primitive;Alternatively,
If it can be searched for obtain the Secondary resource according to the second semantic primitive, using second semantic primitive as described in Expand semantic primitive, second semantic primitive is a search semantic primitive, and the Secondary resource is and the first resource Resource with similitude.
Optionally, it is described to be expanded according to the target semantic primitive, it obtains expanding semantic primitive, including:
The third semantic primitive of the 4th threshold value will be less than with the editing distance of the target semantic primitive as the expansion Semantic primitive, the third semantic primitive are a search semantic primitive;Alternatively,
The 4th semantic primitive of the 5th threshold value will be less than with the acceptation similarity of the target semantic primitive as the expansion Semantic primitive is filled, the 4th semantic primitive is a search semantic primitive.
Optionally, described using the search result searched for according to the expansion semantic primitive as according to the target Before the search result that semantic primitive is searched for, further include:
Associated frequencies between the expansion semantic primitive and the target semantic primitive that are calculated;
Obtain the highest preceding M expansion semantic primitive of associated frequencies;
According to the number being used to search in the character length of described M expansion semantic primitive, historical search behavior, pass through Search in obtained search result the affiliated classification of resource in the search result obtained according to the target semanteme unit searches Registration between the affiliated classification of resource expands in semantic primitive from described M and further determines that out L expansion semantic primitive;
Expand expansion semantic primitive of the semantic primitive as the target semantic primitive using described L.
Optionally, it if inputting user that the target semantic primitive scans for by client Connection Service device, also wraps It includes:
Institute is searched for according to the user in the recent period and determines resource to be recommended using search semantic primitive and core word;
Recommend the resource to be recommended to the client.
Optionally, it is described according to the user search in the recent period institute using search semantic primitive and core word determine it is to be recommended Resource, including:
The real-time preference of the user is obtained, the real-time preference includes real time resources preference and real-time classification preference;
Institute is searched in the recent period according to the real-time preference, the user to determine to wait to push away using search semantic primitive and core word Recommend resource.
Optionally, it further includes:
The resource collection for treating sequencing resources is obtained, the resource collection is search result or resource to be recommended;
It obtains and inputs the real-time preference of user that the target semantic primitive scans for, the real-time preference includes real-time Resource preference and real-time classification preference;
According to treated in the feature of the user, real-time preference and the feature and the resource collection sequencing resources it Between the cross feature that is formed obtain each treating the corresponding click conversion ratio CVR and click-through-rate CTR of sequencing resources;
According to treating that sequencing resources corresponding CVR and CTR determines to treat the corresponding sequence score of sequencing resources;
According to the height of sequence score to treating that sequencing resources are ranked up in the resource collection.
Optionally, the real time resources preference of the user be historical context behavior according to the user and resource and It is determined with the time of origin of the historical context behavior of resource;The real-time classification preference of the user be according to the user with It the historical context behavior of classification and determines with the time of origin of the historical context behavior of classification.
Optionally, information resources treat sequencing resources for any one in the resource collection, described according to be recommended Before resource corresponding CVR and CTR determines the corresponding sequence score of resource to be recommended, including:
It is determined according to the corresponding characteristic value of the information resources and the mean eigenvalue of the affiliated classification of the information resources The characteristic value score of the information resources;
If the information resources are the resource with period timeliness, further according to the expiration time of the information resources Determine the weighting point of the information resources;
According to the quantity for treating to belong in sequencing resources the affiliated classification of the information resources of the resource collection and described The sum for treating the affiliated classification of sequencing resources of resource collection determines diversity score;
It is described that the corresponding sequence score of resource to be recommended is determined according to resource to be recommended corresponding CVR and CTR, Including:
According to the corresponding CVR, CTR of the information resources, characteristic value score, weighting point and diversity score, institute is determined State the corresponding sequence score of information resources.
Optionally, the expiration time according to the information resources determines the weighting point of the information resources, including:
According to the expiration time of the information resources and the mathematic interpolation time weight part of current time;
The number calculation times weighted portion of the information resources was obtained according to the user;
The weighting point of the information resources is determined according to the time weight part and number weighted portion.
Second aspect, the present invention provides a kind of data processing equipment, described device include acquiring unit, division unit and Determination unit:
The acquiring unit, for obtaining target semantic primitive, the target semantic primitive is searching for search Rope semantic primitive;
The division unit, for the target semanteme dividing elements to be obtained multiple participles;
The determination unit, for according to the quantity of the affiliated classification of resource in the obtained search result of participle search and The affiliated classification of resource is searched with what is obtained according to the target semanteme unit searches in the search result obtained according to participle search Registration determines the core word of the target semantic primitive between the affiliated classification of resource in hitch fruit;
The determination unit is additionally operable to the search that will be searched for according to the core word determined from the multiple participle As a result as the search result obtained according to the target semanteme unit searches.
Optionally, if resource quantity is less than first threshold or the mesh in the search result of the target semantic primitive The character length of semantic primitive is marked more than second threshold, triggers the division unit.
Optionally, the first participle is segmented for any one in the multiple participle, and the determination unit is additionally operable to obtain institute The number of the first participle being stated in historical search behavior for search, the resource quantity searched for of the first participle and The number that the first participle occurs jointly in historical search behavior with other words;According to the first participle in historical search The number of search, the resource quantity that the first participle is searched for, the first participle are used in behavior in historical search row The affiliated classification of resource in the number that occurs jointly with other words in, the search result searched for according to the first participle In quantity and the search result searched for according to the first participle the affiliated classification of resource with according to target semanteme Registration calculates the core word score of the first participle between the affiliated classification of resource in the search result that unit searches obtain; It, will if the core word score of the first participle is in the top N of core word highest scoring respectively segmented in the multiple participle The first participle is determined as the core word of the target semantic primitive.
Optionally, the determination unit includes expanding subelement and determination subelement:
If the quantity of resource does not meet third threshold value, institute in the search result obtained according to the target semanteme unit searches Expansion subelement is stated, for according to the target semantic primitive being expanded, obtains expanding semantic primitive, it is described to expand semantic list Member is search semantic primitive;
The determination subelement, for will be according to the search result searched for of expansion semantic primitive as according to institute State the search result that target semanteme unit searches obtain.
Optionally, first resource is any one money in the search result obtained according to the target semanteme unit searches Source, the expansion subelement, if being additionally operable to be searched for obtain the first resource according to the first semantic primitive, by described first For semantic primitive as the expansion semantic primitive, first semantic primitive is a search semantic primitive;It if alternatively, being capable of root It searches for obtain the Secondary resource according to the second semantic primitive, using second semantic primitive as the expansion semantic primitive, institute The second semantic primitive is stated as a search semantic primitive, the Secondary resource is the money for having similitude with the first resource Source.
Optionally, the expansion subelement is additionally operable to that the 4th threshold value will be less than with the editing distance of the target semantic primitive Third semantic primitive as the expansion semantic primitive, the third semantic primitive is a search semantic primitive;It alternatively, will It is less than the 4th semantic primitive of the 5th threshold value as the expansion semantic primitive with the acceptation similarity of the target semantic primitive, 4th semantic primitive is a search semantic primitive.
Optionally, the expansion subelement be additionally operable to the expansion semantic primitive being calculated and the target semantic primitive it Between associated frequencies;Obtain the highest preceding M expansion semantic primitive of associated frequencies;According to the word of described M expansion semantic primitive Number, the affiliated classification by searching for resource in obtained search result being used to search in symbol length, historical search behavior The registration between the affiliated classification of resource in the search result that is obtained according to the target semanteme unit searches, from the M Expand in semantic primitive and further determine that out L expansion semantic primitive;Expand semantic primitive as the target language using described L The expansion semantic primitive of adopted unit.
Optionally, it is described if inputting user that the target semantic primitive scans for by client Connection Service device Device further includes recommendation unit:
The recommendation unit is determined for searching for institute in the recent period according to the user using search semantic primitive and core word Resource to be recommended;Recommend the resource to be recommended to the client.
Optionally, the recommendation unit is additionally operable to obtain the real-time preference of the user, and the real-time preference includes real-time Resource preference and real-time classification preference;Searched in the recent period according to the real-time preference, the user use search semantic primitive and Core word determines resource to be recommended.
Optionally, it is true to further include resource acquisition unit, preference acquiring unit, clicking rate determination unit, score for described device Order member and sequencing unit:
The resource acquisition unit, for obtaining the resource collection for treating sequencing resources, the resource collection is search result Or resource to be recommended;
The preference acquiring unit, for obtain input user that the target semantic primitive scans for it is real-time partially Good, the real-time preference includes real time resources preference and real-time classification preference;
The clicking rate determination unit, for feature, real-time preference and the feature according to the user with it is described The cross feature for treating to be formed between sequencing resources in resource collection obtain it is each treat sequencing resources it is corresponding click conversion Rate CVR and click-through-rate CTR;
The score determination unit treats that sequencing resources corresponding CVR and CTR determines to treat sequencing resources point for basis Not corresponding sequence score;
The sequencing unit, for according to sequence score height to treating that sequencing resources are arranged in the resource collection Sequence.
Optionally, the real time resources preference of the user be historical context behavior according to the user and resource and It is determined with the time of origin of the historical context behavior of resource;The real-time classification preference of the user be according to the user with It the historical context behavior of classification and determines with the time of origin of the historical context behavior of classification.
Optionally, information resources treat sequencing resources, the score determination unit for any one in the resource collection It is additionally operable to determine institute according to the mean eigenvalue of the corresponding characteristic value of the information resources and the affiliated classification of the information resources State the characteristic value score of information resources;If the information resources are the resource with period timeliness, further according to described the The expiration time of three resources determines the weighting point of the information resources;According to the resource collection treat belong to institute in sequencing resources State the quantity of the affiliated classification of information resources and the sum for treating the affiliated classification of sequencing resources of the resource collection determine it is various Property score;According to the corresponding CVR, CTR of the information resources, characteristic value score, weighting point and diversity score, determine described The corresponding sequence score of information resources.
Optionally, the score determination unit is additionally operable to the difference of the expiration time and current time according to the information resources Value calculates time weight part;The number calculation times weighted portion of the information resources was obtained according to the user;According to The time weight part and number weighted portion determine the weighting point of the information resources.
The third aspect, the present invention provides a kind of resource sequence method, the method includes:
The resource collection for treating sequencing resources is obtained, it is described to treat that sequencing resources are the resource provided on internet;
It obtains and inputs the real-time preference of user that the target semantic primitive scans for, the real-time preference includes real-time Resource preference and real-time classification preference;
According to treated in the feature of the user, real-time preference and the feature and the resource collection sequencing resources it Between the cross feature that is formed obtain each treating the corresponding click conversion ratio CVR and click-through-rate CTR of sequencing resources;
According to treating that sequencing resources corresponding CVR and CTR determines to treat the corresponding sequence score of sequencing resources;
According to the height of sequence score to treating that sequencing resources are ranked up in the resource collection.
Optionally, the real time resources preference of the user be historical context behavior according to the user and resource and It is determined with the time of origin of the historical context behavior of resource;The real-time classification preference of the user be according to the user with It the historical context behavior of classification and determines with the time of origin of the historical context behavior of classification.
Optionally, information resources treat sequencing resources for any one in the resource collection, described according to be recommended Before resource corresponding CVR and CTR determines the corresponding sequence score of resource to be recommended, including:
It is determined according to the corresponding characteristic value of the information resources and the mean eigenvalue of the affiliated classification of the information resources The characteristic value score of the information resources;
If the information resources are the resource with period timeliness, further according to the expiration time of the information resources Determine the weighting point of the information resources;
According to the quantity for treating to belong in sequencing resources the affiliated classification of the information resources of the resource collection and described The sum for treating the affiliated classification of sequencing resources of resource collection determines diversity score;
It is described that the corresponding sequence score of resource to be recommended is determined according to resource to be recommended corresponding CVR and CTR, Including:
According to the corresponding CVR, CTR of the information resources, characteristic value score, weighting point and diversity score, institute is determined State the corresponding sequence score of information resources.
Optionally, the expiration time according to the information resources determines the weighting point of the information resources, including:
According to the expiration time of the information resources and the mathematic interpolation time weight part of current time;
The number calculation times weighted portion of the information resources was obtained according to the user;
The weighting point of the information resources is determined according to the time weight part and number weighted portion.
Fourth aspect, the present invention provides a kind of personalized shopping guide's frame, the personalization shopping guide frame includes online meter Calculate module and off-line calculation module:
Online computing module is for real-time businessman's behavioural analysis, and commodity are recalled, personalized ordering;
For being responsible for businessman/service features update, order model training and candidate commodity pond calculate off-line calculation module.
Optionally, personalized shopping guide's frame be additionally operable to carry out the identifying of real-time preference, it is resource matched recall and The sequence of model.
In order to improve search experience it can be seen from above-mentioned technical proposal, target semantic primitive can be segmented, needle Obtained search result is segmented to one, if the affiliated classification that resource in obtained search result is searched for according to participle is less, and Registration is higher between the affiliated classification of resource in the search result that is obtained according to the target semanteme unit searches, then can be with Determine that this participle is consistent substantially with the practical feature carried of the target semantic primitive, this participle is compared with can embody in other words The core meaning of the target semantic primitive, can be as the core word of the target semantic primitive.It is searched for using core word The characteristics of resource that feature possessed by resource is arrived with hope by target semanteme unit searches in search result, is to a certain degree On be consistent, therefore can have by regarding the corresponding search result of core word as the corresponding search result of target semantic primitive Effect plays the role of quantity expansion, and the resource expanded is more likely to the search mesh with being scanned for the target semantic primitive Have correlation, so as to improve the search experience of user.
Description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also To obtain other attached drawings according to these attached drawings.
Fig. 1 be it is provided in an embodiment of the present invention it is a kind of be to improve search experience to the signal that is handled of search semantic primitive Figure;
Fig. 2 is a kind of method flow diagram of data processing method provided in an embodiment of the present invention;
Fig. 3 is the method flow diagram that a kind of core word provided in an embodiment of the present invention determines method;
Fig. 4 is a kind of method flow diagram of screening technique for expanding semantic primitive provided in an embodiment of the present invention;
Fig. 5 is a kind of method flow diagram of resource ordering method provided in an embodiment of the present invention;
Fig. 6 is a kind of personalized block schematic illustration of service platform provided in an embodiment of the present invention;
Fig. 7 is a kind of service platform individual scene bandwagon effect figure provided in an embodiment of the present invention;
Fig. 8 is a kind of structure drawing of device of data processing equipment provided in an embodiment of the present invention;
Fig. 9 is a kind of structure drawing of device of collator provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawings, the embodiment of the present invention is described.
When being scanned on internet platform, sometimes word problem can be searched for because of input, lead to search result very Few even search less than.Such as in the service platform specifically for seller, the resource provided mainly manages for convenience of seller The management tool or service aid of commodity etc. in network shop, shop, therefore the resource quantity provided in service platform is relative to needle It is considerably less to the resource quantity provided in the shopping platform (such as day cat store) of buyer, so as to which user (such as seller) exists There is the possibility bigger of no search result when being searched in service platform, result in bad search experience.
For this purpose, an embodiment of the present invention provides a kind of data processing method and relevant apparatus, can be directed to for search Target semantic primitive is anticipated, to wish to improve the quantity of resource in search result, and keep the quality of resource.This hair The target semantic primitive proposed in bright embodiment can be understood as a search semantic primitive, which is used to search Rope, carry mark wish to get the correlated characteristic of resource, the search semantic primitive can include keyword, key phrase, Or the form of the single or any number of combination such as crucial short sentence.Resource proposed in the embodiment of the present invention is primarily referred to as interconnecting Resource provided in net, the resource can include actual resource, can also include virtual resource.Actual resource can be had The article of entity structure, such as electrical home appliances, clothing, cosmetics etc..Virtual resource can be the product of virtualization, such as play In virtual objects, game money, electronics coupons etc., virtual resource can also be a kind of service or managerial resource, example The service that cleans of visiting that is there is provided such as cleaning worker, the management service etc. of network shop.There is classification with the relevant feature of resource, Since there are many resource provided on internet, required resource is browsed to or searched in order to facilitate user, saves user Time can classify to the resource provided on internet, and the resource with same alike result is classified as a classification.Classification The general name to a kind of resource can be classified as is can be understood as, the resource of this class now is respectively provided at least one identical category Property.Such as " shoes " can be used as a classification, wherein can include each resource with " shoes " this attribute, such as move Footwear, playshoes, sandals, canvas shoe or more detailed each brand size specific resource etc..Such as " canvas shoe " also may be used Using as a classification, wherein each resource with " canvas shoe " this attribute can be included, such as a brands canvas shoe, b brands The specific resource such as canvas shoe plus suede canvas shoe, high side canvas shoe.How resource is sorted out to obtain classification this present invention simultaneously It does not limit, can be sorted out according to dividing precision or is sorted out according to specific requirement.
By being anticipated to target semantic primitive, multiple participles, example can be marked off from target semantic primitive As shown in Figure 1, target semantic primitive 100 and mark off x participle 200, later, according to resource in the search result of each participle Affiliated classification is foundation, determines that y core word of 100 core feature of target semantic primitive can be embodied from x participle 200 300, this y core word 300 determines to obtain from x participle 200, and under normal circumstances, the number of y is less than or equal to x's Number.
It is after determining core word, the search result 400 obtained according to the search of this y core word 300 is semantic as target The search result of unit 100, " as " described here, which does not imply that, replaces the search result of core word according only to target language The former search result 500 that adopted unit searches obtain, and the search result 400 using core word is primarily referred to as target semantic primitive Former search result 500 expansion so that target semantic primitive 100 expand after search result in resource quantity it is more, and Ensure that expanded resource also have at least part target semantic primitive entrained by correlated characteristic, i.e., institute's expanding resource with The correlation of resource is higher in the search result obtained by target semanteme unit searches, so as to improve search experience.
As it can be seen that by the way that target semantic primitive is segmented, obtained search result is segmented for one, if according to participle The affiliated classification for searching for resource in obtained search result is less, and the search with being obtained according to the target semanteme unit searches As a result registration is higher between the affiliated classification of middle resource, then can determine this participle and the practical carrying of the target semantic primitive Feature be consistent substantially, for this participle compared with the core meaning that can embody the target semantic primitive, can be used as should in other words The core word of target semantic primitive.Feature passes through with wishing possessed by resource in the search result searched for using core word Target semanteme unit searches to resource the characteristics of be consistent to a certain extent, therefore by by core word it is corresponding search tie Fruit can effectively play the role of quantity as the corresponding search result of target semantic primitive and expand, and the resource expanded It is more likely to that there is correlation with the search purpose scanned for the target semantic primitive, so as to improve the search body of user It tests.
Fig. 2 is a kind of method flow diagram of data processing method provided in an embodiment of the present invention, and the method can be applied In server, described method includes following steps:
S201:Target semantic primitive is obtained, the target semantic primitive is a search semantic primitive for search.
For example, the target semantic primitive that this step obtains can be that user has just inputted, not yet obtain search knot Search semantic primitive used in the search semantic primitive of fruit or historical search behavior.It is searched for what is just inputted The situation of rope semantic primitive, can obtain corresponding core word by the flow of Fig. 2, and be expanded with the search result of core word The search result of target semantic primitive.For the situation of the search semantic primitive in historical search behavior, the stream of Fig. 2 can be passed through Journey obtains corresponding core word, and is applied, such as resource pushes away using the search result of the target semantic primitive after expansion It recommends, the search result of expansion can enrich recommended resource.
In embodiments of the present invention, can the target semantic primitive of acquisition be directly subjected to word segmentation processing, can also carried Under the premise of Computationally efficient, word segmentation processing selectively is carried out to target semantic primitive.For selectable processing target language The situation of adopted unit, it may be predetermined that whether the search result that target semantic primitive can obtain, which can influence user, is searched for body It tests, and one of principal element for influencing user's search experience is that the resource quantity searched out is very little, the selectivity for leading to user is low.Therefore Can whether this target semantic primitive directly be judged by the quantity of resource in the search result of a target semantic primitive It needs to segment;Alternatively, because the character of used search semantic primitive is longer when searching for, in general the feature identified can It is more, then while have the resource of these features will obviously be seldom even without therefore it is semantic single to pass through target The character length of member judges whether this target semantic primitive needs to segment.For this purpose, an embodiment of the present invention provides a kind of true Whether the semantic primitive that sets the goal needs the judgment mechanism segmented, optionally, if being provided in the search result of the target semantic primitive Source quantity, more than second threshold, performs S202 less than the character length of first threshold or the target semantic primitive.Namely It says, it, can be with when resource quantity that target semanteme unit searches obtain is less or the character length of target semantic primitive is longer Word segmentation processing is carried out to the target semantic primitive.
S202:The target semanteme dividing elements are obtained into multiple participles.
For example, target semantic primitive can be segmented using the meaning of a word or word structure as foundation, such as a mesh Mark semantic primitive can include " women's dress " and " shoes " for the participle that " women's dress shoes " division obtains;One target semantic primitive is The participle that " Commercial goods labels collection " division obtains can include " commodity ", " label " and " collection ".
S203:It is searched for according to the quantity of the affiliated classification of resource in the obtained search result of participle search and according to participle The affiliated classification of resource and resource in the search result obtained according to the target semanteme unit searches in obtained search result Affiliated classification between registration determine the core word of the target semantic primitive.
For example, after multiple participles are determined, can individually be searched using these participles as the progress of keyword respectively Rope, obtain so that each participle roving commission when obtained search result.Such as 10 participle, individually according to participle 1 into Row search obtains a search result 1, individually scans for obtaining a search result 2 according to participle 2.
For any one participle in multiple participle, it may be determined that belonging to the resource searched by this participle The quantity of classification when quantity is more, represents and may search for various inhomogeneity purpose resources by this participle, be equivalent to this The identified feature of a participle belongs to a kind of generic features, and may have this participle in the resource of inhomogeneity now is identified Feature.If one segments the negligible amounts of the affiliated classification of resource searched, representing, there is this participle to identify spy The resource of sign is more concentrated, and has the larger characteristic feature that may belong to one or several classes resource now.In general, pass through mesh Mark semantic primitive wants to search the resource collection with more special characteristic, and excessively general feature may not be very The core meaning of good mark target semantic primitive, and have when more being concentrated there are one the resource for segmenting institute's identification characteristics, this Participle may can be very good the core meaning of mark target semantic primitive, and in other words, this segments identified feature with being somebody's turn to do The feature correlation that target semantic primitive is identified is very high.
It can also determine that one segments belonging to the resource that the affiliated classification of resource searched is arrived with target semanteme unit searches Registration between classification.By search obtain the classification that resource concentrated can show search for these resources used by search The special characteristic that rope semantic primitive is identified, such as search semantic primitive are " shoes ", then are searched by the search semantic primitive The resource that rope obtains will be had focused largely in " shoes " this classification.
Therefore if one segments in the affiliated classification of resource searched and has significant proportion all to be arrived with target semanteme unit searches The affiliated classification of resource it is identical, represent this and segment the special characteristic ratio that identified feature is identified with the target semantic primitive It is more similar, the core meaning of mark target semantic primitive can be played, in other words, this segments identified feature and the target The feature correlation that semantic primitive is identified is very high.
As it can be seen that it is searched for according to the quantity of the affiliated classification of resource in the obtained search result of participle search and according to participle The affiliated classification of resource and resource in the search result obtained according to the target semanteme unit searches in obtained search result Affiliated classification between registration the two features can explicitly represent very much a participle and whether can be identified for that out and pass through mesh The special characteristic of the highly desirable resource that can be searched of semantic primitive is marked, in other words, can explicitly represent very much a participle Whether the feature identified and the feature correlation that the target semantic primitive is identified are very high.
The processing of participle is analyzed by S203, the core of the target semantic primitive can be determined from multiple participles Word, core word can namely embody the participle of the target semanteme memory cell heart meaning or be identified with the target semantic primitive The very high participle of feature correlation.
The core word of one target semantic primitive can there are one or it is multiple, core word is from this target semantic primitive It is determined in the participle marked off, such as a target semanteme dividing elements go out 10 participles, is to segment 1 to point respectively Word 10,5 are determined from this 10 participles to be to segment 1 to participle 5 respectively as the participle of core word.
S204:Using the search result searched for according to the core word determined from the multiple participle as according to institute State the search result that target semanteme unit searches obtain.
For example, since it is determined that the feature that is identified of core word it is similar to the feature that target semantic primitive is identified Or correlation is very high, therefore the resource searched for by core word with the feature entrained by target semantic primitive to a certain extent With correlation, it is more likely that be the required resource of user scanned for by target semantic primitive.It therefore will be according to core Search result of the search result that word is searched for as target semantic primitive both can only pass through target semanteme unit searches Increased resource quantity on the basis of obtained search result, target semanteme can also be embodied to a certain extent by increasing resource The feature that unit is identified, therefore this search result searched for including core word and target semanteme unit searches are obtained When the search result of search result is shown, no matter the quantity of resource or quality can act as raising from search result The effect of search experience.
As it can be seen that by the way that target semantic primitive is segmented, obtained search result is segmented for one, if according to participle The affiliated classification for searching for resource in obtained search result is less, and the search with being obtained according to the target semanteme unit searches As a result registration is higher between the affiliated classification of middle resource, then can determine this participle and the practical carrying of the target semantic primitive Feature be consistent substantially, for this participle compared with the core meaning that can embody the target semantic primitive, can be used as should in other words The core word of target semantic primitive.Feature passes through with wishing possessed by resource in the search result searched for using core word Target semanteme unit searches to resource the characteristics of be consistent to a certain extent, therefore by by core word it is corresponding search tie Fruit can effectively play the role of quantity as the corresponding search result of target semantic primitive and expand, and the resource expanded It is more likely to that there is correlation with the search purpose scanned for the target semantic primitive, so as to improve the search body of user It tests.
Other than that is mentioned in S203 is used to calculate the feature of core word, in order to improve computational accuracy, divided passing through It, can also be with further reference to related ginseng of the participle in historical search behavior during the multiple participles gone out determine core word Number.For clear explanation, next with the first participle, i.e., any one in multiple participles that target semantic primitive is divided segments For be described.Each participle can carry out the meter of core word according to the processing mode of the first participle in multiple participle It calculates and determines.
Fig. 3 is the method flow diagram that a kind of core word provided in an embodiment of the present invention determines method, and the method includes such as Lower step:
S301:Number, the first participle that the first participle is obtained in historical search behavior for search are searched for The number that obtained resource quantity and the first participle occurs jointly in historical search behavior with other words.
For example, historical search row can be included in the search behavior once carried out in internet platform, in history In search behavior, the first participle is possible to be carried out search as search semantic primitive.The first participle is in historical search behavior In for the number of search can identify the search temperature of the first participle, if number of the first participle for search it is higher or Say that search temperature is higher, it is a search semantic primitive for being often used in search that can represent the first participle, is often used in Search can embody reliability and validity of the first participle as search semantic primitive.
And the quantity for the resource searched for according to the first participle can reflect the search matter of the first participle to a certain extent Amount, the resource quantity searched is more, can reflect that the search quality of the first participle is better, if the first participle is targeted The core word of semantic primitive, then the first participle can be that the search result of target language justice unit expands more resources.
The number that the first participle occurs jointly in historical search behavior with other words can embody the first participle and be identified The versatility of feature, if the first participle occurs simultaneously in the search semantic primitive of historical search behavior with a large amount of different words, Then reflect that the first participle is excessively general, then the feature that the first participle is identified is likely to commonplace feature, then this The kind first participle will be difficult to embody the special characteristic that target semantic primitive thinks embodiment.It is searched if being appeared in simultaneously with the first participle Other word negligible amounts in rope semantic primitive, relative to the combination of the new search semantic primitive of the first participle and other words composition It is less, then first participle institute's identification characteristics will be mainly reflected in the search semantic primitive new with other words composition and can search Resource in, therefore the feature that the first participle is identified is comparatively specific.
As it can be seen that the money that number, the first participle of the Yi Shang first participle in historical search behavior for search are searched for The number that source quantity and the first participle occur jointly in historical search behavior with other words can embody to a certain extent The first participle, which is identified, is characterized in a special characteristic or a universals, so as to further improve determining core The precision of word.
S302:It is searched for according to number, the first participle of the first participle in historical search behavior for search Number that obtained resource quantity, the first participle occur jointly in historical search behavior with other words, according to described One participle is searched for the quantity of the affiliated classification of resource in obtained search result and is searched according to what the first participle was searched for The affiliated classification of resource and the affiliated class of resource in the search result obtained according to the target semanteme unit searches in hitch fruit Registration calculates the core word score of the first participle between mesh.
For example, due under normal circumstances, want to search with more specific spy by target semantic primitive The resource of sign, therefore when the feature progress core word obtained by S301 determines, it is desirable to it will likely by the S301 features obtained The participle that generality feature can be identified excludes as far as possible, and retains the participle that can identify special characteristic as core word.
The embodiment of the present invention additionally provides a kind of specific calculation for calculating core word score (score), such as following formula institute Show:
Wherein, query_cnt is the number that the first participle is used for search in historical search behavior, can be according to user Value after duplicate removal, result_cnt are the quantity of resource in the search result searched for according to the first participle, cate_ Entropy is the affiliated classification of resource in the search result searched for according to the first participle in historical search behavior The classification Distribution Entropy that quantity is determined, adjoin_entropy is is total to according to the first participle in historical search behavior with other words With the adjoining entropy that the number occurred is determined, cate_match is resource in the search result searched for according to the first participle Registration between the affiliated classification of resource in affiliated classification and the search result obtained according to target semanteme unit searches.
Further,
Wherein, pi=number of resources of i classifications searched/total resources number searched, n is searches the affiliated classification of resource Quantity;
Wherein, piNumber/first participle that=the first participle and i-th of word occur jointly in semantic primitive is searched for and its The number that his word occurs jointly in semantic primitive is searched for, n are the word occurred jointly in semantic primitive is searched for the first participle Sum;
S303:If the core word score of the first participle is in the core word score respectively segmented in the multiple participle most The first participle is determined as the core word of the target semantic primitive by high top N.
For example, the core word score when the first participle is higher, reflects the feature that the first participle is identified and do not have There is generality, similitude, the correlation of the feature identified and target semantic primitive are bigger.Wherein, N can be according to specific field Scape demand is configured or fixed value.
It can be seen from above-described embodiment that during the core word for determining target semantic primitive, ensure that from point The feature that the core word determined in word is identified has the premise of high correlation with the feature that target semantic primitive is identified Under, relevant parameter of the participle in historical search behavior in itself is further contemplated, so as to improve determining for determining core word Precision.
It should be noted that core word is obtained from by the participle that target semanteme dividing elements obtain, if target is semantic The meaning expression of unit in itself is unclear, and the participle marked off can be caused also to be difficult to what is determined core word or determine The problem of core word is also difficult to search related resource, and this problem is likely due to target semantic primitive itself causes, and leads to It crosses and the obtained core word of participle is divided by target semantic primitive is also difficult to solve the problems, such as this.For this purpose, the embodiment of the present invention is also A kind of scheme expanded target semantic primitive is provided, to wish to be searched for by expanding obtained expansion semantic primitive To search result expand the search result of target semantic primitive.Therefore it is optional, after S204 has been performed, it can be determined that mesh The quantity of resource in the search result (search result expanded by the search result of core word) of preceding target semantic primitive Whether predetermined threshold (third threshold value) is less than.Target semantic primitive can be expanded when resource quantity is less in search result It fills.
That is, if the quantity of resource does not meet in the search result obtained according to the target semanteme unit searches Three threshold values are expanded according to the target semantic primitive, obtain expanding semantic primitive.
Using the search result searched for according to the expansion semantic primitive as according to the target semanteme unit searches Obtained search result.
For example, expanding semantic primitive also belongs to search semantic primitive, expand the expansion according to target semantic primitive Semantic primitive can be one or multiple.
When being expanded, mainly according to target semantic primitive, such as obtained according to target semanteme unit searches each A resource.Whether the present invention can search for obtain the mode that search result provides a variety of expansions according to target semantic primitive.It connects Get off and will be carried out for any one resource in the search result obtained according to target semanteme unit searches with first resource Explanation.
Such as can be searched for obtain the situation of search result according to target semantic primitive, if can be according to the first semanteme Unit searches obtain the first resource, using first semantic primitive as the expansion semantic primitive.Here the first language Adopted unit can be a search semantic primitive, and the first semantic primitive is different from target semantic primitive.
For example, if first resource can be searched according to target semantic primitive and the first semantic primitive, then target The feature that semantic primitive and the first semantic primitive are respectively identified at least part of can have correlation, this part can be with It is the feature that first resource is respectively provided with, by the resource that the first semantic primitive is searched for being obtained according to target semanteme unit searches Resource between have correlation possibility higher, therefore can will between target semantic primitive have certain correlation first Semantic primitive is as expansion semantic primitive.
Alternatively, for that can be searched for obtain the situation of search result according to target semantic primitive, if can be according to the second language Adopted unit searches obtain the Secondary resource, using second semantic primitive as the expansion semantic primitive, second language Adopted unit is a search semantic primitive, and the Secondary resource is the resource for having similitude with the first resource.
For example, can also have other resources of similitude with first resource by searching, such as Secondary resource, if Secondary resource can be searched for obtain by the second semantic primitive, then can determine that target semantic primitive and the second semantic primitive are each At least part of there can be correlation from the feature identified, this part can be that first resource and Secondary resource have Some features have between the resource searched for by the second semantic primitive and the resource obtained according to target semanteme unit searches The possibility higher of correlation, therefore can be using second semantic primitive between target semantic primitive with certain correlation as expansion Fill semantic primitive.
It can search for obtain the situation of search result or can not search for obtain to search for tie for according to target semantic primitive The situation of fruit can will be less than the third semantic primitive of the 4th threshold value with the editing distance of the target semantic primitive as described in Expand semantic primitive, the third semantic primitive is a search semantic primitive.
For example, editing distance for embody two search semantic primitives between character composition on degree of closeness, and In the case of character composition difference smaller (being less than the 4th threshold value), the difference between two search semantic primitives semantically may be simultaneously It is not very big, therefore can be using the third semantic primitive between target semantic primitive with certain Semantic Similarity as expansion language Adopted unit.
Alternatively, it can search for obtain the situation of search result or can not search for for according to target semantic primitive The situation of hitch fruit can make the 4th semantic primitive for being less than the 5th threshold value with the acceptation similarity of the target semantic primitive For the expansion semantic primitive, the 4th semantic primitive is a search semantic primitive.
For example, by the judgement in Semantic Similarity, such as the application by word2vec semantic analysis technologies, it can To determine the similarity degrees on the semanteme or the meaning of a word between two search semantic primitives, two acceptation similarities are smaller (to be less than the Five threshold values) the similitudes of search semantic primitive institute identification characteristics should also be closer to, therefore can will be with target semantic primitive Between have certain Semantic Similarity the 4th semantic primitive as expand semantic primitive.
If it should be noted that more by expanding the obtained quantity of expansion semantic primitive, semantic primitive will be expanded Search result come expand the search result of target semantic primitive can cause expand after search result in resource quantity it is excessive, instead And search experience can be reduced, therefore can be sieved before the search result for expanding target semantic primitive to expanding semantic primitive Choosing.As shown in figure 4, described method includes following steps:
S401:Associated frequencies between the expansion semantic primitive and the target semantic primitive that are calculated.
For example, associated frequencies major embodiment is to be associated with journey between target semantic primitive and expansion semantic primitive Degree, the higher reflection target semantic primitive of associated frequencies and the correlation degree expanded between semantic primitive are better, and correlation degree can To embody the correlations or similitude of two search semantic primitive institute identification characteristics, thus with the correlation degree of target semantic primitive The characteristic similarity or correlation that the feature that preferable expansion semantic primitive is identified is identified with target semantic primitive should be more It is high.
The embodiment of the present invention propose it is a kind of by the resource searched come judge two search semantic primitives between be associated with The mode of frequency, if in the resource that can be searched in the resource that target semantic primitive can search with expanding semantic primitive It repeats or relevant resource quantity is more, it is believed that the associated frequencies of the two search semantic primitives are higher, on the contrary then close It is relatively low to join frequency.
S402:Obtain the highest preceding M expansion semantic primitive of associated frequencies.
By the screening of associated frequencies, can from obtained it is a large amount of find out a part and mesh are filtered out in semantic primitive Mark the higher expansion semantic primitive of semantic primitive associated frequencies.Here M is a positive integer, to the setting of M can with it is specific Scene demand is related, can also be related to expanding the quantity of expansion semantic primitive.
S403:According to time being used to search in the character length of described M expansion semantic primitive, historical search behavior The affiliated classification of resource and the search obtained according to the target semanteme unit searches in number, the search result obtained by search As a result registration between the affiliated classification of middle resource expands in semantic primitive from described M and further determines that out that L is expanded semanteme Unit.
For example, in order to further improve extended precision, the character length for expanding semantic primitive can also be referred to, with mesh Mark the similar feature for expanding the feature that semantic primitive is identified and being identified with target semantic primitive of character length of semantic primitive It is bigger with high correlation, the possibility of similitude.
Expand the number that semantic primitive is used to search in historical search behavior to may refer to be directed to first point in S301 Word is for the associated description of the number of search in historical search behavior, and which is not described herein again.If expand semantic primitive often quilt For searching for the reliability and validity that can embody the expansion semantic primitive as search semantic primitive.
Expand the affiliated classification of resource in the search result that semantic primitive is obtained by search and according to target semanteme Registration may refer to retouching for registration in S203 between the affiliated classification of resource in the search result that unit searches obtain It states, which is not described herein again.Higher registration can embody feature and the target semantic primitive institute for expanding that semantic primitive is identified The feature correlation of mark is very high.
By reference to the features described above proposed in S403, it can further expand in semantic primitive from M and filter out L tool Have character length it is close with target semantic primitive, for searching times to be more and one of this higher three classes feature of registration or arbitrary The expansion semantic primitive of combination.Here L can be less than the positive integer equal to M.
S404:Expand expansion semantic primitive of the semantic primitive as the target semantic primitive using described L.
By the screening of S401 to S403, can be filtered out from greater number of expansion semantic primitive semantic single with target The higher part of first correlation, similitude expands semantic primitive, i.e., this L expansion semantic primitive is expanded semantic single with this L The search result of member can both ensure the quality of resource to expand the search result of target semantic primitive, it is also ensured that resource Quantity will not be excessive, so as to further improve search experience.
For target semantic primitive search result expansion other than it can be applied to the displaying of real-time search result, It can also be effectively applied in resource recommendation.Resource recommendation can be understood as user when not scanning for operation, such as Browse resource process or link to the Internet platform when, to user show some possible users can it is interested, may meet The resource of user demand obtains these possibilities for recommending resource to wish to improve user.
Resource recommendation generally requires user and logins, such as by client Connection Service device, if this user passes through mesh Mark semantic primitive carried out search, then can carry out resource recommendation to the user according to the historical search behavior of this user.
Due to user demand and interest may the passage of time at any time change, therefore in the historical search of processing user , can be more consideration is given to the recent search behavior for the demand that can more embody this user during behavior, this is specifically more in the recent period It can be determined less according to concrete scene demand, such as in one week, in one day etc..And historical search behavior is to judging user earlier Current demand influence it is smaller, can not consider, alternatively, consider when smaller weight is configured.
According to above-mentioned thinking, it can be searched in the recent period according to the user first and use search semantic primitive and core word true Make resource to be recommended.The determining of core word may refer to the corresponding embodiment of Fig. 2, Fig. 3, and which is not described herein again.
And the resource to be recommended determined can be the resource obtained by core word, the semantic unit searches of search, it can also It is resource similar to these resources, relevant.After determining resource to be recommended, these can be recommended to treat to the client Recommend resource.
Other than resource to be recommended being determined in addition to search semantic primitive used in search in the recent period and core word can be used, The real-time preference of user can also be referred to, real-time preference described here can include real time resources preference and real-time classification is inclined It is good.The real-time preference of user can intuitively reflect the user in the recent period or currently to the hobby of resource and demand, and real-time preference can It determines to obtain with the historical context behavior by user and resource.Wherein, the real time resources preference of user is according to the user It historical context behavior with resource and determines with the time of origin of the historical context behavior of resource;The real-time class of user Mesh preference is historical context behavior according to the user and classification and true with the time of origin of the historical context behavior of classification It makes.
The historical context behavior of user and resource can include user be implemented with the relevant operation of resource, such as to money The flow in source is clicked, is obtained, and can specifically be embodied in and commodity are added in shopping cart, are submitted on order, collecting commodities etc..It can See, these of user are obviously related with demand of the user to resource and hobby to the historical context behavior of resource.
Due to user demand and interest may the passage of time at any time change, therefore handle the history of user and resource The mode of correlation behavior can be similar with handling the historical search behavior of user, the historical context with resource that user occurs in the recent period Behavior can more embody the current demand of the user, and the historical context behavior with resource that the user occurs the more early time is to sentencing The influence of the current demand of disconnected user is smaller.
Therefore before resource to be recommended is determined, the real-time preference of the user can be obtained, and according to the real-time preference, The user searches for institute and determines resource to be recommended using search semantic primitive and core word in the recent period.
Since the resource quantity in search result may be more after expansion, resource quantity that when resource recommendation is recommended May also be more, if can be selected from greater number of resource most possible meeting user demand and preferentially show use Family, can improve user can see the possibility of the resource oneself needed at the first time.In this way in resource searching, Ke Yiti High user chooses taking for required resource, that is, shortens the time that user browses search result, so as to improve search experience, providing During source is recommended, the possibility that user obtains recommended resource increases, and can also improve the efficiency of resource recommendation.As it can be seen that how will most It is possible that the problem of resource priority for meeting user demand shows user to be current urgent need to resolve.
Next resource provided in an embodiment of the present invention will be described in detail on the basis of embodiment corresponding to Fig. 2 to Fig. 4 Sortord.It should be noted that embodiment corresponding to Fig. 5 can also independently be implemented.
Fig. 5 is a kind of method flow diagram of resource ordering method provided in an embodiment of the present invention, and the method includes as follows Step:
S501:The resource collection for treating sequencing resources is obtained, the resource collection is search result or resource to be recommended.
For example, need exist for being ranked up the resource of processing, the sequencing resources for the treatment of in resource collection can be in other words The resource to be recommended determined when the search result or resource recommendation that are the target semantic primitive after expanding.It can also It is the resource provided on internet.
S502:It obtains and inputs the real-time preference of user that the target semantic primitive scans for, the real-time preference packet Include real time resources preference and real-time classification preference.
For example, the real-time preference of user can intuitively reflect the user in the recent period or currently to the hobby of resource and need It asks, real-time preference here may refer to the above-mentioned associated description determined when resource to be recommended uses real-time preference, here not It repeats again.
User described here can be the importer of target semantic primitive or need by the master of resource recommendation Body.In other words, the object that the resource sequence effect determined by embodiment illustrated in fig. 5 is shown is the user.
S503:It is provided according to waiting to sort in the feature of the user, real-time preference and the feature and the resource collection The cross feature formed between source obtains each treating that sequencing resources are corresponding and clicking conversion ratio (Click Value Rate, CVR) and click-through-rate (Click-Through-Rate, CTR).
For example, the feature of user other than it can include reflecting the feature of user's own characteristic, can also be worked as and use Family be network seller when, the correlated characteristic of the network shop including network seller, for example, network shop sold goods range, Feature etc..By treating to be associated between sequencing resources by each in the feature of user and resource collection, it is special that intersection can be generated The power of the incidence relation of the characteristics of the characteristics of sign, cross feature can reflect user or network shop and resource.It should be noted that , calculate one treat the CVR and CTR of sequencing resources during, this can also be referred to and treat spies of sequencing resources itself Sign.
By treating sequencing resources by one in the feature of the user, real-time preference and this feature and the resource collection Between the cross feature that is formed be input in CVR/CTR prediction models, this can be obtained and treat CVR corresponding to sequencing resources And CTR, a CVR or CTR for treating sequencing resources is higher, embodies this and treats sequencing resources more it is possible that meeting the need of the user It asks or likes, be more possible to be clicked by the user and check, obtain.
By changing the cross feature inputted into CVR/CTR prediction models, difference can be obtained and treat that sequencing resources correspond to CVR and CTR, so as to finally obtain needed sequencing resources corresponding CVR and CTR.
S504:According to treating that sequencing resources corresponding CVR and CTR determines that treating that sequencing resources are corresponding sorts Point.
Treat that sequencing resources corresponding CVR and CTR can embody this and treat that sequencing resources may be looked by the user due to one The possibility see, obtained, therefore can treat that sequencing resources corresponding CVR and CTR determines that this treats sequencing resources according to one Sort score.Under normal circumstances, if one is treated the corresponding CVR and CTR higher of sequencing resources, this row for the treatment of can be promoted accordingly The corresponding sequence score of sequence resource.One sequence score for treating sequencing resources is higher, identifies this and treats that sequencing resources meet this User demand, the possibility of hobby are higher, are more likely to be checked by the user, obtain.
S505:According to the height of sequence score to treating that sequencing resources are ranked up in the resource collection.
For example, the higher preferential position for being arranged in display area of score that can will sort, such as displaying search knot The first page of fruit can be seen by the user or be easiest at first the position being seen by the user in resource recommendation region.
When calculating one when the sequence score of sequencing resources, sequencing resources CVR and CTR are treated in addition to this can be referred to, This other parameter for treating sequencing resources can also be referred to.Optionally, it can also refer to and treat the characteristic value score of sequencing resources, add Power point and diversity score.It next will be with information resources, i.e., for any one in resource collection treats sequencing resources, to such as What calculates and treats that the sequence score of sequencing resources illustrates.
The characteristic value score of information resources can be according to the corresponding characteristic value of the information resources and the information resources What the mean eigenvalue of affiliated classification was determined.Here characteristic value can identify the parameter of information resources value, such as the Three resources are a pair of shoes, and the selling price of this double shoes can be the characteristic value of this commodity.And mean eigenvalue can be marked Know the average value of resource in the affiliated classification of information resources, such as the average selling price of all shoes can be in shoes classification The mean eigenvalue of this classification.Difference between the characteristic value and mean eigenvalue of information resources can embody information resources Whether characteristic value is easily easily accepted by a user, if such as information resources characteristic value it is too low with respect to mean eigenvalue or excessively high, by with The possibility that family receives is relatively relatively low.
If the information resources are the resource with period timeliness, the weighting of information resources point can be according to described the What the expiration time of three resources determined.The resource of period timeliness can be the money for referring in specific time period be possessed by user Source, such as the antivirus software of 1 year term of validity 1 year are the period timeliness of the antivirus software.When with period timeliness When three resources i.e. will be expired, the demand that user wishes to continue to possess information resources may be larger, such as to killing expired The behavior that malicious software continues to pay dues 1 year.So it can be come true according to the relationship between the expiration time and current time of information resources Make the weighting point of information resources.
An embodiment of the present invention provides a kind of specific modes for calculating weighting point (R), can be according to the information resources Expiration time and current time mathematic interpolation time weight part;Time of the information resources was obtained according to the user Number calculation times weighted portion;The weighting of the information resources is determined according to the time weight part and number weighted portion Point.Specific calculate can be shown below:
Wherein, T is expiration time, and t is current time, and m is the number that the user had bought information resources, and α and β are Definite value.Can be time weight part,It can be number weighted portion.
The diversity score of information resources can be according to resource collection treat belong to information resources in sequencing resources belonging to What the quantity of classification and the sum for treating the affiliated classification of sequencing resources of resource collection were determined.Diversity score is used to embody It treats whether sequencing resources are all concentrated in a classification in resource collection, is also dispersed in multiple classifications.When resource sorts, If can inhomogeneity purpose resource comprehensively be showed user as possible, user can be supplied to more to select, relative to The arrangement belonged in same class purpose resource set is shown, preferably arrangement experience can be provided to the user.
Determine information resources characteristic value score, weighting point and diversity score after, with reference to information resources CTR and CVR can be the sequence score that information resources calculate more accuracy.Sequence is calculated an embodiment of the present invention provides specific The mode of score (score), is shown below:
Score=w1*CTR+w2*CVR+w3*R+w4*P+w5*D
Wherein, w1To w5The respectively weighted value of CTR, CVR, R, P and D, P, which is characterized, to be worth dividing, and D is diversity score.Respectively A weighted value can be definite value, can also be adjusted according to different scene demands.
Wherein, diversity score according to the number for treating sequencing resources and can treat the number of the affiliated classification of sequencing resources It is calculated, can specifically pass throughIt determines, wherein K is the sum for treating sequencing resources, CiIt is the money for calculating sequence score The number of the affiliated classification in source.
Next using the service platform of Internet businessman as application scenarios, the further instruction embodiment of the present invention How the technical solution provided is applied to service market.
Service market be towards wash in a pan be businessman provide diversified service transaction platform, at present covering wash in a pan system enliven seller More than 95%.Its main feature is that:User's visitation frequency is low, and behavior is few, orders and presents periodically.Thousand people one of service market originally Face, it is impossible to which good match businessman's actual demand, shopping guide are less efficient.
To solve problem above, we devise service market personalization frame, can be with as shown in fig. 6, being searched in personalization Significant effect is achieved in the scene of rope and resource recommendation.Wherein search clicking rate promotes 13%, and empty fruiting rate reduces 468%;Thousand times displaying fixture number promotes 15%;Recommend to click and promote 90%, thousand times displaying fixture number promotes 267%, conversion ratio It is integrally higher by 71% than service market.
Personalized shopping guide's frame, personalization shopping guide's frame are included in line computation module and off-line calculation module:
Online computing module is for real-time businessman's behavioural analysis, and commodity are recalled, personalized ordering;
For being responsible for businessman/service features update, order model training and candidate commodity pond calculate off-line calculation module.
Optionally, personalized shopping guide's frame be additionally operable to carry out the identifying of real-time preference, it is resource matched recall and The sequence of model.
Personalized shopping guide's frame can be with service market, and personalization frame in service market can be divided into line computation and offline meter It calculates.
It is responsible for real-time businessman's behavioural analysis (analysis portion to real-time logs and real-time behavior in such as Fig. 6 in line computation Point and real-time preferred partial), commodity recall (recommendation in such as Fig. 6 is recalled and part is recalled in search), and personalized ordering is (such as Model sort sections in Fig. 6);It is responsible for businessman/service features update (ODPS off-line calculations portion in such as Fig. 6 in offline part Point), order model training (CTR/CVR models and period order model part and machine training platform part in such as Fig. 6) and Candidate commodity pond calculate (in such as Fig. 6 for similar commodity, collocation commodity, classification be high-quality, merchant store feature, service features and The ODPS off-line calculations part of cross feature and machine learning training platform).
Wherein at least there are three parts needs to stress.
1. real-time preference identification
The user behavior frequency of service market is low, identifies that the real-time preference of user helps more accurately to match user's need It asks.Real-time preference includes two dimensions of real-time commodity preference and real-time classification preference, and accumulative+real-time behavior of usage time attenuation is anti- The mode of adjustment is presented to build the real-time preference pattern of user, streaming frame J Storm polymerizations search, classification, click, purchase in real time Waiting user behaviors logs, the accumulative user that temporally decays acts on the behavior number of each commodity and classification to classification and commodity dimension, according to The data decimation TopN that history adds up generates real-time preference.
2. matching is recalled
Problems are recalled in service market search:Search without result, search result relevance be not high and search result not It is enough high-quality.Semantic analysis and supplement are carried out to former query with query extensions in view of the above-mentioned problems, being extracted using core word.Packet It includes:Adaptively participle is carried out to search term based on semantic embedding and vectorization represents;To ensure core word and original query Semantic similar, the classification Distribution Entropy of Technique Using Both Text unit, the adjoining entropy with original query carry out core with former query classifications matching degree Heart word extracts;The behavioral similarity clicked according to historical search, bought to supplement the semantic description of former query and semantic phase Search extension word is calculated to former search term like property.Search term after supplement significantly reduces search without fruiting rate, searches plain clicking rate It is also obviously improved with conversion ratio.Personalized recommendation is recalled with real-time commodity preference, real-time classification preference, is searched for, is gone through in the recent period Based on history order goods, coordinate similar commodity, commodity of arranging in pairs or groups, classification best buy, which is enlarged, recalls, so as to build individual character Change the high-quality various commodity pond recommended.
3. model sorts
The model sort sections of personalized recommendation are responsible for combining current merchant store and businessman's behavior to the commodity pond recalled Feature carries out personalized ordering, has main steps that:1) splice the feature of user-pair pairs of commodity, including shop feature, Yong Huhang For preference profiles, channel feature, product features and other cross features, input CTR/CVR prediction model predictions ctr are recalled With cvr scores;2) commodity for meeting subscription model for example are weighted, if the period is purchased again, best buy helps to be promoted to click and turn Change;3) with reference to business objective, comprehensive ctr predictions, cvr predictions, subscription model for example weighting point, real-time preference, visitor's unit price, diversity point Number, best buy score, search text similarity are resequenced.
The service platform is applied to by the technical solution for being provided the embodiment of the present invention so that user is using service During platform, can achieve the effect that thousand people, thousand face, i.e. substantially each user when using the service platform, service platform to The content that user is shown has been substantially all certain difference, moreover, when different user is using same search word in service platform On when scanning for because the preference of user is had nothing in common with each other, therefore the search result searched also can different from.Such as Fig. 7 Shown, either selected service, colleague can be had with partly still guessing that you like part according to the preference of user Targetedly show.
Fig. 8 is a kind of structure drawing of device of data processing equipment provided in an embodiment of the present invention, and described device includes obtaining Unit 801, division unit 802 and determination unit 803:
The acquiring unit 801, for obtaining target semantic primitive, the target semantic primitive is for one for search Search for semantic primitive;
The division unit 802, for the target semanteme dividing elements to be obtained multiple participles;
The determination unit 803, for searching for the quantity of the affiliated classification of resource in obtained search result according to participle, with And according to the affiliated classification of resource in the obtained search result of participle search with being obtained according to the target semanteme unit searches Registration determines the core word of the target semantic primitive between the affiliated classification of resource in search result;
The determination unit 803 is additionally operable to search what is searched for according to the core word determined from the multiple participle Hitch fruit is as the search result obtained according to the target semanteme unit searches.
Optionally, if resource quantity is less than first threshold or the mesh in the search result of the target semantic primitive The character length of semantic primitive is marked more than second threshold, triggers the division unit.
Optionally, the first participle is segmented for any one in the multiple participle, and the determination unit is additionally operable to obtain institute The number of the first participle being stated in historical search behavior for search, the resource quantity searched for of the first participle and The number that the first participle occurs jointly in historical search behavior with other words;According to the first participle in historical search The number of search, the resource quantity that the first participle is searched for, the first participle are used in behavior in historical search row The affiliated classification of resource in the number that occurs jointly with other words in, the search result searched for according to the first participle In quantity and the search result searched for according to the first participle the affiliated classification of resource with according to target semanteme Registration calculates the core word score of the first participle between the affiliated classification of resource in the search result that unit searches obtain; It, will if the core word score of the first participle is in the top N of core word highest scoring respectively segmented in the multiple participle The first participle is determined as the core word of the target semantic primitive.
Optionally, the determination unit includes expanding subelement and determination subelement:
If the quantity of resource does not meet third threshold value, institute in the search result obtained according to the target semanteme unit searches Expansion subelement is stated, for according to the target semantic primitive being expanded, obtains expanding semantic primitive, it is described to expand semantic list Member is search semantic primitive;
The determination subelement, for will be according to the search result searched for of expansion semantic primitive as according to institute State the search result that target semanteme unit searches obtain.
Optionally, first resource is any one money in the search result obtained according to the target semanteme unit searches Source, the expansion subelement, if being additionally operable to be searched for obtain the first resource according to the first semantic primitive, by described first For semantic primitive as the expansion semantic primitive, first semantic primitive is a search semantic primitive;It if alternatively, being capable of root It searches for obtain the Secondary resource according to the second semantic primitive, using second semantic primitive as the expansion semantic primitive, institute The second semantic primitive is stated as a search semantic primitive, the Secondary resource is the money for having similitude with the first resource Source.
Optionally, the expansion subelement is additionally operable to that the 4th threshold value will be less than with the editing distance of the target semantic primitive Third semantic primitive as the expansion semantic primitive, the third semantic primitive is a search semantic primitive;It alternatively, will It is less than the 4th semantic primitive of the 5th threshold value as the expansion semantic primitive with the acceptation similarity of the target semantic primitive, 4th semantic primitive is a search semantic primitive.
Optionally, the expansion subelement be additionally operable to the expansion semantic primitive being calculated and the target semantic primitive it Between associated frequencies;Obtain the highest preceding M expansion semantic primitive of associated frequencies;According to the word of described M expansion semantic primitive Number, the affiliated classification by searching for resource in obtained search result being used to search in symbol length, historical search behavior The registration between the affiliated classification of resource in the search result that is obtained according to the target semanteme unit searches, from the M Expand in semantic primitive and further determine that out L expansion semantic primitive;Expand semantic primitive as the target language using described L The expansion semantic primitive of adopted unit.
Optionally, it is described if inputting user that the target semantic primitive scans for by client Connection Service device Device further includes recommendation unit:
The recommendation unit is determined for searching for institute in the recent period according to the user using search semantic primitive and core word Resource to be recommended;Recommend the resource to be recommended to the client.
Optionally, the recommendation unit is additionally operable to obtain the real-time preference of the user, and the real-time preference includes real-time Resource preference and real-time classification preference;Searched in the recent period according to the real-time preference, the user use search semantic primitive and Core word determines resource to be recommended.
Optionally, Fig. 9 is a kind of structure drawing of device of collator provided in an embodiment of the present invention, and described device includes money Source acquiring unit 901, preference acquiring unit 902, clicking rate determination unit 903, score determination unit 904 and sequencing unit 905:
The resource acquisition unit 901, for obtaining the resource collection for treating sequencing resources, the resource collection is tied for search Fruit or resource to be recommended;
The preference acquiring unit 902 inputs the real-time of user that the target semantic primitive scans for for obtaining Preference, the real-time preference include real time resources preference and real-time classification preference;
The clicking rate determination unit 903, for feature, real-time preference and the feature according to the user with The cross feature for treating to be formed between sequencing resources in the resource collection obtains each treating the corresponding click of sequencing resources Conversion ratio CVR and click-through-rate CTR;
The score determination unit 904 treats that sequencing resources corresponding CVR and CTR determines to treat sequence money for basis The corresponding sequence score in source;
The sequencing unit 905, for according to sequence score height in the resource collection treat sequencing resources into Row sequence.
Optionally, the real time resources preference of the user be historical context behavior according to the user and resource and It is determined with the time of origin of the historical context behavior of resource;The real-time classification preference of the user be according to the user with It the historical context behavior of classification and determines with the time of origin of the historical context behavior of classification.
Optionally, information resources treat sequencing resources, the score determination unit for any one in the resource collection It is additionally operable to determine institute according to the mean eigenvalue of the corresponding characteristic value of the information resources and the affiliated classification of the information resources State the characteristic value score of information resources;If the information resources are the resource with period timeliness, further according to described the The expiration time of three resources determines the weighting point of the information resources;According to the resource collection treat belong to institute in sequencing resources State the quantity of the affiliated classification of information resources and the sum for treating the affiliated classification of sequencing resources of the resource collection determine it is various Property score;According to the corresponding CVR, CTR of the information resources, characteristic value score, weighting point and diversity score, determine described The corresponding sequence score of information resources.
Optionally, the score determination unit is additionally operable to the difference of the expiration time and current time according to the information resources Value calculates time weight part;The number calculation times weighted portion of the information resources was obtained according to the user;According to The time weight part and number weighted portion determine the weighting point of the information resources.
As it can be seen that in order to improve search experience, target semantic primitive can be segmented, be searched for what a participle obtained If rope as a result, less according to the affiliated classification of resource in the obtained search result of participle search, and with according to target semanteme Registration is higher between the affiliated classification of resource in the search result that unit searches obtain, then can determine this participle and the mesh The mark practical feature carried of semantic primitive is consistent substantially, this participle is compared with the core that can embody the target semantic primitive in other words Heart meaning, can be as the core word of the target semantic primitive.Resource is had in the search result searched for using core word The characteristics of with wish by target semanteme unit searches to resource the characteristics of be consistent to a certain extent, therefore pass through by The corresponding search result of core word can effectively play quantity expansion as the corresponding search result of target semantic primitive Effect, and the resource expanded is more likely to have correlation with the search purpose scanned for the target semantic primitive, so as to Improve the search experience of user.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above method embodiment can pass through The relevant hardware of program instruction is completed, and foregoing routine can be stored in a computer read/write memory medium, which exists During execution, step including the steps of the foregoing method embodiments is performed;And aforementioned storage medium can be in following media at least one Kind:Read-only memory (English:Read-only memory, abbreviation:ROM), RAM, magnetic disc or CD etc. are various to store The medium of program code.
It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment it Between just to refer each other for identical similar part, the highlights of each of the examples are difference from other examples. For equipment and system embodiment, since it is substantially similar to embodiment of the method, so describe fairly simple, The relevent part can refer to the partial explaination of embodiments of method.Equipment and system embodiment described above is only schematic , wherein may or may not be as the unit that separating component illustrates physically separate, shown as unit Component may or may not be physical unit, you can be located at a place or can also be distributed to multiple networks On unit.Some or all of module therein can be selected according to the actual needs to realize the purpose of this embodiment scheme. Those of ordinary skill in the art are without creative efforts, you can to understand and implement.
The above, only a kind of specific embodiment of the invention, but protection scope of the present invention is not limited thereto, Any one skilled in the art in the technical scope disclosed by the present invention, the change or replacement that can be readily occurred in, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of the claims Subject to.

Claims (28)

1. a kind of data processing method, which is characterized in that the method includes:
Target semantic primitive is obtained, the target semantic primitive is a search semantic primitive for search;
The target semanteme dividing elements are obtained into multiple participles;
The quantity of the affiliated classification of resource and the search obtained according to participle search in the search result obtained according to participle search As a result the affiliated classification of middle resource and the affiliated classification of resource in the search result obtained according to the target semanteme unit searches Between registration determine the core word of the target semantic primitive;
Using the search result searched for according to the core word determined from the multiple participle as according to the target language The search result that adopted unit searches obtain.
2. according to the method described in claim 1, it is characterized in that, it is described the target semanteme dividing elements are obtained it is multiple Before participle, further include:
If resource quantity is less than first threshold or the target semantic primitive in the search result of the target semantic primitive Character length is more than second threshold, performs the described the step of target semanteme dividing elements are obtained multiple participles.
3. method according to claim 1 or 2, which is characterized in that the first participle is any one in the multiple participle Participle the quantity of the affiliated classification of resource and is obtained according to participle search in the obtained search result according to participle search Search result in the affiliated classification of resource and the institute of resource in the search result that is obtained according to the target semanteme unit searches Registration determines the core word of the target semantic primitive between category classification, including:
The resource that number, the first participle that the first participle is obtained in historical search behavior for search are searched for The number that quantity and the first participle occur jointly in historical search behavior with other words;
The resource searched for according to number, the first participle of the first participle in historical search behavior for search Number that quantity, the first participle occur jointly in historical search behavior with other words is searched for according to the first participle It is provided in the quantity of the affiliated classification of resource and the search result searched for according to the first participle in obtained search result The affiliated classification in source in the search result obtained according to the target semanteme unit searches between the affiliated classification of resource with overlapping Degree calculates the core word score of the first participle;
If the core word score of the first participle is in the preceding N of core word highest scoring respectively segmented in the multiple participle The first participle is determined as the core word of the target semantic primitive by position.
4. according to the method described in claim 1, it is characterized in that, in the core that will be determined from the multiple participle After search result of the search result as the target semantic primitive corresponding to word, further include:
If the quantity of resource does not meet third threshold value in the search result obtained according to the target semanteme unit searches, according to institute It states target semantic primitive to be expanded, obtains expanding semantic primitive, the expansion semantic primitive is search semantic primitive;
The search result searched for according to the expansion semantic primitive is obtained as according to the target semanteme unit searches Search result.
5. according to the method described in claim 4, it is characterized in that, first resource is obtains according to the target semanteme unit searches To search result in any one resource, it is described to be expanded according to the target semantic primitive, obtain expanding semantic single Member, including:
If it can be searched for obtain the first resource according to the first semantic primitive, using first semantic primitive as the expansion Semantic primitive, first semantic primitive are a search semantic primitive;Alternatively,
If it can be searched for obtain the Secondary resource according to the second semantic primitive, using second semantic primitive as the expansion Semantic primitive, second semantic primitive are a search semantic primitive, and the Secondary resource is has with the first resource The resource of similitude.
6. according to the method described in claim 4, it is characterized in that, described expanded according to the target semantic primitive, obtain To expand semantic primitive, including:
Expand semanteme using the third semantic primitive for being less than the 4th threshold value with the editing distance of the target semantic primitive as described Unit, the third semantic primitive are a search semantic primitive;Alternatively,
The 4th semantic primitive of the 5th threshold value will be less than with the acceptation similarity of the target semantic primitive as the expansion language Adopted unit, the 4th semantic primitive are a search semantic primitive.
7. according to claim 4 to 6 any one of them method, which is characterized in that will expand semantic list according to described described Before the search result that Meta Search Engine obtains is as the search result obtained according to the target semanteme unit searches, further include:
Associated frequencies between the expansion semantic primitive and the target semantic primitive that are calculated;
Obtain the highest preceding M expansion semantic primitive of associated frequencies;
According to the number being used to search in the character length of described M expansion semantic primitive, historical search behavior, pass through search The affiliated classification of resource and resource in the search result obtained according to the target semanteme unit searches in obtained search result Affiliated classification between registration, from described M expand semantic primitive in further determine that out L expand semantic primitive;
Expand expansion semantic primitive of the semantic primitive as the target semantic primitive using described L.
8. if according to the method described in claim 1, it is characterized in that, input the user that the target semantic primitive scans for By client Connection Service device, further include:
Institute is searched for according to the user in the recent period and determines resource to be recommended using search semantic primitive and core word;
Recommend the resource to be recommended to the client.
9. according to the method described in claim 8, it is characterized in that, described search for used search phrase in the recent period according to the user Adopted unit and core word determine resource to be recommended, including:
The real-time preference of the user is obtained, the real-time preference includes real time resources preference and real-time classification preference;
Institute is searched for according to the real-time preference, the user in the recent period and determines money to be recommended using search semantic primitive and core word Source.
10. according to the method described in claim 1 or 4 or 8, which is characterized in that further include:
The resource collection for treating sequencing resources is obtained, the resource collection is search result or resource to be recommended;
It obtains and inputs the real-time preference of user that the target semantic primitive scans for, the real-time preference includes real time resources Preference and real-time classification preference;
According to treating institute between sequencing resources in the feature of the user, real-time preference and the feature and the resource collection The cross feature of formation obtains each treating the corresponding click conversion ratio CVR and click-through-rate CTR of sequencing resources;
According to treating that sequencing resources corresponding CVR and CTR determines to treat the corresponding sequence score of sequencing resources;
According to the height of sequence score to treating that sequencing resources are ranked up in the resource collection.
11. method according to claim 9 or 10, which is characterized in that the real time resources preference of the user is according to institute It states the historical context behavior of user and resource and is determined with the time of origin of the historical context behavior of resource;The use The real-time classification preference at family is the historical context behavior according to the user and classification and the historical context behavior with classification What time of origin was determined.
12. according to the method described in claim 10, it is characterized in that, information resources are any one in the resource collection It treats sequencing resources, the corresponding sequence of resource to be recommended is determined according to resource to be recommended corresponding CVR and CTR described Before score, including:
According to being determined the mean eigenvalue of the corresponding characteristic value of the information resources and the affiliated classification of the information resources The characteristic value score of information resources;
If the information resources are the resource with period timeliness, further determined according to the expiration time of the information resources The weighting of the information resources point;
According to the quantity for treating to belong in sequencing resources the affiliated classification of the information resources of the resource collection and the resource Set treats that the sum of the affiliated classification of sequencing resources determines diversity score;
It is described that the corresponding sequence score of resource to be recommended is determined according to resource to be recommended corresponding CVR and CTR, packet It includes:
According to the corresponding CVR, CTR of the information resources, characteristic value score, weighting point and diversity score, described the is determined The corresponding sequence score of three resources.
13. according to the method for claim 12, which is characterized in that the expiration time according to the information resources determines The weighting of the information resources point, including:
According to the expiration time of the information resources and the mathematic interpolation time weight part of current time;
The number calculation times weighted portion of the information resources was obtained according to the user;
The weighting point of the information resources is determined according to the time weight part and number weighted portion.
14. a kind of data processing equipment, which is characterized in that described device includes acquiring unit, division unit and determination unit:
The acquiring unit, for obtaining target semantic primitive, the target semantic primitive is a search phrase for search Adopted unit;
The division unit, for the target semanteme dividing elements to be obtained multiple participles;
The determination unit, for according to the quantity of the affiliated classification of resource in the obtained search result of participle search and according to The affiliated classification of resource and the search knot obtained according to the target semanteme unit searches in the search result that participle search obtains Registration determines the core word of the target semantic primitive between the affiliated classification of resource in fruit;
The determination unit is additionally operable to the search result that will be searched for according to the core word determined from the multiple participle As the search result obtained according to the target semanteme unit searches.
15. device according to claim 14, which is characterized in that if resource in the search result of the target semantic primitive Quantity, more than second threshold, triggers the division unit less than the character length of first threshold or the target semantic primitive.
16. the device according to claims 14 or 15, which is characterized in that the first participle is any one in the multiple participle A participle, it is number that the determination unit is additionally operable to obtain the first participle in historical search behavior for search, described What the resource quantity and the first participle that the first participle is searched for occurred jointly in historical search behavior with other words Number;The money searched for according to number, the first participle of the first participle in historical search behavior for search Number that source quantity, the first participle occur jointly in historical search behavior with other words is searched according to the first participle In the search result that rope obtains in the quantity of the affiliated classification of resource and the search result searched for according to the first participle It is weighed between the affiliated classification of resource in the affiliated classification of resource and the search result obtained according to the target semanteme unit searches The right core word score for calculating the first participle;If the core word score of the first participle is in the multiple participle The first participle is determined as the core word of the target semantic primitive by the top N of core word highest scoring respectively segmented.
17. device according to claim 14, which is characterized in that the determination unit includes expanding subelement and determines son Unit:
If the quantity of resource does not meet third threshold value, the expansion in the search result obtained according to the target semanteme unit searches Subelement is filled, for according to the target semantic primitive being expanded, obtains expanding semantic primitive, the expansion semantic primitive is Search for semantic primitive;
The determination subelement, for will be according to the search result searched for of expansion semantic primitive as according to the mesh The search result that poster justice unit searches obtain.
18. device according to claim 17, which is characterized in that first resource is according to the target semanteme unit searches Any one resource in obtained search result, the expansion subelement, if being additionally operable to be searched according to the first semantic primitive Rope obtains the first resource, and using first semantic primitive as the expansion semantic primitive, first semantic primitive is One search semantic primitive;If it alternatively, can be searched for obtain the Secondary resource according to the second semantic primitive, by second language Adopted unit is a search semantic primitive as the expansion semantic primitive, second semantic primitive, and the Secondary resource is There is the resource of similitude with the first resource.
19. device according to claim 17, which is characterized in that it is described expansion subelement be additionally operable to by with the target language The editing distance of adopted unit is less than the third semantic primitive of the 4th threshold value as the expansion semantic primitive, the semantic list of the third Member is a search semantic primitive;Alternatively, by being less than the 4th of the 5th threshold value with the acceptation similarity of the target semantic primitive For semantic primitive as the expansion semantic primitive, the 4th semantic primitive is a search semantic primitive.
20. according to claim 17 to 19 any one of them device, which is characterized in that the expansion subelement is additionally operable to calculate Associated frequencies between obtained expansion semantic primitive and the target semantic primitive;Obtain the highest preceding M expansion of associated frequencies Fill semantic primitive;According to the number being used to search in described M the expansion character length of semantic primitive, historical search behavior, By searching for the affiliated classification of resource and the search knot obtained according to the target semanteme unit searches in obtained search result Registration between the affiliated classification of resource in fruit expands in semantic primitive from described M and further determines that out that L is expanded semantic list Member;Expand expansion semantic primitive of the semantic primitive as the target semantic primitive using described L.
21. device according to claim 14, which is characterized in that if inputting the use that the target semantic primitive scans for Family further includes recommendation unit by client Connection Service device, described device:
The recommendation unit is determined to wait to push away for searching for institute in the recent period according to the user using search semantic primitive and core word Recommend resource;Recommend the resource to be recommended to the client.
22. device according to claim 21, which is characterized in that the recommendation unit is additionally operable to obtain the reality of the user When preference, the real-time preference include real time resources preference and real-time classification preference;It is near according to the real-time preference, the user Phase search institute determines resource to be recommended using search semantic primitive and core word.
23. according to the device described in claim 14 or 17 or 21, which is characterized in that described device further includes resource acquisition list Member, preference acquiring unit, clicking rate determination unit, score determination unit and sequencing unit:
The resource acquisition unit, for obtaining the resource collection for treating sequencing resources, the resource collection is search result or treats Recommend resource;
The preference acquiring unit, for obtaining the real-time preference of user for inputting the target semantic primitive and scanning for, institute It states real-time preference and includes real time resources preference and real-time classification preference;
The clicking rate determination unit, for feature, real-time preference and the feature according to the user and the resource The cross feature for treating to be formed between sequencing resources in set obtains each treating that sequencing resources are corresponding and clicking conversion ratio CVR With click-through-rate CTR;
The score determination unit treats that sequencing resources corresponding CVR and CTR determines to treat that sequencing resources are right respectively for basis The sequence score answered;
The sequencing unit, for according to sequence score height to treating that sequencing resources are ranked up in the resource collection.
24. the device according to claim 22 or 23, which is characterized in that the real time resources preference of the user is according to institute It states the historical context behavior of user and resource and is determined with the time of origin of the historical context behavior of resource;The use The real-time classification preference at family is the historical context behavior according to the user and classification and the historical context behavior with classification What time of origin was determined.
25. device according to claim 23, which is characterized in that information resources are any one in the resource collection Treat sequencing resources, the score determination unit is additionally operable to according to the corresponding characteristic value of the information resources and the information resources institute The mean eigenvalue for belonging to classification determines the characteristic value score of the information resources;If the information resources are with period timeliness Property resource, the weighting point of the information resources is further determined according to the expiration time of the information resources;According to the money Source set the quantity for treating to belong to the affiliated classification of the information resources in sequencing resources and the resource collection treat sequence money The sum of the affiliated classification in source determines diversity score;According to the corresponding CVR, CTR of the information resources, characteristic value score, add Power point and diversity score, determine the corresponding sequence score of the information resources.
26. device according to claim 25, which is characterized in that the score determination unit is additionally operable to according to the third The expiration time of resource and the mathematic interpolation time weight part of current time;The information resources were obtained according to the user Number calculation times weighted portion;The information resources are determined according to the time weight part and number weighted portion Weighting point.
27. a kind of personalization shopping guide frame, which is characterized in that personalization shopping guide's frame is included in line computation module and offline Computing module:
Online computing module is for real-time businessman's behavioural analysis, and commodity are recalled, personalized ordering;
For being responsible for businessman/service features update, order model training and candidate commodity pond calculate off-line calculation module.
28. personalization shopping guide's frame according to claim 27, which is characterized in that personalization shopping guide's frame is additionally operable to It carries out the identifying of real-time preference, resource matched recall and the sequence of model.
CN201611110268.6A 2016-12-02 2016-12-02 Data processing method and related device Active CN108153792B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611110268.6A CN108153792B (en) 2016-12-02 2016-12-02 Data processing method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611110268.6A CN108153792B (en) 2016-12-02 2016-12-02 Data processing method and related device

Publications (2)

Publication Number Publication Date
CN108153792A true CN108153792A (en) 2018-06-12
CN108153792B CN108153792B (en) 2023-04-18

Family

ID=62468178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611110268.6A Active CN108153792B (en) 2016-12-02 2016-12-02 Data processing method and related device

Country Status (1)

Country Link
CN (1) CN108153792B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109933672A (en) * 2019-02-12 2019-06-25 北京百度网讯科技有限公司 Handle method, apparatus, electronic equipment and the computer readable storage medium of inquiry
CN110427381A (en) * 2019-08-07 2019-11-08 北京嘉和海森健康科技有限公司 A kind of data processing method and relevant device
CN110910207A (en) * 2019-10-30 2020-03-24 苏宁云计算有限公司 Method and system for improving commodity recommendation diversity
CN110968691A (en) * 2018-09-30 2020-04-07 北京国双科技有限公司 Judicial hotspot determination method and device
CN111192657A (en) * 2018-11-15 2020-05-22 宁波方太厨具有限公司 Menu recommendation method based on user behavior heat
CN112765480A (en) * 2021-04-12 2021-05-07 腾讯科技(深圳)有限公司 Information pushing method and device and computer readable storage medium
CN113065932A (en) * 2021-05-06 2021-07-02 北京京东振世信息技术有限公司 Article recommendation method and device
CN113204697A (en) * 2021-04-29 2021-08-03 五八有限公司 Searching method, searching device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101019118A (en) * 2004-07-13 2007-08-15 谷歌股份有限公司 Personalization of placed content ordering in search results
US20090228482A1 (en) * 2006-11-09 2009-09-10 Huawei Technologies Co., Ltd. Network search method, system and device
CN103064838A (en) * 2011-10-19 2013-04-24 阿里巴巴集团控股有限公司 Data searching method and device
CN103123632A (en) * 2011-11-21 2013-05-29 阿里巴巴集团控股有限公司 Determining method for searching headword and device of searching headword, searching method and searching equipment
CN103914533A (en) * 2014-03-31 2014-07-09 百度在线网络技术(北京)有限公司 Promotion search result display method and device
CN105302810A (en) * 2014-06-12 2016-02-03 北京搜狗科技发展有限公司 Information search method and apparatus

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101019118A (en) * 2004-07-13 2007-08-15 谷歌股份有限公司 Personalization of placed content ordering in search results
US20090228482A1 (en) * 2006-11-09 2009-09-10 Huawei Technologies Co., Ltd. Network search method, system and device
CN103064838A (en) * 2011-10-19 2013-04-24 阿里巴巴集团控股有限公司 Data searching method and device
CN103123632A (en) * 2011-11-21 2013-05-29 阿里巴巴集团控股有限公司 Determining method for searching headword and device of searching headword, searching method and searching equipment
CN103914533A (en) * 2014-03-31 2014-07-09 百度在线网络技术(北京)有限公司 Promotion search result display method and device
CN105302810A (en) * 2014-06-12 2016-02-03 北京搜狗科技发展有限公司 Information search method and apparatus

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110968691A (en) * 2018-09-30 2020-04-07 北京国双科技有限公司 Judicial hotspot determination method and device
CN111192657A (en) * 2018-11-15 2020-05-22 宁波方太厨具有限公司 Menu recommendation method based on user behavior heat
CN109933672A (en) * 2019-02-12 2019-06-25 北京百度网讯科技有限公司 Handle method, apparatus, electronic equipment and the computer readable storage medium of inquiry
CN110427381A (en) * 2019-08-07 2019-11-08 北京嘉和海森健康科技有限公司 A kind of data processing method and relevant device
CN110910207A (en) * 2019-10-30 2020-03-24 苏宁云计算有限公司 Method and system for improving commodity recommendation diversity
CN112765480A (en) * 2021-04-12 2021-05-07 腾讯科技(深圳)有限公司 Information pushing method and device and computer readable storage medium
CN112765480B (en) * 2021-04-12 2021-06-18 腾讯科技(深圳)有限公司 Information pushing method and device and computer readable storage medium
CN113204697A (en) * 2021-04-29 2021-08-03 五八有限公司 Searching method, searching device, electronic equipment and storage medium
CN113065932A (en) * 2021-05-06 2021-07-02 北京京东振世信息技术有限公司 Article recommendation method and device

Also Published As

Publication number Publication date
CN108153792B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN108153792A (en) A kind of data processing method and relevant apparatus
Ghose et al. Modeling consumer footprints on search engines: An interplay with social media
CN110110181B (en) Clothing matching recommendation method based on user style and scene preference
CN104866474B (en) Individuation data searching method and device
KR102219344B1 (en) Automatic advertisement execution device, method for automatically generating campaign information for an advertisement medium to execute an advertisement and computer program for executing the method
CN111461841B (en) Article recommendation method, device, server and storage medium
KR101385700B1 (en) Method and apparatus for providing moving image advertisements
US20150058331A1 (en) Search result ranking using machine learning
CN110428298A (en) A kind of shop recommended method, device and equipment
CN109325182B (en) Information pushing method and device based on session, computer equipment and storage medium
CN108153791B (en) Resource recommendation method and related device
CN108122122A (en) Advertisement placement method and system
KR102191486B1 (en) Automatic advertisement execution device, method for automatically generating campaign information for an advertisement medium to execute an advertisement and computer program for executing the method
CN105740268A (en) Information pushing method and apparatus
CN110019943A (en) Video recommendation method, device, electronic equipment and storage medium
CN110348930A (en) Business object data processing method, the recommended method of business object information and device
CN113837842A (en) Commodity recommendation method and equipment based on user behavior data
CN110602532A (en) Entity article recommendation method, device, server and storage medium
CN113946754A (en) User portrait based rights and interests recommendation method, device, equipment and storage medium
CN112488781A (en) Search recommendation method and device, electronic equipment and readable storage medium
CN109446402B (en) Searching method and device
CN115860870A (en) Commodity recommendation method, system and device and readable medium
CN110765346B (en) User intention mining method, device and equipment
CN114840745A (en) Personalized recommendation method and system based on graph feature learning and deep semantic matching model
Chatwin An overview of computational challenges in online advertising

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant