CN105718533A - Information pushing method and device - Google Patents

Information pushing method and device Download PDF

Info

Publication number
CN105718533A
CN105718533A CN201610029313.9A CN201610029313A CN105718533A CN 105718533 A CN105718533 A CN 105718533A CN 201610029313 A CN201610029313 A CN 201610029313A CN 105718533 A CN105718533 A CN 105718533A
Authority
CN
China
Prior art keywords
information
identification information
identification
pushed
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610029313.9A
Other languages
Chinese (zh)
Inventor
岳爱珍
崔燕
杨自强
谭静
高显
赵辉
王私江
于倩
白霄骅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201610029313.9A priority Critical patent/CN105718533A/en
Priority to PCT/CN2016/087453 priority patent/WO2017121076A1/en
Publication of CN105718533A publication Critical patent/CN105718533A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses an information pushing method and device. The embodiment of the method comprises the following steps of obtaining candidate pushing information; determining identification information corresponding to the candidate pushing information on the basis of a pre-trained information identification model; generating information to be pushed on the basis of the candidate pushing information and the identification information corresponding to the candidate pushing information; and pushing the information to be pushed. By using the embodiment, the differences between pushing information in the identification aspect are realized, so that the information obtaining efficiency of a user is high.

Description

Information-pushing method and device
Technical field
The application relates to field of computer technology, is specifically related to Internet technical field, particularly relates to information-pushing method and and device.
Background technology
Along with the development of Internet technology, quantity of information increases with geometry rank, simultaneously, poor information management or mismanagement, the issue of information, propagate out of hand, create a large amount of deceptive information, garbage, cause the pollution of information environment and the generation of " garbage information ".
But, existing information pushing mode usually directly pushes various candidate's pushed information to user, and these information is not added mark, so, does not have the difference in mark between the information of propelling movement, and user obtains the inefficient of information.
Summary of the invention
The purpose of the application is in that to propose information-pushing method and the device of a kind of improvement, solves the technical problem that background section above is mentioned.
On the one hand, this application provides a kind of information-pushing method, described method includes: obtain candidate's pushed information;The identification information corresponding with described candidate's pushed information determined by message identification model based on training in advance;Based on described candidate's pushed information and the identification information corresponding with described candidate's pushed information, generate information to be pushed;Push described information to be pushed.
In certain embodiments, the identification information corresponding with described candidate's pushed information determined by the described message identification model based on training in advance, including: confirm the website that described candidate's pushed information is originated;Search for the characteristic information of described website, described characteristic information is imported the message identification model of training in advance;Obtain the identification information that the characteristic information with described website determined according to described message identification model is corresponding, using the identification information corresponding with the characteristic information of described website as the identification information corresponding with described candidate's pushed information.
In certain embodiments, described characteristic information includes at least one in the following information of described website: number of servers information, domain name age information, ranking information, key word ranking information, jump out sponsor's information of rate information, outer chain number information, flow information, weight information, website.
In certain embodiments, described method also includes: set up the step of message identification model, including: obtaining the sample data trained needed for described model, wherein, described sample data includes the identification information that the characteristic information of the characteristic information of sample site measure and fixed sample site measure is corresponding;Based on initial model, the identification information that the characteristic information of sample site measure is corresponding is predicted, obtain the identification information that the characteristic information of the sample site measure of initial model prediction is corresponding, wherein, described initial model is with one of drag: supporting vector machine model, decision-tree model, model-naive Bayesian, Logic Regression Models;Judge that whether the identification information that identification information that the characteristic information of the sample site measure that initial model predicts is corresponding is corresponding with the characteristic information of fixed sample site measure is consistent;If not, then using the identification information corresponding for the characteristic information of the characteristic information of described sample site measure and the fixed sample site measure training data as described initial model, further, the parameter of described initial model is revised based on described training data, to obtain described message identification model.
In certain embodiments, described identification information includes the first identification information and the second identification information;And, the identification information corresponding with described candidate's pushed information determined by the described model based on training in advance, including: based on the key word whether including pre-setting in the record information of the described website searched, from described first identification information and the second identification information, select one as the identification information corresponding with described candidate's pushed information;Or, based on the information whether including described website in the user's report information set obtained, from described first identification information and the second identification information, select one as the identification information corresponding with described candidate's pushed information.
Second aspect, this application provides a kind of information push-delivery apparatus, and described device includes: acquiring unit, and configuration is used for obtaining candidate's pushed information;Determining unit, configuration determines the identification information corresponding with described candidate's pushed information for the message identification model based on training in advance;Generating unit, configuration is for based on described candidate's pushed information and the identification information corresponding with described candidate's pushed information, generating information to be pushed;Push unit, configuration is used for pushing described information to be pushed.
In certain embodiments, described determine unit, including: site determining subelement, for confirming the website that described candidate's pushed information is originated;Characteristic information search subelement, for searching for the characteristic information of described website, characteristic information imports subelement, for described characteristic information imports the message identification model of training in advance;Identification information obtains subelement, the identification information corresponding for obtaining the characteristic information with described website determined according to described message identification model, using the identification information corresponding with the characteristic information of described website as the identification information corresponding with described candidate's pushed information.
In certain embodiments, described characteristic information includes at least one in the following information of described website: number of servers information, domain name age information, ranking information, key word ranking information, jump out sponsor's information of rate information, outer chain number information, flow information, weight information, website.
In certain embodiments, described device also includes: unit set up by message identification model, obtains subelement including: sample data, the identification information that the characteristic information of characteristic information and fixed sample site measure for obtaining sample site measure is corresponding;Prediction identification information obtains subelement, for the identification information that the characteristic information of sample site measure is corresponding being predicted based on initial model, obtain the identification information that the characteristic information of the sample site measure of initial model prediction is corresponding, wherein, described initial model is with one of drag: supporting vector machine model, decision-tree model, model-naive Bayesian, Logic Regression Models;Prediction identification information judgment subelement, whether the identification information that identification information that characteristic information for judging sample site measure that initial model predicts is corresponding is corresponding with the characteristic information of fixed sample site measure is consistent;Parameter modification subelement, for identification information that the identification information corresponding at the characteristic information predicting sample site measure that identification information judgment subelement judges that initial model predicts is corresponding with the characteristic information of fixed sample site measure inconsistent, using the identification information corresponding for the characteristic information of the characteristic information of described sample site measure and the fixed sample site measure training data as described initial model, and, the parameter of described initial model is revised, to obtain described message identification model based on described training data.
In certain embodiments, described identification information includes the first identification information and the second identification information;And, described determine unit, including: first selects subelement, for based on the key word whether including pre-setting in the record information of the described website searched, selecting one as the identification information corresponding with described candidate's pushed information from described first identification information and the second identification information;Or, second selects subelement, for based on the information whether including described website in the user's report information set obtained, selecting one as the identification information corresponding with described candidate's pushed information from described first identification information and the second identification information.
The information-pushing method of the application offer and device, by obtaining candidate's pushed information, then determine the identification information corresponding with described candidate's pushed information based on the message identification model of training in advance, and generate information to be pushed based on described candidate's pushed information and the identification information corresponding with described candidate's pushed information, finally push described information to be pushed, it is achieved thereby that the difference in mark between the information pushed, user is made to obtain the in hgher efficiency of information.
Accompanying drawing explanation
By reading the detailed description that non-limiting example is made made with reference to the following drawings, other features, purpose and advantage will become more apparent upon:
Fig. 1 is that the application can apply to exemplary system architecture figure therein;
Fig. 2 is the flow chart of an embodiment of the information-pushing method according to the application;
Fig. 3 is the schematic diagram of an application scenarios of the information-pushing method according to the application;
Fig. 4 is the flow chart of another embodiment of the information-pushing method according to the application;
Fig. 5 is the structural representation of an embodiment of the information push-delivery apparatus according to the application;
Fig. 6 is adapted for the structural representation of the computer system for the terminal unit or server realizing the embodiment of the present application.
Detailed description of the invention
Below in conjunction with drawings and Examples, the application is described in further detail.It is understood that specific embodiment described herein is used only for explaining related invention, but not the restriction to this invention.It also should be noted that, for the ease of describing, accompanying drawing illustrate only the part relevant to about invention.
It should be noted that when not conflicting, the embodiment in the application and the feature in embodiment can be mutually combined.Describe the application below with reference to the accompanying drawings and in conjunction with the embodiments in detail.
Fig. 1 illustrates the exemplary system architecture 100 of the embodiment of information-pushing method or the information push-delivery apparatus that can apply the application.
As it is shown in figure 1, system architecture 100 can include terminal unit 101,102,103, network 104 and server 105.Network 104 in order to provide the medium of communication link between terminal unit 101,102,103 and server 105.Network 104 can include various connection type, for instance wired, wireless communication link or fiber optic cables etc..
User can use terminal unit 101,102,103 mutual with server 105 by network 104, to receive or to send message etc..Terminal unit 101,102,103 can be provided with the application of various telecommunication customer end, for instance web browser applications, shopping class application, searching class application, map class application, JICQ, mailbox client, social platform software etc..
Terminal unit 101,102,103 can be have a display screen and various electronic equipments that supported web page browses, include but not limited to smart mobile phone, panel computer, E-book reader, MP3 player (MovingPictureExpertsGroupAudioLayerIII, dynamic image expert's compression standard audio frequency aspect 3), MP4 (MovingPictureExpertsGroupAudioLayerIV, dynamic image expert's compression standard audio frequency aspect 4) player, pocket computer on knee and desk computer etc..
It should be noted that the information-pushing method that the embodiment of the present application provides generally is performed by server 105, correspondingly, information push-delivery apparatus is generally positioned in server 105.
In some cases, server 105 directly can also obtain candidate's pushed information from other servers, information to be pushed is pushed to other servers, or server 105 self just storage has candidate's pushed information, and system architecture now used in this application can also be not related to above-mentioned terminal unit 101,102,103.
It should be understood that the number of terminal unit in Fig. 1, network and server is merely schematic.According to realizing needs, it is possible to have any number of terminal unit, network and server.
With continued reference to Fig. 2, it is shown that the flow process 200 according to the information-pushing method of the application embodiment.Described information-pushing method, comprises the following steps:
Step 201, obtains candidate's pushed information.
In the present embodiment, if candidate's pushed information is based on search result information and generates, so information-pushing method runs on electronic equipment thereon (such as the server shown in Fig. 1) and can obtain candidate's pushed information as follows: first, obtains the searching request of user;Then, based on the search result information that searching request inquiry is corresponding, at this moment, can by search result information directly as candidate's pushed information, screening conditions can also be set according to actual needs, screening search results information, and using the search result information after screening as candidate's pushed information.Such as, if the ageing requirement of candidate's pushed information is higher, it is possible to the time limit is set, screens the Search Results within the time limit arranged.
In the present embodiment, information-pushing method runs on electronic equipment thereon (such as the server shown in Fig. 1) can also pass through wired connection mode or radio connection, directly obtain search result information from search server, using search result information as candidate's pushed information.
In the present embodiment, candidate's pushed information can also be obtain according to the accounts information of user, history pushed information.Such as, the accounts information of user have recorded the industry of its work, then can using the trade trend information of above-mentioned industry as candidate's pushed information.
Step 202, the identification information corresponding with described candidate's pushed information determined by the message identification model based on training in advance.
In the present embodiment, identification information include following at least one: image information, Word message, acoustic information.Identification information can serve to indicate that candidate's pushed information whether secure and trusted.Such as, the source of candidate's pushed information is if the website of government bodies, then think its secure and trusted, the identification information of its correspondence is exactly positive, such as, the printed words of " excellent ", " top " or image, equally possible is printed words or the image of letter " V ", further, available " V1 ", " V2 ", the printed words of " V3 " or its believable degree of graphical representation;The source of candidate's pushed information was if having by the website of report record, then it is assumed that it is dangerous credible, and the identification information of its correspondence is passive, e.g., and the printed words of " non-prime ", " not recommending " or image.If do not judge candidate's pushed information corresponding be positive identification information, it is also possible to the identification information of its correspondence is set to sky, and the identification information that identification information is empty with positive can also be formed and compare.
After above-mentioned electronic equipment obtains candidate's pushed information, it is possible to inquire about the characteristic information of above-mentioned candidate's pushed information in default data base.Concrete, it is possible to first candidate's pushed information is carried out statistical analysis and/or semantic analysis, extracts at least one key word, for instance organization's title, or network address, then in default data base, inquire about key message characteristic of correspondence information based on key message.Certainly, can also first obtain the website that above-mentioned candidate's pushed information is originated, in the head of a station's tool-class website providing site information, by SEO (SearchEngineOptimization, search engine optimization) etc. query facility, the website that above-mentioned candidate's pushed information is originated scans for operation, and the information in result of page searching that captures is as characteristic information.After obtaining characteristic information, characteristic information is imported the message identification model of training in advance;According to the corresponding relation that message identification model training in advance is good, obtaining the identification information corresponding with features described above information, the identification information corresponding with features described above information is the identification information corresponding with above-mentioned candidate's pushed information.
In the present embodiment, features described above information includes at least one in the following information of described website: number of servers information, domain name age information, ranking information, key word ranking information, jump out sponsor's information of rate information, outer chain number information, flow information, weight information, website.
The ranking information of website can be the ranking information of the website obtained from Alexa Alexa ranking system.Key word ranking be a kind of in search engine search results in the way of the dependency of word, word, phrase embodies page rank.Key word nature ranking be usually search engine all related web pages are captured result automatically analyze, the embodiment of automatic name arranging, generally, the website of search engine can provide the key word ranking information of website.Website is jumped out rate and is referred to that only having browsed the user that a page just leaves accounts for the percentage ratio of one group of page or a page access number of times, such as, one website has 1000 different visitors to be linked into from this within certain a period of time, these visitors there are 50 people not have secondary navigation patterns simultaneously, directly exit website, then the rate of jumping out for this import address is exactly 50/1000=5%.The outer chain number of website refers to import to the number of links of this website from other websites, can be obtained the outer chain number information of website by conventional outer link analysis instrument.Website traffic refers to the visit capacity of website, is used to describe and accesses the number of users of a website and the index of webpage quantity that user browses.The flow information of website can be historical traffic information can also be estimate flow information.The weight information of website typically refers to the search engine overall evaluation to a website, exemplarily, it is possible to use Baidu's weight, the PR (pagerank, Google's webpage rank) of Google, or, the SR (SogouRank, search dog webpage index) of search dog.
In some optional implementations of the present embodiment, can first search for the website record information of described website, based on the key word whether including pre-setting in the record information of described website, from described first identification information and the second identification information, select one as the identification information corresponding with described candidate's pushed information.Such as, may search for the website record information of described website, capture the information that in record, the field of sponsor's character is corresponding, judge wherein whether comprise the keywords such as public institution, government bodies, army or sociogroup, if, then determining that the identification information corresponding with described candidate's pushed information is the first identification information, the first identification information is positive identification information, it is possible to be similar " excellent ", the printed words of " top " or image.
In some optional implementations of the present embodiment, user's report information set can be obtained, based on the information whether including described website in described user's report information set, from described first identification information and the second identification information, select one as the identification information corresponding with described candidate's pushed information.User's report information set can be the history report information of the user that server is collected, and report information includes the item of report and the object information of report, and the object information of report can be website, it is also possible to be sponsor's title of website.Such as, user reports that website 1 includes false content, verified, website 1 includes false content really, this report information of server record, if the website that candidate's pushed information is originated is website 1, then determine that the identification information corresponding with candidate's pushed information is the second identification information, second identification information is passive identification information, it is possible to be similar " non-prime ", the printed words of " not recommending " or image, it is also possible to for sky.
Step 203, based on described candidate's pushed information and the identification information corresponding with described candidate's pushed information, generates information to be pushed.
In the present embodiment, above-mentioned candidate's pushed information and the identification information corresponding with described candidate's pushed information can be combined as information to be pushed by above-mentioned electronic equipment.Such as, when identification information is image information, it is possible to corresponding image is added in the specified portions of candidate's pushed information.
Step 204, pushes described information to be pushed.
It it is a schematic diagram of the application scenarios of the information-pushing method according to the present embodiment with continued reference to Fig. 3, Fig. 3.In the application scenarios of Fig. 3, first user initiates a searching request, and search key word is " news ";Afterwards, message identification server can obtain search result information as candidate's pushed information in backstage, and extracts candidate's pushed information characteristic of correspondence information;Then, the characteristic information of candidate's pushed information is imported the message identification model of training in advance by above-mentioned message identification server, determine news website 1 in candidate's pushed information, the identification information of news website 2 correspondence is positive identification information, the printed words of positive identification information " excellent " are combined with candidate's pushed information, generate information to be pushed, finally push described information to be pushed.When user browses Search Results, if there is the operations such as hovering or click at the printed words place of " excellent ", it is possible to by modes such as suspended windows, according to actual needs, show all or part of characteristic information.
The method that above-described embodiment of the application provides is determined by the identification information corresponding with described candidate's pushed information, and based on described candidate's pushed information and the identification information corresponding with described candidate's pushed information, generate information to be pushed, this embodiment achieves the difference between the information of propelling movement in mark, makes user obtain the in hgher efficiency of information.
With further reference to Fig. 4, it illustrates the flow process 400 of another embodiment of information-pushing method.The flow process 400 of this information-pushing method, comprises the following steps:
Step 401, obtains candidate's pushed information.
In the present embodiment, candidate's pushed information can be based on the search result information associated with search operation and generate, or generates according to the accounts information of user, history pushed information.
Step 402, obtains the website that above-mentioned candidate's pushed information is originated.
Generally, candidate's pushed information can directly comprise the information of its website originated, by candidate's pushed information carries out statistical analysis and/or semantic analysis, it is possible to extract web site name or network address.
Step 403, it is judged that whether website sponsor character is public institution, government bodies, army or sociogroup.
In the present embodiment, after obtaining the website that above-mentioned candidate's pushed information is originated can sponsor's character corresponding to query candidate pushed information is originated in the record information data base of website website, table 1 illustrates the part record in record information data base.
Part record in table 1 record information data base
Equally, the website that above-mentioned candidate's pushed information is originated can also be inquired about in the record information query web of website, the information of sponsor's character corresponding to website is obtained by the mode captured, if the website sponsor character of the website that above-mentioned candidate's pushed information is originated is public institution, government bodies, army or sociogroup, then determine that the identification information corresponding with above-mentioned candidate's pushed information is positive identification information, and enter step 406;If it is not, then enter step 404.
As shown in table 1, sponsor's character corresponding to State Intellectual Property Office's government website is government bodies, so the information reliability that its website provides is high, if therefore candidate's pushed information derives from State Intellectual Property Office's government website, it is determined that the identification information of its correspondence is positive identification information.
Step 404, it is judged that whether include the record of the website that above-mentioned candidate's pushed information is originated or the record of the sponsor of this website in unlawful practice data base.
Based on user, record in above-mentioned violation discreditable behavior data base can report that historical information obtains, can also based on whole nation credit information of enterprise publicity system, or enterprise's list acquisition of breaking one's promise of breaking the law on a serious scale of publicity, if unlawful practice data base includes the record of the website that above-mentioned candidate's pushed information is originated or the record of the sponsor of this website, then determine that the identification information corresponding with above-mentioned candidate's pushed information is passive identification information, and enter step 406;If it is not, then enter step 405.
Step 405, obtains characteristic information according to candidate's pushed information, by characteristic information import information identification model.
Set up the step of message identification model, including:
The first, obtaining the sample data trained needed for described model, wherein, described sample data includes the identification information that the characteristic information of the characteristic information of sample site measure and fixed sample site measure is corresponding.
The characteristic information of sample site measure and identification information corresponding to the characteristic information of fixed sample site measure can obtain from sample data sets, sample data sets can be artificial setting, can also being based on the website that fixed identification information is corresponding, the result of page searching in head of a station's tool-class website captures characteristic of correspondence information and obtains.
Such as, have determined that identification information corresponding to the official website of 500 tops of the world enterprise is " excellent ", then in the official website of head of a station tool-class site search 500 tops of the world enterprise, the characteristic information of the official website of 500 tops of the world enterprise is captured at result of page searching, using the official website of 500 tops of the world enterprise as sample site measure, the characteristic information of the official website of 500 tops of the world enterprise is as the characteristic information of sample site measure, by " excellent " as identification information corresponding to the characteristic information of fixed sample site measure.
Obtain and train sample data needed for described model can also browse the mode of record by counting user to obtain, for instance, identification information corresponding to the website that repeatedly accessed in special time period by a large number of users is defined as " excellent ", using this type of website as sample site measure.
The second, based on initial model, the identification information that the characteristic information of sample site measure is corresponding is predicted, obtain the identification information that the characteristic information of the sample site measure of initial model prediction is corresponding, wherein, described initial model is with one of drag: supporting vector machine model, decision-tree model, model-naive Bayesian, Logic Regression Models.
3rd, judge that whether the identification information that identification information that the characteristic information of the sample site measure that initial model predicts is corresponding is corresponding with the characteristic information of fixed sample site measure is consistent.
If not, then using the identification information corresponding for the characteristic information of the characteristic information of described sample site measure and the fixed sample site measure training data as described initial model, further, the parameter of described initial model is revised based on described training data, to obtain described message identification model.
Such as, if the identification information that the characteristic information of website is corresponding only has two kinds, first identification information or the second identification information, wherein the second identification information is alternatively sky, when above-mentioned initial model is supporting vector machine model, LIBSVM software can be run, determine that kernel function is linear kernel (LinearKernel), linear kernel needs the parameter selecting and adjusting to have punishment parameter C, weight parameter weight, weight is used for adjusting the weights of the C of different classes of parameter, wherein weight can be set to the ratio (sample site measure and second that namely the first identification information is corresponding identifies the ratio of corresponding sample site measure) of positive negative sample.Punishment parameter C generally can range for 0.0001 to 10000, it is possible to adjust the value of C according to above-mentioned training data.
4th, characteristic information is obtained according to candidate's pushed information, by characteristic information import information identification model;Message identification model by characteristic information according to the good corresponding relation of training in advance, can find the identification information of correspondence.
Step 406, based on described candidate's pushed information and the identification information corresponding with described candidate's pushed information, generates information to be pushed.
Step 407, pushes described information to be pushed.
After user receives the information of propelling movement, if user queries the correctness of identification information, can also pass through to click the modes such as button that report an error arranged to report an error to server, after server collects the identification information of mistake, can using the identification information of mistake and characteristic of correspondence data thereof as new training data, re-training message identification model, improves the accuracy of message identification model further.
Figure 4, it is seen that compared with the embodiment that Fig. 2 is corresponding, the flow process 400 of the information-pushing method in the present embodiment highlights the step determining identification information.Thus, the scheme that the present embodiment describes can introduce the related data more determining identification information, thus realizing the determination of the higher identification information of accuracy and more effective information pushing.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides a kind of information push-delivery apparatus a embodiment, this device embodiment is corresponding with the embodiment of the method shown in Fig. 2, and this device specifically can apply in various electronic equipment.
As it is shown in figure 5, the information push-delivery apparatus 500 described in the present embodiment includes: acquiring unit 501, determine unit 502, generate unit 503 and push unit 504.Wherein, acquiring unit 501 configuration is used for obtaining candidate's pushed information;Determine that unit 502 configures for determining the identification information corresponding with described candidate's pushed information based on the message identification model of training in advance;Generate unit 503 configuration for based on described candidate's pushed information and the identification information corresponding with described candidate's pushed information, generating information to be pushed;And push unit 504 configuration is used for pushing described information to be pushed.
In the present embodiment, the acquiring unit 501 of information push-delivery apparatus 500 can obtain candidate's pushed information by wired connection mode or radio connection from terminal or other servers
In the present embodiment, acquiring unit 501 obtains candidate's pushed information, on information push-delivery apparatus 500, training in advance has message identification model, thus, information push-delivery apparatus 500 cell 502 really can determine the identification information corresponding with described candidate's pushed information based on the message identification model of training in advance, generating unit 503 can based on described candidate's pushed information and the identification information corresponding with described candidate's pushed information, generating information to be pushed, push unit 504 can push and generate the information to be pushed that unit 503 generates.
In certain embodiments, described determine unit 502, including: site determining subelement, for confirming the website that described candidate's pushed information is originated;Characteristic information search subelement, for searching for the characteristic information of described website, characteristic information imports subelement, for described characteristic information imports the message identification model of training in advance;Identification information obtains subelement, the identification information corresponding for obtaining the characteristic information with described website determined according to described message identification model, using the identification information corresponding with the characteristic information of described website as the identification information corresponding with described candidate's pushed information.
In certain embodiments, described characteristic information includes at least one in the following information of described website: number of servers information, domain name age information, ranking information, key word ranking information, jump out sponsor's information of rate information, outer chain number information, flow information, weight information, website.
In certain embodiments, described device also includes: unit set up by message identification model, obtains subelement including: sample data, the identification information that the characteristic information of characteristic information and fixed sample site measure for obtaining sample site measure is corresponding;Prediction identification information obtains subelement, for the identification information that the characteristic information of sample site measure is corresponding being predicted based on initial model, obtain the identification information that the characteristic information of the sample site measure of initial model prediction is corresponding, wherein, described initial model is with one of drag: supporting vector machine model, decision-tree model, model-naive Bayesian, Logic Regression Models;Prediction identification information judgment subelement, whether the identification information that identification information that characteristic information for judging sample site measure that initial model predicts is corresponding is corresponding with the characteristic information of fixed sample site measure is consistent;Parameter modification subelement, for identification information that the identification information corresponding at the characteristic information predicting sample site measure that identification information judgment subelement judges that initial model predicts is corresponding with the characteristic information of fixed sample site measure inconsistent, using the identification information corresponding for the characteristic information of the characteristic information of described sample site measure and the fixed sample site measure training data as described initial model, and, the parameter of described initial model is revised, to obtain described message identification model based on described training data.
In certain embodiments, described identification information includes the first identification information and the second identification information;And, described determine unit 502, including: first selects subelement, for the key word whether including pre-setting in the record information based on the described website searched, from described first identification information and the second identification information, select one as the identification information corresponding with described candidate's pushed information;Or, second selects subelement, for based on the information whether including described website in the user's report information set obtained, selecting one as the identification information corresponding with described candidate's pushed information from described first identification information and the second identification information.
It will be understood by those skilled in the art that above-mentioned information push-delivery apparatus 500 also includes some other known features, for instance processor, memorizer etc., embodiment of the disclosure in order to unnecessarily fuzzy, these known structures are not shown in Figure 5.
Below with reference to Fig. 6, it illustrates the structural representation being suitable to the computer system 600 for realizing the embodiment of the present application server.
As shown in Figure 6, computer system 600 includes CPU (CPU) 601, its can according to the program being stored in read only memory (ROM) 602 or from storage part 608 be loaded into the program random access storage device (RAM) 603 and perform various suitable action and process.In RAM603, also storage has system 600 to operate required various programs and data.CPU601, ROM602 and RAM603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to bus 604.
It is connected to I/O interface 605: include the importation 606 of keyboard, mouse etc. with lower component;Output part 607 including such as cathode ray tube (CRT), liquid crystal display (LCD) etc. and speaker etc.;Storage part 608 including hard disk etc.;And include the communications portion 609 of the NIC of such as LAN card, modem etc..Communications portion 609 performs communication process via the network of such as the Internet.Driver 610 is connected to I/O interface 605 also according to needs.Detachable media 611, such as disk, CD, magneto-optic disk, semiconductor memory etc., be arranged in driver 610 as required, in order to the computer program read from it is mounted into storage part 608 as required.
Especially, according to embodiment of the disclosure, the process described above with reference to flow chart may be implemented as computer software programs.Such as, embodiment of the disclosure and include a kind of computer program, it includes the computer program being tangibly embodied on machine readable media, and described computer program comprises the program code for performing the method shown in flow chart.In such embodiments, this computer program can pass through communications portion 609 and be downloaded and installed from network, and/or is mounted from detachable media 611.When this computer program is performed by CPU (CPU) 601, perform the above-mentioned functions limited in the present processes.
Flow chart in accompanying drawing and block diagram, it is illustrated that according to the system of the various embodiment of the application, the architectural framework in the cards of method and computer program product, function and operation.In this, flow chart or each square frame in block diagram can represent a part for a module, program segment or code, and a part for described module, program segment or code comprises the executable instruction of one or more logic function for realizing regulation.It should also be noted that at some as in the realization replaced, the function marked in square frame can also to be different from the order generation marked in accompanying drawing.Such as, two square frames succeedingly represented can essentially perform substantially in parallel, and they can also perform sometimes in the opposite order, and this determines according to involved function.It will also be noted that, the combination of the square frame in each square frame in block diagram and/or flow chart and block diagram and/or flow chart, can realize by the special hardware based system of the function or operation that perform regulation, or can realize with the combination of specialized hardware Yu computer instruction.
It is described in unit involved in the embodiment of the present application to be realized by the mode of software, it is also possible to realized by the mode of hardware.Described unit can also be arranged within a processor, for instance, it is possible to it is described as: a kind of processor includes acquiring unit, determines unit, generates unit and push unit.Wherein, the title of these unit is not intended that the restriction to this unit itself under certain conditions, for instance, acquiring unit is also described as " obtaining the unit of candidate's pushed information ".
As on the other hand, present invention also provides a kind of nonvolatile computer storage media, this nonvolatile computer storage media can be the nonvolatile computer storage media comprised in device described in above-described embodiment;Can also be individualism, be unkitted the nonvolatile computer storage media allocating in terminal.Above-mentioned nonvolatile computer storage media storage has one or more program, when one or multiple program are performed by an equipment so that described equipment: obtain candidate's pushed information;The identification information corresponding with described candidate's pushed information determined by message identification model based on training in advance;Based on described candidate's pushed information and the identification information corresponding with described candidate's pushed information, generate information to be pushed;Push described information to be pushed.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Skilled artisan would appreciate that, invention scope involved in the application, it is not limited to the technical scheme of the particular combination of above-mentioned technical characteristic, when also should be encompassed in without departing from described inventive concept simultaneously, other technical scheme being carried out combination in any by above-mentioned technical characteristic or its equivalent feature and being formed.Such as features described above and (but not limited to) disclosed herein have the technical characteristic of similar functions and replace mutually and the technical scheme that formed.

Claims (10)

1. an information-pushing method, it is characterised in that described method includes:
Obtain candidate's pushed information;
The identification information corresponding with described candidate's pushed information determined by message identification model based on training in advance;
Based on described candidate's pushed information and the identification information corresponding with described candidate's pushed information, generate information to be pushed;
Push described information to be pushed.
2. method according to claim 1, it is characterised in that the identification information corresponding with described candidate's pushed information determined by the described message identification model based on training in advance, including:
Confirm the website that described candidate's pushed information is originated;
Search for the characteristic information of described website;
Described characteristic information is imported the message identification model of training in advance;
Obtain the identification information that the characteristic information with described website determined according to described message identification model is corresponding, using the identification information corresponding with the characteristic information of described website as the identification information corresponding with described candidate's pushed information.
3. method according to claim 2, it is characterized in that, described characteristic information includes at least one in the following information of described website: number of servers information, domain name age information, ranking information, key word ranking information, jump out sponsor's information of rate information, outer chain number information, flow information, weight information, website.
4. method according to claim 1 and 2, it is characterised in that described method also includes:
Set up the step of message identification model, including:
Obtaining the sample data trained needed for described model, wherein, described sample data includes the identification information that the characteristic information of the characteristic information of sample site measure and fixed sample site measure is corresponding;
Based on initial model, the identification information that the characteristic information of sample site measure is corresponding is predicted, obtain the identification information that the characteristic information of the sample site measure of initial model prediction is corresponding, wherein, described initial model is with one of drag: supporting vector machine model, decision-tree model, model-naive Bayesian, Logic Regression Models;
Judge that whether the identification information that identification information that the characteristic information of the sample site measure that initial model predicts is corresponding is corresponding with the characteristic information of fixed sample site measure is consistent;
If not, then using the identification information corresponding for the characteristic information of the characteristic information of described sample site measure and the fixed sample site measure training data as described initial model, further, the parameter of described initial model is revised based on described training data, to obtain described message identification model.
5. method according to claim 2, it is characterised in that described identification information includes the first identification information and the second identification information;And,
The identification information corresponding with described candidate's pushed information determined by the described model based on training in advance, including:
Based on the key word whether including pre-setting in the record information of the described website searched, from described first identification information and the second identification information, select one as the identification information corresponding with described candidate's pushed information;
Or, based on the information whether including described website in the user's report information set obtained, from described first identification information and the second identification information, select one as the identification information corresponding with described candidate's pushed information.
6. an information push-delivery apparatus, it is characterised in that described device includes:
Acquiring unit, configuration is used for obtaining candidate's pushed information;
Determining unit, configuration determines the identification information corresponding with described candidate's pushed information for the message identification model based on training in advance;
Generating unit, configuration is for based on described candidate's pushed information and the identification information corresponding with described candidate's pushed information, generating information to be pushed;
Push unit, configuration is used for pushing described information to be pushed.
7. device according to claim 6, it is characterised in that described determine unit, including:
Site determining subelement, for confirming the website that described candidate's pushed information is originated;
Characteristic information search subelement, for searching for the characteristic information of described website;
Characteristic information imports subelement, for described characteristic information imports the message identification model of training in advance;
Identification information obtains subelement, the identification information corresponding for obtaining the characteristic information with described website determined according to described message identification model, using the identification information corresponding with the characteristic information of described website as the identification information corresponding with described candidate's pushed information.
8. device according to claim 7, it is characterized in that, described characteristic information includes at least one in the following information of described website: number of servers information, domain name age information, ranking information, key word ranking information, jump out sponsor's information of rate information, outer chain number information, flow information, weight information, website.
9. the device according to claim 6 or 7, it is characterised in that described device also includes:
Unit set up by message identification model, including:
Sample data obtains subelement, and for obtaining the sample data trained needed for described model, wherein, described sample data includes the identification information that the characteristic information of the characteristic information of sample site measure and fixed sample site measure is corresponding;
Prediction identification information obtains subelement, for the identification information that the characteristic information of sample site measure is corresponding being predicted based on initial model, obtain the identification information that the characteristic information of the sample site measure of initial model prediction is corresponding, wherein, described initial model is with one of drag: supporting vector machine model, decision-tree model, model-naive Bayesian, Logic Regression Models;
Prediction identification information judgment subelement, whether the identification information that identification information that characteristic information for judging sample site measure that initial model predicts is corresponding is corresponding with the characteristic information of fixed sample site measure is consistent;
Parameter modification subelement, for identification information that the identification information corresponding at the characteristic information predicting sample site measure that identification information judgment subelement judges that initial model predicts is corresponding with the characteristic information of fixed sample site measure inconsistent, using the identification information corresponding for the characteristic information of the characteristic information of described sample site measure and the fixed sample site measure training data as described initial model, and, the parameter of described initial model is revised, to obtain described message identification model based on described training data.
10. device according to claim 7, it is characterised in that described identification information includes the first identification information and the second identification information;And,
Described determine unit, including:
First selects subelement, for based on the key word whether including pre-setting in the record information of the described website searched, selecting one as the identification information corresponding with described candidate's pushed information from described first identification information and the second identification information;
Or, second selects subelement, for based on the information whether including described website in the user's report information set obtained, selecting one as the identification information corresponding with described candidate's pushed information from described first identification information and the second identification information.
CN201610029313.9A 2016-01-15 2016-01-15 Information pushing method and device Pending CN105718533A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610029313.9A CN105718533A (en) 2016-01-15 2016-01-15 Information pushing method and device
PCT/CN2016/087453 WO2017121076A1 (en) 2016-01-15 2016-06-28 Information-pushing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610029313.9A CN105718533A (en) 2016-01-15 2016-01-15 Information pushing method and device

Publications (1)

Publication Number Publication Date
CN105718533A true CN105718533A (en) 2016-06-29

Family

ID=56147623

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610029313.9A Pending CN105718533A (en) 2016-01-15 2016-01-15 Information pushing method and device

Country Status (2)

Country Link
CN (1) CN105718533A (en)
WO (1) WO2017121076A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059297A (en) * 2019-04-22 2019-07-26 上海乂学教育科技有限公司 Knowledge point suitable for adaptive learning learns duration prediction method and its application
CN110392155A (en) * 2018-04-16 2019-10-29 阿里巴巴集团控股有限公司 It has been shown that, processing method, device and the equipment of notification message

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109559158A (en) * 2018-11-06 2019-04-02 北京奇虎科技有限公司 Promotion message put-on method, device, electronic equipment and readable storage medium storing program for executing
CN111488517B (en) * 2019-01-29 2024-07-19 北京沃东天骏信息技术有限公司 Method and device for training click rate estimation model
CN111949860B (en) * 2019-05-15 2022-02-08 北京字节跳动网络技术有限公司 Method and apparatus for generating a relevance determination model
CN112766995B (en) * 2019-10-21 2024-09-24 招商证券股份有限公司 Article recommendation method, device, terminal equipment and storage medium
CN111177552A (en) * 2019-12-27 2020-05-19 绍兴市上虞区理工高等研究院 Scientific and technological achievement pushing method and device based on user requirements
CN111597453B (en) * 2020-03-31 2024-05-07 平安科技(深圳)有限公司 User image drawing method, device, computer equipment and computer readable storage medium
CN112148937B (en) * 2020-10-12 2023-07-25 平安科技(深圳)有限公司 Method and system for pushing dynamic epidemic prevention knowledge
CN113724815B (en) * 2021-08-30 2024-06-21 深圳平安智慧医健科技有限公司 Information pushing method and device based on decision grouping model

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101059818A (en) * 2007-06-26 2007-10-24 申屠浩 Method for reinforcing search engine result safety
CN102142033A (en) * 2010-05-20 2011-08-03 百度在线网络技术(北京)有限公司 Method and device for providing relative sub-link information in search result
CN102664925A (en) * 2012-03-29 2012-09-12 奇智软件(北京)有限公司 Method and apparatus for displaying searching result
CN103235821A (en) * 2013-04-27 2013-08-07 百度在线网络技术(北京)有限公司 Original content searching method and searching server
CN103399957A (en) * 2013-08-21 2013-11-20 百度在线网络技术(北京)有限公司 Searching method, system and engine as well as client
CN103810162A (en) * 2012-11-05 2014-05-21 腾讯科技(深圳)有限公司 Method and system for recommending network information
CN103902888A (en) * 2012-12-24 2014-07-02 腾讯科技(深圳)有限公司 Website trust automatic rating method, server-side and system
CN104504058A (en) * 2014-12-18 2015-04-08 北京奇虎科技有限公司 Web page presentation method and browser device
CN104735074A (en) * 2015-03-31 2015-06-24 江苏通付盾信息科技有限公司 Malicious URL detection method and implement system thereof

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101963966A (en) * 2009-07-24 2011-02-02 李占胜 Method for sorting search results by adding labels into search results
US20110125791A1 (en) * 2009-11-25 2011-05-26 Microsoft Corporation Query classification using search result tag ratios
CN101814083A (en) * 2010-01-08 2010-08-25 上海复歌信息科技有限公司 Automatic webpage classification method and system
US20120059838A1 (en) * 2010-09-07 2012-03-08 Microsoft Corporation Providing entity-specific content in response to a search query
CN102375952B (en) * 2011-10-31 2014-12-24 北龙中网(北京)科技有限责任公司 Method for displaying whether website is credibly checked in search engine result
CN103401835A (en) * 2013-07-01 2013-11-20 北京奇虎科技有限公司 Method and device for presenting safety detection results of microblog page

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101059818A (en) * 2007-06-26 2007-10-24 申屠浩 Method for reinforcing search engine result safety
CN102142033A (en) * 2010-05-20 2011-08-03 百度在线网络技术(北京)有限公司 Method and device for providing relative sub-link information in search result
CN102664925A (en) * 2012-03-29 2012-09-12 奇智软件(北京)有限公司 Method and apparatus for displaying searching result
CN103810162A (en) * 2012-11-05 2014-05-21 腾讯科技(深圳)有限公司 Method and system for recommending network information
CN103902888A (en) * 2012-12-24 2014-07-02 腾讯科技(深圳)有限公司 Website trust automatic rating method, server-side and system
CN103235821A (en) * 2013-04-27 2013-08-07 百度在线网络技术(北京)有限公司 Original content searching method and searching server
CN103399957A (en) * 2013-08-21 2013-11-20 百度在线网络技术(北京)有限公司 Searching method, system and engine as well as client
CN104504058A (en) * 2014-12-18 2015-04-08 北京奇虎科技有限公司 Web page presentation method and browser device
CN104735074A (en) * 2015-03-31 2015-06-24 江苏通付盾信息科技有限公司 Malicious URL detection method and implement system thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110392155A (en) * 2018-04-16 2019-10-29 阿里巴巴集团控股有限公司 It has been shown that, processing method, device and the equipment of notification message
CN110059297A (en) * 2019-04-22 2019-07-26 上海乂学教育科技有限公司 Knowledge point suitable for adaptive learning learns duration prediction method and its application

Also Published As

Publication number Publication date
WO2017121076A1 (en) 2017-07-20

Similar Documents

Publication Publication Date Title
CN105718533A (en) Information pushing method and device
CN105183912B (en) Abnormal log determines method and apparatus
CN1934569B (en) Search systems and methods with integration of user annotations
US20060288015A1 (en) Electronic content classification
US20150169710A1 (en) Method and apparatus for providing search results
CN107220386A (en) Information-pushing method and device
US20080059454A1 (en) Search document generation and use to provide recommendations
US20080134015A1 (en) Web Site Structure Analysis
CN110825956A (en) Information flow recommendation method and device, computer equipment and storage medium
CN110516173B (en) Illegal network station identification method, illegal network station identification device, illegal network station identification equipment and illegal network station identification medium
WO2010120941A2 (en) Automatic mapping of a location identifier pattern of an object to a semantic type using object metadata
CN108230113A (en) User's portrait generation method, device, equipment and readable storage medium storing program for executing
CN101517967A (en) Traffic prediction for web sites
CN104899220A (en) Application program recommendation method and system
CN106339380A (en) Method and device for recommending frequently asked question information
CN105306495A (en) User identification method and device
CN107526718A (en) Method and apparatus for generating text
CN105426508A (en) Webpage generation method and apparatus
CN103544150A (en) Method and system for providing recommendation information for mobile terminal browser
CN108280102A (en) Internet behavior recording method, device and user terminal
CN103425767A (en) Method and system for determining prompt data
KR102575415B1 (en) Method and apparatus for providing information on advertisements available for reservation during the marketer's workload period
CN104573120A (en) Recommendation information obtaining method and device for terminal
CN116226494A (en) Crawler system and method for information search
KR102244668B1 (en) System and method for automatically inputting personal information using codes

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160629

RJ01 Rejection of invention patent application after publication