Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
Completely it is communicated to those skilled in the art.
It should be noted that term " first " in the description and claims of this application and above-mentioned attached drawing, "
Two " etc. be the object for distinguishing similar, without being used to describe specific order or precedence.It should be appreciated that it so uses
Data can exchange in the appropriate case, so as to embodiments herein described herein can with except illustrating herein or
Order beyond those of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that cover
Cover it is non-exclusive include, be not necessarily limited to for example, containing the process of series of steps or unit, method, system, product or equipment
Those steps or unit clearly listed, but may include not list clearly or for these processes, method, product
Or the intrinsic other steps of equipment or unit.
According to the embodiment of the present application, a kind of recommendation embodiment of the method for application program is provided, it is necessary to illustrate, attached
The step of flow of figure illustrates can perform in the computer system of such as a group of computer-executable instructions, though also,
So show logical order in flow charts, but in some cases, can be performed with the order being different from herein shown by
Or the step of description.
In order to provide the implementation of the accuracy rate and efficiency that improve application program recommendation, an embodiment of the present invention provides one
The recommendation method and device of kind application program, illustrates the preferred embodiment of the present invention below in conjunction with Figure of description.
Term vector (Word Vector) is a kind of learning model of serializing, is widely used in natural language processing
Fields such as (Nature Language Processing).Sentence is using word as least unit, by word according to being centainly ranked sequentially group
Into sentence.The sentence in corpus (by the molecular text file of many sentences) is trained using term vector model, it can be with
By in corpus each word sequence be melted into a vectorial Ω being made of several dimension real number values, vector between it is similar
Similitude between degree characterization word and word.Similarity is high between vector, then the distance between vector is near;Similarity between vector
Low, then the distance between vector is remote.Wherein, the vectorial Ω of each word characterizes the position that the word occurs in numerous sentence elements and closes
System.
It illustrates:In different sentences, word " Apple " and word " iPhone " always have similar context, example
Such as:Word A → word B → Apple → word C → word D, word A → word B → iPhone → word C → word D.Then instructed via term vector model
After white silk, the vectorial Ω of word " Apple "1With the vectorial Ω of word " iPhone "2Closely located, i.e. vector Ω1With vectorial Ω2's
Similarity is high, then characterizes word " Apple " and word " iPhone " similarity is high.Term vector model can be literal difference, but language
The same or similar word of justice is mapped as the high vectorial Ω of similarity.This is because in corpus in numerous sentences between each word
Position relationship determines the similarity degree between these words.
Based on the theory of above-mentioned term vector model, in the embodiment of the present invention, using the mark of application program as term vector mould
Word in type is trained using term vector model, can will be every in the application history download sequence set of multiple users
The mark sequence of a application program is melted into a real number value vector Ω, establishes the mark of the mark and the application program of application program
Vector correspondence.
The advantages of to make technical solution of the present invention, is clearer, and the present invention is made specifically with reference to the accompanying drawings and examples
It is bright.
An embodiment of the present invention provides a kind of recommendation method of application program, as shown in Figure 1, the described method includes:
101st, the application history download sequence set of multiple users is obtained.
Wherein, the application history download sequence set of user includes the corresponding mark letter of multiple application programs
The downloading order of breath and the application program.The application program that application history download sequence set includes a large number of users is gone through
History download sequence, and the application program download sequence of each user includes the identification information of multiple application programs.
For example, the identification information of application program is represented by app1 to app11, it is specific as follows shown:
The app download sequences of user 1:app1→app2→app3→app4→app5……
The app download sequences of user 2:app6→app2→app3→app9→app10……
The app download sequences of user 3:app11→app7→app12→app9→app4……
102nd, multiple marks corresponding with application program in the application history download sequence set are believed
Breath is trained by term vector model, obtains vector corresponding with each identification information.
In embodiments of the present invention, the word in using the identification information of application program as term vector model, utilizes term vector
Model is trained application history download sequence set, each in application history download sequence set is applied
The identification information sequence of program is melted into a real number value vector, i.e., the identification information correspondence of each application program is in a vector.
The similarity of the vector of application program is used to characterize the similarity between application program.Word after term vector model training, word
Vector distance is nearer, and the vector similarity of word is higher, and the semanteme between word is more similar.Similarly, by term vector model training application
The corresponding vector of each identification information obtained by program history download sequence set, the high application program of vector similarity are similar
Degree is high, and the low application program similarity of vector similarity is low.
According to term vector model to the training method of word in sentence, if for example, application program there are following users
History download sequence:
User 1:Mark → application program of the mark of the mark of application program A1 → application program A2 → application program A3
The mark of the mark of A4 → application program A5;
User 2:Mark → application program of the mark of the mark of application program A1 → application program A2 → application program A6
The mark of the mark of A4 → application program A5.
Then the distance between the corresponding vector of mark vector corresponding with the mark of application program A6 of application program A3 compared with
Closely, the corresponding vector of the mark vector similarity corresponding with the mark of application program A6 of application program A3 is high, and characterization is using journey
The similarity of sequence A3 and application program A6 is high.
After term vector model trains application history download sequence set, application history download sequence collection is obtained
The corresponding vector of mark of each application program in conjunction.Application program vectorization database is previously generated, stores application program
Correspondence between mark and vector.Such as:The mark of application program A1 and the correspondence of vector A1, the mark of application program A2
Know with the correspondence of vector A2, the mark of application program A3 and the correspondence of vector A3, the mark of application program A4 and to
The correspondence of A4, the mark of application program A5 and the correspondence of vector A5 are measured, the mark of application program A6 is with vector A6's
Correspondence.
103rd, application program is recommended to download the user of application program according to the corresponding vector of each identification information.
A kind of recommendation method of application program provided in an embodiment of the present invention, the application program for obtaining multiple users first are gone through
History download sequence set, then will be multiple corresponding with application program in the application history download sequence set
Identification information is trained by term vector model, vector corresponding with each identification information is obtained, finally according to each mark
The corresponding vector of information recommends application program to download the user of application program.With at present by manually calculating application program
Similarity compare, the embodiment of the present invention is by simulating training of the term vector word2vec to text series, the application to user
Program is downloaded history and is trained, it is possible to the similarity between different application is efficiently calculated, it should so as to improve
With the accuracy rate and efficiency of the recommendation of program.
An embodiment of the present invention provides the recommendation method of another application program, as shown in Fig. 2, the described method includes:
201st, the application history download sequence set of multiple users is obtained.
Wherein, the application history download sequence set of the user includes the corresponding mark of multiple application programs
Know the downloading order of information and the application program.The historical series set of application program includes the application program of a large number of users
Download sequence, and the application program download sequence of each user includes the identification information of multiple application programs.
The downloading order of application program is the time order and function order that user downloads each application program, for example, user A is 2016
On July 1, the application program for having downloaded entitled app1, in the application program for having downloaded entitled app2 on July 2nd, 2016,
In the application program for having downloaded entitled app3 on July 3rd, 2016, in the application for having downloaded entitled app4 on July 4 in 2016
Program, in the application program for having downloaded entitled app5 on July 5th, 2016.The time that application program is then downloaded according to user A is suitable
Sequence, the application history download sequence for obtaining user A are:app1→app2→app3→app4→app5.
In embodiments of the present invention, the time sequencing arrangement application history for application program being downloaded according to user downloads sequence
The title of each application program in row meets the actual download time sequencing of application program, so as to be improved among subsequent step
The accuracy rate of similarity calculation between different application.
202nd, multiple marks corresponding with application program in the application history download sequence set are believed
Breath is trained by term vector model, obtains vector corresponding with each identification information.
Wherein, the multiple and application program in the application history download sequence set is distinguished on step 202
Corresponding identification information is trained by term vector model, obtains retouching in detail for vector corresponding with each identification information
It states, can refer to the description of Fig. 1 corresponding steps, details are not described herein.
Further, after step 202, the method further includes:By the identification information and the identification information
The correspondence of vector is stored into application program vectorization database.Need exist for explanation, application program vectorization number
Can be according to actual needs into edlin according to storehouse, editor includes deleting, any one or more in changing and being newly-increased.
In embodiments of the present invention, the application history download sequence collection of multiple users is obtained every preset time period
It closes, using the mark of application program in application history download sequence set as the word in term vector model, using term vector
Model trains application history download sequence set again, obtains and is each applied in application history download sequence set
The corresponding vector of mark of program, using newly training, the corresponding vector update of mark of obtained each application program is described to answer
With program vectorization database.Wherein, preset time period can specifically be set according to actual needs.Such as:Three months or half
Year etc., the embodiment of the present invention is not specifically limited.It illustrates:If the application stored in application program vectorization database
The corresponding original of mark of program is vectorial from newly training obtained vector different, then former using obtained vector is newly trained to replace
Vector.If the corresponding vector of the mark for not storing an application program in vectorization database, in preset application program vector
Change the corresponding vector of mark for increasing the application program in database newly.
203rd, the identification information for the first application program that the user downloads is obtained.
204th, the identification information that first application program is obtained from the application program vectorization database is corresponding
Primary vector.
Further, the method further includes:If described is not got from the application program vectorization database
One vector, then obtain the application program download sequence where first application program;It will be where first application program
Application program download sequence is added in the application history download sequence set, by described in term vector model re -training
Application history download sequence set;Update identification information described in the application program vectorization database and the mark
The correspondence of the vector of information.
It should be noted that if it is searched from application program vectorization data corresponding less than the mark of the first application program
Primary vector then obtains the application program download sequence where the first application program, and under the application history that previously obtained
Carry arrangement set together, pressure is trained application history download sequence set using term vector model, obtains first
The corresponding primary vector of mark of application program, the correspondence of the mark of the first application program and primary vector is increased newly should
With in program vectorization database.
It in embodiments of the present invention, can be according to preset time period, automatically using term vector model to application history
Download sequence set is trained, and can also be forced to download application history using term vector model according to actual needs
Arrangement set is trained, using above two mode, continuous more new application vectorization database.
205th, from the application program vectorization database search with the primary vector meet default similarity to
A few secondary vector.
In embodiments of the present invention, can be searched from the preset application program vectorization database and the primary vector
The highest at least one secondary vector of similarity.As searching should with the first application program similarity highest at least one second
Use program.In an example, it is similar to vector every other in application program vectorization database to calculate primary vector
Degree obtains and the highest at least one secondary vector of the primary vector similarity.According to the second application program to be searched
Number, search secondary vector.That is, if lookup and highest second application program of the first application program similarity, is looked into
Look for one and the highest secondary vector of primary vector similarity;If it searches and the first application program similarity highest multiple second
Application program, then search with the highest multiple secondary vectors of primary vector similarity, the number of the secondary vector of lookup and want
The number for the second application program searched is identical.
In another example, searched from the preset application program vectorization database similar to the primary vector
Degree meets at least one secondary vector of pre-set interval.
As search at least one second application program for meeting pre-set interval with the first application program similarity.At one
In example, calculate the similarity of vector every other in primary vector and preset application program vectorization database, obtain with
The primary vector similarity meets at least one secondary vector of pre-set interval.According to the second application program to be searched
Number searches the multiple secondary vectors for meeting pre-set interval with primary vector similarity, the number of the secondary vector of lookup and institute
The number of the second application program to be searched is identical.Such as:Search N number of similarity with the first application program is more than 50% second
Application program then searches the secondary vector that N number of similarity with primary vector is more than 50%.
Wherein, the similarity between vector is calculated, there are many kinds of possible realization method, such as:Calculate the phase between vector
Like degree, the Euclidean distance between vector can be calculated, the similarity between vector is weighed with Euclidean distance.Euclidean distance is shorter,
Similarity between vector is higher;Euclidean distance is bigger, and the similarity between vector is lower.Such as:Calculate the phase between vector
Like degree, the cosine similarity between vector can be calculated.
It finds after meeting at least one secondary vector of default similarity with primary vector, from application program vectorization number
According to the mark that corresponding second application program of each secondary vector is searched in storehouse, default similarity is even met with primary vector
There are one secondary vectors, then exports the mark of corresponding second application program of the secondary vector;If it is accorded with primary vector
Closing the secondary vector of default similarity has multiple, then exports the mark of corresponding second application program of each secondary vector.It is defeated
The mark of the second application program gone out is to meet the application program of default similarity with the first application program.
206th, the mark of application program corresponding with the secondary vector in the application program vectorization database is believed
Breath recommends the user.
Further, the described method includes:It exports the application program of the recommendation and the application program of the recommendation corresponds to
Type information and description information.
The recommendation method of another kind application program provided in an embodiment of the present invention obtains the application program of multiple users first
History download sequence set, then will be multiple corresponding respectively with application program in the application history download sequence set
Identification information be trained by term vector model, vector corresponding with each identification information is obtained, finally according to each mark
Know the corresponding vector of information and recommend application program to download the user of application program.With at present by manually calculating using journey
The similarity of sequence is compared, and the embodiment of the present invention answers user by simulating training of the term vector word2vec to text series
History is downloaded with program to be trained, it is possible to the similarity between different application is efficiently calculated, so as to improve
The accuracy rate and efficiency of the recommendation of application program.
Further, the embodiment of the present invention provides a kind of recommendation apparatus of application program, as shown in figure 3, described device bag
It includes:Acquiring unit 31, training unit 32, recommendation unit 33.
Acquiring unit 31, for obtaining the application history download sequence set of multiple users, the application of the user
Program history download sequence includes the corresponding identification information of multiple application programs and the downloading order of the application program;
Training unit 32, for will be multiple right respectively with application program in the application history download sequence set
The identification information answered is trained by term vector model, obtains vector corresponding with each identification information;
Recommendation unit 33, for recommending to answer to download the user of application program according to the corresponding vector of each identification information
Use program.
It should be noted that each functional unit involved by a kind of recommendation apparatus of application program provided in an embodiment of the present invention
Other corresponding corresponding descriptions for describing, may be referred to method shown in Fig. 1, details are not described herein, it should be understood that the present embodiment
In device can correspond to realize preceding method embodiment in full content.
A kind of recommendation apparatus of application program provided in an embodiment of the present invention, the application program for obtaining multiple users first are gone through
History download sequence set, then will be multiple corresponding with application program in the application history download sequence set
Identification information is trained by term vector model, vector corresponding with each identification information is obtained, finally according to each mark
The corresponding vector of information recommends application program to download the user of application program.With at present by manually calculating application program
Similarity compare, the embodiment of the present invention is by simulating training of the term vector word2vec to text series, the application to user
Program is downloaded history and is trained, it is possible to the similarity between different application is efficiently calculated, it should so as to improve
With the accuracy rate and efficiency of the recommendation of program.
Further, the embodiment of the present invention provides the recommendation apparatus of another application program, as shown in figure 4, described device
Including:Acquiring unit 41, training unit 42, recommendation unit 43.
Acquiring unit 41, for obtaining the application history download sequence set of multiple users, the application of the user
Program history download sequence set includes the download of the corresponding identification information of multiple application programs and the application program
Sequentially;
Training unit 42, for will be multiple right respectively with application program in the application history download sequence set
The identification information answered is trained by term vector model, obtains vector corresponding with each identification information;
Recommendation unit 43, for recommending to answer to download the user of application program according to the corresponding vector of each identification information
Use program.
Further, described device further includes:
Storage unit 44, for storing the correspondence of the identification information and the vector of the identification information to application
In program vectorization database.
Specifically, the recommendation unit 43 includes:
Acquisition module 431, for obtaining the identification information for the first application program that the user downloads;
The acquisition module 431 is additionally operable to obtain described first from the application program vectorization database using journey
Primary vector corresponding to the identification information of sequence;
Searching module 432, for searched from the application program vectorization database meet with the primary vector it is pre-
If at least one secondary vector of similarity;
Recommending module 433, for by application corresponding with the secondary vector in the application program vectorization database
The identification information of program recommends the user.
Further, described device further includes:Updating block 45;
The acquiring unit 41, if be additionally operable to not get from the application program vectorization database described first to
Amount, then obtain the application program download sequence where first application program;
The training unit 43, being additionally operable to will be described in the application program download sequence addition where first application program
In application history download sequence set, pass through application history download sequence collection described in term vector model re -training
It closes;
The updating block 45, for updating identification information described in the application program vectorization database and the mark
Know the correspondence of the vector of information.
The searching module 432, specifically for from the application program vectorization database search with described first to
Measure the highest at least one secondary vector of similarity.
Further, described device further includes:
Output unit 46, for exporting the corresponding type letter of the application program of the application program of the recommendation and the recommendation
Breath and description information.
It should be noted that each functional unit involved by a kind of recommendation apparatus of application program provided in an embodiment of the present invention
Other corresponding corresponding descriptions for describing, may be referred to method shown in Fig. 2, details are not described herein, it should be understood that the present embodiment
In device can correspond to realize preceding method embodiment in full content.
The recommendation apparatus of another kind application program provided in an embodiment of the present invention obtains the application program of multiple users first
History download sequence set, then will be multiple corresponding respectively with application program in the application history download sequence set
Identification information be trained by term vector model, vector corresponding with each identification information is obtained, finally according to each mark
Know the corresponding vector of information and recommend application program to download the user of application program.With at present by manually calculating using journey
The similarity of sequence is compared, and the embodiment of the present invention answers user by simulating training of the term vector word2vec to text series
History is downloaded with program to be trained, it is possible to the similarity between different application is efficiently calculated, so as to improve
The accuracy rate and efficiency of the recommendation of application program.
The recommendation apparatus of the application program includes processor and memory, and above-mentioned acquiring unit, training unit recommend list
Member, storage unit, updating block and output unit etc. in memory, storage are performed by processor as program unit storage
Above procedure unit in memory realizes corresponding function.
Comprising kernel in processor, gone in memory to transfer corresponding program unit by kernel.Kernel can set one
Or more, by adjusting kernel parameter come improve calculate application program similarity accuracy rate and efficiency.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/
Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flashRAM), memory includes at least one storage
Chip.
It is first when being performed on data processing equipment, being adapted for carrying out present invention also provides a kind of computer program product
The program code of beginningization there are as below methods step:Obtain the application history download sequence set of multiple users, the user
Application history download sequence set include multiple identification informations corresponding with application program and described using journey
The downloading order of sequence;By multiple mark letters corresponding with application program in the application history download sequence set
Breath is trained by term vector model, obtains vector corresponding with each identification information;It is right respectively according to each identification information
The vector answered recommends application program to download the user of application program.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or computer program
Product.Therefore, the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware can be used in the application
Apply the form of example.Moreover, the computer for wherein including computer usable program code in one or more can be used in the application
The computer program production that usable storage medium is implemented on (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form of product.
The application is with reference to the flow according to the method for the embodiment of the present application, equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that it can be realized by computer program instructions every first-class in flowchart and/or the block diagram
The combination of flow and/or box in journey and/or box and flowchart and/or the block diagram.These computer programs can be provided
The processor of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce
A raw machine so that the instruction performed by computer or the processor of other programmable data processing devices is generated for real
The device for the function of being specified in present one flow of flow chart or one box of multiple flows and/or block diagram or multiple boxes.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that the instruction generation being stored in the computer-readable memory includes referring to
Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or
The function of being specified in multiple boxes.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted
Series of operation steps is performed on calculation machine or other programmable devices to generate computer implemented processing, so as in computer or
The instruction offer performed on other programmable devices is used to implement in one flow of flow chart or multiple flows and/or block diagram one
The step of function of being specified in a box or multiple boxes.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/
Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flashRAM).Memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only memory (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic tape cassette, the storage of tape magnetic rigid disk or other magnetic storage apparatus
Or any other non-transmission medium, the information that can be accessed by a computing device available for storage.It defines, calculates according to herein
Machine readable medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It these are only embodiments herein, be not limited to the application.To those skilled in the art,
The application can have various modifications and variations.All any modifications made within spirit herein and principle, equivalent substitution,
Improve etc., it should be included within the scope of claims hereof.