CN110020151B - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110020151B
CN110020151B CN201711252207.8A CN201711252207A CN110020151B CN 110020151 B CN110020151 B CN 110020151B CN 201711252207 A CN201711252207 A CN 201711252207A CN 110020151 B CN110020151 B CN 110020151B
Authority
CN
China
Prior art keywords
site information
keyword
keywords
vector
word vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711252207.8A
Other languages
Chinese (zh)
Other versions
CN110020151A (en
Inventor
贺宇
董国盛
周泽南
苏雪峰
佟子健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Network Technology Co ltd
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201711252207.8A priority Critical patent/CN110020151B/en
Publication of CN110020151A publication Critical patent/CN110020151A/en
Application granted granted Critical
Publication of CN110020151B publication Critical patent/CN110020151B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a data processing method and device, electronic equipment and a storage medium, and aims to improve the accuracy of relevance determination. The method comprises the following steps: forming a correlation path of the keywords and the site information according to the keywords and the site information in the target search result; determining a first word vector of the keyword and a second word vector of the site information according to the association path and a preset model; and calculating the correlation between the keywords and the site information according to the first word vector and the second word vector. And manual classification processing is not needed, and the accuracy of determining the correlation degree is effectively improved.

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data processing method, a data processing apparatus, an electronic device, and a storage medium.
Background
With the development of network technology, more and more users query various information required by network, such as the query of hot-broadcast film and television works, hot games, and the performance, ranking and the like of various commodities, so that the selection of information can be assisted based on the query result.
Generally, query results need to be ranked and fed back during query, some methods score the quality of sites, but the method does not consider the correlation between query terms and sites, and the ranking results may not meet the requirements of users, thereby reducing query efficiency. Some ways of determining the relevance between the keywords and the sites usually determine the relevance according to categories, that is, the relevance between the query words and the sites is calculated according to the matching degree of the categories, but the categories and the classification features of the ways are usually set manually, and the accuracy of classification cannot be guaranteed, so that the accuracy of relevance calculation is difficult to guarantee, and the accuracy of ranking executed according to the relevance is also low.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is to provide a data processing method to improve the accuracy of the correlation determination.
Correspondingly, the embodiment of the invention also provides a data processing device, electronic equipment and a storage medium, which are used for ensuring the realization and application of the method.
In order to solve the above problem, an embodiment of the present invention discloses a data processing method, where the method includes: forming a correlation path of the keywords and the site information according to the keywords and the site information in the target search result; determining a first word vector of the keyword and a second word vector of the site information according to the association path and a preset model; and calculating the correlation between the keywords and the site information according to the first word vector and the second word vector.
Optionally, the forming an associated path of the keyword and the site information according to the keyword and the site information in the target search result includes: determining a plurality of target search results, and respectively extracting keywords and site information corresponding to the keywords from each target search result; and forming an associated path by adopting the corresponding relation between the keywords and the site information.
Optionally, the forming an association path by using the correspondence between the keyword and the site information includes: connecting each keyword with corresponding site information according to the corresponding relation between the keywords and the site information to form a bipartite graph of the keywords and the site information; and determining a plurality of associated paths of the keywords and the site information according to the bipartite graph.
Optionally, the determining a plurality of associated paths of the keyword and the site information according to the bipartite graph includes: and connecting the keywords and the site information in series in a random walk mode according to the bipartite graph to generate a plurality of associated paths.
Optionally, the determining a first word vector of the keyword and a second word vector of the site information according to the association path and the preset model includes: generating vector information according to the associated path, wherein the vector information comprises a first path vector of the keyword and a second path vector of the site information; and inputting the vector information into a preset model to obtain a first word vector of the keyword and a second word vector of the site information.
Optionally, the calculating the correlation between the keyword and the site information according to the first word vector and the second word vector includes: selecting keywords and site information; and performing correlation calculation on the first word vector of the keyword and the second word vector of the site information to obtain the correlation between the keyword and the site information.
Optionally, the method further includes: when a set service is executed through a query word, website information corresponding to the query word is acquired from a query result, wherein the set service comprises at least one of the following: searching and recommending services; and taking the query word as a keyword, taking the website information as site information, and querying the correlation between the corresponding keyword and the site information.
An embodiment of the present invention further provides a data processing apparatus, including: the path determining module is used for forming an associated path of the keyword and the site information according to the keyword and the site information in the target search result; the word vector determining module is used for determining a first word vector of the keyword and a second word vector of the site information according to the association path and the preset model; and the correlation calculation module is used for calculating the correlation between the keywords and the site information according to the first word vector and the second word vector.
Optionally, the path determining module includes: the data extraction submodule is used for determining a plurality of target search results and respectively extracting keywords and site information corresponding to the keywords from each target search result; and the path generation submodule is used for forming an associated path by adopting the corresponding relation between the keywords and the site information.
Optionally, the path generating sub-module includes: a bipartite graph generating unit, configured to connect each keyword with corresponding site information according to a correspondence between the keyword and the site information to form a bipartite graph of the keyword and the site information; and the path determining unit is used for determining a plurality of associated paths of the keywords and the site information according to the bipartite graph.
Optionally, the path determining unit is configured to generate a plurality of associated paths by connecting the keywords and the site information in series in a random walk manner according to the bipartite graph.
Optionally, the word vector determining module is configured to generate vector information according to the association path, where the vector information includes a first path vector of the keyword and a second path vector of the site information; and inputting the vector information into a preset model to obtain a first word vector of the keyword and a second word vector of the site information.
Optionally, the correlation calculation module is configured to select a keyword and site information; and performing correlation calculation on the first word vector of the keyword and the second word vector of the site information to obtain the correlation between the keyword and the site information.
Optionally, the method further includes: the system comprises a correlation query module, a query result obtaining module and a correlation query module, wherein the correlation query module is used for obtaining website information corresponding to a query word from the query result when the set service is executed through the query word, and the set service comprises at least one of the following: searching and recommending services; and taking the query word as a keyword, taking the website information as site information, and querying the correlation between the corresponding keyword and the site information.
The embodiment of the present invention further provides a readable storage medium, which is characterized in that when instructions in the storage medium are executed by a processor of an electronic device, the electronic device is enabled to execute the data processing method according to any one of the embodiments of the present invention.
An embodiment of the present invention further provides an electronic device, which includes a memory, and one or more programs, where the one or more programs are stored in the memory, and the one or more programs configured to be executed by the one or more processors include instructions for: forming a correlation path of the keywords and the site information according to the keywords and the site information in the target search result; determining a first word vector of the keyword and a second word vector of the site information according to the association path and a preset model; and calculating the correlation between the keywords and the site information according to the first word vector and the second word vector.
Optionally, the forming an associated path of the keyword and the site information according to the keyword and the site information in the target search result includes: determining a plurality of target search results, and respectively extracting keywords and site information corresponding to the keywords from each target search result; and forming an associated path by adopting the corresponding relation between the keywords and the site information.
Optionally, the forming an association path by using the correspondence between the keyword and the site information includes: connecting each keyword with corresponding site information according to the corresponding relation between the keywords and the site information to form a bipartite graph of the keywords and the site information; and determining a plurality of associated paths of the keywords and the site information according to the bipartite graph.
Optionally, the determining a plurality of associated paths of the keyword and the site information according to the bipartite graph includes: and connecting the keywords and the site information in series in a random walk mode according to the bipartite graph to generate a plurality of associated paths.
Optionally, the determining a first word vector of the keyword and a second word vector of the site information according to the association path and the preset model includes: generating vector information according to the associated path, wherein the vector information comprises a first path vector of the keyword and a second path vector of the site information; and inputting the vector information into a preset model to obtain a first word vector of the keyword and a second word vector of the site information.
Optionally, the calculating the correlation between the keyword and the site information according to the first word vector and the second word vector includes: selecting keywords and site information; and performing correlation calculation on the first word vector of the keyword and the second word vector of the site information to obtain the correlation between the keyword and the site information.
Optionally, the one or more programs executed by the one or more processors include instructions further for: when a set service is executed through a query word, website information corresponding to the query word is acquired from a query result, wherein the set service comprises at least one of the following: searching and recommending services; and taking the query word as a keyword, taking the website information as site information, and querying the correlation between the corresponding keyword and the site information.
The embodiment of the invention has the following advantages:
according to the embodiment of the invention, the association path of the keyword and the site information can be formed according to the keyword and the site information in the target search result, so that the association path of the keyword and the site information is established according to a large number of search results, then the first word vector of the keyword and the second word vector of the site information are determined according to the association path and the preset model, the correlation between the keyword and the site information is further calculated, manual classification processing is not needed, and the accuracy of determining the correlation is effectively improved.
Drawings
FIG. 1 is a flow chart of the steps of one data processing method embodiment of the present invention;
FIG. 2 is a diagram illustrating an association path according to an embodiment of the present invention;
FIG. 3 is a schematic illustration of a bipartite graph according to an embodiment of the invention;
FIG. 4 is a flow chart of steps in another data processing method embodiment of the present invention;
FIG. 5 is a block diagram of an embodiment of a data processing apparatus of the present invention;
FIG. 6 is a block diagram of another data processing apparatus embodiment of the present invention;
FIG. 7 is a block diagram of a path generation submodule in an alternative embodiment of a data processing apparatus of the invention;
FIG. 8 is a block diagram illustrating an electronic device for data processing in accordance with an exemplary embodiment;
fig. 9 is a schematic structural diagram of a server in an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention is shown, which may specifically include the following steps:
and 102, forming an associated path of the keyword and the site information according to the keyword and the site information in the target search result.
The embodiment of the invention can adopt the search result as the data base to construct the association between the keyword and the site. The target search result refers to N search results with the highest relevance in the keywords and the search results, and can be determined in various ways; the keywords are query words for executing searching, querying, recommending and other services; the website is a website, and the website information refers to identification information of the website in the search result, such as a website address and the like; the associated path refers to a path formed by related keywords and site information, and the associated path can connect the keywords and the corresponding site information in series at random or according to a certain rule, for example, two keywords are connected in series through the same site information, and two site information can also be connected in series through the same keyword, that is, two adjacent nodes in the associated path have an association. In the association path, one of two adjacent nodes is a keyword, the other is site information, the two keywords are connected through the site information, and the two site information are also connected through the keyword, An example of the association path is shown in fig. 2, and the path is the keyword a 1-site information B1-keyword a 2-site information B2- … … -keyword An-site information Bn … ….
The keyword query may be obtained based on a query log of a search engine, for example, millions or millions of keywords may be randomly selected from the query log. And then the search engine crawls the search result by adopting the keyword and obtains a target search result from the search result, for example, if the search result of a home page is used as the target search result, and if the first N search results are obtained as the target search results, the search result can be determined according to requirements, so that the site information corresponding to the keyword can be obtained from each search result, and further, each keyword corresponds to a plurality of site information, and then, the association path of the keyword and the site information is established according to a certain algorithm, namely, each keyword and the site information are associated to form a corresponding association path.
The search engine is a system that collects information from the internet by using a specific computer program according to a certain policy, provides a search service for a user after organizing and processing the information, and displays the related information obtained by searching to the user. Common search engines include hundredths (https:// www.baidu.com), dog search (https:// www.sogo.com /), and so on. The query words or keywords entered by the user in the search engine may be denoted as query. The site information is the site information to which each web page belongs in the query results returned by the search engine, and may be represented as site, for example, site is www.jianpu.cn when url ═ http:// www.jianpu.cn/g/zh/zhoujielilun.
And 104, determining a first word vector of the keyword and a second word vector of the site information according to the associated path and a preset model.
The preset model is a model used for training word vectors, wherein the model can also be regarded as a data set and is constructed according to a data mathematical model, the mathematical model is a scientific or engineering model constructed by using a mathematical logic method and a mathematical language, the mathematical model is a mathematical structure which is generally or approximately expressed by adopting the mathematical language aiming at the characteristic or quantity dependency relationship of a certain object system, and the mathematical structure is a pure relational structure of the certain system marked by means of mathematical symbols. For example, the preset model is a language model of a neural network, a skip-gram model of word2vec, and the like.
The preset model can be trained according to the association path, so that a first word vector of the keyword and a second word vector of the site information can be obtained. Vector information of each keyword and site information can be determined according to the association path, and then the vector information is input into a preset model for model training, so that a first word vector of each keyword and a second word vector of the site information can be obtained.
And 106, calculating the correlation between the keywords and the site information according to the first word vector and the second word vector.
For any two keywords and site information, the first word vector of the keyword and the second word vector of the site information can be adopted to calculate the correlation, so that the correlation between any two keywords and the site information can be obtained.
Therefore, in the business of inquiring, searching, recommending and the like, aiming at the inquiry result corresponding to the inquiry word, one dimension of the sequencing can be the correlation between the keyword and the site information, so that the sequencing accuracy is improved, and the processing efficiency is improved.
In conclusion, the association path of the keyword and the site information can be formed according to the keyword and the site information in the target search result, so that the association path of the keyword and the site information is established according to a large number of search results, then the first word vector of the keyword and the second word vector of the site information are determined according to the association path and the preset model, the correlation between the keyword and the site information is further calculated, manual classification processing is not needed, and the accuracy of similarity determination is effectively improved. And then when sequencing is carried out according to the relevance, the sequencing accuracy can be effectively improved.
In the embodiment of the present application, Word vector (Word Embedding) is also called Word Embedding, and is a general term of a language model and a characterization learning technique in natural language processing. Conceptually, it refers to embedding a high-dimensional space with dimensions of the number of all words into a continuous vector space with much lower dimensions, each word or phrase being mapped as a vector on the real number domain. Natural Language Processing (NLP) is a field of computer science, artificial intelligence, linguistics that focuses on the interaction between computer and human (natural) language. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics.
In an optional embodiment of the present invention, the forming an association path of the keyword and the site information according to the keyword and the site information in the target search result includes: determining a plurality of target search results, and respectively extracting keywords and site information corresponding to the keywords from each target search result; and forming an associated path by adopting the corresponding relation between the keywords and the site information. The search engine can be adopted to crawl search results corresponding to the keywords, then target search results are extracted from the search results, and then site information corresponding to the keywords is extracted from each target search result, so that the corresponding relation between each keyword and the site information is obtained, the corresponding relation is determined according to the site information in the target search results searched by the keywords, and the relevance between the keywords and the site information is represented. Therefore, the associated path of the keywords and the site information can be established according to the corresponding relation, and the keywords and the site information are connected in series.
Wherein, the forming of the association path by adopting the corresponding relation of the keywords and the site information comprises: connecting each keyword with corresponding site information according to the corresponding relation between the keywords and the site information to form a bipartite graph of the keywords and the site information; and determining a plurality of associated paths of the keywords and the site information according to the bipartite graph. According to the corresponding relation between the keywords and the site information, the keywords and the corresponding site information can be connected, namely one keyword corresponds to a plurality of site information, and one site information can also belong to a plurality of keywords, so that the keywords and the site information are connected to form a bipartite graph of the keywords and the site information, and the keywords and the site information form nodes of the bipartite graph. Then, any node in the bipartite graph is taken as a starting point to walk among the nodes of the bipartite graph, so that an associated path can be formed, and a plurality of associated paths can be formed on the basis of the bipartite graph.
For example: the keyword "zhou jilun" corresponds to site information: com, www.b.com; the keyword 'simple love brief' corresponds to site information: com, www.b.com, www.c.com, www.e.com; the keyword "guitar spectrum" corresponds to site information: www.c.com, www.d.com; the keyword 'full music score' corresponds to site information: www.c.com, www.d.com, www.e.com. A bipartite graph as shown in figure 3 may be constructed.
Then multiple association paths can be constructed based on the bipartite graph, for example, one association path is: zhou Ji Lun-www.a.com-simple love spectrum-www.e.com-broad-www.d.com- … …; another example path is: www.b.com-Zhou Ji Lun-www.a.com-simple Aijian notation-www.c.com-Guitar notation-www.d.com- … ….
The bipartite graph is a model in graph theory, and if G ═ V, E is an undirected graph, if a vertex V can be divided into two mutually disjoint subsets (a, B), and two vertices i and j associated with each edge (i, j) in the graph belong to the two different vertex sets (i in a, j in B), respectively, then the graph G is called a bipartite graph (or bipartite graph).
Therefore, a bipartite graph can be formed based on the corresponding relation between the keywords and the site information, and a path formed by the keywords and the site information is obtained, so that vector conversion is facilitated.
Referring to fig. 4, a flowchart illustrating steps of another embodiment of a data processing method according to the present invention is shown, which may specifically include the following steps:
step 402, determining a plurality of target search results, and extracting keywords and site information corresponding to the keywords from each target search result.
The keyword query may be obtained based on a query log of a search engine, for example, millions or millions of keywords may be randomly selected from the query log. And then the search engine crawls the search results by adopting the keywords and obtains target search results from the search results, and the search results can be determined according to requirements, for example, the search results of the home page are used as the target search results, and the previous N search results are obtained as the target search results. And extracting the site information of each search result as the site information corresponding to the keyword, thereby obtaining a plurality of site information corresponding to one keyword, and one site information can be searched by a plurality of keywords, so that one site information can also correspond to a plurality of keywords.
And 404, connecting each keyword with corresponding site information according to the corresponding relation between the keywords and the site information to form a bipartite graph of the keywords and the site information.
After the keywords and the corresponding site information are obtained through the target search result, each keyword and the corresponding site information can be connected according to the corresponding relation between the keywords and the site information, namely, the related site information and the keywords are connected in series. As shown in the above example, the keyword "zhou jilun" is respectively associated with site information: com, www.b.com, the keyword "simple love brief" is the site information: www.a.com, www.b.com, www.c.com, www.e.com are connected, so that the relation of Zhoujilun-www.a.com-simple Escape is established, and in the connection process, the same type of information is not directly connected, namely two keywords are not directly connected, the two site information is not directly connected, and the keywords are connected with the site information, so that a corresponding bipartite graph is formed, wherein an example is shown in FIG. 3.
And step 406, determining a plurality of associated paths of the keywords and the site information according to the bipartite graph.
Then, a node can be arbitrarily selected from the bipartite graph as a starting point, and the starting point can be walked in the bipartite graph to obtain associated paths of the keyword and the site information, wherein one starting point can obtain one or more associated paths, and a plurality of nodes can be selected from the bipartite graph as the starting point, so that a plurality of associated paths can be obtained through one bipartite graph. The walking mode in the bipartite graph can be various, for example, walking according to a certain rule, or random walking, etc., and can be determined according to requirements.
In an optional embodiment, the determining, according to the bipartite graph, a plurality of associated paths of keywords and site information includes: and connecting the keywords and the site information in series in a random walk mode according to the bipartite graph to generate a plurality of associated paths. If the random walk mode is adopted, the node can be selected as a starting point, and then random walk is carried out according to the association of the nodes in the bipartite graph, so that the keywords and the site information are connected in series to generate a corresponding association path.
The Random Walk (Random Walk) means that conservation quantities carried by any irregular walker correspond to a diffusion transport law respectively, are close to brownian motion and are ideal mathematical states of the brownian motion, and the Random Walk algorithm can be operated based on a bipartite graph to generate the associated path.
Step 408, generating vector information according to the associated path, where the vector information includes a first path vector of the keyword and a second path vector of the site information.
Then, vectors of the keywords and the site information can be determined through a plurality of associated paths, and vector information can be generated according to the associated paths, wherein the vector information comprises a first path vector of the keyword and a second path vector of a preset model, the first path vector of each keyword can be obtained for each keyword, and the second path vector of each site information can be obtained for each site information.
Step 410, inputting the vector information into a preset model to obtain a first word vector of the keyword and a second word vector of the site information.
Namely, the associated path is used as training data to train the preset model, for example, the first path vector of each keyword in the associated path and the second path vector of the station information are respectively input into the preset model. Therefore, the preset model is trained, the iteration process corresponding to the model is executed, and the first word vector of each keyword and the second word vector of the site information can be obtained based on the model. Therefore, each selected keyword and the corresponding site information thereof can be represented according to the corresponding vector respectively. In the embodiment of the invention, the first word vector and the second word vector are general terms and are used for distinguishing the vectors representing the keywords and the site information.
For example, according to a skip-gram model in word2vec, a keyword query and a site information site are expressed in a form of n-dimensional dense vectors, and then the correlation between the query and the site is obtained. The skip-gram is a model for training word vectors, word vectors of contexts in a certain window can be predicted according to input word vectors, and therefore word vectors of keywords and site information can be determined conveniently.
Step 412, selecting keywords and site information.
And 414, performing correlation calculation on the first word vector of the keyword and the second word vector of the site information to obtain correlation between the keyword and the site information.
After the first word vector of each keyword and the second word vector of the site information are obtained, the keywords and the site information can be selected, and then correlation calculation is performed through the first word vectors of the keywords and the second word vectors of the site information to obtain a correlation value, so that the correlation between any two keywords and the site information is determined.
The calculation mode of the correlation between the keywords and the site information can be applied to various scenes, and the method is applicable to scenes including but not limited to search engines, recommendation systems and the like, and the keywords and the site information are expressed in a vector form, so that the correlation between the keywords and the site information is calculated based on the vector, and the correlation is added to the scenes of searching, recommendation and the like as a continuous feature, and the search effect is better optimized. When a set service is executed through a query word, website information corresponding to the query word is acquired from a query result, wherein the set service comprises at least one of the following: searching and recommending services; and taking the query word as a keyword, taking the website information as site information, and querying the correlation between the corresponding keyword and the site information.
In a search query scenario, after a user inputs a keyword, a search engine performs a search based on the keyword to obtain a corresponding search result, so that in the process of ranking search results, the relevance between the keyword and site information in the search results is used as one of ranking bases, that is, the determined relevance is combined with other ways to rank the search results. In the actual processing, the correlation value between each keyword and the site information may be stored in advance in the database, and the first word vector of each keyword and the second word vector of the site information may also be stored in the database, so that the corresponding first word vector and second word vector are obtained as needed to calculate the correlation value as one of the ranking basis numbers.
The application is similar to a search query scene in a recommendation scene, and for a recommendation result matched with the recommendation keyword, the relevance of the keyword and site information in the recommendation result can be determined and used as one of sequencing bases of the recommendation result, so that the processing efficiency is improved through the accuracy of feedback results in the scenes of search query, recommendation and the like.
In the embodiment of the invention, the two things are related by a graph mode, and a vector form of the two things is generated by adopting a random walk strategy, so that the correlation degree can be directly calculated. Such as news recommendations, we can generate vectors of users and news, such as advertisements CTR, can vectorize users and advertisements, and so on.
The embodiment of the invention can be combined with the technologies of real intention of a user, natural language processing and the like, and excavate the direct relation between the keyword query and the site information site, thereby providing a novel correlation calculation method from the query to the site. The query and the site are vectorized by a machine learning method, are in the same semantic space, and the accuracy of the similarity between the query and the site is improved by calculating the correlation of the vectors, such as cosine similarity.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
The embodiment of the invention also provides an input device which is applied to terminal equipment, wherein the terminal equipment is provided with a touch screen and a pressure sensing device, and the pressure sensing device can sense the pressure information operated on the touch screen.
Referring to fig. 5, a block diagram of a data arrangement apparatus according to an embodiment of the present invention is shown, which may specifically include the following modules:
and a path determining module 502, configured to form an associated path of the keyword and the site information according to the keyword and the site information in the target search result.
And a word vector determining module 504, configured to determine a first word vector of the keyword and a second word vector of the site information according to the association path and the preset model.
And a correlation calculation module 506, configured to calculate a correlation between the keyword and the site information according to the first word vector and the second word vector.
In summary, an association path of the keywords and the site information is formed according to the keywords and the site information in the target search result, so that association of the keywords and the site information is established according to a large number of search results, then a first word vector of the keywords and a second word vector of the site information are determined according to the association path and a preset model, the keywords and the sites are expressed according to a vector form, further, the correlation between the keywords and the site information is calculated, manual classification processing is not needed, and the accuracy of similarity determination is effectively improved.
Referring to fig. 6, a block diagram of another data arrangement apparatus according to another embodiment of the present invention is shown, which may specifically include the following modules:
and a path determining module 502, configured to form an associated path of the keyword and the site information according to the keyword and the site information in the target search result.
And a word vector determining module 504, configured to determine a first word vector of the keyword and a second word vector of the site information according to the association path and the preset model.
And a correlation calculation module 506, configured to calculate a correlation between the keyword and the site information according to the first word vector and the second word vector.
The correlation query module 508 is configured to, when a setting service is executed through a query term, obtain website information corresponding to the query term from a query result, where the setting service includes at least one of the following: searching and recommending services; and taking the query word as a keyword, taking the website information as site information, and querying the correlation between the corresponding keyword and the site information.
Wherein the path determining module 502 includes: a data extraction sub-module 5022 and a path generation sub-module 5024, wherein:
the data extraction sub-module 5022 is used for determining a plurality of target search results and extracting keywords and site information corresponding to the keywords from each target search result;
and the path generating sub-module 5024 is used for forming an associated path by adopting the corresponding relation between the keyword and the site information.
The path generation submodule 5024 shown in fig. 7 includes: a bipartite graph generation unit 50242 and a path determination unit 50244, wherein:
a bipartite graph generating unit 50242, configured to connect each keyword with corresponding site information according to a correspondence between the keyword and the site information to form a bipartite graph of the keyword and the site information;
a path determining unit 50244, configured to determine a plurality of associated paths of the keyword and the site information according to the bipartite graph.
The path determining unit 50244 is configured to generate a plurality of associated paths by concatenating the keywords and the site information in a random walk manner according to the bipartite graph.
The word vector determining module 504 is configured to generate vector information according to the association path, where the vector information includes a first path vector of the keyword and a second path vector of the site information; and inputting the vector information into a preset model to obtain a first word vector of the keyword and a second word vector of the site information.
The correlation calculation module 506 is used for selecting keywords and site information; and performing correlation calculation on the first word vector of the keyword and the second word vector of the site information to obtain the correlation between the keyword and the site information.
The calculation mode of the correlation between the keywords and the site information can be applied to various scenes, and the method is applicable to scenes including but not limited to search engines, recommendation systems and the like, and the keywords and the site information are expressed in a vector form, so that the correlation between the keywords and the site information is calculated based on the vector, and the correlation is added to the scenes of searching, recommendation and the like as a continuous feature, and the search effect is better optimized. In the embodiment of the invention, the two things are related by a graph mode, and a vector form of the two things is generated by adopting a random walk strategy, so that the correlation degree can be directly calculated. Such as news recommendations, we can generate vectors of users and news, such as advertisements CTR, can vectorize users and advertisements, and so on.
The embodiment of the invention can be combined with the technologies of real intention of a user, natural language processing and the like, and excavate the direct relation between the keyword query and the site information site, thereby providing a novel correlation calculation method from the query to the site. The query and the site are vectorized by a machine learning method, are in the same semantic space, and the accuracy of the similarity between the query and the site is improved by calculating the correlation of the vectors, such as cosine similarity.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
FIG. 8 is a block diagram illustrating a structure of an electronic device 800 for presenting input, according to an example embodiment. For example, the electronic device 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 8, electronic device 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing elements 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.
The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the electronic device 800. For example, the sensor assembly 814 may detect an open/closed state of the device 800, the relative positioning of components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in the position of the electronic device 800 or a component of the electronic device 800, the presence or absence of user contact with the electronic device 800, orientation or acceleration/deceleration of the electronic device 800, and a change in the temperature of the electronic device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the electronic device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer readable storage medium in which instructions, when executed by a processor of an electronic device, enable the electronic device to perform an input method, the method comprising: forming a correlation path of the keywords and the site information according to the keywords and the site information in the target search result; determining a first word vector of the keyword and a second word vector of the site information according to the association path and a preset model; and calculating the correlation between the keywords and the site information according to the first word vector and the second word vector.
Optionally, the forming an associated path of the keyword and the site information according to the keyword and the site information in the target search result includes: determining a plurality of target search results, and respectively extracting keywords and site information corresponding to the keywords from each target search result; and forming an associated path by adopting the corresponding relation between the keywords and the site information.
Optionally, the forming an association path by using the correspondence between the keyword and the site information includes: connecting each keyword with corresponding site information according to the corresponding relation between the keywords and the site information to form a bipartite graph of the keywords and the site information; and determining a plurality of associated paths of the keywords and the site information according to the bipartite graph.
Optionally, the determining a plurality of associated paths of the keyword and the site information according to the bipartite graph includes: and connecting the keywords and the site information in series in a random walk mode according to the bipartite graph to generate a plurality of associated paths.
Optionally, the determining a first word vector of the keyword and a second word vector of the site information according to the association path and the preset model includes: generating vector information according to the associated path, wherein the vector information comprises a first path vector of the keyword and a second path vector of the site information; and inputting the vector information into a preset model to obtain a first word vector of the keyword and a second word vector of the site information.
Optionally, the calculating the correlation between the keyword and the site information according to the first word vector and the second word vector includes: selecting keywords and site information; and performing correlation calculation on the first word vector of the keyword and the second word vector of the site information to obtain the correlation between the keyword and the site information.
Optionally, the method further includes: when a set service is executed through a query word, website information corresponding to the query word is acquired from a query result, wherein the set service comprises at least one of the following: searching and recommending services; and taking the query word as a keyword, taking the website information as site information, and querying the correlation between the corresponding keyword and the site information.
Fig. 9 is a schematic structural diagram of a server in an embodiment of the present invention. The server 900 may vary widely in configuration or performance and may include one or more Central Processing Units (CPUs) 922 (e.g., one or more processors) and memory 932, one or more storage media 930 (e.g., one or more mass storage devices) storing applications 942 or data 944. Memory 932 and storage media 930 can be, among other things, transient storage or persistent storage. The program stored on the storage medium 930 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, central processor 922 may be arranged to communicate with storage medium 930 to execute a series of instruction operations in storage medium 930 on server 800.
The server 900 may also include one or more power supplies 926, one or more wired or wireless network interfaces 950, one or more input-output interfaces 958, one or more keyboards 956, and/or one or more operating systems 941, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
An embodiment of the present invention further provides an electronic device, which includes a memory, and one or more programs, where the one or more programs are stored in the memory, and the one or more programs configured to be executed by the one or more processors include instructions for: forming a correlation path of the keywords and the site information according to the keywords and the site information in the target search result; determining a first word vector of the keyword and a second word vector of the site information according to the association path and a preset model; and calculating the correlation between the keywords and the site information according to the first word vector and the second word vector.
Optionally, the forming an associated path of the keyword and the site information according to the keyword and the site information in the target search result includes: determining a plurality of target search results, and respectively extracting keywords and site information corresponding to the keywords from each target search result; and forming an associated path by adopting the corresponding relation between the keywords and the site information.
Optionally, the forming an association path by using the correspondence between the keyword and the site information includes: connecting each keyword with corresponding site information according to the corresponding relation between the keywords and the site information to form a bipartite graph of the keywords and the site information; and determining a plurality of associated paths of the keywords and the site information according to the bipartite graph.
Optionally, the determining a plurality of associated paths of the keyword and the site information according to the bipartite graph includes: and connecting the keywords and the site information in series in a random walk mode according to the bipartite graph to generate a plurality of associated paths.
Optionally, the determining a first word vector of the keyword and a second word vector of the site information according to the association path and the preset model includes: generating vector information according to the associated path, wherein the vector information comprises a first path vector of the keyword and a second path vector of the site information; and inputting the vector information into a preset model to obtain a first word vector of the keyword and a second word vector of the site information.
Optionally, the calculating the correlation between the keyword and the site information according to the first word vector and the second word vector includes: selecting keywords and site information; and performing correlation calculation on the first word vector of the keyword and the second word vector of the site information to obtain the correlation between the keyword and the site information.
Optionally, the one or more programs executed by the one or more processors include instructions further for: when a set service is executed through a query word, website information corresponding to the query word is acquired from a query result, wherein the set service comprises at least one of the following: searching and recommending services; and taking the query word as a keyword, taking the website information as site information, and querying the correlation between the corresponding keyword and the site information.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The data processing method and apparatus, the electronic device, and the storage medium provided by the present invention are described in detail above, and a specific example is applied in the present document to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understand the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (16)

1. A method of data processing, the method comprising:
determining a plurality of target search results;
extracting keywords and site information corresponding to the keywords from each target search result respectively;
forming an association path by adopting the corresponding relation between the keywords and the site information, wherein one of two adjacent nodes in the association path is the keywords, and the other node is the site information;
determining a first word vector of the keyword and a second word vector of the site information according to the association path and a preset model;
calculating the correlation between the keywords and the site information according to the first word vector and the second word vector;
the determining a first word vector of the keyword and a second word vector of the site information according to the associated path and the preset model comprises the following steps:
generating vector information according to the associated path, wherein the vector information comprises a first path vector of the keyword and a second path vector of the site information;
and inputting the vector information into a preset model to obtain a first word vector of the keyword and a second word vector of the site information.
2. The method according to claim 1, wherein the constructing the association path by using the correspondence between the keyword and the site information includes:
connecting each keyword with corresponding site information according to the corresponding relation between the keywords and the site information to form a bipartite graph of the keywords and the site information;
and determining a plurality of associated paths of the keywords and the site information according to the bipartite graph.
3. The method of claim 2, wherein determining a plurality of associated paths of the keyword and site information according to a bipartite graph comprises:
and connecting the keywords and the site information in series in a random walk mode according to the bipartite graph to generate a plurality of associated paths.
4. The method of claim 1, wherein calculating the correlation between the keyword and the site information according to the first word vector and the second word vector comprises:
selecting keywords and site information;
and performing correlation calculation on the first word vector of the keyword and the second word vector of the site information to obtain the correlation between the keyword and the site information.
5. The method of claim 1, further comprising:
when a set service is executed through a query word, website information corresponding to the query word is acquired from a query result, wherein the set service comprises at least one of the following: searching and recommending services;
and taking the query word as a keyword, taking the website information as site information, and querying the correlation between the corresponding keyword and the site information.
6. A data processing apparatus, comprising:
a path determination module for determining a plurality of target search results; forming a correlation path of the keywords and the site information according to the keywords and the site information in the target search result;
the word vector determining module is used for determining a first word vector of the keyword and a second word vector of the site information according to the association path and the preset model;
the correlation calculation module is used for calculating the correlation between the keywords and the site information according to the first word vector and the second word vector;
the path determination module includes:
the data extraction submodule is used for determining a plurality of target search results and respectively extracting keywords and site information corresponding to the keywords from each target search result;
the path generation sub-module is used for forming an association path by adopting the corresponding relation between the keywords and the site information, wherein one of two adjacent nodes in the association path is the keywords, and the other node is the site information;
the word vector determining module is configured to generate vector information according to the association path, where the vector information includes a first path vector of the keyword and a second path vector of the site information; and inputting the vector information into a preset model to obtain a first word vector of the keyword and a second word vector of the site information.
7. The apparatus of claim 6, wherein the path generation submodule comprises:
a bipartite graph generating unit, configured to connect each keyword with corresponding site information according to a correspondence between the keyword and the site information to form a bipartite graph of the keyword and the site information;
and the path determining unit is used for determining a plurality of associated paths of the keywords and the site information according to the bipartite graph.
8. The apparatus of claim 7,
and the path determining unit is used for connecting the keywords and the site information in series in a random walk mode according to the bipartite graph to generate a plurality of associated paths.
9. The apparatus of claim 6,
the correlation calculation module is used for selecting keywords and site information; and performing correlation calculation on the first word vector of the keyword and the second word vector of the site information to obtain the correlation between the keyword and the site information.
10. The apparatus of claim 6, further comprising:
the system comprises a correlation query module, a query result obtaining module and a correlation query module, wherein the correlation query module is used for obtaining website information corresponding to a query word from the query result when the set service is executed through the query word, and the set service comprises at least one of the following: searching and recommending services; and taking the query word as a keyword, taking the website information as site information, and querying the correlation between the corresponding keyword and the site information.
11. A readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the data processing method according to any of method claims 1-5.
12. An electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors the one or more programs including instructions for:
determining a plurality of target search results;
extracting keywords and site information corresponding to the keywords from each target search result respectively;
forming an association path by adopting the corresponding relation between the keywords and the site information, wherein one of two adjacent nodes in the association path is the keywords, and the other node is the site information;
determining a first word vector of the keyword and a second word vector of the site information according to the association path and a preset model;
calculating the correlation between the keywords and the site information according to the first word vector and the second word vector;
generating vector information according to the associated path, wherein the vector information comprises a first path vector of the keyword and a second path vector of the site information;
and inputting the vector information into a preset model to obtain a first word vector of the keyword and a second word vector of the site information.
13. The electronic device according to claim 12, wherein the constructing an association path using the correspondence between the keyword and the site information includes:
connecting each keyword with corresponding site information according to the corresponding relation between the keywords and the site information to form a bipartite graph of the keywords and the site information;
and determining a plurality of associated paths of the keywords and the site information according to the bipartite graph.
14. The electronic device of claim 13, wherein determining a plurality of associated paths of the keyword and site information according to a bipartite graph comprises:
and connecting the keywords and the site information in series in a random walk mode according to the bipartite graph to generate a plurality of associated paths.
15. The electronic device of claim 12, wherein the calculating the correlation between the keyword and the site information according to the first word vector and the second word vector comprises:
selecting keywords and site information;
and performing correlation calculation on the first word vector of the keyword and the second word vector of the site information to obtain the correlation between the keyword and the site information.
16. The electronic device of claim 12, wherein execution of the one or more programs by one or more processors comprises instructions that are further configured to:
when a set service is executed through a query word, website information corresponding to the query word is acquired from a query result, wherein the set service comprises at least one of the following: searching and recommending services;
and taking the query word as a keyword, taking the website information as site information, and querying the correlation between the corresponding keyword and the site information.
CN201711252207.8A 2017-12-01 2017-12-01 Data processing method and device, electronic equipment and storage medium Active CN110020151B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711252207.8A CN110020151B (en) 2017-12-01 2017-12-01 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711252207.8A CN110020151B (en) 2017-12-01 2017-12-01 Data processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110020151A CN110020151A (en) 2019-07-16
CN110020151B true CN110020151B (en) 2022-04-26

Family

ID=67185939

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711252207.8A Active CN110020151B (en) 2017-12-01 2017-12-01 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110020151B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883295B (en) * 2019-11-29 2024-02-23 北京搜狗科技发展有限公司 Data processing method, device and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102789462A (en) * 2011-05-18 2012-11-21 阿里巴巴集团控股有限公司 Project recommendation method and system
CN103294681A (en) * 2012-02-23 2013-09-11 北京百度网讯科技有限公司 Method and device for generating search result
CN107122455A (en) * 2017-04-26 2017-09-01 中国人民解放军国防科学技术大学 A kind of network user's enhancing method for expressing based on microblogging
CN107291914A (en) * 2017-06-27 2017-10-24 达而观信息科技(上海)有限公司 A kind of method and system for generating search engine inquiry expansion word

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193939B (en) * 2010-03-10 2016-04-06 阿里巴巴集团控股有限公司 The implementation method of information navigation, information navigation server and information handling system
CN106484698A (en) * 2015-08-25 2017-03-08 北京奇虎科技有限公司 A kind of method for pushing of search keyword and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102789462A (en) * 2011-05-18 2012-11-21 阿里巴巴集团控股有限公司 Project recommendation method and system
CN103294681A (en) * 2012-02-23 2013-09-11 北京百度网讯科技有限公司 Method and device for generating search result
CN107122455A (en) * 2017-04-26 2017-09-01 中国人民解放军国防科学技术大学 A kind of network user's enhancing method for expressing based on microblogging
CN107291914A (en) * 2017-06-27 2017-10-24 达而观信息科技(上海)有限公司 A kind of method and system for generating search engine inquiry expansion word

Also Published As

Publication number Publication date
CN110020151A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN109800325B (en) Video recommendation method and device and computer-readable storage medium
US11120078B2 (en) Method and device for video processing, electronic device, and storage medium
CN111581488B (en) Data processing method and device, electronic equipment and storage medium
CN111291069B (en) Data processing method and device and electronic equipment
CN109918565B (en) Processing method and device for search data and electronic equipment
CN112508612B (en) Method for training advertisement creative generation model and generating advertisement creative and related device
CN112148980B (en) Article recommending method, device, equipment and storage medium based on user click
CN106815291B (en) Search result item display method and device and search result item display device
CN110110207B (en) Information recommendation method and device and electronic equipment
CN112784142A (en) Information recommendation method and device
CN112307281A (en) Entity recommendation method and device
CN112148923B (en) Method for ordering search results, method, device and equipment for generating ordering model
CN111538830A (en) French retrieval method, French retrieval device, computer equipment and storage medium
CN111241844A (en) Information recommendation method and device
CN113869063A (en) Data recommendation method and device, electronic equipment and storage medium
CN111368161A (en) Search intention recognition method and intention recognition model training method and device
CN110020151B (en) Data processing method and device, electronic equipment and storage medium
CN107436896B (en) Input recommendation method and device and electronic equipment
CN110110046B (en) Method and device for recommending entities with same name
CN112559852A (en) Information recommendation method and device
CN110147426B (en) Method for determining classification label of query text and related device
CN113157923B (en) Entity classification method, device and readable storage medium
CN113256379A (en) Method for correlating shopping demands for commodities
CN112052395B (en) Data processing method and device
CN110362686B (en) Word stock generation method and device, terminal equipment and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220926

Address after: 100084. Room 9, floor 01, cyber building, building 9, building 1, Zhongguancun East Road, Haidian District, Beijing

Patentee after: BEIJING SOGOU TECHNOLOGY DEVELOPMENT Co.,Ltd.

Patentee after: Beijing Sogou Network Technology Co.,Ltd.

Address before: 100084. Room 9, floor 01, cyber building, building 9, building 1, Zhongguancun East Road, Haidian District, Beijing

Patentee before: BEIJING SOGOU TECHNOLOGY DEVELOPMENT Co.,Ltd.