WO2017113592A1 - 模型生成方法、词语赋权方法、装置、设备及计算机存储介质 - Google Patents
模型生成方法、词语赋权方法、装置、设备及计算机存储介质 Download PDFInfo
- Publication number
- WO2017113592A1 WO2017113592A1 PCT/CN2016/084312 CN2016084312W WO2017113592A1 WO 2017113592 A1 WO2017113592 A1 WO 2017113592A1 CN 2016084312 W CN2016084312 W CN 2016084312W WO 2017113592 A1 WO2017113592 A1 WO 2017113592A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- text
- picture
- word
- model
- regression
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/313—Selection or weighting of terms for indexing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
- G06F16/532—Query formulation, e.g. graphical querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5838—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- the present invention relates to the field of Internet application technologies, and in particular, to a model generation method, a word weighting method, an apparatus, a device, and a computer storage medium.
- the image search process includes: the user inputs the query word, and then the image search engine searches the image collection for the image search result that matches the query word, and sorts the image search result and provides it to the user.
- the image search engine is an information retrieval tool for finding Internet pictures.
- the image search engine needs to calculate the correlation between the query word and the candidate image.
- the calculation correlation is mainly based on the weight of each word in the query word and the weight of each word in the text of each candidate picture.
- the weight of each word in the text of the candidate picture is obtained by using a word weighting technique. It can be seen that the effect of word weighting directly affects the correlation calculation, which in turn affects the accuracy of the image search results.
- the embodiment of the present invention provides a model generation method, a word assignment method, a device, a device, and a computer storage medium, which can improve the weighting result of each word in the text of the picture.
- the accuracy of the image search results is improved.
- An aspect of an embodiment of the present invention provides a method for generating a model, including:
- Performing machine learning based on the text feature and the visual feature generates a first regression model and a first ranking model.
- a candidate picture whose similarity is greater than or equal to a preset similarity threshold is extracted as the other picture identical to the specified picture.
- the method further includes:
- At least one of the distances is selected to filter each text in the text cluster to obtain a filtered text cluster.
- the text feature comprising at least one of the following features:
- machine learning is performed to generate a second regression model and a second ranking model.
- An aspect of an embodiment of the present invention provides a method for weighting a word, including:
- the first regression model Obtaining, by the first regression model, a first regression score of each word in the text according to the text of the specified picture; the first regression model is generated by using the above model generation method;
- An aspect of an embodiment of the present invention provides a model generating apparatus, including:
- a picture obtaining unit configured to acquire other pictures that are the same as the specified picture, and use the specified picture and the other pictures as sample pictures;
- a text clustering unit configured to obtain a text cluster according to the text of the sample picture
- a first feature acquiring unit configured to obtain a text feature according to the text clustering, and obtain a visual feature according to the sample image
- a first generating unit configured to perform machine learning according to the text feature and the visual feature, to generate a first regression model and a first sorting model.
- a candidate picture whose similarity is greater than or equal to a preset similarity threshold is extracted as the other picture identical to the specified picture.
- the device further includes:
- a text processing unit configured to: according to the authoritative data of the site or page where each sample picture is located, the time information of the page where each sample picture is located, the click data of the site where each sample picture is located, and the word vector of the text of each sample picture and the text At least one of the distances between the cluster of word vectors, the text in the text cluster is filtered to obtain a filtered text cluster.
- the text feature comprising at least one of the following features:
- the device further includes:
- a score obtaining unit configured to obtain, by using the first regression model, a regression score of each word in the text of each sample picture
- a sorting obtaining unit configured to obtain, by using the first sorting module, a sorting result of each word in the text of each sample image
- a second feature acquiring unit configured to obtain related features of each picture in the image search result that matches each word in the text of each sample picture
- a second generating unit configured to perform machine learning according to the regression score, the sorting result, and the related feature, and generate a second regression model and a second sorting model.
- An aspect of an embodiment of the present invention provides a word weighting apparatus, including:
- a score obtaining unit configured to obtain a first regression score of each word in the text according to a text of the specified picture by using a first regression model; the first regression model is generated by using the model generating apparatus;
- a sorting obtaining unit configured to use the first sorting model according to the text of the specified picture Obtaining a first sorting result of each word in the text; the first sorting model is generated by using the model generating apparatus;
- a word weighting unit configured to obtain, according to the first regression score and the first sorting result, an empowerment score of each word in the text of the specified picture.
- word weighting unit is specifically configured to:
- word weighting unit further includes:
- a score obtaining module configured to obtain, according to the first regression score and the first sorting result, a second regression score of each word in the text of the specified picture by using a second regression model;
- the regression model is generated by using the above model generating device;
- a sorting obtaining module configured to obtain, according to the first regression score and the first sorting result, a second sorting result of each word in the text of the specified picture by using a second sorting model; the second sorting model Generated by using the above model generating device;
- a word weighting module configured to calculate an weighting score of each word in the text of the specified picture according to the second regression score and the second sorting result, and using the weighting function.
- a plurality of texts are obtained by clustering text of a picture, and then multiple features are extracted based on the text clustering of the image, thereby performing machine learning on various features to generate required
- a model that can be used to weight words in the text of a picture.
- the accuracy of the speech weighting result is relatively low. Therefore, the embodiment of the present invention can improve the accuracy of the weighting result of each word in the text of the picture, thereby improving the accuracy of the image search result.
- Embodiment 1 is a schematic flowchart of Embodiment 1 of a method for generating a model according to an embodiment of the present invention
- FIG. 2 is a diagram showing an example of generating a local model and a global model according to an embodiment of the present invention
- FIG. 3 is a diagram showing an example of text clustering of a picture provided by an embodiment of the present invention.
- Embodiment 4 is a schematic flowchart of Embodiment 2 of a method for generating a model according to an embodiment of the present invention
- FIG. 5 is a diagram showing an example of generating a model using a click feature according to an embodiment of the present invention
- FIG. 6 is a schematic flowchart of a method for weighting a word provided by an embodiment of the present invention.
- FIG. 7 is a functional block diagram of Embodiment 1 of a model generating apparatus according to an embodiment of the present invention.
- FIG. 8 is a functional block diagram of Embodiment 2 of a model generating apparatus according to an embodiment of the present invention.
- FIG. 9 is a functional block diagram of a third embodiment of a model generating apparatus according to an embodiment of the present invention.
- FIG. 10 is a functional block diagram of Embodiment 1 of a word weighting apparatus according to an embodiment of the present invention.
- FIG. 11 is a functional block diagram of Embodiment 2 of a word weighting apparatus according to an embodiment of the present invention.
- first, second, etc. may be used to describe the regression model in embodiments of the invention, these regression models should not be limited to these terms. These terms are only used to distinguish regression models from each other.
- the first regression model may also be referred to as a second regression model without departing from the scope of embodiments of the invention.
- the second regression model may also be referred to as a first regression model.
- the word “if” as used herein may be interpreted as “when” or “when” or “in response to determining” or “in response to detecting.”
- the phrase “if determined” or “if detected (conditions or events stated)” may be interpreted as “when determined” or “in response to determination” or “when detected (stated condition or event) “Time” or “in response to a test (condition or event stated)”.
- FIG. 1 it is a schematic flowchart of Embodiment 1 of a method for generating a model according to an embodiment of the present invention. As shown in the figure, the method includes the following steps:
- the significant difference between image search and web search is that the relevant text of the image is generally short. Therefore, when confronting the task of weighting words, it will encounter the problem of short text understanding.
- One of the methods to solve the problem is to empower the task. Need to add pre-processing steps to cluster the text of the image to get rich and accurate text. That is, the text of the same picture is aggregated, and the text of the picture is mutually verified by the aggregation result, thereby filtering out the credible and sufficient text to improve the validity of the statistical feature of the picture-based text.
- FIG. 2 is an example diagram of generating a local model and a global model according to an embodiment of the present invention.
- the other picture is the same as the specified picture, and then the specified picture and other pictures identical to the specified picture are taken as the sample picture in the embodiment of the present invention.
- the number of specified pictures may be one or more, and the number of other pictures that are the same as each specified picture may be one or more.
- the method of obtaining other pictures identical to the specified picture may include, but is not limited to:
- the signature of the specified picture is obtained by using the entire content of the specified picture or the main feature of the specified picture, and the signature of each candidate picture in the picture set is obtained by the same method. Then, according to the front of the specified picture and the signature of each candidate picture, the similarity between the specified picture and each candidate picture is obtained. Similarity thresholds with preset similarities Performing a comparison to extract a candidate image whose similarity is greater than or equal to a preset similarity threshold, and selecting a candidate image whose similarity is greater than or equal to a preset similarity threshold as the same image as the specified image, that is, considering and specifying Other pictures whose signatures have a similarity greater than or equal to the similarity threshold belong to the same picture as the specified photo. In this way, you get the same picture as the specified picture.
- the text of the specified picture and the text of other pictures may be aggregated to obtain text clustering.
- the text of the specified picture or the text of other pictures may include, but is not limited to, at least one of the title of the page where the picture is located, the text displayed when the mouse hovers over the picture, the title of the picture, and the text in the page where the picture is located.
- the text of the specified picture or the text of other pictures may include, but is not limited to, at least one of the title of the page where the picture is located, the text displayed when the mouse hovers over the picture, the title of the picture, and the text in the page where the picture is located.
- FIG. 3 is a schematic diagram of a text clustering of a picture according to an embodiment of the present invention.
- a method for screening a text cluster may include, but is not limited to: The authoritative data of the site or page where each sample picture is located, the time information of the page where each sample picture is located, the click data of the site where each sample picture is located, and the word vector of the text of each sample picture and the word vector of the text clustered At least one of the distances, each text in the text cluster is filtered to obtain a filtered text cluster. It can be considered that each text in the filtered text cluster belongs to a relatively high quality text.
- the sample picture in the page that is closer to the time can be deleted in the text cluster according to the time information of the page where each sample picture is located.
- the text that preserves the text of the sample image from the earlier page in the text cluster can be deleted in the text cluster according to the time information of the page where each sample picture is located.
- the number of clicks of the site where each sample picture is located is calculated, and the number of clicks is compared with a preset number of times threshold, and the text of the sample picture whose click number is less than the number of times threshold is deleted in the text cluster.
- the distance between the word vector of the text of each sample picture and the word vector of the text cluster is calculated separately, and then the calculated distances are compared with a preset distance threshold, and the distance is deleted in the text cluster.
- the text of the sample picture that is greater than or equal to the distance threshold.
- the text features can be obtained according to the clustered texts obtained after the screening, and the visual features are obtained according to the sample images.
- the text feature may include at least one of the following features:
- the distribution characteristics of the text may include, but are not limited to, a text field in which each word in the text appears in each text of the text cluster, a number of occurrences of each word in the text cluster, and a word cluster in the text in the text.
- the distribution features of the words in the text on different levels of sites or pages may include, but are not limited to, at least one of the following characteristics: the number of occurrences, frequency of occurrence, occurrences of each word in the page or site of each level in the text. The ratio of the maximum value to the number of occurrences, the ratio of the number of occurrences to the average number of occurrences, and the like.
- the click query text of the text whose q is p if the user also clicks the picture r, the text of r is called Click on the text for the extension of the text of p.
- the click feature of the text may include, but is not limited to, the number of occurrences of the words in the text and the expanded click text, the frequency of occurrence, the ratio of the number of occurrences to the maximum number of occurrences, the number of occurrences and the number of occurrences. The ratio and so on.
- the semantic features of a word in a text may include, but are not limited to, a semantic category of words in a text cluster, such as a plant, an animal, or a star.
- the text can be word-cut to obtain words in the text, and then several words in each word with a confidence greater than or equal to the confidence threshold are obtained as the subject of the text.
- a priori attributes of words in the text may include, but are not limited to,: Inverse Document Frequency (IDF) data, semantic categories, co-occurrence words, synonyms, synonyms, and related words.
- IDF Inverse Document Frequency
- the a priori properties of the words can be mined from the corpus and/or user behavior logs.
- the visual feature refers to a visual feature of the sample picture.
- the visual features include "Liu Moumou”, “Concert” and “ Celebrity”.
- the visual characteristics of the sample image can be obtained by machine learning the content of the sample image and the user clicking on the log.
- S104 Perform machine learning according to the text feature and the visual feature to generate a first regression model and a first ranking model.
- machine learning may be performed according to the text feature and the visual feature to generate a local model, where the local model includes a first regression model and a first sort model.
- the first regression model is used to obtain the regression score of each word in the text of the picture
- the first ordering model is used to obtain the ranking score of each word in the text of the picture
- the sorting score is used to determine the picture.
- the ordering between the words in the text For example, the ranking scores of the words A, B, and C are 0.3, -1, and 1.2, respectively, and the order between the words is "Word C > Word A > Word B".
- a Gradient Boosting Decision Tree can be used to perform machine learning on text features and visual features to generate a first regression model.
- GBDT Gradient Boosting Decision Tree
- the Gradient Boosting Rank (GBRank) algorithm may be used to perform machine learning on text features and visual features to generate a first sorting model.
- FIG. 4 is a schematic flowchart of a second embodiment of a method for generating a model according to an embodiment of the present invention. As shown in the figure, the method is based on a model generation method in the first embodiment, and the method may be generated after S104. Includes the following steps:
- the first regression model and the first sorting model generated in the first embodiment can only obtain the regression score of the word in one text of the picture and the sorting position of each word in a text of the picture, if desired
- the text of each sample picture may be first input into a first regression model, and the first regression model outputs a regression score of each word in the text.
- the text of each sample picture is input into the first sorting model, and the first sorting model can output the sorting result of each word in the text.
- the related feature of the picture includes at least one of the following features: user behavior characteristics of each picture in the image search result that matches each word in the text of each sample picture, quality characteristics of each picture, and sites where each picture is located Or the authoritative data of the page.
- the user behavior characteristics of the picture may include, but are not limited to, click data of the picture in the image search result that matches the query word containing the words in the text and the importance is greater than the specified threshold.
- Click data can include: the number of clicks on the image, the frequency of clicks, the ratio of clicks to the maximum number of clicks, the ratio of clicks to the average number of clicks, and so on.
- the quality characteristics of the picture may include, but are not limited to, the size of the picture, the clarity of the picture, the data indicating whether the picture is beautiful (such as true and false), whether the link of the picture is a dead link, and whether the link of the picture is an external site. Connection, etc.
- the authoritative data of the site or page where the image is located may include, but is not limited to, the authoritative absolute value of the site or page where the image is located, the ratio of the absolute value to the maximum value of the absolute value, and the like.
- the regression scores and words of each word may be used.
- the ranking result and the related features of each picture in the picture search result matching the words in the text of each sample picture are machine-learned to generate a global model, and the global model includes a second regression model and a second ordering model.
- the second regression model is used to obtain each regression score when the same word corresponds to the text of different pictures
- the second sorting model is used to obtain the sort score when the same word corresponds to the text of different pictures
- the sort score The sorting when the word corresponds to the text of a different picture.
- the order scores of the words s in the text A, the words s in the text B, and the words s in the text C are 0.3, -1, and 1.2, respectively, and are sorted as "words in the text C" in the text A.
- the GBDT algorithm can be used to perform machine learning on the regression scores of the words, the sort results of the words, and the related features of the pictures in the image search results that match the words in the text of each sample picture. Two regression model.
- the GBRank algorithm may be used to perform machine learning to generate a regression score for each word, a sort result of each word, and a related feature of each picture in the image search result that matches each word in the text of each sample picture.
- the second sorting model may be used to perform machine learning to generate a regression score for each word, a sort result of each word, and a related feature of each picture in the image search result that matches each word in the text of each sample picture.
- FIG. 5 is an exemplary diagram of generating a model by using a click feature according to an embodiment of the present invention.
- the using the click feature generation model may include the following process:
- the click-inverted query words and the corresponding search results are filtered to be candidate data.
- the search results are divided into different levels according to the click information of the search result.
- the data set data_a is obtained by using the filtered click-inverted query words and the corresponding search results and dividing into different levels of search results.
- the data with only a large difference in local features among the candidate data is filtered out as the training data used in generating the first sorting model in the local model, and the filtered data is selected.
- the quality of the quality will be closely related to the features used by the local model.
- different levels of search results can be used as training data for generating a first regression model in the local model; these two training data can be recorded as local training data (train_local).
- machine learning is performed using train_local to generate a local model including a first regression model and a first ranking model.
- the regression score and the sort result corresponding to data_a are obtained by using the local model, and the regression score and the sort result are added to the data set data_a to obtain the data set data_b.
- the local model score is obtained.
- the data in the data set data_a with only a small difference in local features is filtered out, and only the data with less local feature differences, regression scores and sorting results are used as the second regression model and the second ranking model in the global model.
- Training data, the global training data train_global.
- machine learning is performed using train_global to generate a global model including a second regression model and a second ranking model.
- the learned local and global models can be used to weight the text of the test image in the test set and evaluate the test results.
- FIG. 6 is a schematic flowchart of a method for weighting a word according to an embodiment of the present invention. As shown in the figure, the following steps may be included:
- the text of the specified picture, the text feature of the specified picture, and the visual feature of the specified picture are input into the first regression model generated in Embodiment 1, and the first regression model obtains each word in the text of the specified picture according to the input information.
- the first regression score is a value that specifies the word in the text of the specified picture according to the input information.
- the text of the specified picture, the text feature of the specified picture, and the visual feature of the specified picture are input into the first sorting model generated in Embodiment 1, and the first sorting model obtains each word in the text of the specified picture according to the input information.
- the first sort result is input into the first sorting model generated in Embodiment 1, and the first sorting model obtains each word in the text of the specified picture according to the input information. The first sort result.
- the method for obtaining the weighting score of each word in the text of the specified picture by using the first regression score and the first sorting result may include but is not limited to the following two types:
- the weighting function may be used to map the first regression score and the fitting result of the first sorting result into a specified interval, for example, the specified interval is 0-100.
- the second type if the second regression model and the second ranking model are further generated in the model generation method, the second regression model may be used to obtain the second regression model according to the first regression score and the first ranking result. Specifying a second regression score of each word in the text of the picture; and obtaining a second word in the text of the specified picture by using the second ranking model according to the first regression score and the first sorting result Sorting the result; finally, calculating the weighting score of each word in the text of the specified picture according to the second regression score and the second sorting result, and using the weighting function.
- Embodiments of the present invention further provide an apparatus embodiment for implementing the steps and methods in the foregoing method embodiments.
- FIG. 7 is a functional block diagram of Embodiment 1 of a model generating apparatus according to an embodiment of the present invention. As shown, the device includes:
- a picture obtaining unit 71 configured to acquire other pictures that are the same as the specified picture, and use the specified picture and the other pictures as sample pictures;
- a text clustering unit 72 configured to obtain a text cluster according to the text of the sample picture
- a first feature acquiring unit 73 configured to obtain a text feature according to the text cluster, and obtain a visual feature according to the sample image;
- the first generating unit 74 is configured to perform machine learning according to the text feature and the visual feature to generate a first regression model and a first sorting model.
- the picture obtaining unit 71 is specifically configured to:
- a candidate picture whose similarity is greater than or equal to a preset similarity threshold is extracted as the other picture identical to the specified picture.
- FIG. 8 is a functional block diagram of a second embodiment of a model generating apparatus according to an embodiment of the present invention. As shown, the device further includes:
- the text processing unit 75 is configured to: according to the authoritative data of the site or page where each sample picture is located, the time information of the page where each sample picture is located, the click data of the site where each sample picture is located, and the word vector of the text of each sample picture and the text At least one of the distances between the clustered word vectors, each text in the text cluster is filtered to obtain a filtered text cluster.
- the text feature includes at least one of the following features:
- FIG. 9 is a functional block diagram of Embodiment 3 of a model generating apparatus according to an embodiment of the present invention. As shown, the device further includes:
- a score obtaining unit 76 configured to obtain, by using the first regression model, a regression score of each word in the text of each sample picture;
- a sorting obtaining unit 77 configured to obtain, by using the first sorting module, a sorting result of each word in the text of each sample picture;
- the second feature acquiring unit 78 is configured to obtain related features of each picture in the image search result that matches each word in the text of each sample picture;
- the second generating unit 79 is configured to perform machine learning according to the regression score, the sorting result, and the related feature, and generate a second regression model and a second sorting model.
- the related features include at least one of the following features:
- FIG. 10 is a functional block diagram of Embodiment 1 of a word weighting apparatus according to an embodiment of the present invention. As shown, the device includes:
- a score obtaining unit 80 configured to obtain a first regression score of each word in the text according to a text of the specified picture, using the first regression model; the first regression model is a model using the methods shown in FIG. 7 and FIG. Generating device generated
- a sorting obtaining unit 81 configured to use the first sorting mode according to the text of the specified picture And obtaining a first sorting result of each word in the text; the first sorting model is generated by using the model generating apparatus shown in FIG. 7 and FIG. 8;
- the word weighting unit 82 is configured to obtain, according to the first regression score and the first sorting result, an empowerment score of each word in the text of the specified picture.
- the word weighting unit is specifically configured to:
- FIG. 11 is a functional block diagram of Embodiment 2 of a word weighting apparatus according to an embodiment of the present invention.
- the word empowerment unit 82 further includes:
- a score obtaining module 821 configured to obtain, according to the first regression score and the first sorting result, a second regression score of each word in the text of the specified picture by using a second regression model;
- the two regression model is generated using the model generation device shown in FIG. 9;
- a sorting obtaining module 822 configured to obtain, according to the first regression score and the first sorting result, a second sorting result of each word in the text of the specified picture by using a second sorting model;
- the model is generated using the model generation device shown in FIG. 9;
- the word weighting module 823 is configured to calculate an empowerment score of each word in the text of the specified picture according to the second regression score and the second sorting result, and using the weighting function.
- the specified picture and the other picture are used as sample pictures by acquiring other pictures that are the same as the specified picture; thus, the text is clustered according to the text of the sample picture; and further, according to the text Generating a text feature, and obtaining a visual feature according to the sample picture; and performing machine learning according to the text feature and the visual feature to generate a first regression model and a first ranking model, a first regression model and a first ranking Mode
- the type is used to implement the weighting of words in the text in the picture.
- a plurality of texts are obtained by clustering text of a picture, and then multiple features are extracted based on the text clustering of the image, thereby performing machine learning on various features to generate required
- a model that can be used to weight words in the text of a picture.
- the invention solves the problem that the accuracy of the word weighting result is relatively low due to the short text of the picture in the prior art. Therefore, the embodiment of the invention can improve the accuracy of the weighting result of each word in the text of the picture, thereby improving the accuracy. The accuracy of the image search results.
- the disclosed system, apparatus, and method may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- multiple units or components may be combined. Or it can be integrated into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in various embodiments of the present invention may be integrated into one processing order In the meta element, each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
- the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
- the above software functional unit is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the methods of the various embodiments of the present invention. Part of the steps.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Library & Information Science (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (22)
- 一种模型生成方法,其特征在于,所述方法包括:获取与指定图片相同的其它图片,将所述指定图片和所述其他图片作为样本图片;根据所述样本图片的文本,获得文本聚簇;根据所述文本聚簇获得文本特征,并根据所述样本图片获得视觉特征;根据所述文本特征和所述视觉特征进行机器学习,生成第一回归模型和第一排序模型。
- 根据权利要求1所述的方法,其特征在于,所述获取与指定图片相同的其它图片,包括:获取所述指定图片以及各候选图片的签名;根据所述指定图片以及各候选图片的签名,获取所述指定图片与每个候选图片的相似度;提取相似度大于或者等于预设的相似阈值的候选图片,以作为与所述指定图片相同的其它图片。
- 根据权利要求1所述的方法,其特征在于,根据所述文本聚簇获得文本特征,并根据所述样本图片获得视觉特征之前,所述方法还包括:根据各样本图片所在站点或者页面的权威数据、各样本图片所在页面的时间信息、各样本图片所在站点的点击数据、以及各样本图片的文本的词语向量与所述文本聚簇的词语向量之间的距离中至少一个,对所述文本聚簇中的各文本进行筛选,以获得筛选后的文本聚簇。
- 根据权利要求1所述的方法,其特征在于,所述文本特征包括以下特征中至少一个:所述文本聚簇中各文本的分布特征;所述文本聚簇中各文本的点击特征;所述文本聚簇中各文本中词语的语义特征;所述文本聚簇中各文本的主题词;以及,所述文本聚簇中各文本中词语的先验属性。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:利用所述第一回归模型获得各样本图片的文本中各词语的回归分值;利用所述第一排序模块获得各样本图片的文本中各词语的排序结果;获得与各样本图片的文本中各词语相匹配的图片搜索结果中各图片的相关特征;根据所述回归分值、所述排序结果和所述相关特征,进行机器学习,生成第二回归模型和第二排序模型。
- 根据权利要求5所述的方法,其特征在于,所述相关特征包括以下特征中至少一个:与各样本图片的文本中各词语相匹配的图片搜索结果中各图片的用户行为特征、各图片的质量特征、以及各图片所在站点或者页面的权威数据。
- 一种词语赋权方法,其特征在于,所述方法包括:根据指定图片的文本,利用第一回归模型获得所述文本中各词语的第一回归分值;所述第一回归模型为利用权利要求1至4中任一项所述的模型生成方法生成的;根据所述指定图片的文本,利用第一排序模型获得所述文本中各词语的第一排序结果;所述第一排序模型为利用权利要求1至4中任一项所述的模型生成方法生成的;根据所述第一回归分值和所述第一排序结果,获得所述指定图片的文本中各词语的赋权分值。
- 根据权利要求7所述的方法,其特征在于,根据所述第一回归分值和所述第一排序结果,获得所述指定图片的文本中各词语的赋权分值,包括:根据所述第一回归分值和所述第一排序结果,并利用赋权函数,计算所述指定图片的文本中各词语的赋权分值。
- 根据权利要求7所述的方法,其特征在于,根据所述第一回归分值和所述第一排序结果,获得所述指定图片的文本中各词语的赋权分值,包括:根据所述第一回归分值和所述第一排序结果,利用第二回归模型,获得所述指定图片的文本中各词语的第二回归分值;所述第二回归模型为利用权利要求5或6所述的模型生成方法生成的;根据所述第一回归分值和所述第一排序结果,利用第二排序模型,获得所述指定图片的文本中各词语的第二排序结果;所述第二排序模型为利用权利要求5或6所述的模型生成方法生成的;根据所述第二回归分值和所述第二排序结果,并利用赋权函数,计算所述指定图片的文本中各词语的赋权分值。
- 一种模型生成装置,其特征在于,所述装置包括:图片获取单元,用于获取与指定图片相同的其它图片,将所述指定 图片和所述其他图片作为样本图片;文本聚簇单元,用于根据所述样本图片的文本,获得文本聚簇;第一特征获取单元,用于根据所述文本聚簇获得文本特征,并根据所述样本图片获得视觉特征;第一生成单元,用于根据所述文本特征和所述视觉特征进行机器学习,生成第一回归模型和第一排序模型。
- 根据权利要求10所述的装置,其特征在于,所述图片获取单元,具体用于:获取所述指定图片以及各候选图片的签名;根据所述指定图片以及各候选图片的签名,获取所述指定图片与每个候选图片的相似度;提取相似度大于或者等于预设的相似阈值的候选图片,以作为与所述指定图片相同的其它图片。
- 根据权利要求10所述的装置,其特征在于,所述装置还包括:文本处理单元,用于根据各样本图片所在站点或者页面的权威数据、各样本图片所在页面的时间信息、各样本图片所在站点的点击数据、以及各样本图片的文本的词语向量与所述文本聚簇的词语向量之间的距离中至少一个,对所述文本聚簇中的各文本进行筛选,以获得筛选后的文本聚簇。
- 根据权利要求10所述的装置,其特征在于,所述文本特征包括以下特征中至少一个:所述文本聚簇中各文本的分布特征;所述文本聚簇中各文本的点击特征;所述文本聚簇中各文本中词语的语义特征;所述文本聚簇中各文本的主题词;以及,所述文本聚簇中各文本中词语的先验属性。
- 根据权利要求10所述的装置,其特征在于,所述装置还包括:分值获取单元,用于利用所述第一回归模型获得各样本图片的文本中各词语的回归分值;排序获取单元,用于利用所述第一排序模块获得各样本图片的文本中各词语的排序结果;第二特征获取单元,用于获得与各样本图片的文本中各词语相匹配的图片搜索结果中各图片的相关特征;第二生成单元,用于根据所述回归分值、所述排序结果和所述相关特征,进行机器学习,生成第二回归模型和第二排序模型。
- 根据权利要求14所述的装置,其特征在于,所述相关特征包括以下特征中至少一个:与各样本图片的文本中各词语相匹配的图片搜索结果中各图片的用户行为特征、各图片的质量特征、以及各图片所在站点或者页面的权威数据。
- 一种词语赋权装置,其特征在于,所述装置包括:分值获取单元,用于根据指定图片的文本,利用第一回归模型获得所述文本中各词语的第一回归分值;所述第一回归模型为利用权利要求10至13中任一项所述的模型生成装置生成的;排序获取单元,用于根据所述指定图片的文本,利用第一排序模型获得所述文本中各词语的第一排序结果;所述第一排序模型为利用权利 要求10至13中任一项所述的模型生成装置生成的;词语赋权单元,用于根据所述第一回归分值和所述第一排序结果,获得所述指定图片的文本中各词语的赋权分值。
- 根据权利要求16所述的装置,其特征在于,所述词语赋权单元,具体用于:根据所述第一回归分值和所述第一排序结果,并利用赋权函数,计算所述指定图片的文本中各词语的赋权分值。
- 根据权利要求16所述的装置,其特征在于,所述词语赋权单元进一步包括:分值获取模块,用于根据所述第一回归分值和所述第一排序结果,利用第二回归模型,获得所述指定图片的文本中各词语的第二回归分值;所述第二回归模型为利用权利要求14或15所述的模型生成装置生成的;排序获取模块,用于根据所述第一回归分值和所述第一排序结果,利用第二排序模型,获得所述指定图片的文本中各词语的第二排序结果;所述第二排序模型为利用权利要求14或15所述的模型生成装置生成的;词语赋权模块,用于根据所述第二回归分值和所述第二排序结果,并利用赋权函数,计算所述指定图片的文本中各词语的赋权分值。
- 一种设备,包括一个或者多个处理器;存储器;一个或者多个程序,所述一个或者多个程序存储在所述存储器中,当被所述一个或者多个处理器执行时:获取与指定图片相同的其它图片,将所述指定图片和所述其他图片 作为样本图片;根据所述样本图片的文本,获得文本聚簇;根据所述文本聚簇获得文本特征,并根据所述样本图片获得视觉特征;根据所述文本特征和所述视觉特征进行机器学习,生成第一回归模型和第一排序模型。
- 一种设备,包括一个或者多个处理器;存储器;一个或者多个程序,所述一个或者多个程序存储在所述存储器中,当被所述一个或者多个处理器执行时:根据指定图片的文本,利用第一回归模型获得所述文本中各词语的第一回归分值;所述第一回归模型为利用权利要求1至4中任一项所述的模型生成方法生成的;根据所述指定图片的文本,利用第一排序模型获得所述文本中各词语的第一排序结果;所述第一排序模型为利用权利要求1至4中任一项所述的模型生成方法生成的;根据所述第一回归分值和所述第一排序结果,获得所述指定图片的文本中各词语的赋权分值。
- 一种计算机存储介质,所述计算机存储介质被编码有计算机程序,所述程序在被一个或多个计算机执行时,使得所述一个或多个计算机执行如下操作:获取与指定图片相同的其它图片,将所述指定图片和所述其他图片 作为样本图片;根据所述样本图片的文本,获得文本聚簇;根据所述文本聚簇获得文本特征,并根据所述样本图片获得视觉特征;根据所述文本特征和所述视觉特征进行机器学习,生成第一回归模型和第一排序模型。
- 一种计算机存储介质,所述计算机存储介质被编码有计算机程序,所述程序在被一个或多个计算机执行时,使得所述一个或多个计算机执行如下操作:根据指定图片的文本,利用第一回归模型获得所述文本中各词语的第一回归分值;所述第一回归模型为利用权利要求1至4中任一项所述的模型生成方法生成的;根据所述指定图片的文本,利用第一排序模型获得所述文本中各词语的第一排序结果;所述第一排序模型为利用权利要求1至4中任一项所述的模型生成方法生成的;根据所述第一回归分值和所述第一排序结果,获得所述指定图片的文本中各词语的赋权分值。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/319,555 US10565253B2 (en) | 2015-12-31 | 2016-06-01 | Model generation method, word weighting method, device, apparatus, and computer storage medium |
JP2016572673A JP6428795B2 (ja) | 2015-12-31 | 2016-06-01 | モデル生成方法、単語重み付け方法、モデル生成装置、単語重み付け装置、デバイス、コンピュータプログラム及びコンピュータ記憶媒体 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201511025975.0A CN105653701B (zh) | 2015-12-31 | 2015-12-31 | 模型生成方法及装置、词语赋权方法及装置 |
CN201511025975.0 | 2015-12-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017113592A1 true WO2017113592A1 (zh) | 2017-07-06 |
Family
ID=56490920
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2016/084312 WO2017113592A1 (zh) | 2015-12-31 | 2016-06-01 | 模型生成方法、词语赋权方法、装置、设备及计算机存储介质 |
Country Status (4)
Country | Link |
---|---|
US (1) | US10565253B2 (zh) |
JP (1) | JP6428795B2 (zh) |
CN (1) | CN105653701B (zh) |
WO (1) | WO2017113592A1 (zh) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106919951B (zh) * | 2017-01-24 | 2020-04-21 | 杭州电子科技大学 | 一种基于点击与视觉融合的弱监督双线性深度学习方法 |
CN107992508B (zh) * | 2017-10-09 | 2021-11-30 | 北京知道未来信息技术有限公司 | 一种基于机器学习的中文邮件签名提取方法及系统 |
CN110598200B (zh) * | 2018-06-13 | 2023-05-23 | 北京百度网讯科技有限公司 | 语义识别方法及装置 |
CN109032375B (zh) * | 2018-06-29 | 2022-07-19 | 北京百度网讯科技有限公司 | 候选文本排序方法、装置、设备及存储介质 |
CN110569429B (zh) * | 2019-08-08 | 2023-11-24 | 创新先进技术有限公司 | 一种内容选择模型的生成方法、装置和设备 |
JP7321977B2 (ja) * | 2020-06-10 | 2023-08-07 | ヤフー株式会社 | 情報処理装置、情報処理方法および情報処理プログラム |
CN113283115B (zh) * | 2021-06-11 | 2023-08-08 | 北京有竹居网络技术有限公司 | 图像模型生成方法、装置和电子设备 |
CN113254513B (zh) * | 2021-07-05 | 2021-09-28 | 北京达佳互联信息技术有限公司 | 排序模型生成方法、排序方法、装置、电子设备 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070192350A1 (en) * | 2006-02-14 | 2007-08-16 | Microsoft Corporation | Co-clustering objects of heterogeneous types |
CN101582080A (zh) * | 2009-06-22 | 2009-11-18 | 浙江大学 | 一种基于图像和文本相关性挖掘的Web图像聚类方法 |
CN103810274A (zh) * | 2014-02-12 | 2014-05-21 | 北京联合大学 | 基于WordNet语义相似度的多特征图像标签排序方法 |
CN104077419A (zh) * | 2014-07-18 | 2014-10-01 | 合肥工业大学 | 结合语义与视觉信息的长查询图像检索重排序算法 |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000048041A (ja) | 1998-07-29 | 2000-02-18 | Matsushita Electric Ind Co Ltd | データ検索システム及びこれに用いる装置 |
JP3529036B2 (ja) | 1999-06-11 | 2004-05-24 | 株式会社日立製作所 | 文書付き画像の分類方法 |
US7716225B1 (en) * | 2004-06-17 | 2010-05-11 | Google Inc. | Ranking documents based on user behavior and/or feature data |
US7617176B2 (en) | 2004-07-13 | 2009-11-10 | Microsoft Corporation | Query-based snippet clustering for search result grouping |
US8078617B1 (en) * | 2009-01-20 | 2011-12-13 | Google Inc. | Model based ad targeting |
US20110125743A1 (en) * | 2009-11-23 | 2011-05-26 | Nokia Corporation | Method and apparatus for providing a contextual model based upon user context data |
JP2011221794A (ja) | 2010-04-09 | 2011-11-04 | Kddi Corp | 画像選定装置 |
CN102411563B (zh) | 2010-09-26 | 2015-06-17 | 阿里巴巴集团控股有限公司 | 一种识别目标词的方法、装置及系统 |
CN103201718A (zh) | 2010-11-05 | 2013-07-10 | 乐天株式会社 | 关于关键词提取的系统和方法 |
US9864817B2 (en) * | 2012-01-28 | 2018-01-09 | Microsoft Technology Licensing, Llc | Determination of relationships between collections of disparate media types |
US8880438B1 (en) * | 2012-02-15 | 2014-11-04 | Google Inc. | Determining content relevance |
CN102902821B (zh) * | 2012-11-01 | 2015-08-12 | 北京邮电大学 | 基于网络热点话题的图像高级语义标注、检索方法及装置 |
US9082047B2 (en) * | 2013-08-20 | 2015-07-14 | Xerox Corporation | Learning beautiful and ugly visual attributes |
CN103577537B (zh) * | 2013-09-24 | 2016-08-17 | 上海交通大学 | 面向图像分享网站图片的多重配对相似度确定方法 |
CN103942279B (zh) | 2014-04-01 | 2018-07-10 | 百度(中国)有限公司 | 搜索结果的展现方法和装置 |
CN104376105B (zh) * | 2014-11-26 | 2017-08-25 | 北京航空航天大学 | 一种社会媒体中图像低层视觉特征与文本描述信息的特征融合系统及方法 |
-
2015
- 2015-12-31 CN CN201511025975.0A patent/CN105653701B/zh active Active
-
2016
- 2016-06-01 US US15/319,555 patent/US10565253B2/en active Active
- 2016-06-01 JP JP2016572673A patent/JP6428795B2/ja active Active
- 2016-06-01 WO PCT/CN2016/084312 patent/WO2017113592A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070192350A1 (en) * | 2006-02-14 | 2007-08-16 | Microsoft Corporation | Co-clustering objects of heterogeneous types |
CN101582080A (zh) * | 2009-06-22 | 2009-11-18 | 浙江大学 | 一种基于图像和文本相关性挖掘的Web图像聚类方法 |
CN103810274A (zh) * | 2014-02-12 | 2014-05-21 | 北京联合大学 | 基于WordNet语义相似度的多特征图像标签排序方法 |
CN104077419A (zh) * | 2014-07-18 | 2014-10-01 | 合肥工业大学 | 结合语义与视觉信息的长查询图像检索重排序算法 |
Also Published As
Publication number | Publication date |
---|---|
JP2018509664A (ja) | 2018-04-05 |
JP6428795B2 (ja) | 2018-11-28 |
US10565253B2 (en) | 2020-02-18 |
US20180210897A1 (en) | 2018-07-26 |
CN105653701A (zh) | 2016-06-08 |
CN105653701B (zh) | 2019-01-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017113592A1 (zh) | 模型生成方法、词语赋权方法、装置、设备及计算机存储介质 | |
US10354170B2 (en) | Method and apparatus of establishing image search relevance prediction model, and image search method and apparatus | |
US9558264B2 (en) | Identifying and displaying relationships between candidate answers | |
WO2019091026A1 (zh) | 知识库文档快速检索方法、应用服务器及计算机可读存储介质 | |
US9594826B2 (en) | Co-selected image classification | |
AU2011326430B2 (en) | Learning tags for video annotation using latent subtags | |
WO2022095374A1 (zh) | 关键词抽取方法、装置、终端设备及存储介质 | |
KR20160149978A (ko) | 검색 엔진 및 그의 구현 방법 | |
US8909625B1 (en) | Image search | |
WO2017097231A1 (zh) | 话题处理方法及装置 | |
CN110399515B (zh) | 图片检索方法、装置及图片检索系统 | |
US20140358928A1 (en) | Clustering Based Question Set Generation for Training and Testing of a Question and Answer System | |
JP2017508214A (ja) | 検索推奨の提供 | |
US11361030B2 (en) | Positive/negative facet identification in similar documents to search context | |
WO2011057497A1 (zh) | 一种词汇质量挖掘评价方法及装置 | |
WO2015188719A1 (zh) | 结构化数据与图片的关联方法与关联装置 | |
CN110390094B (zh) | 对文档进行分类的方法、电子设备和计算机程序产品 | |
US20140006369A1 (en) | Processing structured and unstructured data | |
CN111708909B (zh) | 视频标签的添加方法及装置、电子设备、计算机可读存储介质 | |
CN106462644B (zh) | 标识来自多个结果页面标识的优选结果页面 | |
WO2018121198A1 (en) | Topic based intelligent electronic file searching | |
WO2017096777A1 (zh) | 文献归一方法、文献搜索方法及对应装置、设备和存储介质 | |
JP5952711B2 (ja) | 予測対象コンテンツにおける将来的なコメント数を予測する予測サーバ、プログラム及び方法 | |
JP7395377B2 (ja) | コンテンツ検索方法、装置、機器、および記憶媒体 | |
CN107665442B (zh) | 获取目标用户的方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2016572673 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15319555 Country of ref document: US |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16880400 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16880400 Country of ref document: EP Kind code of ref document: A1 |