CN113793191A - Commodity matching method and device and electronic equipment - Google Patents

Commodity matching method and device and electronic equipment Download PDF

Info

Publication number
CN113793191A
CN113793191A CN202110181713.2A CN202110181713A CN113793191A CN 113793191 A CN113793191 A CN 113793191A CN 202110181713 A CN202110181713 A CN 202110181713A CN 113793191 A CN113793191 A CN 113793191A
Authority
CN
China
Prior art keywords
commodity
candidate
matching
commodities
category
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110181713.2A
Other languages
Chinese (zh)
Other versions
CN113793191B (en
Inventor
付振宇
郑宇宇
赵英普
闫慧丽
孙孟哲
顾松庠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Holding Co Ltd
Original Assignee
Jingdong Technology Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Holding Co Ltd filed Critical Jingdong Technology Holding Co Ltd
Priority to CN202110181713.2A priority Critical patent/CN113793191B/en
Publication of CN113793191A publication Critical patent/CN113793191A/en
Application granted granted Critical
Publication of CN113793191B publication Critical patent/CN113793191B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • General Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a commodity matching method and device and electronic equipment, and belongs to the technical field of computer application. The matching method of the commodities comprises the following steps: acquiring first description information corresponding to a first commodity; identifying the first description information to determine a multi-dimensional feature of the first commodity; acquiring a plurality of candidate commodities from a commodity library according to a first matching degree between each dimension characteristic of each commodity in the commodity library and each dimension characteristic of a first commodity; determining a second matching degree of the first commodity and each candidate commodity according to the first matching degree between each dimension characteristic of the first commodity and the corresponding dimension characteristic of each candidate commodity and the weight corresponding to each dimension; and extracting the commodities matched with the first commodity from the candidate commodities according to the second matching degree of the first commodity and each candidate commodity. Therefore, by the commodity matching method, the key features related to the commodity attributes are extracted from the description information of the commodities for matching, and the accuracy of commodity matching is improved.

Description

Commodity matching method and device and electronic equipment
Technical Field
The present application relates to the field of computer application technologies, and in particular, to a method and an apparatus for matching a commodity, and an electronic device.
Background
In electronic trademark matching, the overall similarity between the product titles is generally calculated, and whether the two product titles match is determined according to the overall similarity between the product titles. However, since all texts in the product title participate in similarity calculation, text information irrelevant to the product attribute cannot be filtered, and thus the accuracy of matching the product title is low.
Disclosure of Invention
The commodity matching method, the commodity matching device, the electronic equipment and the storage medium are used for solving the problem that in the related technology, by means of the method for matching the commodity titles according to the overall similarity between the commodity titles, all texts in the commodity titles participate in similarity calculation, text information irrelevant to the commodity attributes cannot be filtered, and accordingly accuracy of the commodity title matching is low.
An embodiment of an aspect of the present application provides a matching method for a commodity, including: acquiring first description information corresponding to a first commodity; identifying the first description information to determine a multi-dimensional feature of the first commodity; acquiring a plurality of candidate commodities from a commodity library according to a first matching degree between each dimension characteristic of each commodity in the commodity library and each dimension characteristic of the first commodity; determining a second matching degree of the first commodity and each candidate commodity according to a first matching degree between each dimension characteristic of the first commodity and the corresponding dimension characteristic of each candidate commodity and the weight corresponding to each dimension; and extracting the commodities matched with the first commodity from the candidate commodities according to the second matching degree of the first commodity and each candidate commodity.
The matching device of commodity that this application another aspect embodiment provided includes: the first acquisition module is used for acquiring first description information corresponding to a first commodity; the first determining module is used for identifying the first description information so as to determine the multi-dimensional characteristics of the first commodity; the second acquisition module is used for acquiring a plurality of candidate commodities from the commodity library according to a first matching degree between each dimension characteristic of each commodity in the commodity library and each dimension characteristic of the first commodity; the second determining module is used for determining second matching degrees of the first commodity and each candidate commodity according to the first matching degrees between the dimensional features of the first commodity and the corresponding dimensional features of each candidate commodity and the weight corresponding to each dimension; and the extracting module is used for extracting the commodities matched with the first commodity from the candidate commodities according to the second matching degree of the first commodity and each candidate commodity.
An embodiment of another aspect of the present application provides an electronic device, which includes: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the method for matching an item as described above when executing the program.
In yet another aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, wherein the computer program is executed by a processor to implement the matching method for an article as described above.
According to the commodity matching method, the commodity matching device, the electronic equipment and the computer-readable storage medium, the multi-dimensional features of the first commodity are extracted from the first description information corresponding to the first commodity, the candidate commodities are obtained from the commodity library according to the first matching degree between the dimensional features of the commodities in the commodity library and the dimensional features of the first commodity, the second matching degree between the first commodity and each candidate commodity is determined according to the first matching degree between the dimensional features of the first commodity and the corresponding dimensional features of each candidate commodity and the weight corresponding to each dimension, and the commodity matched with the first commodity is extracted from the candidate commodities according to the second matching degree between the first commodity and each candidate commodity. Therefore, the multidimensional characteristics relevant to the commodity attributes of the first commodity are extracted from the first description information corresponding to the first commodity, the commodity library comprising a large number of commodities and the corresponding multidimensional characteristics is established, the commodities matched with the first commodity are extracted from the commodity library through the multidimensional characteristic matching relevant to the commodity attributes, the key characteristics relevant to the commodity attributes are extracted from the description information of the commodities for matching, useless text information irrelevant to the commodity attributes in the description information is filtered, and the accuracy of commodity matching is improved.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a method for matching a commodity according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of another method for matching a product according to an embodiment of the present disclosure;
fig. 3 is a schematic flowchart of another method for matching a product according to an embodiment of the present disclosure;
fig. 4 is a schematic flowchart of another method for matching a product according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a matching device for goods according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the like or similar elements throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
The embodiment of the application provides a method for matching the commodity titles, aiming at the problem that in the related art, all texts in the commodity titles participate in similarity calculation and text information irrelevant to the commodity attributes cannot be filtered out by a method for matching the commodity titles according to the overall similarity between the commodity titles, so that the accuracy of matching the commodity titles is low.
According to the commodity matching method provided by the embodiment of the application, the multidimensional feature of the first commodity is extracted from the first description information corresponding to the first commodity, the candidate commodities are obtained from the commodity library according to the first matching degree between the multidimensional feature of the first commodity and the multidimensional feature of the first commodity in the commodity library, the second matching degree between the first commodity and each candidate commodity is determined according to the first matching degree between the multidimensional feature of the first commodity and the corresponding dimensional feature of each candidate commodity and the weight corresponding to each dimension, and the commodity matched with the first commodity is extracted from each candidate commodity according to the second matching degree between the first commodity and each candidate commodity. Therefore, the multidimensional characteristics relevant to the commodity attributes of the first commodity are extracted from the first description information corresponding to the first commodity, the commodity library comprising a large number of commodities and the corresponding multidimensional characteristics is established, the commodities matched with the first commodity are extracted from the commodity library through the multidimensional characteristic matching relevant to the commodity attributes, the key characteristics relevant to the commodity attributes are extracted from the description information of the commodities for matching, useless text information irrelevant to the commodity attributes in the description information is filtered, and the accuracy of commodity matching is improved.
Hereinafter, a matching method, an apparatus, an electronic device, a storage medium, and a computer program for a product provided by the present application will be described in detail with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a method for matching a commodity according to an embodiment of the present disclosure.
As shown in fig. 1, the matching method of the commodities comprises the following steps:
step 101, acquiring first description information corresponding to a first commodity.
The matching method of the product according to the embodiment of the present application may be executed by the matching device of the product according to the embodiment of the present application. The matching device of the commodity of the embodiment of the application can be configured in any electronic equipment to execute the matching method of the commodity of the embodiment of the application.
For example, the matching device for the goods according to the embodiment of the present application may be configured in a server corresponding to any e-commerce service, so that when the application needs to match the goods, a goods matching request may be sent to the server, so that the server returns a goods matching result to the application.
The first commodity is a commodity which needs to be determined to be matched with the first commodity.
The first description information may refer to information describing the first product, such as a product title and a product introduction corresponding to the first product, which is not limited in the embodiment of the present application.
In this embodiment, the server may perform analysis processing on the product matching request when obtaining the product matching request, so as to determine first description information corresponding to the first product included in the product matching request. The commodity matching request may be sent to the server by the user through the client, or may be automatically triggered by the server according to a preset rule, which is not limited in the embodiment of the present application.
For example, if a user of the client needs to search for a commodity, first description information corresponding to a first commodity is input in a search box of the client, and the client may generate a commodity matching request according to the first description information corresponding to the first commodity after performing the search and send the commodity matching request to the server. Therefore, the server can analyze the acquired commodity matching request to acquire the first description information corresponding to the first commodity. For example, the first description information corresponding to the first commodity may be "today's goods are published" Huamate 9 Mobile Mocha jin Tong 4GB +32GB ".
Step 102, identifying the first description information to determine a multi-dimensional feature of the first commodity.
The multi-dimensional features of the first commodity may include at least two of a category, a brand, a model, and a performance description.
In the embodiment of the application, a rule for identifying the first description information and a multi-dimensional feature of the first commodity to be extracted may be preset. For example, the multi-dimensional feature of the first commodity can be a category feature and a brand feature; four features may also be described for category, brand, model, performance.
It should be noted that, in actual use, a rule for extracting the feature of the first identification information and the multidimensional feature of the first product to be extracted may be performed according to actual needs and specific application scenarios, which is not limited in this embodiment of the present application. The following description will be made by taking the multidimensional characteristics as category, brand, model and performance description as examples.
As a possible implementation manner, the pre-trained classification model may be used to identify the first description information corresponding to the first commodity, so as to determine the category to which the first commodity belongs.
Furthermore, since the types of the commodities are many and there are many similar commodity categories (such as earphones/headsets, bluetooth earphones, and mobile phone earphones), in order to improve the accuracy of commodity category identification, a classification model capable of performing multi-level classification on the commodities can be trained, that is, the similar commodity categories can be clustered, and the clustered commodity categories are subjected to the secondary classification. That is, in a possible implementation manner of this embodiment of the present application, step 102 may include:
inputting the first description information into a primary classification model to determine a target primary category to which the first commodity belongs;
and under the condition that the target primary category to which the first commodity belongs contains the subcategory, inputting the first description information into a secondary classification model corresponding to the target primary category to determine the target secondary category to which the first commodity belongs.
In this embodiment of the application, first description information corresponding to the first commodity may be first input into a pre-trained primary classification model, so that the primary classification model outputs a target primary category to which the first commodity belongs. If the target first-level category to which the first commodity belongs does not include the sub-category, the target first-level category to which the first commodity belongs may be determined as the category to which the first commodity belongs.
Accordingly, if the target primary category to which the first item belongs includes a sub-category, the determination of the target secondary category to which the first item belongs may continue. Specifically, the first description information corresponding to the first commodity may be continuously input into the second-level classification model corresponding to the target first-level category, so that the second-level classification model corresponding to the target first-level category outputs the target second-level category to which the first commodity belongs, and the target second-level category to which the first commodity belongs is determined as the category to which the first commodity belongs.
Furthermore, when the multi-level classification model is trained, the initial classification model can be preliminarily trained through the training data set, and each class which is easy to be confused is clustered according to the confusion probability of the classification model after the preliminary training to each class so as to generate the multi-level classification model. That is, in a possible implementation manner of the embodiment of the present application, a primary classification model and a secondary classification model may be generated through training in the following steps:
acquiring a training data set and a testing data set, wherein the training data set and the testing data set respectively comprise description information of a plurality of commodities and a label category corresponding to each commodity;
training the initial classification model by using a training data set to generate a first classification model;
processing the description information of each commodity in the test data set by using a first classification model to determine the prediction probability of each commodity belonging to each category;
determining confusion probability among all classes according to the labeling class corresponding to each commodity and the prediction probability belonging to each class;
clustering the categories with the confusion probability larger than a threshold value to generate a first-level category;
training the initial classification model by using each clustered primary class and the corresponding description information respectively to generate a primary classification model;
and training the initial classification model by using each labeled class and description information contained in each primary class to generate a secondary classification model corresponding to each primary class.
In the embodiment of the application, description information of a large number of commodities can be acquired, and the description information of each commodity is labeled according to the category of each commodity to generate a labeling category corresponding to each commodity, that is, the description information of one commodity and the labeling category corresponding to the description information of one commodity can form a piece of training data or test data, and a training data set and a test data set can be generated according to the description information of a large number of commodities and the labeling categories corresponding to the description information of a large number of commodities. The training data in the training data set and the test data in the test data set can be different, so that the training precision of the classification model is improved.
In the embodiment of the application, description information of each commodity in a training data set can be firstly sequentially input into an initial classification model, so that the initial classification model outputs a prediction category corresponding to each commodity, a loss value of the initial classification model is further determined according to the difference between the prediction category corresponding to each commodity and a labeling category, and when the loss value of the initial classification model is greater than a loss threshold value, parameters of the initial classification model are updated according to the loss value of the initial classification model; and then, continuously training the updated classification model by using the training data set until the loss value of the updated classification model is less than or equal to the loss threshold value, and determining the updated classification model as the first classification model.
In the embodiment of the application, after the first classification model is generated by training, the first classification model may be tested by using a test data set to determine the confusion probability of the first classification model for each commodity category, and then the primary category and the secondary category are determined according to the confusion probability. Therefore, the description information of each commodity in the test data set can be input into the first classification model, so that the first classification model outputs the prediction probability that each commodity belongs to each category, and the confusion probability among the categories is further determined according to the labeling category corresponding to each commodity and the prediction probability that each commodity belongs to each category. If the confusion probability among the multiple categories is greater than the threshold value, determining that the multiple categories are similar categories, and clustering the multiple categories to generate a first-level category; accordingly, if the confusion probability between one category and the other categories is less than or equal to the threshold, it may be determined that there is no category similar to the category, that is, the category may be directly determined as a primary category.
For example, the first classification model is a trained full classification model, and six classes can be predicted A, B, C, D, E, F. Table 1 shows a confusion matrix generated by testing the first classification model with the test data set. The column 1 in table 1 represents labeled categories corresponding to commodities, and the numerical value in each row in table 1 represents the prediction probability that the commodity of each labeled category output by the first classification model belongs to each category. For example, the commodity labeled as class a has a prediction probability of 0.75 for class a, a prediction probability of 0.15 for class B, a prediction probability of 0.03 for class C, and so on; the commodity labeled class B has a prediction probability of 0.2 for class A, a prediction probability of 0.7 for class B, and so on. After the confusion matrix shown in table 1 is obtained, the average of the prediction probability that the commodity labeled as X belongs to the Y category and the prediction probability that the commodity labeled as Y belongs to the X category may be determined as the confusion probability between the category X and the category Y, that is, the confusion probability between the category a and the category B is (0.15+0.2)/2 is 0.175, and the confusion probability between the category E and the category F is (0.2+0.25)/2 is 0.225. Assuming that the threshold is 0.1, according to the confusion matrix shown in table 1, since the confusion probability between the category a and the category B is greater than the threshold, the confusion probability between the category E and the category F is greater than the threshold, and the confusion probabilities between the category C and the category D and other categories are both less than the threshold, the category a and the category B can be clustered to generate a primary category AB, and the category E and the category F can be clustered to generate a primary category EF, so that the generated primary categories are AB, C, D, and EF, respectively.
TABLE 1
A B C D E F
A 0.75 0.15 0.03 0.03 0.03 0.01
B 0.2 0.7 0.02 0.04 0.01 0.03
C 0.03 0.04 0.90 0.03 0 0
D 0 0 0.03 0.95 0.02 0
E 0 0 0.05 0.05 0.7 0.2
F 0 0 0.05 0 0.25 0.7
In the embodiment of the application, after each primary category is determined, the labeling category of the corresponding commodity in the training data set can be modified into the corresponding primary category according to each labeling category contained in the primary category. For example, if the labeling category included in the training data set is A, B, C, D, E, F, and the generated primary categories are AB, C, D, and EF, respectively, the labeling categories of the commodities labeling the categories a and B may be modified to AB, and the labeling categories of the commodities labeling the categories E and F may be modified to EF. Then, the description information of each commodity in the training data set after the annotation data is modified can be input into the initial classification model, so that the loss value of the initial classification model is determined according to the difference between the prediction class corresponding to each commodity output by the initial classification model and the modified annotation class, the parameters of the initial classification model are updated according to the loss value, and the updated classification model can be determined as the primary classification model until the loss value of the updated classification model is smaller than or equal to the loss value threshold.
In the embodiment of the present application, after each primary category is determined, for each primary category, training data corresponding to the primary category may be obtained from a training data set according to each labeled category included in the primary category. For example, for the primary class AB, the labeled classes included in the primary class AB are class a and class B, the description information and labeled classes of the commodities labeled as class a and class B may be obtained from the training data set, and used as the training data corresponding to the primary class AB. And then, inputting the description information of each commodity in the training data corresponding to the primary class into an initial classification model, determining the loss value of the initial classification model according to the difference between the prediction class and the labeled class corresponding to each commodity output by the initial classification model, updating the parameters of the initial classification model according to the loss value, continuing training the updated classification model by using the training data corresponding to the primary class until the loss value of the updated classification model is less than or equal to a loss value threshold value, and determining the updated classification model as a secondary classification model corresponding to the primary class. Correspondingly, according to the same training mode as the above, the two-stage classification models corresponding to the respective one-stage classes can be sequentially trained and generated.
It should be noted that, if the primary class does not include multiple labeled classes, it may be determined that the primary class does not include the secondary class, and the secondary classification model corresponding to the primary class may not be trained.
As a possible implementation manner, the brand recognition model and the brand dictionary can be trained in advance, and the brand of the first commodity is determined by combining the dictionary with the recognition model. For example, word segmentation processing may be performed on first description information corresponding to a first commodity to determine each word segmentation included in the first description information, and then each word segmentation included in the first description information is matched with a preset brand dictionary, and each word segmentation included in the preset brand dictionary is determined as a brand of the first commodity; if the preset brand dictionary does not contain any word segmentation in the first description information, the first description information can be input into a pre-trained brand recognition model, so that the first description information is recognized through the brand recognition model, and the brand of the first commodity is generated.
It should be noted that, after the brand of the first commodity is generated through the brand recognition model, if the preset brand dictionary does not contain the generated brand of the first commodity, the generated brand of the first commodity may be added into the preset brand dictionary to continuously perfect the preset brand dictionary, and accuracy and efficiency of brand recognition are improved.
As a possible implementation manner, the model and the performance description included in the first description information may be extracted by a preset regular expression. Optionally, the first description information may be matched according to a regular expression corresponding to the performance description, so as to extract the performance description included in the first description information; then, the extracted performance description may be removed from the first description information, and the remaining information may be matched according to a regular expression corresponding to the model, so as to extract the model included in the first description information.
As an example, two types of features may be included in the performance description: quantitative description and qualitative description. For example, quantitative descriptions may include: mobile phone memory: 4G +128G, camera focal length: 20-50mm, refrigerator capacity: 80 liters, etc.; qualitative descriptions may include: focusing mode of the camera: zooming and focusing, and a refrigerator: double door, single door, etc. Therefore, when the performance description included in the first description information is extracted, the quantitative description and the qualitative description may be sequentially extracted. For example, the first description information may be matched according to a regular expression corresponding to the quantitative description, so as to extract the quantitative description of the first commodity contained in the first description information; and then removing the quantitative description from the first description information, and matching the residual information after the quantitative description is removed according to a regular expression corresponding to the qualitative description so as to extract the qualitative description of the first commodity from the first description information.
It should be noted that, when extracting the multi-dimensional features such as the category, the brand, the model, the performance description, and the like of the first commodity from the first description information, the extraction order of the features is not limited in the embodiment of the present application. In actual use, the extraction sequence of each dimensional feature can be flexibly set according to actual needs and specific application scenes.
Step 103, obtaining a plurality of candidate commodities from the commodity library according to a first matching degree between each dimension feature of each commodity in the commodity library and each dimension feature of the first commodity.
The commodity library may be a database containing mapping relationships between description information of commodities and dimensional features corresponding to the commodities.
The candidate goods may be goods in which the first matching degree between each dimension feature in the goods library and each dimension feature of the first goods is greater than or equal to the matching degree threshold.
In this embodiment of the present application, feature extraction may be performed on the description information of each commodity in the commodity library according to the manner of extracting each dimension feature of the first commodity disclosed in the above step, so as to determine a multidimensional feature of each commodity in the commodity library, and store the description information of the commodity and the multidimensional feature corresponding to the description information in the commodity library in a corresponding manner.
In this embodiment, for each commodity in the commodity library, a first matching degree between each dimension feature of the commodity and the corresponding dimension feature of the first commodity may be determined, so as to determine whether the commodity is a commodity with a higher multi-dimensional feature similarity to the first commodity according to the first matching degree between each dimension feature of the commodity and the corresponding dimension feature of the first commodity, that is, whether the commodity may be used as a candidate commodity.
As a possible implementation manner, if the first matching degrees between the features of each dimension of one commodity in the commodity library and the features of the corresponding dimension of the first commodity are all greater than or equal to the matching degree threshold, it may be determined that the features of each dimension of the commodity are the same as or similar to the features of each dimension of the first commodity, and thus the commodity may be determined as a candidate commodity. Correspondingly, for a commodity in the commodity library, if the first matching degree between any one-dimensional feature of the commodity and the corresponding dimensional feature of the first commodity is smaller than the threshold value of the matching degree, it may be determined that different features exist between the commodity and the first commodity, and thus the commodity may not be determined as a candidate commodity.
For example, the multi-dimensional features of the good include category, brand, model, and performance description, assuming a matching threshold of 0.9. If the first matching degree between the category of the first commodity and the category of the commodity a in the commodity library is 1, the first matching degree between the brand of the first commodity and the brand of the commodity a is 1, the first matching degree between the model of the first commodity and the model of the commodity a is 0.95, and the first matching degree between the performance description of the first commodity and the performance description of the commodity a is 0.92, the commodity a can be determined as a candidate commodity. If the first matching degree between the category of the first commodity and the category of the commodity B in the commodity library is 0.9, the first matching degree between the category of the first commodity and the category of the commodity B is 0.5, the first matching degree between the model of the first commodity and the model of the commodity B is 0.3, and the first matching degree between the performance description of the first commodity and the performance description of the commodity B is 0.1, it may be determined that the commodity B is not a candidate commodity.
As an example, the commodity library may be used for performing preliminary screening, so that the feature corresponding to the commodity in the commodity library may only include a part of the dimensional feature of the first commodity, so as to reduce the data amount of the commodity library and improve the efficiency of matching the candidate commodities. Therefore, when the feature corresponding to the product in the product library may include only the partial dimension feature of the first product, the plurality of candidate products may be acquired from the product library according to the first matching degree between each dimension feature of each product in the product library and the partial dimension feature corresponding to the first product. Furthermore, after a plurality of candidate commodities are acquired, feature extraction may be performed on the description information corresponding to each candidate commodity to extract features that are not included in the candidate commodity but included in the first commodity, and further, a first matching degree between each newly extracted dimensional feature corresponding to the candidate commodity and the corresponding dimensional feature of the first commodity may be determined.
For example, assuming that the multidimensional features of the first commodity include category, brand, model and performance descriptions, the commodity library only includes mapping relationships between the description information of the commodity and the category and brand of the commodity, so that a plurality of candidate commodities can be obtained from the commodity library according to a first matching degree between the category of each commodity in the commodity library and the category of the first commodity and a first matching degree between the brand of each commodity in the commodity library and the brand of the first commodity. Then, according to the mode of extracting the model and the performance description of the first commodity, the model and the performance description of each candidate commodity can be extracted from the description information of each candidate commodity; further, a first degree of matching between the model of each candidate commodity and the model of the first commodity and a first degree of matching between the performance description of each candidate commodity and the performance description of the first commodity can be determined.
And 104, determining a second matching degree of the first commodity and each candidate commodity according to the first matching degree between each dimension characteristic of the first commodity and the corresponding dimension characteristic of each candidate commodity and the weight corresponding to each dimension characteristic.
The second matching degree between the first commodity and the candidate commodity may be an overall similarity between the first commodity and the candidate commodity.
In this embodiment of the application, different weights may be respectively assigned to different dimensional features according to the importance degree of each dimensional feature pair for measuring the similarity between the commodities, so as to determine a second matching degree between the first commodity distribution and each candidate commodity according to a first matching degree between each dimensional feature of the first commodity and the corresponding dimensional feature of each candidate commodity and the weight corresponding to each dimensional feature, so as to characterize the overall similarity between the first commodity and the candidate commodity through the second matching degree.
As an example, for a candidate product, a weighted average of first matching degrees between each dimension feature of the first product and a corresponding dimension feature of the candidate product may be determined as a second matching degree between the first product and the candidate product. I.e., a second degree of match between the first item and the candidate item, may be determined by equation (1).
Figure BDA0002941649990000091
Wherein S isiIs the second matching degree between the first commodity and the ith candidate commodity, i is the serial number of the candidate commodity, AijIs a first matching degree, omega, between the jth dimension feature of the first commodity and the jth dimension feature of the ith candidate commodityjIs the weight of the j-th dimension feature, j is the serial number of the multi-dimension feature, and n is the number of the multi-dimension features.
For example, the multi-dimensional features of the first commodity are category, brand, model, quantitative description and qualitative description, and the weights of the category, brand, model, quantitative description and qualitative description are 0.3, 0.2, 0.1, respectively. Suppose that the description information of the first commodity is "[ current goods release on the same day ] Hua is mate9 Mobile Mocha jin full network 4GB +32 GB", and the multidimensional characteristics of the first commodity are respectively' category: cell-phone, brand: hua is, type: mate9, quantitative description: 4g 32g, qualitatively described: the full network communication, the multidimensional characteristics of the candidate commodity A obtained from the commodity library are respectively' category: cell-phone, brand: hua is, type: mate9, quantitative description: 4g 32g, qualitatively described: full network access; the multidimensional features of the candidate commodity B are "category: cell-phone, brand: hua is, type: v30, quantitative description: 6g 128g, qualitatively described: full network access; the first matching degrees between the dimensional features of the first commodity and the corresponding dimensional features of the candidate commodity A are respectively 1, 1 and 1; the first matching degrees between the dimensional features of the first commodity and the corresponding dimensional features of the candidate commodity B are 1, 0 and 1, respectively. Thus, the second matching degree of the first article with the candidate article a is: 1 × 0.3+1 × 0.3+1 × 0.2+1 × 0.1+1 × 0.1 ═ 1, and the second degree of matching between the first commodity and the candidate commodity B is: 1 × 0.3+1 × 0.3+0 × 0.2+0 × 0.1+1 × 0.1 is 0.7.
As an example, when a commodity is candidate from the commodity library through a part of dimensional features (such as categories and brands), a search score of each candidate commodity in the commodity library may be further determined according to a first matching degree between each dimensional feature of each candidate commodity and a corresponding dimensional feature of the first commodity. Further, when the second matching degree between the first product and each candidate product is determined, the search score of the second matching in the product library may be incorporated. For example, a weighted average of a first matching degree between each dimension feature of the first commodity and a corresponding dimension feature of the candidate commodity and a search score of the candidate commodity in the commodity library may be determined as a second matching degree between the first commodity and the candidate commodity. I.e., a second degree of match between the first item and the candidate item, may be determined by equation (2).
Figure BDA0002941649990000101
Wherein S isiIs the second matching degree between the first commodity and the ith candidate commodity, i is the serial number of the candidate commodity, AijIs a first matching degree, omega, between the jth dimension feature of the first commodity and the jth dimension feature of the ith candidate commodityjIs the weight of the j-th dimension feature, j is the serial number of the multi-dimension feature, n is the number of the multi-dimension feature, esi. Search score, ω, in the merchandise store for the ith candidate itemesIs the weight of the candidate search score.
For example, the multidimensional feature of the first commodity is a category, a brand, a model, a quantitative description and a qualitative description, and the weights of the category, the brand, the model, the quantitative description and the qualitative description are 0.2, 0.1 and 0.1 respectively, and the weight of the search score of the candidate commodity in the commodity library is 0.2. Suppose that the description information of the first commodity is "[ current goods release on the same day ] Hua is mate9 Mobile Mocha jin full network 4GB +32 GB", and the multidimensional characteristics of the first commodity are respectively' category: cell-phone, brand: hua is, type: mate9, quantitative description: 4g 32g, qualitatively described: the full network communication, the multidimensional characteristics of the candidate commodity A obtained from the commodity library are respectively' category: cell-phone, brand: hua is, type: mate9, quantitative description: 4g 32g, qualitatively described: full network access; the multidimensional features of the candidate commodity B are "category: cell-phone, brand: hua is, type: v30, quantitative description: 6g 128g, qualitatively described: full network access; the first matching degrees between the dimensional features of the first commodity and the corresponding dimensional features of the candidate commodity A are respectively 1, 1 and 1, and the search score of the candidate commodity A in the commodity library is 0.821; the first matching degrees between the dimensional features of the first commodity and the corresponding dimensional features of the candidate commodity B are 1, 0 and 1, respectively, and the search score of the candidate commodity B in the commodity library is 0.223. Thus, the second matching degree of the first article with the candidate article a is: 1 × 0.2+1 × 0.2+1 × 0.2+1 × 0.1+1 × 0.1+0.821 × 0.2 ═ 0.964, and the second degree of matching between the first product and the candidate product B is: 1 × 0.2+1 × 0.2+0 × 0.2+0 × 0.1+1 × 0.1+0.223 × 0.2 is 0.544.
And 105, extracting the commodities matched with the first commodity from the candidate commodities according to the second matching degree of the first commodity and each candidate commodity.
In this embodiment of the application, after the second matching degrees of the first commodity and each candidate commodity are determined, the candidate commodities may be sorted in a descending order according to the second matching degrees of the first commodity and each candidate commodity, and the X candidate commodities with the largest second matching degree to the first commodity may be determined as the commodities matched with the first commodity.
Further, if there are a plurality of candidate products having the same second matching degree as the first product, the plurality of candidate products may be sorted in descending order according to the respective first matching degrees corresponding to the plurality of candidate products or the search scores in the product library. That is, in a possible implementation manner of the embodiment of the present application, the step 105 may include:
and under the condition that the second matching degrees corresponding to the at least two candidate commodities are the same and are both greater than the second matching degrees corresponding to other candidate commodities, extracting the commodity matched with the first commodity from the at least two candidate commodities according to the first matching degrees corresponding to the at least two candidate commodities respectively.
As an example, the priority of each dimension feature may be preset, so that when the second matching degrees between the multiple candidate commodities and the first commodity are the same, the multiple candidate commodities may be sorted in a descending order according to the first matching degree corresponding to the feature with the highest priority first, and if there is a part of the multiple candidate commodities corresponding to the feature with the highest priority and having the same first matching degree, the part of the candidate commodities may be further sorted in a descending order according to the first matching degree corresponding to the feature with the highest priority until the ranking of all the candidate commodities with the same second matching degree is determined, the candidate commodity with the ranking at the top X may be selected from all the candidate commodities and determined as the commodity matching with the first commodity.
As an example, a preset feature may be further specified in advance, so that when the second matching degrees between the multiple candidate commodities and the first commodity are the same, the multiple candidate commodities with the same second matching degree may be sorted in a descending order according to the first matching degree corresponding to the preset feature, and then the candidate commodity with the top X in the sorting order may be selected from all the candidate commodities and determined as the commodity matched with the first commodity. For example, the preset feature may be a category, a search score in a product database, and the like, which is not limited in the embodiment of the present application.
When it needs to be explained, in actual use, a specific value of X may be determined according to actual needs and a specific application scenario, which is not limited in the embodiment of the present application.
According to the commodity matching method provided by the embodiment of the application, the multidimensional feature of the first commodity is extracted from the first description information corresponding to the first commodity, the candidate commodities are obtained from the commodity library according to the first matching degree between the multidimensional feature of the first commodity and the multidimensional feature of the first commodity in the commodity library, the second matching degree between the first commodity and each candidate commodity is determined according to the first matching degree between the multidimensional feature of the first commodity and the corresponding dimensional feature of each candidate commodity and the weight corresponding to each dimension, and the commodity matched with the first commodity is extracted from each candidate commodity according to the second matching degree between the first commodity and each candidate commodity. Therefore, the multidimensional characteristics relevant to the commodity attributes of the first commodity are extracted from the first description information corresponding to the first commodity, the commodity library comprising a large number of commodities and the corresponding multidimensional characteristics is established, the commodities matched with the first commodity are extracted from the commodity library through the multidimensional characteristic matching relevant to the commodity attributes, the key characteristics relevant to the commodity attributes are extracted from the description information of the commodities for matching, useless text information irrelevant to the commodity attributes in the description information is filtered, and the accuracy of commodity matching is improved.
In a possible implementation form of the present application, for different categories of commodities, the model numbers and the performance descriptions may have a large difference, so different identification rules may be set for the different categories of commodities, respectively, for identifying the model numbers and the performance descriptions of the commodities, so as to further improve the accuracy of the commodity feature identification.
The matching method of the commodities provided by the embodiment of the present application is further described below with reference to fig. 2.
Fig. 2 is a schematic flow chart of another commodity matching method according to an embodiment of the present disclosure.
As shown in fig. 2, the matching method of the commodities includes the following steps:
step 201, acquiring first description information corresponding to a first commodity.
The detailed implementation process and principle of step 201 may refer to the detailed description of the above embodiments, and are not described herein again.
Step 202, performing category identification on the first description information to determine a category to which the first commodity belongs.
In the embodiment of the application, the first description information may be identified through a pre-trained classification model to determine the category to which the first commodity belongs. For specific implementation processes and principles, reference may be made to the detailed description of the embodiments, which is not repeated herein.
Step 203, acquiring an identification rule corresponding to the first commodity according to the category to which the first commodity belongs.
In the embodiment of the application, because the categories of the commodities are different, the expression modes of the models or the performance descriptions of the commodities may have great differences, so that different identification rules can be set for different commodity categories of the commodities of each category respectively, and the different identification rules are respectively used for identifying the models and/or the performance descriptions of the commodities of each category. Thus, after the category to which the first commodity belongs is determined, the identification rule corresponding to the category to which the first commodity belongs may be determined according to the category to which the first commodity belongs, and the identification rule corresponding to the category to which the first commodity belongs may be determined as the identification rule corresponding to the first commodity.
As an example, if the model and the performance description in the first description information are determined by regular expression matching, a regular expression corresponding to the model of each category of commodities and a regular expression corresponding to the performance description of each category of commodities may be set to identify the model and the performance description of each category of commodities respectively.
And step 204, matching the first description information based on the identification rule to determine the model and/or the performance description of the first commodity.
In this embodiment of the application, if the multidimensional feature of the first commodity includes the model and the performance description of the first commodity, the identification rule may also include a model identification rule and a performance description identification rule corresponding to the first commodity. Thus, the first description information may be matched based on a model identification rule corresponding to the first commodity included in the identification rule to determine the model of the first commodity; and matching the first description information based on the performance description identification rule corresponding to the first commodity included in the identification rule to determine the performance description of the first commodity.
Furthermore, because the model of the commodity usually consists of letters and numbers, and the performance description of the commodity may also contain letters and numbers, in order to reduce the influence of the performance description of the commodity on the identification of the model of the commodity, the performance description can be removed from the first description information after the performance description of the commodity is determined, and then the model identification of the commodity is performed, so as to further improve the accuracy of the identification of the model of the commodity. That is, in a possible implementation manner of this embodiment of the present application, step 204 may include:
matching the first description information based on the identification rule corresponding to the performance description to determine the performance description of the first commodity;
and matching the rest information except the performance description of the first commodity in the first description information based on the identification rule corresponding to the model to determine the model of the first commodity.
As a possible implementation manner, the first description information may be first matched according to an identification rule corresponding to the performance description included in the identification rule, so as to determine the performance description of the first commodity. Then, the performance description of the first commodity may be removed from the first description information, and then, the remaining information from which the performance description of the first commodity is removed may be matched based on the identification rule corresponding to the model included in the identification rule, so as to determine the model of the first commodity.
For example, if the first description information of the first product is "gold" of the Konka/kanka BCD-386BX4S4 door home energy saving 2-door cross-door refrigerator, "the first description information is matched based on the identification rule corresponding to the performance description, and the performance description of the first product can be determined to be" 4 doors, "so that the remaining information after removing the performance description of the first product from the first description information is" gold "of the Konka/kanka BCD-386BX4S home energy saving 2-door cross-door refrigerator," and the remaining information is matched based on the identification rule corresponding to the model number, and the model number of the first product can be determined to be "BCD-386 BX 4S. Note that, if the model number of the first product is directly extracted from the first description information, the model number of the first product is erroneously recognized as "BCD-386 BX4S 4".
Step 205, a plurality of candidate commodities are obtained from the commodity library according to a first matching degree between each dimension feature of each commodity in the commodity library and each dimension feature of the first commodity.
And step 206, determining a second matching degree between the first commodity and each candidate commodity according to the first matching degree between each dimension characteristic of the first commodity and the corresponding dimension characteristic of each candidate commodity and the weight corresponding to each dimension characteristic.
Step 207, extracting the commodity matched with the first commodity from the plurality of candidate commodities according to the second matching degree of the first commodity and each candidate commodity.
The detailed implementation process and principle of the steps 205-207 can refer to the detailed description of the above embodiments, and are not described herein again.
According to the commodity matching method provided by the embodiment of the application, the category of the first commodity is determined by performing category identification on first description information corresponding to the first commodity, the first description information is matched based on an identification rule to determine the model and/or performance description of the first commodity, then a plurality of candidate commodities are obtained from the commodity library according to the first matching degree between each dimension feature of each commodity in the commodity library and each dimension feature of the first commodity, and further the second matching degree between the first commodity and each candidate commodity is determined according to the first matching degree between each dimension feature of the first commodity and the corresponding dimension feature of each candidate commodity and the weight corresponding to each dimension, so that the commodity matched with the first commodity is extracted from each candidate commodity according to the second matching degree between the first commodity and each candidate commodity. Therefore, different identification rules are set for different types of commodities respectively and are used for identifying the types and the performance descriptions of the commodities, the commodities matched with the first commodities are extracted from the commodity library through multi-dimensional feature matching related to commodity attributes, key features related to the commodity attributes are extracted from description information of the commodities for matching, useless text information unrelated to the commodity attributes in the description information is filtered, and the accuracy of commodity feature identification is improved, so that the accuracy of commodity matching is further improved.
In a possible implementation form of the present application, since the multidimensional feature of the first commodity and the multidimensional feature of each candidate commodity may not be completely matched, the matched commodity may be selected from the candidate commodities according to a first matching degree between the multidimensional feature of the first commodity and the multidimensional feature of the candidate commodity, so as to further improve the accuracy of commodity matching.
The matching method of the commodities provided by the embodiment of the present application is further described below with reference to fig. 3.
Fig. 3 is a schematic flowchart of another method for matching a product according to an embodiment of the present disclosure.
As shown in fig. 3, the matching method of the merchandise includes the following steps:
step 301, obtaining first description information corresponding to a first commodity.
Step 302, the first description information is identified to determine a multi-dimensional feature of the first item.
Step 303, obtaining a plurality of candidate commodities from the commodity library according to a first matching degree between each dimension feature of each commodity in the commodity library and each dimension feature of the first commodity.
The detailed implementation process and principle of the steps 301-303 can refer to the detailed description of the above embodiments, and are not described herein again.
Step 304, determining the feature quantity M of each dimension corresponding to the first commodity and the feature quantity N of each dimension corresponding to the ith candidate commodityiWherein M, N and i are positive integers respectively, and i is less than or equal to the total number K of candidate commodities.
Step 305, when M is less than NiIn the case of (3), a second matching degree between the first commodity and each candidate commodity is determined according to the first matching degree between the M-dimensional feature corresponding to the first commodity and the M-dimensional feature of each candidate commodity and the weight corresponding to each M-dimensional feature.
In the embodiment of the present application, when determining the second matching degree between the first commodity and each candidate commodity, the feature quantity M of each dimension of the first commodity and the feature quantity N of each dimension of each candidate commodity may be determined first. In the case that M is the same as the number N of features of each dimension of each candidate commodity, a weighted average of the first matching degrees between the features of each dimension of the first commodity and the features of the corresponding dimension of the ith candidate commodity may be determined according to the first matching degree between each feature of the first commodity and the features of the corresponding dimension of the ith candidate commodity and the weight of each feature, and the weighted average of the first matching degrees between the features of each dimension of the first commodity and the features of the corresponding dimension of the ith candidate commodity may be determined as the second matching degree between the first commodity and the ith candidate commodity.
When the feature numbers N of the candidate commodities in the respective dimensions are different from M, for example, M is smaller than the feature number N of the candidate commodity in the respective dimensionsiIn the case of (1), that is, in the case where the feature not included in the features of the first product is included in the features of the ith candidate product, the feature not included in the features of the first product in the M-dimensional space can be removed from the features of the first product in the N-dimensional space based on the features of the first product in the M-dimensional spaceAnd the included features only reserve the M-dimensional features which are included in the N-dimensional features of the candidate commodities and are the same as the M-dimensional features of the first commodity, and further determine the second matching degree of the first commodity and each candidate commodity according to the first matching degree between the M-dimensional features corresponding to the first commodity and the M-dimensional features of each candidate commodity and the weights corresponding to the M-dimensional features. For example, for the ith candidate item, a weighted average of the first matching degrees between the M-dimensional feature of the first item and the M-dimensional feature of the ith candidate item may be determined as the second matching degree between the first item and the ith candidate item.
For example, the multidimensional features of the first commodity comprise a category, a brand and a model, and the multidimensional features of the candidate commodities comprise a category, a brand, a model and a performance description, so that the performance description features of the candidate commodities can be removed, and the second matching degree between the first commodity and each candidate commodity is determined according to the first matching degree between the category, the brand and the model of the first commodity and the category, the brand and the model of each candidate commodity and the weights corresponding to the category, the brand and the model.
Step 306, at NiIf less than M, N corresponding to each candidate commodityiThe dimensional features are respectively related to N of the first commodityiFirst degree of matching between dimensional features, and NiAnd determining the second matching degree of the first commodity and each candidate commodity according to the weights corresponding to the dimensional features respectively.
In the embodiment of the present application, the feature numbers N in each dimension of any candidate product are different from M, for example, the feature number N in each dimension of the ith candidate product is different from MiIf the value is less than M, that is, the M-dimensional feature of the first commodity includes N of the ith candidate commodityiFeatures not included in the dimension features, so that N can be based on the i-th candidate goodiDimension feature, removing N of the ith candidate item from the M-dimension feature of the first itemiFeatures not included in the dimension features are retained only for N of the ith candidate item and included in the M-dimension features of the first itemiN with identical dimensional characteristicsiDimension characteristics, and further according to N corresponding to the first commodityiThe dimensional features are respectively associated with N of each candidate commodityiA first degree of match between dimensional features, and NiAnd determining the second matching degree of the first commodity and each candidate commodity according to the weights corresponding to the dimensional features respectively. For example, for the ith candidate item, N for the first item may beiDimension feature and N of ith candidate commodityiAnd determining the weighted average of the first matching degrees among the dimensional features as the second matching degree of the first commodity and the ith candidate commodity.
For example, if the multidimensional features of the first commodity include a category, a brand, and a model, and the multidimensional features of the candidate commodities include a category and a brand, the model features of the first commodity may be removed, and the second matching degree between the first commodity and each candidate commodity may be determined according to the first matching degree between the category and the brand of the first commodity and the category and the brand of each candidate commodity, and the weights corresponding to the category and the brand.
And 307, extracting the commodities matched with the first commodity from the candidate commodities according to the second matching degrees of the first commodity and each candidate commodity.
The detailed implementation process and principle of the step 307 may refer to the detailed description of the above embodiments, and are not described herein again.
In the method for matching commodities provided by the embodiment of the application, the multidimensional feature of the first commodity is extracted from the first description information corresponding to the first commodity, and a plurality of candidate commodities are obtained from the commodity library according to the first matching degree between the multidimensional feature of each commodity in the commodity library and the multidimensional feature of the first commodity, so that the quantity M of the multidimensional features corresponding to the first commodity and the quantity N of the multidimensional features corresponding to the ith candidate commodityiAt different times, according to M-dimensional features and NiThe features all contain the features, the second matching degree of the first commodity and each candidate commodity is determined, and the commodity matched with the first commodity is extracted from each candidate commodity according to the second matching degree of the first commodity and each candidate commodity. Therefore, the multi-dimensional features related to the commodity attributes of the first commodity are extracted from the first description information corresponding to the first commodity, and a commodity library comprising a large number of commodities and the corresponding dimensional features is established so as to pass through the commodity attributesAnd performing relevance-related multi-dimensional feature matching, namely extracting the commodities matched with the first commodity from the commodity library, so that the key features related to the commodity attributes are extracted from the description information of the commodities for matching, useless text information unrelated to the commodity attributes in the description information is filtered, the extracted key features are unified, and the accuracy of commodity matching is further improved.
In a possible implementation form of the present application, since the multidimensional feature of the first commodity may not be completely matched with the multidimensional feature of each candidate commodity, the matching commodity may be selected from the candidate commodities according to a first matching degree between the multidimensional feature of the first commodity and the multidimensional feature of the candidate commodity, and the weight of each multidimensional feature is adjusted, so as to further improve the accuracy of commodity matching.
The matching method of the commodities provided by the embodiment of the present application is further described below with reference to fig. 4.
Fig. 4 is a schematic flowchart of another method for matching a product according to an embodiment of the present disclosure.
As shown in fig. 4, the matching method of the merchandise includes the following steps:
step 401, acquiring first description information corresponding to a first commodity.
Step 402, identifying the first description information to determine a multi-dimensional feature of the first item.
Step 403, obtaining a plurality of candidate commodities from the commodity library according to a first matching degree between each dimension feature of each commodity in the commodity library and each dimension feature of the first commodity.
The detailed implementation process and principle of the steps 401-403 may refer to the detailed description of the above embodiments, and are not described herein again.
Step 404, determining the feature quantity M of each dimension corresponding to the first commodity and the feature quantity N of each dimension corresponding to the ith candidate commodityiWherein M, N and i are positive integers respectively, and i is less than or equal to the total number K of candidate commodities.
Step 405, at M and NiAnd if the weights are not the same, updating the weights corresponding to the dimensional features.
In the embodiment of the present application, when determining the second matching degree between the first commodity and each candidate commodity, the feature quantity M of each dimension of the first commodity and the feature quantity N of each dimension of each candidate commodity may be determined first. In the case that M is the same as the number N of features of each dimension of each candidate commodity, a weighted average of the first matching degrees between the features of each dimension of the first commodity and the features of the corresponding dimension of the ith candidate commodity may be determined according to the first matching degree between each feature of the first commodity and the features of the corresponding dimension of the ith candidate commodity and the weight of each feature, and the weighted average of the first matching degrees between the features of each dimension of the first commodity and the features of the corresponding dimension of the ith candidate commodity may be determined as the second matching degree between the first commodity and the ith candidate commodity.
When the feature numbers N of the candidate commodities in the respective dimensions are different from M, for example, M is smaller than the feature number N of the candidate commodity in the respective dimensionsiIn a different case, for the M-dimensional feature of the first commodity, only the M-dimensional feature and the N-dimensional feature may be retainediThe features included in all the dimensional features and the N-dimensional features of each candidate commodity may be retained only by the M-dimensional features and the N-dimensional featuresiDimensional features are all included features.
As an example, where M is less than the number N of features in each dimension of the ith candidate goodiIn the case of (1), that is, each dimension feature of the ith candidate product includes a feature not included in each dimension feature of the first product, it is possible to remove a feature not included in the M-dimension feature of the first product from the N-dimension features of each candidate product according to the M-dimension feature of the first product, retain only the M-dimension feature included in the N-dimension features of each candidate product and identical to the M-dimension feature of the first product, and reassign weights corresponding to the M-dimension features, respectively, so that the weights corresponding to the dimension features are adapted to the current number of the dimension features, thereby improving the accuracy of determining the second matching degree.
For example, the multidimensional feature of the first commodity comprises a category, a brand and a model, each multidimensional feature of each candidate commodity comprises a category, a brand, a model and a performance description, and the weights corresponding to the category, the brand, the model and the performance description are respectively 0.3, 0.2 and 0.2, so that the performance description feature of each candidate commodity can be removed, and the weights corresponding to the category, the brand and the model are respectively updated to be 0.4, 0.4 and 0.2.
As an example, the feature quantity N of each dimension of the i-th candidate commodityiIf the value is less than M, that is, the M-dimensional feature of the first commodity includes N of the ith candidate commodityiFeatures not included in the dimension features, so that N can be based on the i-th candidate goodiDimension feature, removing N of the ith candidate item from the M-dimension feature of the first itemiFeatures not included in the dimension features are retained only for N of the ith candidate item and included in the M-dimension features of the first itemiN with identical dimensional characteristicsiDimension feature and reassign NiAnd weights corresponding to the dimensional features respectively so that the weights corresponding to the dimensional features are adaptive to the current quantity of the dimensional features, and therefore the accuracy of determining the second matching degree is improved.
For example, if the multidimensional feature of the first product includes a category, a brand, and a model, and the respective multidimensional features of the candidate products include a category and a brand, and the weights corresponding to the category, the brand, and the model are respectively 0.4, and 0.2, the model feature of the first product may be removed, and the weights corresponding to the category and the brand are respectively updated to be 0.5 and 0.5.
And step 406, determining a second matching degree between the first commodity and each candidate commodity according to the updated weight corresponding to each dimension characteristic and the first matching degree between each dimension characteristic of the first commodity and the corresponding dimension characteristic of each candidate commodity.
As an example, M is less than the feature quantity N of each dimension of the ith candidate itemiIn this case, the second matching degree between the first commodity and each candidate commodity may be determined according to the first matching degree between the M-dimensional feature corresponding to the first commodity and the M-dimensional feature of each candidate commodity, and the updated weight corresponding to each M-dimensional feature. For example, for the ith candidate item, a weighted average of the first matching degrees between the M-dimensional feature of the first item and the M-dimensional feature of the ith candidate item may be determined as the ith of the first item and the ith candidate itemAnd two matching degrees.
For example, the multidimensional feature of the first commodity comprises a category, a brand and a model, each multidimensional feature of each candidate commodity comprises a category, a brand, a model and a performance description, and the weights corresponding to the category, the brand, the model and the performance description are respectively 0.3, 0.2 and 0.2, so that the performance description feature of each candidate commodity can be removed, and the weights corresponding to the category, the brand and the model are respectively updated to be 0.4, 0.4 and 0.2. And then, according to the first matching degree between the category, the brand and the model of the first commodity and the category, the brand and the model of each candidate commodity respectively, and the updated weights 0.4, 0.4 and 0.2 corresponding to the category, the brand and the model respectively, the second matching degree between the first commodity and each candidate commodity is determined.
As an example, the feature quantity N of each dimension of the i-th candidate commodityiIf less than M, according to N corresponding to the first commodityiThe dimensional features are respectively associated with N of each candidate commodityiA first degree of match between dimensional features, and an updated NiAnd determining the second matching degree of the first commodity and each candidate commodity according to the weights corresponding to the dimensional features respectively. For example, for the ith candidate item, N for the first item may beiDimension feature and N of ith candidate commodityiAnd determining the weighted average of the first matching degrees among the dimensional features as the second matching degree of the first commodity and the ith candidate commodity.
For example, if the multidimensional feature of the first product includes a category, a brand, and a model, and the respective multidimensional features of the candidate products include a category and a brand, and the weights corresponding to the category, the brand, and the model are respectively 0.4, and 0.2, the model feature of the first product may be removed, and the weights corresponding to the category and the brand are respectively updated to be 0.5 and 0.5. And then, according to the first matching degree between the category and the brand of the first commodity and the category and the brand of each candidate commodity respectively, and the updated weights 0.5 and 0.5 corresponding to the category and the brand respectively, determining the second matching degree between the first commodity and each candidate commodity.
Step 407, extracting the product matched with the first product from the plurality of candidate products according to the second matching degree of the first product and each candidate product.
The detailed implementation process and principle of the step 407 may refer to the detailed description of the above embodiments, and are not described herein again.
In the method for matching commodities provided by the embodiment of the application, the multidimensional feature of the first commodity is extracted from the first description information corresponding to the first commodity, and a plurality of candidate commodities are obtained from the commodity library according to the first matching degree between the multidimensional feature of each commodity in the commodity library and the multidimensional feature of the first commodity, so that the quantity M of the multidimensional features corresponding to the first commodity and the quantity N of the multidimensional features corresponding to the ith candidate commodityiAt different times, according to M-dimensional features and NiThe features all contain the features, the second matching degree of the first commodity and each candidate commodity is determined, the weight corresponding to each dimension feature is updated, and the commodity matched with the first commodity is extracted from each candidate commodity according to the second matching degree of the first commodity and each candidate commodity. Therefore, the multidimensional characteristics relevant to the commodity attributes of the first commodity are extracted from the first description information corresponding to the first commodity, the commodity library comprising a large number of commodities and the corresponding multidimensional characteristics is established, the commodities matched with the first commodity are extracted from the commodity library through the multidimensional characteristic matching relevant to the commodity attributes, the key characteristics relevant to the commodity attributes are extracted from the description information of the commodities for matching, useless text information irrelevant to the commodity attributes in the description information is filtered, the extracted key characteristics are unified and weight adjusted, and the accuracy of commodity matching is further improved.
In order to realize the embodiment, the application also provides a matching device of the commodity.
Fig. 5 is a schematic structural diagram of a matching device for a commodity according to an embodiment of the present application.
As shown in fig. 5, the matching device 50 for merchandise includes:
the first obtaining module 51 is configured to obtain first description information corresponding to a first commodity;
a first determining module 52, configured to identify the first description information to determine a multi-dimensional feature of the first commodity;
a second obtaining module 53, configured to obtain a plurality of candidate commodities from the commodity library according to a first matching degree between each dimension feature of each commodity in the commodity library and each dimension feature of the first commodity;
a second determining module 54, configured to determine, according to the first matching degree between each dimension feature of the first commodity and the corresponding dimension feature of each candidate commodity and the weight corresponding to each dimension, a second matching degree between the first commodity and each candidate commodity;
and an extracting module 55, configured to extract a product matched with the first product from the multiple candidate products according to the second matching degree between the first product and each candidate product.
In practical use, the matching device for the commodities provided by the embodiment of the application can be configured in any electronic equipment to execute the matching method for the commodities.
According to the matching device for the commodities, the multidimensional feature of the first commodity is extracted from the first description information corresponding to the first commodity, the candidate commodities are obtained from the commodity library according to the first matching degree between the multidimensional feature of the first commodity and the multidimensional feature of the first commodity in the commodity library, the second matching degree between the first commodity and each candidate commodity is determined according to the first matching degree between the multidimensional feature of the first commodity and the corresponding dimensional feature of each candidate commodity and the weight corresponding to each dimension, and the commodity matched with the first commodity is extracted from each candidate commodity according to the second matching degree between the first commodity and each candidate commodity. Therefore, the multidimensional characteristics relevant to the commodity attributes of the first commodity are extracted from the first description information corresponding to the first commodity, the commodity library comprising a large number of commodities and the corresponding multidimensional characteristics is established, the commodities matched with the first commodity are extracted from the commodity library through the multidimensional characteristic matching relevant to the commodity attributes, the key characteristics relevant to the commodity attributes are extracted from the description information of the commodities for matching, useless text information irrelevant to the commodity attributes in the description information is filtered, and the accuracy of commodity matching is improved.
In one possible implementation form of the present application, the multidimensional feature includes at least two of the following: category, brand, model, and performance description.
Further, in another possible implementation form of the present application, the first determining module 52 includes:
the first determining unit is used for inputting the first description information into the primary classification model so as to determine a target primary category to which the first commodity belongs;
and the second determining unit is used for inputting the first description information into the secondary classification model corresponding to the target primary category to determine the target secondary category to which the first commodity belongs under the condition that the target primary category to which the first commodity belongs contains the subcategory.
Further, in another possible implementation form of the present application, the matching device 50 for the article further includes:
the third acquisition module is used for acquiring a training data set and a testing data set, wherein the training data set and the testing data set respectively comprise description information of a plurality of commodities and a labeling category corresponding to each commodity;
a first generation module, configured to train the initial classification model using the training data set to generate a first classification model;
the third determining module is used for processing the description information of each commodity in the test data set by using the first classification model so as to determine the prediction probability of each commodity belonging to each class;
the fourth determining module is used for determining the confusion probability among all the categories according to the labeling categories corresponding to all the commodities and the prediction probabilities belonging to all the categories;
the second generation module is used for clustering the categories with the confusion probability larger than the threshold value so as to generate a first-level category;
the third generation module is used for training the initial classification model by utilizing each clustered primary class and the corresponding description information respectively so as to generate a primary classification model;
and the fourth generation module is used for training the initial classification model by using each labeled class and the description information contained in each primary class so as to generate a secondary classification model corresponding to each primary class.
Further, in another possible implementation form of the present application, the first determining module 52 includes:
the third determining unit is used for carrying out category identification on the first description information so as to determine the category to which the first commodity belongs;
the acquisition unit is used for acquiring the identification rule corresponding to the first commodity according to the category of the first commodity;
and the fourth determining unit is used for matching the first description information based on the identification rule so as to determine the model and/or the performance description of the first commodity.
Further, in another possible implementation form of the present application, the fourth determining unit is specifically configured to:
matching the first description information based on the identification rule corresponding to the performance description to determine the performance description of the first commodity;
and matching the rest information except the performance description of the first commodity in the first description information based on the identification rule corresponding to the model to determine the model of the first commodity.
Further, in another possible implementation form of the present application, the second determining module 54 includes:
a fifth determining unit configured to determine the feature quantity M of each dimension corresponding to the first commodity and the feature quantity N of each dimension corresponding to the ith candidate commodityiWherein M, N and i are positive integers respectively, and i is less than or equal to the total number K of the candidate commodities;
a sixth determining unit for determining whether M is less than NiDetermining a second matching degree of the first commodity and each candidate commodity according to the first matching degree of the M-dimensional characteristic corresponding to the first commodity and the M-dimensional characteristic of each candidate commodity and the weight corresponding to the M-dimensional characteristic;
a seventh determination unit for determining at NiIf less than M, N corresponding to each candidate commodityiThe dimensional features are respectively related to N of the first commodityiFirst degree of matching between dimensional features, and NiAnd determining the second matching degree of the first commodity and each candidate commodity according to the weights corresponding to the dimensional features respectively.
Further, in another possible implementation form of the present application, the second determining module 54 includes:
an eighth determining unit configured to determine the feature quantity M of each dimension corresponding to the first commodity and the feature quantity N of each dimension corresponding to the ith candidate commodityiWherein M, N and i are positive integers respectively, and i is less than or equal to the total number K of the candidate commodities;
an update unit for updating M and NiUnder different conditions, updating the weight corresponding to each dimension characteristic;
and the ninth determining unit is used for determining the second matching degree of the first commodity and each candidate commodity according to the updated weight corresponding to each dimension characteristic and the first matching degree between each dimension characteristic of the first commodity and the corresponding dimension characteristic of each candidate commodity.
Further, in another possible implementation form of the present application, the extracting module 55 includes:
and the extracting unit is used for extracting the commodities matched with the first commodity from the at least two candidate commodities according to the first matching degrees respectively corresponding to the at least two candidate commodities under the condition that the second matching degrees corresponding to the at least two candidate commodities are the same and are both greater than the second matching degrees corresponding to other candidate commodities.
It should be noted that the above explanation of the embodiment of the method for matching a product shown in fig. 1, fig. 2, fig. 3, and fig. 4 is also applicable to the matching device 50 of a product of this embodiment, and will not be repeated here.
The commodity matching device provided by the embodiment of the application determines the category of a first commodity by performing category identification on first description information corresponding to the first commodity, determines the model and/or performance description of the first commodity by matching the first description information based on an identification rule, then acquires a plurality of candidate commodities from a commodity library according to a first matching degree between each dimension feature of each commodity in the commodity library and each dimension feature of the first commodity, and further determines a second matching degree between the first commodity and each candidate commodity according to the first matching degree between each dimension feature of the first commodity and the corresponding dimension feature of each candidate commodity and the weight corresponding to each dimension, so as to extract a commodity matched with the first commodity from each candidate commodity according to the second matching degree between the first commodity and each candidate commodity. Therefore, different identification rules are set for different types of commodities respectively and are used for identifying the types and the performance descriptions of the commodities, the commodities matched with the first commodities are extracted from the commodity library through multi-dimensional feature matching related to commodity attributes, key features related to the commodity attributes are extracted from description information of the commodities for matching, useless text information unrelated to the commodity attributes in the description information is filtered, and the accuracy of commodity feature identification is improved, so that the accuracy of commodity matching is further improved.
In order to implement the above embodiments, the present application further provides an electronic device.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
As shown in fig. 6, the electronic device 200 includes:
a memory 210 and a processor 220, a bus 230 connecting different components (including the memory 210 and the processor 220), wherein the memory 210 stores a computer program, and when the processor 220 executes the program, the method for matching the product according to the embodiment of the present application is implemented.
Bus 230 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Electronic device 200 typically includes a variety of electronic device readable media. Such media may be any available media that is accessible by electronic device 200 and includes both volatile and nonvolatile media, removable and non-removable media.
Memory 210 may also include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)240 and/or cache memory 250. The electronic device 200 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 260 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 6, commonly referred to as a "hard drive"). Although not shown in FIG. 6, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 230 by one or more data media interfaces. Memory 210 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the application.
A program/utility 280 having a set (at least one) of program modules 270, including but not limited to an operating system, one or more application programs, other program modules, and program data, each of which or some combination thereof may comprise an implementation of a network environment, may be stored in, for example, the memory 210. The program modules 270 generally perform the functions and/or methodologies of the embodiments described herein.
Electronic device 200 may also communicate with one or more external devices 290 (e.g., keyboard, pointing device, display 291, etc.), with one or more devices that enable a user to interact with electronic device 200, and/or with any devices (e.g., network card, modem, etc.) that enable electronic device 200 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 292. Also, the electronic device 200 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 293. As shown, the network adapter 293 communicates with the other modules of the electronic device 200 via the bus 230. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 200, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processor 220 executes various functional applications and data processing by executing programs stored in the memory 210.
It should be noted that, the implementation process and the technical principle of the electronic device of this embodiment refer to the foregoing explanation of the matching method for the commodity according to the embodiment of the present application, and are not described herein again.
The electronic device according to the embodiment of the present application may execute the method for matching a product as described above, and extract a multi-dimensional feature of a first product from first description information corresponding to the first product, obtain a plurality of candidate products from a product library according to a first matching degree between each dimensional feature of each product in the product library and each dimensional feature of the first product, and further determine a second matching degree between the first product and each candidate product according to the first matching degree between each dimensional feature of the first product and each corresponding dimensional feature of the candidate product and a weight corresponding to each dimension, so as to extract a product matched with the first product from each candidate product according to the second matching degree between the first product and each candidate product. Therefore, the multidimensional characteristics relevant to the commodity attributes of the first commodity are extracted from the first description information corresponding to the first commodity, the commodity library comprising a large number of commodities and the corresponding multidimensional characteristics is established, the commodities matched with the first commodity are extracted from the commodity library through the multidimensional characteristic matching relevant to the commodity attributes, the key characteristics relevant to the commodity attributes are extracted from the description information of the commodities for matching, useless text information irrelevant to the commodity attributes in the description information is filtered, and the accuracy of commodity matching is improved.
In order to implement the above embodiments, the present application also proposes a computer-readable storage medium.
The computer-readable storage medium stores thereon a computer program, and the computer program is executed by a processor to implement the matching method for the commodity according to the embodiment of the present application.
In order to implement the foregoing embodiments, a further embodiment of the present application provides a computer program, which is executed by a processor to implement the matching method for the product according to the embodiments of the present application.
In an alternative implementation, the embodiments may be implemented in any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the consumer electronic device, partly on the consumer electronic device, as a stand-alone software package, partly on the consumer electronic device and partly on a remote electronic device, or entirely on the remote electronic device or server. In the case of remote electronic devices, the remote electronic devices may be connected to the consumer electronic device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external electronic device (e.g., through the internet using an internet service provider).
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (12)

1. A method of matching a commodity, comprising:
acquiring first description information corresponding to a first commodity;
identifying the first description information to determine a multi-dimensional feature of the first commodity;
acquiring a plurality of candidate commodities from a commodity library according to a first matching degree between each dimension characteristic of each commodity in the commodity library and each dimension characteristic of the first commodity;
determining a second matching degree of the first commodity and each candidate commodity according to a first matching degree between each dimension characteristic of the first commodity and the corresponding dimension characteristic of each candidate commodity and the weight corresponding to each dimension characteristic;
and extracting the commodities matched with the first commodity from the candidate commodities according to the second matching degree of the first commodity and each candidate commodity.
2. The method of claim 1, wherein the multi-dimensional features comprise at least two of: category, brand, model, and performance description.
3. The method of claim 1, wherein said identifying the first description information to determine the multi-dimensional characteristics of the first item comprises:
inputting the first description information into a primary classification model to determine a target primary category to which the first commodity belongs;
and when the target primary category to which the first commodity belongs comprises a sub-category, inputting the first description information into a secondary classification model corresponding to the target primary category to determine the target secondary category to which the first commodity belongs.
4. The method of claim 3, further comprising:
acquiring a training data set and a testing data set, wherein the training data set and the testing data set respectively comprise description information of a plurality of commodities and a label category corresponding to each commodity;
training an initial classification model by using the training data set to generate a first classification model;
processing the description information of each of the commodities in the test data set by using the first classification model to determine a prediction probability that each of the commodities belongs to each category;
determining confusion probability among all classes according to the labeling class corresponding to each commodity and the prediction probability belonging to each class;
clustering the categories with the confusion probability larger than a threshold value to generate a first-level category;
training the initial classification model by using each clustered primary class and the corresponding description information respectively to generate a primary classification model;
and training the initial classification model by using each labeled class and description information contained in each primary class to generate a secondary classification model corresponding to each primary class.
5. The method of claim 1, wherein said identifying the first description information to determine the multi-dimensional characteristics of the first item comprises:
performing category identification on the first description information to determine a category to which the first commodity belongs;
acquiring an identification rule corresponding to the first commodity according to the category of the first commodity;
and matching the first description information based on the identification rule to determine the model and/or the performance description of the first commodity.
6. The method of claim 5, wherein said matching the first description information to determine the model and/or performance description of the first item based on the identification rule comprises:
matching the first description information based on an identification rule corresponding to the performance description to determine the performance description of the first commodity;
and matching the rest information except the performance description of the first commodity in the first description information based on the identification rule corresponding to the model so as to determine the model of the first commodity.
7. The method according to any one of claims 1 to 6, wherein the determining a second degree of matching of the first item with each of the candidate items based on the first degree of matching between each dimensional feature of the first item and the corresponding dimensional feature of each of the candidate items and the weight corresponding to each dimensional feature comprises:
determining the feature quantity M of each dimension corresponding to the first commodity and the feature quantity N of each dimension corresponding to the ith candidate commodityiWherein M, N and i are positive integers respectively, and i is less than or equal to the total number K of the candidate commodities;
when M is less than NiDetermining a second matching degree of the first commodity and each candidate commodity according to a first matching degree of the M-dimensional feature corresponding to the first commodity and the M-dimensional feature of each candidate commodity and the weight corresponding to the M-dimensional feature;
in NiIf less than M, N corresponding to each candidate commodityiDimensional features respectively corresponding to N of the first commodityiA first degree of matching between dimensional features, and the NiAnd determining the second matching degree of the first commodity and each candidate commodity according to the weights corresponding to the dimensional features respectively.
8. The method according to any one of claims 1 to 6, wherein the determining a second degree of matching between the first item and each of the candidate items according to the first degree of matching between the dimensional features of the first item and the dimensional features corresponding to each of the candidate items and the weights corresponding to the dimensional features comprises:
determining the feature quantity M of each dimension corresponding to the first commodity and the feature quantity N of each dimension corresponding to the ith candidate commodityiWherein M, N and i are positive integers respectively, and i is less than or equal to the total number K of the candidate commodities;
at M and NiUnder different conditions, updating the weight corresponding to each dimension characteristic;
and determining a second matching degree of the first commodity and each candidate commodity according to the updated weight corresponding to each dimension characteristic and a first matching degree between each dimension characteristic of the first commodity and the corresponding dimension characteristic of each candidate commodity.
9. The method as recited in claim 8, wherein said extracting the item matching the first item from the plurality of candidate items based on the second degree of match of the first item to each of the candidate items, respectively, comprises:
and under the condition that the second matching degrees corresponding to at least two candidate commodities are the same and are both greater than the second matching degrees corresponding to other candidate commodities, extracting commodities matched with the first commodity from the at least two candidate commodities according to the first matching degrees corresponding to the at least two candidate commodities respectively.
10. An apparatus for matching an article, comprising:
the first acquisition module is used for acquiring first description information corresponding to a first commodity;
the first determining module is used for identifying the first description information so as to determine the multi-dimensional characteristics of the first commodity;
the second acquisition module is used for acquiring a plurality of candidate commodities from the commodity library according to a first matching degree between each dimension characteristic of each commodity in the commodity library and each dimension characteristic of the first commodity;
the second determining module is used for determining a second matching degree of the first commodity and each candidate commodity according to the first matching degree between each dimension characteristic of the first commodity and the corresponding dimension characteristic of each candidate commodity and the weight corresponding to each dimension;
and the extracting module is used for extracting the commodities matched with the first commodity from the candidate commodities according to the second matching degree of the first commodity and each candidate commodity.
11. An electronic device, comprising: memory, processor and program stored on the memory and executable on the processor, characterized in that the processor, when executing the program, implements the method of matching an article according to any of claims 1-9.
12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a matching method for an article of manufacture according to any one of claims 1 to 9.
CN202110181713.2A 2021-02-09 2021-02-09 Commodity matching method and device and electronic equipment Active CN113793191B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110181713.2A CN113793191B (en) 2021-02-09 2021-02-09 Commodity matching method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110181713.2A CN113793191B (en) 2021-02-09 2021-02-09 Commodity matching method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113793191A true CN113793191A (en) 2021-12-14
CN113793191B CN113793191B (en) 2024-05-24

Family

ID=78876805

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110181713.2A Active CN113793191B (en) 2021-02-09 2021-02-09 Commodity matching method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113793191B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100305748A1 (en) * 2009-05-27 2010-12-02 You Hsin Wen Commodity selection systems and methods
JP2013109773A (en) * 2013-01-07 2013-06-06 Olympus Corp Feature matching method and article recognition system
CN103235803A (en) * 2013-04-17 2013-08-07 北京京东尚科信息技术有限公司 Method and device for acquiring object attribute values from text
US20150026017A1 (en) * 2013-07-16 2015-01-22 Toshiba Tec Kabushiki Kaisha Information processing apparatus and information processing method
US20160260033A1 (en) * 2014-05-09 2016-09-08 Peter Keyngnaert Systems and Methods for Similarity and Context Measures for Trademark and Service Mark Analysis and Repository Searchess
WO2018041168A1 (en) * 2016-08-31 2018-03-08 腾讯科技(深圳)有限公司 Information pushing method, storage medium and server
CN108960945A (en) * 2017-05-18 2018-12-07 北京京东尚科信息技术有限公司 Method of Commodity Recommendation and device
CN110738553A (en) * 2019-10-18 2020-01-31 深圳市比量科技传媒有限公司 method and system for mapping commodity links of different shopping malls to each other
WO2020108608A1 (en) * 2018-11-29 2020-06-04 腾讯科技(深圳)有限公司 Search result processing method, device, terminal, electronic device, and storage medium
CN111353838A (en) * 2018-12-21 2020-06-30 北京京东尚科信息技术有限公司 Method and device for automatically checking commodity category
US20200242486A1 (en) * 2019-01-29 2020-07-30 Ricoh Company, Ltd. Method and apparatus for recognizing intention, and non-transitory computer-readable recording medium
EP3699780A1 (en) * 2019-02-21 2020-08-26 Beijing Baidu Netcom Science And Technology Co. Ltd. Method and apparatus for recommending entity, electronic device and computer readable medium
WO2020199591A1 (en) * 2019-03-29 2020-10-08 平安科技(深圳)有限公司 Text categorization model training method, apparatus, computer device, and storage medium
CN112036981A (en) * 2020-09-02 2020-12-04 珠海随变科技有限公司 Method, device, equipment and medium for providing target comparison commodities

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100305748A1 (en) * 2009-05-27 2010-12-02 You Hsin Wen Commodity selection systems and methods
JP2013109773A (en) * 2013-01-07 2013-06-06 Olympus Corp Feature matching method and article recognition system
CN103235803A (en) * 2013-04-17 2013-08-07 北京京东尚科信息技术有限公司 Method and device for acquiring object attribute values from text
US20150026017A1 (en) * 2013-07-16 2015-01-22 Toshiba Tec Kabushiki Kaisha Information processing apparatus and information processing method
US20160260033A1 (en) * 2014-05-09 2016-09-08 Peter Keyngnaert Systems and Methods for Similarity and Context Measures for Trademark and Service Mark Analysis and Repository Searchess
WO2018041168A1 (en) * 2016-08-31 2018-03-08 腾讯科技(深圳)有限公司 Information pushing method, storage medium and server
CN108960945A (en) * 2017-05-18 2018-12-07 北京京东尚科信息技术有限公司 Method of Commodity Recommendation and device
WO2020108608A1 (en) * 2018-11-29 2020-06-04 腾讯科技(深圳)有限公司 Search result processing method, device, terminal, electronic device, and storage medium
CN111353838A (en) * 2018-12-21 2020-06-30 北京京东尚科信息技术有限公司 Method and device for automatically checking commodity category
US20200242486A1 (en) * 2019-01-29 2020-07-30 Ricoh Company, Ltd. Method and apparatus for recognizing intention, and non-transitory computer-readable recording medium
EP3699780A1 (en) * 2019-02-21 2020-08-26 Beijing Baidu Netcom Science And Technology Co. Ltd. Method and apparatus for recommending entity, electronic device and computer readable medium
WO2020199591A1 (en) * 2019-03-29 2020-10-08 平安科技(深圳)有限公司 Text categorization model training method, apparatus, computer device, and storage medium
CN110738553A (en) * 2019-10-18 2020-01-31 深圳市比量科技传媒有限公司 method and system for mapping commodity links of different shopping malls to each other
CN112036981A (en) * 2020-09-02 2020-12-04 珠海随变科技有限公司 Method, device, equipment and medium for providing target comparison commodities

Also Published As

Publication number Publication date
CN113793191B (en) 2024-05-24

Similar Documents

Publication Publication Date Title
CN107992596B (en) Text clustering method, text clustering device, server and storage medium
CN107085585B (en) Accurate tag relevance prediction for image search
JP6894534B2 (en) Information processing method and terminal, computer storage medium
US20130060769A1 (en) System and method for identifying social media interactions
US11907659B2 (en) Item recall method and system, electronic device and readable storage medium
CN104834651B (en) Method and device for providing high-frequency question answers
CN110347908B (en) Voice shopping method, device, medium and electronic equipment
CN111666766A (en) Data processing method, device and equipment
CN111325156A (en) Face recognition method, device, equipment and storage medium
CN107526721B (en) Ambiguity elimination method and device for comment vocabularies of e-commerce products
CN111767738A (en) Label checking method, device, equipment and storage medium
CN110597978A (en) Article abstract generation method and system, electronic equipment and readable storage medium
CN111125457A (en) Deep cross-modal Hash retrieval method and device
CN111274822A (en) Semantic matching method, device, equipment and storage medium
CN111930933A (en) Detection case processing method and device based on artificial intelligence
CN113408301A (en) Sample processing method, device, equipment and medium
KR20120047622A (en) System and method for managing digital contents
CN113591476A (en) Data label recommendation method based on machine learning
CN112182150A (en) Aggregation retrieval method, device, equipment and storage medium based on multivariate data
CN115017385A (en) Article searching method, device, equipment and storage medium
CN110879821A (en) Method, device, equipment and storage medium for generating rating card model derivative label
CN107908724B (en) Data model matching method, device, equipment and storage medium
CN113468311B (en) Knowledge graph-based complex question and answer method, device and storage medium
CN108733702B (en) Method, device, electronic equipment and medium for extracting upper and lower relation of user query
CN113793191B (en) Commodity matching method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant