CN116543180A - Method, device, equipment and storage medium for checking similarity of articles - Google Patents

Method, device, equipment and storage medium for checking similarity of articles Download PDF

Info

Publication number
CN116543180A
CN116543180A CN202210090396.8A CN202210090396A CN116543180A CN 116543180 A CN116543180 A CN 116543180A CN 202210090396 A CN202210090396 A CN 202210090396A CN 116543180 A CN116543180 A CN 116543180A
Authority
CN
China
Prior art keywords
article
feature vector
item
articles
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210090396.8A
Other languages
Chinese (zh)
Inventor
苏潇
张雄伟
周明龙
陶通
李勇
包勇军
颜伟鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202210090396.8A priority Critical patent/CN116543180A/en
Publication of CN116543180A publication Critical patent/CN116543180A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a method, a device, equipment and a storage medium for checking similarity of articles, and relates to the field of data processing. The method comprises the following steps: acquiring an object set to be inspected, and acquiring a pre-stored characteristic vector of each object from a database based on identification data in the object set to be inspected, wherein the characteristic vector of each object characterizes the characteristics of the object in multiple dimensions. By traversing each article in the article set to be inspected, determining a local outlier factor value of each article according to the feature vectors of a plurality of articles in the article set to be inspected, wherein the local outlier factor value of each article characterizes the outlier degree of the article relative to the articles in the k neighborhood range of the article. Updating the to-be-inspected article set according to the local outlier factor values of a plurality of articles in the to-be-inspected article set, and eliminating abnormal articles. The scheme realizes the detection of the abnormal articles in the similar article set.

Description

Method, device, equipment and storage medium for checking similarity of articles
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for checking similarity of articles.
Background
In electronic commerce, similar article sets can often implicitly express certain characteristics of articles, and the similar article sets have important application in various business scenes, for example, the similar article sets can be used for recommending articles for users in a recommendation system, and can also be used for merchants and platforms to automatically label the articles.
The sources of the similar article sets are various, such as merchant self-uploading, platform operator selection, algorithm automatic extraction and the like, and the similarity degree of the similar article sets cannot be ensured generally. Outliers in a collection of similar items often have a greater impact on subsequent tasks, such as decreasing the accuracy of item recommendations. Therefore, it is necessary and critical to perform a similarity check on a set of similar items.
Disclosure of Invention
The embodiment of the application provides a method, a device, equipment and a storage medium for checking the similarity of articles, which are used for detecting abnormal articles in a similar article set.
A first aspect of an embodiment of the present application provides an article similarity checking method, including:
acquiring an article set to be verified, wherein the article set to be verified comprises identification data of a plurality of articles;
acquiring feature vectors of the plurality of articles from a database based on the identification data;
Determining a local outlier factor value of each article by traversing each article of the plurality of articles according to a feature vector of the plurality of articles, wherein the local outlier factor value is used for indicating the outlier degree of the article relative to other articles in a k adjacent distance range of the article, and k is a positive integer;
and updating the to-be-checked article set according to the local outlier factor values of the plurality of articles, wherein the number of the articles of the updated article set is smaller than or equal to the number of the articles of the to-be-checked article set.
In an optional embodiment of the first aspect of the present application, the determining, by traversing each item of the plurality of items, a local outlier of the each item according to a feature vector of the plurality of items includes:
determining an average local reachable density value of all articles except the first article in a k adjacent distance range of the first article and a local reachable density value of the first article according to the characteristic vectors of the plurality of articles; the first article is any one of the plurality of articles;
and determining a local outlier factor value of the first article according to the average local reachable density value and the local reachable density value of the first article.
In an optional embodiment of the first aspect of the present application, determining the local reachable density value of the first article according to the feature vectors of the plurality of articles includes:
determining an achievable distance between the first item and each item other than the first item in the k-nearest distance range of the first item according to feature vectors of all items in the k-nearest distance range of the first item in the plurality of items;
a local reachable density value of the first item is determined based on a reachable distance between the first item and each item other than the first item within a k-nearest distance range of the first item and a total number of items other than the first item within the k-nearest distance range of the first item.
In an optional embodiment of the first aspect of the present application, the determining, according to the feature vectors of all items in the k-adjacent distance range of the first item in the plurality of items, an reachable distance between the first item and each item other than the first item in the k-adjacent distance range of the first item includes:
determining a first distance value according to the characteristic vector of the first article and the characteristic vector of a second article, wherein the second article is any article except the first article in a k adjacent distance range of the first article;
Acquiring a k adjacent distance value of the second object;
and determining the reachable distance between the first article and the second article according to the first distance value and the k adjacent distance value of the second article.
In an optional embodiment of the first aspect of the present application, the first distance value is a cosin distance value; the determining a first distance value according to the feature vector of the first article and the feature vector of the second article comprises: the first distance value is determined by the following formula:
where d (p, o) represents a first distance value between the first article and the second article, p represents a feature vector of the first article, and o represents a feature vector of the second article.
In an optional embodiment of the first aspect of the present application, the determining the reachable distance between the first article and the second article according to the first distance value and the k-nearest distance value of the second article includes:
and taking the larger value of the first distance value and the k adjacent distance value of the second article as the reachable distance between the first article and the second article.
In an optional embodiment of the first aspect of the application, the determining the local outlier factor value of the first article according to the average local reachable density value and the local reachable density value of the first article includes:
And taking the ratio of the average local reachable density value to the local reachable density value of the first article as a local outlier factor value of the first article.
In an optional embodiment of the first aspect of the present application, the updating the set of items to be verified according to the local outlier factor values of the plurality of items includes:
and removing the articles with the local outlier factor values smaller than a preset threshold value from the articles to obtain an updated article set.
In an optional embodiment of the first aspect of the present application, the creating a feature vector of any one of the plurality of objects includes:
acquiring a text feature vector, an attribute feature vector and a picture feature vector of the article, wherein the text feature vector is obtained by carrying out vector modeling on text information of the article, the attribute feature vector is obtained by carrying out vector modeling on attribute information of the article, and the picture feature vector is obtained by carrying out vector modeling on picture information of the article;
and carrying out multi-mode information fusion on the text feature vector, the attribute feature vector and the picture feature vector of the article to obtain the feature vector of the article.
In an optional embodiment of the first aspect of the present application, the performing multi-modal information fusion on the text feature vector, the attribute feature vector, and the picture feature vector of the article to obtain a feature vector of the article includes:
carrying out feature vector weighting processing on the text feature vector, the attribute feature vector and the picture feature vector of the object by adopting an attention mechanism model to obtain a feature vector of the object;
the input of the attention mechanism model comprises the text feature vector, the attribute feature vector and the picture feature vector of the item, and the output of the attention mechanism model comprises weight values of the text feature vector, the attribute feature vector and the picture feature vector of the item.
In an optional embodiment of the first aspect of the present application, the acquiring the text feature vector of the article includes:
acquiring Word vectors of each Word in text information of the article by using a Word2vec model;
and obtaining the text feature vector of the article by carrying out weighted summation on word vectors of all words in the text information of the article.
In an optional embodiment of the first aspect of the present application, the acquiring the attribute feature vector of the article includes:
determining an attribute vector of each attribute in the attribute information of the article by adopting an article knowledge graph model;
and obtaining the attribute feature vector of the article by carrying out weighted summation on the attribute vectors of all the attributes in the attribute information of the article.
In an optional embodiment of the first aspect of the present application, the acquiring the picture feature vector of the object includes:
extracting a target area in the picture information of the article by using an image detection algorithm, wherein the target area corresponds to the article;
inputting the picture of the target area into a convolutional neural network RNN to obtain the picture feature vector of the object.
A second aspect of embodiments of the present application provides an article similarity checking device, including:
the acquisition module is used for acquiring an article set to be verified, wherein the article set to be verified comprises identification data of a plurality of articles;
acquiring feature vectors of the plurality of articles from a database based on the identification data;
a processing module, configured to determine, by traversing each article of the plurality of articles, a local outlier factor value for each article according to a feature vector of the plurality of articles, where the local outlier factor value is used to indicate an outlier degree of the article relative to other articles of a k-adjacent distance range of the article, and k is a positive integer;
And the updating module is used for updating the article set to be checked according to the local outlier factor values of the plurality of articles, and the number of the articles of the updated article set is smaller than or equal to the number of the articles of the article set to be checked.
A third aspect of the embodiments of the present application provides an electronic device, including:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any of the first aspects.
A fourth aspect of embodiments of the present application provides a computer readable storage medium having stored thereon a computer program for execution by a processor to implement a method as in any of the first aspects.
A fifth aspect of embodiments of the present application provides a computer program product comprising a computer program which, when executed by a processor, implements the method of any of the first aspects.
The embodiment of the application provides a method, a device, equipment and a storage medium for checking the similarity of articles. The method comprises the following steps: acquiring an object set to be inspected, and acquiring a pre-stored characteristic vector of each object from a database based on identification data in the object set to be inspected, wherein the characteristic vector of each object characterizes the characteristics of the object in multiple dimensions. By traversing each article in the article set to be inspected, determining a local outlier factor value of each article according to the feature vectors of a plurality of articles in the article set to be inspected, wherein the local outlier factor value of each article characterizes the outlier degree of the article relative to the articles in the k neighborhood range of the article. Updating the to-be-inspected article set according to the local outlier factor values of a plurality of articles in the to-be-inspected article set, and eliminating abnormal articles. The scheme realizes the detection of the abnormal articles in the similar article set.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief description will be given below of the drawings that are needed in the embodiments or the prior art descriptions, it being obvious that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
Fig. 1 is a schematic view of a scenario of an article similarity checking method provided in an embodiment of the present application;
fig. 2 is a schematic flow chart of an article similarity checking method according to an embodiment of the present application;
fig. 3 is a schematic distribution diagram of an article set to be detected in a two-dimensional space according to an embodiment of the present application;
FIG. 4 is a schematic flow chart of creating an article feature vector according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an article similarity checking device according to an embodiment of the present disclosure;
fig. 6 is a hardware configuration diagram of an electronic device according to an embodiment of the present application.
Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The terms first, second and the like in the description of embodiments of the present application, in the claims and in the above-described figures, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the present application described herein may be implemented in other sequences than those illustrated or otherwise described herein.
It should be understood that the terms "comprises" and "comprising," and any variations thereof, as used herein, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements that are expressly listed or inherent to such process, method, article, or apparatus.
In the description of the embodiments of the present application, the term "corresponding" may indicate that there is a direct correspondence or an indirect correspondence between the two, or may indicate that there is an association between the two, or may indicate a relationship between the two and the indicated, configured, or the like.
At present, data platforms such as an e-commerce platform provide services such as article recommendation for users, and the platform recommends articles possibly interested in the users according to historical purchase records or browsing records of the users, so that the users can conveniently purchase the articles. The recommendation process is based on a similar article set constructed by a platform, and if the article similarity of the similar article set is not high, accurate recommendation service cannot be provided for users.
The existing similarity checking method for similar article sets comprises the following steps:
first, manual verification. The platform operator regularly checks each article in a certain similar article set, determines whether an abnormal article exists, and if the abnormal article exists, eliminates the abnormal article from the similar article set.
Second, by making fine item classification rules, such as setting keywords of similar items, the items meeting certain category requirements are further screened from the item collection.
Thirdly, based on the article data, screening article main data from the article main data, classifying the article main data, and determining whether abnormal articles exist by combining algorithms such as a threshold mutation method, an isolated forest and the like.
The first method described above is labor intensive and costly and has a limited amount of data that can be processed. The second method has poor universality, and the rule may be too complex, so that the limitation of the characteristics of the article is fixed. The third method needs to set main data of the articles, the information of the articles is not comprehensive enough, and the application scene is limited.
Aiming at the problems, further similarity verification is needed for similar article collection in the current data platform, and an article similarity verification method is provided for the embodiment of the application, and the main inventive thought is as follows: firstly, modeling the multidimensional data of each item in the data platform to generate item feature vectors with universality, so that the item feature vectors can be suitable for various business scenes, such as item recommendation, item classification and the like. And then, by acquiring the characteristic vector of each article in the similar article set (namely the initial similar article set or the article set to be checked), traversing each article in the similar article set by adopting an improved local outlier factor (Local Outlier Factor, LOF) algorithm, and judging whether the article is an abnormal point in the similar article set or not until the abnormal detection of all articles in the similar article set is completed. Based on the scheme, the abnormal points in the similar article set are detected, the accuracy of article classification of the data platform can be improved, and the accuracy of article recommendation of the data platform is further improved.
Before introducing the article similarity verification method provided by the application, firstly, an application scene of the verification method is briefly introduced. Fig. 1 is a schematic view of a scenario of an article similarity checking method according to an embodiment of the present application. As shown in fig. 1, the scene includes an article similarity checking device 11 and a data platform 12, where the article similarity checking device 11 is communicatively connected to the data platform 12.
As an example, the article similarity checking device 11 acquires a similar article data set from the data platform 12, performs vector analysis on information of multiple dimensions of articles in the similar article data set, detects abnormal article data in the similar article data set, and eliminates the abnormal article data from the similar article data set to realize checking of the article similarity in the similar article data set.
Optionally, the data platform 12 may be various data platforms such as an electronic commerce platform, a financial platform, a short video platform, etc., and the data platform 12 provides services such as searching for articles, ordering, etc.
Alternatively, the article similarity checking device 11 may be used as a processing device independent of the data platform 12; it may also be integrated into the data platform 12 as a functional module of the data platform 12, which is not limited in any way by the embodiments of the present application.
Based on the application scenario, the technical scheme provided by the embodiment of the application is described in detail through a specific embodiment. It should be noted that, the technical solution provided in the embodiments of the present application may include some or all of the following, and the following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be described in some embodiments.
Fig. 2 is a flow chart of an article similarity checking method according to an embodiment of the present application. The method for checking the similarity of the articles can be applied to the device 11 for checking the similarity of the articles shown in fig. 1, and as shown in fig. 2, the method comprises the following steps:
step 201, acquiring an article set to be verified, wherein the article set to be verified comprises identification data of a plurality of articles.
In this embodiment, the set of items to be inspected is an initial set of similar items. The sources of the similar article sets are different in different business scenes, including but not limited to self-uploading of users (such as article labels selected when the merchant uploads articles), manual selection of platform operation, automatic extraction of algorithms and the like. In the scene based on the construction of the article labels, the similar article collection is usually uploaded by a user and manually selected by operators, so that the article similarity cannot be ensured.
In this embodiment, the set of items to be inspected includes identification data for a plurality of items, where the identification data may be the number of the item for uniquely identifying the item.
By way of example, similar item set 1 is a set under the same item tag "seaside", which may contain items of apparel, sunglasses, athletic equipment, etc. that are suitable for wearing at seaside. The similar article set 2 belongs to a set under the same article label 'large code crowd', and the set contains articles such as clothes suitable for obese people and people like loose styles.
It should be noted that, the article tag refers to a descriptive word or phrase that can be used for article recommendation, such as "party," "large code crowd," "camouflage wind," and may also have timeliness, such as "seven sunset," "teacher section," and the like, and the embodiments of the present application are not limited specifically.
Because of certain deviation in manual selection, abnormal articles may be mixed in the similar article set, for example, the article label set "seaside" contains abnormal articles such as sports watches with poor waterproof effect, and the article label set "large code crowd" contains abnormal articles such as articles with pictures displayed as slimming clothes.
Step 202, obtaining feature vectors of a plurality of objects from a database based on the identification data.
In this embodiment, the database of the data platform stores feature vectors of all the items in the data platform, and the feature vector of each item is created when the merchant uploads the relevant information of the item. After the article feature vector is created, the article identifier and the article feature vector can be bound and stored in a database of the data platform, namely, the database comprises the corresponding relation between the identification data of the article and the feature vector.
In this embodiment, the feature vector of each item is determined based on the feature vectors of each item in multiple dimensions. Feature vectors for each item in multiple dimensions include, but are not limited to, text feature vectors, attribute feature vectors, and picture feature vectors for the item. For a specific creation process of the item feature vector, see later embodiments.
The text feature vector of the article is obtained by carrying out vector modeling on the text information of the article, the attribute feature vector of the article is obtained by carrying out vector modeling on the attribute information of the article, and the picture feature vector of the article is obtained by carrying out vector modeling on the picture information of the article.
Exemplary text information for an item includes, but is not limited to, a title, description, etc. of the item. The attribute information of the item includes various aspects of the attribute of the item, such as the place of origin of the item, applicable season, applicable crowd, etc. The picture information of the item includes a promotional master image of the item.
Step 203, determining a local outlier factor value of each article by traversing each article of the plurality of articles according to the feature vectors of the plurality of articles. Wherein the local outlier is used to indicate an extent of outlier of the item relative to other items of the k-adjacent distance range of the item, k being a positive integer.
Specifically, an article, such as a first article, is randomly selected from a set of articles to be verified, and a local outlier of the first article is determined. And randomly selecting the next article from the article set to be checked after the first article is removed, and determining the local outlier factor value of the next article until all the articles in the article set to be checked are traversed.
For ease of understanding, the data processing procedure of this step will be described in detail below taking the first article in the article set to be inspected as an example. Wherein the first article is any one of a plurality of articles in the article set to be inspected.
In an alternative embodiment of the present embodiment, step 203 includes:
step 2031, determining an average local reachable density value of all items except the first item within a k-nearest distance range of the first item and a local reachable density value of the first item according to the feature vectors of the plurality of items.
In an alternative embodiment, the local reachable density value of the first item is determined by: determining an reachable distance between the first article and each article except the first article in the k adjacent distance range of the first article according to the feature vectors of all the articles in the k adjacent distance range of the first article in the plurality of articles; a local reachable density value of the first item is determined based on the reachable distance between the first item and each item other than the first item within a k-nearest distance range of the first item and a total number of items other than the first item within the k-nearest distance range of the first item.
As one example, a local reachable density value of the first item is determined by equation one:
in the reach_dist k (p, o) represents the reachable distance between the first item p and the second item o, which is any item other than the first item p within the k-nearest distance range of the first item p. N k(p) I represents the total number of items other than the first item p within k-adjacent distance range of the first item p, where |n k(p) |≥k。
In an alternative embodiment, the achievable distance between the first item and the second item is determined by: determining a first distance value according to the characteristic vector of the first article and the characteristic vector of the second article; acquiring a k adjacent distance value of a second object; and determining the reachable distance between the first article and the second article according to the first distance value and the k adjacent distance value of the second article. Wherein the first distance value refers to a distance value between feature vectors of the first article and the second article.
As one example, the first distance value is a cosin distance value. Specifically, a first distance value is determined by the formula two:
where d (p, o) represents a first distance value between the first article and the second article, p represents a feature vector of the first article, and o represents a feature vector of the second article.
As one example, the larger of the first distance value and the k-nearest distance value of the second item is taken as the reachable distance between the first item and the second item. Specifically, the reachable distance between the first article and the second article is determined by the formula three:
In the method, in the process of the invention,representing the reachable distance between the first item p and the second item o, and k-distance (o) represents the k-nearest distance value for the second item.
The above examples are described below with reference to the accompanying drawings. Fig. 3 is a schematic diagram of distribution of an article set to be detected in a two-dimensional space according to an embodiment of the present application. In the actual business scenario, the feature vector of the article is multidimensional, and for ease of understanding, the present example only uses a two-dimensional feature vector as an example to describe the scheme.
Assuming that k is 4, as shown in fig. 3, the k adjacent distance range of the first article p includes articles a, b, o, d except the first article p, and the distance values between the k adjacent distance range of the first article p and the first article p are respectively articles d, b, a, o from the near to the far, that is, the distance values between the first article p and the article o. The k adjacent distance range of the second article o comprises articles c, e, f, g, h except the second article o, the distance value between the k adjacent distance range of the second article o and the second article o is respectively articles h, e, f, c (g) from the near to the far, namely the k adjacent distance value of the second article o is the distance value between the second article p and the articlesAnd (c) a distance value between products c (g). It follows that N k(p) I equals k, |N k(o) I is greater than k.
The reachable distance between the first item p and the second item o is determined from a first distance value between the first item p and the second item o and a k-nearest distance value of the second item o. Wherein, the first distance value between the first article p and the second article o is d (p, o) as shown in fig. 3, the k adjacent distance value of the second article o is d (o, c) as shown in fig. 3, and d (p, o) > d (o, c) is known from fig. 3, and d (p, o) is taken as the reachable distance between the first article p and the second article p. Based on the same principle, the reachable distance between the first article p and the articles a, b and d respectively can be determined, and then the local reachable density value of the first article p is determined based on the formula I. Exemplary, assume that the reachable distances between the first item p and the item a, b, o, d are denoted d, respectively 1 ,d 2 ,d 3 ,d 4 The locally reachable density value of the first article p is lrd 4 (p)=4/(d1+d2+d3+d4)。
Step 2032, determining a local outlier factor value for the first item based on the average local reachable density value and the local reachable density value for the first item.
In an alternative embodiment of the present embodiment, the ratio of the average local reachable density value to the local reachable density value of the first item is taken as the local outlier factor value of the first item. Specifically, the local outlier factor value for the first item may be determined by equation four:
in lrd k (o) represents the locally reachable density value of the k-nearest distance range of the second object o,lrd k (o)/|N k (p) | represents the average local reachable density value of all items except the first item p within the k-nearest distance range of the first item p, lrd k (p) represents the k-neighborhood of the first article pLocally reachable density values in the close range. The determination process of the local reachable density value of the k adjacent distance range of the second object o is similar to that of the k adjacent distance range of the first object p, and reference is made to the above embodiments, and details thereof are omitted here.
It should be noted that, the larger the local outlier factor value of the first article, the smaller the local reachable density of the first article is, and the more likely the first article is an outlier.
And 204, updating the article set to be verified according to the local outlier factor values of the plurality of articles. Wherein the number of the updated article sets is less than or equal to the number of the articles of the article sets to be checked.
In an alternative embodiment of the present embodiment, articles having local outlier values less than a preset threshold are removed from the plurality of articles, resulting in an updated article set.
The preset threshold may be about 1.5, and if the local outlier factor value of an article in the article set to be inspected is greater than or equal to 1.5, the article is indicated to be an abnormal article, and the article is removed from the article set; if the local outlier factor value of one article in the article set to be inspected is smaller than 1.5, indicating that the article is a normal article, and keeping the article in the article set.
It should be noted that, the preset threshold may be set reasonably according to an actual service scenario, and the embodiment of the present application is not limited.
According to the article similarity checking method, an article set to be checked is obtained, a feature vector of each article stored in advance is obtained from a database based on identification data in the article set to be checked, and the feature vector of each article characterizes the features of the article in multiple dimensions. By traversing each article in the article set to be inspected, determining a local outlier factor value of each article according to the feature vectors of a plurality of articles in the article set to be inspected, wherein the local outlier factor value of each article characterizes the outlier degree of the article relative to the articles in the k neighborhood range of the article. Updating the to-be-inspected article set according to the local outlier factor values of a plurality of articles in the to-be-inspected article set, and eliminating abnormal articles. The scheme realizes the detection of the abnormal articles in the similar article set. The similar article collection obtained based on the scheme is used for article classification, so that the accuracy of article classification of the data platform can be improved, and the accuracy of article recommendation of the data platform can be improved.
Based on the above embodiments, a detailed description is given below of how the item feature vector is created in the above embodiments. The article feature vectors shown in the embodiment are obtained through multi-dimensional feature vector fusion, so that article features can be more comprehensively represented, and a data basis is provided for subsequent article classification and recommendation.
Fig. 4 is a schematic flow chart of creating an article feature vector according to an embodiment of the present application. As shown in fig. 4, the process of creating the feature vector of the article includes the steps of:
step 301, obtaining a text feature vector of an article.
The purpose of this step is to vector model the text information of the item.
In an alternative embodiment, word2vec models are used to obtain Word vectors of each Word in the text information of the article, and the Word vectors of all words in the text information of the article are weighted and summed to obtain the text feature vector of the article.
For example, assuming that the text information of an item includes n words, word vectors of the n words are obtained through a Word2vec model, and are respectively marked as v 1 ,v 2 …v n The vector of the text information of the article as a whole (i.e., the text feature vector of the article) is expressed as: v text =(v 1 +v 2 +…+v n ) And/n. Alternatively, the weight value of each word can be determined according to the importance degree of n words, and the vector of the text information is represented by a weighted summation mode.
In an alternative embodiment, a model such as fastText, gloVe may also be used to obtain word vectors for each word in the text information of the item. The model for obtaining the text information word vector of the article is not particularly limited in the embodiment of the application.
Step 302, obtaining an attribute feature vector of the article.
The purpose of this step is to model the vector of various attribute information of the item. The attributes of the item may be considered as entities in the item knowledge graph, so attribute feature vectors of the item may be constructed based on pre-constructed item knowledge graphs.
In an alternative embodiment, an article knowledge graph model, such as a TransE, transR knowledge graph model, is adopted to determine an attribute vector of each attribute in attribute information of the article; and obtaining the attribute feature vector of the article by carrying out weighted summation on the attribute vectors of all the attributes in the attribute information of the article.
For example, assuming that an object has m main attributes, first obtaining attribute vectors of the m attributes through an object knowledge graph model, and respectively marking the attribute vectors as v 1 ,v 2 …v m The vector of the item attribute information as a whole (i.e., the attribute feature vector of the item) can be expressed as: v attr =(v 1 +v 2 +…+v m ) And/m. Alternatively, the weight value of each attribute may be determined according to the importance degree of m attributes, and the vector of the attribute information may be represented by a weighted summation manner.
Step 303, obtaining a picture feature vector of the object.
The purpose of this step is to vector model the picture information of the item.
In an alternative embodiment, extracting a target area in the picture information of the article by using an image detection algorithm, wherein the target area corresponds to the article; inputting the picture of the target area into a convolutional neural network RNN to obtain the picture feature vector of the object. In this embodiment, the target region, i.e., the region of interest (region of interest, ROI), is the region of interest (ROI) of the picture encoded by RNN, and if the picture of the region where the object is located is denoted as x, the vector of the picture information of the object (i.e., the picture feature vector of the object) can be expressed as v image =CNN(ROI(x))。
It should be noted that the dimensions of the feature vectors of steps 301 to 303 described above are the same.
And 304, carrying out multi-mode information fusion on the text feature vector, the attribute feature vector and the picture feature vector of the object to obtain the feature vector of the object. Wherein the feature vector of the article is the integrated feature vector of the article.
In an alternative embodiment, the attention mechanism model is used for carrying out feature vector weighting processing on the text feature vector, the attribute feature vector and the picture feature vector of the object to obtain the feature vector of the object. The input of the attention mechanism model comprises a text feature vector, an attribute feature vector and a picture feature vector of the object, and the output of the attention mechanism model comprises weight values of the text feature vector, the attribute feature vector and the picture feature vector of the object.
The attention mechanism model can dynamically give different weights to the article information sources so as to adapt to the needs of different downstream tasks needing to pay attention to different article information sources. Through multidimensional feature vector fusion, vectors under different feature spaces are integrated into the same space to obtain unified vector representation of the object, which can be recorded as: v=α text v textattr v attrimage v image Wherein alpha is text ,α attr ,α image =Attention(v text ,v attr ,v image ),α text Weight value, alpha, representing text feature vector of article attr Weight value, alpha, representing attribute feature vector of article image The weight value of the picture characteristic vector representing the object is calculated by the attribute function. The implementation of the Attention function is not the same in different models.
Alternatively, in some embodiments, an Attention model may be constructed based on a single layer feedforward neural network to derive an Attention function corresponding to the model.
Optionally, in some embodiments, other multi-mode information fusion models may be further used to perform feature vector fusion on the text feature vector, the attribute feature vector, and the picture feature vector of the object, so as to obtain the feature vector of the object.
Step 305, storing the feature vector of the article in a database. (optional)
Specifically, the feature vectors of the articles and the identification data of the articles are subjected to data association, and the associated data are stored in a database of a data platform shown in fig. 1, so that the subsequent processes of checking similar article sets, classifying the articles and the like are facilitated.
The above process of creating the feature vector of the article is applicable to any article in the article set to be verified in the above embodiment.
The article feature vector creation process shown in the embodiment fuses information of articles in multiple dimensions, such as text information, attribute information, picture information and the like, and provides data support for subsequent similar article set verification, article classification and the like by constructing article feature vectors integrating the multi-dimensional information.
In summary, the method for checking the similarity of the articles provided in the embodiments of the present application mainly includes the following technical key points: modeling the vector of the item and using the item vector representation for anomaly detection. The vector modeling of the article comprehensively utilizes a plurality of information sources of the article, gives consideration to the information of the article in different dimensions, and technically utilizes a plurality of technologies such as word vector, knowledge graph representation learning, convolutional neural network, multi-modal fusion and the like. Abnormal article detection realizes detection of abnormal points in a similar article set, and makes adaptability improvement on a traditional LOF algorithm, so that the method can be better used for business scenes such as e-commerce and the like.
The article similarity checking method provided by the embodiment of the application is described above, and the article similarity checking device provided by the embodiment of the application will be described below.
According to the embodiment of the application, the function modules of the article similarity checking device can be divided according to the method embodiment, for example, each function module can be divided corresponding to each function, and two or more functions can be integrated in one processing module. The integrated modules described above may be implemented either in hardware or in software functional modules.
It should be noted that, in the embodiment of the present application, the division of the modules is schematic, which is merely a logic function division, and other division manners may be implemented in actual implementation. The following description will be given by taking an example of dividing each function module into corresponding functions.
Fig. 5 is a schematic structural diagram of an article similarity checking device according to an embodiment of the present application. As shown in fig. 5, the article similarity checking device 400 provided in this embodiment includes: an acquisition module 401, a processing module 402, and an update module 403.
An obtaining module 401, configured to obtain a set of items to be verified, where the set of items to be verified includes identification data of a plurality of items;
acquiring feature vectors of the plurality of articles from a database based on the identification data;
a processing module 402, configured to determine, by traversing each item of the plurality of items, a local outlier factor value for each item according to a feature vector of the plurality of items, where the local outlier factor value is used to indicate an outlier degree of the item with respect to other items of a k-adjacent distance range of the item, and k is a positive integer;
And the updating module 403 is configured to update the to-be-verified item set according to the local outlier factor values of the plurality of items, where the number of items in the updated item set is less than or equal to the number of items in the to-be-verified item set.
In an alternative embodiment of the present embodiment, the processing module 402 is configured to:
determining an average local reachable density value of all articles except the first article in a k adjacent distance range of the first article and a local reachable density value of the first article according to the characteristic vectors of the plurality of articles; the first article is any one of the plurality of articles;
and determining a local outlier factor value of the first article according to the average local reachable density value and the local reachable density value of the first article.
In an alternative embodiment of the present embodiment, the processing module 402 is configured to:
determining an achievable distance between the first item and each item other than the first item in the k-nearest distance range of the first item according to feature vectors of all items in the k-nearest distance range of the first item in the plurality of items;
a local reachable density value of the first item is determined based on a reachable distance between the first item and each item other than the first item within a k-nearest distance range of the first item and a total number of items other than the first item within the k-nearest distance range of the first item.
In an alternative embodiment of the present embodiment, the processing module 402 is configured to:
determining a first distance value according to the characteristic vector of the first article and the characteristic vector of a second article, wherein the second article is any article except the first article in a k adjacent distance range of the first article;
acquiring a k adjacent distance value of the second object;
and determining the reachable distance between the first article and the second article according to the first distance value and the k adjacent distance value of the second article.
Optionally, the first distance value is a cosin distance value;
in an alternative embodiment of the present embodiment, the processing module 402 is configured to:
the first distance value is determined by the following formula:
where d (p, o) represents a first distance value between the first article and the second article, p represents a feature vector of the first article, and o represents a feature vector of the second article.
In an alternative embodiment of the present embodiment, the processing module 402 is configured to: and taking the larger value of the first distance value and the k adjacent distance value of the second article as the reachable distance between the first article and the second article.
In an alternative embodiment of the present embodiment, the processing module 402 is configured to: and taking the ratio of the average local reachable density value to the local reachable density value of the first article as a local outlier factor value of the first article.
In an alternative embodiment of the present embodiment, the updating module 403 is configured to: and removing the articles with the local outlier factor values smaller than a preset threshold value from the articles to obtain an updated article set.
In an alternative embodiment of the present embodiment, the obtaining module 401 is configured to: acquiring a text feature vector, an attribute feature vector and a picture feature vector of the article, wherein the text feature vector is obtained by carrying out vector modeling on text information of the article, the attribute feature vector is obtained by carrying out vector modeling on attribute information of the article, and the picture feature vector is obtained by carrying out vector modeling on picture information of the article;
and the processing module 402 is configured to perform multi-mode information fusion on the text feature vector, the attribute feature vector and the picture feature vector of the object to obtain a feature vector of the object.
In an alternative embodiment of the present embodiment, the processing module 402 is configured to: carrying out feature vector weighting processing on the text feature vector, the attribute feature vector and the picture feature vector of the object by adopting an attention mechanism model to obtain a feature vector of the object;
The input of the attention mechanism model comprises the text feature vector, the attribute feature vector and the picture feature vector of the item, and the output of the attention mechanism model comprises weight values of the text feature vector, the attribute feature vector and the picture feature vector of the item.
In an optional embodiment of this embodiment, an obtaining module 401 is configured to obtain a Word vector of each Word in the text information of the article by using a Word2vec model;
and the processing module 402 is configured to obtain a text feature vector of the article by performing weighted summation on word vectors of all words in the text information of the article.
In an alternative embodiment of the present embodiment, the processing module 402 is configured to:
determining an attribute vector of each attribute in the attribute information of the article by adopting an article knowledge graph model;
and obtaining the attribute feature vector of the article by carrying out weighted summation on the attribute vectors of all the attributes in the attribute information of the article.
In an alternative embodiment of the present embodiment, the processing module 402 is configured to:
extracting a target area in the picture information of the article by using an image detection algorithm, wherein the target area corresponds to the article;
Inputting the picture of the target area into a convolutional neural network RNN to obtain the picture feature vector of the object.
The article similarity checking device provided in this embodiment may execute the technical scheme of any one of the method embodiments, and its implementation principle and technical effect are similar, and will not be described herein.
Fig. 6 is a hardware configuration diagram of an electronic device according to an embodiment of the present application. As shown in fig. 6, the electronic device 500 provided in this embodiment includes:
a memory 501;
a processor 502; and
a computer program;
the computer program is stored in the memory 501 and configured to be executed by the processor 502 to implement the technical solution of any of the above method embodiments, and the implementation principle and technical effect are similar, and will not be described herein again.
Alternatively, the memory 501 may be separate or integrated with the processor 502. When memory 501 is a device separate from processor 502, electronic device 500 further includes: a bus 503 for connecting the memory 501 and the processor 502.
The present embodiment also provides a computer readable storage medium, on which a computer program is stored, the computer program being executed by the processor 502 to implement the technical solution of any of the method embodiments as described above.
Embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements the technical solution of any of the method embodiments described above.
The embodiment of the application also provides a chip, which comprises: the processing module and the communication interface, the processing module can execute the technical scheme of any method embodiment.
Further, the chip further includes a storage module (e.g., a memory), where the storage module is configured to store instructions, and the processing module is configured to execute the instructions stored in the storage module, and execution of the instructions stored in the storage module causes the processing module to execute the technical solution of any of the foregoing method embodiments.
It should be understood that the above processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), or may be other general purpose processors, digital signal processors (english: digital Signal Processor, abbreviated as DSP), application specific integrated circuits (english: application Specific Integrated Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile memory NVM, such as at least one magnetic disk memory, and may also be a U-disk, a removable hard disk, a read-only memory, a magnetic disk or optical disk, etc.
The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (Peripheral Component, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or one type of bus.
The storage medium may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application specific integrated circuit (Application Specific Integrated Circuits, ASIC for short). The processor and the storage medium may reside as discrete components in an electronic device.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present application.

Claims (17)

1. An article similarity checking method, comprising:
acquiring an article set to be verified, wherein the article set to be verified comprises identification data of a plurality of articles;
acquiring feature vectors of the plurality of articles from a database based on the identification data;
determining a local outlier factor value of each article by traversing each article of the plurality of articles according to a feature vector of the plurality of articles, wherein the local outlier factor value is used for indicating the outlier degree of the article relative to other articles in a k adjacent distance range of the article, and k is a positive integer;
and updating the to-be-checked article set according to the local outlier factor values of the plurality of articles, wherein the number of the articles of the updated article set is smaller than or equal to the number of the articles of the to-be-checked article set.
2. The method of claim 1, wherein said determining a local outlier for each of said plurality of items by traversing said each item based on a feature vector of said plurality of items comprises:
determining an average local reachable density value of all articles except the first article in a k adjacent distance range of the first article and a local reachable density value of the first article according to the characteristic vectors of the plurality of articles; the first article is any one of the plurality of articles;
and determining a local outlier factor value of the first article according to the average local reachable density value and the local reachable density value of the first article.
3. The method of claim 2, wherein determining the local reachable density value of the first item from the feature vectors of the plurality of items comprises:
determining an achievable distance between the first item and each item other than the first item in the k-nearest distance range of the first item according to feature vectors of all items in the k-nearest distance range of the first item in the plurality of items;
A local reachable density value of the first item is determined based on a reachable distance between the first item and each item other than the first item within a k-nearest distance range of the first item and a total number of items other than the first item within the k-nearest distance range of the first item.
4. A method according to claim 3, wherein said determining the reachable distance between the first item and each item other than the first item in the k-nearest distance range of the first item from the feature vectors of all items in the k-nearest distance range of the first item in the plurality of items comprises:
determining a first distance value according to the characteristic vector of the first article and the characteristic vector of a second article, wherein the second article is any article except the first article in a k adjacent distance range of the first article;
acquiring a k adjacent distance value of the second object;
and determining the reachable distance between the first article and the second article according to the first distance value and the k adjacent distance value of the second article.
5. The method of claim 4, wherein the first distance value is a cosin distance value; the determining a first distance value according to the feature vector of the first article and the feature vector of the second article comprises: the first distance value is determined by the following formula:
Where d (p, o) represents a first distance value between the first article and the second article, p represents a feature vector of the first article, and o represents a feature vector of the second article.
6. The method of claim 4, wherein determining the reachable distance between the first item and the second item based on the first distance value and the k-nearest distance value for the second item comprises:
and taking the larger value of the first distance value and the k adjacent distance value of the second article as the reachable distance between the first article and the second article.
7. The method of any one of claims 2-6, wherein the determining a local outlier value for the first item based on the average local reachable density value and the local reachable density value for the first item comprises:
and taking the ratio of the average local reachable density value to the local reachable density value of the first article as a local outlier factor value of the first article.
8. The method of any of claims 1-6, wherein the updating the set of items to be verified based on the local outlier factor values of the plurality of items comprises:
And removing the articles with the local outlier factor values smaller than a preset threshold value from the articles to obtain an updated article set.
9. The method according to any one of claims 1-6, wherein the creating of the feature vector for any one of the plurality of items comprises:
acquiring a text feature vector, an attribute feature vector and a picture feature vector of the article, wherein the text feature vector is obtained by carrying out vector modeling on text information of the article, the attribute feature vector is obtained by carrying out vector modeling on attribute information of the article, and the picture feature vector is obtained by carrying out vector modeling on picture information of the article;
and carrying out multi-mode information fusion on the text feature vector, the attribute feature vector and the picture feature vector of the article to obtain the feature vector of the article.
10. The method of claim 9, wherein the multi-modal information fusion of the text feature vector, the attribute feature vector, and the picture feature vector of the article to obtain the feature vector of the article comprises:
carrying out feature vector weighting processing on the text feature vector, the attribute feature vector and the picture feature vector of the object by adopting an attention mechanism model to obtain a feature vector of the object;
The input of the attention mechanism model comprises the text feature vector, the attribute feature vector and the picture feature vector of the item, and the output of the attention mechanism model comprises weight values of the text feature vector, the attribute feature vector and the picture feature vector of the item.
11. The method of claim 9, wherein the obtaining the text feature vector for the item comprises:
acquiring Word vectors of each Word in text information of the article by using a Word2vec model;
and obtaining the text feature vector of the article by carrying out weighted summation on word vectors of all words in the text information of the article.
12. The method of claim 9, wherein the obtaining the attribute feature vector for the item comprises:
determining an attribute vector of each attribute in the attribute information of the article by adopting an article knowledge graph model;
and obtaining the attribute feature vector of the article by carrying out weighted summation on the attribute vectors of all the attributes in the attribute information of the article.
13. The method of claim 9, wherein the obtaining the picture feature vector of the item comprises:
Extracting a target area in the picture information of the article by using an image detection algorithm, wherein the target area corresponds to the article;
inputting the picture of the target area into a convolutional neural network RNN to obtain the picture feature vector of the object.
14. An article similarity verification device, comprising:
the acquisition module is used for acquiring an article set to be verified, wherein the article set to be verified comprises identification data of a plurality of articles;
acquiring feature vectors of the plurality of articles from a database based on the identification data;
a processing module, configured to determine, by traversing each article of the plurality of articles, a local outlier factor value for each article according to a feature vector of the plurality of articles, where the local outlier factor value is used to indicate an outlier degree of the article relative to other articles of a k-adjacent distance range of the article, and k is a positive integer;
and the updating module is used for updating the article set to be checked according to the local outlier factor values of the plurality of articles, and the number of the articles of the updated article set is smaller than or equal to the number of the articles of the article set to be checked.
15. An electronic device, comprising:
A memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1-13.
16. A computer readable storage medium, having stored thereon a computer program, the computer program being executed by a processor to implement the method of any of claims 1-13.
17. A computer program product comprising a computer program which, when executed by a processor, implements the method of any of claims 1-13.
CN202210090396.8A 2022-01-25 2022-01-25 Method, device, equipment and storage medium for checking similarity of articles Pending CN116543180A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210090396.8A CN116543180A (en) 2022-01-25 2022-01-25 Method, device, equipment and storage medium for checking similarity of articles

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210090396.8A CN116543180A (en) 2022-01-25 2022-01-25 Method, device, equipment and storage medium for checking similarity of articles

Publications (1)

Publication Number Publication Date
CN116543180A true CN116543180A (en) 2023-08-04

Family

ID=87454764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210090396.8A Pending CN116543180A (en) 2022-01-25 2022-01-25 Method, device, equipment and storage medium for checking similarity of articles

Country Status (1)

Country Link
CN (1) CN116543180A (en)

Similar Documents

Publication Publication Date Title
Poursaeed et al. Vision-based real estate price estimation
CN108629665B (en) Personalized commodity recommendation method and system
CN108256568B (en) Plant species identification method and device
CN107341716B (en) Malicious order identification method and device and electronic equipment
US9218364B1 (en) Monitoring an any-image labeling engine
CN111898031B (en) Method and device for obtaining user portrait
CN110263821B (en) Training of transaction feature generation model, and method and device for generating transaction features
CN109300003B (en) Enterprise recommendation method, enterprise recommendation device, computer equipment and storage medium
WO2020192013A1 (en) Directional advertisement delivery method and apparatus, and device and storage medium
US11017016B2 (en) Clustering product media files
CN110580489B (en) Data object classification system, method and equipment
WO2017088496A1 (en) Search recommendation method, device, apparatus and computer storage medium
CN111666275B (en) Data processing method and device, electronic equipment and storage medium
US11682060B2 (en) Methods and apparatuses for providing search results using embedding-based retrieval
US20230093756A1 (en) Systems and methods for generating recommendations
CN116308684B (en) Online shopping platform store information pushing method and system
CN112508638B (en) Data processing method and device and computer equipment
CN111488385A (en) Data processing method and device based on artificial intelligence and computer equipment
CN111429214B (en) Transaction data-based buyer and seller matching method and device
CN111861679A (en) Commodity recommendation method based on artificial intelligence
CN112989182B (en) Information processing method, information processing device, information processing apparatus, and storage medium
CN107305615A (en) Tables of data recognition methods and system
CN113128218A (en) Key field extraction method and device for bidding information
CN112687079A (en) Disaster early warning method, device, equipment and storage medium
EP3489838A1 (en) Method and apparatus for determining an association

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination