CN110929764A - Picture auditing method and device, electronic equipment and storage medium - Google Patents

Picture auditing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110929764A
CN110929764A CN201911063137.0A CN201911063137A CN110929764A CN 110929764 A CN110929764 A CN 110929764A CN 201911063137 A CN201911063137 A CN 201911063137A CN 110929764 A CN110929764 A CN 110929764A
Authority
CN
China
Prior art keywords
picture
audited
pictures
candidate
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911063137.0A
Other languages
Chinese (zh)
Inventor
彭冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN201911063137.0A priority Critical patent/CN110929764A/en
Publication of CN110929764A publication Critical patent/CN110929764A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to a picture auditing method and device, electronic equipment and a storage medium. The problems of high labor cost and low efficiency caused by a manual auditing mechanism are solved. The method comprises the following steps: acquiring a candidate picture set corresponding to a picture to be audited of a target object from an internet platform, wherein the candidate picture set comprises a plurality of pictures of the target object which pass audit; acquiring a characteristic vector of each picture in the candidate picture set; based on a clustering algorithm, clustering the pictures in the candidate picture set according to the feature vectors of all the pictures in the candidate picture set, and determining a reference picture according to a clustering result; and auditing the picture to be audited based on the reference picture to obtain an audit result.

Description

Picture auditing method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of information technologies, and in particular, to a method and an apparatus for checking a picture, an electronic device, and a storage medium.
Background
With the development of networks, more and more information is published on the network, and netizens can comment on any information disclosed on the network. In order to standardize the network civilization, the netizens need to be audited when uploading comment information. The sensitive words in the text information can be directly identified, so that the text information can be audited in a sensitive word filtering mode, but the method is not suitable for the picture information. Therefore, manual review of pictures uploaded to the network by people is required.
For example, in a public review network, when people review a consumed commodity, a picture corresponding to the commodity is often uploaded, and then the uploaded picture is manually reviewed to determine that the picture is the picture of the corresponding consumed commodity. However, the manual review mechanism is adopted to review the picture information, so that the labor cost is high and the efficiency is low.
Disclosure of Invention
The invention aims to provide a picture auditing method and device, electronic equipment and a storage medium, which can automatically audit pictures and solve the problems of high cost and low efficiency caused by a manual auditing mechanism.
In order to achieve the above object, according to a first aspect of the embodiments of the present disclosure, there is provided a picture auditing method, including:
acquiring a candidate picture set corresponding to a picture to be audited of a target object from an internet platform, wherein the candidate picture set comprises a plurality of pictures of the target object which pass audit;
acquiring a characteristic vector of each picture in the candidate picture set;
based on a clustering algorithm, clustering the pictures in the candidate picture set according to the feature vectors of all the pictures in the candidate picture set, and determining a reference picture according to a clustering result;
and auditing the picture to be audited based on the reference picture to obtain an audit result.
Optionally, the auditing the picture to be audited based on the reference picture to obtain an audit result, including:
acquiring a feature vector of the picture to be audited;
calculating the similarity between the picture to be audited and the reference picture according to the characteristic vector of the picture to be audited and the characteristic vector of the reference picture;
and if the similarity is greater than a preset threshold value, determining that the picture to be audited passes the audit.
Optionally, the obtaining the feature vector of each picture in the candidate picture set includes:
inputting the candidate picture set into a feature vector extraction model to obtain feature vectors of all pictures in the candidate picture set; and/or the presence of a gas in the gas,
the obtaining of the feature vector of the picture to be audited includes:
inputting the picture to be audited into the feature vector extraction model to obtain the feature vector of the picture to be audited;
the feature vector extraction model is obtained by training pictures of different articles and feature vectors of the pictures as training samples.
Optionally, training the feature vector extraction model further includes:
acquiring approved pictures of different articles from the Internet platform, and determining pixel values and feature vectors of the acquired pictures;
and taking the pixel value of the acquired picture as input, taking the feature vector of the acquired picture as output, and training the feature vector extraction model.
Optionally, the clustering algorithm comprises a k-medoids algorithm.
According to a second aspect of the embodiments of the present disclosure, there is provided a picture auditing apparatus, including:
the system comprises an acquisition module, a verification module and a verification module, wherein the acquisition module is used for acquiring a candidate picture set corresponding to a picture to be verified of a target article from an internet platform, and the candidate picture set comprises a plurality of pictures which pass verification of the target article;
the execution module is used for acquiring the characteristic vector of each picture in the candidate picture set;
the clustering module is used for clustering the pictures in the candidate picture set according to the characteristic vector of each picture in the candidate picture set based on a clustering algorithm and determining a reference picture according to a clustering result;
and the auditing module is used for auditing the picture to be audited based on the reference picture to obtain an auditing result.
Optionally, the auditing module includes:
the obtaining submodule is used for obtaining the characteristic vector of the picture to be audited;
the calculation submodule is used for calculating the similarity between the picture to be audited and the reference picture according to the characteristic vector of the picture to be audited and the characteristic vector of the reference picture;
and the first determining submodule is used for determining that the picture to be audited passes the audit when the similarity is greater than a preset threshold value.
Optionally, the execution module includes:
the first input sub-module is used for inputting the candidate picture set into a feature vector extraction model to obtain feature vectors of all pictures in the candidate picture set;
the acquisition sub-module includes:
the second input submodule is used for inputting the picture to be audited into the feature vector extraction model to obtain the feature vector of the picture to be audited;
the feature vector extraction model is obtained by training pictures of different articles and feature vectors of the pictures as training samples.
Optionally, the picture auditing apparatus further includes a training module, where the training module includes:
the second determining submodule is used for acquiring the approved pictures of different articles from the Internet platform and determining the pixel values and the feature vectors of the acquired pictures;
and the training sub-module is used for taking the pixel value of the acquired picture as input, taking the feature vector of the acquired picture as output and training the feature vector extraction model.
Optionally, the clustering algorithm comprises a k-medoids algorithm.
According to a third aspect of embodiments of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of any one of the first aspects.
According to a fourth aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:
a memory having a computer program stored thereon;
a processor for executing the computer program in the memory to implement the steps of the method of any of the first aspects.
Through the technical scheme, the following technical effects can be at least achieved:
and acquiring a candidate picture set corresponding to a picture to be audited of the target object from the Internet platform, wherein the candidate picture set comprises a plurality of pictures which pass the audit of the target object. The pictures which pass the audit have the corresponding characteristic information of the commodity. Extracting the feature vector of each picture in the candidate picture set to obtain the multi-dimensional (high-dimensional) feature information corresponding to the commodity. And clustering the pictures in the candidate picture set according to the characteristic vector of each picture in the candidate picture set, and determining a reference picture according to a clustering result. And auditing the picture to be audited based on the reference picture, so that whether the picture to be audited is a picture corresponding to the commodity can be accurately identified. The automatic image auditing mode replaces a manual auditing mode, so that the labor cost is saved, and the auditing efficiency is improved. In addition, the reference picture is selected by clustering the candidate picture set, so that the selected reference picture can be representative, and the auditing result is more accurate.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
fig. 1 is a flowchart illustrating a picture auditing method according to an exemplary embodiment of the present disclosure.
FIG. 2 is a flow diagram illustrating a training of a feature vector extraction model according to an exemplary embodiment of the present disclosure.
Fig. 3 is a block diagram illustrating a picture auditing apparatus according to an exemplary embodiment of the present disclosure.
FIG. 4 is a block diagram illustrating an audit module according to an exemplary embodiment of the present disclosure.
Fig. 5 is a block diagram illustrating another picture auditing apparatus according to an exemplary embodiment of the present disclosure.
Fig. 6 is a block diagram illustrating an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
The following detailed description of specific embodiments of the present disclosure is provided in connection with the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present disclosure, are given by way of illustration and explanation only, not limitation.
It is worth noting that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described figures are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The method and the system are mainly applied to a business scene for auditing the commodity pictures uploaded to the Internet platform. For example, the method is used for verifying the commodity display pictures uploaded to online ordering platforms such as take-out ordering platforms, public comment networks, network supermarkets and the like by merchants. And for example, in a business for auditing pictures in evaluation information uploaded to the online ordering platform by the consumer. Generally, the pictures uploaded to the network ordering platform are audited in a manual auditing mode so as to ensure that the uploaded pictures correspond to the pictures of the commodity.
For example, if a merchant wants to sell a piece of signboard fish-flavor shredded meat on an online shop, the merchant needs to add information of the dish in a menu, including a display picture of the dish. When the merchant uploads the picture of the fish-flavored shredded pork, the merchant can check whether the picture is the picture of the fish-flavored shredded pork and then determine to upload the picture, so that the picture of the fish-flavored shredded pork uploaded to the online ordering platform has the display significance for the consumers. On the contrary, if the menu item of the fish-flavor shredded pork is uploaded with a picture of the braised fish, the picture does not correspond to the name of the dish, which may cause confusion to the consumers who want to order the food.
As another example, if a consumer purchases a piece of egg-fried rice at the ordering platform, the consumer may upload a photo of the egg-fried rice actually purchased to the platform for review, and before the photo is successfully uploaded, the photo needs to be reviewed to ensure that the photo is a picture of the egg-fried rice and to avoid that the photo is a picture of other goods, for example, to avoid that the photo is a picture of shredded pork with fish flavor. Thus, the review of the egg-fried rice by the consumer is of reference value to other consumers.
However, when a user uploads a plurality of pictures corresponding to a plurality of commodities to the online ordering platform at the same time, the manual auditing method is obviously inefficient, and in such a case, the commodity pictures are easily transferred to a non-corresponding commodity catalog.
In view of this, embodiments of the present disclosure provide a picture auditing method and apparatus, an electronic device, and a storage medium, so as to implement automatic picture auditing and solve the problems of high labor cost and low efficiency caused by a manual auditing mechanism.
Fig. 1 is a flowchart illustrating a picture auditing method according to an exemplary embodiment of the present disclosure, as shown in fig. 1, the method including the steps of:
s101, obtaining a candidate picture set corresponding to a picture to be checked of a target article from an Internet platform, wherein the candidate picture set comprises a plurality of pictures which pass the check of the target article.
The internet platform can be a network trading platform such as a network shop, a take-out meal ordering platform, an online reservation platform, a network auction platform and the like.
The target object is a current operation object. The target item may be any commodity or any service provided on the network ordering platform. The examination is examination and verification, and the purpose is to judge whether an examined object meets preset conditions.
In step S101, a candidate image set corresponding to an image to be checked of a target item is obtained from an internet platform. The candidate picture set may be multiple pictures of the corresponding target item that have been approved, or may be one or multiple pictures of the corresponding target item set by the user. Each picture in the candidate picture set is a shot picture of the target object and contains partial or all characteristic information of the corresponding target object.
In one possible implementation, the candidate image set corresponding to the target item may be obtained from a plurality of merchant directories on an online ordering platform on the internet. In addition, a plurality of pictures of corresponding target articles on other website platforms can be obtained through network search. For example, if the user wants to upload the picture of the tomato fried egg recommended by the merchant again, the obtained multiple historical pictures are used as the candidate picture set corresponding to the newly uploaded tomato fried egg picture by obtaining the historical picture corresponding to the tomato fried egg in the recommended dish catalog of the merchant. The historical picture is a picture which passes the audit, and the newly uploaded tomato fried egg picture is a picture to be audited. In another example, if the user wants to upload the picture of the tomato fried eggs recommended by the merchant again, the checked pictures of the tomato fried eggs in the menu directory of other merchants can be obtained, and the obtained multiple checked pictures of the tomato fried eggs are used as the candidate picture set. In another example, if the user wants to upload the picture of the tomato fried eggs recommended by the merchant again, the picture labeled as the tomato fried eggs searched by the network can be used as the candidate picture set through network search.
In another possible implementation manner, the candidate picture set may be selected specifically by the following method. Illustratively, when a user uploads images of the merchant recommended dish, the auditing system searches the latest N examined images which are inverted according to time under the merchant recommended dish directory, if K images are lacked, the K images which are inverted according to time and correspond to the dish under other merchants are obtained to complement the N images, and then the obtained N images are used as a candidate image set.
And S102, acquiring the characteristic vector of each picture in the candidate picture set.
A feature vector of a picture is a mapping representation of the picture in some defined semantic space. In the present disclosure, the obtained feature vector may be a mapping representation of a high-dimensional space, i.e., a high-dimensional feature vector.
S103, based on a clustering algorithm, clustering the pictures in the candidate picture set according to the feature vectors of the pictures in the candidate picture set, and determining a reference picture according to a clustering result.
Clustering is the process of grouping a set of physical or abstract objects, the group generated by clustering is called a cluster, and a cluster is a collection of data objects. Any two objects within each cluster have a high degree of similarity. There is a high degree of dissimilarity between two objects belonging to different clusters. It is characterized in that a large amount of data is characterized by a small number of clusters. Common clustering algorithms include a k-medoids algorithm and a k-means algorithm.
The k-means algorithm uses the mean or weighted average (centroid) of the points within a cluster as a representative point for the cluster, and iteratively divides the data objects into different clusters to minimize the objective function, thereby making the generated clusters as compact and independent as possible. And the k-medoids algorithm selects an object (called a central point) positioned closest to the center of the cluster in the cluster as a representative point of the cluster, and then repeatedly replaces the representative point with a non-representative point to improve the clustering quality.
In the embodiment of the present disclosure, a k-medoids algorithm may be adopted to cluster the candidate picture sets, and firstly, it is assumed that the input is: a database of n objects, k cluster clusters being expected to be obtained; the output is: k clusters, minimizing the sum of the deviations of all objects from the center point of the cluster to which they belong. The clustering step may be:
step 1, randomly selecting k points as the central points of an initial cluster;
step 2, distributing the rest points to the cluster represented by the current optimal central point according to the principle of being closest to the central point;
step 3, calculating a criterion function corresponding to each member point in each cluster, and selecting the point corresponding to the minimum criterion function as a new central point;
and 4, repeating the process of the step 2-3 until all the central points are not changed any more or the set maximum iteration times are reached.
The criterion function is the sum of the distances between a certain member point and other member points in a cluster.
In step S103, clustering the pictures in the candidate picture set according to the feature vector of each picture in the candidate picture set, and determining a reference picture according to the clustering result. Wherein the reference picture may be one or more. For example, when clustering the candidate picture set, the candidate picture set may be divided into a plurality of subsets according to the view angle (such as a front view, a side view, etc.) of the article in the picture, then each subset is clustered to obtain a plurality of clustering centers, and the pictures corresponding to the obtained plurality of clustering centers are used as reference pictures. Thus, the reference picture may be a plurality of pictures.
It should be noted that, in the present disclosure, the reference picture has characteristic information corresponding to the preset condition that the picture to be checked needs to satisfy in the embodiment of step S101.
And S104, auditing the picture to be audited based on the reference picture to obtain an auditing result.
And auditing the picture to be audited according to the obtained reference picture, so as to obtain whether the picture to be audited meets the auditing condition, and obtain the auditing result of the picture.
By adopting the method, the candidate picture set corresponding to the picture to be audited of the target object is obtained from the Internet platform, wherein the candidate picture set comprises a plurality of pictures which pass the audit of the target object. The picture which passes the audit has the corresponding characteristic information of the commodity, and the characteristic information is used for identifying the corresponding commodity. By extracting the feature vector of each picture in the candidate picture set, high-dimensional feature information corresponding to the commodity can be obtained. It is understood that the higher the dimension of the feature information, the easier it is to recognize the object. And clustering the pictures in the candidate picture set according to the characteristic vector of each picture in the candidate picture set, and determining a reference picture according to a clustering result. And auditing the picture to be audited based on the reference picture, so that whether the picture to be audited is a picture corresponding to the commodity can be accurately identified. The automatic image auditing mode replaces a manual auditing mode, so that the labor cost is saved, and the auditing efficiency is improved. In addition, the reference picture is selected by clustering the candidate picture set, so that the selected reference picture can be representative, and the auditing result is more accurate.
In addition, in step S103, the pictures in the candidate picture set are clustered according to the feature vector of each picture in the candidate picture set, so as to determine the reference picture. Specifically, the candidate picture sets are clustered according to the feature vectors of the candidate picture sets, and a picture corresponding to a final clustering center can be used as a reference picture. Wherein the criterion function value of the eigenvector corresponding to the reference picture is the smallest with the eigenvector corresponding to the other candidate picture. Then, the reference picture selected by this clustering method is the most representative picture in the candidate picture set. If the clustering idea is used to explain the representativeness of the reference picture, it can be understood that each unlabeled picture is compared with the specific feature information contained in the reference picture, and then the unlabeled pictures with similar features are divided into the set represented by the reference picture to obtain a candidate picture set represented by the reference picture, where the unlabeled picture refers to a picture before division.
In a possible implementation manner, the step S104 may include:
acquiring a feature vector of the picture to be audited; calculating the similarity between the picture to be audited and the reference picture according to the characteristic vector of the picture to be audited and the characteristic vector of the reference picture; and if the similarity is greater than a preset threshold value, determining that the picture to be audited passes the audit.
Specifically, the similarity between the picture to be checked and the reference picture is calculated, which may be calculating an euclidean distance between the picture to be checked and the reference picture according to the feature vector of the picture to be checked and the feature vector of the reference picture. And euclidean distance, also known as euclidean metric, is a definition of distance, which refers to the true distance between two points in an m-dimensional space, or the natural length of a vector (i.e., the distance of the point from the origin). The euclidean distance in two and three dimensions is the actual distance between two points.
In addition, the hamming distance between the picture to be examined and the reference picture can be calculated, wherein the hamming distance is used in data transmission error control coding, and the hamming distance represents the number of corresponding bits of two (same length) characters which are different, in other words, the hamming distance is the number of characters which need to be replaced for converting one character string into another character string.
And converting the calculated Euclidean distance or Hamming distance into similarity, and comparing the similarity with a preset threshold value to determine whether the picture to be audited passes the audit. It is understood that the smaller the distance between the two is, the greater the similarity is, and the greater the similarity is, the more similar the two are. For example, if the calculated similarity between the picture to be audited and the reference picture is greater than a preset threshold, the picture to be audited passes the audit.
For another example, the similarity between the picture to be checked and the reference picture may be directly calculated by using a similarity algorithm, for example, the similarity between the feature vectors of the picture to be checked and the reference picture may be calculated by using a cosine similarity algorithm.
As yet another example, the similarity corresponding to the hamming distance between two pictures based on pixels may be determined by extracting pixels of the picture to be audited and pixels of the reference picture. And then comparing with a preset similarity threshold value to obtain an auditing result.
By adopting the method, the characteristic vector of the picture to be audited is audited based on the characteristic vector of the reference picture, and the result of whether the object described by the picture to be audited and the reference picture is the same commodity can be directly obtained according to the comparison between the similarity value of the characteristic vector of the reference picture and the preset similarity threshold value. And by setting the similarity threshold, the correctness of the auditing results of the reference picture and the picture to be audited can be guaranteed.
In one possible implementation, the feature vector of the picture may be obtained through a pre-trained feature vector extraction model.
For example, the candidate picture set may be input into the feature vector extraction model to obtain a feature vector of each picture in the candidate picture set; similarly, the picture to be checked can also be input into the feature vector extraction model to obtain the feature vector of the picture to be checked; the feature vector extraction model is obtained by training pictures of different articles and feature vectors of the pictures as training samples.
The feature vector extraction model may be a neural network model. For example, a Convolutional Neural Network (CNN) acquires the feature vector of the picture by using a pre-trained feature vector extraction model, so that the accuracy of similarity calculation using the feature vector of the reference picture and the feature vector of the picture to be checked can be ensured. As can be understood by those skilled in the art, with the multi-sample trained neural network, the loss function and the weighting parameters of each layer of the network are relatively stable, so that the situations of over-fitting and under-fitting can be avoided. The overfitting is a phenomenon that the machine learning model or the deep learning model performs too well in the training sample, so that the performance in the verification data set and the test data set is poor. Under-fitting means that the extracted features are less due to less training samples, and the trained model cannot well identify test data. Therefore, if the feature vectors of the picture are directly and respectively extracted without using a pre-trained extraction model, the features of the picture to be examined cannot be well identified when the picture to be examined is examined based on the reference picture. Therefore, the feature extraction model trained in advance is adopted, so that the feature extraction can be more accurately carried out when the machine audits the picture to be audited, and the accuracy of the audit result is improved.
Fig. 2 is a flowchart illustrating a method for training a feature vector extraction model according to an exemplary embodiment of the present disclosure, and as shown in fig. 2, the method may include the following steps:
s201, obtaining the approved pictures of different articles from the Internet platform, and determining the pixel values and the feature vectors of the obtained pictures.
S202, taking the pixel value of the obtained picture as input, taking the feature vector of the obtained picture as output, and training the feature vector extraction model.
In the embodiment of the present disclosure, the feature vector acquired by using the feature vector extraction model is a high-dimensional vector. By adopting the method to train the feature vector extraction model, the feature vector extraction model is trained through a plurality of sample data, so that the model can learn the dimension of the feature information contained in the picture which passes the examination on the Internet platform. Therefore, when the feature vectors of the reference picture and the picture to be checked are extracted, the feature vectors of the reference picture and the picture to be checked can be ensured to be in the same dimension. Further, the correctness of the result of the image auditing method can be guaranteed.
Based on the same inventive concept, an embodiment of the present disclosure further provides a picture auditing apparatus, which is used to implement the steps of the picture auditing method provided by the foregoing method embodiment, as shown in fig. 3, the apparatus 300 includes:
an obtaining module 310, configured to obtain, from an internet platform, a candidate picture set corresponding to a picture to be checked of a target item, where the candidate picture set includes multiple pictures that have been checked of the target item;
an executing module 320, configured to obtain a feature vector of each picture in the candidate picture set;
the clustering module 330 is configured to cluster the pictures in the candidate picture set according to the feature vectors of the pictures in the candidate picture set based on a clustering algorithm, and determine a reference picture according to a clustering result;
the auditing module 340 is configured to audit the picture to be audited based on the reference picture to obtain an auditing result.
By adopting the device, a candidate picture set corresponding to the picture to be audited of the target object is obtained from the Internet platform, wherein the candidate picture set comprises a plurality of pictures which pass the audit of the target object. The picture which passes the audit has the corresponding characteristic information of the commodity, and the characteristic information is used for identifying the corresponding commodity. By extracting the feature vector of each picture in the candidate picture set, various feature information corresponding to the commodity can be obtained. It is understood that the more kinds of feature information, the easier it is to recognize the object. And clustering the pictures in the candidate picture set according to the characteristic vector of each picture in the candidate picture set, and determining a reference picture according to a clustering result. And auditing the picture to be audited based on the reference picture, so that whether the picture to be audited is a picture corresponding to the commodity can be accurately identified. The automatic image auditing mode replaces a manual auditing mode, so that the labor cost is saved, and the auditing efficiency is improved. In addition, the reference picture is selected by clustering the candidate picture set, so that the selected reference picture can be representative, and the auditing result is more accurate.
Optionally, as shown in fig. 4, the auditing module 340 includes:
the obtaining submodule 341 is configured to obtain a feature vector of the picture to be audited;
the calculating submodule 342 is configured to calculate a similarity between the picture to be checked and the reference picture according to the feature vector of the picture to be checked and the feature vector of the reference picture;
the first determining sub-module 343 is configured to determine that the picture to be audited passes the audit when the similarity is greater than a preset threshold.
Optionally, the executing module 320 includes:
the first input sub-module is used for inputting the candidate picture set into a feature vector extraction model to obtain feature vectors of all pictures in the candidate picture set;
the obtaining sub-module 341 includes:
the second input submodule is used for inputting the picture to be audited into the feature vector extraction model to obtain the feature vector of the picture to be audited;
the feature vector extraction model is obtained by training pictures of different articles and feature vectors of the pictures as training samples.
Optionally, as shown in fig. 5, the picture auditing apparatus 300 further includes a training module 350, where the training module 350 includes:
the second determining submodule is used for acquiring the approved pictures of different articles from the Internet platform and determining the pixel values and the feature vectors of the acquired pictures;
and the training sub-module is used for taking the pixel value of the acquired picture as input, taking the feature vector of the acquired picture as output and training the feature vector extraction model.
Optionally, the clustering algorithm comprises a k-medoids algorithm.
With regard to the apparatus in the above embodiment, the specific manner in which each module performs operations is implemented in the same or similar manner as in the above method embodiment, and will not be described in detail here.
Fig. 6 is a block diagram illustrating an electronic device 1900 according to an example embodiment. For example, the electronic device 1900 may be provided as a server. Referring to fig. 6, an electronic device 1900 includes a processor 1922, which may be one or more in number, and a memory 1932 for storing computer programs executable by the processor 1922. The computer program stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processor 1922 may be configured to execute the computer program to perform the above-described picture approval method.
Additionally, electronic device 1900 may also include a power component 1926 and a communication component 1950, the power component 1926 may be configured to perform power management of the electronic device 1900, and the communication component 1950 may be configured to enable communication, e.g., wired or wireless communication, of the electronic device 1900. In addition, the electronic device 1900 may also include input/output (I/O) interfaces 1958. The electronic device 1900 may operate based on an operating system, such as Windows Server, Mac OS XTM, UnixTM, Linux, etc., stored in memory 1932.
In another exemplary embodiment, a computer readable storage medium comprising program instructions which, when executed by a processor, implement the steps of the above-described picture auditing method is also provided. For example, the computer readable storage medium may be the memory 1932 described above that includes program instructions that are executable by the processor 1922 of the electronic device 1900 to perform the picture review method described above.
In another exemplary embodiment, a computer program product is also provided, which comprises a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-mentioned picture auditing method when executed by the programmable apparatus.
The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure.
It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. In order to avoid unnecessary repetition, various possible combinations will not be separately described in this disclosure.
In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure, as long as it does not depart from the spirit of the present disclosure.

Claims (10)

1. A picture auditing method is characterized by comprising the following steps:
acquiring a candidate picture set corresponding to a picture to be audited of a target object from an internet platform, wherein the candidate picture set comprises a plurality of pictures of the target object which pass audit;
acquiring a characteristic vector of each picture in the candidate picture set;
based on a clustering algorithm, clustering the pictures in the candidate picture set according to the feature vectors of all the pictures in the candidate picture set, and determining a reference picture according to a clustering result;
and auditing the picture to be audited based on the reference picture to obtain an audit result.
2. The method according to claim 1, wherein the reviewing the picture to be reviewed based on the reference picture to obtain a review result includes:
acquiring a feature vector of the picture to be audited;
calculating the similarity between the picture to be audited and the reference picture according to the characteristic vector of the picture to be audited and the characteristic vector of the reference picture;
and if the similarity is greater than a preset threshold value, determining that the picture to be audited passes the audit.
3. The method of claim 2,
the obtaining of the feature vector of each picture in the candidate picture set includes: inputting the candidate picture set into a feature vector extraction model to obtain feature vectors of all pictures in the candidate picture set; and/or the presence of a gas in the gas,
the obtaining of the feature vector of the picture to be audited includes: inputting the picture to be audited into the feature vector extraction model to obtain the feature vector of the picture to be audited;
the feature vector extraction model is obtained by training pictures of different articles and feature vectors of the pictures as training samples.
4. The method of claim 3, further comprising training the feature vector extraction model, comprising:
acquiring approved pictures of different articles from the Internet platform, and determining pixel values and feature vectors of the acquired pictures;
and taking the pixel value of the acquired picture as input, taking the feature vector of the acquired picture as output, and training the feature vector extraction model.
5. The method according to any one of claims 1 to 4, wherein the clustering algorithm comprises a k-medoids algorithm.
6. A picture auditing device, comprising:
the system comprises an acquisition module, a verification module and a verification module, wherein the acquisition module is used for acquiring a candidate picture set corresponding to a picture to be verified of a target article from an internet platform, and the candidate picture set comprises a plurality of pictures which pass verification of the target article;
the execution module is used for acquiring the characteristic vector of each picture in the candidate picture set;
the clustering module is used for clustering the pictures in the candidate picture set according to the characteristic vector of each picture in the candidate picture set based on a clustering algorithm and determining a reference picture according to a clustering result;
and the auditing module is used for auditing the picture to be audited based on the reference picture to obtain an auditing result.
7. The apparatus of claim 6, wherein the audit module comprises:
the obtaining submodule is used for obtaining the characteristic vector of the picture to be audited;
the calculation submodule is used for calculating the similarity between the picture to be audited and the reference picture according to the characteristic vector of the picture to be audited and the characteristic vector of the reference picture;
and the first determining submodule is used for determining that the picture to be audited passes the audit when the similarity is greater than a preset threshold value.
8. The apparatus of claim 7,
the execution module comprises:
the first input sub-module is used for inputting the candidate picture set into a feature vector extraction model to obtain feature vectors of all pictures in the candidate picture set;
the acquisition sub-module includes:
the second input submodule is used for inputting the picture to be audited into the feature vector extraction model to obtain the feature vector of the picture to be audited;
the feature vector extraction model is obtained by training pictures of different articles and feature vectors of the pictures as training samples.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
10. An electronic device, comprising:
a memory having a computer program stored thereon;
a processor for executing the computer program in the memory to carry out the steps of the method of any one of claims 1 to 5.
CN201911063137.0A 2019-10-31 2019-10-31 Picture auditing method and device, electronic equipment and storage medium Withdrawn CN110929764A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911063137.0A CN110929764A (en) 2019-10-31 2019-10-31 Picture auditing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911063137.0A CN110929764A (en) 2019-10-31 2019-10-31 Picture auditing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110929764A true CN110929764A (en) 2020-03-27

Family

ID=69850174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911063137.0A Withdrawn CN110929764A (en) 2019-10-31 2019-10-31 Picture auditing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110929764A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563429A (en) * 2020-04-23 2020-08-21 广东博智林机器人有限公司 Drawing verification method and device, electronic equipment and storage medium
CN111767422A (en) * 2020-06-30 2020-10-13 平安国际智慧城市科技股份有限公司 Data auditing method, device, terminal and storage medium
CN113205130A (en) * 2021-04-28 2021-08-03 五八有限公司 Data auditing method and device, electronic equipment and storage medium
CN115114469A (en) * 2021-03-17 2022-09-27 腾讯科技(深圳)有限公司 Picture identification method, device and equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294577A (en) * 2016-07-27 2017-01-04 北京小米移动软件有限公司 Figure chip detection method and device
JP2017010468A (en) * 2015-06-25 2017-01-12 Kddi株式会社 System and method for retrieving objects reflected in imaging picture
US20170177979A1 (en) * 2015-12-22 2017-06-22 The Nielsen Company (Us), Llc Image quality assessment using adaptive non-overlapping mean estimation
CN107133221A (en) * 2017-06-09 2017-09-05 北京京东尚科信息技术有限公司 Signal auditing method, device, computer-readable medium and electronic equipment
CN108228844A (en) * 2018-01-09 2018-06-29 美的集团股份有限公司 A kind of picture screening technique and device, storage medium, computer equipment
CN109597902A (en) * 2018-12-20 2019-04-09 深圳市丰巢科技有限公司 Picture examination method, apparatus, equipment and storage medium
CN110377775A (en) * 2019-07-26 2019-10-25 Oppo广东移动通信有限公司 A kind of picture examination method and device, storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017010468A (en) * 2015-06-25 2017-01-12 Kddi株式会社 System and method for retrieving objects reflected in imaging picture
US20170177979A1 (en) * 2015-12-22 2017-06-22 The Nielsen Company (Us), Llc Image quality assessment using adaptive non-overlapping mean estimation
CN106294577A (en) * 2016-07-27 2017-01-04 北京小米移动软件有限公司 Figure chip detection method and device
CN107133221A (en) * 2017-06-09 2017-09-05 北京京东尚科信息技术有限公司 Signal auditing method, device, computer-readable medium and electronic equipment
CN108228844A (en) * 2018-01-09 2018-06-29 美的集团股份有限公司 A kind of picture screening technique and device, storage medium, computer equipment
CN109597902A (en) * 2018-12-20 2019-04-09 深圳市丰巢科技有限公司 Picture examination method, apparatus, equipment and storage medium
CN110377775A (en) * 2019-07-26 2019-10-25 Oppo广东移动通信有限公司 A kind of picture examination method and device, storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563429A (en) * 2020-04-23 2020-08-21 广东博智林机器人有限公司 Drawing verification method and device, electronic equipment and storage medium
CN111767422A (en) * 2020-06-30 2020-10-13 平安国际智慧城市科技股份有限公司 Data auditing method, device, terminal and storage medium
CN115114469A (en) * 2021-03-17 2022-09-27 腾讯科技(深圳)有限公司 Picture identification method, device and equipment and storage medium
CN113205130A (en) * 2021-04-28 2021-08-03 五八有限公司 Data auditing method and device, electronic equipment and storage medium
CN113205130B (en) * 2021-04-28 2023-05-02 五八有限公司 Data auditing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110929764A (en) Picture auditing method and device, electronic equipment and storage medium
CN110427560B (en) Model training method applied to recommendation system and related device
CN110008397B (en) Recommendation model training method and device
US20210073890A1 (en) Catalog-based image recommendations
CN111639193A (en) Product risk assessment method and device, electronic equipment and storage medium
CN112395487B (en) Information recommendation method and device, computer readable storage medium and electronic equipment
CN113139769B (en) Intelligent purchasing scheme recommending method and device, computer equipment and storage medium
CN116187524B (en) Supply chain analysis model comparison method and device based on machine learning
CN111738541A (en) Method and device for acquiring store food quality information and electronic equipment
CN113689144A (en) Quality assessment system and method for product description
CN111429161B (en) Feature extraction method, feature extraction device, storage medium and electronic equipment
CN115712780A (en) Information pushing method and device based on cloud computing and big data
CN115423040A (en) User portrait identification method and AI system of interactive marketing platform
CN113254687B (en) Image retrieval and image quantification model training method, device and storage medium
CN114997916A (en) Prediction method, system, electronic device and storage medium of potential user
CN111949789A (en) Text classification method and text classification system
CN113327132A (en) Multimedia recommendation method, device, equipment and storage medium
CN113032676A (en) Recommendation method and system based on micro-feedback
CN113591881B (en) Intention recognition method and device based on model fusion, electronic equipment and medium
CN110262906B (en) Interface label recommendation method and device, storage medium and electronic equipment
CN114580354A (en) Synonym-based information encoding method, device, equipment and storage medium
CN115796984A (en) Training method of item recommendation model, storage medium and related equipment
CN112328746A (en) Dish label warehousing method and device, computer equipment and storage medium
CN116977021B (en) Automatic pushing method for system butt joint based on big data
CN112000821B (en) Multimedia information pushing method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200327

WW01 Invention patent application withdrawn after publication