CN113254638B - Product image determining method, computer equipment and storage medium - Google Patents

Product image determining method, computer equipment and storage medium Download PDF

Info

Publication number
CN113254638B
CN113254638B CN202110498466.9A CN202110498466A CN113254638B CN 113254638 B CN113254638 B CN 113254638B CN 202110498466 A CN202110498466 A CN 202110498466A CN 113254638 B CN113254638 B CN 113254638B
Authority
CN
China
Prior art keywords
image
vocabulary
vocabularies
words
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110498466.9A
Other languages
Chinese (zh)
Other versions
CN113254638A (en
Inventor
张秦玮
谭坤
焦文叶
梁玲
杨树青
贾如意
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North Minzu University
Original Assignee
North Minzu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North Minzu University filed Critical North Minzu University
Priority to CN202110498466.9A priority Critical patent/CN113254638B/en
Publication of CN113254638A publication Critical patent/CN113254638A/en
Application granted granted Critical
Publication of CN113254638B publication Critical patent/CN113254638B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention is suitable for the computer field, has provided a product image to confirm the method, computer equipment and storage medium, wherein a product image confirms the method, comprising: calculating the similarity between any two image vocabularies, and classifying the image vocabularies according to the similarity; selecting representative image words from various image words; the method realizes the classification of the image vocabularies with obvious discrimination by calculating the similarity between the image vocabularies of the target product, and can select the most representative image vocabulary in each class so as to accurately describe the image of the target product.

Description

Product image determination method, computer equipment and storage medium
Technical Field
The invention belongs to the field of computers, and particularly relates to a product image determining method, computer equipment and a storage medium.
Background
The product image refers to the perception image of the consumer on the product and the characteristics thereof; the operator positions the product, highlights the specific function of the product, and enables the product to meet the specific needs of the consumer, so that the product is perceived by the consumer. The undesirable image of the product may be a result of a weakness of the earlier product or may be a result of people resisting product renewal. Eliminating the undesirable image of the commodity effectively expands the market for the commodity.
Determining that the target product image is a key link of product image design and is crucial to emotional design of products, and meanwhile, the target image also determines the direction of product design; the current method for determining the image of the product is mainly a subjective determination method, namely, the image of the product is determined mainly according to subjective evaluation of a testee on the cognition of the testee on the product.
The existing product image determination method has the problems of strong subjectivity and low accuracy.
Disclosure of Invention
The embodiment of the invention aims to provide a product image determining method, and aims to solve the problems of strong subjectivity and low accuracy of the conventional product image determining method.
The embodiment of the invention is realized in such a way that a method for determining the image of a product comprises the following steps of;
acquiring a plurality of image vocabularies describing a target product;
calculating the similarity between any two image vocabularies, and classifying the image vocabularies according to the similarity;
and selecting representative image words from the various image words.
The acquisition of a plurality of image vocabularies for describing the target product comprises the following steps:
acquiring a plurality of descriptive vocabularies for describing the target product from the Internet;
and preliminarily screening the descriptive words, and taking the words obtained after screening as the image words.
The preliminary screening comprises:
matching the descriptive words with words in the depreciation word database, and screening out the descriptive words which are successfully matched;
and matching the descriptive words with words in the model design class word database of the target product, and screening out the descriptive words which are not successfully matched.
The calculating the similarity between any two image words comprises the following steps:
matching each image vocabulary with entries in a synonym forest database, wherein the synonym forest is divided into a plurality of levels, a logic structure between each level is a tree structure, an atomic word group is the lowest level, the atomic word group comprises a plurality of entries, and each atomic word group has a corresponding code;
assigning the codes of the atomic word groups where the successfully matched entries are located to corresponding image vocabularies;
determining the parameter of image vocabulary similarity calculation by utilizing the codes of any two image vocabularies;
and substituting the parameters into a subject vocabulary similarity calculation model to calculate the similarity between the corresponding two subject vocabularies.
The first image vocabulary and the second image vocabulary are any two of the image vocabularies, and the parameters comprise k, m and n;
k is the number of layers from a root node to the nearest public father node of the first image vocabulary and the second image vocabulary in the tree layer structure of the synonym forest database; the root node is an initial node of each image vocabulary in a synonym forest database tree layer structure;
n is the number of nodes at the layer below the nearest common father node of the first image vocabulary and the second image vocabulary;
m is the interval number between the next level nodes of the nearest common father nodes of the first image words and the second image words.
The image vocabulary similarity calculation model comprises the following steps:
Figure BDA0003055438780000031
wherein, y 1 For the first image vocabulary, y 2 For the second image vocabulary, sim (y) 1 ,y 2 ) Similarity of the first image words and the second image words is obtained; j represents the number of layers of the first/second image vocabulary from the nearest common parent node to the position of the image vocabulary; q (r) is the path weight from the root node to the lexical position, r is the number of layers of the tree layer structure of the synonym forest database, and r belongs to [1,5]]Q (k) represents an edge weight of the nearest common parent node, and alpha is the first image vocabulary and the second image vocabularyAnd (4) adjusting parameters of the similarity of the vocabularies of the two images.
The classifying the plurality of image vocabularies according to the similarity comprises the following steps:
extracting K image words from each image word to serve as a clustering center based on a K-means clustering algorithm;
determining the distance from the rest image vocabularies to each clustering center according to the similarity;
classifying each residual image vocabulary to a clustering center closest to the residual image vocabulary;
and re-determining K clustering centers, and circularly executing the classification process until the clustering error square sum criterion function is converged, thereby classifying each image vocabulary into K classes.
The representative image words are selected from various types of image words, and the method comprises the following steps:
determining the distance from each image word in each class of words to the clustering center;
and selecting the image words with the nearest distance as the representative image words.
It is a further object of embodiments of the present invention to provide a computer device comprising a memory and a processor, the memory having stored therein a computer program which, when executed by the processor, causes the processor to perform the steps of the product image determination method.
It is a further object of embodiments of the present invention to provide a computer readable storage medium having stored thereon a computer program, which, when executed by a processor, causes the processor to carry out the steps of the product image determination method.
According to the method for determining the image of the product, provided by the embodiment of the invention, the similarity between any two image words is calculated, and the image words are classified according to the similarity; selecting representative image words from various image words; the method realizes the classification of the image vocabularies with obvious discrimination by calculating the similarity between the image vocabularies of the target product, and can select the most representative image vocabulary in each class so as to accurately describe the target product.
Drawings
FIG. 1 is a flow chart of a method for determining a product image according to an embodiment of the present invention;
FIG. 2 is a flowchart for acquiring a plurality of image vocabularies describing a target product according to an embodiment of the present invention;
FIG. 3 is a flowchart for calculating the similarity between any two image vocabularies according to an embodiment of the present invention;
FIG. 4 is a flowchart of classifying the image vocabularies according to the similarity according to an embodiment of the present invention;
FIG. 5 is a flowchart of selecting a representative image vocabulary from various types of image vocabularies according to an embodiment of the present invention;
FIG. 6 is a block diagram showing an internal configuration of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements should not be limited by these terms unless otherwise specified. These terms are only used to distinguish one element from another. For example, a first xx script may be referred to as a second xx script, and similarly, a second xx script may be referred to as a first xx script, without departing from the scope of the present application.
The method for determining the product image provided by the embodiment of the invention is realized by computer equipment, wherein the computer equipment can be an independent physical server or a terminal, can also be a server cluster formed by a plurality of physical servers, and can be a cloud server for providing basic cloud computing services such as a cloud server, a cloud database, a cloud storage and a CDN (content delivery network); the terminal can be a smart phone, a tablet computer, a notebook computer, a desktop computer and the like, but is not limited thereto; the computer device comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to realize the steps of the product image determination method; the computer device also includes a network interface, an input device, and a display screen. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to perform the image vocabulary determining method. The internal memory may also have stored therein a computer program that, when executed by the processor, causes the processor to perform the image vocabulary determination method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
As shown in fig. 1, in one embodiment, an image vocabulary determining method is provided, and this embodiment is mainly illustrated by applying the method to a computer device. A method for determining image vocabulary specifically comprises the following steps:
step S102, acquiring a plurality of image vocabularies for describing a target product;
the image words are words describing product images of target products, and the product images are perception images formed after the consumers perceive the product shapes, colors, textures and the like according to memory, experience and association. The product modeling is the most direct way to embody the product emotion, and in the product image design, the whole feeling of the product modeling, such as atmosphere, luxury, personality and the like, is usually expressed by using perceptual image adjectives (image words for short).
Step S104, calculating the similarity between any two image vocabularies, and classifying the image vocabularies according to the similarity;
the similarity represents the similarity between any two image words, if the similarity is higher, the similarity represents that the two image words are more similar, namely the description of the product shape of the target product by the two image words is more inclined to the same dimension, for example, the similarity of the two image words is 'large' and 'large' is higher, and the two image words are the image words describing the dimension of the size of the target product; conversely, the lower the similarity, the more dissimilar the two image words, i.e. the more the two image words tend to describe the product configuration of the target product in different dimensions, such as 'huge', 'vivid' two image words have lower similarity, and 'huge' describes the dimension of the size of the target product, while 'vivid' describes the dimension of the color.
And S106, selecting representative image words from the image words of various types.
Wherein, the representative image vocabulary is the vocabulary which has higher similarity with other residual image vocabularies of the group and can summarize the residual image vocabularies to a certain extent. In the embodiment, the object vocabularies are classified with significant discrimination by calculating the similarity between the object vocabularies of the target product so as to realize the effect of describing each dimension of the target product, and the most representative object vocabulary can be selected from each class of object vocabularies so as to accurately describe the target product.
In an embodiment, as shown in fig. 2, step S102 may specifically include the following steps:
step S202, obtaining a plurality of descriptive vocabularies describing the target product from the Internet;
the source of the descriptive vocabulary of the target product includes, but is not only, the introduction about the product in each large online shopping platform on which the target product is sold, the comment of the consumer who has purchased the product on the commodity, the related introduction about the product on the manufacturer official network of the target product, or the evaluation report about the product in the related industry forum and the discussion and evaluation of the product by the forum user, or the description about the product in various news information reports about the product, etc. The method comprises the steps of firstly determining the position of a target product name in each channel, then extracting sentences containing the target product name, and selecting adjectives from the sentences as descriptive words by using a constituent sentence analysis algorithm.
Step S204, preliminarily screening the descriptive vocabularies, and taking the vocabularies obtained after screening as the image vocabularies;
by primarily screening the descriptive vocabularies, the descriptive vocabularies which obviously do not meet the requirements can be removed, the amount of the image vocabularies to be calculated is reduced, the workload of calculating the similarity of the image vocabularies at the later stage is further reduced, and meanwhile, the requirement of product image design is met;
the preliminary screening comprises the following steps:
matching the descriptive words with words in the depreciation word database, and screening out the descriptive words which are successfully matched; matching each acquired descriptive word with the words in the deprecation word database, and if the matching is successful, judging the descriptive word as a deprecation word', and screening out the deprecation word; additionally, the words in the database of derogatory words are socially recognized, reduced, i.e. objectively derogative words, such as 'colloquial', 'nausea', etc.
Matching the descriptive vocabulary with the vocabulary in the model design class vocabulary database of the target product, and screening out the descriptive vocabulary which is not successfully matched; and matching each obtained descriptive vocabulary with the vocabulary in the model design class vocabulary database of the target product, if the matching is successful, reserving the descriptive vocabulary, and screening out the descriptive vocabularies which are not successfully matched.
In one embodiment, as shown in fig. 3, the calculating the similarity between any two image vocabularies in step S104 includes:
step S302, matching each image vocabulary with entries in a synonym forest database;
the synonym forest is divided into a plurality of levels, and the logic structure among the levels is a tree structure; specifically, the synonym forest is divided into five levels including a large class, a middle class, a small class, word groups and atomic word groups, wherein the large class is 12, the middle class is 95, the small class is 1428, the word groups are 4026, and the atomic word groups are 17797, wherein the atomic word groups are the lowest level, and each atomic word group also has a plurality of entries, and each atomic word group has a corresponding code; the synonym forest is characterized in that 4 layers of large classes, middle classes, small classes, word groups and the like in the synonym forest are abstract level categories without specific terms, the atom word group layer contains specific terms, the synonym forest is stored in a text mode, each atom word group is one line and comprises one or more terms, and if the atom word group coded as ' Ea01A01 ═ represents ' long, slender, diffuse, long ' and the like; each atomic word group has 8-bit codes in a synonym forest, the first 7-bit codes are respectively represented by 1-bit capital letters, 1-bit lowercase letters, 2-bit decimal integers, 1-bit capital letters and 2-bit decimal integers, the 8 th bit is one of "═ #", "@", the "═ means that the meanings of the vocabularies in the atomic word group are the same," # "means that the meanings of the vocabularies in the atomic word group are related, and the" @ "means that the meanings of the vocabularies in the atomic word group are independent.
Step S304, assigning the codes of the atomic word groups where the successfully matched entries are located to corresponding image vocabularies;
for example, if the image word "slender" is successfully matched with the term "slender" in the atomic group encoded as "Ea 01a 01", the encoding "Ea 01a 01" is assigned to the image word "slender".
Step S306, calculating parameters by using the similarity of the coded image vocabularies of any two image vocabularies;
the first image vocabulary and the second image vocabulary are any two of the plurality of image vocabularies, and the parameters comprise k, m and n;
k is the number of layers from a root node to the nearest public father node of the first image vocabulary and the second image vocabulary in the tree layer structure of the synonym forest database; the root node is an initial node of each image vocabulary in a synonym forest database tree layer structure;
n is the number of nodes at the next layer of the nearest common parent node of the first image vocabulary and the second image vocabulary;
m is the interval number between the next level nodes of the nearest common father nodes of the first image words and the second image words.
Taking two meaning terms of "luxury" and "atmosphere" as an example, the code of "luxury" is "Eb 32a01 ═ and the code of" atmosphere "is" Ee35a01 ═ and the code of "Eb 32a01 ═ and the code of" Ee35a01 ═ are the same at level 1, i.e., the nearest common parent node of the two meaning terms of "luxury" and "atmosphere" is at level 1, i.e., k is 1; the "luxury" and "atmosphere" share E in the k +1 layer a 、E b 、E c 、E d 、E e 、E f Six nodes, i.e., n-6. Two meaning terms of "luxury" and "atmosphere" in the k +1 layer E b 、E e The number of intervals of (a) is 3, i.e., m is 3.
Step S308, substituting the parameters into an image vocabulary similarity calculation model to calculate the similarity between two corresponding image vocabularies;
the image vocabulary similarity calculation model comprises the following steps:
Figure BDA0003055438780000111
wherein, y 1 For the first image vocabulary, y 2 For the second image vocabulary, sim (y) 1 ,y 2 ) Similarity of the first image words and the second image words; j represents the number of layers of the first/second image vocabulary from the nearest common parent node to the position of the image vocabulary; q (r) is the path weight from the root node to the lexical position, r is the number of layers of the tree layer structure of the synonym forest database, and r belongs to [1,5]]Q (k) represents an edge weight of the nearest common parent node, and a is a commonality adjustment parameter of similarity between the first image vocabulary and the second image vocabulary。
Wherein, the image vocabulary similarity is a number and the range is [0, 1 ]; the similarity between the meaning words is related to the commonality and the difference between the words, the more the commonality is, the more the difference is, the less similar, and the derivation process of the formula (1) is as follows:
the first image vocabulary y 1 With the second image vocabulary y 2 The commonalities and the differences are respectively g (y) 1 ,y 2 ) And c (y) 1 ,y 2 ) Definition of y 1 And y 2 Similarity sim (y) of 1 ,y 2 ) Comprises the following steps:
Figure BDA0003055438780000112
the common distance is the distance from the root node of the vocabulary to the nearest common parent node, and comprises y 1 Distance L from root node to nearest common parent node g1 (y 1 ,y 2 ) And y 2 Distance L from root node to nearest common parent node g2 (y 1 ,y 2 ) Two parts; difference distance is from the nearest public father node to y 1 、y 2 Sum of distances including nearest common parent node to y 1 Distance L c1 (y 1 ,y 2 ) And nearest common parent node to y 2 Distance L c2 (y 1 ,y 2 )。
Further, the general property g (y) 1 ,y 2 ) The sum of the common distance and the common adjusting parameter is recorded as y 1 And y 2 And if the common regulation parameter of the similarity of the two image vocabularies is alpha, then:
g(y 1 ,y 2 )=L g1 (y 1 ,y 2 )+L g2 (y 1 ,y 2 )+α (3)
and because:
L g1 (y 1 ,y 2 )=L g2 (y 1 ,y 2 ) (4)
defining:
L g (y 1 ,y 2 )=L g1 (y 1 ,y 2 )=L g2 (y 1 ,y 2 ) (5)
then equation (2) can be:
g(y 1 ,y 2 )=2L g (y 1 ,y 2 )+α (6)
difference c (y) 1 ,y 2 ) As the sum of the difference distance and the difference adjustment parameter, y is recorded 1 And y 2 And if the difference adjusting parameter of the lexical similarity of the two images is beta, then:
c(y 1 ,y 2 )=L c1 (y 1 ,y 2 )+L c2 (y 1 ,y 2 )+β (7)
and because:
L c1 (y 1 ,y 2 )=L c2 (y 1 ,y 2 ) (8)
equation (7) can be:
c(y 1 ,y 2 )=2L c (y 1 ,y 2 )+β (9)
thus, equation (1) can be:
Figure BDA0003055438780000121
further, the number of layers each image vocabulary passes from the root node to the position of the vocabulary is 5, and the weight of each layer is different. And Q (r) is the path weight from the root node to the vocabulary position, r is the layer number of the tree layer structure of the synonym forest database, and r belongs to [1,5], then:
Figure BDA0003055438780000131
Figure BDA0003055438780000132
the difference distance in the image vocabulary similarity obtained by the invention is related to the number of nodes at the next layer of the nearest public father node and the interval number between the nodes at the next layer of the nearest public father node. Let Q (k) denote the edge weight of the nearest common parent node, then:
Figure BDA0003055438780000133
wherein n is the number of nodes at the next layer of the nearest common father node of the first image vocabulary and the second image vocabulary; m is the interval number between the next level nodes of the nearest common father nodes of the first image vocabulary and the second image vocabulary; finally, y can be obtained 1 And y 2 The similarity calculation model of the two image vocabularies is as follows:
Figure BDA0003055438780000134
in this embodiment, q (k), q (r), and a are constants, j represents the number of layers of the first/second image vocabulary from the nearest common parent node to the position of the image vocabulary, and k + j is 5; the similarity parameter of the image vocabulary is substituted into the formula (1) to calculate the first image vocabulary y 1 And a second image vocabulary y 2 The similarity between the image vocabularies and the image vocabularies is calculated by the model based on the common distance and the difference distance between the image vocabularies, and the common adjusting parameter and the difference adjusting parameter are introduced, so that the accuracy and the efficiency of the calculation of the similarity between the image vocabularies are improved.
In addition, when the first five-level codes (the first 7-bit codes) of the tree structure of the synonym forest database of the two image vocabularies are the same, namely k is 5, the similarity is determined by the 8-bit code.
The 8 th bit is coded as "═ #", "@" to respectively note that the similarity is 1, 0.5, 0. When the 8 th bit is coded as ═ it means that the two meaning words are synonymous and the similarity is 1; when the 8 th bit is "#", the two image words are related, and the similarity is 0.5; when the 8 th position is "@", it means that the two meaning words are independent, and the degree of similarity is 0.
If the image vocabulary has a plurality of codes in the synonym forest, the maximum value of the similarity of the two image vocabularies is taken as the final similarity value. Let the first image vocabulary y 1 With w codes, the second image word y 2 With v codes, then:
Figure BDA0003055438780000141
wherein sim (y) 1 (i),y 2 (j) Y) representing image vocabulary 1 The ith code and image vocabulary y 2 The jth encoded similarity value of (e).
As shown in fig. 4, the classifying the plurality of image vocabularies according to the similarity in step S104 includes:
s402, extracting K image words from each image word as a clustering center based on a K-means clustering algorithm;
step S404, determining the distance from the rest image vocabularies to each clustering center according to the similarity;
step S406, classifying each remaining image vocabulary to a clustering center closest to the image vocabulary;
and step S408, re-determining K clustering centers, and circularly executing the classification process until the clustering error square sum criterion function is converged, thereby classifying each image vocabulary into K classes.
The classification process is carried out through a K-means clustering algorithm, the K-means clustering algorithm (K-means clustering algorithm) is a clustering analysis algorithm for iterative solution, and the main principle is that K objects are selected as initial clustering centers, then the distance from each data object to each clustering center is calculated, and the data object is classified into the class where the closest clustering center is located; and (4) reselecting the clustering center, if the clustering centers of two adjacent times do not change, indicating that the data object classification adjustment is finished, the clustering error square sum criterion function f is converged, if the clustering criterion function f is not converged, modifying the clustering center again, entering next iteration until the criterion function f is converged, and ending the algorithm.
The algorithm framework of the K-means clustering algorithm is as follows:
(1) giving a data set with the size of n, making O equal to l, and selecting K initial clustering centers Z d (O), d 1, 2, 3.., K, d represents the cluster center, and O represents the number of iterations.
(2) Calculating the distance D (x) of each sample data object from the cluster center i ,Z d (O)), i is 1, 2, 3, …, and each sample x is divided into two i And classifying to the cluster center closest to the cluster center.
(3) Let O be O +1, calculate the new cluster center and cluster error sum of squares criterion function f value:
Figure BDA0003055438780000151
(4) and (3) judging: if | f (O +1) -f (O) | < θ (f converges) or there is no class change in the object, the algorithm ends, otherwise, O ═ O +1, return to step (2) and recalculate, where θ is the convergence value.
After the classification is completed, a representative image vocabulary can be selected from the various image vocabularies, and as shown in fig. 5, the process comprises the following steps:
step S502, determining the distance from each image word in each class of words to the clustering center;
in step S504, the image vocabulary with the closest distance is selected as the representative image vocabulary.
In the embodiment, based on the similarity between the image words, the high-discrimination classification of the image words is realized through the K-means clustering algorithm, and a representative image word can be determined from each classification, so that the effect of multi-dimensional accurate description of the modeling of the target product can be realized.
The method comprises the steps of calculating the similarity between any two image vocabularies, and classifying the image vocabularies according to the similarity; selecting representative image words from various image words; the image vocabulary similarity calculation model provided by the method is based on the common distance and the difference distance among the image vocabularies, and introduces the common adjusting parameter and the difference adjusting parameter, so that the accuracy and the efficiency of the image vocabulary similarity calculation are improved, the high-discrimination classification of the image vocabularies is realized through a K-means clustering algorithm based on the similarity among the image vocabularies, a representative image vocabulary can be determined from each classification, and the effect of performing multi-dimensional accurate description on the modeling of a target product can be realized.
In one embodiment, a computer device is proposed, the computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring a plurality of image vocabularies describing a target product;
calculating the similarity between any two image vocabularies, and classifying the image vocabularies according to the similarity;
and selecting representative image words from the various image words.
Wherein, the computer equipment also comprises a network interface, an input device and a display screen. The memory comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to implement the image vocabulary determining method. The internal memory may also have stored therein a computer program that, when executed by the processor, causes the processor to perform the image vocabulary determination method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer readable storage medium is provided, having a computer program stored thereon, which, when executed by a processor, causes the processor to perform the steps of:
acquiring a plurality of image vocabularies describing a target product;
calculating the similarity between any two image vocabularies, and classifying the plurality of image vocabularies according to the similarity;
and selecting representative image words from the various image words.
It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in various embodiments may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (5)

1. A method of determining a product image, comprising: acquiring a plurality of image vocabularies describing a target product; calculating the similarity between any two image vocabularies, and classifying the image vocabularies according to the similarity; selecting representative image words from various image words;
the obtaining of a plurality of image vocabularies describing the target product comprises the following steps: acquiring a plurality of descriptive vocabularies for describing the target product from the Internet; primarily screening the descriptive vocabularies, and taking the vocabularies obtained after screening as the image vocabularies;
the calculating the similarity between any two image words comprises the following steps: matching each image vocabulary with entries in a synonym forest database, wherein the synonym forest is divided into a plurality of levels, a logic structure between each level is a tree structure, an atomic word group is the lowest level, the atomic word group comprises a plurality of entries, and each atomic word group has a corresponding code; assigning the codes of the atomic word groups where the successfully matched entries are located to corresponding image vocabularies; determining the parameter of image vocabulary similarity calculation by utilizing the codes of any two image vocabularies; substituting the parameters into an image vocabulary similarity calculation model to calculate the similarity between two corresponding image vocabularies;
the first image vocabulary and the second image vocabulary are any two of the image vocabularies, and the parameters comprise k, m and n; k is the number of layers from a root node to the nearest public father node of the first image vocabulary and the second image vocabulary in the tree layer structure of the synonym forest database; the root node is an initial node of each image vocabulary in a synonym forest database tree layer structure; n is the number of nodes at the layer below the nearest common father node of the first image vocabulary and the second image vocabulary; m is the interval number between the next level nodes of the nearest common father nodes of the first image vocabulary and the second image vocabulary;
the image vocabulary similarity calculation model comprises the following steps:
Figure FDA0003786591270000011
wherein, y 1 For the first image vocabulary, y 2 For the second image vocabulary, sim (y) 1 ,y 2 ) The similarity of the first image words and the second image words(ii) a j represents the number of levels of the first/second image vocabulary from the nearest common parent node to the position of the image vocabulary; q (r) is the path weight from the root node to the lexical position, r is the number of layers of the tree layer structure of the synonym forest database, and r belongs to [1,5]]Q (k) represents an edge weight of the nearest common parent node, and a is a commonality adjustment parameter of the similarity between the first image vocabulary and the second image vocabulary;
the preliminary screening comprises:
matching the descriptive words with words in the depreciation word database, and screening out the descriptive words which are successfully matched;
and matching the descriptive words with words in a model design class word database of the target product, and screening out the descriptive words which are not successfully matched.
2. The method of claim 1, wherein the classifying the plurality of image words according to the similarity comprises:
extracting K image vocabularies from each image vocabulary to serve as a clustering center based on a K-means clustering algorithm;
determining the distance from the rest image vocabularies to each clustering center according to the similarity;
classifying the rest image vocabularies into a clustering center closest to the rest image vocabularies;
and re-determining K clustering centers, and circularly executing the classification process until the clustering error square sum criterion function is converged, thereby classifying each image vocabulary into K classes.
3. The method of claim 2, wherein said selecting a representative image vocabulary from among said image vocabularies comprises:
determining the distance from each image word in each class of words to the clustering center;
and selecting the image vocabulary with the closest distance as the representative image vocabulary.
4. A computer device comprising a memory and a processor, the memory having stored therein a computer program that, when executed by the processor, causes the processor to perform the steps of the product image determination method of any of claims 1 to 3.
5. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, causes the processor to carry out the steps of the method of determining an image of a product according to any of claims 1 to 3.
CN202110498466.9A 2021-05-08 2021-05-08 Product image determining method, computer equipment and storage medium Active CN113254638B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110498466.9A CN113254638B (en) 2021-05-08 2021-05-08 Product image determining method, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110498466.9A CN113254638B (en) 2021-05-08 2021-05-08 Product image determining method, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113254638A CN113254638A (en) 2021-08-13
CN113254638B true CN113254638B (en) 2022-09-23

Family

ID=77224069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110498466.9A Active CN113254638B (en) 2021-05-08 2021-05-08 Product image determining method, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113254638B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114860886B (en) * 2022-05-25 2023-07-18 北京百度网讯科技有限公司 Method for generating relationship graph and method and device for determining matching relationship

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066757A (en) * 2017-05-11 2017-08-18 北方民族大学 A kind of big data supports the module type spectrum Optimization Design in lower product modular design
CN109960786A (en) * 2019-03-27 2019-07-02 北京信息科技大学 Chinese Measurement of word similarity based on convergence strategy
CN110929529A (en) * 2019-11-29 2020-03-27 长沙理工大学 Text clustering method based on synonym forest semantic similarity
CN111813985A (en) * 2020-06-29 2020-10-23 浙江大学 Bionic design system for matching biological prototype with product style based on perceptual image

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10074200B1 (en) * 2015-04-22 2018-09-11 Amazon Technologies, Inc. Generation of imagery from descriptive text
CN108596051A (en) * 2018-04-04 2018-09-28 浙江大学城市学院 A kind of intelligent identification Method towards product style image
TWI687825B (en) * 2018-12-03 2020-03-11 國立臺灣師範大學 Method and system for mapping from natural language to color combination
CN111414753A (en) * 2020-03-09 2020-07-14 中国美术学院 Method and system for extracting perceptual image vocabulary of product
CN112528661A (en) * 2020-12-15 2021-03-19 北京信息科技大学 Entity similarity calculation method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066757A (en) * 2017-05-11 2017-08-18 北方民族大学 A kind of big data supports the module type spectrum Optimization Design in lower product modular design
CN109960786A (en) * 2019-03-27 2019-07-02 北京信息科技大学 Chinese Measurement of word similarity based on convergence strategy
CN110929529A (en) * 2019-11-29 2020-03-27 长沙理工大学 Text clustering method based on synonym forest semantic similarity
CN111813985A (en) * 2020-06-29 2020-10-23 浙江大学 Bionic design system for matching biological prototype with product style based on perceptual image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
产品设计中的感性意象理论技术;王冬铀;《工业设计》;20180720(第7期);全文 *

Also Published As

Publication number Publication date
CN113254638A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
CN108491511B (en) Data mining method and device based on graph data and model training method and device
US10496751B2 (en) Avoiding sentiment model overfitting in a machine language model
CN112328909B (en) Information recommendation method and device, computer equipment and medium
CN111241839B (en) Entity identification method, entity identification device, computer readable storage medium and computer equipment
CN112307168A (en) Artificial intelligence-based inquiry session processing method and device and computer equipment
CN112347787A (en) Method, device and equipment for classifying aspect level emotion and readable storage medium
CN111523960A (en) Product pushing method and device based on sparse matrix, computer equipment and medium
CN113239176B (en) Semantic matching model training method, device, equipment and storage medium
CN111309887A (en) Method and system for training text key content extraction model
CN113961666A (en) Keyword recognition method, apparatus, device, medium, and computer program product
CN111368564A (en) Text processing method and device, computer readable storage medium and computer equipment
JP6757840B2 (en) Sentence extraction system, sentence extraction method, and program
CN113254638B (en) Product image determining method, computer equipment and storage medium
CN113345564B (en) Early prediction method and device for patient hospitalization duration based on graph neural network
CN113704509A (en) Multimedia recommendation method and device, electronic equipment and storage medium
CN110597977B (en) Data processing method, data processing device, computer equipment and storage medium
CN112597292A (en) Question reply recommendation method and device, computer equipment and storage medium
CN115408997A (en) Text generation method, text generation device and readable storage medium
CN111931035B (en) Service recommendation method, device and equipment
CN113239193A (en) Text mining method and system
CN112734519A (en) Commodity recommendation method based on convolution self-encoder network
CN111859939A (en) Text matching method and system and computer equipment
CN111652004B (en) Fusion method and device for machine translation system
CN114048392B (en) Multimedia resource pushing method and device, electronic equipment and storage medium
CN112417086B (en) Data processing method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant