CN110807082A - Quality spot check item determination method, system, electronic device and readable storage medium - Google Patents

Quality spot check item determination method, system, electronic device and readable storage medium Download PDF

Info

Publication number
CN110807082A
CN110807082A CN201810866301.0A CN201810866301A CN110807082A CN 110807082 A CN110807082 A CN 110807082A CN 201810866301 A CN201810866301 A CN 201810866301A CN 110807082 A CN110807082 A CN 110807082A
Authority
CN
China
Prior art keywords
evaluation
evaluation data
quality
article
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810866301.0A
Other languages
Chinese (zh)
Inventor
向彪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201810866301.0A priority Critical patent/CN110807082A/en
Publication of CN110807082A publication Critical patent/CN110807082A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a method, a system, an electronic device and a readable storage medium for determining a quality sampling item, wherein the method for determining the quality sampling item of an article comprises the following steps: acquiring first evaluation data of an article; performing word segmentation processing on the first evaluation data to obtain a plurality of evaluation word segments; presetting a sampling inspection project library; the spot check item library is stored with a plurality of items to be spot checked and a subject term corresponding to each item to be spot checked; calculating the similarity between each subject term and the evaluation participle, and counting the frequency of the similarity greater than a similarity threshold value; and selecting N to-be-spot-checked items corresponding to the N subject terms with the highest frequency as quality spot-check items, wherein N is a positive integer. According to the invention, the quality spot check project is automatically selected according to the evaluation data of the user, so that the dependence on professionals is eliminated, the automatic and scientific spot check is realized, the spot check project depends on the evaluation data of the user and is more representative, and the spot check project is more reliable.

Description

Quality spot check item determination method, system, electronic device and readable storage medium
Technical Field
The invention belongs to the field of big data processing, and particularly relates to a quality spot check item determination method, a quality spot check item determination system, electronic equipment and a readable storage medium.
Background
Article quality spot check is taken as an effective quality supervision and management method and is generally accepted and adopted by quality supervision departments, industries and enterprises of the state government, and the Internet is taken as a platform and a channel for article circulation and also needs to be taken for article quality spot check. Generally, an important link of article quality spot check is selection of a spot check item, and in the existing spot check item selection process, the category to which the article belongs is judged mainly by depending on the related experience of quality control personnel according to the description of the article and in combination with the quality standard established by the national quality governing department, and the spot check item of the spot check is selected or all items are simply selected.
The selective inspection method depends heavily on the related experience of quality control personnel, the requirement on the professional performance of people is high, the mode depending on the quality control personnel is not good in popularization due to the fact that the types of articles on the Internet platform are large in quantity and frequent in alternation, meanwhile, the response time and the cost are difficult to control, and due to the fact that the quality control personnel easily know the information of the articles and have personal tendency and one-sidedness, the selective inspection item selected in the selective inspection process is unreasonable, the quality problem cannot be found efficiently at low cost, the selective inspection efficiency is reduced, the cost is increased, and sustainability and popularization are not achieved.
Disclosure of Invention
The invention provides a quality spot check item determination method, a system, an electronic device and a readable storage medium, aiming at overcoming the defects that in the prior art, the quality spot check of internet articles mainly depends on the check of quality control personnel, so that the spot check efficiency is reduced and the spot check is not popularized.
The invention solves the technical problems through the following technical scheme:
an item quality spot check item determination method, comprising:
acquiring first evaluation data of an article;
performing word segmentation processing on the first evaluation data to obtain a plurality of evaluation word segments;
presetting a sampling inspection project library; the spot check item library is stored with a plurality of items to be spot checked and a subject term corresponding to each item to be spot checked;
calculating the similarity between each subject term and the evaluation participle, and counting the frequency of the similarity greater than a similarity threshold value;
and selecting N to-be-spot-checked items corresponding to the N subject terms with the highest frequency as quality spot-check items, wherein N is a positive integer.
Preferably, after the step of obtaining the first evaluation data of an article, the article quality spot check item determining method further includes:
judging whether the first evaluation data contain negative evaluation on the quality of the article, if so, filtering out the first evaluation data which do not contain negative evaluation on the quality of the article;
and in the step of performing word segmentation processing on the first evaluation data, performing word segmentation processing on the filtered first evaluation data.
Preferably, the step of determining whether the first evaluation data includes a negative evaluation of the quality of the article specifically includes:
acquiring second evaluation data of the target object within a preset time;
endowing the second evaluation data with a target characteristic label, wherein the target characteristic label is used for representing whether the second evaluation data shows that the target object has a quality problem;
creating a text information base for judging the quality of the article according to the second evaluation data;
training according to the text information base and the target feature label to obtain an article evaluation data evaluation model;
and judging whether the first evaluation data comprises negative evaluation on the quality of the article by using the article evaluation data evaluation model.
Preferably, the step of creating a text information base for evaluating the quality of the article according to the evaluation data specifically includes:
presetting a word vector library; the word vector stock stores a plurality of standard participles and word vectors corresponding to each standard participle;
performing word segmentation processing on the second evaluation data to obtain a plurality of words;
acquiring word segmentation vectors corresponding to the plurality of word segmentations from the word vector library; the text information base comprises the word segmentation vectors;
the step of training according to the text information base to obtain the article evaluation data judgment model specifically comprises the following steps:
and inputting the word segmentation vectors and the target feature labels into a machine learning model as training samples, and training to obtain the article evaluation data evaluation model.
Preferably, the step of determining whether the first evaluation data includes a negative evaluation on the quality of the article by using the article evaluation data evaluation model specifically includes:
obtaining evaluation word segmentation vectors corresponding to the plurality of evaluation word segmentation from the word vector library;
inputting the evaluation word segmentation vector into the article evaluation data evaluation model, and outputting a feature tag of the first evaluation data; the characteristic label is used for characterizing whether the first evaluation data comprises negative evaluation on the quality of the article;
and judging whether the first evaluation data comprises negative evaluation on the quality of the article according to the feature tag evaluation model.
Preferably, before the step of querying the word vector corresponding to the plurality of words from the word vector library, the quality spot check item determining method further includes:
filtering stop words in the plurality of participles;
in the step of inquiring the participle vectors corresponding to the participles from the word vector library, the corresponding participle vectors are obtained for the filtered participles.
Preferably, the step of calculating the similarity between each topic word and the evaluation segmentation word specifically comprises:
obtaining a subject term vector corresponding to the subject term from the term vector library;
and calculating the cosine similarity of the subject word vector and the evaluation word vector based on a cosine similarity algorithm to serve as the similarity.
An electronic device comprises a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor executes the computer program to realize the quality spot check item determination method.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the quality spot check item determination method described above.
An article quality spot check project determining system comprises a data acquisition module, a word segmentation module, a similarity calculation module, a frequency statistics module, a quality spot check project selecting module and a spot check project library; the spot check item library is stored with a plurality of items to be spot checked and a subject term corresponding to each item to be spot checked;
the data acquisition module is used for acquiring first evaluation data of an article;
the word segmentation module is used for carrying out word segmentation processing on the first evaluation data to obtain a plurality of evaluation words;
the similarity calculation module is used for calculating the similarity between each subject term and the evaluation participle and calling the frequency statistics module;
the frequency counting module is used for counting the frequency that the similarity between each subject term and the evaluation participle is greater than a similarity threshold value;
the quality spot check item selection module is used for selecting N to-be-spot check items corresponding to the N subject terms with the highest frequency as quality spot check items, wherein N is a positive integer.
Preferably, the article quality spot check item determination system further comprises a judgment module and a filtering module;
the judging module is used for judging whether the first evaluation data contains negative evaluation on the quality of the article, and if so, the evaluation data filtering module is called;
the evaluation data filtering module is used for filtering first evaluation data which do not contain negative evaluation on the quality of the article;
and the word segmentation module is used for carrying out word segmentation processing on the filtered first evaluation data.
Preferably, the judging module comprises an evaluation data obtaining unit, a label endowing unit, a text information base establishing unit and an article evaluation data evaluation model training unit;
the evaluation data acquisition unit is used for acquiring second evaluation data of the target object within a preset time;
the label endowing unit is used for endowing the second evaluation data with a target characteristic label, and the target characteristic label is used for representing whether the second evaluation data shows that the target article has a quality problem;
the text information base creating unit is used for creating a text information base for judging the quality of the article according to the second evaluation data;
the article evaluation data evaluation model training unit is used for training according to the text information base and the target characteristic label to obtain an article evaluation data evaluation model;
the judging module is used for judging whether the first evaluation data comprises negative evaluation on the quality of the article by utilizing the article evaluation data judging model.
Preferably, the judging module further comprises a word segmentation unit and a word vector library; the word vector stock stores a plurality of standard participles and word vectors corresponding to each standard participle;
the word segmentation unit is used for carrying out word segmentation processing on the second evaluation data to obtain a plurality of words;
the text information base creating unit is used for acquiring word segmentation vectors corresponding to the multiple word segmentations from the word vector base; the text information base comprises the word segmentation vectors;
and the article evaluation data evaluation model training unit is used for inputting the word segmentation vectors and the target feature labels into a machine learning model as training samples, and training to obtain the article evaluation data evaluation model.
Preferably, the judging module further comprises a word vector acquiring unit and a label output unit;
the word vector acquiring unit is used for acquiring evaluation word segmentation vectors corresponding to the evaluation word segmentations from the word vector library;
the label output unit is used for inputting the evaluation word segmentation vectors into the article evaluation data evaluation model and outputting the feature labels of the first evaluation data; the characteristic label is used for characterizing whether the first evaluation data comprises negative evaluation on the quality of the article;
the judging module is used for judging whether the first evaluation data comprises negative evaluation on the quality of the article according to the feature label judging model.
Preferably, the text information base creating module further comprises a stop word filtering unit;
the stop word filtering unit is used for filtering stop words in the multiple participles;
the word vector obtaining unit is used for obtaining corresponding word segmentation vectors for the filtered multiple word segmentations.
Preferably, the word vector obtaining unit is further configured to obtain a subject word vector corresponding to the subject word from the word vector library;
the similarity calculation module is used for calculating the cosine similarity of the subject word vector and the evaluation word segmentation vector based on a cosine similarity algorithm as the similarity.
The positive progress effects of the invention are as follows: according to the invention, the quality spot check project is automatically selected according to the evaluation data of the user, so that the dependence on professionals is eliminated, the automatic and scientific spot check is realized, the spot check project depends on the evaluation data of the user and is more representative, and the spot check project is more reliable.
Drawings
Fig. 1 is a flowchart of a method for determining an item quality spot check item according to embodiment 1 of the present invention.
Fig. 2 is a flowchart of a method for determining an item quality spot check item according to embodiment 2 of the present invention.
Fig. 3 is a flowchart illustrating a step 11 of the method for determining a quality spot check item according to embodiment 2 of the present invention.
Fig. 4 is a flowchart illustrating the step 113 of the method for determining a quality spot check item according to embodiment 2 of the present invention.
Fig. 5 is a flowchart illustrating the step 115 of the method for determining a quality spot check item according to embodiment 2 of the present invention.
Fig. 6 is a flowchart illustrating the step 113 of the method for determining a quality spot check item according to embodiment 3 of the present invention.
Fig. 7 is a flowchart illustrating the step 40 of the method for determining the item quality spot check item according to embodiment 3 of the present invention.
Fig. 8 is a schematic structural diagram of an electronic device according to embodiment 4 of the present invention.
Fig. 9 is a block diagram of an item quality spot check item determination system according to embodiment 6 of the present invention.
Fig. 10 is a block diagram of an item quality spot check item determination system according to embodiment 7 of the present invention.
Fig. 11 is a schematic block diagram of a determining module in the system for determining a quality spot check item according to embodiment 7 of the present invention.
Fig. 12 is a block diagram illustrating a determining module in the system for determining a quality spot check item according to embodiment 8 of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.
Example 1
An item quality spot check item determination method, as shown in fig. 1, includes:
step 10, acquiring first evaluation data of an article; in the embodiment, the evaluation and refund contents within the past 3 months time range corresponding to the article to be sampled are selected as the evaluation data of the article to be sampled so as to ensure that the contents can better reflect the recent quality problem of the article;
step 20, performing word segmentation processing on the first evaluation data to obtain a plurality of evaluation words;
step 30, presetting a sampling inspection item library; the spot check item library is stored with a plurality of items to be spot checked and a subject term corresponding to each item to be spot checked;
step 40, calculating the similarity between each subject term and each evaluation word, and counting the frequency of similarity greater than a similarity threshold;
step 50, selecting N to-be-spot-checked items corresponding to N subject terms with the highest frequency as quality spot-check items; n is a positive integer.
In the recommendation process, this embodiment may adopt: arranging all items to be subjected to spot inspection in a descending order according to frequency values, taking 5 before the ranking, and setting limits on the frequency, for example, requiring the items to be subjected to spot inspection with the frequency more than 10, wherein the threshold value can be flexibly adjusted according to the actual application condition to be used as a final quality spot inspection item;
according to the quality spot-check item selection method and device, the quality spot-check item is automatically selected according to the evaluation data of the user, so that dependence on professionals is eliminated, automatic and scientific spot-check is achieved, the spot-check item is representative depending on the evaluation data of the user, and the spot-check item is reliable.
Example 2
The method for determining the article quality spot check item in this embodiment is further improved on the basis of embodiment 1, as shown in fig. 2, after step 10, the method for determining the article quality spot check item further includes:
step 11, judging whether the first evaluation data comprise negative evaluation on the quality of the article, if so, executing step 12; if not, the data does not need to be filtered;
step 12, filtering first evaluation data which do not contain negative evaluation on the quality of the article;
further, step 20 is replaced by step 20-1, which specifically comprises:
step 20-1, performing word segmentation processing on the filtered first evaluation data to obtain a plurality of evaluation words;
it should be noted that, generally, the purpose of quality spot check is to enable effective supervision and management on articles, and especially, problematic articles need to be emphasized, therefore, in this embodiment, review data is filtered in advance, and spot check is focused on articles with quality problems, where, as shown in fig. 3, step 11 specifically includes:
step 111, obtaining second evaluation data of the target object within a preset time;
step 112, endowing the second evaluation data with a target characteristic label; the target characteristic label is used for representing whether the second evaluation data shows that the target article has quality problems; in the target characteristic label calibration, part of data can be picked from the existing evaluation data for manual marking, and whether the data belong to the quality problem is marked, for example, if the data belong to the quality problem of the article, the data are marked as 1, otherwise, the data are marked as 0;
step 113, creating a text information base for judging the quality of the article according to the second evaluation data;
step 114, training according to the text information base and the target characteristic label to obtain an article evaluation data evaluation model;
and step 115, judging whether the first evaluation data comprises negative evaluation on the quality of the article by using the article evaluation data evaluation model.
Further, in this embodiment, as shown in fig. 4, step 113 specifically includes:
step 1131, presetting a word vector library; the word vector library stores a plurality of standard participles and word vectors corresponding to each standard participle;
step 1132, performing word segmentation processing on the second evaluation data to obtain a plurality of words;
step 1133, obtaining word segmentation vectors corresponding to the multiple word segmentations from the word vector library; the text information base comprises the word segmentation vectors;
in step 114, the word segmentation vectors and the target feature labels are input into a machine learning model as training samples, and the object evaluation data evaluation model is obtained through training.
It should be noted that the word vector indicates that a word is represented by using a multidimensional array, and by using the word vector, the distance between similar words is closer when the cosine distance is calculated, and the generation of the word vector has a relatively mature open-source implementation technology. The generation of the word vector library in this embodiment may be performed by collecting item category information, item description information, item feedback information, and the like, performing word segmentation on all feedback text contents under each category, learning a word vector expression mode of each word, and finally generating a word vector library corresponding to each item category.
Further, after the training is performed to obtain the item evaluation data evaluation model, as shown in fig. 5, step 115 specifically includes:
1151, obtaining evaluation word segmentation vectors corresponding to a plurality of evaluation word segmentations from a word vector library;
step 1152, inputting the evaluation word segmentation vectors into an article evaluation data evaluation model, and outputting feature labels of first evaluation data; the characteristic label is used for characterizing whether the first evaluation data comprises negative evaluation on the quality of the article;
step 1153, judging whether the first evaluation data comprises negative evaluation on the quality of the article according to the feature label evaluation model.
In this embodiment, after the second comment data is segmented, the corresponding segmentation vector is obtained based on the word vector library, then the segmentation vector and the target feature tag are used as training corpus to be trained to obtain an article evaluation data evaluation model, and then the first evaluation data is judged based on the article evaluation data evaluation model.
Example 3
The method for determining the item quality spot check item in this embodiment is further improved on the basis of embodiment 2, as shown in fig. 6, since part of the user evaluation data contains many non-standard phrases, punctuations, invalid contents, and the like, after performing word segmentation on the evaluation data, words such as stop words and symbols are removed to improve the accuracy of the model, and therefore, before step 1133, step 113 further includes:
step 1134, filtering stop words in the multiple participles;
further, replacing step 1133 with step 1133-1 specifically includes:
step 1133-1, obtaining corresponding participle vectors for the filtered multiple participles from the word vector library;
in addition, vector representations of the subject word and the evaluation segmented word can be queried based on the word vector library, so that the degree of adjacency between two words can be obtained by calculating the cosine similarity of the vectors, specifically, as shown in fig. 7, step 40 specifically includes:
step 401, obtaining a subject term vector corresponding to a subject term from a term vector library;
and step 402, calculating cosine similarity of the subject word vector and the evaluation word vector based on a cosine similarity algorithm to serve as similarity.
In this embodiment, in a specific implementation process, the cosine distance between each word in the evaluation data and each subject word may be calculated according to the subject word related to the specific quality item requirement defined in the national and industrial quality standards, and if the cosine distance between a certain word and a subject word is greater than a certain threshold (for example, the set threshold is 0.8), it indicates that the problem of the record feedback is close to the subject word in a high probability, so that the count of the subject is increased by 1, and finally, the count number corresponding to each quality subject can be obtained.
Example 4
An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for determining an item for quality spot check as described in any of embodiments 1 to 3 when executing the computer program.
Fig. 8 is a schematic structural diagram of an electronic device according to embodiment 4 of the present invention. FIG. 8 illustrates a block diagram of an exemplary electronic device 90 suitable for use in implementing embodiments of the present invention. The electronic device 90 shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 8, the electronic device 90 may take the form of a general purpose computing device, which may be a server device, for example. The components of the electronic device 90 may include, but are not limited to: at least one processor 91, at least one memory 92, and a bus 93 that connects the various system components (including the memory 92 and the processor 91).
The bus 93 includes a data bus, an address bus, and a control bus.
Memory 92 may include volatile memory, such as Random Access Memory (RAM)921 and/or cache memory 922, and may further include Read Only Memory (ROM) 923.
Memory 92 may also include a program tool 925 having a set (at least one) of program modules 924, such program modules 924 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The processor 91 executes various functional applications and data processing by running a computer program stored in the memory 92.
The electronic device 90 may also communicate with one or more external devices 94 (e.g., keyboard, pointing device, etc.). Such communication may be through an input/output (I/O) interface 95. Also, the electronic device 90 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via a network adapter 96. The network adapter 96 communicates with the other modules of the electronic device 90 via the bus 93. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 90, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, and data backup storage systems, etc.
It should be noted that although in the above detailed description several units/modules or sub-units/modules of the electronic device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module, according to embodiments of the application. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
Example 5
A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the item quality spot check item determination method according to any one of embodiments 1 to 3.
More specific examples, among others, that the readable storage medium may employ may include, but are not limited to: a portable disk, a hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
In a possible implementation manner, the present invention can also be implemented in a form of a program product, which includes program code for causing a terminal device to execute steps of implementing the item quality spot check item determination method described in any one of embodiments 1 to 3 when the program product is run on the terminal device.
Where program code for carrying out the invention is written in any combination of one or more programming languages, the program code may be executed entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on a remote device or entirely on the remote device.
Example 6
An article quality spot-check item determination system is shown in fig. 9, and comprises a data acquisition module 1, a word segmentation module 2, a similarity calculation module 3, a frequency statistics module 4, a quality spot-check item selection module 5 and a spot-check item library 6; the spot check item library 6 stores a plurality of items to be spot checked and a subject term corresponding to each item to be spot checked;
the data acquisition module 1 is used for acquiring first evaluation data of an article; in the embodiment, the evaluation and refund contents within the past 3 months time range corresponding to the article to be sampled are selected as the evaluation data of the article to be sampled so as to ensure that the contents can better reflect the recent quality problem of the article;
the word segmentation module 2 is used for performing word segmentation processing on the first evaluation data to obtain a plurality of evaluation word segments;
the similarity calculation module 3 is used for calculating the similarity between each subject term and the evaluation participle and calling the frequency statistic module 4;
the frequency counting module 4 is used for counting the frequency that the similarity between each subject term and the evaluation participle is greater than a similarity threshold;
the quality spot check item selecting module 5 is configured to select N to-be-spot check items corresponding to the N most frequent subject terms as quality spot check items, where N is a positive integer.
In the process of determining the sampling item, this embodiment may adopt: arranging all items to be subjected to spot inspection in a descending order according to frequency values, taking 5 before the ranking, and setting limits on the frequency, for example, requiring the items to be subjected to spot inspection with the frequency more than 10, wherein the threshold value can be flexibly adjusted according to the actual application condition to be used as a final quality spot inspection item;
according to the quality spot-check item selection method and device, the quality spot-check item is automatically selected according to the evaluation data of the user, so that dependence on professionals is eliminated, automatic and scientific spot-check is achieved, the spot-check item is representative depending on the evaluation data of the user, and the spot-check item is reliable.
Example 7
The system for determining the article quality spot check item of this embodiment is further improved on the basis of embodiment 6, and as shown in fig. 10, the system for determining the article quality spot check item further includes a determining module 7 and a filtering module 8;
the judging module 7 is configured to judge whether the first evaluation data includes a negative evaluation on the quality of the article, and if so, invoke the evaluation data filtering module 8;
the evaluation data filtering module 8 is used for filtering first evaluation data which do not contain negative evaluation on the quality of the article;
and the word segmentation module 2 is used for carrying out word segmentation processing on the filtered first evaluation data.
It should be noted that, in general, the purpose of quality spot inspection is to effectively supervise and manage articles, and especially to put more importance on problematic articles, therefore, in this embodiment, review data is filtered in advance, and spot inspection is focused on articles with quality problems, specifically, as shown in fig. 11, the determining module 7 includes an evaluation data obtaining unit 71, a label assigning unit 72, a text information base creating unit 73, and an article evaluation data evaluation model training unit 74;
the evaluation data acquisition unit 71 is configured to acquire second evaluation data of the target item within a preset time;
the label assigning unit 72 is configured to assign a target feature label to the second evaluation data, where the target feature label is used to characterize whether the second evaluation data indicates that the target article has a quality problem; in the target characteristic label calibration, part of data can be picked from the existing evaluation data for manual marking, and whether the data belong to the quality problem is marked, for example, if the data belong to the quality problem of the article, the data are marked as 1, otherwise, the data are marked as 0;
the text information base creating unit 73 is configured to create a text information base for evaluating the quality of the article according to the second evaluation data;
the article evaluation data evaluation model training unit 74 is configured to train according to the text information base and the target feature label to obtain an article evaluation data evaluation model;
the judging module 7 is configured to judge whether the first evaluation data includes a negative evaluation on the quality of the article by using the article evaluation data evaluation model.
Referring to fig. 11, the determining module 7 further includes a word segmentation unit 75 and a word vector library 76; the word vector bank 76 stores a plurality of standard participles and word vectors corresponding to each standard participle;
the word segmentation unit 75 is configured to perform word segmentation processing on the second evaluation data to obtain a plurality of words;
the text information base creating unit 73 is configured to obtain word segmentation vectors corresponding to the multiple word segmentations from the word vector base 76; the text information base comprises the word segmentation vectors;
the article evaluation data evaluation model training unit 74 is configured to input the word segmentation vector and the target feature label as training samples into a machine learning model, and train to obtain the article evaluation data evaluation model.
It should be noted that the word vector indicates that a word is represented by using a multidimensional array, and by using the word vector, the distance between similar words is closer when the cosine distance is calculated, and the generation of the word vector has a relatively mature open-source implementation technology. The generation of the word vector library 76 in this embodiment may be performed by collecting item category information, item description information, item feedback information, and the like, performing word segmentation on all feedback text contents under each category, learning a word vector expression mode of each word, and finally generating the word vector library 76 corresponding to each item category.
In this embodiment, referring to fig. 11, the determining module 7 further includes a word vector obtaining unit 77 and a label output unit 78;
the word vector acquiring unit 77 is configured to acquire evaluation word segmentation vectors corresponding to the plurality of evaluation word segmentations from the word vector library 76;
the label output unit 78 is configured to input the evaluation word segmentation vector into the article evaluation data evaluation model, and output a feature label of the first evaluation data; the characteristic label is used for characterizing whether the first evaluation data comprises negative evaluation on the quality of the article;
the judging module 7 is configured to judge whether the first evaluation data includes a negative evaluation on the quality of the article according to the feature tag evaluation model.
In this embodiment, after the second comment data is segmented, the corresponding segmentation vector is obtained based on the word vector library 76, then the segmentation vector and the target feature tag are used as training corpus to be trained to obtain an article evaluation data evaluation model, and then the first evaluation data is judged based on the article evaluation data evaluation model.
Example 8
The system method for determining the article quality spot check item in this embodiment is further improved on the basis of embodiment 6, and since part of the user evaluation data contains many unnormalized phrases, punctuations, invalid contents, and the like, after the evaluation data is segmented, stop words and symbols are removed to improve the accuracy of the model, as shown in fig. 12, the judgment module further includes a stop word filtering unit 79;
the stop word filtering unit 79 is configured to filter stop words in the multiple segmented words;
the word vector obtaining unit 77 is configured to obtain corresponding word segmentation vectors for the filtered multiple word segmentations.
In addition, based on the word vector library 76, the vector representation of the subject word and the evaluation participle can be queried, so that the degree of adjacency between two words can be obtained by calculating the cosine similarity of the vectors, specifically:
the word vector acquiring unit 77 is further configured to acquire a subject word vector corresponding to the subject word from the word vector library 76;
the similarity calculation module is used for calculating the cosine similarity of the subject word vector and the evaluation word segmentation vector based on a cosine similarity algorithm as the similarity.
In this embodiment, in a specific implementation process, the cosine distance between each word in the evaluation data and each subject word may be calculated according to the subject word related to the specific quality item requirement defined in the national and industrial quality standards, and if the cosine distance between a certain word and a subject word is greater than a certain threshold (for example, the set threshold is 0.8), it indicates that the problem of the record feedback is close to the subject word in a high probability, so that the count of the subject is increased by 1, and finally, the count number corresponding to each quality subject can be obtained.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims (16)

1. An item quality spot check item determination method, characterized by comprising:
acquiring first evaluation data of an article;
performing word segmentation processing on the first evaluation data to obtain a plurality of evaluation word segments;
presetting a sampling inspection project library; the spot check item library is stored with a plurality of items to be spot checked and a subject term corresponding to each item to be spot checked;
calculating the similarity between each subject term and the evaluation participle, and counting the frequency of the similarity greater than a similarity threshold value;
and selecting N to-be-spot-checked items corresponding to the N subject terms with the highest frequency as quality spot-check items, wherein N is a positive integer.
2. The method for determining a quality spot check item according to claim 1, wherein after the step of obtaining the first evaluation data of an item, the method for determining a quality spot check item further comprises:
judging whether the first evaluation data contain negative evaluation on the quality of the article, if so, filtering out the first evaluation data which do not contain negative evaluation on the quality of the article;
and in the step of performing word segmentation processing on the first evaluation data, performing word segmentation processing on the filtered first evaluation data.
3. The method for determining a quality spot check item according to claim 2, wherein the step of determining whether the first evaluation data includes a negative evaluation of the quality of the item specifically comprises:
acquiring second evaluation data of the target object within a preset time;
endowing the second evaluation data with a target characteristic label, wherein the target characteristic label is used for representing whether the second evaluation data shows that the target object has a quality problem;
creating a text information base for judging the quality of the article according to the second evaluation data;
training according to the text information base and the target feature label to obtain an article evaluation data evaluation model;
and judging whether the first evaluation data comprises negative evaluation on the quality of the article by using the article evaluation data evaluation model.
4. The method for determining a quality spot check item according to claim 3, wherein the step of creating a text information base for evaluating the quality of an item according to the evaluation data specifically comprises:
presetting a word vector library; the word vector stock stores a plurality of standard participles and word vectors corresponding to each standard participle;
performing word segmentation processing on the second evaluation data to obtain a plurality of words;
acquiring word segmentation vectors corresponding to the plurality of word segmentations from the word vector library; the text information base comprises the word segmentation vectors;
the step of training according to the text information base to obtain the article evaluation data judgment model specifically comprises the following steps:
and inputting the word segmentation vectors and the target feature labels into a machine learning model as training samples, and training to obtain the article evaluation data evaluation model.
5. The method for determining a quality spot check item according to claim 4, wherein the step of determining whether the first evaluation data includes a negative evaluation on the quality of the item by using the item evaluation data evaluation model specifically comprises:
obtaining evaluation word segmentation vectors corresponding to the plurality of evaluation word segmentation from the word vector library;
inputting the evaluation word segmentation vector into the article evaluation data evaluation model, and outputting a feature tag of the first evaluation data; the characteristic label is used for characterizing whether the first evaluation data comprises negative evaluation on the quality of the article;
and judging whether the first evaluation data comprises negative evaluation on the quality of the article according to the feature tag evaluation model.
6. The method of claim 4, wherein prior to the step of querying the word vector library for the word vectors corresponding to the plurality of words, the method further comprises:
filtering stop words in the plurality of participles;
in the step of inquiring the participle vectors corresponding to the participles from the word vector library, the corresponding participle vectors are obtained for the filtered participles.
7. The method for determining quality spot check items according to claim 5, wherein the step of calculating the similarity between each subject term and the evaluation participle specifically comprises:
obtaining a subject term vector corresponding to the subject term from the term vector library;
and calculating the cosine similarity of the subject word vector and the evaluation word vector based on a cosine similarity algorithm to serve as the similarity.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the quality spot check item determination method of any one of claims 1 to 7 when executing the computer program.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the quality spot check item determination method according to any one of claims 1 to 7.
10. An article quality spot check item determination system is characterized by comprising a data acquisition module, a word segmentation module, a similarity calculation module, a frequency statistics module, a quality spot check item selection module and a spot check item library; the spot check item library is stored with a plurality of items to be spot checked and a subject term corresponding to each item to be spot checked;
the data acquisition module is used for acquiring first evaluation data of an article;
the word segmentation module is used for carrying out word segmentation processing on the first evaluation data to obtain a plurality of evaluation words;
the similarity calculation module is used for calculating the similarity between each subject term and the evaluation participle and calling the frequency statistics module;
the frequency counting module is used for counting the frequency that the similarity between each subject term and the evaluation participle is greater than a similarity threshold value;
the quality spot check item selection module is used for selecting N to-be-spot check items corresponding to the N subject terms with the highest frequency as quality spot check items, wherein N is a positive integer.
11. The article quality spot check item determination system of claim 10, wherein the article quality spot check item determination system further comprises a determination module and a filtering module;
the judging module is used for judging whether the first evaluation data contains negative evaluation on the quality of the article, and if so, the evaluation data filtering module is called;
the evaluation data filtering module is used for filtering first evaluation data which do not contain negative evaluation on the quality of the article;
and the word segmentation module is used for carrying out word segmentation processing on the filtered first evaluation data.
12. The item quality spot check item determination system of claim 11, wherein the judgment module comprises an evaluation data acquisition unit, a label assignment unit, a text information base creation unit, and an item evaluation data evaluation model training unit;
the evaluation data acquisition unit is used for acquiring second evaluation data of the target object within a preset time;
the label endowing unit is used for endowing the second evaluation data with a target characteristic label, and the target characteristic label is used for representing whether the second evaluation data shows that the target article has a quality problem;
the text information base creating unit is used for creating a text information base for judging the quality of the article according to the second evaluation data;
the article evaluation data evaluation model training unit is used for training according to the text information base and the target characteristic label to obtain an article evaluation data evaluation model;
the judging module is used for judging whether the first evaluation data comprises negative evaluation on the quality of the article by utilizing the article evaluation data judging model.
13. The article quality spot check item determination system of claim 12, wherein the decision module further comprises a word segmentation unit and a word vector library; the word vector stock stores a plurality of standard participles and word vectors corresponding to each standard participle;
the word segmentation unit is used for carrying out word segmentation processing on the second evaluation data to obtain a plurality of words;
the text information base creating unit is used for acquiring word segmentation vectors corresponding to the multiple word segmentations from the word vector base; the text information base comprises the word segmentation vectors;
and the article evaluation data evaluation model training unit is used for inputting the word segmentation vectors and the target feature labels into a machine learning model as training samples, and training to obtain the article evaluation data evaluation model.
14. The item quality spot check item determination system of claim 13, wherein the determination module further comprises a word vector acquisition unit and a label output unit;
the word vector acquiring unit is used for acquiring evaluation word segmentation vectors corresponding to the evaluation word segmentations from the word vector library;
the label output unit is used for inputting the evaluation word segmentation vectors into the article evaluation data evaluation model and outputting the feature labels of the first evaluation data; the characteristic label is used for characterizing whether the first evaluation data comprises negative evaluation on the quality of the article;
the judging module is used for judging whether the first evaluation data comprises negative evaluation on the quality of the article according to the feature label judging model.
15. The article quality spot check item determination system of claim 13, wherein the decision module further comprises a stop word filtering unit;
the stop word filtering unit is used for filtering stop words in the multiple participles;
the word vector obtaining unit is used for obtaining corresponding word segmentation vectors for the filtered multiple word segmentations.
16. The article quality spot check item determination system of claim 14, wherein the word vector obtaining unit is further configured to obtain a subject word vector corresponding to the subject word from the word vector library;
the similarity calculation module is used for calculating the cosine similarity of the subject word vector and the evaluation word segmentation vector based on a cosine similarity algorithm as the similarity.
CN201810866301.0A 2018-08-01 2018-08-01 Quality spot check item determination method, system, electronic device and readable storage medium Pending CN110807082A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810866301.0A CN110807082A (en) 2018-08-01 2018-08-01 Quality spot check item determination method, system, electronic device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810866301.0A CN110807082A (en) 2018-08-01 2018-08-01 Quality spot check item determination method, system, electronic device and readable storage medium

Publications (1)

Publication Number Publication Date
CN110807082A true CN110807082A (en) 2020-02-18

Family

ID=69486760

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810866301.0A Pending CN110807082A (en) 2018-08-01 2018-08-01 Quality spot check item determination method, system, electronic device and readable storage medium

Country Status (1)

Country Link
CN (1) CN110807082A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114254951A (en) * 2021-12-27 2022-03-29 南方电网物资有限公司 Power grid equipment arrival sampling inspection method based on digitization technology

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101334787A (en) * 2008-07-22 2008-12-31 深圳钱袋商务有限公司 Objects evaluation information enquiry system and method
CN101408966A (en) * 2008-11-20 2009-04-15 汤溪蔚 Method and system for evaluation or questionnaire inquisition of brands through network
CN103123633A (en) * 2011-11-21 2013-05-29 阿里巴巴集团控股有限公司 Generation method of evaluation parameters and information searching method based on evaluation parameters
WO2015198436A1 (en) * 2014-06-26 2015-12-30 楽天株式会社 Information processing device, information processing method, and program
CN105590176A (en) * 2016-03-07 2016-05-18 杭州国家电子商务产品质量监测处置中心 Product quality risk monitoring sampling method based on Internet
US20170139918A1 (en) * 2015-11-13 2017-05-18 Salesforce.Com, Inc. Managing importance ratings related to event records in a database system
CN107203841A (en) * 2017-05-08 2017-09-26 中车青岛四方机车车辆股份有限公司 The method of inspection and device of product quality

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101334787A (en) * 2008-07-22 2008-12-31 深圳钱袋商务有限公司 Objects evaluation information enquiry system and method
CN101408966A (en) * 2008-11-20 2009-04-15 汤溪蔚 Method and system for evaluation or questionnaire inquisition of brands through network
CN103123633A (en) * 2011-11-21 2013-05-29 阿里巴巴集团控股有限公司 Generation method of evaluation parameters and information searching method based on evaluation parameters
WO2015198436A1 (en) * 2014-06-26 2015-12-30 楽天株式会社 Information processing device, information processing method, and program
US20170139918A1 (en) * 2015-11-13 2017-05-18 Salesforce.Com, Inc. Managing importance ratings related to event records in a database system
CN105590176A (en) * 2016-03-07 2016-05-18 杭州国家电子商务产品质量监测处置中心 Product quality risk monitoring sampling method based on Internet
CN107203841A (en) * 2017-05-08 2017-09-26 中车青岛四方机车车辆股份有限公司 The method of inspection and device of product quality

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114254951A (en) * 2021-12-27 2022-03-29 南方电网物资有限公司 Power grid equipment arrival sampling inspection method based on digitization technology

Similar Documents

Publication Publication Date Title
CN109829629B (en) Risk analysis report generation method, apparatus, computer device and storage medium
CN112163424A (en) Data labeling method, device, equipment and medium
CN113238922B (en) Log analysis method and device, electronic equipment and medium
CN111460250A (en) Image data cleaning method, image data cleaning device, image data cleaning medium, and electronic apparatus
CN110826494A (en) Method and device for evaluating quality of labeled data, computer equipment and storage medium
CN108241867B (en) Classification method and device
CN111931809A (en) Data processing method and device, storage medium and electronic equipment
CN112613569A (en) Image recognition method, and training method and device of image classification model
CN112181490A (en) Method, device, equipment and medium for identifying function category in function point evaluation method
CN110069558A (en) Data analysing method and terminal device based on deep learning
CN111104422B (en) Training method, device, equipment and storage medium of data recommendation model
CN113760891A (en) Data table generation method, device, equipment and storage medium
CN112579781A (en) Text classification method and device, electronic equipment and medium
CN112613176A (en) Slow SQL statement prediction method and system
CN110807082A (en) Quality spot check item determination method, system, electronic device and readable storage medium
CN115310869B (en) Combined supervision method, system, equipment and storage medium for supervision items
CN113627892B (en) BOM data identification method and electronic equipment thereof
US20220188512A1 (en) Maintenance of a data glossary
CN114266242A (en) Work order data processing method and device, server and readable storage medium
CN111209397B (en) Method for determining enterprise industry category
CN114021064A (en) Website classification method, device, equipment and storage medium
CN111177301B (en) Method and system for identifying and extracting key information
CN115080730A (en) Account data processing method and device, electronic equipment and computer storage medium
CN113761918A (en) Data processing method and device
CN117112791B (en) Unknown log classification decision system, method and device and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination