CN116010609A - Material data classifying method and device, electronic equipment and storage medium - Google Patents

Material data classifying method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116010609A
CN116010609A CN202310286304.8A CN202310286304A CN116010609A CN 116010609 A CN116010609 A CN 116010609A CN 202310286304 A CN202310286304 A CN 202310286304A CN 116010609 A CN116010609 A CN 116010609A
Authority
CN
China
Prior art keywords
material data
generator
value
keywords
countermeasure network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310286304.8A
Other languages
Chinese (zh)
Other versions
CN116010609B (en
Inventor
段效亮
马乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Zhonghan Software Co ltd
Original Assignee
Shandong Zhonghan Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Zhonghan Software Co ltd filed Critical Shandong Zhonghan Software Co ltd
Priority to CN202310286304.8A priority Critical patent/CN116010609B/en
Publication of CN116010609A publication Critical patent/CN116010609A/en
Application granted granted Critical
Publication of CN116010609B publication Critical patent/CN116010609B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to the technical field of data processing, and particularly provides a material data classification method, a device, electronic equipment and a storage medium, aiming at improving the material data classification efficiency. The material data classifying method comprises the following steps: receiving keywords input by a user; generating a plurality of hyponyms for the keywords based on a generator in a pre-trained generated countermeasure network; the generating type countermeasure network comprises a generator and a discriminator, the generating type countermeasure network is trained according to discrimination results and material data hit rates, the discrimination results are the discrimination results of the discriminator on sequences generated by the generator during training, and the material data hit rates are query hit rates of target material data when the sequences generated by the generator are used for query during training; and receiving the selection operation of the user on the paraphraseology, inquiring the material data by utilizing the keywords and each selected paraphraseology, and classifying the inquired material data into a category.

Description

Material data classifying method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data processing, and in particular, to a method and apparatus for classifying material data, an electronic device, and a storage medium.
Background
In order to pursue high quality and high efficiency development, enterprises generally introduce a fine management concept to sort and manage mass data generated during the operation of the enterprises. For large-scale manufacturing enterprises, the data required to be managed at the production end includes material data, product data, personnel data and the like. In this case, a large-scale production enterprise usually selects multiple suppliers for the same material, and the multiple suppliers supply the same material at the same time, but the naming mode or text description (hereinafter collectively referred to as material data) of the materials by the different suppliers are different, so that the material data collected by the enterprise from the suppliers is difficult to classify.
In the prior art, in order to classify material data, a common practice is to manually expand keywords, then query the material data based on the keywords and manually expanded close meaning words, and then classify query results into one category. However, large manpower is required for manually expanding the paraphraseology, and the similarity between the manually expanded paraphraseology and the keywords is high, but the relevance between the manually expanded paraphraseology and the material database is low, so that the material data of the same class can only be queried for a small part, and the classification efficiency of the material data is affected.
Disclosure of Invention
The embodiment of the application provides a material data classifying method, a device, electronic equipment and a storage medium, and aims to improve classifying efficiency of material data.
According to a first aspect of embodiments of the present application, there is provided a method for classifying material data, the method comprising:
receiving keywords input by a user and used for inquiring material data;
generating a plurality of hyponyms for the keywords based on a generator in a pre-trained generated countermeasure network; the generating type countermeasure network comprises a generator and a discriminator, the generating type countermeasure network is trained according to discrimination results and material data hit rates, the discrimination results are the discrimination results of the discriminator on sequences generated by the generator during training, and the material data hit rates are query hit rates of target material data when the sequences generated by the generator are used for query during training;
and receiving the selection operation of the user on the paraphraseology, inquiring the material data by utilizing the keywords and each selected paraphraseology, and classifying the inquired material data into a category.
Optionally, the method further comprises:
training a generator of the generated type countermeasure network according to the discrimination result and the hit rate of the material data, and training a discriminator of the generated type countermeasure network according to the discrimination result.
Optionally, the step of training the generator of the generated type countermeasure network according to the discrimination result and the hit rate of the material data, and training the discriminator of the generated type countermeasure network according to the discrimination result includes:
the following steps S1 to S3 are repeated N times: s1, inputting random noise and sample keywords into a generator to obtain a sequence generated by the generator; s2, complementing the sequence based on a Monte Carlo search method to obtain a complete sequence, judging the complete sequence through a discriminator to obtain a judging result, and generating a reward value according to the judging result; s3, inquiring the material data by utilizing the sequence to obtain a material inquiry result;
calculating an average value of the N rewards as a first rewards;
counting the query hit rate of N material query results on target material data, and calculating a second reward value according to the query hit rate; the target material data are material data matched with the sample keywords;
the generator is updated according to the average value of the first prize value and the second prize value, and the arbiter is updated according to the first prize value.
Alternatively, the random noise of the input generator is inconsistent and the sample keywords of the input generator are consistent each time step S1 is performed N times.
Optionally, the step of updating the generator in accordance with the first prize value and the second prize value comprises:
weighting and summing the first and second prize values according to their respective weights;
updating a generator according to the weighted summation result;
wherein, during training, the weight of the first reward value and the weight of the second reward value are adjusted along with the change amplitude of the first reward value, and the more gradual the change amplitude of the first reward value is, the smaller the weight of the first reward value is, and the larger the weight of the second reward value is.
Optionally, the weight of the first prize value gradually decreases with the increase of the training round, the weight of the second prize value gradually increases with the increase of the training round, the weight of the first prize value has a preset lower limit value, the weight of the second prize value has a preset upper limit value, and the addition result of the preset lower limit value and the preset upper limit value is equal to 1.
Optionally, the method further comprises:
displaying the residual material data which are not queried;
and receiving manual classifying operation of the user on the residual material data, and classifying the residual material data into corresponding categories according to the manual classifying operation of the user on the residual material data.
According to a second aspect of embodiments of the present application, there is provided a material data classifying device, the device comprising:
the keyword receiving module is used for receiving keywords which are input by a user and used for inquiring material data;
the system comprises a paraphrasing generating module, a keyword generating module and a matching module, wherein the paraphrasing generating module is used for generating a plurality of paraphrasing for keywords based on a generator in a pre-trained generating type countermeasure network; the generating type countermeasure network comprises a generator and a discriminator, the generating type countermeasure network is trained according to discrimination results and material data hit rates, the discrimination results are the discrimination results of the discriminator on sequences generated by the generator during training, and the material data hit rates are query hit rates of target material data when the sequences generated by the generator are used for query during training;
and the material data query module is used for receiving the selection operation of the user on the paraphrasing, and respectively utilizing the keywords and each selected paraphrasing to query the material data and classifying the queried material data into a category.
According to a third aspect of embodiments of the present application, there is provided an electronic device comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, the computer program implementing the method of classifying material data according to any one of the preceding claims when executed by the processor.
According to a fourth aspect of embodiments of the present application, there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method of classifying material data as described in any of the above.
By adopting the material data classifying method provided by the embodiment of the application, a user can generate a plurality of hyponyms for the keywords by utilizing the pre-trained generator in the generation type countermeasure network only by inputting the keywords for inquiring the material data, and then the user can inquire the corresponding material data and classify the material data by selecting a part or all of the hyponyms from the hyponyms for inquiring operation, so that the manpower required by the material data classifying task is effectively saved, and the material data classifying efficiency is improved. And because the generated type countermeasure network is obtained by training according to the judging result and the material data hit rate, wherein the judging result is the result of judging the sequence generated by the generator by the judging device during training, and the material data hit rate is the query hit rate of target material data when the sequence generated by the generator is used for querying during training, the similarity between the similar meaning words generated by the generator of the generated type countermeasure network and the keywords is high, the relevance between the similar meaning words and the material database is also higher, and when the similar meaning words are used for querying, the material data of the same class has higher query hit rate, and the classification efficiency of the material data is further improved. In addition, in the method, a user needs to select the paraphraseology from the plurality of paraphraseology generated by the generator to inquire the material data, so that the user behavior can be restrained, and the standardization of data management is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
FIG. 1 is a flow chart of a method for classifying material data according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of SseqGAN according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of the material classifying device provided by the application.
In the figure: a keyword receiving module 310, a paraphrasing generating module 320 and a material data inquiring module 330.
Detailed Description
Before describing embodiments of the present disclosure, it should be noted that: some embodiments of the disclosure are described as process flows, in which the various operational steps of the flows may be numbered sequentially, but may be performed in parallel, concurrently, or simultaneously.
The terms "first," "second," and the like may be used in embodiments of the present disclosure to describe various features, but these features should not be limited by these terms. These terms are only used to distinguish one feature from another.
The term "and/or," "and/or" may be used in embodiments of the present disclosure to include any and all combinations of one or more of the associated features listed.
It will be understood that when two elements are described in a connected or communicating relationship, unless a direct connection or direct communication between the two elements is explicitly stated, connection or communication between the two elements may be understood as direct connection or communication, as well as indirect connection or communication via intermediate elements.
In order to make the technical solutions and advantages of the embodiments of the present disclosure more apparent, the following detailed description of exemplary embodiments of the present disclosure is provided in conjunction with the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present disclosure, not all embodiments of which are exhaustive. It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other.
Currently, in order to pursue high quality and high efficiency development, enterprises generally introduce a fine management concept to sort and manage massive data generated during the operation of the enterprises. For large-scale manufacturing enterprises, the data required to be managed at the production end includes material data, product data, personnel data and the like. In this case, a large-scale production enterprise usually selects multiple suppliers for the same material, and the multiple suppliers supply the same material at the same time, but the naming mode or text description (hereinafter collectively referred to as material data) of the materials by the different suppliers are different, so that the material data collected by the enterprise from the suppliers is difficult to classify.
For ease of understanding, for example, only for a material that is a hex bolt, some suppliers name the material as a "hex bolt", some suppliers name the material as a "hex head bolt", and some suppliers name the material as a "hex full thread bolt". The enterprise can enter these material data into the material database of enterprise when the material is in the field, leads to same material to have multiple naming methods in physical database, is unfavorable for classification and management of material data.
In the prior art, in order to classify material data, a common practice is to manually expand keywords, then query the material data based on the keywords and manually expanded close meaning words, and then classify query results into one category. However, large manpower is required for manually expanding the paraphraseology, and the similarity between the manually expanded paraphraseology and the keywords is high, but the relevance between the manually expanded paraphraseology and the material database is low, so that the material data of the same class can only be queried for a small part, and the classification efficiency of the material data is affected.
In view of this, the embodiments of the present application provide a method, an apparatus, an electronic device, and a storage medium for classifying material data, which aim to improve the classification efficiency of the material data.
Referring to fig. 1, fig. 1 is a flow chart illustrating a method for classifying material data according to an embodiment of the present application. As shown in fig. 1, the material data classifying method includes the following steps:
s110: and receiving keywords input by a user and used for inquiring material data.
In the application, when a user needs to classify material data, the material data can belong to keywords related to categories. For ease of understanding, a user may enter the keyword "hex bolt" or enter the keyword "hex head bolt" when, for example, the user needs to query and categorize physical data in the materials database related to "hex head bolt".
S120: generating a plurality of hyponyms for the keywords based on a generator in a pre-trained generated countermeasure network; the generating type countermeasure network comprises a generator and a discriminator, the generating type countermeasure network is trained according to discrimination results and material data hit rates, the discrimination results are the discrimination results of the discriminator on sequences generated by the generator during training, and the material data hit rates are query hit rates of target material data when the sequences generated by the generator are used for query during training.
In the method, each time the generator is called, the keyword and one random noise are input into the generator, one hyponym of the keyword is generated by the generator, and a plurality of hyponyms of the keyword can be generated by calling the generator for a plurality of times. In order to make the paraphrasing generated each time different, the random noise input each time the generator is called is different from each other, but the keywords input each time the generator is called are the same, that is, the keywords received in step S110. Wherein the random noise may be white noise. In some embodiments, if the keyword input by the user is longer, the n-gram language model may be used to segment the keyword, then each segment is expressed as a single hot vector, finally, a plurality of single hot vectors are spliced into a longer binary vector, the spliced binary vector is spliced or added with binary random noise, and finally, the spliced result or the added result is input to the generator, and a hyponym of the keyword is generated through the generator.
S130: and receiving the selection operation of the user on the paraphraseology, inquiring the material data by utilizing the keywords and each selected paraphraseology, and classifying the inquired material data into a category.
In the present application, the generated plurality of paraphraseology may be displayed to a user, who may select some or all of the paraphraseology through an input device (e.g., a mouse or touch screen).
In the method, the user only needs to input the keywords for inquiring the material data, the pre-trained generator in the generating type countermeasure network can be utilized to generate a plurality of hyponyms for the keywords, then the user selects a part or all of the hyponyms from the plurality of hyponyms to perform inquiry operation, and the corresponding material data can be inquired and classified, so that the manpower required by the material data classification task is effectively saved, and the material data classification efficiency is improved. And because the generated type countermeasure network is obtained by training according to the judging result and the material data hit rate, wherein the judging result is the result of judging the sequence generated by the generator by the judging device during training, and the material data hit rate is the query hit rate of target material data when the sequence generated by the generator is used for querying during training, the similarity between the similar meaning words generated by the generator of the generated type countermeasure network and the keywords is high, the relevance between the similar meaning words and the material database is also higher, and when the similar meaning words are used for querying, the material data of the same class has higher query hit rate, and the classification efficiency of the material data is further improved. In addition, in the method, a user needs to select the paraphraseology from the plurality of paraphraseology generated by the generator to inquire the material data, so that the user behavior can be restrained, and the standardization of data management is improved.
In some embodiments of the present application, the data categorizing method further comprises: training a generator of the generated type countermeasure network according to the discrimination result and the hit rate of the material data, and training a discriminator of the generated type countermeasure network according to the discrimination result.
In particular, the sequence antagonism network SseqGAN may be selected as the generation antagonism network, and the problem that the generator is difficult to be trained in the text task (i.e., the discrete data task) is solved based on the reinforcement learning algorithm included in the SseqGAN.
Referring to fig. 2, fig. 2 is a schematic structural diagram of SseqGAN according to an embodiment of the present application. The training process of the generator and the arbiter can be divided into an initial training phase and a formal training phase. In the initial training stage, network parameters of the generator and the discriminant are initialized at random, then the generator is pre-trained through a maximum likelihood estimation method, the purpose is to improve the searching efficiency of the generator, then the pre-trained generator is used for generating partial sequences, and the discriminant is pre-trained through minimizing the cross. And in the formal training stage, training the generator of the generated type countermeasure network according to the discrimination result and the hit rate of the material data, and training the discriminator of the generated type countermeasure network according to the discrimination result. Specifically, in the formal training phase, the following steps S1 to S3 are first repeatedly performed N times (i.e., steps S1 to S3 are performed N times in a loop): s1, inputting random noise and sample keywords into a generator to obtain a sequence generated by the generator; s2, complementing the sequence based on a Monte Carlo search method to obtain a complete sequence, judging the complete sequence through a discriminator to obtain a judging result, and generating a reward value according to the judging result; s3, inquiring the material data by utilizing the sequence to obtain a material inquiry result.
Then calculating the average value of N rewards as a first rewards; counting the query hit rate of N material query results on target material data, and calculating a second reward value according to the query hit rate; the target material data are material data matched with the sample keywords; finally, the generator is updated according to the first rewards value and the second rewards value, and the discriminator is updated according to the first rewards value.
As shown in fig. 2, the dots located to the right of the sequence generated by the generator represent the sequence supplemented based on the monte carlo search method. When the base Yu Mengte Carlo search method complements sequences, it is also understood that the base Yu Mengte Carlo search method complements each action, which is the action that generates each sequence. When the complete sequence is discriminated by the discriminator, specifically, each sequence in the complete sequence is discriminated by the discriminator (i.e. whether the sequence is a true sequence or not) so as to obtain a discrimination result of each sequence, wherein the discrimination result of each sequence is a decimal between 0 and 1 and is used for indicating the probability that the discriminator considers that the sequence is the true sequence. And calculating an average value of the discrimination results of the sequences, and taking the average value as the discrimination result of the complete sequence. In addition, since each sequence is not a real sequence, the closer the discrimination result of the complete sequence is to 1, the closer the sequence generated by the generator is to the real sequence, the more deceptive, so the discrimination result of the complete sequence can be directly used as a reward value. An average of the N prize values is then calculated as the first prize value.
In addition, the number of the target material data in the N material inquiry results is counted, and the counted number is divided by the total number of the target material data in the material database, so that the inquiry hit rate of the target material data is obtained. Because the higher the query hit rate of the target material data, the more the sequence generated by the description generator matches the material database, the calculated query hit rate can be directly used as the second prize value.
Finally, updating the generator according to the first rewarding value and the second rewarding value, so that the sequence generated by the generator is closer to the real sequence, and meanwhile, the sequence generated by the generator is more matched with the material database. And updating the discriminator according to the first reward value, so that the discriminator improves the capability of discriminating the authenticity of the sequence. When updating the arbiter according to the first prize value, the first prize value may be used as a loss value, and the arbiter may be updated based on the loss value. The larger the first prize value, the larger the penalty value for updating the arbiter.
In performing step S1N times, the random noise of each input generator is inconsistent, thereby generating a different sequence each time, and the sample keywords of each input generator are consistent, thereby making each generated sequence related to the sample keywords.
In some embodiments of the present application, the generator may be updated according to the first prize value and the second prize value, and the first prize value and the second prize value may be weighted and summed according to the respective weights of the first prize value and the second prize value; then updating the generator according to the weighted summation result; wherein, during training, the weight of the first reward value and the weight of the second reward value are adjusted along with the change amplitude of the first reward value, and the more gradual the change amplitude of the first reward value is, the smaller the weight of the first reward value is, and the larger the weight of the second reward value is. Specifically, the weight of the first prize value gradually decreases with increasing training rounds, the weight of the second prize value gradually increases with increasing training rounds, the weight of the first prize value has a preset lower limit value (e.g., 0.5), the weight of the second prize value has a preset upper limit value (e.g., 0.5), and the addition result of the preset lower limit value and the preset upper limit value is equal to 1.
In the present application, in the initial training stage of the generator, the first prize value has a larger variation range, and in the later training stage of the generator, the first prize value has a smaller variation range, so that the weight of the first prize value gradually decreases and the weight of the second prize value gradually increases with the increase of the training rounds. It should be noted that, by designing the weight of the first reward value to be larger and the weight of the first reward value to be smaller in the initial stage of training, the generator can learn how to generate the paraphrasing word closer to the keyword more quickly in the initial stage of training. And the weight of the first rewarding value is designed to be smaller in the later training period (namely after the generator has a certain capability of generating the paraphrasing), and the weight of the first rewarding value is designed to be larger, so that the generator can pay more attention to how to generate the paraphrasing which is more matched with the material database on the basis of having a certain capability of generating the paraphrasing. The weights of the first reward value and the second reward value are designed according to the method provided by the application, so that the quick convergence of the generator in a complex training task is facilitated.
In addition, the material data classifying method in the application can further comprise the following steps: displaying the residual material data which are not queried; and receiving manual classifying operation of the user on the residual material data, and classifying the residual material data into corresponding categories according to the manual classifying operation of the user on the residual material data.
When the method is concretely implemented, after a user classifies most of material data in the material database through a plurality of keywords, the user can call out a small amount of residual material data, and then the residual material data is manually classified, so that the residual material data is classified into corresponding categories.
The present disclosure provides a material data classifying method based on the above embodiments, and based on the same inventive concept, the present disclosure provides a material data classifying device through the following embodiments. Referring to fig. 3, fig. 3 is a schematic structural diagram of a material classifying device provided in the present application. As shown in fig. 3, the material data classifying device includes:
the keyword receiving module 310 is configured to receive keywords input by a user and used for querying material data.
A paraphrasing generating module 320 for generating a plurality of paraphrasing for the keywords based on a generator in a pre-trained generated countermeasure network; the generating type countermeasure network comprises a generator and a discriminator, the generating type countermeasure network is trained according to discrimination results and material data hit rates, the discrimination results are the discrimination results of the discriminator on sequences generated by the generator during training, and the material data hit rates are query hit rates of target material data when the sequences generated by the generator are used for query during training.
And the material data query module 330 is configured to receive a selection operation of the paraphrasing by the user, query material data by using the keyword and each selected paraphrasing, and classify the queried material data into a category.
Based on the same inventive concept, the embodiment of the application also provides an electronic device, which comprises a processor, a memory and a computer program stored on the memory and capable of running on the processor, wherein the computer program is executed by the processor to realize the material data classification method of any one of the above.
Based on the same inventive concept, the embodiments of the present application further provide a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the method for classifying material data according to any one of the above.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (10)

1. A method for classifying material data, the method comprising:
receiving keywords input by a user and used for inquiring material data;
generating a plurality of hyponyms for the keywords based on a generator in a pre-trained generated countermeasure network; the generating type countermeasure network comprises the generator and a discriminator, the generating type countermeasure network is trained according to discrimination results and a material data hit rate, the discrimination results are the results of discrimination on sequences generated by the generator by the discriminator during training, and the material data hit rate is the query hit rate of target material data when the sequences generated by the generator are used for query during training;
and receiving the selection operation of the user on the paraphraseology, inquiring the material data by utilizing the keywords and each selected paraphraseology, and classifying the inquired material data into a category.
2. The method according to claim 1, wherein the method further comprises:
training a generator of the generated type countermeasure network according to the judging result and the material data hit rate, and training a discriminator of the generated type countermeasure network according to the judging result.
3. The method of claim 2, wherein the step of training the generator of the generated type countermeasure network based on the discrimination result and the material data hit rate, and training the discriminator of the generated type countermeasure network based on the discrimination result comprises:
the following steps S1 to S3 are repeated N times: s1, inputting random noise and sample keywords into the generator to obtain a sequence generated by the generator; s2, complementing the sequence based on a Monte Carlo search method to obtain a complete sequence, judging the complete sequence through the judging device to obtain a judging result, and generating a reward value according to the judging result; s3, inquiring the material data by utilizing the sequence to obtain a material inquiry result;
calculating the average value of N reward values as a first reward value;
counting the query hit rate of N material query results on target material data, and calculating a second reward value according to the query hit rate; wherein the target material data is material data matched with the sample keywords;
updating the generator according to the first reward value and the second reward value, and updating the discriminator according to the first reward value.
4. A method according to claim 3, wherein the random noise input to the generator is inconsistent and the sample keywords input to the generator are consistent each time step S1 is performed N times.
5. A method according to claim 3, wherein the step of updating the generator in accordance with the first prize value and the second prize value comprises:
weighting and summing the first and second prize values according to their respective weights;
updating the generator according to the weighted summation result;
wherein, during training, the weight of the first reward value and the weight of the second reward value are adjusted along with the change amplitude of the first reward value, the more gradual the change amplitude of the first reward value is, the smaller the weight of the first reward value is, and the larger the weight of the second reward value is.
6. The method of claim 5, wherein the weight of the first bonus value decreases gradually as the training period increases, the weight of the second bonus value increases gradually as the training period increases, the weight of the first bonus value has a preset lower limit value, the weight of the second bonus value has a preset upper limit value, and the addition result of the preset lower limit value and the preset upper limit value is equal to 1.
7. The method according to claim 1, wherein the method further comprises:
displaying the residual material data which are not queried;
and receiving manual classifying operation of the user on the residual material data, and classifying the residual material data into corresponding categories according to the manual classifying operation of the user on the residual material data.
8. A material data classification device, the device comprising:
the keyword receiving module is used for receiving keywords which are input by a user and used for inquiring material data;
a paraphrasing generating module for generating a plurality of paraphrasing for the keywords based on a generator in a pre-trained generated countermeasure network; the generating type countermeasure network comprises the generator and a discriminator, the generating type countermeasure network is trained according to discrimination results and a material data hit rate, the discrimination results are the results of discrimination on sequences generated by the generator by the discriminator during training, and the material data hit rate is the query hit rate of target material data when the sequences generated by the generator are used for query during training;
and the material data query module is used for receiving the selection operation of the user on the paraphraseology, respectively utilizing the keywords and each selected paraphraseology to query the material data, and classifying the queried material data into a category.
9. An electronic device comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, which computer program, when executed by the processor, implements the method of classifying material data according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the storage medium has stored thereon a computer program which, when executed by a processor, implements a method of classifying material data according to any one of claims 1 to 7.
CN202310286304.8A 2023-03-23 2023-03-23 Material data classifying method and device, electronic equipment and storage medium Active CN116010609B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310286304.8A CN116010609B (en) 2023-03-23 2023-03-23 Material data classifying method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310286304.8A CN116010609B (en) 2023-03-23 2023-03-23 Material data classifying method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116010609A true CN116010609A (en) 2023-04-25
CN116010609B CN116010609B (en) 2023-06-09

Family

ID=86035858

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310286304.8A Active CN116010609B (en) 2023-03-23 2023-03-23 Material data classifying method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116010609B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787029A (en) * 2016-02-25 2016-07-20 浪潮软件集团有限公司 SOLR-based key word recognition method
CN108427686A (en) * 2017-02-15 2018-08-21 北京国双科技有限公司 Text data querying method and device
US20190114348A1 (en) * 2017-10-13 2019-04-18 Microsoft Technology Licensing, Llc Using a Generative Adversarial Network for Query-Keyword Matching
CN110032734A (en) * 2019-03-18 2019-07-19 百度在线网络技术(北京)有限公司 Near synonym extension and generation confrontation network model training method and device
US20190286950A1 (en) * 2018-03-16 2019-09-19 Ebay Inc. Generating a digital image using a generative adversarial network
CN110942101A (en) * 2019-11-29 2020-03-31 湖南科技大学 Rolling bearing residual life prediction method based on depth generation type countermeasure network
CN111582348A (en) * 2020-04-29 2020-08-25 武汉轻工大学 Method, device, equipment and storage medium for training condition generating type countermeasure network
CN111651577A (en) * 2020-06-01 2020-09-11 全球能源互联网研究院有限公司 Cross-media data association analysis model training method, data association analysis method and system
CN112182155A (en) * 2020-09-25 2021-01-05 中国人民大学 Search result diversification method based on generating type countermeasure network
CN112884130A (en) * 2021-03-16 2021-06-01 浙江工业大学 SeqGAN-based deep reinforcement learning data enhanced defense method and device
US20210224275A1 (en) * 2020-01-21 2021-07-22 Oracle International Corporation Query classification and processing using neural network based machine learning
WO2021174827A1 (en) * 2020-03-02 2021-09-10 平安科技(深圳)有限公司 Text generation method and appartus, computer device and readable storage medium
CN114897163A (en) * 2022-05-23 2022-08-12 阿里巴巴(中国)有限公司 Pre-training model data processing method, electronic device and computer storage medium

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787029A (en) * 2016-02-25 2016-07-20 浪潮软件集团有限公司 SOLR-based key word recognition method
CN108427686A (en) * 2017-02-15 2018-08-21 北京国双科技有限公司 Text data querying method and device
US20190114348A1 (en) * 2017-10-13 2019-04-18 Microsoft Technology Licensing, Llc Using a Generative Adversarial Network for Query-Keyword Matching
US20190286950A1 (en) * 2018-03-16 2019-09-19 Ebay Inc. Generating a digital image using a generative adversarial network
CN110032734A (en) * 2019-03-18 2019-07-19 百度在线网络技术(北京)有限公司 Near synonym extension and generation confrontation network model training method and device
CN110942101A (en) * 2019-11-29 2020-03-31 湖南科技大学 Rolling bearing residual life prediction method based on depth generation type countermeasure network
US20210224275A1 (en) * 2020-01-21 2021-07-22 Oracle International Corporation Query classification and processing using neural network based machine learning
WO2021174827A1 (en) * 2020-03-02 2021-09-10 平安科技(深圳)有限公司 Text generation method and appartus, computer device and readable storage medium
CN111582348A (en) * 2020-04-29 2020-08-25 武汉轻工大学 Method, device, equipment and storage medium for training condition generating type countermeasure network
CN111651577A (en) * 2020-06-01 2020-09-11 全球能源互联网研究院有限公司 Cross-media data association analysis model training method, data association analysis method and system
CN112182155A (en) * 2020-09-25 2021-01-05 中国人民大学 Search result diversification method based on generating type countermeasure network
CN112884130A (en) * 2021-03-16 2021-06-01 浙江工业大学 SeqGAN-based deep reinforcement learning data enhanced defense method and device
CN114897163A (en) * 2022-05-23 2022-08-12 阿里巴巴(中国)有限公司 Pre-training model data processing method, electronic device and computer storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SHUANSHUAN PANG 等: "An Approach to Generate Topic Similar Document by Seed Extraction-Based SeqGAN Training for Bait Document", 2018 IEEE THIRD INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC), pages 1 - 8 *
杨懿男;齐林海;王红;苏林萍;: "基于生成对抗网络的小样本数据生成技术研究", 电力建设, no. 05, pages 71 - 77 *
陈培培;邵曦;: "基于生成对抗网络的音乐标签自动标注", 南京信息工程大学学报(自然科学版), no. 06, pages 754 - 759 *

Also Published As

Publication number Publication date
CN116010609B (en) 2023-06-09

Similar Documents

Publication Publication Date Title
US8195674B1 (en) Large scale machine learning systems and methods
CN110020128B (en) Search result ordering method and device
CN109902823B (en) Model training method and device based on generation countermeasure network
CN109903103B (en) Method and device for recommending articles
TWI752349B (en) Risk identification method and device
CN110827120A (en) GAN network-based fuzzy recommendation method and device, electronic equipment and storage medium
McRee Symbolic regression using nearest neighbor indexing
CN116680320A (en) Mixed matching method based on big data
CN110263136A (en) The method and apparatus for pushing object to user based on intensified learning model
Dewancker et al. Interactive preference learning of utility functions for multi-objective optimization
Basaran et al. A multi-criteria decision making to rank android based mobile applications for mathematics
Vens et al. First order random forests with complex aggregates
CN116010609B (en) Material data classifying method and device, electronic equipment and storage medium
CN113204642A (en) Text clustering method and device, storage medium and electronic equipment
Chow et al. A new feature selection scheme using a data distribution factor for unsupervised nominal data
CN111382265B (en) Searching method, device, equipment and medium
CN111782805A (en) Text label classification method and system
US20090259614A1 (en) Method and expert system for valuating an object
Jovanović et al. Evolutionary approach for automated component-based decision tree algorithm design
JP5171686B2 (en) Accelerated search modeling system and method
CN112328918A (en) Commodity sorting method, computing device and computer-readable storage medium
CN113761108B (en) Data searching method, device, equipment and computer readable storage medium
CN109446562B (en) Retrieval method for configuration scheme of crank press
Carstensen Mining Distinct Representations of High-Utility Itemsets Using Particle Swarm Optimization
Mgboh et al. DEEPLY LEARN STUDENTS’ACADEMIC PERFORMANCE

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant