CN108304382A - Mass analysis method based on manufacturing process text data digging and system - Google Patents

Mass analysis method based on manufacturing process text data digging and system Download PDF

Info

Publication number
CN108304382A
CN108304382A CN201810074691.8A CN201810074691A CN108304382A CN 108304382 A CN108304382 A CN 108304382A CN 201810074691 A CN201810074691 A CN 201810074691A CN 108304382 A CN108304382 A CN 108304382A
Authority
CN
China
Prior art keywords
failure
data
quality
manufacturing process
text data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810074691.8A
Other languages
Chinese (zh)
Other versions
CN108304382B (en
Inventor
潘丽
崔泽媛
刘士军
嵇存
郭芳芳
杨承磊
孟祥旭
武蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN201810074691.8A priority Critical patent/CN108304382B/en
Publication of CN108304382A publication Critical patent/CN108304382A/en
Application granted granted Critical
Publication of CN108304382B publication Critical patent/CN108304382B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • General Factory Administration (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of mass analysis method and system based on manufacturing process text data digging, the method includes:Acquisition and the relevant data of quality problems, the described and relevant data of quality problems include technical papers data, structure relationship of products data and final product quality failure-description data;To described quality tab extraction is carried out with the relevant data of quality problems;Product quality problem analysis and failure technique association analysis are carried out according to the quality tab.Present invention incorporates data related with quality in multiple systems, incorporate data resource, and carried out association analysis for these data, are conducive to the retrospect of quality problems source, and process optimization is carried out convenient for expert.

Description

Mass analysis method based on manufacturing process text data digging and system
Technical field
The present invention relates to product quality analysis fields, and in particular to one kind is based on technique text data and production in manufacturing process Quality describes the method and system that text data carries out quality analysis.
Background technology
As the concept of " data assets are enterprise key assets " deepens continuously the popular feeling, various countries' manufacturing industry all starts data Manage the core competitiveness as enterprise.Around intelligent management pattern, is designed for different data sources and use corresponding message tube Reason system.Such as production work is used for process data, production procedure, device operation scheme, interim technique, technology monthly magazine design Skill manages system;For a product by which part, component form and these parts, component between relational design use product Structure management system;It accepts the designs such as approval record, failure mode information for quality problems information record, defective work and uses Product quality management system.I.e. traditional zinc powders system divides the data such as technical papers, product structure, final product quality Open pipe is managed, and independent subsystem is respectively formd.But the processing technology file that the final product quality of product is original with it is not exhausted To independent, wherein there is certain incidence relations.As the final product quality problem of product often can be by its production process Original processing technology file is traced back to, to find to influence the critical process of quality, finds real problem place.Therefore, The data in subsystems can be obtained, a part of information is filtered out, content related with quality are extracted, from magnanimity history number Various relationships are automatically analyzed in, therefrom find production law, instruct actual industrial production.
Non-structured ratio data is larger in industrial data, as Product Process text, quality describe text, failure cause Description etc. is stored with non-structured textual form.Text message amount is larger, and artificial is difficult therefrom to extract crucial information. Currently, being analyzed and being excavated very common, to be generated to network extensive real corpus library for social networks text data It processes with carrying out different depth, forms the knowledge network of certain scale.However the research towards industrial quarters is deficienter, lacks The weary consciousness for building this domain knowledge system.
Therefore, for the concrete condition of industrial quarters, how technique text data is analyzed, excavating influences product quality Key factor, be the technical issues of those skilled in the art face at present.
Invention content
The purpose of the invention is to overcome in manufacturing process with the dispersion of quality relevant initial data, unstructured data Ratio is big, it is difficult to intuitive the problem of finding rule, it is proposed that it is comprehensive that a kind of technique text data and product quality describe text data Close the route of analysis.
To achieve the above object, the present invention adopts the following technical scheme that:
A kind of mass analysis method based on manufacturing process text data digging, includes the following steps:
It obtains and includes technical papers data, produces with the relevant data of quality problems, the described and relevant data of quality problems Product structural relation data and final product quality failure-description data;
To described quality tab extraction is carried out with the relevant data of quality problems;
Product quality problem analysis and failure technique association analysis are carried out according to the quality tab.
Further, the quality tab, which extracts, includes:
Convert technique text data to technology key word label;Convert final product quality failure-description data to failure mark Label.
Further, the product quality problem analysis includes phenomenon of the failure association analysis and quality problems conductibility point Analysis.
Further, the phenomenon of the failure association analysis includes:
Tag extraction is carried out to failure-description data, obtains faulty tag sequence;
The extraction of frequent item set is carried out to faulty tag series using Apriori algorithm.
Further, the quality problems conductibility, which is analyzed, includes:
The frequency that each quality problems occur in the identical product adjacent development stage is counted, and is arranged by descending order Sequence;
It chooses top n and is used as most conductive failure.
Further, the failure technique association analysis includes:
The technical papers of all components involved by structure relationship of products data organization product, obtain Product Process knot Paper mulberry;
Keyword extraction is carried out to technical papers data, obtains the corresponding technology key word sequence of each file, builds work Skill keyword dictionary;
According to the final product quality problem log of product, Product Process structure tree is searched for, the pass involved by quality problems is obtained Key technological factor.
Further, described to include to technical papers data progress keyword extraction:
Technical papers data are segmented, stop words filtering pretreatment, obtains candidate key word sequence;
Candidate keywords figure is built, figure interior joint is the candidate keywords, two of co-occurrence in the window that length is K There are sides between candidate keywords;
The TF-IDF values of each word are calculated, and then assign the weights on each side;
According to TextRank formula, the weight of each node of iterative diffusion, until convergence;
It sorts by weight backward to node, obtains most important T word, the keyword as document.
Further, the method further includes:Quality analysis results are visualized, auxiliary process optimization.
Second purpose according to the present invention, the present invention also provides a kind of quality based on manufacturing process text data digging Analysis system, including memory, processor and storage are on a memory and the computer program that can run on a processor, it is described Processor realizes the mass analysis method when executing described program.
Third purpose according to the present invention, the present invention also provides a kind of computer readable storage mediums, are stored thereon with Computer program executes when the program is executed by processor and realizes the quality based on manufacturing process text data digging point Analysis method.
Beneficial effects of the present invention
1, the problem of disperseing the present invention overcomes industrial initial data, the data in multiple systems are associated analysis, That is, data related with quality in quality control system, production technology management system are extracted, and combination product manufactured Journey, the common various potential relationships analyzed between quality problems, and the technique source for influencing quality is traced back to, so that expert carries out work Skill optimizes.
2, keyword extraction skill is utilized in order to handle a large amount of non-structured text datas in analytic process of the present invention Art extracts keyword from technique text, builds technique dictionary, is conducive to further establish industrial knowledge mapping.
3, the present invention converts the big section text of phenomenon of the failure description to failure sequence, analyzes between phenomenon of the failure feature The incidence relation of incidence relation and same failure in the different development stages.By excavating and analyzing the co-occurrence between failure Pattern contributes to a series of generation for quickly solving to avoid failures;Occurred between the different development stages by excavating phenomenon of the failure Conduction phenomenon, contribute to emphasis investigation solve, reduce processing quality problem incidence.
Description of the drawings
The accompanying drawings which form a part of this application are used for providing further understanding of the present application, and the application's shows Meaning property embodiment and its explanation do not constitute the improper restriction to the application for explaining the application.
Fig. 1 is the method for the present invention flow diagram.
Fig. 2 is system structure of the invention structure chart.
Fig. 3 is the faulty tag sequence diagram that a plurality of phenomenon of the failure describes text conversion.
Fig. 4 is that quality problems conductibility analyzes schematic diagram.
Fig. 5 present invention process labels word cloud, failure-description label word cloud visualize schematic diagram.
Fig. 6 is that quality problems transduction assay visualizes schematic diagram.
Specific implementation mode
It is noted that described further below be all exemplary, it is intended to provide further instruction to the application.Unless another It indicates, all technical and scientific terms used herein has usual with the application person of an ordinary skill in the technical field The identical meanings of understanding.
It should be noted that term used herein above is merely to describe specific implementation mode, and be not intended to restricted root According to the illustrative embodiments of the application.As used herein, unless the context clearly indicates otherwise, otherwise singulative It is also intended to include plural form, additionally, it should be understood that, when in the present specification using term "comprising" and/or " packet Include " when, indicate existing characteristics, step, operation, device, component and/or combination thereof.
In the absence of conflict, the features in the embodiments and the embodiments of the present application can be combined with each other.
The preferred forms of the present invention are the auxiliary point as platform in conjunction with enterprise's technique, quality information management platform Analysis tool, the analysis model provided by this system and method for digging calling interface use present system.
In implementation, related final product quality in enterprise information management platform, structure relationship of products, component mistake need to be understood in depth The subsystem of journey technique relies on the analysis process and method of the present invention, analyzes initial data, and then finds regularity Knowledge, support industry Optimizing manufacture.
The present invention basic thought be:Text key is automatically extracted from mass historical data by text analysis technique Word is analyzed conductibility of the incidence relation and failure between phenomenon of the failure between the different development stages, and is closed by its product structure Original processing technology file is traced back to by system, finds the source technique of quality problems.
Embodiment one
Present embodiment discloses a kind of mass analysis methods based on manufacturing process text data digging, as shown in Figure 1, packet Include following steps:
Step1:Multi-source data extracts
Extraction and the relevant data of quality problems from the corresponding information management system used for different data sources design, Such as:Product Process file data is extracted from production technology management system, and product structure is extracted from product structure management system Relation data extracts final product quality failure-description data from product quality management system.
After extracting data, the pretreatments such as data cleansing, Chinese word segmentation also are carried out to data.
Step2:Quality tab extracts
Non-structured ratio data is larger in industrial data, as Product Process text, quality describe text, failure cause Description etc. is stored with non-structured textual form.The word of wherein most is the only only a few in order to assist stating Word be related with quality problems.Therefore it can convert technique text data to technique by tag extraction process and close Keyword label converts final product quality failure-description data to quality problems label.By being sieved from a large amount of urtext data The keyword for selecting high quality, high confidence level, filters out the incoherent information of redundancy, is intuitively checked convenient for expert and subsequent quality Analysis.
It is relatively common in keyword extraction techniques to there is TF-IDF and TextRank, wherein TF-IDF algorithms not to account for word Relationship between language and the characteristic of itself only weigh the importance of a word with word frequency.TextRank algorithm is confined to single text Shelves are calculated, and the information of corpus is had ignored.The TF-IDF based on statistics and two kinds of the TextRank based on figure can be combined Algorithm, the keyword being provided commonly in extraction process file.
Step3:Analysis of Quality Problem and failure technique association analysis are carried out according to quality tab.
The Analysis of Quality Problem includes phenomenon of the failure association analysis and the analysis of quality problems conductibility.
1, phenomenon of the failure association analysis
The generation of failure is not only relevant with design and producing, also has co-occurrence, i.e., a certain event between failure and failure Barrier frequently can lead to the generation of another failure.Therefore there is certain potential relationships, i.e., two between phenomenon of the failure Failure frequently occurs together in phenomenon of the failure description.
It is faulty tag sequence by a description text conversion, therefore due to having done tag extraction to phenomenon of the failure description The extraction of frequent item set can be carried out to faulty tag using Apriori algorithm.For example, " phenomenon of the failure description " big section text turns After being changed to artifact tag list, a series of a plurality of list for including phenomena of the failure, such as Fig. 3 are obtained.List 1:[failure 1, failure 2, Failure 3, failure 4, failure 5], list 2:[failure 1, failure 2, failure 6, failure 7, failure 8], list 3:[failure 1, failure 2, Failure 3], list 4:[failure 3, failure 4].Analyze the incidence relation between phenomenon of the failure feature, i.e. which phenomenon of the failure together Occur, obtains:<Failure 1, failure 2>, co-occurrence 3 times;<Failure 3, failure 4>, co-occurrence 2 times;<Failure 1, failure 2, failure 3>, altogether It is 2 times existing.Explanation<Failure 1, failure 2>,<Failure 3, failure 4>,<Failure 1, failure 2, failure 3>There are incidence relations.
The flow for describing to excavate the co-occurrence pattern that is out of order in text from a large amount of phenomena of the failure is as follows:
(1) text is described to every phenomenon of the failure and carries out keyword extraction, convert a text to a phenomenon of the failure Sequence label.
(2) frequency that each phenomenon of the failure label occurs in all phenomenon of the failure descriptions is counted, the frequency of occurrences is weeded out and is less than The label of threshold value obtains the F1 frequent item sets of phenomenon of the failure label.
(3) phenomenon of the failure set of tags F2 item collections are obtained by F1 frequent item set combination of two
(4) frequency for counting each appearance in F2 item collections in all phenomenon of the failure descriptions, weeds out and threshold value occurs less than threshold The item of value obtains the F2 frequent item sets of phenomenon of the failure label.
(5) and so on, until not containing any item in Fk frequent item sets.
2, the quality problems conductibility analysis of combination product manufacturing process
Production has its fixed flow, can undergo a series of development stage, and two adjacent development stages may It will appear similar quality problems, illustrate that quality problems are conducted.It can be to the matter of appearance of each development stage of product Amount problem Conjoint Analysis finds which quality problems is conducted.
When acquisition quality problem data, phenomenon of the failure description and the development of corresponding model and model are often recorded Stage.Tag extraction is done to phenomenon of the failure description, is faulty tag sequence by a description text conversion.It counts under same model The frequency that each failure occurs in the different development stages, and by being ranked sequentially from big to small, obtain the failure under each stage List is found in all higher failure of two stage frequencies of occurrences.For example, describing to obtain under " stage 1 " by product quality problem By frequency flashback arrangement error listing be [failure 1, failure 2, failure 3, failure 4, failure 5 ...], under " stage 2 " press frequency Flashback arrange the problem of list be [failure 2, failure 8, failure 9, failure 3, failure 7 ...], then it can be seen that " failure 2 " two The frequency that a stage occurs is all higher, and " failure 3 " is only second to " failure 2 ".It can be obtained from " stage 1 " to " stage 2 ", " failure 2, the most conductibility of failure 3 ", such as Fig. 4.
If a certain failure frequency of occurrences in the quality problems description of two adjacent development stages is all higher, illustrate This failure is conducted in the two development stages.Compare the frequency of all failures for appearing in two development stages, and Find out the higher failure of the frequency of occurrences, the simple violence of this method, but because data volume is larger, it is relatively inefficient.It can be used Following algorithm efficiently searches the most conductive failure of top n:
3, failure-technique association analysis
There is certain incidence relation between the final product quality of product and its original Design Processing, technological design The quality of the finished product after production is directly affected whether rationally.Therefore, the final product quality problem of product often can be by its production Process traces back to original processing technology file, to find to influence the critical process of quality, finds real problem place.
It is as follows:
First, the technical papers of all components involved by structure relationship of products tissue products, obtain Product Process Structure tree;
Then, technical papers are handled using keyword extraction techniques, obtains the corresponding technology key of each file Word sequence, to provide easily technique search function;
Finally, according to the final product quality problem log of product, Product Process structure tree is searched for, is obtained involved by quality problems Critical process factor.It is investigated, the source technique for influencing quality is found.
Wherein, the construction method of technology key word dictionary is:
The structure of technology key word dictionary will be by extracting keyword to all technology documents, and expert is combined to participate in, structure Build technology key word dictionary.According to technology key word dictionary, technical papers can be converted to technology key word sequence, to carry For easily technique search function, technical process can be traced when ging wrong so as to a certain process procedure.Technology key In word extraction, on the basis of TextRank algorithm, introduces TF-IDF and assign side right weight, i.e., TextRank figures are become oriented Figure.Original TextRank is non-directed graph, only considers that the vocabulary cooccurrence relation in single document, i.e. two vocabulary are limiting size Co-occurrence in window, then there are a lines on two vocabulary corresponding vertex, and the weights on side are 1, i.e., without distinguishing different sides Weight.The weight that the side of wi to wj is assigned in this system is the TF-IDF values of wi, and the weight that the side of wj to wi assigns is wj's TF-IDF values.It is as follows that technology document extracts keyword flow:
(1) Chinese word segmentation is carried out to technique text file D, compares stop words dictionary, filters out stop words, obtain word sequence D={ W1, W2 ..., Wn } is arranged, wherein Wi is candidate keywords.
(2) structure candidate keywords figure G=(V, E).V is set of node, is made of the candidate keywords obtained in (1);E is Side collection, if two words co-occurrence in the window that length is K, there are sides between the corresponding node of two words.
(3) the TF-IDF values of each word are calculated, and then assign the weights on each side.
(4) according to TextRank formula, the weight of each node of iterative diffusion, until convergence.
(5) it sorts by weight backward to node, obtains most important T word, the keyword as document.
By the above-mentioned means, carrying out keyword extraction to technology document, it converts technology document to keyword sequence.In conjunction with Expert participates in, and merges the keyword of all documents, builds technology key word dictionary.
Step4:Intuitive display data analysis result, auxiliary expert carry out process optimization.
It after the completion of above-mentioned analysis, needs visual result showing expert with open arms, it is assisted to carry out process optimization.Data The visualization stage provides the multiple display modes such as word cloud, cake chart, temperature figure.Word cloud provides non-structured text data A kind of visual representing mode, gives visual protrusion to the higher keyword of frequency, with different font sizes, is laid out and matches The different lexical item of color display frequency, to filter out a large amount of text message, as long as making expert is at a glance inswept to have a taste of text This purport, such as Fig. 5.Word cloud displaying is implemented as follows:Will need show vocabulary, word frequency correspondence by word frequency size into Row descending sort.Word frequency is mapped as font size so that the big word font of word frequency is larger, and the small word font of word frequency is smaller.It presses The sequence of word frequency from big to small, draws since center picture.Fig. 6 is quality problems transduction assay figure, using reference axis mode It is shown.
Embodiment two
The purpose of the present embodiment is to provide a kind of quality analysis system based on manufacturing process text data digging.
The system comprises memory, processor and store the computer journey that can be run on a memory and on a processor Sequence, the processor realize following steps when executing described program:
It obtains and includes technical papers data, produces with the relevant data of quality problems, the described and relevant data of quality problems Product structural relation data and final product quality failure-description data;
To described quality tab extraction is carried out with the relevant data of quality problems;
Product quality problem analysis and failure technique association analysis are carried out according to the quality tab.
Fig. 2 illustrates the functional structure chart of this system in detail.This system is divided into four function modules, respectively pre-processes mould Block, data analysis module, data visualization module and data memory module.Wherein, preprocessing module is divided into data extraction, data Cleaning, Chinese word segmentation and keyword extraction;Data analysis module be divided into quality technique dictionary creation, phenomenon of the failure association analysis, The analysis of problem conductibility, failure technique association analysis;Data visualization module be divided into data integrally show, data correlation displaying and Data comparison is shown;Data memory module is divided into configuration file storage, log recording storage, specialized dictionary storage and deactivated dictionary Storage.
Embodiment three
The purpose of the present embodiment is to provide a kind of computer readable storage medium.
A kind of computer readable storage medium, is stored thereon with computer program, which executes when being executed by processor Following steps:
It obtains and includes technical papers data, produces with the relevant data of quality problems, the described and relevant data of quality problems Product structural relation data and final product quality failure-description data;
To described quality tab extraction is carried out with the relevant data of quality problems;
Product quality problem analysis and failure technique association analysis are carried out according to the quality tab.
Each step involved in the device of above example two and three is corresponding with embodiment of the method one, specific implementation mode It can be found in the related description part of embodiment one.Term " computer readable storage medium " is construed as including one or more The single medium or multiple media of instruction set;Any medium is should also be understood as including, any medium can be stored, be compiled Code carries the instruction set for being executed by processor and processor is made to execute the either method in the present invention.
Beneficial effects of the present invention
1, the problem of disperseing the present invention overcomes industrial initial data, the data in multiple systems are associated analysis, That is, data related with quality in quality control system, production technology management system are extracted, and combination product manufactured Journey, the common various potential relationships analyzed between quality problems, and the technique source for influencing quality is traced back to, so that expert carries out work Skill optimizes.
2, keyword extraction skill is utilized in order to handle a large amount of non-structured text datas in analytic process of the present invention Art extracts keyword from technique text, builds technique dictionary, is conducive to further establish industrial knowledge mapping.
It will be understood by those skilled in the art that each module or each step of aforementioned present invention can be filled with general computer It sets to realize, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored Be performed by computing device in the storage device, either they are fabricated to each integrated circuit modules or by they In multiple modules or step be fabricated to single integrated circuit module to realize.The present invention is not limited to any specific hardware and The combination of software.
Above-mentioned, although the foregoing specific embodiments of the present invention is described with reference to the accompanying drawings, not protects model to the present invention The limitation enclosed, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not Need to make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.

Claims (10)

1. a kind of mass analysis method based on manufacturing process text data digging, which is characterized in that include the following steps:
Acquisition and the relevant data of quality problems, the described and relevant data of quality problems include technical papers data, product knot Structure relation data and final product quality failure-description data;
To described quality tab extraction is carried out with the relevant data of quality problems;
Product quality problem analysis and failure technique association analysis are carried out according to the quality tab.
2. a kind of mass analysis method based on manufacturing process text data digging as described in claim 1, which is characterized in that The quality tab extracts:
Convert technique text data to technology key word label;Convert final product quality failure-description data to faulty tag.
3. a kind of mass analysis method based on manufacturing process text data digging as described in claim 1, which is characterized in that The product quality problem analysis includes phenomenon of the failure association analysis and the analysis of quality problems conductibility.
4. a kind of mass analysis method based on manufacturing process text data digging as claimed in claim 3, which is characterized in that The phenomenon of the failure association analysis includes:
Tag extraction is carried out to failure-description data, obtains faulty tag sequence;
The extraction of frequent item set is carried out to faulty tag series using Apriori algorithm.
5. a kind of mass analysis method based on manufacturing process text data digging as claimed in claim 3, which is characterized in that The quality problems conductibility is analyzed:
The frequency that each failure occurs in the identical product adjacent development stage is counted, and is sorted by descending order;
It chooses top n and is used as most conductive failure.
6. a kind of mass analysis method based on manufacturing process text data digging as described in claim 1, which is characterized in that The failure technique association analysis includes:
The technical papers of all components involved by structure relationship of products data organization product, obtain Product Process structure Tree;
Keyword extraction is carried out to technical papers data, obtains the corresponding technology key word sequence of each file, structure technique is closed Keyword dictionary;
According to the final product quality problem log of product, Product Process structure tree is searched for, the crucial work involved by quality problems is obtained Skill factor.
7. a kind of mass analysis method based on manufacturing process text data digging as claimed in claim 6, which is characterized in that It is described to include to technical papers data progress keyword extraction:
Technical papers data are segmented, stop words filtering pretreatment, obtains candidate key word sequence;
Candidate keywords figure is built, figure interior joint is the candidate keywords, two candidates of co-occurrence in the window that length is K There are sides between keyword;
The TF-IDF values of each word are calculated, and then assign the weights on each side;
According to TextRank formula, the weight of each node of iterative diffusion, until convergence;
It sorts by weight backward to node, obtains most important T word, the keyword as document.
8. a kind of mass analysis method based on manufacturing process text data digging as claimed in claim 6, which is characterized in that The method further includes:Quality analysis results are visualized, auxiliary process optimization.
9. a kind of quality analysis system based on manufacturing process text data digging, including memory, processor and it is stored in On reservoir and the computer program that can run on a processor, which is characterized in that the processor is realized when executing described program Such as claim 1-8 any one of them mass analysis methods.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor It is executed when execution and realizes such as quality analysis side of the claim 1-8 any one of them based on manufacturing process text data digging Method.
CN201810074691.8A 2018-01-25 2018-01-25 Quality analysis method and system based on text data mining in manufacturing process Active CN108304382B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810074691.8A CN108304382B (en) 2018-01-25 2018-01-25 Quality analysis method and system based on text data mining in manufacturing process

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810074691.8A CN108304382B (en) 2018-01-25 2018-01-25 Quality analysis method and system based on text data mining in manufacturing process

Publications (2)

Publication Number Publication Date
CN108304382A true CN108304382A (en) 2018-07-20
CN108304382B CN108304382B (en) 2021-02-02

Family

ID=62866376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810074691.8A Active CN108304382B (en) 2018-01-25 2018-01-25 Quality analysis method and system based on text data mining in manufacturing process

Country Status (1)

Country Link
CN (1) CN108304382B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109243528A (en) * 2018-08-14 2019-01-18 张旭蓓 The bioprocess control method of knowledge based map digraph
CN110059319A (en) * 2019-04-22 2019-07-26 上海化学工业区公共管廊有限公司 A kind of piping lane failure analysis methods based on key words co-occurrence
CN112395424A (en) * 2020-10-10 2021-02-23 北京仿真中心 Complex product quality problem tracing method and system
CN114138857A (en) * 2021-11-10 2022-03-04 北京师范大学 Big data mining method and device based on watershed water environment
CN116132107A (en) * 2022-12-16 2023-05-16 苏州可米可酷食品有限公司 Full life cycle quality data traceability management system based on data cloud processing product
CN116562714A (en) * 2023-07-07 2023-08-08 南通汤姆瑞斯工业智能科技有限公司 Workpiece information tracing system and method applied to machining

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102735485A (en) * 2011-10-14 2012-10-17 中联重科股份有限公司 Excavator, and method and system for determining equipment fault
CN104268338A (en) * 2014-09-26 2015-01-07 北京航空航天大学 Complex product failure effect transfer relation model as well as analysis and evaluation method thereof
KR20160076646A (en) * 2014-12-23 2016-07-01 (주)해인씨앤에스 Method and apparatus for managing a process and quality improvement of manufacturing process
CN106202665A (en) * 2016-06-30 2016-12-07 北京航空航天大学 Initial failure root primordium recognition methods based on domain mapping Yu weighted association rules
CN107451666A (en) * 2017-07-15 2017-12-08 西安电子科技大学 Breaker based on big data analysis assembles Tracing back of quality questions system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102735485A (en) * 2011-10-14 2012-10-17 中联重科股份有限公司 Excavator, and method and system for determining equipment fault
CN104268338A (en) * 2014-09-26 2015-01-07 北京航空航天大学 Complex product failure effect transfer relation model as well as analysis and evaluation method thereof
KR20160076646A (en) * 2014-12-23 2016-07-01 (주)해인씨앤에스 Method and apparatus for managing a process and quality improvement of manufacturing process
CN106202665A (en) * 2016-06-30 2016-12-07 北京航空航天大学 Initial failure root primordium recognition methods based on domain mapping Yu weighted association rules
CN107451666A (en) * 2017-07-15 2017-12-08 西安电子科技大学 Breaker based on big data analysis assembles Tracing back of quality questions system and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
甘超 等: "基于Apriori算法的设备故障诊断技术的研究", 《组合机床与自动化加工技术》 *
魏赟 等: "融合统计学和TextRank的生物医学文献关键短语抽取", 《计算机应用与软件》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109243528A (en) * 2018-08-14 2019-01-18 张旭蓓 The bioprocess control method of knowledge based map digraph
CN109243528B (en) * 2018-08-14 2022-02-08 张旭蓓 Biological process control method based on knowledge graph digraph
CN110059319A (en) * 2019-04-22 2019-07-26 上海化学工业区公共管廊有限公司 A kind of piping lane failure analysis methods based on key words co-occurrence
CN110059319B (en) * 2019-04-22 2022-11-18 上海化学工业区公共管廊有限公司 Pipe gallery fault analysis method based on keyword co-occurrence
CN112395424A (en) * 2020-10-10 2021-02-23 北京仿真中心 Complex product quality problem tracing method and system
CN114138857A (en) * 2021-11-10 2022-03-04 北京师范大学 Big data mining method and device based on watershed water environment
CN116132107A (en) * 2022-12-16 2023-05-16 苏州可米可酷食品有限公司 Full life cycle quality data traceability management system based on data cloud processing product
CN116132107B (en) * 2022-12-16 2024-04-12 苏州可米可酷食品有限公司 Full life cycle quality data traceability management system based on data cloud processing product
CN116562714A (en) * 2023-07-07 2023-08-08 南通汤姆瑞斯工业智能科技有限公司 Workpiece information tracing system and method applied to machining
CN116562714B (en) * 2023-07-07 2023-12-08 南通汤姆瑞斯工业智能科技有限公司 Workpiece information tracing system and method applied to machining

Also Published As

Publication number Publication date
CN108304382B (en) 2021-02-02

Similar Documents

Publication Publication Date Title
CN106649260B (en) Product characteristic structure tree construction method based on comment text mining
US11645317B2 (en) Recommending topic clusters for unstructured text documents
CN108304382A (en) Mass analysis method based on manufacturing process text data digging and system
Inzalkar et al. A survey on text mining-techniques and application
US9317593B2 (en) Modeling topics using statistical distributions
CN104346379B (en) A kind of data element recognition methods of logic-based and statistical technique
EP2045737A2 (en) Selecting tags for a document by analysing paragraphs of the document
EP1323078A1 (en) A document categorisation system
WO2014210387A2 (en) Concept extraction
CN112395424A (en) Complex product quality problem tracing method and system
CN110297893A (en) Natural language question-answering method, device, computer installation and storage medium
Tabassum et al. Semantic analysis of Urdu english tweets empowered by machine learning
Barbosa et al. An approach to clustering and sequencing of textual requirements
Suresh et al. Data mining and text mining—a survey
JP5324677B2 (en) Similar document search support device and similar document search support program
US11675793B2 (en) System for managing, analyzing, navigating or searching of data information across one or more sources within a computer or a computer network, without copying, moving or manipulating the source or the data information stored in the source
CN110874366A (en) Data processing and query method and device
Miotto et al. Supporting the Curation of Biological Databases Reusable Text Mining
CN110929509B (en) Domain event trigger word clustering method based on louvain community discovery algorithm
CN106775694A (en) A kind of hierarchy classification method of software merit rating code product
CN116467291A (en) Knowledge graph storage and search method and system
CN112668836B (en) Risk spectrum-oriented associated risk evidence efficient mining and monitoring method and apparatus
Punitha et al. Partition document clustering using ontology approach
Hu et al. A classification model of power operation inspection defect texts based on graph convolutional network
Ordoñez et al. Business Process Models Clustering Based on Multimodal Search, K-means, and Cumulative and No-Continuous N-Grams

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant