CN114637782A - Method and device for generating text aiming at structured numerical data - Google Patents
Method and device for generating text aiming at structured numerical data Download PDFInfo
- Publication number
- CN114637782A CN114637782A CN202210343943.9A CN202210343943A CN114637782A CN 114637782 A CN114637782 A CN 114637782A CN 202210343943 A CN202210343943 A CN 202210343943A CN 114637782 A CN114637782 A CN 114637782A
- Authority
- CN
- China
- Prior art keywords
- data sequence
- numerical
- abnormal data
- abnormal
- description text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000002159 abnormal effect Effects 0.000 claims abstract description 113
- 238000004519 manufacturing process Methods 0.000 claims abstract description 36
- 239000004065 semiconductor Substances 0.000 claims abstract description 18
- 238000004590 computer program Methods 0.000 claims description 9
- 230000005856 abnormality Effects 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000004140 cleaning Methods 0.000 claims description 5
- 238000003860 storage Methods 0.000 claims description 5
- 230000002547 anomalous effect Effects 0.000 claims description 4
- 238000009826 distribution Methods 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 235000012431 wafers Nutrition 0.000 description 28
- 238000003070 Statistical process control Methods 0.000 description 12
- 230000007547 defect Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000001360 synchronised effect Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 241000011102 Thera Species 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000004886 process control Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000002759 z-score normalization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/186—Templates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- General Factory Administration (AREA)
Abstract
The invention provides a method for generating a text aiming at structured numerical data, which comprises the following steps: acquiring structured numerical data, wherein the structured numerical data is a numerical sequence related to semiconductor manufacturing; and matching a corresponding description text for the numerical sequence based on a preset rule template, wherein the description text describes abnormal information corresponding to the numerical sequence. The method for generating the text aiming at the structured numerical data converts the structured numerical data into the text data which is easier to identify, and well reveals the abnormal phenomenon in the semiconductor manufacturing process.
Description
Technical Field
The invention relates to the technical field of semiconductor manufacturing, in particular to a method and a device for generating texts aiming at structured numerical data.
Background
For example, when a wafer has a yield problem, a scheme for improving the yield can be found by analyzing the historical data generated by each machine retrospectively to determine the defect and/or failure characteristics of the wafer and the root cause corresponding to the defect and/or failure characteristics, but the data generated by the machine is structured numerical data, and the structured numerical data needs to be analyzed by higher professional knowledge, and then abnormal information disclosed by the structured numerical data generated by the machine can be judged by combining with past experience, which undoubtedly increases the workload and professional threshold of workers.
Disclosure of Invention
The embodiment of the invention provides a method and a device for generating texts aiming at structured numerical data, which are used for converting the structured numerical data into text data which is easier to identify and better disclosing abnormal phenomena in the semiconductor manufacturing process.
In a first aspect, the present invention provides a method for generating text for structured numerical data, comprising: acquiring structured numerical data, wherein the structured numerical data is a numerical sequence related to semiconductor manufacturing; and matching a corresponding description text for the numerical sequence based on a preset rule template, wherein the description text describes abnormal information corresponding to the numerical sequence.
The method for generating the text aiming at the structured numerical data converts the structured numerical data into the text data which is easier to identify, and better reveals the abnormal phenomenon in the semiconductor manufacturing process.
In one possible implementation, the method further comprises: determining abnormal data sequence segments in the numerical value sequence based on the distribution characteristics of the numerical value sequence;
the matching of the corresponding description text for the numerical sequence based on the preset rule template comprises the following steps:
and matching corresponding description texts for the abnormal data sequence segments based on a preset rule template.
In one possible implementation, the preset rule template includes a plurality of abnormal judgment conditions and abnormal phenomenon descriptions corresponding to the abnormal judgment conditions;
the matching of the corresponding description text for the abnormal data sequence segment based on the preset rule template comprises the following steps:
determining a target abnormality judgment condition which is met by the abnormal data sequence segment in the plurality of abnormality judgment conditions;
and acquiring abnormal phenomenon description corresponding to the target abnormal judgment condition as the description text.
In another possible implementation, the sequence of values includes a plurality of anomalous data sequence segments; the method further comprises the following steps:
acquiring position intervals of the plurality of abnormal data sequence segments in the numerical value sequence;
merging the first abnormal data sequence segment and the second abnormal data sequence segment to obtain a third abnormal data sequence segment, wherein the position intervals of the first abnormal data sequence segment and the second abnormal data sequence segment are overlapped;
and determining the description text of the third abnormal data sequence segment based on the description text of the first abnormal data sequence segment and the description text of the second abnormal data sequence segment.
In another possible implementation, the determining the description text of the third abnormal data sequence segment based on the description text of the first abnormal data sequence segment and the description text of the second abnormal data sequence segment includes:
if the description text of the first abnormal data sequence segment is the same as that of the second abnormal data sequence segment, determining the description text of the first abnormal data sequence segment as that of the third abnormal data sequence segment;
if the description text of the first abnormal data sequence segment is different from that of the second abnormal data sequence segment, determining that the abnormal data sequence segment at the overlapped part is described by the description text of the first abnormal data sequence segment or the description text of the second abnormal data sequence segment according to a preset rule.
In another possible implementation, the matching, based on a preset rule template, a corresponding description text for the numerical value sequence further includes:
and preprocessing the numerical value sequence.
In one example, the preprocessing the sequence of values includes one or more of:
cleaning the numerical value sequence to remove numerical values which are larger than or equal to a first preset threshold value and numerical values which are smaller than or equal to a second preset threshold value, wherein the first preset threshold value is larger than the second preset threshold value;
and carrying out standardization processing on the numerical value sequence.
In another possible implementation, the method further comprises:
and performing knowledge extraction based on the description text for forming a target knowledge graph related to the semiconductor manufacturing.
In a second aspect, the present invention provides an apparatus for generating text for structured numerical data, comprising:
an acquisition module configured to acquire structured numerical data, the structured numerical data being a sequence of values related to semiconductor manufacturing;
and the matching module is configured to match a corresponding description text for the numerical value sequence based on a preset rule template, wherein the description text describes the abnormal information corresponding to the numerical value sequence.
In a third aspect, the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first and/or second aspect.
In a fourth aspect, the present invention also provides a computing device comprising a memory and a processor, the memory having stored therein instructions that, when executed by the processor, cause the method of the first aspect and/or the second aspect to be carried out.
In a fifth aspect, the present invention provides a computer program or computer program product comprising instructions which, when executed, cause a computer to perform the method of the first aspect and/or the second aspect.
Drawings
Fig. 1 is a schematic view of an application scenario of an apparatus for generating a text for structured numerical data according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for generating text for structured numerical data according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a rule template;
fig. 4 is a schematic structural diagram of an apparatus for generating a text for structured numerical data according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computing device according to an embodiment of the present invention.
Detailed Description
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Fig. 1 is a schematic view of an application scenario of an apparatus for generating a text for structured numerical data according to an embodiment of the present invention. As shown in fig. 1, a numerical sequence generated by a semiconductor manufacturing apparatus is input to an apparatus 10, and a text sequence describing abnormality information corresponding to the numerical sequence is output.
The device 10 is deployed with a means for generating text for structured numerical data, and the device 10 may select a suitable computing device as needed, for example, various server devices including dedicated server computers (e.g., personal computer servers, UNIX servers, terminal servers), blade servers, mainframe computers, server clusters, or any other suitable arrangement or combination; various terminal devices, including various types of computer devices, such as portable handheld devices, general purpose computers (e.g., personal computers or laptop computers), workstation computers, wearable devices, etc., as well as what the device 20 is, embodiments of the present invention are not limited.
Wafer fabrication is a typical scenario in semiconductor manufacturing, and embodiments of the present invention are described below by taking wafer fabrication as an example.
Fig. 2 is a flowchart of a method for generating a text for structured numerical data according to an embodiment of the present invention. The method is applicable to the apparatus shown in fig. 1. As shown in fig. 2, a method for generating a text for structured numerical data according to an embodiment of the present invention at least includes steps S201 to S202.
In step S201, structured numerical value data is acquired.
Wherein the structured numerical data is a wafer fabrication related numerical sequence.
In one example, the wafer fabrication related value sequence may include a wafer fabrication equipment related value sequence such as data including a status of the equipment obtained by a sensor during production of the wafer fabrication equipment (also referred to as a tool), including but not limited to: temperature, humidity, pressure, voltage, current, etc.; and the utilization of wafer fabrication equipment;
and/or wafer-related data, e.g., data obtained by wafer defect inspection during a production run (e.g., wafer defect data); data (for example, failure type data of the wafer, including CPU interval failure, GUP interval failure, storage interval failure, and the like) obtained by the wafer through an electrical test in the production flow; wafer yield data, and the like.
The obtaining manner may be various, for example, the data may be obtained directly from the collecting device of each machine in the semiconductor manufacturing process, that is, the collecting device collecting the real-time data of each machine directly sends the collected data to the device 10, or the first type of data is obtained by calling from a database storing the structured numerical data generated by each machine in the semiconductor manufacturing process, or the data is obtained by receiving the input of the user.
In step S202, a corresponding description text is matched for the numerical value sequence based on a preset rule template, and the description text describes abnormal information corresponding to the numerical value sequence.
In step S202, the numerical sequence is converted into a natural language text description to better disclose the abnormal phenomena in the wafer manufacturing equipment production process.
In one example, the abnormal data sequence segment in the numerical sequence may be determined based on the distribution characteristics of the numerical sequence, and then the corresponding description text may be matched for the abnormal data sequence segment based on a preset rule template.
For example, a numerical sequence is first encoded, a sliding window is designed, abnormal data in the numerical sequence is automatically identified according to the characteristic distribution of the numerical sequence, and the abnormal data is expressed as text data describing the data.
In another example, the preset rule template includes a plurality of abnormal judgment conditions and abnormal phenomenon descriptions corresponding to the abnormal judgment conditions;
matching corresponding description texts for the abnormal data sequence segments based on a preset rule template, wherein the matching comprises the following steps: determining a target abnormal judgment condition which is met by the abnormal data sequence segment in a plurality of abnormal judgment conditions; and acquiring abnormal phenomenon description corresponding to the target abnormal judgment condition as a description text.
In one implementation, a plurality of abnormality determination conditions and descriptions of abnormality corresponding to the respective abnormality determination conditions may be described with reference to fig. 3. When the abnormal data sequence of the numerical sequence conforms to "1 point is outside the control limits", that is, one point falls outside the region A ", the abnormal phenomenon corresponding to the numerical sequence is described as" A large shift ", that is, a large deviation exists. That is to say, the description text corresponding to the abnormal data sequence segment of the numerical value sequence is "a large shift".
The rule template may be implemented programmatically to obtain text describing the anomaly data. An exemplary, programmed implementation of a rule template is as follows:
in another example, when the numerical sequence includes a plurality of abnormal data sequence segments, the method further includes obtaining a position interval of the plurality of abnormal data sequence segments in the numerical sequence; merging the first abnormal data sequence segment and the second abnormal data sequence segment to obtain a third abnormal data sequence segment, wherein the position intervals of the first abnormal data sequence segment and the second abnormal data sequence segment are overlapped; and determining the description text of the third abnormal data sequence segment based on the description text of the first abnormal data sequence segment and the description text of the second abnormal data sequence segment. If the description text of the first abnormal data sequence segment is the same as that of the second abnormal data sequence segment, determining the description text as that of a third abnormal data sequence segment; if the description text of the first abnormal data sequence segment is different from that of the second abnormal data sequence segment, determining that the abnormal data sequence segment at the overlapped part is described by the description text of the first abnormal data sequence segment or the description text of the second abnormal data sequence segment according to a preset rule.
In other words, the matching results of multiple abnormal data sequence segments are combined, and abnormal data sequence segments with overlapped segments are combined, for example, 12 detection result segment sets of 8 abnormal detection templates are combined, each element in the segment set is a list, and the segment set of the result is in the list.
For example: [ [0,2], [4,5], [6,10], [8,20], ] is a set of intervals, and [0,2] is one of the intervals, wherein the set of intervals has 12 intervals in total, and each set of intervals is separately subjected to interval combination, and finally 12 combined interval sets are obtained.
The interval merging process is as follows:
1. traversing the 12 interval sets;
2. removing empty intervals in the interval set to avoid bug;
3. if the number of the intervals in the interval set is less than two, merging is not needed;
4. if the number of the current interval is more than two, traversing each interval, and sorting the intervals in the originally input interval set from small to large according to the left boundary of the interval, so that if the current interval right boundary > is the left boundary of the next interval, merging the two intervals into a continuous interval, wherein the left boundary is the left boundary of the current interval, and the right boundary is the right boundary of the next interval;
5. repeating until the traversal of the interval set is completed, and determining the interval set corresponding to the template.
The overlapped abnormal data sequence segment after the merging interval is determined according to a preset template covering principle, for example, the template covering principle may be preset to be 7- >9- >2, 8- >10- >3, that is, when the template covering principle of the overlapped abnormal data sequence segment is: template 7 covers template 9 and template 2, template 9 covers template 2, template 8 covers template 10 and template 3, and template 10 covers template 3; that is, if the description text matched with the abnormal data segment at the overlapping portion is the template 7 or 9 when the intervals are combined, the template 7 covers the template 9 according to the preset template covering principle, and the abnormal data segment at the overlapping portion is described by the description text of the template 7.
For example, if the upper and lower phenomena overlap in section, only the upper phenomena are taken in the description, such as the template 7: "a medium shift on the upper side" assuming that the interval is [10,20], and template 9: "a small shift on the upper side", assuming that the interval is [15,30], then before template coverage is: "From 10to 20, thera wa a medium shift on the upper side; and from 15to 30, there was a small shift on the upper side ", after overlaying: "" From 10to 20, there was a medium shift on the upper side; and from 21to 30, heat wa a small shift on the upper side "(small shift as a subordinate phenomenon, with the [15,20] block section covered by a superordinate phenomenon medium shift).
In another example, the sequence of values generated by the wafer fabrication facility may be SPC (Statistical Process Control) type data.
It is understood that SPC is a process control tool that utilizes mathematical statistical methods. The method analyzes and evaluates the production process, timely discovers the sign of systematic factors according to feedback information, and takes measures to eliminate the influence of the systematic factors, so that the process is maintained in a controlled state only influenced by random factors, and the purpose of controlling the quality is achieved. The state of the process is monitored by a statistical method, and the variation of the product quality is reduced by determining that the production process is in a controlled state. The data generated by the equipment in the semiconductor manufacturing process is SPC type data, and it is currently the engineer who passes this type of data through the SPC tool, i.e., statistically analyzes it, and then documents the analysis result into a report for defect analysis and yield improvement.
According to the method for generating the text aiming at the structured numerical data, SPC data generated by equipment in the semiconductor manufacturing process are directly converted into the SPC event description text which is easier to identify through a template matching method, so that some abnormal condition information existing in the semiconductor manufacturing process can be better disclosed, a yield analysis engineer can conveniently find wafer defects and root causes, and a decision can be made aiming at the defect roots to improve the yield of wafers.
The method for generating the text for the structured numerical data provided by the embodiment of the invention further includes a step of preprocessing the numerical sequence before the step S202.
For example, the step of preprocessing the numerical sequence may include performing a cleaning process on the numerical sequence to remove a numerical value greater than or equal to a first preset threshold and a numerical value less than or equal to a second preset threshold, where the first preset threshold is greater than the second preset threshold; and/or, normalizing the numerical sequence.
Because some extreme values far higher than the normal level or some minimum values far lower than the normal level may exist in the numerical sequence, and these numerical values seriously affect the average value of the wafer (wafer), the average value is obtained after the numerical value is removed, that is, the numerical sequence is cleaned, wherein the preset threshold value may be set according to the actual situation, and the value of the preset threshold value is not limited in the embodiment of the present invention. That is, some maxima and minima in the sequence of values that deviate from the normal level are cleaned away by the data cleaning process to facilitate obtaining an average value for the wafer.
Because different SPC value ranges are different (for example, ucl and lcl are different), and the range in which the value sequence is judged to be abnormal is also different, different SPC value sequences need to be scaled correspondingly to be included in the same range, that is, the value sequence is standardized in the following manner:
1. calculating the mean value mu and the standard deviation sigma of the same type of SPC data (namely the numerical sequence after the cleaning treatment) with the outliers removed;
2. x' [ i ] ═ (x [ i ] - μ)/σ (i.e., z-score normalization);
3、ucl=μ+3σ;
4、lcl=μ-3σ;
5. in the numerical sequence thus processed, values exceeding ucl will be >3, and values less than lcl will be < -3, i.e., the range defined by ucl and lcl for SPC data will become [ -3,3 ].
Those skilled in the art will appreciate that ucl means that an upper specification limit for a property value, i.e., a product property greater than ucl, will result in an engineering reject; lcl means that a lower specification limit for the property value, i.e. a product property less than lcl, would result in engineering failures.
In one example, the method may further include performing knowledge extraction based on the description text to form a target knowledge graph related to wafer fabrication, where the target knowledge graph may be used for automatic inference related to wafer fabrication, for example, one or more of defects, failure categories, root causes of defect/failure categories, and corresponding decisions for solving the root causes of the wafers may be inferred according to SPC data of the wafers to improve yields of the wafers.
Based on the same concept as the foregoing method embodiment, the embodiment of the present invention further provides an apparatus 400 for generating text for structured numerical data, where the apparatus 400 for generating text for structured numerical data includes units or modules to implement the steps in the methods shown in fig. 2 and 3.
Fig. 4 is a schematic structural diagram of an apparatus for generating a text for structured numerical data according to an embodiment of the present invention. As shown in fig. 4, the apparatus 400 for generating text for structured numerical data at least comprises:
an obtaining module 401 configured to obtain structured numerical data, which is a numerical sequence related to wafer manufacturing;
a matching module 402, configured to match a corresponding description text for the numerical value sequence based on a preset rule template, where the description text describes abnormal information corresponding to the numerical value sequence.
The apparatus 400 for generating a text for structured numerical data according to the embodiment of the present invention may correspond to executing the method described in the embodiment of the present invention, and the above and other operations and/or functions of each module in the apparatus 400 for generating a text for structured numerical data are respectively for implementing the corresponding flows of each method in fig. 2 and 3, and for brevity, the detailed implementation may refer to the above description, and is not repeated herein.
Embodiments of the present invention also provide a computing device comprising at least one processor, a memory, and a communication interface, wherein the processor is configured to execute the method described in fig. 2 and 3. The computing device may be a server or a terminal device.
Fig. 5 is a schematic structural diagram of a computing device according to an embodiment of the present invention.
As shown in fig. 5, the computing device 500 includes at least one processor 501, memory 502, a communication interface, and 503. The processor 501, the memory 502 and the communication interface 503 are communicatively connected, and communication can be achieved wirelessly or by wire. The communication interface 503 is used for receiving user instructions or collecting information sent by the device; the memory 502 stores computer instructions that are executed by the processor 501 to perform the methods of the foregoing method embodiments.
It should be understood that, in the embodiment of the present invention, the processor 501 may be a central processing unit CPU, and the processor 801 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general purpose processor may be a microprocessor or any conventional processor or the like.
The memory 502 may include both read-only memory and random access memory, and provides instructions and data to the processor 501. Memory 502 may also include non-volatile random access memory.
The memory 502 may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and direct bus RAM (DR RAM).
It should be understood that the computing device 500 according to the embodiment of the present invention may execute the method shown in fig. 2 and 3 according to the embodiment of the present invention, and the detailed description of the implementation of the method is referred to above and is not repeated herein for brevity.
An embodiment of the invention provides a computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, causes the above-mentioned method of generating text for structured numerical data to be implemented.
An embodiment of the present invention provides a computer program or computer program product comprising instructions which, when executed, cause a computer to perform a method of generating text for structured numerical data as set out above.
It will be further appreciated by those of ordinary skill in the art that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether these functions are performed in hardware or software depends on the particular application of the solution and design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (10)
1. A method of generating text for structured numerical data, comprising:
acquiring structured numerical data, wherein the structured numerical data is a numerical sequence related to semiconductor manufacturing;
and matching corresponding description texts for the numerical value sequences based on a preset rule template, wherein the description texts describe abnormal information corresponding to the numerical value sequences.
2. The method of claim 1, further comprising:
determining abnormal data sequence segments in the numerical value sequence based on the distribution characteristics of the numerical value sequence;
the matching of the corresponding description text for the numerical sequence based on the preset rule template comprises the following steps:
and matching corresponding description texts for the abnormal data sequence segments based on a preset rule template.
3. The method according to claim 2, wherein the preset rule template comprises a plurality of abnormal judgment conditions and abnormal phenomenon descriptions corresponding to the abnormal judgment conditions;
the matching of the corresponding description text for the abnormal data sequence segment based on the preset rule template comprises the following steps:
determining a target abnormality judgment condition which is met by the abnormal data sequence segment in the plurality of abnormality judgment conditions;
and acquiring abnormal phenomenon description corresponding to the target abnormal judgment condition as the description text.
4. A method according to claim 2 or 3, wherein the sequence of values comprises a plurality of abnormal data sequence segments;
the method further comprises the following steps:
acquiring position intervals of the abnormal data sequence segments in the numerical value sequence;
merging the first abnormal data sequence segment and the second abnormal data sequence segment to obtain a third abnormal data sequence segment, wherein the position intervals of the first abnormal data sequence segment and the second abnormal data sequence segment are overlapped;
and determining the description text of the third abnormal data sequence segment based on the description text of the first abnormal data sequence segment and the description text of the second abnormal data sequence segment.
5. The method of claim 4, wherein determining the description text of the third anomalous data sequence segment based on the description text of the first anomalous data sequence segment and the description text of the second anomalous data sequence segment comprises:
if the description text of the first abnormal data sequence segment is the same as that of the second abnormal data sequence segment, determining the description text of the first abnormal data sequence segment as that of the third abnormal data sequence segment;
if the description text of the first abnormal data sequence segment is different from that of the second abnormal data sequence segment, determining that the abnormal data sequence segment at the overlapped part is described by the description text of the first abnormal data sequence segment or the description text of the second abnormal data sequence segment according to a preset rule.
6. The method according to any one of claims 1to 5, wherein the matching of the corresponding description text for the numerical value sequence based on a preset rule template further comprises:
and preprocessing the numerical value sequence.
7. The method of claim 6, wherein the preprocessing the sequence of values comprises one or more of:
cleaning the numerical value sequence to remove numerical values which are larger than or equal to a first preset threshold value and numerical values which are smaller than or equal to a second preset threshold value, wherein the first preset threshold value is larger than the second preset threshold value;
and carrying out standardization processing on the numerical value sequence.
8. The method of any of claims 1-7, further comprising:
and performing knowledge extraction based on the description text for forming a target knowledge graph related to the semiconductor manufacturing.
9. An apparatus for generating text for structured numerical data, comprising:
an acquisition module configured to acquire structured numerical data, the structured numerical data being a sequence of values related to semiconductor manufacturing;
and the matching module is configured to match a corresponding description text for the numerical sequence based on a preset rule template, wherein the description text describes abnormal information corresponding to the numerical sequence.
10. A computer-readable storage medium, on which a computer program is stored, which, when the computer program is executed in a computer, causes the computer to carry out the method of any one of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210343943.9A CN114637782A (en) | 2022-04-02 | 2022-04-02 | Method and device for generating text aiming at structured numerical data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210343943.9A CN114637782A (en) | 2022-04-02 | 2022-04-02 | Method and device for generating text aiming at structured numerical data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114637782A true CN114637782A (en) | 2022-06-17 |
Family
ID=81950946
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210343943.9A Pending CN114637782A (en) | 2022-04-02 | 2022-04-02 | Method and device for generating text aiming at structured numerical data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114637782A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116089808A (en) * | 2023-02-06 | 2023-05-09 | 迪爱斯信息技术股份有限公司 | Feature selection method and device |
CN116090559A (en) * | 2023-02-03 | 2023-05-09 | 深圳智现未来工业软件有限公司 | Method for generating knowledge points based on wafer map detection data |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001101184A (en) * | 1999-10-01 | 2001-04-13 | Nippon Telegr & Teleph Corp <Ntt> | Method and device for generating structurized document and storage medium with structurized document generation program stored therein |
JP2006172343A (en) * | 2004-12-20 | 2006-06-29 | Nec Corp | Structured document evaluation data generator, structured document evaluation data generation program and structured document inspection system |
CN104428762A (en) * | 2012-08-17 | 2015-03-18 | 英特尔公司 | Traversing data utilizing data relationships |
CN110377902A (en) * | 2019-06-21 | 2019-10-25 | 北京百度网讯科技有限公司 | The training method and device of text generation model are described |
-
2022
- 2022-04-02 CN CN202210343943.9A patent/CN114637782A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001101184A (en) * | 1999-10-01 | 2001-04-13 | Nippon Telegr & Teleph Corp <Ntt> | Method and device for generating structurized document and storage medium with structurized document generation program stored therein |
JP2006172343A (en) * | 2004-12-20 | 2006-06-29 | Nec Corp | Structured document evaluation data generator, structured document evaluation data generation program and structured document inspection system |
CN104428762A (en) * | 2012-08-17 | 2015-03-18 | 英特尔公司 | Traversing data utilizing data relationships |
CN110377902A (en) * | 2019-06-21 | 2019-10-25 | 北京百度网讯科技有限公司 | The training method and device of text generation model are described |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116090559A (en) * | 2023-02-03 | 2023-05-09 | 深圳智现未来工业软件有限公司 | Method for generating knowledge points based on wafer map detection data |
CN116089808A (en) * | 2023-02-06 | 2023-05-09 | 迪爱斯信息技术股份有限公司 | Feature selection method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114637782A (en) | Method and device for generating text aiming at structured numerical data | |
CN109711440B (en) | Data anomaly detection method and device | |
US8233494B2 (en) | Hierarchical and incremental multivariate analysis for process control | |
WO2012094156A2 (en) | Methods and apparatus for data analysis | |
CN109882834B (en) | Method and device for monitoring operation data of boiler equipment | |
TWI663569B (en) | Quality prediction method for multi-workstation system and system thereof | |
CN114551271A (en) | Method and device for monitoring machine operation condition, storage medium and electronic equipment | |
JP2006318263A (en) | Information analysis system, information analysis method and program | |
KR20230042041A (en) | Prediction of Equipment Failure Modes from Process Traces | |
CN112565422B (en) | Method, system and storage medium for identifying fault data of power internet of things | |
CN112700050B (en) | Method and system for predicting ultra-short-term 1 st point power of photovoltaic power station | |
CN117520741A (en) | Method for predicting and improving yield of semiconductor factory based on big data | |
CN116776647B (en) | Performance prediction method and system for composite nickel-copper-aluminum heat dissipation bottom plate | |
CN117909864A (en) | Power failure prediction system and method | |
CN117272122A (en) | Wafer anomaly commonality analysis method and device, readable storage medium and terminal | |
US20130030760A1 (en) | Architecture for analysis and prediction of integrated tool-related and material-related data and methods therefor | |
CN113327072B (en) | Data sharing method and system for intelligent manufacturing equipment process | |
US20140100806A1 (en) | Method and apparatus for matching tools based on time trace data | |
CN114172708A (en) | Method for identifying network flow abnormity | |
Bassetto et al. | Operational methods for improving manufacturing control plans: case study in a semiconductor industry | |
Ershov et al. | Approach to the clustering modeling for the strong correlative control measurements for estimation of percent of the suitable integrated circuits in the semiconductor industry | |
CN113591266A (en) | Method and system for analyzing fault probability of electric energy meter | |
CN114818275A (en) | Method and device for constructing knowledge graph for semiconductor manufacturing yield analysis | |
CN113283501A (en) | Deep learning-based equipment state detection method, device, equipment and medium | |
CN103942615A (en) | Noisy point removing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20230411 Address after: Building A, Tianxia International Center, No. 8 Taoyuan Road, Dawangshan Community, Nantou Street, Nanshan District, Shenzhen City, Guangdong Province, 518054, 2605 Applicant after: Shenzhen Zhixian Future Industrial Software Co.,Ltd. Address before: 200090 A307, 3rd floor, building a, East 1223, 1687 Changyang Road, Yangpu District, Shanghai Applicant before: Raft Ferry (Shanghai) Technology Co.,Ltd. |
|
TA01 | Transfer of patent application right |