CN114637782A - Method and device for generating text aiming at structured numerical data - Google Patents

Method and device for generating text aiming at structured numerical data Download PDF

Info

Publication number
CN114637782A
CN114637782A CN202210343943.9A CN202210343943A CN114637782A CN 114637782 A CN114637782 A CN 114637782A CN 202210343943 A CN202210343943 A CN 202210343943A CN 114637782 A CN114637782 A CN 114637782A
Authority
CN
China
Prior art keywords
data sequence
numerical
abnormal data
abnormal
description text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210343943.9A
Other languages
Chinese (zh)
Inventor
夏敏
李云健
易丛文
徐文丞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhixian Future Industrial Software Co ltd
Original Assignee
Raft Ferry Shanghai Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Raft Ferry Shanghai Technology Co ltd filed Critical Raft Ferry Shanghai Technology Co ltd
Priority to CN202210343943.9A priority Critical patent/CN114637782A/en
Publication of CN114637782A publication Critical patent/CN114637782A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Factory Administration (AREA)

Abstract

The invention provides a method for generating a text aiming at structured numerical data, which comprises the following steps: acquiring structured numerical data, wherein the structured numerical data is a numerical sequence related to semiconductor manufacturing; and matching a corresponding description text for the numerical sequence based on a preset rule template, wherein the description text describes abnormal information corresponding to the numerical sequence. The method for generating the text aiming at the structured numerical data converts the structured numerical data into the text data which is easier to identify, and well reveals the abnormal phenomenon in the semiconductor manufacturing process.

Description

Method and device for generating text aiming at structured numerical data
Technical Field
The invention relates to the technical field of semiconductor manufacturing, in particular to a method and a device for generating texts aiming at structured numerical data.
Background
For example, when a wafer has a yield problem, a scheme for improving the yield can be found by analyzing the historical data generated by each machine retrospectively to determine the defect and/or failure characteristics of the wafer and the root cause corresponding to the defect and/or failure characteristics, but the data generated by the machine is structured numerical data, and the structured numerical data needs to be analyzed by higher professional knowledge, and then abnormal information disclosed by the structured numerical data generated by the machine can be judged by combining with past experience, which undoubtedly increases the workload and professional threshold of workers.
Disclosure of Invention
The embodiment of the invention provides a method and a device for generating texts aiming at structured numerical data, which are used for converting the structured numerical data into text data which is easier to identify and better disclosing abnormal phenomena in the semiconductor manufacturing process.
In a first aspect, the present invention provides a method for generating text for structured numerical data, comprising: acquiring structured numerical data, wherein the structured numerical data is a numerical sequence related to semiconductor manufacturing; and matching a corresponding description text for the numerical sequence based on a preset rule template, wherein the description text describes abnormal information corresponding to the numerical sequence.
The method for generating the text aiming at the structured numerical data converts the structured numerical data into the text data which is easier to identify, and better reveals the abnormal phenomenon in the semiconductor manufacturing process.
In one possible implementation, the method further comprises: determining abnormal data sequence segments in the numerical value sequence based on the distribution characteristics of the numerical value sequence;
the matching of the corresponding description text for the numerical sequence based on the preset rule template comprises the following steps:
and matching corresponding description texts for the abnormal data sequence segments based on a preset rule template.
In one possible implementation, the preset rule template includes a plurality of abnormal judgment conditions and abnormal phenomenon descriptions corresponding to the abnormal judgment conditions;
the matching of the corresponding description text for the abnormal data sequence segment based on the preset rule template comprises the following steps:
determining a target abnormality judgment condition which is met by the abnormal data sequence segment in the plurality of abnormality judgment conditions;
and acquiring abnormal phenomenon description corresponding to the target abnormal judgment condition as the description text.
In another possible implementation, the sequence of values includes a plurality of anomalous data sequence segments; the method further comprises the following steps:
acquiring position intervals of the plurality of abnormal data sequence segments in the numerical value sequence;
merging the first abnormal data sequence segment and the second abnormal data sequence segment to obtain a third abnormal data sequence segment, wherein the position intervals of the first abnormal data sequence segment and the second abnormal data sequence segment are overlapped;
and determining the description text of the third abnormal data sequence segment based on the description text of the first abnormal data sequence segment and the description text of the second abnormal data sequence segment.
In another possible implementation, the determining the description text of the third abnormal data sequence segment based on the description text of the first abnormal data sequence segment and the description text of the second abnormal data sequence segment includes:
if the description text of the first abnormal data sequence segment is the same as that of the second abnormal data sequence segment, determining the description text of the first abnormal data sequence segment as that of the third abnormal data sequence segment;
if the description text of the first abnormal data sequence segment is different from that of the second abnormal data sequence segment, determining that the abnormal data sequence segment at the overlapped part is described by the description text of the first abnormal data sequence segment or the description text of the second abnormal data sequence segment according to a preset rule.
In another possible implementation, the matching, based on a preset rule template, a corresponding description text for the numerical value sequence further includes:
and preprocessing the numerical value sequence.
In one example, the preprocessing the sequence of values includes one or more of:
cleaning the numerical value sequence to remove numerical values which are larger than or equal to a first preset threshold value and numerical values which are smaller than or equal to a second preset threshold value, wherein the first preset threshold value is larger than the second preset threshold value;
and carrying out standardization processing on the numerical value sequence.
In another possible implementation, the method further comprises:
and performing knowledge extraction based on the description text for forming a target knowledge graph related to the semiconductor manufacturing.
In a second aspect, the present invention provides an apparatus for generating text for structured numerical data, comprising:
an acquisition module configured to acquire structured numerical data, the structured numerical data being a sequence of values related to semiconductor manufacturing;
and the matching module is configured to match a corresponding description text for the numerical value sequence based on a preset rule template, wherein the description text describes the abnormal information corresponding to the numerical value sequence.
In a third aspect, the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first and/or second aspect.
In a fourth aspect, the present invention also provides a computing device comprising a memory and a processor, the memory having stored therein instructions that, when executed by the processor, cause the method of the first aspect and/or the second aspect to be carried out.
In a fifth aspect, the present invention provides a computer program or computer program product comprising instructions which, when executed, cause a computer to perform the method of the first aspect and/or the second aspect.
Drawings
Fig. 1 is a schematic view of an application scenario of an apparatus for generating a text for structured numerical data according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for generating text for structured numerical data according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a rule template;
fig. 4 is a schematic structural diagram of an apparatus for generating a text for structured numerical data according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computing device according to an embodiment of the present invention.
Detailed Description
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Fig. 1 is a schematic view of an application scenario of an apparatus for generating a text for structured numerical data according to an embodiment of the present invention. As shown in fig. 1, a numerical sequence generated by a semiconductor manufacturing apparatus is input to an apparatus 10, and a text sequence describing abnormality information corresponding to the numerical sequence is output.
The device 10 is deployed with a means for generating text for structured numerical data, and the device 10 may select a suitable computing device as needed, for example, various server devices including dedicated server computers (e.g., personal computer servers, UNIX servers, terminal servers), blade servers, mainframe computers, server clusters, or any other suitable arrangement or combination; various terminal devices, including various types of computer devices, such as portable handheld devices, general purpose computers (e.g., personal computers or laptop computers), workstation computers, wearable devices, etc., as well as what the device 20 is, embodiments of the present invention are not limited.
Wafer fabrication is a typical scenario in semiconductor manufacturing, and embodiments of the present invention are described below by taking wafer fabrication as an example.
Fig. 2 is a flowchart of a method for generating a text for structured numerical data according to an embodiment of the present invention. The method is applicable to the apparatus shown in fig. 1. As shown in fig. 2, a method for generating a text for structured numerical data according to an embodiment of the present invention at least includes steps S201 to S202.
In step S201, structured numerical value data is acquired.
Wherein the structured numerical data is a wafer fabrication related numerical sequence.
In one example, the wafer fabrication related value sequence may include a wafer fabrication equipment related value sequence such as data including a status of the equipment obtained by a sensor during production of the wafer fabrication equipment (also referred to as a tool), including but not limited to: temperature, humidity, pressure, voltage, current, etc.; and the utilization of wafer fabrication equipment;
and/or wafer-related data, e.g., data obtained by wafer defect inspection during a production run (e.g., wafer defect data); data (for example, failure type data of the wafer, including CPU interval failure, GUP interval failure, storage interval failure, and the like) obtained by the wafer through an electrical test in the production flow; wafer yield data, and the like.
The obtaining manner may be various, for example, the data may be obtained directly from the collecting device of each machine in the semiconductor manufacturing process, that is, the collecting device collecting the real-time data of each machine directly sends the collected data to the device 10, or the first type of data is obtained by calling from a database storing the structured numerical data generated by each machine in the semiconductor manufacturing process, or the data is obtained by receiving the input of the user.
In step S202, a corresponding description text is matched for the numerical value sequence based on a preset rule template, and the description text describes abnormal information corresponding to the numerical value sequence.
In step S202, the numerical sequence is converted into a natural language text description to better disclose the abnormal phenomena in the wafer manufacturing equipment production process.
In one example, the abnormal data sequence segment in the numerical sequence may be determined based on the distribution characteristics of the numerical sequence, and then the corresponding description text may be matched for the abnormal data sequence segment based on a preset rule template.
For example, a numerical sequence is first encoded, a sliding window is designed, abnormal data in the numerical sequence is automatically identified according to the characteristic distribution of the numerical sequence, and the abnormal data is expressed as text data describing the data.
In another example, the preset rule template includes a plurality of abnormal judgment conditions and abnormal phenomenon descriptions corresponding to the abnormal judgment conditions;
matching corresponding description texts for the abnormal data sequence segments based on a preset rule template, wherein the matching comprises the following steps: determining a target abnormal judgment condition which is met by the abnormal data sequence segment in a plurality of abnormal judgment conditions; and acquiring abnormal phenomenon description corresponding to the target abnormal judgment condition as a description text.
In one implementation, a plurality of abnormality determination conditions and descriptions of abnormality corresponding to the respective abnormality determination conditions may be described with reference to fig. 3. When the abnormal data sequence of the numerical sequence conforms to "1 point is outside the control limits", that is, one point falls outside the region A ", the abnormal phenomenon corresponding to the numerical sequence is described as" A large shift ", that is, a large deviation exists. That is to say, the description text corresponding to the abnormal data sequence segment of the numerical value sequence is "a large shift".
The rule template may be implemented programmatically to obtain text describing the anomaly data. An exemplary, programmed implementation of a rule template is as follows:
Figure BDA0003580344540000041
in another example, when the numerical sequence includes a plurality of abnormal data sequence segments, the method further includes obtaining a position interval of the plurality of abnormal data sequence segments in the numerical sequence; merging the first abnormal data sequence segment and the second abnormal data sequence segment to obtain a third abnormal data sequence segment, wherein the position intervals of the first abnormal data sequence segment and the second abnormal data sequence segment are overlapped; and determining the description text of the third abnormal data sequence segment based on the description text of the first abnormal data sequence segment and the description text of the second abnormal data sequence segment. If the description text of the first abnormal data sequence segment is the same as that of the second abnormal data sequence segment, determining the description text as that of a third abnormal data sequence segment; if the description text of the first abnormal data sequence segment is different from that of the second abnormal data sequence segment, determining that the abnormal data sequence segment at the overlapped part is described by the description text of the first abnormal data sequence segment or the description text of the second abnormal data sequence segment according to a preset rule.
In other words, the matching results of multiple abnormal data sequence segments are combined, and abnormal data sequence segments with overlapped segments are combined, for example, 12 detection result segment sets of 8 abnormal detection templates are combined, each element in the segment set is a list, and the segment set of the result is in the list.
For example: [ [0,2], [4,5], [6,10], [8,20], ] is a set of intervals, and [0,2] is one of the intervals, wherein the set of intervals has 12 intervals in total, and each set of intervals is separately subjected to interval combination, and finally 12 combined interval sets are obtained.
The interval merging process is as follows:
1. traversing the 12 interval sets;
2. removing empty intervals in the interval set to avoid bug;
3. if the number of the intervals in the interval set is less than two, merging is not needed;
4. if the number of the current interval is more than two, traversing each interval, and sorting the intervals in the originally input interval set from small to large according to the left boundary of the interval, so that if the current interval right boundary > is the left boundary of the next interval, merging the two intervals into a continuous interval, wherein the left boundary is the left boundary of the current interval, and the right boundary is the right boundary of the next interval;
5. repeating until the traversal of the interval set is completed, and determining the interval set corresponding to the template.
The overlapped abnormal data sequence segment after the merging interval is determined according to a preset template covering principle, for example, the template covering principle may be preset to be 7- >9- >2, 8- >10- >3, that is, when the template covering principle of the overlapped abnormal data sequence segment is: template 7 covers template 9 and template 2, template 9 covers template 2, template 8 covers template 10 and template 3, and template 10 covers template 3; that is, if the description text matched with the abnormal data segment at the overlapping portion is the template 7 or 9 when the intervals are combined, the template 7 covers the template 9 according to the preset template covering principle, and the abnormal data segment at the overlapping portion is described by the description text of the template 7.
For example, if the upper and lower phenomena overlap in section, only the upper phenomena are taken in the description, such as the template 7: "a medium shift on the upper side" assuming that the interval is [10,20], and template 9: "a small shift on the upper side", assuming that the interval is [15,30], then before template coverage is: "From 10to 20, thera wa a medium shift on the upper side; and from 15to 30, there was a small shift on the upper side ", after overlaying: "" From 10to 20, there was a medium shift on the upper side; and from 21to 30, heat wa a small shift on the upper side "(small shift as a subordinate phenomenon, with the [15,20] block section covered by a superordinate phenomenon medium shift).
In another example, the sequence of values generated by the wafer fabrication facility may be SPC (Statistical Process Control) type data.
It is understood that SPC is a process control tool that utilizes mathematical statistical methods. The method analyzes and evaluates the production process, timely discovers the sign of systematic factors according to feedback information, and takes measures to eliminate the influence of the systematic factors, so that the process is maintained in a controlled state only influenced by random factors, and the purpose of controlling the quality is achieved. The state of the process is monitored by a statistical method, and the variation of the product quality is reduced by determining that the production process is in a controlled state. The data generated by the equipment in the semiconductor manufacturing process is SPC type data, and it is currently the engineer who passes this type of data through the SPC tool, i.e., statistically analyzes it, and then documents the analysis result into a report for defect analysis and yield improvement.
According to the method for generating the text aiming at the structured numerical data, SPC data generated by equipment in the semiconductor manufacturing process are directly converted into the SPC event description text which is easier to identify through a template matching method, so that some abnormal condition information existing in the semiconductor manufacturing process can be better disclosed, a yield analysis engineer can conveniently find wafer defects and root causes, and a decision can be made aiming at the defect roots to improve the yield of wafers.
The method for generating the text for the structured numerical data provided by the embodiment of the invention further includes a step of preprocessing the numerical sequence before the step S202.
For example, the step of preprocessing the numerical sequence may include performing a cleaning process on the numerical sequence to remove a numerical value greater than or equal to a first preset threshold and a numerical value less than or equal to a second preset threshold, where the first preset threshold is greater than the second preset threshold; and/or, normalizing the numerical sequence.
Because some extreme values far higher than the normal level or some minimum values far lower than the normal level may exist in the numerical sequence, and these numerical values seriously affect the average value of the wafer (wafer), the average value is obtained after the numerical value is removed, that is, the numerical sequence is cleaned, wherein the preset threshold value may be set according to the actual situation, and the value of the preset threshold value is not limited in the embodiment of the present invention. That is, some maxima and minima in the sequence of values that deviate from the normal level are cleaned away by the data cleaning process to facilitate obtaining an average value for the wafer.
Because different SPC value ranges are different (for example, ucl and lcl are different), and the range in which the value sequence is judged to be abnormal is also different, different SPC value sequences need to be scaled correspondingly to be included in the same range, that is, the value sequence is standardized in the following manner:
1. calculating the mean value mu and the standard deviation sigma of the same type of SPC data (namely the numerical sequence after the cleaning treatment) with the outliers removed;
2. x' [ i ] ═ (x [ i ] - μ)/σ (i.e., z-score normalization);
3、ucl=μ+3σ;
4、lcl=μ-3σ;
5. in the numerical sequence thus processed, values exceeding ucl will be >3, and values less than lcl will be < -3, i.e., the range defined by ucl and lcl for SPC data will become [ -3,3 ].
Those skilled in the art will appreciate that ucl means that an upper specification limit for a property value, i.e., a product property greater than ucl, will result in an engineering reject; lcl means that a lower specification limit for the property value, i.e. a product property less than lcl, would result in engineering failures.
In one example, the method may further include performing knowledge extraction based on the description text to form a target knowledge graph related to wafer fabrication, where the target knowledge graph may be used for automatic inference related to wafer fabrication, for example, one or more of defects, failure categories, root causes of defect/failure categories, and corresponding decisions for solving the root causes of the wafers may be inferred according to SPC data of the wafers to improve yields of the wafers.
Based on the same concept as the foregoing method embodiment, the embodiment of the present invention further provides an apparatus 400 for generating text for structured numerical data, where the apparatus 400 for generating text for structured numerical data includes units or modules to implement the steps in the methods shown in fig. 2 and 3.
Fig. 4 is a schematic structural diagram of an apparatus for generating a text for structured numerical data according to an embodiment of the present invention. As shown in fig. 4, the apparatus 400 for generating text for structured numerical data at least comprises:
an obtaining module 401 configured to obtain structured numerical data, which is a numerical sequence related to wafer manufacturing;
a matching module 402, configured to match a corresponding description text for the numerical value sequence based on a preset rule template, where the description text describes abnormal information corresponding to the numerical value sequence.
The apparatus 400 for generating a text for structured numerical data according to the embodiment of the present invention may correspond to executing the method described in the embodiment of the present invention, and the above and other operations and/or functions of each module in the apparatus 400 for generating a text for structured numerical data are respectively for implementing the corresponding flows of each method in fig. 2 and 3, and for brevity, the detailed implementation may refer to the above description, and is not repeated herein.
Embodiments of the present invention also provide a computing device comprising at least one processor, a memory, and a communication interface, wherein the processor is configured to execute the method described in fig. 2 and 3. The computing device may be a server or a terminal device.
Fig. 5 is a schematic structural diagram of a computing device according to an embodiment of the present invention.
As shown in fig. 5, the computing device 500 includes at least one processor 501, memory 502, a communication interface, and 503. The processor 501, the memory 502 and the communication interface 503 are communicatively connected, and communication can be achieved wirelessly or by wire. The communication interface 503 is used for receiving user instructions or collecting information sent by the device; the memory 502 stores computer instructions that are executed by the processor 501 to perform the methods of the foregoing method embodiments.
It should be understood that, in the embodiment of the present invention, the processor 501 may be a central processing unit CPU, and the processor 801 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general purpose processor may be a microprocessor or any conventional processor or the like.
The memory 502 may include both read-only memory and random access memory, and provides instructions and data to the processor 501. Memory 502 may also include non-volatile random access memory.
The memory 502 may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and direct bus RAM (DR RAM).
It should be understood that the computing device 500 according to the embodiment of the present invention may execute the method shown in fig. 2 and 3 according to the embodiment of the present invention, and the detailed description of the implementation of the method is referred to above and is not repeated herein for brevity.
An embodiment of the invention provides a computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, causes the above-mentioned method of generating text for structured numerical data to be implemented.
An embodiment of the present invention provides a computer program or computer program product comprising instructions which, when executed, cause a computer to perform a method of generating text for structured numerical data as set out above.
It will be further appreciated by those of ordinary skill in the art that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether these functions are performed in hardware or software depends on the particular application of the solution and design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of generating text for structured numerical data, comprising:
acquiring structured numerical data, wherein the structured numerical data is a numerical sequence related to semiconductor manufacturing;
and matching corresponding description texts for the numerical value sequences based on a preset rule template, wherein the description texts describe abnormal information corresponding to the numerical value sequences.
2. The method of claim 1, further comprising:
determining abnormal data sequence segments in the numerical value sequence based on the distribution characteristics of the numerical value sequence;
the matching of the corresponding description text for the numerical sequence based on the preset rule template comprises the following steps:
and matching corresponding description texts for the abnormal data sequence segments based on a preset rule template.
3. The method according to claim 2, wherein the preset rule template comprises a plurality of abnormal judgment conditions and abnormal phenomenon descriptions corresponding to the abnormal judgment conditions;
the matching of the corresponding description text for the abnormal data sequence segment based on the preset rule template comprises the following steps:
determining a target abnormality judgment condition which is met by the abnormal data sequence segment in the plurality of abnormality judgment conditions;
and acquiring abnormal phenomenon description corresponding to the target abnormal judgment condition as the description text.
4. A method according to claim 2 or 3, wherein the sequence of values comprises a plurality of abnormal data sequence segments;
the method further comprises the following steps:
acquiring position intervals of the abnormal data sequence segments in the numerical value sequence;
merging the first abnormal data sequence segment and the second abnormal data sequence segment to obtain a third abnormal data sequence segment, wherein the position intervals of the first abnormal data sequence segment and the second abnormal data sequence segment are overlapped;
and determining the description text of the third abnormal data sequence segment based on the description text of the first abnormal data sequence segment and the description text of the second abnormal data sequence segment.
5. The method of claim 4, wherein determining the description text of the third anomalous data sequence segment based on the description text of the first anomalous data sequence segment and the description text of the second anomalous data sequence segment comprises:
if the description text of the first abnormal data sequence segment is the same as that of the second abnormal data sequence segment, determining the description text of the first abnormal data sequence segment as that of the third abnormal data sequence segment;
if the description text of the first abnormal data sequence segment is different from that of the second abnormal data sequence segment, determining that the abnormal data sequence segment at the overlapped part is described by the description text of the first abnormal data sequence segment or the description text of the second abnormal data sequence segment according to a preset rule.
6. The method according to any one of claims 1to 5, wherein the matching of the corresponding description text for the numerical value sequence based on a preset rule template further comprises:
and preprocessing the numerical value sequence.
7. The method of claim 6, wherein the preprocessing the sequence of values comprises one or more of:
cleaning the numerical value sequence to remove numerical values which are larger than or equal to a first preset threshold value and numerical values which are smaller than or equal to a second preset threshold value, wherein the first preset threshold value is larger than the second preset threshold value;
and carrying out standardization processing on the numerical value sequence.
8. The method of any of claims 1-7, further comprising:
and performing knowledge extraction based on the description text for forming a target knowledge graph related to the semiconductor manufacturing.
9. An apparatus for generating text for structured numerical data, comprising:
an acquisition module configured to acquire structured numerical data, the structured numerical data being a sequence of values related to semiconductor manufacturing;
and the matching module is configured to match a corresponding description text for the numerical sequence based on a preset rule template, wherein the description text describes abnormal information corresponding to the numerical sequence.
10. A computer-readable storage medium, on which a computer program is stored, which, when the computer program is executed in a computer, causes the computer to carry out the method of any one of claims 1-8.
CN202210343943.9A 2022-04-02 2022-04-02 Method and device for generating text aiming at structured numerical data Pending CN114637782A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210343943.9A CN114637782A (en) 2022-04-02 2022-04-02 Method and device for generating text aiming at structured numerical data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210343943.9A CN114637782A (en) 2022-04-02 2022-04-02 Method and device for generating text aiming at structured numerical data

Publications (1)

Publication Number Publication Date
CN114637782A true CN114637782A (en) 2022-06-17

Family

ID=81950946

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210343943.9A Pending CN114637782A (en) 2022-04-02 2022-04-02 Method and device for generating text aiming at structured numerical data

Country Status (1)

Country Link
CN (1) CN114637782A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089808A (en) * 2023-02-06 2023-05-09 迪爱斯信息技术股份有限公司 Feature selection method and device
CN116090559A (en) * 2023-02-03 2023-05-09 深圳智现未来工业软件有限公司 Method for generating knowledge points based on wafer map detection data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001101184A (en) * 1999-10-01 2001-04-13 Nippon Telegr & Teleph Corp <Ntt> Method and device for generating structurized document and storage medium with structurized document generation program stored therein
JP2006172343A (en) * 2004-12-20 2006-06-29 Nec Corp Structured document evaluation data generator, structured document evaluation data generation program and structured document inspection system
CN104428762A (en) * 2012-08-17 2015-03-18 英特尔公司 Traversing data utilizing data relationships
CN110377902A (en) * 2019-06-21 2019-10-25 北京百度网讯科技有限公司 The training method and device of text generation model are described

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001101184A (en) * 1999-10-01 2001-04-13 Nippon Telegr & Teleph Corp <Ntt> Method and device for generating structurized document and storage medium with structurized document generation program stored therein
JP2006172343A (en) * 2004-12-20 2006-06-29 Nec Corp Structured document evaluation data generator, structured document evaluation data generation program and structured document inspection system
CN104428762A (en) * 2012-08-17 2015-03-18 英特尔公司 Traversing data utilizing data relationships
CN110377902A (en) * 2019-06-21 2019-10-25 北京百度网讯科技有限公司 The training method and device of text generation model are described

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116090559A (en) * 2023-02-03 2023-05-09 深圳智现未来工业软件有限公司 Method for generating knowledge points based on wafer map detection data
CN116089808A (en) * 2023-02-06 2023-05-09 迪爱斯信息技术股份有限公司 Feature selection method and device

Similar Documents

Publication Publication Date Title
CN114637782A (en) Method and device for generating text aiming at structured numerical data
CN109711440B (en) Data anomaly detection method and device
US8233494B2 (en) Hierarchical and incremental multivariate analysis for process control
WO2012094156A2 (en) Methods and apparatus for data analysis
CN109882834B (en) Method and device for monitoring operation data of boiler equipment
TWI663569B (en) Quality prediction method for multi-workstation system and system thereof
CN114551271A (en) Method and device for monitoring machine operation condition, storage medium and electronic equipment
JP2006318263A (en) Information analysis system, information analysis method and program
KR20230042041A (en) Prediction of Equipment Failure Modes from Process Traces
CN112565422B (en) Method, system and storage medium for identifying fault data of power internet of things
CN112700050B (en) Method and system for predicting ultra-short-term 1 st point power of photovoltaic power station
CN117520741A (en) Method for predicting and improving yield of semiconductor factory based on big data
CN116776647B (en) Performance prediction method and system for composite nickel-copper-aluminum heat dissipation bottom plate
CN117909864A (en) Power failure prediction system and method
CN117272122A (en) Wafer anomaly commonality analysis method and device, readable storage medium and terminal
US20130030760A1 (en) Architecture for analysis and prediction of integrated tool-related and material-related data and methods therefor
CN113327072B (en) Data sharing method and system for intelligent manufacturing equipment process
US20140100806A1 (en) Method and apparatus for matching tools based on time trace data
CN114172708A (en) Method for identifying network flow abnormity
Bassetto et al. Operational methods for improving manufacturing control plans: case study in a semiconductor industry
Ershov et al. Approach to the clustering modeling for the strong correlative control measurements for estimation of percent of the suitable integrated circuits in the semiconductor industry
CN113591266A (en) Method and system for analyzing fault probability of electric energy meter
CN114818275A (en) Method and device for constructing knowledge graph for semiconductor manufacturing yield analysis
CN113283501A (en) Deep learning-based equipment state detection method, device, equipment and medium
CN103942615A (en) Noisy point removing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230411

Address after: Building A, Tianxia International Center, No. 8 Taoyuan Road, Dawangshan Community, Nantou Street, Nanshan District, Shenzhen City, Guangdong Province, 518054, 2605

Applicant after: Shenzhen Zhixian Future Industrial Software Co.,Ltd.

Address before: 200090 A307, 3rd floor, building a, East 1223, 1687 Changyang Road, Yangpu District, Shanghai

Applicant before: Raft Ferry (Shanghai) Technology Co.,Ltd.

TA01 Transfer of patent application right