WO2022252051A1 - 数据处理方法、装置、设备及存储介质 - Google Patents

数据处理方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022252051A1
WO2022252051A1 PCT/CN2021/097393 CN2021097393W WO2022252051A1 WO 2022252051 A1 WO2022252051 A1 WO 2022252051A1 CN 2021097393 W CN2021097393 W CN 2021097393W WO 2022252051 A1 WO2022252051 A1 WO 2022252051A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
value
samples
values
positive
Prior art date
Application number
PCT/CN2021/097393
Other languages
English (en)
French (fr)
Inventor
张帆
王海金
王洪
雷一鸣
柴栋
贺王强
吴建民
Original Assignee
京东方科技集团股份有限公司
北京中祥英科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司, 北京中祥英科技有限公司 filed Critical 京东方科技集团股份有限公司
Priority to CN202180001364.XA priority Critical patent/CN115735203A/zh
Priority to PCT/CN2021/097393 priority patent/WO2022252051A1/zh
Publication of WO2022252051A1 publication Critical patent/WO2022252051A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing

Definitions

  • the present disclosure relates to the technical field of data processing, and in particular, to a data processing method, device, device, and storage medium.
  • the equipment involved in the production process through which the raw material is produced and the equipment parameters corresponding to the equipment will affect the performance of the product, which may cause the product's performance to be substandard (also known as defective). Therefore, for products whose performance is not up to standard, it is necessary to determine the reasons for the performance of the product not up to standard from the equipment and equipment parameters of the equipment.
  • a data processing method comprising: obtaining sample data of each sample in a plurality of samples generated within a preset time period; the sample data includes the value of the device parameter of the device in the sample path at each collection time , and the test result of the sample; according to the test result of the sample, the sample data is divided into positive samples and negative samples; according to the value of the device parameter, the sample cut point of each sample is determined, and N of the target values corresponding to each sample are obtained Value group; the sample cut point of each sample is used to represent the sudden point of the value of the device parameter of each sample, and the target value is the value of the value of the device parameter whose time difference between two adjacent collection times is less than the first threshold value, N is a positive integer greater than or equal to 1; the relevant quantitative value is determined according to the difference between the Mth value group in the positive sample and the Mth value group in the negative sample, and the relevant quantitative value is used to represent the influence degree of the equipment parameters on the bad sample, M is a positive integer less
  • the determination of the related quantitative value based on the difference between the Mth value group in the positive sample and the Mth value group in the negative sample includes: determining the first value of the statistical index of the Mth value group in the negative sample and the second value of the statistical index of the Mth value group of the positive sample; the statistical index is used to characterize the central tendency or change trend of the value in the value group; determine the difference between the first value and the second value; determine the relevant quantitative value according to the difference.
  • the above-mentioned determination of the difference between the first value and the second value includes: according to the characteristic parameters in the plurality of first values in the negative sample and the characteristic parameters in the plurality of second values in the positive sample, A difference between the first value and the second value is determined.
  • the characteristic parameters include the value and/or the overall mean of the target location.
  • determining the difference between the first value and the second value according to the characteristic parameters in the plurality of first values in the negative sample and the characteristic parameters in the plurality of second values in the positive sample includes: Determining a first difference between the value of the target location of the plurality of first values in the negative sample and the value of the target location of the plurality of second values in the positive sample; determining the difference between the population mean of the plurality of first values in the negative sample and the positive A second difference of the overall mean of the plurality of second values in the sample; determining the difference between the first value and the second value according to the first difference, the second difference and a preset weight.
  • the above-mentioned determination of the sample cut point of each sample according to the value of the device parameter, and obtaining N value groups corresponding to the target value of each sample include: determining the sample data of the reference sample according to the value of the device parameter ;
  • the reference sample is the sample in the positive sample; determine the signal-to-noise ratio of the reference sample, and determine the absolute value of the signal-to-noise ratio as the absolute value of the signal-to-noise ratio; the absolute value of the filtered device parameter value is greater than the signal-to-noise ratio
  • the value of the absolute value is used as the reference sample cut point; the sample cut point of each sample is determined according to the reference ratio and the reference sample cut point, and N value groups corresponding to the target value of each sample are obtained;
  • the reference ratio is the equipment parameter of the reference sample The ratio of the number of values to the number of values of the device parameter per sample.
  • the above-mentioned determination of the sample cut point of each sample according to the reference ratio and the reference sample cut point includes: determining the preliminary sample cut point of each sample according to the reference ratio and the reference sample cut point; Cutting point and preset window size, to obtain the distance between the value group of the device parameters whose distance from the sample cutting point is within the range of the preset window size and the value group of the device parameters whose distance from the reference sample cutting point is within the range of the preset window size Correlation; the sample cut point for each sample is corrected according to the correlation.
  • the above-mentioned determination of the sample data of the reference sample according to the value of the device parameter includes: performing Fourier transform on the value of the device parameter of each sample in the positive sample; converting the value of the device parameter in the transformed positive sample The minimum number of numerical values is taken as the interception quantity; among the values of the equipment parameters of each sample in the positive sample, the value of the first interception quantity is obtained to obtain multiple interception value groups; the number of values included in each interception value group is the interception quantity; according to each Intercept the arrangement order of the values in the value group, obtain the median of the values in each position in multiple intercepted value groups, and obtain the median sequence; determine the sample data of the reference sample from the positive sample; the reference sample is in the positive sample The sample with the smallest difference value from the median sequence.
  • the acquisition of the sample data of each of the multiple samples generated within the preset time period includes: acquiring the sample data of each sample generated within the preset time period; obtaining the target value of the positive sample The number of values included; the value range is determined according to the number of values included in the target value of the positive sample; the number of values included in the target value of the positive sample in the sample data of each sample generated within the preset time period is within the value range The positive samples outside the filter are filtered to obtain the sample data of each sample among the multiple samples generated within the preset time period.
  • the above method further includes: sorting the magnitudes of the related quantization values, and outputting the sorting of the value groups of the device parameters corresponding to the related quantization values.
  • the above method further includes: outputting information parameters of the value group of the device parameter, where the information parameter includes the position of the value group in the device parameter and/or the percentage of the value group in the target value.
  • a data processing method comprising: receiving a sample screening condition input by a user on a condition selection interface; obtaining sample data of each sample in a plurality of samples corresponding to the sample screening condition; the sample data includes sample path The value of the equipment parameters of the equipment at each collection time, as well as the test results of the samples; according to the test results of the samples, the sample data is divided into positive samples and negative samples; according to the values of the equipment parameters, the sample cut point of each sample is determined , to obtain N value groups corresponding to the target value of each sample; the sample cut point of each sample is used to represent the sudden point of the value of the device parameter of each sample, and the target value is two adjacent collections of the value of the device parameter The time difference between times is less than the value of the first threshold, and N is a positive integer greater than or equal to 1; the relevant quantization value is determined according to the difference between the Mth value group in the positive sample and the Mth value group in the negative sample, and the related quantization value It is used to
  • the method further includes: sorting the magnitudes of the relevant quantitative values; displaying the relevant quantitative values on the analysis result display interface includes: displaying the numerical values of the equipment parameters corresponding to the relevant quantitative values on the analysis result display interface Sort.
  • the method further includes: outputting information parameters of the value group of the device parameter, where the information parameter includes the position of the value group in the device parameter and/or the percentage of the value group in the target value.
  • a data processing device including: an acquisition module, configured to acquire sample data of each sample in a plurality of samples generated within a preset time period; the sample data includes device parameters of devices in the sample path The value of the acquisition time and the test result of the sample; the division module is used to divide the sample data into positive samples and negative samples according to the test result of the sample; the determination module is used to determine the sample of each sample according to the value of the device parameter Cutting point, to obtain N value groups corresponding to the target value of each sample; the sample cutting point of each sample is used to represent the sudden point of the value of the device parameter of each sample, and the target value is two adjacent values of the device parameter
  • the time difference between two acquisition times is less than the value of the first threshold, N is a positive integer greater than or equal to 1; the correlation quantization value is determined according to the difference between the Mth value group in the positive sample and the Mth value group in the negative sample, and the correlation The quantized value is used to characterize the degree of influence of device parameters on bad samples, and
  • the determination module is specifically used to: determine the first value of the statistical index of the Mth value group in the negative sample and the second value of the statistical index of the Mth value group in the positive sample; the statistical index is used to characterize the value The central tendency or trend of change of values in a group; determining the difference between a first value and a second value; determining the relevant quantitative value based on the difference.
  • the determining module is specifically configured to: determine the first value and the second value according to the characteristic parameters in the multiple first values in the negative sample and the feature parameters in the multiple second values in the positive sample difference.
  • the characteristic parameters include the value and/or the overall mean of the target location.
  • the determination module is specifically used to: determine the first difference between the value of the target position of the plurality of first values in the negative sample and the value of the target position of the plurality of second values in the positive sample; determine the negative The second difference between the overall mean of multiple first values in the sample and the overall mean of multiple second values in the positive sample; according to the first difference, the second difference, and the preset weight, determine the first value and the second value difference.
  • the determination module is specifically used to: determine the sample data of the reference sample according to the value of the device parameter; the reference sample is a sample in the positive sample; determine the signal-to-noise ratio of the reference sample, and calculate the absolute value of the signal-to-noise ratio It is determined as the absolute value of the signal-to-noise ratio; the absolute value of the filtered device parameter value is greater than the absolute value of the signal-to-noise ratio as the reference sample cut point; the sample cut point of each sample is determined according to the reference ratio and the reference sample cut point , to obtain N value groups corresponding to the target value of each sample; the reference ratio is the ratio of the number of values of the device parameters of the reference sample to the number of values of the device parameters of each sample.
  • the determination module is specifically used to: determine the preliminary sample cut point of each sample according to the reference ratio and the reference sample cut point; the acquisition module is also used to: according to the determined sample cut point and the preset window size, Obtain the correlation between the numerical value group of the equipment parameter whose distance from the sample cutting point is within the preset window size range and the numerical value group of the equipment parameter within the preset window size range from the reference sample cutting point; the data processing device also includes A correction module for correcting the sample cut point for each sample according to the correlation.
  • the determination module is also used to: perform Fourier transform on the value of the device parameter of each sample in the positive sample; use the minimum number of the value of the device parameter in the transformed positive sample as the intercepted quantity; obtain In the value of the equipment parameter of each sample in the positive sample, the first intercepted number of values is obtained to obtain multiple intercepted value groups; the number of values included in each intercepted value group is the intercepted number; according to the order of the values in each intercepted value group, Obtain the median of the values of each position in multiple intercepted value groups to obtain the median sequence; determine the sample data of the reference sample from the positive sample; the reference sample is the smallest difference between the positive sample and the median sequence of samples.
  • the obtaining module is specifically used to: obtain the sample data of each sample generated within a preset time period; obtain the number of values included in the target value of the positive sample; obtain the number of values included in the target value of the positive sample The number determines the value range; filter the positive samples whose number of values included in the target value of the positive sample in the sample data of each sample generated within the preset time period is outside the value range, and obtain multiple positive samples generated within the preset time period Sample data for each sample in the sample.
  • the data processing apparatus further includes: a sorting module, configured to sort the magnitudes of the relevant quantized values; and an output module, configured to output the sorted numerical groups of the device parameters corresponding to the correlated quantized values.
  • the output module is further configured to: output information parameters of the value group of the device parameter, where the information parameter includes the position of the value group in the device parameter and/or the percentage of the value group in the target value.
  • a data processing device including: a receiving module, configured to receive a sample screening condition input by a user on a condition selection interface; Sample data; the sample data includes the value of the equipment parameters of the equipment in the sample path at each collection time, and the inspection results of the samples; the division module is used to divide the sample data into positive samples and negative samples according to the inspection results of the samples; determine The module is used to determine the sample cut point of each sample according to the value of the device parameter, and obtain N value groups corresponding to the target value of each sample; the sample cut point of each sample is used to characterize the device parameter of each sample
  • the mutation point of the value, the target value is the value at which the time difference between two adjacent collection times in the value of the device parameter is less than the first threshold value, N is a positive integer greater than or equal to 1; according to the Mth value group in the positive sample and The difference of the Mth value group in the negative sample determines the relevant quantitative value.
  • the relevant quantitative value is used to represent the influence degree of the equipment parameters on the bad sample
  • the data processing device further includes: a sorting module, configured to sort the magnitudes of the relevant quantified values; the display module is specifically used to: display the numerical values of the equipment parameters corresponding to the relevant quantified values on the analysis result display interface Sort.
  • a sorting module configured to sort the magnitudes of the relevant quantified values
  • the display module is specifically used to: display the numerical values of the equipment parameters corresponding to the relevant quantified values on the analysis result display interface Sort.
  • the display module is further used to: display the information parameters of the numerical group of the equipment parameter on the analysis result display interface, the information parameter includes the position of the numerical group in the equipment parameter and/or the percentage of the numerical group to the target value.
  • an electronic device including: a processor and a memory for storing instructions executable by the processor; wherein the processor is configured to execute the executable instructions to achieve any of the above aspects and embodiments thereof One or more steps in the provided data processing method.
  • a computer readable storage medium stores computer program instructions, and when the computer program instructions are run on the processor, the processor is made to perform one or more steps in the data processing method as described in any of the above-mentioned embodiments. .
  • a computer program product includes computer program instructions.
  • the computer program instructions When the computer program instructions are executed on the computer, the computer program instructions cause the computer to execute one or more steps in the data processing method as described in any of the above embodiments.
  • a computer program When the computer program is executed on a computer, the computer program causes the computer to execute one or more steps in the data processing method described in any of the above embodiments.
  • Figure 1 is a block diagram of a data processing system according to some embodiments.
  • FIG. 2 is a structural diagram of an electronic device according to an embodiment
  • Fig. 3 is a flow chart of a data processing method according to an embodiment
  • FIG. 4 is a graph of the results of determining reference sample cut points according to some embodiments.
  • FIG. 5 is a flowchart of determining a sample cut point for a sample, according to some embodiments.
  • Figure 6 is a flow chart for determining a first difference, a second difference, and a difference value between the first difference and the second difference, according to some embodiments;
  • FIG. 7 is a flowchart of another data processing method according to some embodiments.
  • Fig. 8 is a structural diagram of a condition selection interface according to some embodiments.
  • Figure 9 is a block diagram of an outcome variable input interface according to some embodiments.
  • Figure 10 is a structural diagram of a causal variable input interface according to some embodiments.
  • Figure 11 is a sample distribution diagram according to some embodiments.
  • Fig. 12 is a structural diagram showing related quantitative values of a value group in an analysis result display interface according to some embodiments.
  • Fig. 13 is a structural diagram showing related quantitative values of two value groups in an analysis result display interface according to some embodiments.
  • FIG. 14 is a structural diagram of a data processing device 80 according to some embodiments.
  • Fig. 15 is a structural diagram of a data processing device 90 according to some embodiments.
  • first and second are used for descriptive purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features. In the description of the embodiments of the present disclosure, unless otherwise specified, "plurality” means two or more.
  • the expressions “coupled” and “connected” and their derivatives may be used.
  • the term “connected” may be used in describing some embodiments to indicate that two or more elements are in direct physical or electrical contact with each other.
  • the term “coupled” may be used when describing some embodiments to indicate that two or more elements are in direct physical or electrical contact.
  • the terms “coupled” or “communicatively coupled” may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other.
  • the embodiments disclosed herein are not necessarily limited by the context herein.
  • the term “if” is optionally interpreted to mean “when” or “at” or “in response to determining” or “in response to detecting,” depending on the context.
  • the phrases “if it is determined that " or “if [the stated condition or event] is detected” are optionally construed to mean “when determining ! or “in response to determining ! depending on the context Or “upon detection of [stated condition or event]” or “in response to detection of [stated condition or event]”.
  • the equipment and equipment parameters involved in any production process of the product route will affect the performance of the product, which may cause the performance of the product to be substandard (also known as bad), and it is used for testing
  • the detection site of product performance may usually be behind multiple devices, so it is impossible to locate the bad device in time.
  • an embodiment of the present disclosure provides a data processing method to obtain the sample data of each sample among multiple samples generated within a preset time period; divide the sample data into positive samples and negative samples according to the test results of the samples; determine The sample cut point of each sample obtains N value groups (also known as sequence segments) corresponding to the target value of each sample; the sample cut point of each sample is used to characterize the mutation point of the value of the device parameter of each sample, according to The difference between the Mth value group in the positive sample and the Mth value group in the negative sample determines the relevant quantitative value, which is used to characterize the degree of influence of equipment parameters on bad samples, N is a positive integer, and M is less than or equal to N A positive integer of , so as to improve the detection efficiency, so that the user can quickly make a decision and locate the cause of the bad sample.
  • N is a positive integer
  • M is less than or equal to N A positive integer of , so as to improve the detection efficiency, so that the user can quickly make a decision and locate the cause of the bad sample
  • the data processing method provided by the embodiments of the present disclosure is applicable to the data processing system 10 shown in FIG. 1 , and the data processing system 10 includes a data processing device 100 , a display device 200 and a distributed storage device 300 .
  • the data processing device 100 is coupled to the display device 200 and the distributed storage device 300 respectively.
  • the distributed storage device 300 is configured to store production data generated by multiple devices (or referred to as factory devices).
  • the production data generated by multiple devices includes the sample data of multiple devices; for example, the sample data includes the identification of the devices that multiple samples pass through during the production process, the parameters corresponding to the devices, inspection results, and production time, each sample Go through at least one device in the production process.
  • the distributed storage device 300 may include multiple hardware memories, and different hardware memories are distributed in different physical locations (such as in different factories, or in different production lines), and realize information exchange between each other through wireless transmission (such as a network, etc.). Transfer, so that the data is a distributed relationship, but logically constitute a database based on big data technology.
  • the raw data of a large number of different equipment is stored in the corresponding manufacturing system, such as Yield Management System (Yield Management System, YMS), Fault Detection & Classification (FDC), Manufacturing Execution System (Manufacturing Execution System, MES), etc.
  • Yield Management System Yield Management System, YMS
  • FDC Fault Detection & Classification
  • MES Manufacturing Execution System
  • these raw data can be extracted from the original table by data extraction tools (such as Sqoop, kettle, etc.) and then transmitted to the distributed storage device 300 (such as distributed file system (Hadoop) Distributed File System, HDFS)) to reduce the load on equipment and manufacturing systems, and facilitate the subsequent data processing device 100 to read data.
  • distributed storage device 300 such as distributed file system (Hadoop) Distributed File System, HDFS)
  • the data in the distributed storage device 300 can be stored in Hive tool or Hbase database format.
  • Hive tool the above raw data is first stored in the database; after that, preprocessing such as data cleaning and data conversion can be continued in the Hive tool to obtain the sample data data warehouse of the sample.
  • the data warehouse can be connected to the display device 200, the data processing device 100, etc. through different API interfaces to realize data interaction with these devices.
  • the display device 200 displays a selection page, and the selection page is used for the user to select a filter condition.
  • the filter condition includes a result variable, a cause variable, and a filter condition (for example: product category and time period, etc.), and the data processing device 100 performs intelligent mining for bad diagnosis and analysis.
  • the analysis result obtained by the data processing device 100 through the fault diagnosis analysis is displayed to the user on the analysis result display page of the display device 200 .
  • the data volume of the above raw data is very large.
  • the raw data generated by all devices may be hundreds of gigabytes per day, and the data generated per hour may also be tens of gigabytes.
  • a relational database can be used to store massive structured data
  • distributed computing can be used to calculate massive data.
  • a distributed file management system distributed File System, DFS
  • the data solution realizes the storage and calculation of massive structured data.
  • the Hive tool is a Hadoop-based data warehouse tool that can be used for data extraction, transformation and loading (ETL).
  • the Hive tool defines a simple SQL-like query language, and also allows custom MapReduce mappers and reducers to be completed by default tools. complex analytical tasks.
  • the Hive tool does not have a specific data storage format, nor does it create an index for the data. Users can freely organize the tables and process the data in the database. It can be seen that the parallel processing of distributed file management can meet the storage and processing requirements of massive data. Users can query and process simple data through SQL, and user-defined functions can be used for complex processing. Therefore, when analyzing the massive data of the factory, it is necessary to extract the data of the factory database into the distributed file system. On the one hand, the original data will not be damaged, and on the other hand, the efficiency of data analysis is improved.
  • the distributed storage device 300 may be one storage, multiple storages, or a general term for multiple storage elements.
  • the memory can include: random access memory (Random Access Memory, RAM), double data rate synchronous dynamic random access memory (Double Data Rate Synchronous Dynamic Random Access Memory, DDR SRAM), can also include non-volatile memory (non-volatile memory ), such as disk storage, flash memory (Flash), etc.
  • the data processing apparatus 100 may be any terminal device, server, virtual machine or server cluster.
  • the display device 200 may be a display, or a product including a display, such as a television, a computer (all-in-one or desktop), a computer, a tablet computer, a mobile phone, an electronic picture screen, and the like.
  • the display device may be any device that displays images, whether in motion (eg, video) or stationary (eg, still images), and whether text or text.
  • the described embodiments may be implemented in or associated with a variety of electronic devices such as, but not limited to, game consoles, television monitors, flat panel displays, computer monitors, automotive displays (e.g., odometer displays, etc.), navigators, cockpit controls and/or displays, electronic photographs, electronic billboards or signs, projectors, architectural structures, packaging and aesthetic structures (e.g., displays of images of pieces of jewelry), etc.
  • electronic devices such as, but not limited to, game consoles, television monitors, flat panel displays, computer monitors, automotive displays (e.g., odometer displays, etc.), navigators, cockpit controls and/or displays, electronic photographs, electronic billboards or signs, projectors, architectural structures, packaging and aesthetic structures (e.g., displays of images of pieces of jewelry), etc.
  • the display device 200 described herein may include one or more displays, including one or more terminals with a display function, so that the data processing device 100 can send its processed data (such as influencing parameters) to the display The device 200, and the display device 200 displays it again. That is to say, through the interface of the display device 200 (that is, the user interaction interface), the user can fully interact (control and receive results) with the data processing system 10 .
  • the functions of the data processing device 100, the display device 200 and the distributed storage device 300 may be integrated into one electronic device or two electronic devices, or may be implemented separately by different devices.
  • the functions of the display device 200 and the distributed storage device 300 are not limited in this embodiment of the present disclosure.
  • the above functions of the data processing device 100, the display device 200 and the distributed storage device 300 can all be realized by the electronic device 30 shown in FIG. 2 .
  • the electronic device 30 in FIG. 2 includes, but is not limited to: a processor 301, a memory 302, an input unit 303, an interface unit 304, a power supply 305, and the like.
  • the electronic device 30 includes a display 306 .
  • the processor 301 is the control center of the electronic device, and uses various interfaces and lines to connect various parts of the entire electronic device, by running or executing software programs and/or modules stored in the memory 302, and calling data stored in the memory 302 , to perform various functions of the electronic equipment and process data, so as to monitor the electronic equipment as a whole.
  • the processor 301 may include one or more processing units; optionally, the processor 301 may integrate an application processor and a modem processor, wherein the application processor mainly processes the operating system, user interface and application programs, etc., and the modem
  • the tuner processor mainly handles wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 301 .
  • the memory 302 can be used to store software programs as well as various data.
  • the memory 302 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one functional unit, and the like.
  • the memory 302 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage devices.
  • the memory 302 may be a non-transitory computer-readable storage medium, for example, the non-transitory computer-readable storage medium may be a read-only memory (read-only memory, ROM), a random access memory (random access memory, RAM) ), CD-ROM, tape, floppy disk and optical data storage devices, etc.
  • ROM read-only memory
  • RAM random access memory
  • CD-ROM compact disc-read-only memory
  • tape floppy disk
  • optical data storage devices etc.
  • the input unit 303 may be a keyboard, a touch screen and other devices.
  • the interface unit 304 is an interface for connecting an external device to the electronic device 30 .
  • an external device may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device with an identification module, audio input/output (I/O) ports, video I/O ports, headphone ports, and more.
  • the interface unit 304 may be used to receive input from an external device (eg, data information, etc.) and transmit the received input to one or more elements within the electronic device 30 or may be used to connect transfer data.
  • a power supply 305 (such as a battery) can be used to supply power to various components.
  • the power supply 305 can be logically connected to the processor 301 through a power management system, so as to realize functions such as managing charging, discharging, and power consumption management through the power management system .
  • the display 306 is used to display information input by the user or information provided to the user (eg, data processed by the processor 301 ).
  • the display 306 may include a display panel, and the display panel may be configured in the form of a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (Organic Light-Emitting Diode, OLED), or the like.
  • the electronic device 30 is the display device 200
  • the electronic device 30 includes a display 306 .
  • the computer instructions in the embodiments of the present disclosure may also be referred to as application program codes or systems, which are not specifically limited in the embodiments of the present disclosure.
  • the electronic device shown in FIG. 2 is only an example, which does not limit the applicable electronic device of the embodiments of the present disclosure. In actual implementation, the electronic device may include more or less devices or devices than those shown in FIG. 2 .
  • Figure 3 is a flow chart of a data processing method provided by an embodiment of the present disclosure, the method may be applied to the electronic device shown in Figure 2, and the method shown in Figure 3 may include the following steps:
  • the electronic device acquires sample data of each of the multiple samples generated within a preset time period.
  • the sample data includes the values of the equipment parameters of the equipment passing through the sample at each collection time, and the inspection results of the samples.
  • the electronic device receives sample data of each of multiple samples of the same model produced by each device on the sample production line within a preset time period.
  • the electronic device performs data preprocessing through the following steps to obtain sample data of each sample among the multiple samples produced within a preset time period:
  • Step 1 The electronic device acquires initial sample data.
  • the initial sample data is the sample data of each sample produced within a preset time period.
  • the electronic device obtains from the Hbase database the batch information related to the product of a specific model within a preset period of time and/or the raw material identification information for producing the product, and according to the obtained batch information or identification information from the memory or distributed
  • the sample data of each sample of the same model produced within a preset period of time is acquired in the storage system as the initial sample data.
  • samples in the embodiments of the present disclosure may be display panels in a display panel production line; of course, the samples in the embodiments of the present disclosure may also be other products.
  • the sample data corresponding to the sample may also include a display panel motherboard (glass), and the display panel motherboard may be produced and processed into multiple display panels (panels).
  • An exemplary test result may be 0 or 1, where 0 indicates that the sample belongs to one type, and 1 indicates that the sample belongs to another type. In an example, 0 indicates that the sample is a good sample, and 1 indicates that the sample is a bad sample. Specifically, bad samples can be divided into different types according to needs.
  • the variable corresponding to the test result of each sample among the multiple samples in the embodiment of the present disclosure is the same variable.
  • the first column 1 is the identification of sample 1
  • 49.5456, 49.5823 and 46.9352 are the values of device parameter 1 respectively
  • time 00:01:47 is the value of device parameter 1
  • the production time corresponding to 49.5456, the time 00:01:48 is the production time corresponding to the value of device parameter 1 being 49.5823
  • the time 00:01:49 is the production time corresponding to the value of device parameter 1 being 46.9352.
  • the rest are similar to this and will not be repeated.
  • Step 2 The electronic device clips and/or filters the initial sample data to obtain sample data of each of the multiple samples produced within a preset time period.
  • the electronic device may clip and/or filter the initial sample data in at least one of the following ways to obtain the sample data of each of the multiple samples produced within a preset time period:
  • Method 1 The electronic device divides the initial sample data into positive samples (also known as good samples) and negative samples (also called bad samples) according to the test results of the samples. For the negative samples in the initial sample data, the electronic device filters the negative samples in which the ratio of the number of values of the device parameters to the number of collection times in the negative samples does not meet the preset condition. Exemplarily, the electronic device filters out, from the initial sample data, negative samples in which 95% of the numerical values of the device parameters are greater than the number of collection times of the device parameters.
  • the electronic device filters out sample data whose sample ID is 4.
  • Method 2 The electronic device divides the initial sample data into positive samples (also known as good samples) and negative samples (also called bad samples) according to the test results of the samples.
  • the electronic device obtains the number of values included in the target value of the positive sample through the following steps; determines the value range according to the number of values included in the target value of the positive sample; the target value of the positive sample in the sample data of each sample produced within the preset time period The number of values included in the value is filtered for positive samples that are outside the value range.
  • Step 1 The electronic device acquires the number of values included in the target value of the positive sample.
  • the target value is a value in which the time difference between two adjacent collection times is smaller than the first threshold among the values of the device parameters.
  • the target values for the sample whose sample ID is 1 are 49.5456, 49.5823, and 46.9352.
  • the number of target values of the sample whose sample ID is 1 is 3, and it can also be obtained that the number of target values of the sample whose sample ID is 2 is 5, the number of target values of the sample whose sample ID is 3 is 7, and the sample ID is 4
  • the target number of values for the sample is 5.
  • Step 2 The electronic device determines the value range according to the number of values included in the target value of the positive sample.
  • the electronic device acquires a median and an interquartile range (IQR) among the numerical values of the device parameters, and the electronic device determines the numerical range according to the median and the interquartile range.
  • IQR interquartile range
  • the electronic device determines that the sum of the median and four times the interquartile range is the upper bound of the numerical range, and the upper bound of the numerical range is obtained as 13, and the difference between the median and four times the interquartile range is determined as the numerical range Next, the lower bound of the value range is -3.
  • Step 3 The electronic device filters positive samples in which the number of values included in the target value in the initial sample data is outside the determined value range.
  • Method 3 The electronic device determines the clipping length according to the median of the number of values included in the target value of each sample, and clips the acquired sample data of each sample according to the clipping length.
  • the electronic device acquires the number of target values of each sample.
  • the electronic device obtains the median of the number of values of the target value, and the electronic device determines the cutting length according to the obtained median and the preset percentage.
  • the electronic device trims the values of the trimming length target value backwards from the start collection time of the target value, or the electronic device trims the value of the trimming length target value forward from the end collection time of the target value.
  • S101 The electronic device divides the sample data into a positive sample and a negative sample according to a test result of the sample.
  • positive samples include: samples with sample ID 1 to sample ID 3, and negative samples include samples with sample ID 4.
  • the electronic device determines the sample cut point of each sample according to the value of the device parameter, and obtains N value groups corresponding to the target value of each sample.
  • the sample cut point of each sample is used to characterize the abrupt point of the value of the device parameter for each sample.
  • the target value is a value in which the time difference between two adjacent collection times in the value of the device parameter is smaller than the first threshold, and N is a positive integer greater than or equal to 1.
  • the electronic device determines the sample cut point of each sample through the following steps:
  • Step 1 The electronic device determines the sample data of the reference sample according to the obtained values of the device parameters.
  • the reference sample is a positive sample among multiple samples.
  • the electronic device can perform Fourier transform on the value of the device parameter of each sample in the positive sample; use the minimum number of the value of the device parameter in the transformed positive sample as the interception quantity; obtain the number of device parameters of each sample in the positive sample
  • multiple intercepted numerical values are obtained to obtain multiple intercepted numerical groups; the number of numerical values included in each intercepted numerical group is the intercepted quantity; the electronic device obtains each of the multiple intercepted numerical groups according to the sequence of the numerical values in each intercepted numerical group.
  • the median of the values of the positions is obtained to obtain the median sequence; the electronic device determines the sample with the smallest difference value between the positive sample and the median sequence as the reference sample.
  • the electronic device extracts sample data of 1% of the samples from the sample data corresponding to the positive samples. If 20 ⁇ number of samples ⁇ 200, sample data of 20 samples will be drawn. If the number of samples is less than or equal to 20, all samples are drawn.
  • the electronic device determines a reference sample from the samples drawn by the above-mentioned method. In this way, determining the reference sample from the extracted samples can improve the efficiency of data processing.
  • the number of numerical values of the device parameters of the sample whose sample ID is 1 is 3, and the number of numeric values of the device parameters of the sample whose sample ID is 2 is 5,
  • the number of numerical values of the device parameter of the sample whose sample ID is 3 is 7, where 3 is the minimum number of numerical values of the device parameter.
  • the electronic device determines that the interception length is 3.
  • the intercepted value group of the sample whose sample ID is 1 obtained by the electronic device is (49.5456, 49.5823, 46.9352), the intercepted value group of the sample whose sample ID is 2 is (47.0249, 47.0248, 47.0248), and the intercepted value group of the sample whose sample ID is 3
  • the value group is (49.5344, 46.8889, 46.8889), then, the median of the first position obtained by the electronic device is 49.5344, the median of the second position is 47.0248, and the median of the third position is 46.8889.
  • the median sequence obtained by electronic equipment is (49.5344, 47.0248, 46.8889).
  • the difference between the sample with the sample identifier 3 and the median sequence is the smallest, and the electronic device determines the sample with the sample identifier 3 as the reference sample.
  • Step 2 the electronic device determines the signal-to-noise ratio of the reference sample, and determines the absolute value of the signal-to-noise ratio as the absolute value of the signal-to-noise ratio.
  • Step 3 The electronic device takes the value whose absolute value is greater than the absolute value of the signal-to-noise ratio among the numerical values of the filtered device parameters as the reference sample cutting point.
  • the absolute value of the signal-to-noise ratio determined by the electronic device is threshold, and the electronic device uses the value of the device parameter outside the threshold range [-threshold, threshold] among the filtered device parameter values as the reference sample cut point.
  • the numerical points in curve 1 are sample data
  • the numerical points in curve 2 are sample data obtained by using a high-pass filter.
  • the abscissa in FIG. 4 is the serial number corresponding to the collection time of the numerical values of the device parameters of the sample data.
  • the electronic device may adjust the reference sample cut point according to the determined fluctuation range of the value of the device parameter near the reference sample cut point.
  • the cut point of the reference sample is adjusted. Fluctuation thresholds are used to help determine abrupt points in the value of a device parameter.
  • Step 4 The electronic device determines the sample cut point of each sample according to the reference ratio and the reference sample cut point.
  • the value groups obtained from the sample cut points determined based on the same reference sample cut point correspond to each other;
  • the reference ratio is the ratio of the number of values of the device parameters of the reference sample to the number of values of the device parameters of each sample.
  • the electronic device determines the preliminary sample cut point of each sample according to the reference ratio and the reference sample cut point; according to the determined sample cut point and the preset window size, the distance from the sample cut point is obtained The correlation between the value group of the device parameters whose distance is within the preset window size range and the value group of the device parameters whose distance from the reference sample cut point is within the preset window size range; correct the value of each sample according to the obtained correlation Sample cut point.
  • the electronic device selects the 2nn interval before the cutting point of the sample data as the preset window size range [start, end]; the electronic device traverses the preset window size range [start, end] For each point of , starting with start, obtain the data segment X with the length of the preset window size range backward, and select the data segment Y with the length of nn before and after the reference sample, and use the Pearson correlation coefficient to calculate the distance between the data segment X and the data segment Y relevance. Then, the electronic device takes the cut point with the highest correlation within the preset window size range [start, end] as the sample cut point of the sample.
  • a device can include multiple device recipes (recipe steps).
  • the equipment recipe is used to describe the instruction of how the equipment should process the sample (also known as: the setting of equipment parameters for the equipment to process the sample).
  • a device recipe includes the value of the device parameter and the time corresponding to the value of the device parameter. A time difference between collection times of values of the same equipment parameter of an equipment recipe is less than a first threshold (for example: 1 second).
  • the cut point of the sample whose sample ID is 2 determined by the electronic device is 47.0013. Then, the electronic device divides the sample data whose sample identifier is 2 into two value groups, and the first value group includes values 47.0249, 47.0248 and 47.0248. The second set of values includes the values 47.0013 and 47.0013.
  • the electronic device determines the relevant quantitative value according to the difference between the Mth value group in the positive sample and the Mth value group in the negative sample.
  • the relevant quantitative value is used to represent the degree of influence of the device parameters on the bad sample, and M is less than or equal to N positive integer of .
  • the electronic device determines the relevant quantitative value through the following steps:
  • S103-1 The electronic device determines the first value of the statistical index of the Mth value group in the negative sample.
  • Statistical indicators are used to characterize the central tendency or changing trend of values in a value group.
  • the statistical indicators used to characterize the central tendency of values in a value group include the maximum value, minimum value, mean value, median, standard deviation, subscript of the minimum value and subscript of the maximum value, etc., among the characteristics that reflect the integrity of the value group at least one feature.
  • the statistical indicators used to characterize the trend of the value group include slope, range difference, sum of the difference of the downtrend (Stat_downtrend), the sum of the difference of the uptrend (Stat_uptrend), the sum of positive values (Positive_sum), and the sum of the uptrend
  • S103-2 The electronic device determines the second value of the statistical index of the Mth value group in the positive sample.
  • the electronic device obtains the statistical index of the first value group of the device parameter of each positive sample second value.
  • S103-3 The electronic device determines a difference between the first value and the second value.
  • the electronic device determines the difference between the first value and the second value according to the characteristic parameter of the first value and the characteristic parameter of the second value.
  • the characteristic parameters may include the value and/or the population mean of the target location.
  • the electronic device may use a Kruskal-Wallis test to determine the difference between the first value and the second value.
  • the electronic device acquires the median of the plurality of first values and the median of the plurality of second values.
  • the electronic device determines the difference between the two medians as the difference between the first value and the second value.
  • the electronic device may use a T-test to determine the difference between the first value and the second value.
  • the electronic device acquires an overall mean of multiple first values, and acquires an overall average of multiple second values, and the electronic device determines the difference between the two overall averages as the difference between the first value and the second value.
  • the electronic device determines a first difference between the value of the target position of the first value and the value of the target position of the second value.
  • the electronic device determines a second difference between the overall mean of the first value and the overall mean of the second value, and the electronic device determines the difference between the first value and the second value according to the first difference, the second difference, and a preset weight.
  • the electronic device determines the first difference according to the Kruskal-Wallis test, and determines the second difference according to the T test, and the electronic device compares the first difference*50% with the second difference* The sum of 50% is determined as the difference p value between the first value and the second value.
  • S103-4 The electronic device determines a related quantitative value according to the difference.
  • the electronic device may determine the first values of all statistical indexes of the numerical groups of device parameters in the negative samples; determine the first values of all statistical indexes corresponding to the numerical groups of device parameters in the positive samples. Binary value, according to the difference between the first value and the second value to obtain the related quantitative value of the value group.
  • the electronic device may also determine the first value of each statistical index of the numerical group of the device parameter in the negative sample, and determine the second value of the corresponding statistical index of the numerical group of the device parameter in the positive sample, Obtain the difference value between the first value of the positive sample and the second value of the negative sample of each statistical index, and obtain multiple related quantitative values of the numerical group according to the multiple difference values, and the electronic device sorts the multiple related quantitative values It is output to the user, which can facilitate the user to determine which statistical index can better reflect the degree of adverse influence of the value group on the sample.
  • S104 The electronic device sorts the determined related quantized values, and outputs the sorted numerical groups of the device parameters corresponding to the related quantized values.
  • the electronic device sorts the numerical groups of the device parameters corresponding to the relevant quantitative values in descending order according to the magnitude of the relevant quantitative values. In this way, the numerical groups that have the greatest impact on bad samples will be ranked first, which is convenient for users to troubleshoot samples caused by Bad reason.
  • the electronic device acquires the sample data of each of the multiple samples produced within a preset time period; divides the sample data into positive samples and negative samples according to the test results of the samples; determines the sample data of each sample Cutting point; the sample cutting point reflects the sudden point in the value of the device parameter, and the sample cutting point of each sample divides the target value corresponding to each sample into multiple value groups; thus, the value of each value group The trend tends to be the same.
  • the relevant quantitative value is determined. The larger the difference, the larger the relevant quantitative value, indicating the degree of adverse influence of the numerical group on the sample. The larger the value is, the easier it is for the user to find out the cause of the bad sample.
  • FIG. 7 is a flow chart of another data processing method provided by an embodiment of the present disclosure. This method can be applied to the electronic device shown in FIG. 2 , and the method shown in FIG. 7 may include the following steps:
  • S200 The electronic device receives a sample filtering condition input by a user on a condition selection interface.
  • the sample filter conditions may include: sample model, factory ID, site, process, start time and end time, etc.
  • the condition selection interface is shown in Figure 8, the start time and end time in Figure 8 are used to receive the input time period, the input box corresponding to the factory in Figure 8 is used to receive the factory identification, and the process input box is used to receive the process, site The input box is used to receive the site, and the product model input box is used to receive the sample model.
  • the electronic device receives the input sample filtering conditions.
  • sample screening conditions may also include test result variables.
  • the electronic device reads a variable of a preset inspection result.
  • the electronic device acquires the input variable of the inspection result in response to the user's input on the result variable input interface.
  • the result variable input interface is shown in Figure 9, the user clicks on the result variable input box in Figure 9 to display the interface shown in Figure 9, the raw material in Figure 9 can be the panel master, and the testing site can be used for the user to select the test Site, which includes at least six variables of test results under the test site: the number of bad samples of type 1 can be used as the variable of the test result for the sample of type 1 selected by the user, and the bad rate of type 1 can be used for the sample of type 1 selected by the user
  • the defect rate of type 1 raw material can be used as the variable of the test result
  • the defect rate of type 1 raw material can be used as the variable of the test result by the user
  • the defect rate of type 2 can be used for the defect rate of the sample of type 2 selected by the user
  • the defective rate of type 2 can be used for the user to select the defective rate of the sample of type 2 as the variable of the test result
  • the defective rate of the raw material of type 2 can be used for the defective rate of the raw material of the type
  • the sample screening condition may also include device parameters, and the electronic device obtains the device parameters in response to user input on the cause variable input interface.
  • the raw material in FIG. 10 may be a panel master.
  • the testing site in Figure 10 is a testing site that can be used for user selection, and the product can be used for user selection of product models.
  • the process identification in Figure 10 can be used for the user to select the corresponding process, and a process corresponds to at least one process step.
  • Both the process step identification 1 and the process step identification 2 in Figure 10 can be used for the user to select the process step, and the identification in Figure 10 is the process step identification 2 process steps correspond to at least three devices. Wherein, device 1 corresponds to one device, device 2 corresponds to one device, and device 3 corresponds to one device.
  • the electronic device obtains the sample data of each of the multiple samples corresponding to the sample screening conditions; the sample data includes the values of the device parameters of the devices passing through the sample at each collection time, and the test results of the samples.
  • S202 The electronic device divides the sample data into positive samples and negative samples according to the test results of the samples.
  • the electronic device divides the sample data into positive samples and negative samples and displays the sample distribution as shown in FIG. 11 .
  • the abscissa is the production time
  • the ordinate is the inspection result.
  • the electronic device determines the sample cut point of each sample according to the value of the device parameter, and obtains N value groups corresponding to the target value of each sample; the sample cut point of each sample is used to represent the device parameter of each sample
  • the mutation point of the value, the target value is the value at which the time difference between two adjacent collection times in the value of the device parameter is less than the first threshold, and N is a positive integer greater than or equal to 1.
  • the electronic device determines the relevant quantitative value according to the difference between the Mth value group in the positive sample and the Mth value group in the negative sample.
  • the relevant quantitative value is used to represent the degree of influence of the device parameters on the bad sample, and M is less than or equal to N positive integer of .
  • S205 The electronic device displays relevant quantitative values on the analysis result display interface.
  • the electronic device sorts the magnitudes of the relevant quantified values, and the electronic device displays the sorted numerical groups of the device parameters corresponding to the relevant quantified values on the analysis result display interface.
  • the electronic device displays each numerical group corresponding to the obtained relevant quantified value on the analysis result display interface as shown in Figure 12.
  • the electronic device takes the numerical group as a unit and ranks the multiple related quantified values of the numerical group from high to high. Sort by low, the first in Figure 12 is equipment parameter 1, this equipment parameter 1 has only one process and one value group, the correlation quantification value of the 20 statistical indicators of this equipment parameter 1 is sorted from high to low, The correlation quantification value of feature 1 is the highest, which is 0.9682.
  • the electronic device obtains the output parameters, and the output parameters include: information parameters of the numerical group, at least one of the range percentage, the first ratio or the second ratio; the first ratio is the number of samples including the device parameters and a plurality of The ratio of the total number of samples, the second ratio is the ratio of the number of bad samples corresponding to the device parameters to the total number of negative samples; the electronic device displays the output parameters on the analysis result display interface.
  • the information parameter includes the position of the value group in the device parameter and/or the percentage of the value group in the target value.
  • the electronic device displays the acquired related quantitative values of each numerical group on the analysis result display interface as shown in FIG. 13 , and FIG. 13 includes related quantitative values of two numerical groups of the device parameter 1 .
  • the relevant quantitative value is up to 0.9682.
  • the name of the device parameter corresponding to this value group is parameter 1, and the value group: 0 (1/2) means that there is only one device recipe in the sample generation process;
  • the equipment formula is divided into 2 numerical groups, and this numerical group is the first numerical group; numerical group percentage: 94.85%, indicating that the numerical group accounts for the percentage of the entire equipment formula; extreme difference percentage: 100.0%, indicating that the equipment
  • the embodiments of the present disclosure can divide the electronic equipment in the above embodiments into functional modules according to the above method example, for example, each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module middle.
  • the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that the division of modules in the embodiments of the present disclosure is schematic, and is only a logical function division, and there may be another division manner in actual implementation.
  • the data processing device 80 includes: an acquisition module 801 , a division module 802 and a determination module 803 .
  • the acquisition module 801 is used to acquire the sample data of each sample among the plurality of samples generated within the preset time period; the sample data includes the value of the equipment parameters of the equipment in the sample route at each collection time, and the test results of the samples; the division module 802, used to divide the sample data into positive samples and negative samples according to the test results of the samples; the determination module 803, used to determine the sample cut point of each sample according to the value of the device parameter, and obtain the target value corresponding to each sample N value groups; the sample cut point of each sample is used to characterize the sudden point of the value of the device parameter of each sample, and the target value is that the time difference between two adjacent acquisition times in the value of the device parameter is less than the first threshold
  • the value of N is a positive integer greater than or equal to 1; the relevant quantitative value
  • the determining module 803 is specifically configured to: determine the first value of the statistical index of the Mth value group in the negative sample and the second value of the statistical index of the Mth value group in the positive sample; the statistical index is used to characterize The central tendency or changing trend of the values in the value group; determine the difference between the first value and the second value; determine the relevant quantitative value based on the difference.
  • the determining module 803 is specifically configured to: determine the first value and the second value difference.
  • the characteristic parameters include the value and/or the overall mean of the target location.
  • the determining module 803 is specifically configured to: determine the first difference between the value of the target position of the multiple first values in the negative sample and the value of the target position of the multiple second values in the positive sample; determine The second difference between the overall mean of multiple first values in the negative sample and the overall mean of multiple second values in the positive sample; according to the first difference, the second difference and the preset weight, determine the first value and the second value difference.
  • the determining module 803 is specifically configured to: determine the sample data of the reference sample according to the value of the device parameter; the reference sample is a sample in the positive sample; determine the signal-to-noise ratio of the reference sample, and calculate the absolute value of the signal-to-noise ratio The value is determined as the absolute value of the signal-to-noise ratio; the absolute value of the filtered device parameter value is greater than the absolute value of the signal-to-noise ratio as the reference sample cut point; the sample cut of each sample is determined according to the reference ratio and the reference sample cut point Points to obtain N value groups corresponding to the target value of each sample; the reference ratio is the ratio of the number of values of the device parameters of the reference sample to the number of values of the device parameters of each sample.
  • the determination module 803 is specifically used to: determine the preliminary sample cut point of each sample according to the reference ratio and the reference sample cut point; the acquisition module is also used to: according to the determined sample cut point and the preset window size , to obtain the correlation between the numerical value group of the equipment parameters whose distance from the sample cutting point is within the preset window size range and the numerical value group of the equipment parameters whose distance from the reference sample cutting point is within the preset window size range; the data processing device also A correction module 804 is included for correcting the sample cut point of each sample according to the correlation.
  • the determination module 803 is also used to: perform Fourier transform on the value of the device parameter of each sample in the positive sample; use the minimum number of values of the device parameter in the transformed positive sample as the intercepted number; Obtain the first intercepted number of values in the value of the device parameter of each sample in the positive sample to obtain multiple intercepted value groups; the number of values included in each intercepted value group is the intercepted number; according to the order of the values in each intercepted value group , to obtain the median of the values of each position in multiple intercepted value groups, and obtain the median sequence; determine the sample data of the reference sample from the positive sample; the reference sample is the difference between the positive sample and the median sequence Minimal sample.
  • the obtaining module 801 is specifically configured to: obtain the sample data of each sample generated within a preset time period; obtain the number of values included in the target value of the positive sample; obtain the number of values included in the target value of the positive sample The number determines the value range; filter the positive samples whose number of values included in the target value of the positive sample in the sample data of each sample generated within the preset time period is outside the value range, and obtain the number of positive samples generated within the preset time period The sample data of each sample in samples;
  • the data processing device 80 further includes: a sorting module 805, configured to sort the magnitudes of the relevant quantized values; and an output module 806, configured to output the sorted numerical groups of the device parameters corresponding to the correlated quantized values.
  • the output module 806 is further configured to: output information parameters of the value group of the device parameter, where the information parameter includes the position of the value group in the device parameter and/or the percentage of the value group in the target value.
  • the receiving function of the acquisition module 801 may be implemented by the interface unit 304 in FIG. 2 .
  • the processing functions of the acquisition module 801, the division module 802, the determination module 803, the correction module 804, the sorting module 805 and the output module 806 can all be implemented by the processor 301 in FIG. 2 calling the computer program stored in the memory 302.
  • an embodiment of the present disclosure provides a structural diagram of a data processing device 90.
  • the data processing device 90 includes: a receiving module 901, an acquisition module 902, a division module 903, a determination module 904, and a display module 905.
  • the receiving module 901 used to receive the sample screening condition input by the user on the condition selection interface;
  • the obtaining module 902 is used to obtain the sample data of each sample in a plurality of samples corresponding to the sample screening condition;
  • the sample data includes the device parameters of the device of the sample route in The numerical value of each acquisition time, and the test result of the sample;
  • the division module 903 is used to divide the sample data into positive samples and negative samples according to the test results of the sample;
  • the determination module 904 is used to determine each The sample cut point of each sample is used to obtain N value groups corresponding to the target value of each sample;
  • the sample cut point of each sample is used to represent the sudden point of the value of the device parameter of each sample, and the target value is the value of the device parameter
  • the receiving module 901 can be used to perform S200
  • the obtaining module 902 can be used to perform S201
  • the division module 903 can be used to perform S202
  • the determination module 904 can be used to perform S203 and S204
  • the display module 905 can be used to Execute S205.
  • the data processing device further includes: a sorting module 906, configured to sort the magnitudes of the relevant quantitative values; the display module 905 is specifically configured to: display the numerical values of the equipment parameters corresponding to the relevant quantitative values on the analysis result display interface The sorting of the groups.
  • a sorting module 906 configured to sort the magnitudes of the relevant quantitative values
  • the display module 905 is specifically configured to: display the numerical values of the equipment parameters corresponding to the relevant quantitative values on the analysis result display interface The sorting of the groups.
  • the display module 905 is also used to: display the information parameters of the value group of the device parameter on the analysis result display interface, the information parameter includes the position of the value group in the device parameter and/or the percentage of the value group in the target value .
  • the receiving functions of the receiving module 901 and the acquiring module 902 may be implemented by the interface unit 304 in FIG. 3 .
  • the processing functions of the acquisition module 902, the division module 903, the determination module 904, the display module 905, and the sorting module 906 can all be implemented by the processor 301 in FIG. 3 calling a computer program stored in the memory 302.
  • Embodiments of the present disclosure also provide an electronic device, including: a processor and a memory for storing instructions executable by the processor; wherein the processor is configured to execute the executable instructions to achieve any of the above A data processing method described in an embodiment.
  • Some embodiments of the present disclosure provide a computer-readable storage medium (for example, a non-transitory computer-readable storage medium), where computer program instructions are stored in the computer-readable storage medium, and when the computer program instructions are run on a processor , so that the processor executes one or more steps in the data processing method described in any one of the above embodiments.
  • a computer-readable storage medium for example, a non-transitory computer-readable storage medium
  • the above-mentioned computer-readable storage medium may include, but is not limited to: a magnetic storage device (for example, a hard disk, a floppy disk, or a magnetic tape, etc.), an optical disk (for example, a CD (Compact Disk, a compact disk), a DVD (Digital Versatile Disk, Digital Versatile Disk), etc.), smart cards and flash memory devices (for example, EPROM (Erasable Programmable Read-Only Memory, Erasable Programmable Read-Only Memory), card, stick or key drive, etc.).
  • Various computer-readable storage media described in this disclosure can represent one or more devices and/or other machine-readable storage media for storing information.
  • the term "machine-readable storage medium" may include, but is not limited to, wireless channels and various other media capable of storing, containing and/or carrying instructions and/or data.
  • Some embodiments of the present disclosure also provide a computer program product.
  • the computer program product includes computer program instructions. When the computer program instructions are executed on the computer, the computer program instructions cause the computer to execute one or more steps in the data processing method as described in the above-mentioned embodiments.
  • Some embodiments of the present disclosure also provide a computer program.
  • the computer program When the computer program is executed on a computer, the computer program causes the computer to execute one or more steps in the data processing method described in the above-mentioned embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Complex Calculations (AREA)

Abstract

一种数据处理方法,该方法包括:获取预设时间段内生成的多个样本中每个样本的样本数据;样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果;根据样本的检验结果,将样本数据划分为正样本和负样本;根据设备参数的数值,确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组;目标数值为设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值;根据正样本中第M个数值组和负样本中第M个数值组的差异确定相关量化值,相关量化值用于表征设备参数对不良样本的影响程度。

Description

数据处理方法、装置、设备及存储介质 技术领域
本公开涉及数据处理技术领域,尤其涉及数据处理方法、装置、设备及存储介质。
背景技术
在产品的制造过程中,生产原料所经过的生产工序中所涉及的设备以及设备所对应的设备参数都会影响产品的性能,有可能会导致产品的性能不达标(又称不良)。因此,对于性能不达标的产品,有必要从设备以及设备的设备参数中确定导致产品的性能不达标的原因。
发明内容
一方面,提供一种数据处理方法,该方法包括:获取预设时间段内生成的多个样本中每个样本的样本数据;样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果;根据样本的检验结果,将样本数据划分为正样本和负样本;根据设备参数的数值,确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组;每个样本的样本切割点用于表征每个样本的设备参数的数值的突变点,目标数值为设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数;根据正样本中第M个数值组和负样本中第M个数值组的差异确定相关量化值,相关量化值用于表征设备参数对不良样本的影响程度,M为小于或等于N的正整数。
在一些实施例中,上述根据正样本中第M个数值组和负样本中第M个数值组的差异确定相关量化值,包括:确定负样本中第M个数值组的统计指标的第一值和正样本的第M个数值组的统计指标的第二值;统计指标用于表征数值组中数值的集中趋势或者变化趋势;确定第一值与第二值的差异;根据差异确定相关量化值。
在另一些实施例中,上述确定第一值与第二值的差异,包括:根据负样本中的多个第一值中的特征参数与正样本中的多个第二值中的特征参数,确定第一值与第二值的差异。
在另一些实施例中,特征参数包括目标位置的值和/或总体均值。
在另一些实施例中,上述根据负样本中的多个第一值中的特征参数与正样本中的多个第二值中的特征参数,确定第一值与第二值的差异,包括:确定负样本中的多个第一值的目标位置的值与正样本中的多个第二值的目标位置的值的第一差异;确定负样本中的多个第一值的总体均值与正样本中的多个第二值的总体均值的第二差异;根据第一差异、第二差异以及预设权重,确定第一值与第二值的差异。
在另一些实施例中,上述根据设备参数的数值,确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组,包括:根据设备参数的 数值确定参考样本的样本数据;参考样本为正样本中的样本;确定参考样本的信噪比,并将信噪比的绝对值确定为信噪比绝对值;将滤波后的设备参数的数值中的绝对值大于信噪比绝对值的数值作为参考样本切割点;根据参考比例与参考样本切割点确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组;参考比例为参考样本的设备参数的数值个数与每个样本的设备参数的数值个数的比值。
在另一些实施例中,上述根据参考比例与参考样本切割点确定每个样本的样本切割点,包括:根据参考比例以及参考样本切割点确定初步的每个样本的样本切割点;根据确定的样本切割点以及预设窗口大小,获取距离样本切割点的距离在预设窗口大小范围内的设备参数的数值组与距离参考样本切割点在预设窗口大小范围内的设备参数的数值组之间的相关性;根据相关性修正每个样本的样本切割点。
在另一些实施例中,上述根据设备参数的数值确定参考样本的样本数据,包括:对正样本中每个样本的设备参数的数值进行傅里叶变换;将变换后的正样本中设备参数的数值的最小数量作为截取数量;获取正样本中每个样本的设备参数的数值中前截取数量个数值得到多个截取数值组;每个截取数值组包括的数值数量均为截取数量;按照每个截取数值组中数值的排列顺序,获取多个截取数值组中每个位置的数值中的中位数,得到中位数序列;从正样本中确定参考样本的样本数据;参考样本为正样本中与中位数序列的差异值最小的样本。
在另一些实施例中,上述获取预设时间段内生成的多个样本中每个样本的样本数据,包括:获取预设时间段内生成的每个样本的样本数据;获取正样本的目标数值包括的数值个数;根据正样本的目标数值包括的数值个数确定数值范围;将预设时间段内生成的每个样本的样本数据中正样本的目标数值中包括的数值个数在数值范围之外的正样本过滤,得到预设时间段内生成的多个样本中每个样本的样本数据。
和/或,获取预设时间段内生产的每个样本的样本数据;根据每个样本的目标数值包括的数值个数中的中位数,确定裁剪长度,根据裁剪长度裁剪获取的每个样本的样本数据,得到预设时间段内生成的多个样本中每个样本的样本数据。
在另一些实施例中,上述方法还包括:对相关量化值的大小进行排序,输出相关量化值对应的设备参数的数值组的排序。
在另一些实施例中,上述方法还包括:输出设备参数的数值组的信息参数,信息参数包括数值组在设备参数中的位置和/或数值组占目标数值的百分比。
再一方面,提供一种数据处理方法,该方法包括:接收用户在条件选择界面输入的样本筛选条件;获取与样本筛选条件对应的多个样本中每个样本的样本数据;样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果;根据样本的检验结果,将样本数据划分为正样本 和负样本;根据设备参数的数值,确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组;每个样本的样本切割点用于表征每个样本的设备参数的数值的突变点,目标数值为设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数;根据正样本中第M个数值组和负样本中第M个数值组的差异确定相关量化值,相关量化值用于表征设备参数对不良样本的影响程度,M为小于或等于N的正整数;在分析结果展示界面显示相关量化值。
在一些实施例中,该方法还包括:对相关量化值的大小进行排序;上述在分析结果展示界面显示相关量化值,包括:在分析结果展示界面显示相关量化值对应的设备参数的数值组的排序。
在另一些实施例中,该方法还包括:输出设备参数的数值组的信息参数,信息参数包括数值组在设备参数中的位置和/或数值组占目标数值的百分比。
又一方面,提供一种数据处理装置,包括:获取模块,用于获取预设时间段内生成的多个样本中每个样本的样本数据;样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果;划分模块,用于根据样本的检验结果,将样本数据划分为正样本和负样本;确定模块,用于根据设备参数的数值,确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组;每个样本的样本切割点用于表征每个样本的设备参数的数值的突变点,目标数值为设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数;根据正样本中第M个数值组和负样本中第M个数值组的差异确定相关量化值,相关量化值用于表征设备参数对不良样本的影响程度,M为小于或等于N的正整数。
在一些实施例中,确定模块具体用于:确定负样本中第M个数值组的统计指标的第一值和正样本的第M个数值组的统计指标的第二值;统计指标用于表征数值组中数值的集中趋势或者变化趋势;确定第一值与第二值的差异;根据差异确定相关量化值。
在另一些实施例中,确定模块具体用于:根据负样本中的多个第一值中的特征参数与正样本中的多个第二值中的特征参数,确定第一值与第二值的差异。
在另一些实施例中,特征参数包括目标位置的值和/或总体均值。
在另一些实施例中,确定模块具体用于:确定负样本中的多个第一值的目标位置的值与正样本中的多个第二值的目标位置的值的第一差异;确定负样本中的多个第一值的总体均值与正样本中的多个第二值的总体均值的第二差异;根据第一差异、第二差异以及预设权重,确定第一值与第二值的差异。
在另一些实施例中,确定模块具体用于:根据设备参数的数值确定参考样本的样本数据;参考样本为正样本中的样本;确定参考样本的信噪比,并将信噪比的绝对值确定为信噪比绝对值;将滤波后的设备参数的数值中的绝对值大于信噪比绝对值的数值作为参考样本切割点;根据参考比例与参考样 本切割点确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组;参考比例为参考样本的设备参数的数值个数与每个样本的设备参数的数值个数的比值。
在另一些实施例中,确定模块具体用于:根据参考比例以及参考样本切割点确定初步的每个样本的样本切割点;获取模块还用于:根据确定的样本切割点以及预设窗口大小,获取距离样本切割点的距离在预设窗口大小范围内的设备参数的数值组与距离参考样本切割点在预设窗口大小范围内的设备参数的数值组之间的相关性;数据处理装置还包括修正模块,用于根据相关性修正每个样本的样本切割点。
在另一些实施例中,确定模块还用于:对正样本中每个样本的设备参数的数值进行傅里叶变换;将变换后的正样本中设备参数的数值的最小数量作为截取数量;获取正样本中每个样本的设备参数的数值中前截取数量个数值得到多个截取数值组;每个截取数值组包括的数值数量均为截取数量;按照每个截取数值组中数值的排列顺序,获取多个截取数值组中每个位置的数值中的中位数,得到中位数序列;从正样本中确定参考样本的样本数据;参考样本为正样本中与中位数序列的差异值最小的样本。
在另一些实施例中,获取模块具体用于:获取预设时间段内生成的每个样本的样本数据;获取正样本的目标数值包括的数值个数;根据正样本的目标数值包括的数值个数确定数值范围;将预设时间段内生成的每个样本的样本数据中正样本的目标数值中包括的数值个数在数值范围之外的正样本过滤,得到预设时间段内生成的多个样本中每个样本的样本数据。
和/或,获取预设时间段内生产的每个样本的样本数据;根据每个样本的目标数值包括的数值个数中的中位数,确定裁剪长度,根据裁剪长度裁剪获取的每个样本的样本数据,得到预设时间段内生成的多个样本中每个样本的样本数据。
在另一些实施例中,数据处理装置还包括:排序模块,用于对相关量化值的大小进行排序;输出模块,用于输出相关量化值对应的设备参数的数值组的排序。
在另一些实施例中,输出模块还用于:输出设备参数的数值组的信息参数,信息参数包括数值组在设备参数中的位置和/或数值组占目标数值的百分比。
又一方面,提供一种数据处理装置,包括:接收模块,用于接收用户在条件选择界面输入的样本筛选条件;获取模块,用于获取与样本筛选条件对应的多个样本中每个样本的样本数据;样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果;划分模块,用于根据样本的检验结果,将样本数据划分为正样本和负样本;确定模块,用于根据设备参数的数值,确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组;每个样本的样本切割点用于表征每个样本的设备参数的数值的突变点,目标数值为设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数;根据正样本中第M个数值 组和负样本中第M个数值组的差异确定相关量化值,相关量化值用于表征设备参数对不良样本的影响程度,M为小于或等于N的正整数;显示模块,用于在分析结果展示界面显示相关量化值。
在另一些实施例中,数据处理装置还包括:排序模块,用于对相关量化值的大小进行排序;显示模块具体用于:在分析结果展示界面显示相关量化值对应的设备参数的数值组的排序。
在另一些实施例中,显示模块还用于:在分析结果展示界面显示设备参数的数值组的信息参数,信息参数包括数值组在设备参数中的位置和/或数值组占目标数值的百分比。
又一方面,提供一种电子设备,包括:处理器和用于存储所述处理器可执行指令的存储器;其中,处理器被配置为执行可执行指令,以实现上述任意一方面及其实施例所提供的数据处理方法中的一个或多个步骤。
再一方面,提供一种计算机可读存储介质。所述计算机可读存储介质存储有计算机程序指令,所述计算机程序指令在处理器上运行时,使得所述处理器执行如上述任一实施例所述的数据处理方法中的一个或多个步骤。
又一方面,提供一种计算机程序产品。所述计算机程序产品包括计算机程序指令,在计算机上执行所述计算机程序指令时,所述计算机程序指令使计算机执行如上述任一实施例所述的数据处理方法中的一个或多个步骤。
又一方面,提供一种计算机程序。当所述计算机程序在计算机上执行时,所述计算机程序使计算机执行如上述任一实施例所述的数据处理方法中的一个或多个步骤。
附图说明
为了更清楚地说明本公开中的技术方案,下面将对本公开一些实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例的附图,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。此外,以下描述中的附图可以视作示意图,并非对本公开实施例所涉及的产品的实际尺寸、方法的实际流程、信号的实际时序等的限制。
图1为根据一些实施例的数据处理系统的结构图;
图2为根据一实施例的电子设备的结构图;
图3为根据一实施例的一种数据处理方法的流程图;
图4为根据一些实施例的确定参考样本切割点的结果图;
图5为根据一些实施例的确定样本的样本切割点的流程图;
图6为根据一些实施例的确定第一差异、第二差异以及第一差异与第二 差异的差异值的流程图;
图7为根据一些实施例的另一种数据处理方法的流程图;
图8为根据一些实施例的条件选择界面的结构图;
图9为根据一些实施例的结果变量输入界面的结构图;
图10为根据一些实施例的原因变量输入界面的结构图;
图11为根据一些实施例的样本分布图;
图12为根据一些实施例的分析结果展示界面中展示一个数值组的相关量化值的结构图;
图13为根据一些实施例的分析结果展示界面中展示两个数值组的相关量化值的结构图;
图14为根据一些实施例的一种数据处理装置80结构图;
图15为根据一些实施例的一种数据处理装置90的结构图。
具体实施方式
下面将结合附图,对本公开一些实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。基于本公开所提供的实施例,本领域普通技术人员所获得的所有其他实施例,都属于本公开保护的范围。
除非上下文另有要求,否则,在整个说明书和权利要求书中,术语“包括(comprise)”及其其他形式例如第三人称单数形式“包括(comprises)”和现在分词形式“包括(comprising)”被解释为开放、包含的意思,即为“包含,但不限于”。在说明书的描述中,术语“一个实施例(one embodiment)”、“一些实施例(some embodiments)”、“示例性实施例(exemplary embodiments)”、“示例(example)”、“特定示例(specific example)”或“一些示例(some examples)”等旨在表明与该实施例或示例相关的特定特征、结构、材料或特性包括在本公开的至少一个实施例或示例中。上述术语的示意性表示不一定是指同一实施例或示例。此外,所述的特定特征、结构、材料或特点可以以任何适当方式包括在任何一个或多个实施例或示例中。
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本公开实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。
在描述一些实施例时,可能使用了“耦接”和“连接”及其衍伸的表达。例如,描述一些实施例时可能使用了术语“连接”以表明两个或两个以上部 件彼此间有直接物理接触或电接触。又如,描述一些实施例时可能使用了术语“耦接”以表明两个或两个以上部件有直接物理接触或电接触。然而,术语“耦接”或“通信耦合(communicatively coupled)”也可能指两个或两个以上部件彼此间并无直接接触,但仍彼此协作或相互作用。这里所公开的实施例并不必然限制于本文内容。
如本文中所使用,根据上下文,术语“如果”任选地被解释为意思是“当……时”或“在……时”或“响应于确定”或“响应于检测到”。类似地,根据上下文,短语“如果确定……”或“如果检测到[所陈述的条件或事件]”任选地被解释为是指“在确定……时”或“响应于确定……”或“在检测到[所陈述的条件或事件]时”或“响应于检测到[所陈述的条件或事件]”。
本文中“适用于”或“被配置为”的使用意味着开放和包容性的语言,其不排除适用于或被配置为执行额外任务或步骤的设备。
另外,“基于”的使用意味着开放和包容性,因为“基于”一个或多个所述条件或值的过程、步骤、计算或其他动作在实践中可以基于额外条件或超出所述的值。
如本文所使用的那样,“约”或“近似”包括所阐述的值以及处于特定值的可接受偏差范围内的平均值,其中所述可接受偏差范围如由本领域普通技术人员考虑到正在讨论的测量以及与特定量的测量相关的误差(即,测量系统的局限性)所确定。
相关技术中,在生产产品的过程中,产品途径的任意一个生产工序中所涉及的设备以及设备参数都会影响产品的性能,有可能导致产品的性能不达标(又称不良),而用于检测产品性能的检测站点,通常可能在多个设备之后,因此无法及时定位导致不良的设备。在定位导致不良的设备的过程中,需追溯生产工序所涉及的每个设备,定位至设备之后,再获取设备的设备参数(包括温度,压力,湿度,流量等信息),由于设备的设备参数众多,例如:设备中子单元(subunit)级别的设备参数最多可达130个,若设备包含10个子单元,则该设备共有13000个设备参数,逐个设备参数确认会消耗大量的时间。
基于此本公开实施例提供一种数据处理方法,获取预设时间段内生成的多个样本中每个样本的样本数据;根据样本的检验结果,将样本数据划分为正样本和负样本;确定每个样本的样本切割点得到每个样本对应的目标数值的N个数值组(又称序列段);每个样本的样本切割点用于表征每个样本的设备参数的数值的突变点,根据正样本中第M个数值组和负样本中第M个数值组的差异确定相关量化值,相关量化值用于表征设备参数对不良样本的影 响程度,N为正整数,M为小于或等于N的正整数,从而提高检测效率,以便用户快速做出决策,定位导致样本不良的原因。
下面将结合附图,对本公开一些实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。基于本公开所提供的实施例,本领域普通技术人员所获得的所有其他实施例,都属于本公开保护的范围。
本公开的实施例所提供的数据处理方法适用于如图1所示的数据处理系统10,数据处理系统10包括数据处理装置100、显示装置200和分布式存储装置300。数据处理装置100分别与显示装置200和分布式存储装置300耦接。
分布式存储装置300被配置为存储多个设备(或称为工厂设备)产生的生产数据。例如,多个设备产生的生产数据包括多个设备的样本数据;例如,样本数据包括多个样本在生产过程中经过的设备的标识、设备所对应的参数、检验结果以及生产时间,每个样本在生产过程中经历至少一个设备。
其中,分布式存储装置300中存储有相对完整的数据(如一个数据库)。分布式存储装置300可以包括多个硬件的存储器,且不同的硬件存储器分布在不同物理位置(如在不同工厂,或在不同生产线),并通过无线传输(例如网络等)实现相互之间信息的传递,从而使得数据是分布式关系的,但在逻辑上构成一个基于大数据技术的数据库。
大量不同设备的原始数据存储在相应的生产制造系统中,如收益管理系统(Yield Management System,YMS)、错误侦测及分类(Fault Detection&Classification,FDC)、制造执行系统(Manufacturing Execution System,MES)等系统的关系型数据库(如Oracle、Mysql等)中,而这些原始数据可通过数据抽取工具(如Sqoop、kettle等)进行原表抽取以传输给分布式存储装置300(如分布式文件系统(Hadoop Distributed File System,HDFS)),以降低对设备和生产制造系统的负载,便于后续数据处理装置100读取数据。
分布式存储装置300中的数据可采用Hive工具或Hbase数据库格式存储。例如,根据Hive工具,以上原始数据先存储在数据库中;之后,可继续在Hive工具中进行数据清洗、数据转换等预处理,得到样本的样本数据数据仓库。数据仓库可再通过不同的API接口,与显示装置200、数据处理装置100等连接以实现与这些设备间的数据交互。显示装置200展示选择页面,选择页面用于用户选择筛选条件,筛选条件包括结果变量、原因变量以及过滤条件(例如:产品类别和时间段等),数据处理装置100进行智能挖掘以进行不良诊断分析,数据处理装置100经过不良诊断分析得到的分析结果,在显 示装置200的分析结果展示页面展示给用户。
其中,由于涉及多个工厂的多个设备,故以上原始数据的数据量是很大的。例如,所有设备每天产生的原始数据可能有几百G,每小时产生的数据也可能有几十G。
本公开实施例中可以使用关系型数据库实现对对海量结构化数据的存储,采用分布式计算实现对海量数据的计算,示例性的,采用分布式文件管理系统(Distributed File System,DFS)的大数据方案实现对海量结构化数据的存储和计算。
DFS为基础的大数据技术,其允许采用多个廉价硬件设备构建大型集群,以对海量数据进行处理。如Hive工具是基于Hadoop的数据仓库工具,可用来进行数据提取转化加载(ETL),Hive工具定义了简单的类SQL查询语言,同时也允许通过自定义的MapReduce的mapper和reducer来默认工具无法完成的复杂的分析工作。Hive工具没有专门的数据存储格式,也没有为数据建立索引,用户可以自由的组织其中的表,对数据库中的数据进行处理。可见,分布式文件管理的并行处理可满足海量数据的存储和处理要求,用户可通过SQL查询处理简单数据,而复杂处理时可采用自定义函数来实现。因此,在对工厂的海量数据分析时,需要将工厂数据库的数据抽取到分布式文件系统中,一方面不会对原始数据造成破坏,另一方面提高了数据分析效率。
示例性地,分布式存储装置300可以是一个存储器,可以是多个存储器,也可以是多个存储元件的统称。例如,存储器可以包括:随机存储器(Random Access Memory,RAM),双倍速率同步动态随机存储器(Double Data Rate Synchronous Dynamic Random Access Memory,DDR SRAM),也可以包括非易失性存储器(non-volatile memory),例如磁盘存储器,闪存(Flash)等。
数据处理装置100可以是任意一个终端设备、服务器、虚拟机或服务器集群。
显示装置200可以是显示器,还可以是包含显示器的产品,例如电视机、电脑(一体机或台式机)、计算机、平板电脑、手机、电子画屏等。示例性地,该显示装置可以是显示不论运动(例如,视频)还是固定(例如,静止图像)的且不论文字还是的图像的任何装置。更明确地说,预期所述实施例可实施在多种电子装置中或与多种电子装置关联,所述多种电子装置例如(但不限于)游戏控制台、电视监视器、平板显示器、计算机监视器、汽车显示器(例如,里程表显示器等)、导航仪、座舱控制器和/或显示器、电子相片、电子广告牌或指示牌、投影仪、建筑结构、包装和美学结构(例如,对于一 件珠宝的图像的显示器)等。
示例性地,文中所述的显示装置200可包括一个或多个显示器,包括一个或多个具有显示功能的终端,从而数据处理装置100可将其处理后的数据(例如影响参数)发送给显示装置200,显示装置200再将其显示出来。也就是说,通过该显示装置200的界面(也即用户交互界面),可实现用户与数据处理系统10的完全交互(控制和接收结果)。
可以理解的是上述数据处理装置100、显示装置200和分布式存储装置300的功能可以集成在一个电子装置或两个电子装置中,也可以是分开分别由不同的装置实现上述数据处理装置100、显示装置200和分布式存储装置300的功能,本公开实施例对此不进行限定。
上述数据处理装置100、显示装置200和分布式存储装置300的功能均可以由如图2所示的电子设备30实现。图2中电子设备30包括但不限于:处理器301、存储器302、输入单元303、接口单元304和电源305等。可选的,电子设备30包括显示器306。
处理器301是电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或执行存储在存储器302内的软件程序和/或模块,以及调用存储在存储器302内的数据,执行电子设备的各种功能和处理数据,从而对电子设备进行整体监控。处理器301可包括一个或多个处理单元;可选的,处理器301可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器301中。
存储器302可用于存储软件程序以及各种数据。存储器302可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能单元所需的应用程序等。此外,存储器302可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。可选地,存储器302可以是非临时性计算机可读存储介质,例如,非临时性计算机可读存储介质可以是只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
输入单元303可以为键盘、触摸屏等器件。
接口单元304为外部装置与电子设备30连接的接口。例如,外部装置可以包括有线或无线头戴式耳机端口、外部电源(或电池充电器)端口、有线或无线数据端口、存储卡端口、用于连接具有识别模块的装置的端口、音频 输入/输出(I/O)端口、视频I/O端口、耳机端口等等。接口单元304可以用于接收来自外部装置的输入(例如,数据信息等)并且将接收到的输入传输到电子设备30内的一个或多个元件或者可以用于在电子设备30和外部装置之间传输数据。
电源305(比如:电池)可以用于为各个部件供电,可选的,电源305可以通过电源管理系统与处理器301逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
显示器306用于显示由用户输入的信息或提供给用户的信息(例如由处理器301处理后的数据)。显示器306可包括显示面板,可以采用液晶显示器(liquid crystal display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板。在电子设备30为显示装置200的情况下,电子设备30包括显示器306。
可选的,本公开实施例中的计算机指令也可以称之为应用程序代码或系统,本公开实施例对此不作具体限定。
需要说明的是,图2所示的电子设备仅为示例,其不对本公开实施例可适用的电子设备构成限定。实际实现时,电子设备可以包括比图2中所示的更多或更少的设备或器件。
如图3所示为本公开实施例所提供的一种数据处理方法的流程图,该方法可以应用于图2所示的电子设备,图3所示的方法可以包括以下步骤:
S100:电子设备获取预设时间段内生成的多个样本中每个样本的样本数据。样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果。
在一种可能的实现方式中,电子设备接收样本生产线上的各个设备在预设时间段内生产的相同型号的多个样本中每个样本的样本数据。
在另一种可能的实现方式中,电子设备通过如下步骤进行数据预处理,获取预设时间段内生产的多个样本中每个样本的样本数据:
步骤一:电子设备获取初始样本数据。初始样本数据为预设时间段内生产的每个样本的样本数据。
示例性的,电子设备从Hbase数据库中获取预设时间段内特定型号的产品相关的批次信息和/或生产该产品的原材料标识信息,根据获取的批次信息或标识信息从存储器或分布式存储系统中获取预设时间段内生产的相同型号的每个样本的样本数据作为初始样本数据。
需要说明的是,本公开实施例中的样本可以为显示面板生产线中的显示 面板;当然,本公开实施例中的样本也可以为其它产品。样本对应的样本数据还可以包括显示面板母板(glass),显示面板母板可以被生产加工为多个显示面板(panel)。
本公开实施例对样本的检验结果的表示方式不进行限定,示例性的检验结果可以为0或1,其中,0表征该样本属于一种类型,1表征该样本属于另一种类型。在一个例子中,0表征该样本属于良样本,1表征该样本属于不良样本,具体的,不良可根据需要分为不同类型。例如,可根据不良对样本性能的直接影响进行分类,如亮线不良、暗线不良、萤火虫不良(hot spot)等;或者,也可根据不良的具体成因进行分类,如信号线短路不良、对位不良等;或者,也可根据不良的大体成因进行分类,如阵列工艺不良、彩膜工艺不良等;或者,也可根据不良的严重程度进行分类,如导致报废的不良、导致降低品质的不良等;或者,也可不区分不良的种类,即只要样本存在任何不良,即认为其有不良,反之则认为其无不良。其中,本公开实施例中多个样本中每个样本的检验结果所对应变量为同一种变量。
示例性的,假设电子设备获取的样本数据中设备A的设备参数1对应的样本数据如下表1所示:
表1
Figure PCTCN2021097393-appb-000001
表1中,第一列1为样本1的标识,第一行中,49.5456、49.5823和46.9352分别为设备参数1的值,第二行中,时间00:01:47为设备参数1的值为49.5456对应的生产时间,时间00:01:48为设备参数1的值为49.5823对应的生产时间,时间00:01:49为设备参数1的值为46.9352对应的生产时间。其余与此类似不再赘述。
步骤二:电子设备对初始样本数据进行裁剪和/或过滤得到预设时间段内生产的多个样本中每个样本的样本数据。
电子设备可以通过如下方式中的至少一种方式对初始样本数据进行裁剪和/或过滤得到预设时间段内生产的多个样本中每个样本的样本数据:
方式一:电子设备根据样本的检验结果,将初始样本数据划分为正样本(又称良样本)和负样本(又称不良样本)。对于初始样本数据中的负样本,电子设备将负样本中设备参数的数值个数与采集时间的个数的比例不满足预设条件的负样本过滤。示例性的,电子设备从初始样本数据中过滤掉设备参数的数值个数的95%大于设备参数的采集时间的个数的负样本。基于表1的初始样本数据,样本标识为4的样本数据中设备参数的数值个数为5,设备参数的采集时间的个数为4,设备参数的数值个数5*95%大于4,因此,电子设备过滤掉样本标识为4的样本数据。
方式二:电子设备根据样本的检验结果,将初始样本数据划分为正样本(又称良样本)和负样本(又称不良样本)。电子设备通过如下步骤获取正样本的目标数值包括的数值个数;根据正样本的目标数值包括的数值个数确定数值范围;将预设时间段内生产的每个样本的样本数据中正样本的目标数值中包括的数值个数在数值范围之外的正样本过滤。
步骤一:电子设备获取正样本的目标数值包括的数值个数。其中,目标数值为设备参数的数值中,相邻两个采集时间之间的时间差小于第一阈值的数值。
基于表1的示例,样本标识为1的样本的目标数值为49.5456、49.5823和46.9352。样本标识为1的样本的目标数值个数为3,同样可得,样本标识为2的样本的目标数值个数为5,样本标识为3的样本的目标数值个数为7,样本标识为4的样本的目标数值个数为5。
步骤二:电子设备根据正样本的目标数值包括的数值个数确定数值范围。
可能的实现方式中,电子设备获取设备参数的数值个数中的中位数以及四分位距(interquartile range,IQR),电子设备根据中位数以及四分位距确定数值范围。
基于步骤一以及方式一过滤后的样本数据的示例,该设备参数的数值个数的中位数为5,IQR满足公式:IQR=Q3-Q1,其中,Q3为第三四分位数,Q1表示第一四分位数,得到Q3为6,Q1为4,得到IQR为2。电子设备确定中位数与4倍的四分位距的和为数值范围的上界,得到数值范围的上界为13,确定中位数与四倍的四分位距的差为数值范围的下届,得到数值范围的下界为-3。
步骤三:电子设备将初始样本数据中目标数值中包括的数值个数在确定的数值范围之外的正样本过滤。
基于设备参数1的数值个数分别为3,5,7的示例。该示例中设备参数的个数均在数值范围(-3,13)之间,因此,电子设备不过滤正样本。
方式三:电子设备根据每个样本的目标数值包括的数值个数中的中位数,确定裁剪长度,根据裁剪长度裁剪获取的每个样本的样本数据。
可能的实现方式中,电子设备获取每一样本的目标数值的数值个数。电子设备获取目标数值的数值个数的中位数,电子设备根据获取的中位数以及预设百分比确定裁剪长度。电子设备从目标数值的起始采集时间向后裁剪裁剪长度个目标数值的数值,或者,电子设备从目标数值的结束采集时间向前裁剪裁剪长度个目标数值的数值。
基于方式二中的示例,参数1的目标数值的数值个数的中位数为5,预设百分比为3%,得到参数1的目标数值的裁剪长度为5*3%=0.15,那么,向下取整得到裁剪长度为0。电子设备无需裁剪初始样本数据中的正样本。
S101:电子设备根据样本的检验结果,将样本数据划分为正样本和负样本。
基于表1的示例,假设,样本标识为1至样本标识为3的样本的检验结果为良,样本标识为4的样本的检验结果为不良。那么,划分后的样本数据中,正样本包括:样本标识为1至样本标识为3的样本,负样本包括样本标识为4的样本。
S102:电子设备根据设备参数的数值,确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组。每个样本的样本切割点用于表征每个样本的该设备参数的数值的突变点。目标数值为设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数。
可能的实现方式中,电子设备通过如下步骤确定每个样本的样本切割点:
步骤一:电子设备根据获取到的设备参数的数值确定参考样本的样本数 据。参考样本为多个样本中的正样本。
电子设备可以对正样本中每个样本的设备参数的数值进行傅里叶变换;将变换后的正样本中设备参数的数值的最小数量作为截取数量;获取正样本中每个样本的设备参数的数值中前截取数量个数值得到多个截取数值组;每个截取数值组包括的数值数量均为截取数量;电子设备按照每个截取数值组中数值的排列顺序,获取多个截取数值组中每个位置的数值中的中位数,得到中位数序列;电子设备将正样本中与该中位数序列的差异值最小的样本确定为参考样本。
可选的,若设备参数对应的样本数据中的正样本数量(以下简称样本数)大于或等于200,电子设备从正样本所对应样本数据中抽取1%的样本的样本数据。若20<样本数<200,则抽取20个样本的样本数据。若样本数小于或等于20,则抽取所有样本。电子设备从抽取的样本中通过上述方法确定参考样本。这样,从抽取的样本中确定参考样本可以提高数据处理的效率。
基于表1中样本标识为1、2和3的样本数据的示例,样本标识为1的样本的设备参数的数值的数量为3,样本标识为2的样本的设备参数的数值的数量为5,样本标识为3的样本的设备参数的数值的数量为7,其中,3为该设备参数的数值的最小数量。电子设备确定截取长度为3。电子设备得到的样本标识为1的样本的截取数值组为(49.5456,49.5823,46.9352),样本标识为2的样本的截取数值组为(47.0249,47.0248,47.0248),样本标识为3的样本的截取数值组为(49.5344,46.8889,46.8889),那么,电子设备得到的第一个位置的中位数为49.5344,第二个位置的中位数为47.0248,第三个位置的中位数为46.8889。电子设备得到的中位数序列为(49.5344,47.0248,46.8889)。样本标识为1、2和3的样本数据中样本标识为3的样本与该中位数序列的差值最小,电子设备将样本标识为3的样本确定为参考样本。
步骤二:电子设备确定参考样本的信噪比,并将信噪比的绝对值确定为信噪比绝对值。
步骤三:电子设备将滤波后的设备参数的数值中的绝对值大于信噪比绝对值的数值作为参考样本切割点。
示例性的,假设,电子设备确定的信噪比绝对值为threshold,电子设备将滤波后的设备参数的数值中在阈值范围[-threshold,threshold]之外的设备参数的数值作为参考样本的切割点。如图4所示,曲线1中的数值点为样本数据,曲线2中的数值点为使用高通滤波器得到的样本数据,图4中并没有超出阈值范围的数值,因此,图4中的样本数据作为一个数值组。图4中横坐标为样本数据的设备参数的数值的采集时间对应的序号。
可选的,电子设备可以根据确定出的参考样本切割点附近的设备参数的 数值的波动幅度调整参考样本切割点。示例性的,确定出的参考样本切割点两边的设备参数的数值的差值小于波动阈值的情况下,调整该参考样本切割点。波动阈值用于帮助确定设备参数的数值的突变点。
步骤四:电子设备根据参考比例与参考样本切割点确定每个样本的样本切割点。基于相同参考样本切割点确定的样本切割点得到的数值组相互对应;参考比例为参考样本的设备参数的数值个数与每个样本的设备参数的数值个数的比值。
可能的实现方式中,对于每个样本,电子设备根据参考比例以及参考样本切割点确定初步的每个样本的样本切割点;根据确定的样本切割点以及预设窗口大小,获取距离该样本切割点的距离在预设窗口大小范围内的设备参数的数值组与距离参考样本切割点在预设窗口大小范围内的设备参数的数值组之间的相关性;根据获取的相关性修正每个样本的样本切割点。
如图5所示,假设,参考比例为2,电子设备选取样本数据的切割点前2nn区间作为预设窗口大小范围[start,end];电子设备遍历预设窗口大小范围[start,end]内的每个点,以start开始,获取向后预设窗口大小范围长度的数据段X,另选取参考样本前后nn长度的数据段Y,用皮尔逊相关系数计算数据段X和数据段Y之间的相关性。然后,电子设备将预设窗口大小范围[start,end]内相关性最高的切割点作为该样本的样本切割点。
可以理解的是,样本的样本切割点将每个样本对应的目标数值区分为N个数值组。N为大于或等于1的正整数,在N等于1的情况下,说明确定该样本没有样本切割点,无需对该样本的目标数值进行划分,目标数值为一个数值组。在实际生产过程中一个设备可以包括多个设备配方(recipe step)。其中,设备配方用于描述设备应如何处理样本的指令(又称:设备处理样本的设备参数的设定)。在具体表现形式上一个设备配方包括设备参数的数值以及设备参数的数值对应的时间。一个设备配方的同一设备参数的数值的采集时间之间的时间差小于第一阈值(例如:1秒)。
基于表1的示例,电子设备确定的样本标识为2的样本的切割点为47.0013。那么,电子设备将样本标识为2的样本数据切割为两个数值组,第一个数值组包括数值47.0249、47.0248和47.0248。第二数值组包括数值47.0013和47.0013。
可以理解的是,得到的同一数值组中的数值的变化趋势趋于相同。同一数值组中的数值是稳定且没有突变发生的。这样,基于设备参数的同一位置的数值组的正负样本的差异,确定的数值组的相关量化值可以用于表征该数 值组对样本不良的影响程度。
S103:电子设备根据正样本中第M个数值组和负样本中第M个数值组的差异确定相关量化值,相关量化值用于表征设备参数对不良样本的影响程度,M为小于或等于N的正整数。
可能的实现方式中,电子设备通过如下步骤确定相关量化值:
S103-1:电子设备确定负样本中第M个数值组的统计指标的第一值。统计指标用于表征数值组中数值的集中趋势或者变化趋势。
用于表征数值组中数值的集中趋势的统计指标包括最大值、最小值、均值、中位数、标准差、最小值的下标和最大值的下标等体现数值组整体性的特征中的至少一个特征。
用于表征数值组变化趋势的统计指标包括斜率、范围差、下降趋势的差值的和(Stat_downtrend)、上升趋势的差值的和(Stat_uptrend)、正数值的和(Positive_sum)、对上升区间的和中最大值(Positive_max)、对连续上升区间的和最大区间的开始下标(Positive_maxstart)、对连续上升区间的和最大区间的结束下标(Positive_maxend)、负数值的和(Negative_sum)、对下降区间的和中最大值(Negative_max)、对连续下降区间的和最大区间的开始下标(Negative_maxstart Index)、对连续下降区间的和最大区间的结束下标(Negative_maxend index)或绝对值之和(L1_NORM)中的至少一个。
上述下标用于表征数值组中数值在该数值组中的位置,示例性的,假设,数值组为(-2,1,-1,2,3,-3,4,-4),那么,该数值组中的第一个数值-2的下标为1,第二个数值1的下标为2,依次类推,不再赘述。
基于数值组(-2,1,-1,2,3,-3,4,-4)的示例,该数值组包括设备参数的数值的最大值为4;最小值为-4;均值为(-2+1+-1+2+3+-3+4+-4)/8=0;中位数为(-1+1)/2=0;标准差std满足公式
Figure PCTCN2021097393-appb-000002
其中,x 1对应-2,x 8对应-4,其余,与此类似。
Figure PCTCN2021097393-appb-000003
为-2,1,-1,2,3,-3,4,-4的平均数。计算得标准差为2.93;Range为最大值与最小值的差,即4-(-4)=8;Index_min为最小值的下标为8;Index_max为7;Stat_downtrend为-2-6-8=-16;Stat_uptrend为3+3+1+7=14;Slope满足斜率公式:
Figure PCTCN2021097393-appb-000004
Figure PCTCN2021097393-appb-000005
计算得Slope为-0.21428571;Positive_sum为10;Positive_max为2+3=5;Positive_maxstart为4,对连续上升区间的和最大区间为[2,3],2的下标为4;Positive_maxend为5,对连续上升区间的和最大区间为[2,3],3的下标为5;Negative_sum为负数值的和,即-2-1-3-4=-10;Negative_max为对下降区间的和中最大值,-4是下降区间中和的最大值;Negative_maxstart为对连续下降区间的和最大区间的开始Index,-4是下降区间中和的最大值, -4的下标为8,即8;Negative_maxend为对连续下降区间的和最大区间的结束Index,同上,即为8;L1_NORM为10-(-10)=20;假设,统计指标包括上述20个数值,则可得此数值组的统计指标的第一值为特征向量[-4,4,0,0,2.93,8,8,7,-16,14,-0.214,10,5,4,5,-10,-4,8,8,20]。
S103-2:电子设备确定正样本中第M个数值组的统计指标的第二值。
基于S103-1的示例,假设S103-1中的数值组为负样本的该设备参数的第一个数值组,电子设备获取每个正样本的该设备参数的第一个数值组的统计指标的第二值。
S103-3:电子设备确定第一值与第二值的差异。
在一种可能的实现方式中,电子设备根据第一值的特征参数与第二值的特征参数确定第一值与第二值的差异。
特征参数可以包括目标位置的值和/或总体均值。
在一个例子中,电子设备可以采用克鲁斯卡尔-沃利斯(kruskal wallis)检验确定第一值与第二值的差异值。
示例性的,假设,目标位置为中位数,电子设备获取多个第一值中的中位数与多个第二值中的中位数。电子设备确定两个中位数的差异为第一值与第二值的差异。
在另一个例子中,电子设备可以采用T检验确定第一值与第二值的差异。
示例性的,电子设备获取多个第一值的总体均值,并获取多个第二值的总体均值,电子设备确定两个总体均值的差异为第一值与第二值的差异值。
在另一种可能的实现方式中,电子设备确定第一值的目标位置的值与第二值的该目标位置的值的第一差异。电子设备确定第一值的总体均值与第二值的总体均值的第二差异,电子设备根据第一差异、第二差异以及预设权重,确定第一值与第二值的差异值。
示例性的,如图6所示电子设备根据克鲁斯卡尔-沃利斯检验确定了第一差异,根据T检验确定了第二差异,电子设备将第一差异*50%与第二差异*50%的和确定为第一值与第二值的差异值p value。
S103-4:电子设备根据差异确定相关量化值。
可以理解的是,差异值越大,说明该数值组与检验结果的相关性越大,相关性越大,该设备参数的该数值组的相关量化值越大。
可以理解的是,一种可能的实现方式中,电子设备可以确定负样本中设备参数的数值组的所有统计指标的第一值;确定正样本中设备参数的数值组的对应所有统计指标的第二值,根据第一值与第二值的差异值得到该数值组的相关量化值。
另一种可能的实现方式中,电子设备也可以确定负样本中设备参数的数值组的每一个统计指标的第一值,确定正样本中设备参数的数值组的对应统计指标的第二值,得到每一个统计指标的正样本的第一值与负样本的第二值的差异值,根据多个差异值得到该数值组的多个相关量化值,电子设备对多个相关量化值进行排序后输出给用户,可以方便用户确定哪个统计指标更能反应该数值组对样本不良的影响程度。
可选的,S104:电子设备对确定的相关量化值进行排序,并输出相关量化值对应的设备参数的数值组的排序。
示例性的,电子设备按照相关量化值的大小对相关量化值对应的设备参数的数值组进行降序排序,这样,对不良样本影响程度最大的数值组就会排在最前面,方便用户排查造成样本不良的原因。
本公开实施例中,电子设备获取预设时间段内生产的多个样本中每个样本的样本数据;根据样本的检验结果,将样本数据划分为正样本和负样本;确定每个样本的样本切割点;样本切割点体现了设备参数的数值中的突变点,每个样本的样本切割点,将每个样本对应的目标数值区分为多个数值组;这样,每个数值组中的数值的趋势趋于相同,根据负样本中设备参数的数值组与对应的正样本数值组的差异,确定相关量化值,差异越大则相关量化值也越大,说明该数值组对样本的不良影响程度越大,从而方便用户排查出造成样本不良的原因。
如图7所示为本公开实施例所提供的另一种数据处理方法的流程图,该方法可以应用于图2所示的电子设备,图7所示的方法可以包括以下步骤:
S200:电子设备接收用户在条件选择界面输入的样本筛选条件。
样本筛选条件可以包括:样本型号、工厂标识、站点、工序、开始时间以及结束时间等。
示例性的,条件选择界面如图8所示,图8中开始时间与结束时间用于接收输入时间段,图8中工厂对应输入框用于接收工厂标识、工序输入框用于接收工序、站点输入框用于接收站点、产品型号输入框用于接收样本型号。在用户在图8中输入框输入相关信息后,点击确认按钮,则电子设备接收到了输入的样本筛选条件。
可选的,样本筛选条件还可以包括检验结果变量。
在一种可能的实现方式中,电子设备读取预设的检验结果的变量。
在另一种可能的实现方式中,电子设备响应于用户在结果变量输入界面的输入,获取输入的检验结果的变量。
示例性的,结果变量输入界面如图9所示,用户点击图9中结果变量输入框展示如图9中的界面,图9中原材料可以为面板母版,检测站点可以用于用户选择该检测站点,该检测站点下至少包括六种检验结果的变量:类型1不良数可以用于用户选择类型1的样本的不良数作为检验结果的变量,类型1不良率可以用于用户选择类型1的样本的不良率作为检验结果的变量,类型1原材料的不良率可以用于用户选择类型1的该原材料的不良率作为检验结果的变量,类型2不良数可以用于用户选择类型2的样本的不良数作为检验结果的变量,类型2不良率可以用于用户选择类型2的样本的不良率作为检验结果的变量,类型2原材料的不良率可以用于用户选择类型2的该原材料的不良率作为检验结果的变量。
可选的,样本筛选条件还可以包括设备参数,电子设备响应于用户在原因变量输入界面的输入,获取设备参数。
示例性的,如图10所示为原因变量输入界面。图10中原材料可以为面板母版。图10中的检测站点为可以用于用户选择的检测站点,产品可以用于用户选择产品型号。图10中工艺标识可以用于用户选择对应工艺,一个工艺对应至少一个工艺步骤,图10中工艺步骤标识1以及工艺步骤标识2均可以用于用户选择工艺步骤,图10中标识为工艺步骤标识2的工艺步骤对应了至少三个设备。其中,设备1对应一个设备、设备2对应一个设备、设备3对应一个设备。
S201:电子设备获取与样本筛选条件对应的多个样本中每个样本的样本数据;样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果。
S202:电子设备根据样本的检验结果,将样本数据划分为正样本和负样本。
可选的,电子设备将样本数据划分为正样本和负样本后展示如图11所示的样本分布。图11中横坐标为生产时间,纵坐标为检验结果。
S203:电子设备根据设备参数的数值,确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组;每个样本的样本切割点用于表征每个样本的设备参数的数值的突变点,目标数值为设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数。
具体的,参考上述实施例中S102的描述,不再赘述。
S204:电子设备根据正样本中第M个数值组和负样本中第M个数值组的差异确定相关量化值,相关量化值用于表征设备参数对不良样本的影响程度,M为小于或等于N的正整数。
具体的,参考上述S103中的描述,不再赘述。
S205:电子设备在分析结果展示界面显示相关量化值。
可选的,电子设备对相关量化值的大小进行排序,电子设备在分析结果展示界面显示相关量化值对应的设备参数的数值组的排序。
在一个例子中,电子设备在分析结果展示界面显示获取的相关量化值对应的每一数值组如图12所示,电子设备以数值组为单位,将数值组的多个相关量化值由高到低进行排序,图12中排名第一的为设备参数1,该设备参数1仅有一个工序和一个数值组,该设备参数1的20个统计指标的相关性量化值由高到低进行排序,特征1的相关性量化值最高,为0.9682。
可选的,电子设备获取输出参数,输出参数包括:数值组的信息参数,极差百分比、第一比值或第二比值中的至少一种;第一比值为包括设备参数的样本数量与多个样本的总数量的比值,第二比值为设备参数对应的不良样本的数量与负样本的总数量的比值;电子设备在分析结果展示界面显示输出参数。其中,信息参数包括数值组在设备参数中的位置和/或数值组占目标数值的百分比。
示例性的,电子设备在分析结果展示界面显示获取的每一数值组的相关量化值如图13所示,图13中包括设备参数1的两个数值组的相关量化值。图13中第一行的数值组中相关量化值最大为0.9682,该数值组对应的设备参数名称为参数1,数值组:0(1/2)表示在样本生成过程中仅有一个设备配方;该设备配方被划分成2个数值组,该数值组为第1个数值组;数值组百分比:94.85%,表示该数值组占整个设备配方的百分比;极差的百分比:100.0%,表示该设备配方的极差(最大值-最小值)占整个过程的极差的百分比;不良比值:(89/89)表示有上报此设备参数的不良样本数/所有的不良样本数;参数样本比值:(1095/1143)表示有上报此设备参数的样本数/所有样本数。
上述主要从方法的角度对本公开实施例提供的方案进行了介绍。为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本公开能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本公开实施例可以根据上述方法示例对上述实施例中的电子设备进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件 的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本公开实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
如图14所示,为本公开实施例提供的一种数据处理装置80的结构图。数据处理装置80包括:获取模块801、划分模块802以及确定模块803。获取模块801用于获取预设时间段内生成的多个样本中每个样本的样本数据;样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果;划分模块802,用于根据样本的检验结果,将样本数据划分为正样本和负样本;确定模块803,用于根据设备参数的数值,确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组;每个样本的样本切割点用于表征每个样本的设备参数的数值的突变点,目标数值为设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数;根据正样本中第M个数值组和负样本中第M个数值组的差异确定相关量化值,相关量化值用于表征设备参数对不良样本的影响程度,M为小于或等于N的正整数。例如,结合图3获取模块801可以用于执行S100,划分模块802可以用于执行S101,确定模块803可以用于执行S102和S103。
在一些实施例中,确定模块803具体用于:确定负样本中第M个数值组的统计指标的第一值和正样本的第M个数值组的统计指标的第二值;统计指标用于表征数值组中数值的集中趋势或者变化趋势;确定第一值与第二值的差异;根据差异确定相关量化值。
在另一些实施例中,确定模块803具体用于:根据负样本中的多个第一值中的特征参数与正样本中的多个第二值中的特征参数,确定第一值与第二值的差异。
在另一些实施例中,特征参数包括目标位置的值和/或总体均值。
在另一些实施例中,确定模块803具体用于:确定负样本中的多个第一值的目标位置的值与正样本中的多个第二值的目标位置的值的第一差异;确定负样本中的多个第一值的总体均值与正样本中的多个第二值的总体均值的第二差异;根据第一差异、第二差异以及预设权重,确定第一值与第二值的差异。
在另一些实施例中,确定模块803具体用于:根据设备参数的数值确定参考样本的样本数据;参考样本为正样本中的样本;确定参考样本的信噪比,并将信噪比的绝对值确定为信噪比绝对值;将滤波后的设备参数的数值中的绝对值大于信噪比绝对值的数值作为参考样本切割点;根据参考比例与参考样本切割点确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组;参考比例为参考样本的设备参数的数值个数与每个样本的设备参数的数值个数的比值。
在另一些实施例中,确定模块803具体用于:根据参考比例以及参考样本切割点确定初步的每个样本的样本切割点;获取模块还用于:根据确定的样本切割点以及预设窗口大小,获取距离样本切割点的距离在预设窗口大小 范围内的设备参数的数值组与距离参考样本切割点在预设窗口大小范围内的设备参数的数值组之间的相关性;数据处理装置还包括修正模块804,用于根据相关性修正每个样本的样本切割点。
在另一些实施例中,确定模块803还用于:对正样本中每个样本的设备参数的数值进行傅里叶变换;将变换后的正样本中设备参数的数值的最小数量作为截取数量;获取正样本中每个样本的设备参数的数值中前截取数量个数值得到多个截取数值组;每个截取数值组包括的数值数量均为截取数量;按照每个截取数值组中数值的排列顺序,获取多个截取数值组中每个位置的数值中的中位数,得到中位数序列;从正样本中确定参考样本的样本数据;参考样本为正样本中与中位数序列的差异值最小的样本。
在另一些实施例中,获取模块801具体用于:获取预设时间段内生成的每个样本的样本数据;获取正样本的目标数值包括的数值个数;根据正样本的目标数值包括的数值个数确定数值范围;将预设时间段内生成的每个样本的样本数据中正样本的目标数值中包括的数值个数在数值范围之外的正样本过滤,得到预设时间段内生成的多个样本中每个样本的样本数据;
和/或,获取预设时间段内生产的每个样本的样本数据;根据每个样本的目标数值包括的数值个数中的中位数,确定裁剪长度,根据裁剪长度裁剪获取的每个样本的样本数据,得到预设时间段内生成的多个样本中每个样本的样本数据。
在另一些实施例中,数据处理装置80还包括:排序模块805,用于对相关量化值的大小进行排序;输出模块806,用于输出相关量化值对应的设备参数的数值组的排序。
在另一些实施例中,输出模块806还用于:输出设备参数的数值组的信息参数,信息参数包括数值组在设备参数中的位置和/或数值组占目标数值的百分比。
在一个示例中,参见图2,上述获取模块801的接收功能可以由图2中的接口单元304实现。上述获取模块801的处理功能、划分模块802、确定模块803、修正模块804、排序模块805以及输出模块806均可以由图2中的处理器301调用存储器302中存储的计算机程序实现。
关于上述可选方式的具体描述参见前述的方法实施例,此处不再赘述。此外,上述提供的任一种应用实例的数据处理装置80的解释以及有益效果的描述均可参考上述对应的方法实施例,不再赘述。
需要说明的是,上述各个模块对应执行的动作仅是具体举例,各个单元实际执行的动作参照上述基于图3所述的实施例的描述中提及的动作或步骤。
如图15所示为本公开实施例提供一种数据处理装置90的结构图,数据处理装置90包括:接收模块901、获取模块902、划分模块903、确定模块 904和显示模块905,接收模块901,用于接收用户在条件选择界面输入的样本筛选条件;获取模块902,用于获取与样本筛选条件对应的多个样本中每个样本的样本数据;样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果;划分模块903,用于根据样本的检验结果,将样本数据划分为正样本和负样本;确定模块904,用于根据设备参数的数值,确定每个样本的样本切割点,得到每个样本对应的目标数值的N个数值组;每个样本的样本切割点用于表征每个样本的设备参数的数值的突变点,目标数值为设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数;根据正样本中第M个数值组和负样本中第M个数值组的差异确定相关量化值,相关量化值用于表征设备参数对不良样本的影响程度,M为小于或等于N的正整数;显示模块905,用于在分析结果展示界面显示相关量化值。例如,结合图7,接收模块901可以用于执行S200,获取模块902可以用于执行S201,划分模块903可以用于执行S202,确定模块904可以用于执行S203和S204,显示模块905可以用于执行S205。
在另一些实施例中,数据处理装置还包括:排序模块906,用于对相关量化值的大小进行排序;显示模块905具体用于:在分析结果展示界面显示相关量化值对应的设备参数的数值组的排序。
在另一些实施例中,显示模块905还用于:在分析结果展示界面显示设备参数的数值组的信息参数,信息参数包括数值组在设备参数中的位置和/或数值组占目标数值的百分比。
在一个示例中,参见图3,上述接收模块901以及获取模块902的接收功能可以由图3中的接口单元304实现。上述获取模块902的处理功能、划分模块903、确定模块904、显示模块905以及排序模块906均可以由图3中的处理器301调用存储器302中存储的计算机程序实现。
关于上述可选方式的具体描述参见前述的方法实施例,此处不再赘述。此外,上述提供的任一种应用实例的数据处理装置90的解释以及有益效果的描述均可参考上述对应的方法实施例,不再赘述。
需要说明的是,上述各个模块对应执行的动作仅是具体举例,各个单元实际执行的动作参照上述基于图7所述的实施例的描述中提及的动作或步骤。
本公开说实施例还提供一种电子设备,包括:处理器和用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为执行所述可执行指令,以实现上述任一实施例所述的数据处理方法。
本公开的一些实施例提供了一种计算机可读存储介质(例如,非暂态计 算机可读存储介质),该计算机可读存储介质中存储有计算机程序指令,计算机程序指令在处理器上运行时,使得处理器执行如上述实施例中任一实施例所述的数据处理方法中的一个或多个步骤。
示例性的,上述计算机可读存储介质可以包括,但不限于:磁存储器件(例如,硬盘、软盘或磁带等),光盘(例如,CD(Compact Disk,压缩盘)、DVD(Digital Versatile Disk,数字通用盘)等),智能卡和闪存器件(例如,EPROM(Erasable Programmable Read-Only Memory,可擦写可编程只读存储器)、卡、棒或钥匙驱动器等)。本公开描述的各种计算机可读存储介质可代表用于存储信息的一个或多个设备和/或其它机器可读存储介质。术语“机器可读存储介质”可包括但不限于,无线信道和能够存储、包含和/或承载指令和/或数据的各种其它介质。
本公开的一些实施例还提供了一种计算机程序产品。该计算机程序产品包括计算机程序指令,在计算机上执行该计算机程序指令时,该计算机程序指令使计算机执行如上述实施例所述的数据处理方法中的一个或多个步骤。
本公开的一些实施例还提供了一种计算机程序。当该计算机程序在计算机上执行时,该计算机程序使计算机执行如上述实施例所述的数据处理方法中的一个或多个步骤。
上述计算机可读存储介质、计算机程序产品及计算机程序的有益效果和上述一些实施例所述的数据处理方法的有益效果相同,此处不再赘述。
以上所述,仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,想到变化或替换,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以所述权利要求的保护范围为准。

Claims (20)

  1. 一种数据处理方法,包括:
    获取预设时间段内生成的多个样本中每个样本的样本数据;所述样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果;
    根据样本的检验结果,将所述样本数据划分为正样本和负样本;
    根据所述设备参数的数值,确定所述每个样本的样本切割点,得到所述每个样本对应的目标数值的N个数值组;所述每个样本的样本切割点用于表征所述每个样本的所述设备参数的数值的突变点,所述目标数值为所述设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数;
    根据所述正样本中第M个数值组和所述负样本中第M个数值组的差异确定相关量化值,所述相关量化值用于表征所述设备参数对不良样本的影响程度,M为小于或等于N的正整数。
  2. 根据权利要求1所述的数据处理方法,所述根据所述正样本中第M个数值组和所述负样本中第M个数值组的差异确定相关量化值,包括:
    确定所述负样本中第M个数值组的统计指标的第一值和所述正样本的第M个数值组的所述统计指标的第二值;所述统计指标用于表征数值组中数值的集中趋势或者变化趋势;
    确定所述第一值与所述第二值的差异;
    根据所述差异确定所述相关量化值。
  3. 根据权利要求2所述的数据处理方法,所述确定所述第一值与所述第二值的差异,包括:
    根据所述负样本中的多个所述第一值中的特征参数与所述正样本中的多个所述第二值中的所述特征参数,确定所述第一值与所述第二值的差异。
  4. 根据权利要求3所述的数据处理方法,所述特征参数包括目标位置的值和/或总体均值。
  5. 根据权利要求3所述的数据处理方法,所述根据所述负样本中的多个所述第一值中的特征参数与所述正样本中的多个所述第二值中的所述特征参数,确定所述第一值与所述第二值的差异,包括:
    确定所述负样本中的多个所述第一值的目标位置的值与所述正样本中的多个所述第二值的所述目标位置的值的第一差异;
    确定所述负样本中的多个所述第一值的总体均值与所述正样本中的多个所述第二值的总体均值的第二差异;
    根据所述第一差异、所述第二差异以及预设权重,确定所述第一值与所述第二值的差异。
  6. 根据权利要求1-5任一项所述的数据处理方法,所述根据所述设备参数的数值,确定所述每个样本的样本切割点,得到所述每个样本对应的目标数值的N个数值组,包括:
    根据所述设备参数的数值确定参考样本的样本数据;所述参考样本为所述正样本中的样本;
    确定所述参考样本的信噪比,并将所述信噪比的绝对值确定为信噪比绝对值;
    将滤波后的所述设备参数的数值中的绝对值大于所述信噪比绝对值的数值作为参考样本切割点;
    根据参考比例与所述参考样本切割点确定所述每个样本的样本切割点,得到所述每个样本对应的所述目标数值的N个数值组;所述参考比例为所述参考样本的所述设备参数的数值个数与所述每个样本的所述设备参数的数值个数的比值。
  7. 根据权利要求6所述的数据处理方法,所述根据参考比例与所述参考样本切割点确定所述每个样本的样本切割点,包括:
    根据参考比例以及所述参考样本切割点确定初步的所述每个样本的样本切割点;根据确定的所述样本切割点以及预设窗口大小,获取距离所述样本切割点的距离在所述预设窗口大小范围内的所述设备参数的数值组与距离所述参考样本切割点在所述预设窗口大小范围内的所述设备参数的数值组之间的相关性;
    根据所述相关性修正所述每个样本的样本切割点。
  8. 根据权利要求6或7所述的数据处理方法,所述根据所述设备参数的数值确定参考样本的样本数据,包括:
    对所述正样本中每个样本的所述设备参数的数值进行傅里叶变换;
    将变换后的所述正样本中所述设备参数的数值的最小数量作为截取数量;
    获取所述正样本中每个样本的所述设备参数的数值中前所述截取数量个数值得到多个截取数值组;每个截取数值组包括的数值数量均为所述截取数量;
    按照每个截取数值组中数值的排列顺序,获取所述多个截取数值组中每个位置的数值中的中位数,得到中位数序列;
    从所述正样本中确定参考样本的样本数据;所述参考样本为所述正样本 中与所述中位数序列的差异值最小的样本。
  9. 根据权利要求1-8任一项所述的数据处理方法,所述获取预设时间段内生成的多个样本中每个样本的样本数据,包括:
    获取预设时间段内生成的每个样本的样本数据;获取所述正样本的所述目标数值包括的数值个数;根据所述正样本的所述目标数值包括的数值个数确定数值范围;将所述预设时间段内生成的每个样本的样本数据中正样本的所述目标数值中包括的数值个数在所述数值范围之外的正样本过滤,得到所述预设时间段内生成的多个样本中每个样本的样本数据;
    和/或,获取预设时间段内生产的每个样本的样本数据;根据每个样本的所述目标数值包括的数值个数中的中位数,确定裁剪长度,根据所述裁剪长度裁剪获取的每个样本的样本数据,得到所述预设时间段内生成的多个样本中每个样本的样本数据。
  10. 根据权利要求1-9任一项所述的数据处理方法,所述方法还包括:
    对所述相关量化值的大小进行排序,输出所述相关量化值对应的所述设备参数的数值组的排序。
  11. 根据权利要求1-9任一项所述的数据处理方法,所述方法还包括:
    输出所述设备参数的数值组的信息参数,所述信息参数包括所述数值组在所述设备参数中的位置和/或所述数值组占所述目标数值的百分比。
  12. 一种数据处理方法,包括:
    接收用户在条件选择界面输入的样本筛选条件;
    获取与所述样本筛选条件对应的多个样本中每个样本的样本数据;所述样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果;
    根据样本的检验结果,将所述样本数据划分为正样本和负样本;
    根据所述设备参数的数值,确定所述每个样本的样本切割点,得到所述每个样本对应的目标数值的N个数值组;所述每个样本的样本切割点用于表征所述每个样本的所述设备参数的数值的突变点,所述目标数值为所述设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数;
    根据所述正样本中第M个数值组和所述负样本中第M个数值组的差异确定相关量化值,所述相关量化值用于表征所述设备参数对不良样本的影响程度,M为小于或等于N的正整数;
    在分析结果展示界面显示所述相关量化值。
  13. 根据权利要求12所述的数据处理方法,所述方法还包括:
    对所述相关量化值的大小进行排序;
    所述在分析结果展示界面显示所述相关量化值,包括:
    在分析结果展示界面显示所述相关量化值对应的所述设备参数的数值组的排序。
  14. 根据权利要求12或13所述的数据处理方法,所述方法还包括:
    在分析结果展示界面显示所述设备参数的数值组的信息参数,所述信息参数包括所述数值组在所述设备参数中的位置和/或所述数值组占所述目标数值的百分比。
  15. 一种数据处理装置,包括:
    获取模块,用于获取预设时间段内生成的多个样本中每个样本的样本数据;所述样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果;
    划分模块,用于根据样本的检验结果,将所述样本数据划分为正样本和负样本;
    确定模块,用于根据所述设备参数的数值,确定所述每个样本的样本切割点,得到所述每个样本对应的目标数值的N个数值组;所述每个样本的样本切割点用于表征所述每个样本的所述设备参数的数值的突变点,所述目标数值为所述设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数;根据所述正样本中第M个数值组和所述负样本中第M个数值组的差异确定相关量化值,所述相关量化值用于表征所述设备参数对不良样本的影响程度,M为小于或等于N的正整数。
  16. 根据权利要求15所述的数据处理装置,所述确定模块具体用于:
    确定所述负样本中第M个数值组的统计指标的第一值和所述正样本的第M个数值组的所述统计指标的第二值;所述统计指标用于表征数值组中数值的集中趋势或者变化趋势;
    确定所述第一值与所述第二值的差异;
    根据所述差异确定所述相关量化值。
  17. 一种数据处理装置,包括:
    接收模块,用于接收用户在条件选择界面输入的样本筛选条件;
    获取模块,用于获取与所述样本筛选条件对应的多个样本中每个样本的样本数据;所述样本数据包括样本途径的设备的设备参数在每一采集时间的数值,以及样本的检验结果;
    划分模块,用于根据样本的检验结果,将所述样本数据划分为正样本和负样本;
    确定模块,用于根据所述设备参数的数值,确定所述每个样本的样本切割点,得到所述每个样本对应的目标数值的N个数值组;所述每个样本的样本切割点用于表征所述每个样本的所述设备参数的数值的突变点,所述目标数值为所述设备参数的数值中相邻两个采集时间之间的时间差小于第一阈值的数值,N为大于或等于1的正整数;
    根据所述正样本中第M个数值组和所述负样本中第M个数值组的差异确定相关量化值,所述相关量化值用于表征所述设备参数对不良样本的影响程度,M为小于或等于N的正整数;
    显示模块,用于在分析结果展示界面显示所述相关量化值。
  18. 一种电子设备,其特征在于,包括:
    处理器和用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为执行所述可执行指令,以实现如权利要求1-11任一项所述的数据处理方法,或者,以实现如权利要求12-14任一项所述的数据处理方法。
  19. 一种计算机可读存储介质,其特征在于,当所述计算机可读存储介质中的指令由电子设备的处理器执行时,使得所述电子设备能够执行如权利要求1-11任一项所述的数据处理方法,或者,执行如权利要求12-14任一项所述的数据处理方法。
  20. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机指令,当所述计算机指令在计算机设备上运行时,使得所述计算机设备执行如权利要求1-11任一项所述的数据处理方法,或者,以执行如权利要求12-14任一项所述的数据处理方法。
PCT/CN2021/097393 2021-05-31 2021-05-31 数据处理方法、装置、设备及存储介质 WO2022252051A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180001364.XA CN115735203A (zh) 2021-05-31 2021-05-31 数据处理方法、装置、设备及存储介质
PCT/CN2021/097393 WO2022252051A1 (zh) 2021-05-31 2021-05-31 数据处理方法、装置、设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/097393 WO2022252051A1 (zh) 2021-05-31 2021-05-31 数据处理方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022252051A1 true WO2022252051A1 (zh) 2022-12-08

Family

ID=84323801

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/097393 WO2022252051A1 (zh) 2021-05-31 2021-05-31 数据处理方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN115735203A (zh)
WO (1) WO2022252051A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190354860A1 (en) * 2016-12-14 2019-11-21 Conti Temic Microelectronic Gmbh Device for Classifying Data
CN111177507A (zh) * 2019-12-31 2020-05-19 支付宝(杭州)信息技术有限公司 多标记业务处理的方法及装置
CN111325228A (zh) * 2018-12-17 2020-06-23 上海游昆信息技术有限公司 一种模型训练方法及装置
CN112818114A (zh) * 2019-11-15 2021-05-18 阿里巴巴集团控股有限公司 信息的分类方法、检测方法、计算设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190354860A1 (en) * 2016-12-14 2019-11-21 Conti Temic Microelectronic Gmbh Device for Classifying Data
CN111325228A (zh) * 2018-12-17 2020-06-23 上海游昆信息技术有限公司 一种模型训练方法及装置
CN112818114A (zh) * 2019-11-15 2021-05-18 阿里巴巴集团控股有限公司 信息的分类方法、检测方法、计算设备及存储介质
CN111177507A (zh) * 2019-12-31 2020-05-19 支付宝(杭州)信息技术有限公司 多标记业务处理的方法及装置

Also Published As

Publication number Publication date
CN115735203A (zh) 2023-03-03

Similar Documents

Publication Publication Date Title
US11670021B1 (en) Enhanced graphical user interface for representing events
US10262032B2 (en) Cache based efficient access scheduling for super scaled stream processing systems
US8983936B2 (en) Incremental visualization for structured data in an enterprise-level data store
US9367872B1 (en) Systems and user interfaces for dynamic and interactive investigation of bad actor behavior based on automatic clustering of related data in various data structures
US20180329963A1 (en) Embedded Analytics and Transactional Data Processing
WO2021147559A1 (zh) 业务数据质量检测方法、装置、计算机设备及存储介质
WO2021103401A1 (zh) 数据对象分类方法、装置、计算机设备和存储介质
CN113051317A (zh) 一种数据探查方法和系统、数据挖掘模型更新方法和系统
US20190050672A1 (en) INCREMENTAL AUTOMATIC UPDATE OF RANKED NEIGHBOR LISTS BASED ON k-th NEAREST NEIGHBORS
CN111046141A (zh) 一种基于历史时间特征的文本库关键词精炼方法
WO2022227094A1 (zh) 数据处理方法、装置、设备及存储介质
WO2022252051A1 (zh) 数据处理方法、装置、设备及存储介质
CN111460257A (zh) 专题生成方法、装置、电子设备和存储介质
US10324943B2 (en) Auto-monitoring and adjustment of dynamic data visualizations
DE112019006218T5 (de) Prozesssteuerungsinstrument zur Verarbeitung großer und umfangreicher Daten
CN115249043A (zh) 数据分析方法、装置、电子设备及存储介质
CN116089490A (zh) 数据分析方法、装置、终端和存储介质
WO2022062834A1 (zh) 数据探查方法、装置、电子设备和存储介质
CN115470251A (zh) 一种大数据分析展示装置
WO2022088084A1 (zh) 数据处理方法、装置及系统、电子设备
CN114625763A (zh) 用于数据库的信息分析方法、装置、电子设备和可读介质
US20190034479A1 (en) Automatic selection of neighbor lists to be incrementally updated
WO2022198680A1 (zh) 数据处理方法及装置、电子设备、存储介质
WO2024055281A1 (zh) 异常根因分析方法及装置
CN109766254B (zh) It系统运维监控数据辅助预处理方法和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21943427

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE